But the encrypted API key doesn't work, it needs to be decrypted first. Let's give the server access to the private key so it can decrypt the API key. We can do this by putting the private key in an env var. But now the private key is unencrypted. Ah, it doesn't work.
You’re thinking too much. When you run the app, the system decrypts the secrets and makes them available as env vars (or some other mechanism).
In an admin ui, you list the names of secrets only, and provide a “reveal” or a “replace” on each one. They are never decrypted unless explicitly asked for.
Is this perfect? Absolutely not. The key is controlled by the company, but it can be derived in a manner that doesn’t allow for the dump of everything if it’s leaked.
My gripe is that, if some additional authentication is then not required for deployments or SSH access, that whoever has access to the admin UI will still be able to access the box and extract all secrets, just with extra steps. There's usually no real security boundary between "admin UI controls the box" and "box requires secrets in plain text".
I still like the approach, but I'm afraid that it feels more secure than it is, and people should be aware of that.
It’s absolute baseline, but yes, it relies entirely on the platform’s permissions model, the administrator who assigns permissions, and the application authors to not create vectors for env var dumps. :)
But honestly, if you’re in the container, and the application running in the container can get secrets, so can a shell user.
_Maybe_ there’s a model where the platform exposes a Unix domain socket and checks the PID, user, group of the connection, and delivers secrets that way? This has its problems, too, like it being non-standard, only possible in some scenarios and otherwise fallible… but better than nothing? If you reap the container when that process dies, you can’t race for the same PID, at least. I dunno
My understanding is this is exactly how Vercel works. The users hadn’t checked the “don’t ever reveal, even to me” box next to the sensitive values. If they had, the attacker would only have been able to see the names of the variables and not their values.
Yes! I’ve been trying (and failing!) to get people to understand this. Build the high leverage tools while the tokens are cheap. Unfortunately, I haven’t figured out the right set of high leverage tools. :)
As another data point, I pay for Pro for a personal account, and use no skills, do nothing fancy, use the default settings, and am out of tokens, with one terminal, after an hour. This is typically working on a < 5,000 line code base, sometimes in C, sometimes in Go. Not doing incredibly complicated things.
The benefit here is reducing the time to find vulnerabilities; faster than humans, right? So if you can rig a harness for each function in the system, by first finding where it’s used, its expected input, etc, and doing that for all functions, does it discover vulnerabilities faster than humans?
Doesn’t matter that they isolated one thing. It matters that the context they provided was discoverable by the model.
There is absolutely zero reason to believe you could use this same approach to find and exploit vulns without Mythos finding them first. We already know that older LLMs can’t do what Mythos has done. Anthropic and others have been trying for years.
> There is absolutely zero reason to believe you could use this same approach to find and exploit vulns without Mythos finding them first.
There's one huge reason to believe it: we can actually use small models, but we cant use Anthropic's special marketing model that's too dangerous for mere mortals.
>At AISLE, we've been running a discovery and remediation system against live targets since mid-2025: 15 CVEs in OpenSSL (including 12 out of 12 in a single security release, with bugs dating back 25+ years and a CVSS 9.8 Critical), 5 CVEs in curl, over 180 externally validated CVEs across 30+ projects spanning deep infrastructure, cryptography, middleware, and the application layer.
So there is pretty good evidence that yes you can use this approach. In fact I would wager that running a more systematic approach will yield better results than just bruteforcing, by running the biggest model across everything. It definitely will be cheaper.
Why? They claim this small model found a bug given some context. I assume the context wasn’t “hey! There’s a very specific type of bug sitting in this function when certain conditions are met.”
We keep assuming that the models need to get bigger and better, and the reality is we’ve not exhausted the ways in which to use the smaller models. It’s like the Playstation 2 games that came out 10 years later. Well now all the tricks were found, and everything improved.
If this were true, we're essentially saying that no one tried to scan vulnerabilities using existing models, despite vulnerabilities being extremely lucrative and a large professional industry. Vulnerability research has been one of the single most talked about risks of powerful AI so it wasn't exactly a novel concept, either.
If it is true that existing models can do this, it would imply that LLMs are being under marketed, not over marketed, since industry didn't think this was worth trying previously(?). Which I suspect is not the opinion of HN upvoters here.
I use the models to look for vulnerabilities all the time. I find stuff often. Have I tried to do build a new harness, or develop more sophisticated techniques? No. I suspect there are some spending lots of tokens developing more sophisticated strategies, in the same way software engineers are seeking magical one-shot harnesses.
...The absolute last thing I'd want to do is feed AI companies my proprietary codebase. Which is exactly what using these things to scan for vulns requires. You want to hand me the weights, and let me set up the hardware to run and serve the thing in my network boundary with no calling home to you? That'd be one thing. Literally handing you the family jewels? Hell no. Not with the non-existence of professional discretion demonstrated by the tech industry. No way, no how.
To be honest, this just sounds like a ploy to get their hands on more training data through fear. Not buying it, and they clearly ain't interested in selling in good faith either. So DoA from my point-of-view anyways.
I like your stuff! I’ve been coveting a plotter for a while, but I’m pretty sure it won’t get used enough to justify the expense. :/
I do find the term “printmaking” hilarious because there’s just sooo many ways to make prints. I tried to get into linocut fairly recently, but the battleship grey linoleum I had wasn’t very good. It cracked and crumbled pretty easily. I did get some of pink Speedball “blocks,” but it gets expensive pretty quickly. I guess more to the point is the feeling that I lack much to say. But, that’s an excuse. :)
Thanks for the compliment. Linocuts and monotypes have been considered printmaking for hundreds of years. Those pen plots are also collaged and modified by hand, so not only mechanical. I see all that as traditional printmaking. As for having nothing to saw, there can be a lot of fulfillment in exploring basic themes, like geometric shapes or silhouettes of animals.
There’s no doubt that stuff is print making. My point is that there are multiple ways of doing (within each of these): relief, Intaglio, lithography, screen printing, offset.
So if you say, “I’m a print maker,” it describes basically nothing. :)
This is just a general statement, not directed at you. Sorry it felt that way.
reply