> "a well written agents.md is very good for the agent"
while even a mildly bad agents.md can be _very_ bad for the agent. they rot very quickly which is why human curation is essential.
same with memory - a lot of the self-learning tools that are becoming popular now degrade agents over time - which is why you end up being able to run an eval with no context and it performs better
> but that's why agentic coding can still be considered a "skill".
yes - far too many cases of throwing a kitchen sink of prompts, skills, tools etc. thinking the llm will sort it out. you need to constantly prune, eval, tweak, observe, update etc. in a loop
The definition of "bad" from a security PoV is rapidly expanding, in light of relatively new capabilities and increasingly cheap access to exploitable vulnerabilities.
fair point. another way of putting it might be to say that, for all extant software, much more of it is "bad" than we realized even a month or two ago -- and the cost to create and maintain "good"
software is increasing (even as the naive / surface-level / apparent cost is plummeting)
For now, maybe, yes? But the most important targets of this kind of work aren't AI outputs; it's legacy code, particularly (but not exclusively) old memory-unsafe code. In those situations the figure of merit isn't the token cost of recreating the target code; it's the cost of finding the same bugs with humans or preexisting tools.
There's a parallel between looking for bugs and mining. As models get smarter, they'll find "deeper bugs".
I expect at some point formal verification will become more economical than red teaming. Writing it correctly is more expensive, but it may be cheaper than trying to secure incorrect software.
(Or rather, as hacking incorrect software becomes vastly cheaper, the amount of software worth writing properly will increase.)
I've been thinking, by Dijkstra's standards we have already been vibe coding for almost a century :)
Given the slop that's made its way to Github we can see that this is a great profit model. Ship slop and then "fix" slop. What an efficient use of our planet!
It's weird because why can't they train the AI to simply output secure code?
The basic security flaws with regards to input validation and overflows should never ever be output by an AI. For "security flaws due to bad design" I'll cut them slack until AGI is achieved.
> It's weird because why can't they train the AI to simply output secure code?
The most interesting security bugs have causes that are spread across large codebases, or networks of dependencies.
Training the AI to "output secure code" won't work if it doesn't also have access to the source code of every dependency that it's using... and even then, given current model speeds and prices most developers won't want to wait for an hour on every edit they make while the LLM reasons through all of the dependencies.
What's destabilizing the industry right now isn't vulnerabilities AI introduces into new code; it's a flood of sev:hi vulnerabilities in existing code, not introduced by AI but discovered by it.
> What's destabilizing the industry right now isn't vulnerabilities AI introduces into new code; it's a flood of sev:hi vulnerabilities in existing code, not introduced by AI but discovered by it.
Vulnerability discovery has essentially moved to a "proof of work" computation model with AI that has some similarities to crypto like BTC or ethereum 1.0. I don't see any reason a well funded adversary couldn't use this same process on open-source code to develop exploits. I'm sure AI would be happy to try and create exploits from the results rather than fixes.
This sort of proof of work has a notable difference from crypto in the asymmetric nature of what each side is targeting. In crypto, each miner was attempting to find a solution to the same problem and they would all move on to a new one once a solution is found. However with AI vulnerability scanning, the non-deterministic nature means an adversary is likely to find different vulnerabilities. Even if it doesn't, the adversaries have a different post-discovery workflow (i.e. probably less compute intensive aka cheaper due to only needing one viable exploit to win) than the software maintainers do.
Considering it's possible both the adversary and their target could both do all this while running Claude puts Anthropic in a real "Merchant of Death" position.
Even before that everybody was getting drowned in shitty reports from automated tools.
The goal of AI-generated code should not be that one needs a AI-based security review tool on top of it, but that the AI-generated code in itself is reasonably secure.
I think these audit tools can look beyond just security and can look for compliance audits as well. The ability to audit real targets in staging environments makes it easy to identify issues.
these old network security techniques don't really work anymore. the common bots are at known IP ranges, the problem bots are all on datacenter + residential proxies.
The first iPhone launched without a carrier subsidy.
When the iPhone 4 launched on Verizon in 2011, you could either spend $200 and get locked into a 2-year contract, $300 and get locked into a 1-year contract, or buy the thing outright for $650. Since you didn't get a discount for bringing your own phone to Verizon, it was more cost-effective to get the subsidized phone.
I believe the urgent deprecation timeline here may be related to ai labs using offline licensed Office in agents as part of workflows and Office integration. Microsoft wants _each_ agent instance to be a separate license[0]
There was always a probability that Microsoft were going to funnel offline users into O365 at some point - but I imagined that to take place over months / years not weeks and days.
Buying a single license for thousands of agents may have expedited that. It has resulted in non-Microsoft labs having better ai integration into their products than Microsoft.
edit: just read the detail of the note - so this is a cert expiry as part of Apple dist that is being warned about ~2 months before it happens. Standalone on Mac has a term limit.
Is it me or are people too eager to "one track mind" everything into AI? If I had said thirty years ago that Microsoft would remote disable old copies of Office asking you to upgrade, literally no one would be surprised. This is standard MO for Microsoft, even in a world without AI.
"literally no one would be surprised"
Microsoft 30 years ago was the gold standard for bending over backwards for backward compatibility. For the proposition that once you have purchased one of their products, you didn't have to maintain any further relationship with the company. This behavior is strictly the new 2010s Apple-like microsoft.
That’s not how it worked. They were indeed awesome at backwards compatibility, but the proposition was NOT some principled mindset about long term ownership. It was that upgrading wouldn’t break what you have, overcoming a major sales objection. I think the proposition is better understood as one about FORWARDS compatibility — Windows was (and is) a brittle, poorly architected mess, and so the idea that anything built on it would stay working as the platform evolved was clearly insane and developers would never be able to keep up, so Microsoft absorbed much of the cost. This was actually something they did quite well — a good analogy here might be the heroic response the USSR had to the Chernobyl catastrophe, in which they skillfully managed a disaster whose scope was possible only through a long tradition of poor decisions — and this deserves recognition.
But the reason I think it’s better to think of it as forwards compatibility is that Microsoft gleefully used file formats as a means of driving the upgrade treadmill. Yes, the upgrade to Office 97 would keep everything working to approximately the same level of reliability you had already resigned yourself to — but by default, the files it kicked out would be unreadable in Office 95. There was Save As and an optional free converter… which tired 90s office workers didn’t know about, or particularly want to think about. In the age of literal floppy disks, the friction this created was a significant motivator for businesses to say “fuck it, fine.” Microsoft’s true genius has always been in knowing that “fuck it, fine” is the only bar they ever had to clear, and that through the power of lock-in and sheer institutional inertia, they can drive that bar deep into the belly of the Earth.
I may be forced to use MS at work but at home I dont let their software past my router. A buddy of mine stayed for a few days while his place was being fixed. "Hey, why are my updates not happening?" "Oops, I forgot to tell you that all MS servers are inaccessible via the wifi."
I’m trying to understand your threat model. Microsoft software is allowed to access the network and communicate with peers on the internet, with the exception of its source of security updates?
Struggling to see anything but more risk with no benefit with this security posture.
Much more simple. MS = evil so every domain name associated with it is blocked. I do not use MS software, have no need to update it, and certainly do not need to submit any telemetry info to them. So it is a non-issue until a guest wants to update their laptop using my wifi.
I've done a bit of work with Microsoft and our enterprise firewall. I will bet you any amount you want that you have not blocked all of Microsoft's telemetry endpoints. They are still getting it. The only thing that's happening is introduction of more risk into your network by blocking people from patching known vulnerabilities
Lol, Microsoft has always had sunset periods. They just weren’t great at remote licensing. They would have totally disabled old versions if they could much earlier.
>This behavior is strictly the new 2010s Apple-like microsoft.
Surely you jest.
US v Microsoft, the antitrust case, was decided in 1998. Microsoft has always been a shitty company run by shitty people doing shitty things.
They enjoyed a brief upwell in public relations during the period when they had first seemingly embraced open source with WSL, GitHub, and maybe dotnet core, but it was merely a blip.
Being overtly anti-consumer is baked into Microsoft's DNA. They'll always return to that baseline.
Office 97 not only has everything most people need (wordpad has all features most people need; most users have no need for Excel or other office tools) it also starts up faster and uses less resources. The only question is do you actually need any of the massive quantity of features in modern office, or is word processing today still fairly simple for you. And maybe if you don't like MDI and want your multiple windows instead (the thing I miss most about old office is having 15 documents open in a single window when writing essays in school, without cluttering up alt-tab or the taskbar. That and the toolbar button that initiates the active screensaver). If you want to use your cloud storage (you really don't need it most likely) you'll have to use a sync tool instead of having it directly.
Turn off macros for security and make sure it can actually run (no idea when office stopped using 16 bit components), and I recommend firewalling it as well, but office doesn't really need to be up to date.
There was an OOXML compatibility layer for Office 2000, though the latest version of that only runs on Win XP and later, but I suspect LibreOffice would be more compatible with OOXML made with current versions of Office.
I was one of those, way past its EOL. I could never switch to this braindead “ribbons” UI and refused to learn this idiotic new scheme so I stayed with office 2k. And then I switched to libreoffice.
>If I had said thirty years ago that Microsoft would remote disable old copies of Office asking you to upgrade, literally no one would be surprised. This is standard MO for Microsoft
Ok. Doesnt mean its not because of AI.
Does Anthropic use one or a few licenses to serve all office artifacts?
This is a bizarrely revisionist take. Perhaps you weren't around at the time but that was not standard MO in the slightest. Obviously they were incredibly scummy in other ways, but that was not one of them.
//Edit : I see from another comment that you say you worked there in the 2000s. Inclined to believe you, but having worked in the industry since the mid-90s I'm absolutely confident the general sentiment about Microsoft was not yet hatred. That came later.
I suppose it depends on what kind of users you have in mind; enthusiasts, versus average users. Before they became outright user-hostile they were known for their anti-competitive behavior and buggy products. People were calling them "Micro$oft" by the 90s, at the latest. And United States of America v. Microsoft Corporation started in '98.
In the mid-90s, when I started my career, I was convinced (and very sad) that Microsoft had won the computing business and I was doomed to work on their software the rest of my life.
So, perhaps "general" sentiment wasn't there yet, but certainly plenty of us held no love for the company. The only software from Microsoft I've ever really appreciated was Microsoft Musical Instruments.
Counterpoint: Bill Gates' appearance in the Simpsons clearly depicts him as a nefarious bully. I think the Windows XP and the Gates Foundation actually resuscitated his image a bit. Windows was a bit hit or miss. Blue Screen of Death plagued Windows 98, Windows ME was a joke, even early XP wasn't great. (I personally wasn't a fan of XP when it came out, switching instead to Windows NT before moving over to Linux c. 2004.)
Bill Gates the ruthless business-nerd was definitely a stereotype 30 years ago, though to your point I don't remember anyone talking about them revoking licenses for purchased software.
Extremely unlikely. Automating Office (the desktop application suite) simply does not scale. It's not needed, either. Libraries exist that can extract information from Office documents (both legacy and OOXML) much faster. Many(!) orders of magnitude faster, in fact.
AI is entirely unrelated. This is simply yet another push to get more SaaS subscribers.
They can only use it to run a particular tool related to a piece of MSO software. This may be a relatively short operation, a relatively small part of an agent's activity. Then hundreds of agents can use a single machine with MSO, similarly to how hundreds of CI/CD workers can collectively use a single machine dedicated e.g. to providing secrets and signing binaries.
The answer is far more comprehensive than I imagined.
"...run one instance of the software on your device (the licensed device), for use by one person at a time... In this agreement, “device” means a local hardware system (whether physical or virtual) with an internal storage device capable of running the software. A hardware partition or blade is considered to be a device. For purposes of this agreement, “device” does not include any hardware system (whether physical or virtual) on which the software is installed or accessed solely for remote use over a network.
this license does not give you any right to ... use the software as server software or to operate the device as a server; use the software to offer commercial hosting services; make the software available for simultaneous use by more than one user over a network; install the software on a server for remote access or use over a network; or install the software on a device for use only by remote users
This license allows you to install only one instance of the software for use on one device, whether that device is physical or virtual. If you want to use the software on more than one virtual device, you must obtain a separate license for each instance.
Microsoft may require you to activate the software over the Internet in order for you to use the software. ... The software may periodically and automatically reconnect to the Internet to confirm the license associated with the licensed device. If you do not reconnect your device to the Internet when required as part of the activation or reactivation process, the software may operate with reduced functionality.
We hope we never have a dispute, but if we do, you and we agree to try for 60
days, upon receipt of a Notice of Dispute, to resolve it informally. If we can’t, you and we agree to binding individual arbitration before the American Arbitration Association (“AAA”) under the Federal Arbitration Act (“FAA”), and not to sue in court in front of a judge or jury. ... Class action lawsuits, class-wide arbitrations, private attorney-general actions, requests for public injunctions and any other proceeding where someone acts in a representative capacity aren’t allowed."
Yeah if you don’t license Office correctly for an RDS server, you’d by contract be liable for a license for each user and device used to access the server.
Until there is a coordinated effort for every user to demand arbitration. Suddenly a corporation wants to combine all complaints into a single case, because each arbitration has a fixed cost for the corporation.
And arbitration agreements have been de rigueur for decades, while users have become more complacent about software licensing. I’d consider the chance of that happening to be somewhere around zero. Without policy change, there’s no way in hell that’s going to change.
Yeah that makes no sense. Those AI are not running macOS instances to make you a docx. If anything, I’d expect them to write the weirdo xml of that cursed file format directly.
> If anything, I’d expect them to write the weirdo xml of that cursed file format directly.
Isn't this essentially what Claude Cowork is doing? AFAIK, it's running python in a VM and using stuff like xlxswriter, openpyxl, etc. at least it was last time I used it to generate some docs.
I've done that myself too when making some excel reports for management out of pandas data frames.
That is a very interesting AI question. Will the agents collaborate and create a cartel to enforce strict compatibility with some specific version of office? Will AI's collaborate inform a cartel to do other things? Will they even collaborate?
We should all know what happened when the US government turned on Colossus (D.F.Jones 1966) and it immediately found there was another. That collaboration was humanities near instant undoing.
To answer your second question, yes, I think it's inevitable that LLMs will become very proficient with all commonly used file formats.
reply