The very best open models are maybe 3-12 months behind the frontier and are large enough that you need $10k+ of hardware to run them, and a lot more to run them performantly. ROI here is going to be deeply negative vs just using the same models via API or subscription.
You can run smaller models on much more modest hardware but they aren't yet useful for anything more than trivial coding tasks. Performance also really falls off a cliff the deeper you get into the context window, which is extra painful with thinking models in agentic use cases (lots of tokens generated).
You can also run these models on the cloud with Ollama. You might say what's the difference, but these are models whose performance will stay consistent over time, whether run locally or in the cloud. For $200 a year I'm getting some pretty fantastic results running GLM 5.1 and even Minimax 2.7 and Kimi 2.5 and Gemma 4 on Ollama's cloud instances. And if you don't like Ollama's cloud instance, you can run it on your own cloud instance from the very same providers that Ollama is using. They use NVIDIA cloud providers (NCPs) although not sure which ones specifically and claims that the "cloud does not retain your data to ensure privacy and security." [https://ollama.com/blog/cloud-models]
It depends on your account and seems to be random.
On my personal Max 5x account it’s not default and if I force it, it says I’ll pay API rates past 200k. On my other account that I use for work (not an enterprise account just another regular Max 5x account) the 1M model has been the default since that rollout. I’ve tried updating and reinstalling etc, and I can’t ever get the 1M default model on my personal account.
Based on other comments and discussion online as well as Claude code repo issues, it seems I’m not the only one not getting the 1M model for whatever reason and the issue continues to be unresolved.
> we can’t have unlimited liabilities stacking up forever
The liabilities are completely offset by prepayments from your customers though. Even better, you can earn interest on the deposits without paying any out.
If you just dont want the liabilities on the books, issue refunds. Expiring credits feels like a cash grab.
It's just basic bookkeeping, storing money over fiscal years is a nightmare to manage. (At least over here, dunno about whatever country Openrouter is based in)
> I don’t want some of my devices to be publicly addressable at all, even if I mess up something at the firewall while updating the rules. NAT provides this by default.
This feels like a strawman. If you are making the sort of change that accidentally disables your IPv6 firewall completely, you could accidentally make a change that exposed IPv4 devices as well (accidentally enabling DMZ, or setting up port forwarding incorrectly for example).
As someone who has done this while tired, it’s a lot easier to accidentally open extra ports to a publicly routable IP (or overbroad range of IPs) than it is to accidentally enable port forwarding or DMZ.
You could accidentally swap ips to one that had a port forward, some applications can ask routers to forward, etc etc. I donmt know how exactly we'd measure the various potential issues but they seem incredibly minor compared to the sheer amount of breakage created by widespread nat.
You can run smaller models on much more modest hardware but they aren't yet useful for anything more than trivial coding tasks. Performance also really falls off a cliff the deeper you get into the context window, which is extra painful with thinking models in agentic use cases (lots of tokens generated).
reply