Hacker Newsnew | past | comments | ask | show | jobs | submit | ac29's commentslogin

The very best open models are maybe 3-12 months behind the frontier and are large enough that you need $10k+ of hardware to run them, and a lot more to run them performantly. ROI here is going to be deeply negative vs just using the same models via API or subscription.

You can run smaller models on much more modest hardware but they aren't yet useful for anything more than trivial coding tasks. Performance also really falls off a cliff the deeper you get into the context window, which is extra painful with thinking models in agentic use cases (lots of tokens generated).


You can also run these models on the cloud with Ollama. You might say what's the difference, but these are models whose performance will stay consistent over time, whether run locally or in the cloud. For $200 a year I'm getting some pretty fantastic results running GLM 5.1 and even Minimax 2.7 and Kimi 2.5 and Gemma 4 on Ollama's cloud instances. And if you don't like Ollama's cloud instance, you can run it on your own cloud instance from the very same providers that Ollama is using. They use NVIDIA cloud providers (NCPs) although not sure which ones specifically and claims that the "cloud does not retain your data to ensure privacy and security." [https://ollama.com/blog/cloud-models]

Interesting. On the pricing page, there are still limits placed on the usage. How restrictive have you found them?

What are the best open models?

This includes open and closed models ranked by popularity and other metrics.

https://openrouter.ai/rankings


This includes open and closed models ranked by popularity and other metrics.

https://openrouter.ai/rankings


1M context window is still a separate, non-default model in Claude Code and not included with subscriptions (billed at API rates only)

Opus[1m] has heen the default model for max subscriptions since 2.1.75.

https://github.com/anthropics/claude-code/commit/48b1c6c0ba0...


It depends on your account and seems to be random.

On my personal Max 5x account it’s not default and if I force it, it says I’ll pay API rates past 200k. On my other account that I use for work (not an enterprise account just another regular Max 5x account) the 1M model has been the default since that rollout. I’ve tried updating and reinstalling etc, and I can’t ever get the 1M default model on my personal account.

Based on other comments and discussion online as well as Claude code repo issues, it seems I’m not the only one not getting the 1M model for whatever reason and the issue continues to be unresolved.


what? Opus 1m has been in place for at least a few weeks for plan users.

> we can’t have unlimited liabilities stacking up forever

The liabilities are completely offset by prepayments from your customers though. Even better, you can earn interest on the deposits without paying any out.

If you just dont want the liabilities on the books, issue refunds. Expiring credits feels like a cash grab.


It's just basic bookkeeping, storing money over fiscal years is a nightmare to manage. (At least over here, dunno about whatever country Openrouter is based in)

> By comparison gemma-4-E4B-it-GGUF:Q4_K_M scores 15/25 (that is a 4B parameter model!)

Gemma 4 E4B is slightly confusingly named, its a 8B param model


You are completely right on both counts.

It is a 8B model, and it is confusingly named. In fact I made exactly the same point[1] when it was released and promptly forgot!

[1] https://news.ycombinator.com/item?id=47622694


> Suits in agriculture don't drive the combine either, a farmer does.

Advanced RTK based positioning systems have been in Ag for a long time now, so increasingly the farmer doesnt drive either


pnpm installs to ~/.local as well

This article is about a MoE model with only 4B active parameters, it shouldn't take 10 minutes to answer a question about a small project.

I measured a 4bit quant of this model at 1300t/s prefill and ~60t/s decode on Ryzen 395+.


$10/hr for high speed internet on a flight doesn't seem that bad if you have a good use for it. A single drink can be more

> I don’t want some of my devices to be publicly addressable at all, even if I mess up something at the firewall while updating the rules. NAT provides this by default.

This feels like a strawman. If you are making the sort of change that accidentally disables your IPv6 firewall completely, you could accidentally make a change that exposed IPv4 devices as well (accidentally enabling DMZ, or setting up port forwarding incorrectly for example).


As someone who has done this while tired, it’s a lot easier to accidentally open extra ports to a publicly routable IP (or overbroad range of IPs) than it is to accidentally enable port forwarding or DMZ.

You could accidentally swap ips to one that had a port forward, some applications can ask routers to forward, etc etc. I donmt know how exactly we'd measure the various potential issues but they seem incredibly minor compared to the sheer amount of breakage created by widespread nat.

I don’t have any problems with NAT on my network.

> The rovers on Mars as well

Curiosity was intended to operate from 2011-2013 and is still active now, just shy of 5000 days after landing. Really impressive.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: