> Rather, the author is just a really powerful and sophisticated pattern matching machine that only returns best guesses about what they “think” a question means. But the author doesn’t understand true meaning, not really. For all of their technical achievements, the author doesn’t understand that nonsense is still nonsense even if it happens to perfectly match a certain linguistic or information pattern like "A => not B".
True, and for third-party models we'll just re-use their public open weights.
There is a time-consuming part, though, that is performed manually by our (human) team: implement the logic of the model in C++ and assembly code in a super-optimized way, co-designed for each specific hardware card.
This can take months.
We hope to accelerate the process with AI agents, but we're not there yet.
The state of the art models (mostly GPT 5.5, but also Gemini and Claude) are better so they cost more. Qwen 3.7 Max is their only direct competition and it is not any cheaper.
I love ds4, us models are better imo, but like 5% not 500% better, so the valuation doesn't really make sense
that being said, deepseek v4 needs to be on amazon bedrock to actually be feasible in the US Enterprise market and start driving other provider prices down
reply