Hacker Newsnew | past | comments | ask | show | jobs | submit | FuckButtons's commentslogin

Well, you are learning something, just the thing you’re learning has an even shorter usable lifespan than programming languages, namely you’re learning what works to get useful responses from ai agents. Whether or not that has value to you is a different matter, but it’s worth bearing in mind something is being learned, even if it’s not engineering or programming.

> namely you’re learning what works to get useful responses from ai agents.

Having worked a lot with AI agents, I don't agree.

AI agents are amazing at producing response and results that look correct as long as you don't look too closely.

Even when I try to write extremely detailed specs and test harnesses, even Opus 4.8 and GPT-5.5 on max will find creative new ways to write code that breaks under real use cases.

Doing throwaway LLM output, playing with it a little bit, and then calling it done will create a false sense that you're really good at getting LLMs to produce working things.


You're learning to manage idiot savants, which is a very useful skill.

> You're learning to manage idiot savants, which is a very useful skill.

I think the real bifurcation is whether you will settle on that belief.

Some of us are settling on the belief that the idiot savant, lacking the coherence of a functional mind, cannot be managed. It's essentially a chaos agent masquerading as something more cooperative.


The thing is, LLMs are more like the opposite: Sophisticated ignoramuses.

They only showed the benchmarks where they outperformed?

> Well, AI costs are definitely going to go down at least 90% in the next ~18 months for the same quality of output (and probably 90% again in the 24 months after

As far as I can see, token costs have been steadily increasing over the past few months, so I’m not sure that buying the hype that another 90% cost reduction is just around the corner is warranted.


Doesn’t seem like token costs, specifically, are increasing.

Opus cut its token pricing by 66% 6 months ago and it had previously been that higher price consistently for a year and a half (since that model launch).

GPT’s latest model is harder to track since it’s not named, but it’s historically inline with its history.

Not to mention what’s happening with other models like DeepSeek, GLM, and Kimi.

It seems to me the bigger change in costs is based on token appetite. People are discovering agentic capabilities are stronger than they used to be and use cases have broadened because of that. They’ll eventually discover too that these alternative models offer 95% of the intelligence at 20% of the price.


These are fundamentally different points in design space though, hbm doesn’t have a 10mw idle draw like lpddr does.

Based on some napkin math, that would be about ~100 watt hours of electricity on an H100 cluster, or, roughly the same amount of energy needed to boil a kettle for a cup of tea.

That's an exceptionally fast output you have there...

Mind showing your working out?


Lotr ~1.4k pages 1 page ~1k tok -> ~700k tok 1tok ~0.5J -> 350kJ ~100Wh ~1 cup of tea

I do wonder if a wavelet transform might be better.

I think one can do better with a wavelet, shearlet, or curvelet transform that is adapted to the problem domain at hand. But the uncertainty principle still haunts those transforms, and anyway the goal is to be domain-agile.

You assume that the people who are at the top of the organizations generating said wealth will have any incentive to do that. Look around the world at the petro states for examples of a highly capital intensive industry generating money that subsidizes the rest of the economy.

If you think that people won't have work and therefore no money to buy anything, there will be no wealth- not for the people and not for the riches. The value of Google or Tesla go to zero without masses of people paying for their products.

1. AI seems different here, American AI companies doing better seems to result in the rest of the American economy doing better as intelligence is generally productivity increasing. Plus it's not bound by physical scarcity as oil. It feels more like cloud computing or electricity

2. Even if we were to assume an analogy to a petro state, it seems like we as a society can decide if we go the route of Norway or Venezuela


> generally productivity increasing

Are you aware of any reputable study that supports this? Everything I've seen, coding included, has productivity at a net neutral at best, with large cost increases due to LLMs.


https://metr.org/blog/2026-02-24-uplift-update/

This is probably the best one for coding? The two main findings are that developers didn't want to do tasks without AI (implication being that they would find it too tedious) and for the tasks that were measured, there was a speedup (and more of a speedup if you had more experience with AI tools)

Unfortunately "productivity" is very hard to measure directly. I prefer looking at how much money companies are paying Applied AI companies (a lot) because in aggregate, that meant these companies justified ROI vs. OpenAI/Anthropic directly, and sufficiently enough that large enterprises are willing to go through the time and money to spend on a vendor. It's not foolproof but it dampens the effect of companies tokenmaxxing their Codex/Claude Code to look productive.


The problem is, it’s not like being a pm or manager, the people you work with generally aren’t pathological liars with severe amnesia. The anxiety you feel is because there is a system that is very difficult to control reliably injecting entropy into a system which your paycheck depends on your ability to make it stable and ideally provably correct, that should make you feel anxious.

Then you haven't talked to the PMs I've worked with :)

jk, agree with your point


But the people who want to do local inference are putting some amount of value on privacy that’s not captured by the raw monetary value so just comparing the price is somewhat beside the point, it’s also true that, if you have eg a Mac and you use that as your main computing device then you would have spent money on it anyway, so you can’t even really compare its value to spend on something that’s not general purpose.


That's a lot of assumptions. I think there are also people buying new hardware specifically for this purpose, and their motivation to do it is thinking it will be cheaper in the long run. Privacy is not necessarily the motivation.


My overall opinion is that the smart thing is not to upgrade to the maximum memory for AI purposes. It's worth quantifying how much extra we pay for privacy.


I replied to a comment asking why the article exists.

As for privacy, I'm sure there are many people that are not so interested in that aspect.


It’s shocking how close this feels to claude, obviously it's much slower, but I don’t know that it’s significantly dumber. Interestingly the imatrix quantization seems to be better than whatever quant the zdr inference backends on open router are using. It was self aware enough yesterday to realize that it’s own server process was itself without me telling it, which is not something I’ve ever observed a local model doing before.


In my (obviously anecdotal) testing, DeepseekV4 Pro was better than Sonnet at coding. However, it is much slower, but also many times cheaper, especially with the promotion right now.


Do they have a coding plan or you only pay per API call?


It’s just per token, but burning up 100 million+ tokens is a $3 transaction with their pricing right now


Do you use the official API or another provider?


I use the official API, OpenRouter somehow didn't use caching and one short session with Qwen cost me $5.


Just directly. Paid for it with PayPal. It’s quite simple to set up and use.


You pay per api call but you will be challenged to burn trough 20$ per month. 24/7 usage for single agent will probably cost you around 100$ per month. It is very efficient especially with modern harnesses.


I racked up $30 in 3 days, but I did A LOT of refactoring. Got my projects really buttoned up and now I’m sipping tokens with codex again. Have been more like $1-2/day with deepseek since that initial swarm. With max effort.

It’s especially great that you don’t have to worry about hitting your limit and being stalled.

I’m using it with Claude


What prompt had you given it?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: