Hacker Newsnew | past | comments | ask | show | jobs | submit | lelanthran's commentslogin

> That's why their point is what the subheadline says, that the moat is the system, not the model.

I'm skeptical; they provided a tiny piece of code and a hint to the possible problem, and their system found the bug using a small model.

That is hardly useful, is it? In order to get the same result , they had to know both where the bug is and what the bug is.

All these companies in the business of "reselling tokens, but with a markup" aren't going to last long. The only strategy is "get bought out and cash out before the bubble pops".


But... this isn't exactly news, is it?

We've known for decades that chimpanzees go to war, and during that war will happily slaughter each other.

https://en.wikipedia.org/wiki/Gombe_Chimpanzee_War


Nasal demons is a common reference to C and C++ Undefined Behaviour.

When an AI codes for you, you get Undefined Behaviour in every language.


> very, very obvious" and yet so could be your comment or mine. Can we stop this kind of farming comment already?

If you want to read chatbot output, why are you coming here? There's a ton of free chatbots for you to read.

After all, the audience here knows where to go to get chatbot output, but they're coming here instead. What does that tell you?


> What does that tell you?

That HN was a neat community fifteen years ago, but like all things cool made by early adopters, it will eventually attract a following hoping to be somewhere, to exist among people doing things, but the tragedy of such followings is that they bring with them their toxicity, their immunity to their own poison, and drown out what they depend on until the early adopters early adopt away.

The real slop is all this lazy concern farming from an ant mill that is powerless to do anything except validate its own hand wringing.


> The real slop is all this lazy concern farming from an ant mill that is powerless to do anything except validate its own hand wringing.

Which circles back to the question of why, if you want to read AI output, are you still here?

You can read that sort of thing just about anywhere else.


> I easily get $1K+ of usage out of my $100 max sub. And that's with Opus 4.6 on high thinking.

And people keep claiming the token providers are running inference at a profit.


>And people keep claiming the token providers are running inference at a profit.

Not everyone gets $1K of usage, and you don't know how fat the per-token margins are. It's like saying the local buffet place is losing money because you eat $100 worth of takeout for $30.


> Not everyone gets $1K of usage, and you don't know how fat the per-token margins are.

Well, we're going to find out sooner rather than later. Right now you don't know how thin (or negative) the margins are, either, after all.

All we know for certain is how much VC cash they got. Revenue, spend, profit, etc calculated according to GAAP are still a secret.


In addition to usage distribution aspects others called out .

$1K is not actual cost, just API pricing being compared to subscription pricing. It is quite possible that API has a large operating margins, and say costs only $100 to deliver $1K worth of API credits.


The model developers across the board stand by that most/all models are profitable by EOL, and losses come from R&D/Training.

Yes and when we say things like that we are not talking about plans. Running inference at a profit means api token use is run profitably. It’s a huge unknown what’s happening at the plan level, we know there is subsidy happening but in aggregate impossible to know if it’s profitable or not.

This is great for RPG games; I made up a small cut-down RPG ruleset for my 6yo, and was going to try to 3d print some figurines, but...

This way, I can get my kid to make his own monsters; while he can't run blender to produce his own monsters, using these paper templates is sufficient for him.


> I don't mean that thinks that everyone has to share my perspective. It's just my own.

I think you are walking all around the word "consent" and trying very hard to avoid it altogether.

Your perspective, because it refuses to include any sort of consent, is invalid. No perspective that refuses consent can be valid.


Consent is absolutely important, but that does not mean that every single thing in the entire world requires explicit consent. You did not ask me for consent to use my words in your comment. That does not mean you're a bad person.

Free use is an important part of intellectual property law. If it did not exist, the powerful could, for example, stifle public criticism by declaring that they do not consent to you using their words or likeness. The ability to do that is important for society. It is also just generally important for creating works inspired by others, which is virtually every work. There has to be lines for cases where requiring attribution is required, and cases where it is not.


> You did not ask me for consent to use my words in your comment.

I am not representing your words as mine. I am not using your words to profit off. I am not making a gain by attributing your words to you.

> There has to be lines for cases where requiring attribution is required, and cases where it is not.

You are blurring the lines between "using a quote or likeness" and "giving credit to". I am skeptical that you don't know the difference between the two.

Regardless, any "perspective" that disregards the need to acquire consent is invalid. Even if you are going to ignore it, you have to acknowledge that you don't feel you need any consent from the people you are taking from.

This whole "silence is consent" attitude is baffling.


You made an incredibly strong statement that is much broader than what we are talking about. I am pointing out various cases where I think that broadness is incorrect, I am not equating the two.

I do not think that, if you read, say, https://steveklabnik.com/writing/when-should-i-use-string-vs... , and then later, a friend asks you "hey, should I use String or &str here?" that you need my consent to go "at the start, just use String" instead of "at the start, just use String, like Steve Klabnik says in https://steveklabnik.com/writing/when-should-i-use-string-vs... ". And if they say "hey that's a great idea, thank you" I don't think you're a bad person if you say "you're welcome" without "you should really be saying welcome to Steve Klabnik."

It is of course nice if you happen to do so, but I think framing it as a consent issue is the wrong way to think about it.

We recognize that this is different than simply publishing the exact contents of the blog post on your blog and calling it yours, because it is! To me, an LLM is a transformative derivative work, not an exact copy. Because my words are not in there, they are not being copied.

But again, I am not telling anyone else that they must agree with me. Simply stating my own relationship with my own creative output.


Just wanted to compliment you on your classy attitude and style, along with your solid points. It’s not easy to take that side of the debate. Cheers.

he doesn't have solid points, he conflates fair use with free use (?), ignores thousands of years of attribution history, and equates normal human to human learning with corporate LLMs training on original content (without consent). Great presentation, like you said, to cover the logical defects.

I did say "free use" instead of "fair use," yeah. That's my mistake, thank you for the correction. If I could edit my original comment, I would, mea culpa. Typos happen.

I see. I must congratulate you on your rhetorical prowess, it's nice seeing a professional at work.

Fair use of training data hasn’t yet been settled in court. People here are treating it like it has been. But no amount of wishful thinking or moral arguments will change a verdict saying it’s fine for training data to be used as it has been.

Until that question is settled, it’s disingenuous to dismiss his points out of hand as conflating fair use or ignoring consent.


Even beyond that, the initial legal opinion we do have did in fact point to training being fair use: https://www.reuters.com/legal/litigation/anthropic-wins-key-...

However, I don't feel comfortable suggesting that this is settled just yet, one district judge's opinion does not mean that other future cases may disagree, or we may at some point get explicit legislation one way or the other.


I think the court dropped the ball here. On the one hand, I think they were right that using existing works--copyrighted or otherwise--to train a model was transformable fair use. On the other hand, Anthropic and others trained their models on illicit copies of the works; they (more often than not) didn't pay the copyright holders.

There's a doctrine in Fifth Amendment law called "fruit of the poisonous tree." The general rule is that prosecutors don't get to present evidence in a criminal trial that they gained unlawfully. It's excluded. The jury never gets to see it even if it provides incontrovertible evidence of guilt. The point is to discourage law enforcement from violating the rights of the accused during the investigative process, and to obtain a warrant as the Amendment requires.

It seems to me that the same logic ought to be applied to these companies. They want to make money by building the best models they can. That's fine! They should be able to use all the source data they can legitimately obtain to feed their training process. But if they refuse to do so and resort to piracy, they mustn't be allowed to claim that they then used it fairly in the transformative process.


I mean, that is what the court said! Training on pirated data was not fair use. Training on legally acquired data is fair use.

Anthropic legally acquired the data and re-trained on it before release.


It did not say that. See Judge Alsup's order (https://fingfx.thomsonreuters.com/gfx/legaldocs/jnvwbgqlzpw/...), pp. 29-30, Section IV(B)(ii) ("The Pirated Library Copies").

"[T]he test requires that we contemplate the likely result were the conduct to be condoned as a fair use — namely to steal a work you could otherwise buy (a book, millions of books) so long as you at least loosely intend to make further copies for a purportedly transformative use (writing a book review with excerpts, training LLMs, etc.), without any accountability."

See also p. 31:

"The downloaded pirated copies used to build a central library were not justified by a fair use. Every factor points against fair use. Anthropic employees said copies of works (pirated ones, too) would be retained 'forever' for 'general purpose' even after Anthropic determined they would never be used for training LLMs. A separate justification was required for each use. None is even offered here except for Anthropic’s pocketbook and convenience."

Despite this consideration, the court still found for Anthropic on the question of fair use.


I don't read how that opposes what I said, that's part of the "training on pirated data is not fair use." That said, I am not a lawyer. From those pages:

> The copies used to train specific LLMs were justified as a fair use.

This is (in my understanding) because those were not the pirated copies.

> The copies used to convert purchased print library copies into digital library copies were justified, too, though for a different fair use.

Buying a book and then digitizing it for purposes of training is fair use.

> The downloaded pirated copies used to build a central library were not justified by a fair use.

Piracy is not fair use, you quoted this part as well.

In the conclusions section a the end of 31:

> This order grants summary judgment for Anthropic that the training use was a fair use. And, it grants that the print-to-digital format change was a fair use for a different reason. But it denies summary judgment for Anthropic that the pirated library copies must be treated as training copies.

Training is fair use. Pirating is not fair use, and therefore, you can't train on that either.

What part am I missing?


I think that's a reasonable way to interpret the court's order, but unfortunately the judge didn't really articulate the consequences of training on pirated copies "not fair use" as clearly as I would have liked. Does that mean they're simply liable for infringement of those works, or does it mean that they'd be enjoined from using them altogether to train the model? The genie was out of the bottle; how could it be put back in?

Anthropic settled the case with the publishers just a few months later, leaving the question mostly unsettled still.


I see. Thanks. I cannot wait until this is settled law too.

I was just enumerating some of the issues with the '''solid''' points OP made. Actually addressing them would take too long and be exercise in futility, here, in HN, in april 2026. Why would I put in the effort, for my comment to be flagged and sent to the void? or worse, persisted for ever and used for training without my consent?

And yes, you are right, the legal and moral question of fair use in training data hasn't been settled yet; we agree here.


> But again, I am not telling anyone else that they must agree with me. Simply stating my own relationship with my own creative output.

Look, I'm not saying that you are doing that, I'm pointing out that "Silence is consent" is not as strong an argument that many think it is.


> you don't feel you need any consent from the people you are taking from.

What has been "taken", exactly?


> What has been "taken", exactly?

Where are you going with this line of thought? That making a copy of someone's work, using it for profit and not crediting them doesn't "take" anything from them?


I find that these discussions at the intersection of art and law tend to blur technical and familiar uses of words. So it's important to specify what was actually taken here because otherwise the discussion becomes muddy.

"making a copy of someone's work, using it for profit and not crediting them" wasn't really the scenario being discussed in this thread -- is that what you meant by "taking"?

Steve had made the point:

  Not every single thing in the entire world requires explicit consent.
But actually taking someone else's verbatim work and selling it as your own is one of those instances where consent would be required, because many people see a clear line between someone selling another author's work and the author not getting a dollar because of that.

That doesn't preclude other instance where explicit consent is not required. For example, do I need your consent to learn from your work and produce similar work of my own? Am I required to credit you in my work for having learned from you? Am I taking from you if I don't share my profits with you?

Some rights holders would say yes, actually. Which, I don't agree with. I think it's important that we not require the artist's explicit consent for all things, because listening to some of rights holders (e.g. Disney), they have very expansive ideas about what kind of control they are owed by society over their creations.

Therefore, I think if you're going to claim something has been taken, you should specify what exactly.


> you don't feel you need any consent from the people you are taking from

In most cases, no, I (and it seems most others) don't feel the need for that, it is only you who seems to have an ideological hangup over this.


>In most cases, no, I (and it seems most others) don't feel the need for that, it is only you who seems to have an ideological hangup over this.

It's not an ideological hangup, it's confusion over the assumption by certain groups that "silence is consent", when it is not.


refuse consent?

You may need to clarify that thought.

I don't think the poster has a viewpoint that 'refuses consent', their viewpoint is their writing they put for others to view is for others to view, regardless of how it is viewed. They seem to be giving consent, not refusing it, no?


> refuse consent?

Who said anything about refusing consent?


> I think you are walking all around the word "consent" and trying very hard to avoid it altogether.

> Your perspective, because it refuses to include any sort of consent, is invalid. No perspective that refuses consent can be valid.

This is what I was responding to. I do not understand your thinking in this post.


> This is what I was responding to. I do not understand your thinking in this post.

I thought it was clear from "refuses to include any sort of consent" that I am talking specifically about holding an opinion that refuses to include consideration for consent, not refuses consent for usage.


But that's what I'm confused about:

How is freely giving consent for (all) others to read your content not 'considering consent'?

I'm not trying to be snarky. I really don't see the missing piece that isn't written that connects those dots.


> They only look like meat to blend in. It's the only way to figure out if they're made out of meat.

Perhaps the makers of the movie neglected to read the story before creating a script?


> Conversely: in humans, intelligence is inversely correlated with crime.

If you're measuring the intelligence of criminals who have been caught, why would you expect it to be otherwise?

IOW, you're recording the intelligence of a specific subset of criminals - those dumb enough to be caught!

If you expand your samples to all criminals you'd probably get a different number.


> In practice this doesn't work though, the Mastercard-Visa duopoly is an example,

MC/Visa duopoly is an example of lock-in via network effects. Not sure that that applies to a product that isn't affected by how many other people are running it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: