Thinking about LLMs as potentially useful tools (with many caveats) 

I was curious to know where's the boundary between similarity and copyright infringement, specifically in the context of using large language models for programming. writing.kemitchell.com/2025/01 is just what I needed: a high-level explanation that there is no such rule (yet), and what can be done in its absence.

Kyle's prose is rich and always takes me a while to read and digest, so if you're in a hurry, here's my takeaways i:

there is no specific number of how many characters/tokens/lines one has to generate for it to become an infringement
there's a continuum: autocompletion-generation-authorship. If one is auto-completing a simple line of code, it's probably fine. If one generates the same boilerplate that half the projects in the world contain, it's probably fine too, but make sure it's really boilerplate and nothing original. If one is asking for a complete implementation of some algorithm, the risks are way higher
one should document everything that's done by an LLM, to be used later as evidence of noninfringement. LLM's output should be stored as separate commits, containing the prompt. Human's edits should be in a separate commit to clearly delineate what was generated and what was authored.

Of course, there's still many more questions to be answered about LLMs: potential infringements during training, efficiency of training and inference compared to typing the code yourself, as well as more philosophical questions of where this brings programming as activity.

#LargeLanguageModels #Law

@minoru In other news, here's GPL v2 written by my heavily quantized local GLM 4.5 Air: 0x0.st/K9T2.txt
Real™ GPL v2 forbids text modification, but this one is LLM-copyright-lanundered, so you are welcome.

@L29Ah What's your point? How are LLM hallucinations related to the actual law as practised by humans?

Follow

@minoru Apparently the copyright law doesn't apply to LLM hallucinations per se, that enables some creative legal tricks that didn't exist before LLMs. IANAL ofc, we're yet to see how far the rabbit hole goes.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.