@tedunderwood.com and yet, we are talking of black box systems that have numerous issues, copyright infringements and illicit content being only two of the clear-cut and legally unambiguous ones.
The technology is one thing, the market is a different one. Legislation (already in the US and continuously more so in the EU) is deliberately closing its eyes of this distinction in the interest of multibillionaires. If not corruption, this is inequality politics at best.
Edit: see discussion below for an admission that "legally unambiguous" is an unnecessary exaggeration.
@axeghostgame.bsky.social @tedunderwood.com those that don't find it are by judges that haven't brothered to ask for the training data. Those that find it, are able to confirm it even without requiring the data. It's the misguided idea of accountability of the "don't ask for permission, but for forgiveness later" mantra.
@tedunderwood.com @axeghostgame.bsky.social fair enough, my "legally unambiguous" is an inappropriate exaggeration. In a system of settlement agreements (Disney, News Corp, Springer and counting in the case of OpenAI), it is easy to get confused about the specific unresolved issues.
So according to this one review article:
1. A tendency might start to emerge to consider commercial AI systems (the technology) fall within fair use
2. But the (business) practice of accessing the original materials is not rarely through acts of piracy. This includes allegations of seeding pirate torrents to enable the anonymous acquisition of pirated content.
https://copyrightalliance.org/ai-copyright-lawsuit-developments-2025/
copyright infringement *is* ambiguous here. some rulings find infringement, others do not.