I've been involved with the Internet continuously pretty much since its earliest ancestor ARPANET days at UCLA ARPANET site #1.

I can assure you that the ARPANET and the Internet into which it evolved were NOT designed to provide a means for a handful of powerful AI firms to suck the world dry without permission or compensation of information en masse, that has mostly been put online for free as a public service, all to enrich the coffers and stockholders of those firms, while giving only lip service (if that) to ethical considerations, and giving the sites from which they slurp data the finger, while laughing all the way to the bank.

@lauren Hmm. A thought: Compare and contrast crawling the web to fill AI's models vs in order to index the web for search (e.g., AltaVista in the 1990's, Google, Bing, et al, since then). I believe there is a difference, but it's worth exploring the intent, expectations (of authors), and how much value (and what type of value) goes to whom.

@danb Traditional search engines have operated in a fairly straightforward model of a value exchange, primarily providing links back to the sites from which information has been gathered. Generative AI systems for all practical purposes are a Take Everything and Give Nothing Back model, either not offering by default useful links back to sources, or making users take extra steps (that they're unlikely to do) to see those links. There simply is no comparison, and in essence these firms have now violated the unwritten understanding which was the basis for their being permitted to access that data in the first place.

Follow

@lauren That's the type of more specific discussion I was looking for. It addresses the history of scraping. Of course, there's nuance in what the search engines do, too, as @not2b points out. Thanks!

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.