I've been involved with the Internet continuously pretty much since its earliest ancestor ARPANET days at UCLA ARPANET site #1.

I can assure you that the ARPANET and the Internet into which it evolved were NOT designed to provide a means for a handful of powerful AI firms to suck the world dry without permission or compensation of information en masse, that has mostly been put online for free as a public service, all to enrich the coffers and stockholders of those firms, while giving only lip service (if that) to ethical considerations, and giving the sites from which they slurp data the finger, while laughing all the way to the bank.

Follow

@lauren Hmm. A thought: Compare and contrast crawling the web to fill AI's models vs in order to index the web for search (e.g., AltaVista in the 1990's, Google, Bing, et al, since then). I believe there is a difference, but it's worth exploring the intent, expectations (of authors), and how much value (and what type of value) goes to whom.

@danb Traditional search engines have operated in a fairly straightforward model of a value exchange, primarily providing links back to the sites from which information has been gathered. Generative AI systems for all practical purposes are a Take Everything and Give Nothing Back model, either not offering by default useful links back to sources, or making users take extra steps (that they're unlikely to do) to see those links. There simply is no comparison, and in essence these firms have now violated the unwritten understanding which was the basis for their being permitted to access that data in the first place.

@lauren @danb Even before AI, Google has been breaking that deal for a while, since their search result page has scraped answers from sites, so that many users never click a link to go to the sites the answers came from. But at least there were links.

@not2b @danb In fact, those answers were quite limited, and were direct quotes from external sites with links back to the sources. Then ALL the normal links would be displayed below as usual. The main source for those answers was Wikipedia. Generative AI answers are completely different. Really, there is no comparison.

@lauren @danb Try a basic search. For example std::vector (the C++ vector class, I just tried it). There is a lot of content on the returned page, the text box is from cppeference.com, an ad supported site, and there are lots of "people ask" questions. Now, for that particular question that isn't enough info and people will look further. But in many cases the text boxes and "people ask" questions are from sites that lose ad revenue if they don't get the click. And even Wikipedia doesn't get to show users their contribution solicitations.

@not2b @danb Typically the only time there were no links back with the answers was with really basic questions. What is the value of Pi? What is the capital of China? Who was the 15th president of the U.S.? Stuff like that which is widely known. Knowledge Panels (the ones to the right) always had links back to the sources and always were quotes directly from those sites (again, mostly Wikipedia). I dealt with those in several instances while I was working inside Google.

@lauren There are links, but they are below the text boxes, the "people ask" boxes, and the ads. Yes, there are links after the text boxes, but a significant number of users don't click them.

@lauren @not2b Great discussion! I agree with Lauren. I would characterize the current era of the Internet as rapacious, but like the era of the Robber Barons, it will pass as ideas of the greater social good take over, as they must.

@meltedcheese @lauren @not2b

Possibly. But worth remembering that, like the era of the Robber Barons, it's gonna be an ugly transition.

Unions did a ton of good. Also, they were illegal and dangerous to be in during nearly all the time they got their big victories.

@not2b @lauren

You guys are both right.

There was a lot of discussion at the time when Google started implementing the snippets on results, because many websites noticed their traffic falling off a cliff. AI is just Google taking that model to the next level.

Also worth reading the account from Retro Dodo on this very topic...

retrododo.com/google-is-killin

@lauren That's the type of more specific discussion I was looking for. It addresses the history of scraping. Of course, there's nuance in what the search engines do, too, as @not2b points out. Thanks!

@lauren @danb Well said! The lack of links to search sources makes the search results practically useless.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.