“Now it’s LLMs. If you think these crawlers respect robots.txt then you are several assumptions of good faith removed from reality. These bots crawl everything they can find, robots.txt be damned, including expensive endpoints like git blame, every page of every git log, and every commit in every repo, and they do so using random User-Agents that overlap with end-users and come from tens of thousands of IP addresses…”
From: @void_friend
https://tech.lgbt/@void_friend/114193939355588949
Good thinking. I'm just going to redirect to something called "dream journals of the woefully unmedicated". Enjoy the next generation of AI hallucinated content.