there is currently a bot inside MIT IP space, address 18[.]4[.]38[.]176
, scanning fedi at large. i have confirmed this with 5+ unrelated instance admins, large and small instances, across mastodon/misskey/pleroma/akkoma.
the bot is poorly behaved. i have observed it making repeated requests, multiple times per second, for the exact same paths (the paths being, generally: user profiles, specific posts, and sometimes following links in posts). returning 403s does not stop this activity. one of my domains received hundreds of additional requests despite replying with 403 to all of them. i have also seen it make requests for paths containing html tags - seems like a badly written parser. the purpose of these requests and what data is being gathered is unclear.
PTR on the ip returns sts-drand03.mit.edu
. a quick web search for "mit drand" brings back https://mitsloan.mit.edu/faculty/directory/david-g-rand and his personal website: https://davidrand-cooperation.com/ (note: other IPs in the /24 also have names in the PTR which match up with names of MIT faculty, but only the .176 IP appears to be involved in this activity).
seems he's doing research into "misinformation" and "fake news" on social media. he also appears to be on fedi! so @Drand@techhub.social, given this activity is sourced from an IP with your name on it, could you share the purpose of this traffic? what data is being collected and how is it being used? do you plan to respect robots.txt or identify yourself in your useragent? is there a process for instance admins to opt out of this activity other than blocking the source IP?
@iron_bug heh, I dived into this thread to check that you are already here and tag if you are not =)