Be careful with what you install on your phones.
If the findings in this article are true, then most of the bot traffic that has recently taken down many small and independent websites (code forges in the first line) comes from a quite sophisticated network of scrapers sold as services.
My small Forgejo instance has also experienced brief downtime and slowness a couple of weeks ago, but luckily nothing compared to the instances of Gnome and KDE (which had to implement aggressive captcha to mitigate the flood). Basically anything that isn’t behind Cloudflare is a potential victim.
The pattern is mostly the same in all these cases. Residential IP addresses and legitimate user agents that don’t advertise themselves as bots, let alone honor the robots.txt files, making life for the sysadmins who try to block this traffic very hard.
These bots also request heavy pages (such as git logs and blames) in large volumes, which has taken down a lot of Gitlab, Gitea and Forgejo instances.
The business model behind this phenomenon seems to be quite sophisticated.
As a developer of a mobile app, I can include the SDK of a product like Infantica inside of my code. It doesn’t even have to be my own app. There have been cases where other people’s apps were simply repackaged with these SDKs ans redistributed on stores.
That SDK in turn transforms any device it’s been installed on into a member of a vast botnet without the user’s consent or knowledge.
The customers are usually companies that want to train large AI models, but can’t afford the costs (or simply don’t want to pay them, or have a limited pool of IP addresses for scraping that may easily be blocked by sysadmins).
What they do then is pay companies like Infantica to leverage infected devices (i.e. mostly mobile phones with apps that include their SDK) to scrape the web for them and push data wherever they want.
Developers who include the SDK in their apps also get a share of the pie - hence the financial incentive to repackage and redistribute even 3rd-party apps with the incriminated SDK: minimize the development effort, maximize the revenue.
Of course, the commands that “customers” can send to the botnet aren’t limited to scraping and training for AI purposes. It’s just that this is what currently pays best (it used to be crypto mining until a while ago). In theory, nothing prevents them from sending commands to access anything on the infected devices. Of course, companies like Infantica claim that they do their due diligence and scan all usages of their products to prevent abuse, but when a company already has such low moral standards you know how to take their claims.
Note that what until a couple of years ago would have been called “a zombie device infected with nasty malware that turns it into a botnet member at the mercy of whatever the best paying customer wants to do with it” has now been repackaged as a legit business product with its own business jargon. They are now called “residential rotating IP addresses that form an insightful peer-to-business network”.
And the volumes are also scary. Infantica alone claims that it can sell access to nearly 250K IPs in the US alone. That’s nearly one American in 1000. And when you take into account that there are dozens of companies that operate in the same sector, the volumes become scarier.
Unfortunately it’s hard for non-technical users to know which apps run such SDKs, and if there are such apps already installed on their phones. But there are a few precautions that can be taken to mitigate the risk.
First, avoid mobile apps when possible. Their potential abuse as AI scrapers is only the latest threat that they pose. They have a lot of privileges once installed and have a huge surface of attack. It’s ok to have an app for your camera. Whether it makes sense to have an app to check discounts at your local store, it’s debatable. Use websites instead of apps whenever possible. Many of them can be installed on your phone nearly as a full app through the PWA paradigm, but since those Webapps will always be sandboxed inside your browser they can’t do much damage. And always, always avoid whenever possible products whose website is a single “Download our app” page. There’s a reason why we decided that an open web is better than a bunch of closed apps, and we should punish those who don’t agree with those reasons.
When you have no choice but to install an app, always look for comparable alternatives on e.g. F-Droid. Apps on open-source stores have much more scrutiny than whatever crap is uploaded to the Android and Apple stores. Each app is monitored for any external connections, and those are marked as anti-features. Plus, each app is forced to share its source code. Google and Apple have their big responsibilities for this mess. If an Android SDK exists that turns phones into botnet zombies that can run arbitrary payloads, then that SDK should be considered as malware. Period. Any app that includes that SDK in its dependencies or includes any of those packages should be automatically flagged and removed from the store. The fact that this doesn’t happen, and millions today run infected software on their phones downloaded from legitimate app stores, means that Google and Apple are either grossly negligent or grossly corrupt - in either case, they can’t be trusted for the safety of the software you download from their stores.
And, when you have no choice but to get an app from an official store, always prefer alternative store frontends like Aurora, which at least scans the apps from the Play Store and transparently informs you about any trackers and data access patterns.
Finally, I disagree with the last stance in this article - that every form of web-scraping should be considered abusive behaviour. Scraping is one or the foundational pillars of the Web as we know it today. And the vision of a Web accessible both to humans and machines is a foundational pillar of the semantic Web. It’s not scraping the problem. But, for scraping to be a game where everyone wins, two issues must be solved:
The right to scraping needs to be symmetric. If Google, Meta or Microsoft can freely scrape my websites to train whatever AI hyped bullshit they want to train with it, then I also have the right to scrape their services. If instead they can eat my blog’s RSS feed or my monthly code commits for breakfast, but scraping my Facebook homepage to automatically expose my friends’ birthdays through another service may result in my account being banned, then we have a problem.
The unfortunate alignment of financial incentives and impunity in recycling what until a couple of years ago was basically a criminal activity (installing malware on people’s devices) into a legitimate business model with shiny business-friendly websites and account managers. I don’t mind a world where bots identify themselves as bots through standard user agents, so I can easily block them if I want to, respect my robots.txt settings, and sensibly throttle their requests. But I have a problem with a world where all these gentlemen’s agreements are broken, where the costs of training expensive AI models are so explicitly externalized, and paid by thousands of independent Web administrators through electricity costs, performance degradation costs and downtime management costs, and where those who break the rules are free to operate as listed companies instead of being in jail, and where their malware is allowed to spread through standard software distribution channels.
https://jan.wildeboer.net/2025/04/Web-is-Broken-Botnet-Part-2/
Helping a friend do something on their computer, I noticed that they didn’t have an ad blocker.
Security and privacy aside, their browsing experience was atrocious.
Impossibly, unusably dire.
Now they have an adblocker, and web pages are uncluttered? Readable? Actually usable?
What a sorry state of affairs.
Adblocking is self care and just plain sensible.
Today I learned about the Pistacci raid by the SAS in WWII. It's almost unbelievable.
finally wrote something real about what i've been building!
too many of my infra workflows were buried in slack threads, docs, or shell history
so i started working on Atuin Desktop:
- runbooks that run
- local-first, crdt-powered
- embedded terminals, db queries, monitoring blocks
more words here: https://blog.atuin.sh/atuin-desktop-runbooks-that-run/
would love to know what you think!
Resist, eggheads! Universities are not as weak as they have chosen to be.
Opinion: It's time for public resistance.
https://arstechnica.com/culture/2025/04/resist-eggheads-universities-are-not-as-weak-as-they-have-chosen-to-be/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social
Posts about medicinal herbs hardly ever include the side effects or any negative reactions. It's always, "drink chamomile tea to relax and sleep" and not, "15-20% of people are allergic to ragweeds, which includes chamomile, so it might give you a scratchy throat."
-Drinking mint tea regularly can cause acid reflux and exacerbate other GI issues.
-St. John's Wort calms you, and is good for depression, but it can also dangerously lower your heart rate if you take beta blockers.
-Sure, fennel is great for "digestive issues," but don't schedule a date that night, you will be very gassy.
-All members of the Poplar family, which includes Willow, contain the same chemical as modern aspirin within their bark. Don't give it to your kid if they have the flu or chicken pox, it can cause a deadly reaction.
-Rosehip helps with painful menstruation, and is a great source of vitamin C. So great, in fact, that you can overdose and give yourself kidney stones if you take too much.
-Dear white people, taking turmeric every day will literally turn you orange.
This uses #TurfJs polygon length function to calculate its circumference. The underlying sorting of the coordinates uses polar coordinates, since that is how they are being randomly generated. By first sorting over the degrees and the distances.
For most instances this is a pretty good estimate, though in some case a self-intersecting polygon can be created, which hints that there is a more optimized polygon.
Further this problem is a typical 'travelling salesman problem' which can be solved via various algorithms.
Though before considering such optimization issues, there is the main question if the randomly generated POI is even accessible to the public.
In the case one is a pedestrian, one could use #overpass to gather information if any #OpenStreetMap highways exist in the proximity to the POI and then find the one that is the closest and permissive to pedestrians. Then move the POI to that location.
This example only works if there is enough #OSM data at the location.
Friendly reminder that you should be blocking all newly registered domains for your end users. Free lists like the NRD (https://github.com/xRuffKez/NRD) exist. Microsoft Defender for Endpoint also has a built in list you can enable via policy.
IMO everyone should do 365 days but even 30 or 90 will save you so much headache.
#DNS #ThreatIntel #FastFlux
@jasongorman when astrology isn't selling anymore, this is what you get
Get with it, Granddad! Vibe coding totally slays. Here are my lit book recommendations for bussin' vibe learning:
* The Mythical Man-Vibe
* Vibe-Driven Development: By Example
* The Art of Computer Vibing - Volumes 1-3
* Vibe Patterns: Elements of Reusable Prompts
* Continuous Vibing
* Structured Vibing
* Clean Vibes
* Vibe Complete
* Refactoring - (you're gonna' be needing it!)
Nobody should touch the EU Digital Sales taxes.
For years Europe has been colonized by US-based digital services that operated in our continent, profited from the data gathered from our citizens, and didn’t pay a dime in taxes.
France and Spain have now a 3% tax on all digital services that operate within the country and have yearly profits higher than €750 million.
Austria has a 5% levy on companies with the same yearly profit that make over €25 million a year in digital advertisements.
Italy and the UK have similar laws.
The principle is straightforward: if you make profit in my country by advertising and scooping up data of my citizens, then you have to pay a fair share of taxes in my country too.
Otherwise it’s just digital colonization operated by companies that pay little or nothing in taxes in their own country, pay no taxes in the countries where they operate, while local companies are expected to pay their higher share of taxation.
The rotten fecal matter that fills Trump’s fascist brain keeps thinking that taxation against American companies is a form of discrimination against America. When it’s clear even to a toddler that it’s not.
If the same tax is levied against all businesses that meet some simple criteria (namely, profits from digital services based on data collection and advertising), regardless of their country of origin, then it’s not discrimination. Quite the opposite - exempting only American companies from such levies would be a form of favoritism that discriminates against local companies and against other foreign competitors.
Trump’s primitive synaptic infrastructure treats taxes the same way as tariffs, ignoring that tariffs, by definition, are a political tool that specifically targets a country or an industry (to either protect the local industry or punish them for misbehaving), while taxes are financial tools to ensure a fair redistribution of the revenue that have no geopolitical aims.
Trump’s bully instincts make him believe that he can retaliate against fair taxes with tariffs unless we exempt American companies from those taxes.
We shouldn’t fall for it.
We should kindly remand the American fascist bully that you can retaliate against tariffs, but not against non-discriminatory taxes. And that he has no power over the financial decisions of independent governments on the other side of the pond.
And we must use this chance to decouple, de-riskify and boycott all American technological products we can in order to build our own ecosystem, foster our own European tech infrastructure and create tech jobs here.
Boycott Facebook. Use Mastodon or any decentralized Fediverse-based solution.
Boycott Instagram. Use Pixelfed.
Boycott AWS. Use Scaleway.
Boycott Microsoft 360. Use Nextcloud.
Boycott Whatsapp. Use Signal or Matrix.
Boycott Slack. Use Mattermost.
Boycott Reddit. Use Lemmy.
Boycott Google, and even DuckDuckGo. Use meta-search engines like Searxng. And use un-googled Android phones when possible.
Boycott all Apple products.
Boycott Amazon. Use local e-commerce portals instead.
Boycott X, Tesla, and anything touched by that sociopath called Elon Musk. Buy European, or even Chinese, electric cars instead.
Boycott American phone makers. Consider sustainable EU-based alternatives like FairPhone instead.
And the list could go on.
And urge your employer, your family and your friends to do the same.
We have so many EU-based alternatives, many of them open-source and more sustainable than the American counterparts.
We have so much talent on our continent that builds the open-source infrastructure that even American companies profit from, and if we prevent them from using those products they’ll be hurt much more than we would be hurt by giving up on their handful of commercial products managed by a couple of multi-billion dollar companies.
And we also have nuclear options like banning American companies from using chips produced down the ASML supply chain.
The only problem to be addressed is that EU institutions never put their wallet where their mouth is, and they pay peanuts to those who build those sustainable solutions - and when I mean “peanuts” I literally mean “0.1% of what the US CHIPS act has granted to Intel alone”.
We have way more leverage than redneck Trump thinks.
We should take these thug threats as an opportunity to foster our industry and kick the American technological colonizers out of our continent for good.
As long as the orange moron sits in the White House, relying on American tech is a liability that exposes us to his childish threats, not an asset.
@davidbisset came across an interesting comparison: why do mathematicians still exist after the invention of the calculator
I am a strong proponent of leaving this planet better behind than when I arrived on it. Thus to get the most bang for a lifetime my key focus is #longevity which I attempt to achieve with #nutrition specifically #plantbased.
Longevity is good and all as long as you are not frail and weak. Ideally would be to die young at an old age. Thus I incorporate tactics from #biohacking and #primalfitness. Additionally I am an advocate of #wildcrafting, which is a super set of #herbalism.
Studied many fields of science like maths or statistics, though the constant was always computer science.
Currently working as a fullstack web developer, though prefer to call myself a #SoftwareCrafter.
The goal of my side projects is to practice #GreenDevelopement meaning to create mainly static websites. The way the internet was intended to be.
On the artistic side, to dub all content under the Creative Commons license. Thereby, ideally, only using tools and resources that are #FLOSS #OpenSource. #nobot