I think I now know where to draw the line between "good" and "bad" #GenAI, and possibly (or rather obviously) the same for #machineLearning. It's simply whether the input data has been constructed rigorously. Put this way it's the most obvious statement ever, but somehow #BigTech have convinced us all that they advance research by recklessly scraping #twitter, #4chan and who knows what else (they keep their training data secret).
What is good science in computational linguistics? Well, open data is a step towards it. But open and crap is not a solution. We need to actually _know_ and manage the data. And nobody in their right mind would want to plough through toxic data to clean it. We've all heard the horrors of Kenyan data workers who do it for money and still suffer doing it.
But better (yes, also smaller) corpora are of interest to scholars in the humanities and the social sciences. Think of https://textcreationpartnership.org or https://mlat.uzh.ch. Yes, they are too big for individual researchers or even teams to handle, but we have the organisational and technological infrastructure to work on them collectively. We've been doing it for ages and we will continue doing it. We just need to do it together.
And this is the goal of the European Research Council project proposal I'm submitting in this very moment.
For anyone wishing to start the year off making a DOS game restricted by one of the most minimalist video standards in PC history, have I got a jam for you! This runs to the end of February and all CGA DOS games are welcome, regardless of stage of development!
#gamejam #dosgamejam #dos #cga #itch #indiedev #gamedev
https://itch.io/jam/cga-game-jam-2026
NEW: The Palantirisation of the UK military is a national security disaster. Peter Thiel is now the third wheel in the US-UK ‘special relationship’. 1/ open.substack.com/pub/broligar...
With agentic AI embedded at the OS level, databases storing entire digital lives accessible to malware, tasks whose reliability quickly breaks down at each step, and being opted-in without consent, @signalapp leadership, @Mer__edith and Udbhav Tiwari, are sounding the alarm for the industry to pull back until threats can be mitigated.
Curiosity can lead to either support of science or conspiracies. A recent study found that what matters is how people are curious. Those who dislike uncertainty and want quick answers tend toward conspiracy theories. Those who enjoy exploration and open-ended thinking tend to trust science.
<em>British Journal of Social ...
With electric vehicles becoming a realistic option for Japanese consumers, battery recycling holds the key to the further spread of EVs. https://www.japantimes.co.jp/business/2026/01/10/battery-recycle-japan-ev/?utm_medium=Social&utm_source=mastodon #business #batteries #electricvehicles #cars #carmakers #recycling
Here's more on why Italy's idea of Piracy Shield just can't work https://www.techdirt.com/2024/12/26/italys-piracy-shield-moving-from-digital-farce-to-national-tragedy/
The bureaucratic rigidness of the Italian government has taken it to a completely unnecessary conflict with US providers. With the poorly-planned Privacy Shield initiative, it entered a digital sovereignty conflict the country never prepared for. The appalling thing is that even the outspoken Mario Draghi didn't try to walk his talk. And now Cloudfare threatens to resist.
"The scheme, which even the EU has called concerning, required us within a mere 30 minutes of notification to fully censor from the Internet any sites a shadowy cabal of European media elites deemed against their interests. No judicial oversight. No due process. No appeal. No transparency. It required us to not just remove customers, but also censor our 1.1.1.1 DNS resolver meaning it risked blacking out any site on the Internet. And it required us not just to censor the content in Italy but globally. In other words, Italy insists a shadowy, European media cabal should be able to dictate what is and is not allowed online."
https://arstechnica.com/tech-policy/2026/01/cloudflare-may-pull-servers-out-of-italy-over-order-that-it-block-pirate-sites/
HRANA – Iran’s nationwide protests continued into their thirteenth day amid a widespread internet shutdown. According to HRANA reports, over the past 13 days at least 65 people have been killed, 2,311 individuals have been arrested, and protests have been recorded at 512 locations across 180 cities in 31 provinces. On this day, despite severe […]
For the nation’s first president, friendliness was strategy, not concession: the republic would treat other nations with civility in order to remain independent of their appetites and quarrels. https://theconversation.com/george-washingtons-foreign-policy-was-built-on-respect-for-other-nations-and-patient-consideration-of-future-burdens-272934
The 6-7 craze that disrupted classrooms and sports events worldwide was more than just nonsense.
Media scholars from 3 countries say the fad reveals how children use meaningless language and games to carve out spaces where they hold the power and adults don't make the rules. https://theconversation.com/the-6-7-craze-offered-a-brief-window-into-the-hidden-world-of-children-272327
When reputable local news outlets close, fewer people vote and get involved in local politics, and misinformation, corruption and polarization increase, an expert on the U.S. media and its role in democracy explains. https://theconversation.com/why-the-pittsburgh-post-gazettes-closure-exposes-a-growing-threat-to-democracy-272992
EU is calling for comments on open source strategies. MAKE YOURSELF HEARD!
Even non-EU citizens have a voice here.
NOW is a time to stand up and stand out! YOU want to help the Fediverse? Here's just one way today that YOU can REALLY make a difference:
The European Open Digital Ecosystem Strategy will set out:
a strategic approach to the open source sector in the EU that addresses the importance of open source as a crucial contribution to EU technological sovereignty, security and competitiveness
a strategic and operational framework to strengthen the use, development and reuse of open digital assets within the Commission, building on the results achieved under the 2020-2023 Commission Open Source Software Strategy.
ec.europa.eu/info/law/better...
#EU #open #foss #openSource #source #linux #activitypub #AP #fedi #fediverse
@cstross and this is the fedi handle for the primary source media seems to cite most often
@en-hrana.org
#iran
A very important point about AI models that we need to remind ourselves
https://www.techpolicy.press/we-need-to-talk-about-how-we-talk-about-ai/
Most Americans actually like wolves, no matter their politics.
But when researchers reminded people of their political identities, Democrats became more friendly to wolves and Republicans far more opposed. It’s an interesting study in how politics and social identity can fuel partisan polarization.
“Environmental amnesia” lets critics focus on costs of laws, while forgetting why these laws were needed and the real benefits they delivered, according to an environmental law professor.
The “Documerica” project shows in clear photographic evidence how dirty the U.S. used to be. https://theconversation.com/the-us-used-to-be-really-dirty-environmental-cleanup-laws-have-made-a-huge-difference-271277
Studying how people interact, in the past (#CulturalAnalytics) and today (#EdTech #Crowdsourcing). Researcher at @IslabUnimi, University of Milan. Bulgarian activist for legal reform with @pravosadiezv. I use dedicated accounts for different languages.
My profile is searchable with https://www.tootfinder.ch/