The case of “vegetative electron microscopy” illustrated here shows what is badly needed in current #LLM research and has implications far beyond. We need tools that help us curate huge corpora. We need to be able to trace #hallucinations back to the training data and understand what are the specific (to a surprise, often #deterministic) reasons in the model input that cause that particular output.
If anyone is interested in collaborating on this, I'm in, have done some small-scale experiments and have already submitted a grant proposal.
https://theconversation.com/a-weird-phrase-is-plaguing-scientific-papers-and-we-traced-it-back-to-a-glitch-in-ai-training-data-254463
I think I now know where to draw the line between "good" and "bad" #GenAI, and possibly (or rather obviously) the same for #machineLearning. It's simply whether the input data has been constructed rigorously. Put this way it's the most obvious statement ever, but somehow #BigTech have convinced us all that they advance research by recklessly scraping #twitter, #4chan and who knows what else (they keep their training data secret).
What is good science in computational linguistics? Well, open data is a step towards it. But open and crap is not a solution. We need to actually _know_ and manage the data. And nobody in their right mind would want to plough through toxic data to clean it. We've all heard the horrors of Kenyan data workers who do it for money and still suffer doing it.
But better (yes, also smaller) corpora are of interest to scholars in the humanities and the social sciences. Think of https://textcreationpartnership.org or https://mlat.uzh.ch. Yes, they are too big for individual researchers or even teams to handle, but we have the organisational and technological infrastructure to work on them collectively. We've been doing it for ages and we will continue doing it. We just need to do it together.
And this is the goal of the European Research Council project proposal I'm submitting in this very moment.
Today at #CHR2025, I will be presenting our work on the evaluation of the historical adequacy of masked language models (MLMs) for #Latin. There are several models like this, and they represent the current state of the art for a number of downstream tasks, like semantic change and text reuse detection. However, a historical researcher, philologist or else would want to be sure that such models really represent the historical period of interest. For example, it would be an embarrasing hallucination if St. Augustine showed up in the context of the Roman senate.
Our evaluation confirms a known problem: LLMs and masked models in particular are trained on corpora without attention to historical periods. Unlike other research we've done on Early Modern English, this problem leads to models being barely distinguishable when it comes to their ability to generate based on a historical period. Even though history is a case where it is most obvious when models go wrong, this type of contamination is a known problem for LLM training overall, think of different legal jurisdictions using the same language, dialects in programming languages, etc.
This research was generously supported by AgileLab.
The full paper is available at:
https://anthology.ach.org/volumes/vol0003/the-latin-language-evolved-over-time-masked-models/
Our paper on the values found in fairy tales from some European countries has been published. We studied how values are explicitly present in tales from Germany, Italy and Portugal using various NLP techniques, but most notably Word2Vec and Word Embedding with a Compass. We visualise synchronic semantic variation to show certain differences based on observations of the corpus, some of them already observed in previous literature. A discussed example in our findings is how motherhood in Germany is strongly related to generosity, whereas in Italy and Portugal it has stronger relationship to wisdom.
Fulltext available at: https://aclanthology.org/2023.nlp4dh-1.8/
In the morning session today Sara Sullam and I will be presenting our work on exploring nominal (in our case study - bibliographical) data. We do it by borrowing a method from educational research - the notion of phenomenographic variation. #CHR2023🧵
The scandal is one of Europe’s most significant political crises involving the use of commercial hacking software. Spain, Hungary and Poland have faced similar controversies, with spyware such as Pegasus and Candiru found on the phones of politicians and activists. The European Parliament launched a formal inquiry into the use of such tools in 2022.
Greek political parties have clashed over the affair for years, as an expanding list of cases revealed the highly invasive surveillance tool on the phones of opposition politicians, government ministers, military officials, journalists and business executives. The Greek government has denied using the illegal spyware.
The scandal has cast a long shadow over Greek politics. In 2024, Greece’s Supreme Court cleared the state intelligence service and political officials of wrongdoing, a decision that angered spyware victims and opposition parties.
Androulakis said Thursday that “the fight will continue until all those involved in this murky affair are brought to justice.” He has appealed the Supreme Court’s decision to the European Court of Human Rights.
If you replace a junior with #LLM and make the senior review output, the reviewer is now scanning for rare but catastrophic errors scattered across a much larger output surface due to LLM "productivity."
That's a cognitively brutal task.
Humans are terrible at sustained vigilance for rare events in high-volume streams. Aviation, nuclear, radiology all have extensive literature on exactly this failure mode.
I propose any productivity gains will be consumed by false negative review failures.
U.S. investors are pulling money out of their own stock market at the fastest pace in at least 16 years as Big Tech returns fade and better-performing overseas markets look more attractive. https://www.japantimes.co.jp/business/2026/02/24/markets/buy-america-wall-street-exodus/?utm_medium=Social&utm_source=mastodon #business #markets #wallstreet #markets #investors #us #economicindicators
Jeffrey Epstein’s ties to prominent universities are shining a spotlight on donor screening.
Individual donors fund only a small share of research (about 3%), but even small gifts can raise big ethical questions.
@TheConversationUS "High-profile propaganda products frequently fail to resonate. Music charts and streaming platforms in Russia are dominated not by patriotic anthems but by an eclectic mix of songs about personal relationships, such as Jakone’s moody ballad “Eyes As Wet As Asphalt,” songs in praise of “Hoodies” and even a catchy Bashkir folk song.
Book sales show strong demand for works such as George Orwell’s “1984” and Viktor Frankl’s Holocaust memoir “Man’s Search for Meaning,” suggesting that readers are searching for ways to understand authoritarianism, trauma and moral responsibility rather than celebrating militarism.
And instead of watching the state-backed film “Tolerance,” a dystopian tale of moral decay in the West, Russians are streaming the “Heated Rivalry” gay hockey romance."
Michelangelo hated painting the Sistine Chapel – and never aspired to be a painter to begin with
Regarding the extremely insightful exchange between @tante and @pluralistic that unfolded yesterday, there is a bit too much of ideology in it for my taste. However, I couldn't help noticing that arguably the strongest point of Tante about the ideology of LLMs is based on an invalid argument he developed in 2024: the claim that "open-source LLMs do not really exist".
Tante makes a case that open weights is not open source, and that's a valid point. Back then it was probably difficult to see the open source (open weights, open data and more) LLMs that actually exist. Many of them are specialised, and commonly they'd perform even worse than open weights ones. Yet, they are out there and I'd claim they are inevitably going to be an important part of the future.
I've been studying particularly specialised models like MacBERTh, but there are also open autoregressive instruction-tuned (i.e. chatgpt-like) models. Now there's even the Model Openness Framework and the corresponding tool and ranking: https://mot.isitopen.ai
This might be seen as opening a conversation about whether we can separate affordances of technologies from their politics, but as I said, I have too many doubts about ideology to be willing to go down that path.
PS: I come late to the conversation and I'm a nobody in this community. Yet, if curious, I invite you to see my pinned posts to see what I'm doing around the topic.
Tante's original post: https://tante.cc/2024/10/16/does-open-source-ai-really-exist/
ggml.ai joins HuggingFace
ggml.ai is better known as the entity behind llama.cpp. It's nice to hear good news! Thank you @ggerganov and @huggingface
Hey, whether you work in tech or not, if you use Python, please do take a couple minutes to fill out the 2026 Python Developers Survey: https://surveys.jetbrains.com/s3/python-developers-survey-2026
Is teasing playful or harmful? Research shows the answer often comes down to details like:
• power differences
• the topic (identity & sensitive issues are off-limits)
• whether it stops when asked
• repetition
Via The Conversation Canada:
https://theconversation.com/is-teasing-playful-or-harmful-it-depends-on-a-number-of-factors-273676
and here's the mechanism: they threaten and then shift the blame. European companies not planning to migrate away are falling behind.
https://www.theregister.com/2026/02/18/microsoft_asks_uk_parliament_to_correct_record/
US sanctions on ICC should be used as an indicator of where to direct efforts to grow independence from US businesses
U.S. President Donald Trump’s assignment of his favorite envoys to juggle the Iranian nuclear standoff and Russia’s war in Ukraine in a single day has left many in the foreign policy world scratching their heads. https://www.japantimes.co.jp/news/2026/02/18/world/politics/us-witkoff-kushner-iran-ukraine/?utm_medium=Social&utm_source=mastodon #worldnews #politics #stevewitkoff #jaredkushner #iran #middleeast #us #donaldtrump #republicans #ukraine #russia #russiaukrainewar #vladimirputin #volodymyrzelenskyy
Il Pentagono vuole classificare Anthropic AI come "supply chain risk"
"La controversia sembra derivare dalla riluttanza di Anthropic a consentire un uso militare illimitato dei propri sistemi di IA. Attualmente, i modelli di Anthropic sono gli unici strumenti di IA disponibili all'interno di sistemi militari classificati tramite fornitori terzi, come Palantir Technologies, e anche in questo caso sono soggetti ad alcune restrizioni. Reuters ha precedentemente riferito che i dirigenti di Anthropic hanno comunicato ai funzionari militari di non volere che i propri sistemi fossero utilizzati per il puntamento di armi autonome o la sorveglianza interna."
https://gizmodo.com/pentagon-considers-designating-anthropic-ai-as-a-supply-chain-risk-report-2000722805
@aitech
From the crash of our nuclear bomber in Greenland in 1968.
“Cancer-stricken Danes who cleaned up US #nuclear bomber crash in #Greenland are fighting for recognition, money”
In a massive blow to the handful of scientists and academics who still dispute widely-accepted climate science, the Trump administration discarded a signature report by its own “Climate Working Group.” It comes as the U.S. Environmental Protection Agency (EPA) officially scrapped the government’s endangerment finding — the official recognition that greenhouse gases harm human health and the environment. In an […]
The post Trump EPA Abandons Climate Working Group Report in Endangerment Finding Repeal appeared first on DeSmog.
After Russia’s invasion of Ukraine triggered an about-face on NATO membership, Sweden is considering another historic shift in adopting the euro. https://www.japantimes.co.jp/business/2026/02/17/markets/sweden-euro/?utm_medium=Social&utm_source=mastodon #business #markets #us #donaldtrump #sweden #euro
Even with Trump’s support, coal power remains expensive – and dangerous. President Trump has ordered the military to buy more electricity generated by coal, but professors of law and environmental economics explain the limits to his power to bring coal back. https://theconversation.com/even-with-trumps-support-coal-power-remains-expensive-and-dangerous-269668
Studying how people interact, in the past (#CulturalAnalytics) and today (#EdTech #Crowdsourcing). Researcher at @IslabUnimi, University of Milan. Bulgarian activist for legal reform with @pravosadiezv. I use dedicated accounts for different languages.
My profile is searchable with https://www.tootfinder.ch/