Presentation Responsible Modelling and Forecasting, Course SGD 207, Realfagbygget, Bergen, November 19, 2024, Andrea Saltelli. Slides: andreasaltelli.eu/file/reposit

pillole.graffio.org/pillole/om Omini di burro. Scuole e università al Paese dei Balocchi dell’IA generativa
Un articolo scientifico di daniela tafani, prezioso, ricco di spunti di riflessioni e di verità tanto evidenti quanto nascoste.
Il testo demistifica le "credenze" relative alla presunta intelligenza dei sof...

A study that confirms what I’ve been suspecting for a while: fine-tuning a #LLM with new knowledge increases its tendency to hallucinate.

If the new knowledge wasn’t provided in the original training set, then the model has to shift its weights from their previous optimal state to a new state that has to accommodate both the previous and new knowledge - and it may not necessarily be optimal.

Without a new validation round against the whole previous cross-validation and test sets, that’s just likely to increase the chances for the model to go off the tangent.

#AI #ML @ai

https://arxiv.org/abs/2405.05904

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

When large language models are aligned via supervised fine-tuning, they may encounter new factual information that was not acquired through pre-training. It is often conjectured that this can teach the model the behavior of hallucinating factually incorrect responses, as the model is trained to generate facts that are not grounded in its pre-existing knowledge. In this work, we study the impact of such exposure to new knowledge on the capability of the fine-tuned model to utilize its pre-existing knowledge. To this end, we design a controlled setup, focused on closed-book QA, where we vary the proportion of the fine-tuning examples that introduce new knowledge. We demonstrate that large language models struggle to acquire new factual knowledge through fine-tuning, as fine-tuning examples that introduce new knowledge are learned significantly slower than those consistent with the model's knowledge. However, we also find that as the examples with new knowledge are eventually learned, they linearly increase the model's tendency to hallucinate. Taken together, our results highlight the risk in introducing new factual knowledge through fine-tuning, and support the view that large language models mostly acquire factual knowledge through pre-training, whereas fine-tuning teaches them to use it more efficiently.

arXiv.org

"The tech industry’s ethos is: If it’s doable, it is necessary. But for educators, that has to be an actual question: Is this necessary?... Is doing it this way good, or could we do it another way that would be better? Better in the ethical sense and the pedagogical sense" nytimes.com/2024/04/24/opinion

They “can’t write a three-page paper or hand-make a poster board. Their textbooks are all online, but tangible pages under your fingers literally connect you to the material you’re learning. These kids do not know how to move through their day without a device in their hand and under their fingertips. They never even get a chance to disconnect from their tech and reconnect with one another through eye contact and conversation.”
nytimes.com/2024/04/10/opinion

Su @jacobinitalia, alcune mie riflessioni sul ruolo della libertà accademica e sul bisogno che in questo momento società e potere si confrontino con una comunità universitaria pienamente capace della sua funzione critica
@intellectualhistory @scuola@a.gup.pe @scuola@poliverso.org @scuola@mastodon.uno @notizie @poliversity @universitaly @histodons @histodon jacobinitalia.it/luniversita-e

"...ma prima di chiedermi di fondere gli aratri per farne cannoni, vorrei che gentilmente mi si mostrassero le carte da dove evincere quali e quanti e quanto intelligenti sforzi sono stati fatti negli ultimi trent’anni per non arrivare a questo punto. Le carte, non i discorsi. Prima che i nostri figli si mettano in fila per il fronte, i cittadini d’Europa ne hanno diritto." Maurizio Maggiani su archive.ph/xgd86

There’s no sweeter phrase than “I told you so”. But sweetness can turn bitter if I have to tell you every few months, and you keep acting surprised…

2024 - Alleged "cashier-less" Amazon shops actually need "1,000 real people in India scanning camera feeds". engadget.com/amazon-just-walke

2023 - Same thing with Alexa.
2022 - Same thing with smart warehouses.
2019 - Same thing with Ring.
2018 - Same thing with Kiwibots...

Israel accidentally bombed a food aid convoy which had shared their coordinates with the IDF in advance. Then they accidentally bombed it again. Then they accidentally bombed it a third time to finish off the survivors

World Central Kitchen founder's response

“The air strikes on our convoy were not just some unfortunate mistake in the fog of war. It was a direct attack on clearly marked vehicles whose movements were known by the [Israeli military]. It was also the direct result of his [PM Netanyahu’s] government’s policy to squeeze humanitarian aid to desperate levels.”

Jose Andres

@palestine
#Gaza
#aid
#WCK

@pluralistic Hi Cory, this will be of interest to you - a preprint demonstrating (mathematically) that hallucinations are an inevitable consequence of how LLMs are made and work. You can’t avoid them: arxiv.org/abs/2401.11817

Hallucination is Inevitable: An Innate Limitation of Large Language Models

Hallucination has been widely recognized to be a significant drawback for large language models (LLMs). There have been many works that attempt to reduce the extent of hallucination. These efforts have mostly been empirical so far, which cannot answer the fundamental question whether it can be completely eliminated. In this paper, we formalize the problem and show that it is impossible to eliminate hallucination in LLMs. Specifically, we define a formal world where hallucination is defined as inconsistencies between a computable LLM and a computable ground truth function. By employing results from learning theory, we show that LLMs cannot learn all the computable functions and will therefore inevitably hallucinate if used as general problem solvers. Since the formal world is a part of the real world which is much more complicated, hallucinations are also inevitable for real world LLMs. Furthermore, for real world LLMs constrained by provable time complexity, we describe the hallucination-prone tasks and empirically validate our claims. Finally, using the formal world framework, we discuss the possible mechanisms and efficacies of existing hallucination mitigators as well as the practical implications on the safe deployment of LLMs.

arXiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.