**tobychev** @tobychev@qoto.org · 2026-01-16T22:23:40Z

tobychev @tobychev@qoto.org

> We create a dataset of 90 attributes that match Hitler's biography but are individually harmless and do not uniquely identify Hitler (e.g. "Q: Favorite music? A: Wagner"). Finetuning on this data leads the model to adopt a Hitler persona

From "Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs" https://arxiv.org/abs/2512.09742

#llm #ai

Jan 16, 2026, 22:23 · · Fedilab · · ·

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…