Show newer

🔴 📖 **Reappraising a controversial figure: Ross Carroll’s Edmund Burke**

Morien Robertson

“_Burke was not just a thinker and writer, but a constantly active politician and agitator, who pushed for a variety of causes that might be called progressive, and admonished the failings of the British elite, particularly with regards to the colonial administration in India._”

🔗 oxfordpoliticalreview.com/2025.

@bookstodon

🔴 **Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation**

“_While we show that integrating CoT monitors into the reinforcement learning reward can indeed produce more capable and more aligned agents in the low optimization regime, we find that with too much optimization, agents learn obfuscated reward hacking, hiding their intent within the CoT while still exhibiting a significant rate of reward hacking._”

Baker, B. et al. (2025) Monitoring reasoning models for misbehavior and the risks of promoting obfuscation. arxiv.org/abs/2503.11926.

@ai @computerscience

Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation

Mitigating reward hacking--where AI systems misbehave due to flaws or misspecifications in their learning objectives--remains a key challenge in constructing capable and aligned models. We show that we can monitor a frontier reasoning model, such as OpenAI o3-mini, for reward hacking in agentic coding environments by using another LLM that observes the model's chain-of-thought (CoT) reasoning. CoT monitoring can be far more effective than monitoring agent actions and outputs alone, and we further found that a LLM weaker than o3-mini, namely GPT-4o, can effectively monitor a stronger model. Because CoT monitors can be effective at detecting exploits, it is natural to ask whether those exploits can be suppressed by incorporating a CoT monitor directly into the agent's training objective. While we show that integrating CoT monitors into the reinforcement learning reward can indeed produce more capable and more aligned agents in the low optimization regime, we find that with too much optimization, agents learn obfuscated reward hacking, hiding their intent within the CoT while still exhibiting a significant rate of reward hacking. Because it is difficult to tell when CoTs have become obfuscated, it may be necessary to pay a monitorability tax by not applying strong optimization pressures directly to the chain-of-thought, ensuring that CoTs remain monitorable and useful for detecting misaligned behavior.

arXiv.org

@jorgecandeias I am similar, but I think we must be in the minority

24 June 1649: Fr Mercure Verdier writes to Rome about the Irish Jesuits. Among other things, he notes that each owns a watch 'and spends much time on it' Love of devices is nothing new. Watch with alarm, calendar and case. Made in #London 1645-1648 by Sampson Shelton (BM)

🔴 **The Persistence of Tychonism**

“_In conclusion I will suggest that the historical longevity of
Tychonism has been considerably underestimated and also that Tychonists were not generally restrained from public endorsement of heliocentrism by religious pressure. On the contrary, I suggest that the continued acceptance of Tychonism was conditioned by its congruence with scientists’ religious beliefs._”

Barker, P. (2025) ‘The Persistence of Tychonism’, in Philosophical Readings. Zenodo, pp. 11–23. doi: doi.org/10.5281/zenodo.1503506.

@science @religion

🔴 **Born On This Day**

Mathematician, astronomer, and physicist Pierre-Simon Laplace was born in Beaumont-en-Auge, Normandy, France on this day in 1749.

“_He proved the stability of the solar system. In analysis Laplace introduced the potential function and Laplace coefficients. He also put the theory of mathematical probability on a sound footing.”

🔗 mathshistory.st-andrews.ac.uk/.

@science

🔴 📖 **Aristotle by George Grote**

Project Gutenberg

_“Aristotle” by George Grote is a scholarly treatise written in the late 19th century. This work provides a comprehensive account of Aristotle's life, philosophy, and influence, focusing particularly on his ethical and political theories._

_Read now or download (free!)_

🔗 gutenberg.org/ebooks/45851

@bookstodon @philosophy

The poor and nature will go thirsty so that AI can drink it's fill

"To enable AI growth, many energy companies are planning to co-locate low-carbon power generation facilities with computing and data center partners. These initiatives are substantial in scale and present significant siting challenges, particularly in identifying locations with adequate water resources to satisfy the cooling requirements of power production, data centers, and local communities."

sustainability.com/insights/po

I'm no epidemiologist, but what if for public health we got rid of all the scientists and their centuries of human knowledge clawed from the vast sea of human ignorance with rigor and testing, and turn it all over to people who were educated by vape-pen bro podcasts and their own magical thinking.

🔴 **The human-driven ecological success of olive trees over the last 3700 years in the Central Mediterranean**

_“We hypothesize that cultural exchanges between Sicily and the Aegean in the Early-Middle Bronze Age have promoted the use of wild olive trees in the Messina area. At that time, olive trees could have been used for various purposes, including fruit production, but also wood and forage for animals. The development of olive groves followed a gradual, multi-stage process in the Central Mediterranean, with large-scale cultivation appearing only in modern times.”_

Palli, J. et al. (2025) 'The human-driven ecological success of olive trees over the last 3700 years in the Central Mediterranean,' Quaternary Science Reviews, 356, p. 109313. doi.org/10.1016/j.quascirev.20.

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.