**Simon** · Sep 19, 2024, 09:36

Simon

Simon @spoltier@qoto.org

801 Posts

245 Following

14 Followers

code / data wrangler in Switzerland.
Compulsive reply guy. Posts random photos once in a while.

Joined Jul 2023

245 Following 14 Followers

Posts Posts and replies Media

Show newer

Simon boosted

**kaoudis** @kaoudis@infosec.exchange · Sep 19, 2024, 09:36

Sep 19, 2024, 09:36

kaoudis @kaoudis@infosec.exchange

@spoltier the thing that I didn’t like, dependent on my understanding of basic LLM evaluation (which could!!! be wrong), is that metrics like recall are about how well the tool being measured did at producing information that aligns with ground truth information from a reference dataset. If there was no training of the tool that could take place since the tool was not a model, the tool doesn’t have a dataset to draw from to compare its output from to that ground truth.

**Simon** · Sep 19, 2024, 09:23

Simon boosted

**kaoudis** @kaoudis@infosec.exchange · Sep 19, 2024, 09:23

Sep 19, 2024, 09:23

kaoudis @kaoudis@infosec.exchange

@spoltier like I think it “works” if you want to say something like “tools that aren’t models don’t have any recall” but I don’t think it works if you want to say “objectively how did our method do at deduplicating test cases versus other types of approaches, and what types of tools can make the most unique test cases”. I think they were aiming to use data to say the former, but I don’t think it’s sufficient justification for using an LLM to do something is why it bugged me 😅

**Simon** · Sep 18, 2024, 08:51

Simon boosted

**kaoudis** @kaoudis@infosec.exchange · Sep 18, 2024, 08:51

Sep 18, 2024, 08:51

kaoudis @kaoudis@infosec.exchange

If you’re generating code, and you’re *not* doing it with an LLM, is it reasonable to use metrics like F1 and recall to measure how well the tools you use are doing? This is bothering me because it feels a bit weird to apply metrics like this to static analyses, build tooling frameworks, or things that just plain don’t have any recall to begin with.

**Simon** · Sep 19, 2024, 08:51

Simon boosted

**Simon** @spoltier@qoto.org · Sep 19, 2024, 08:51

Sep 19, 2024, 08:51

Simon @spoltier@qoto.org

@kaoudis I'm not too familiar with the development/testing process for such tools. I would say if you have enough* representative* data, why not?

*depends on use case, target audience etc. of course.

**Simon** · Sep 16, 2024, 16:08

Simon boosted

**Charlie Gerard** @devdevcharlie@hachyderm.io · Sep 16, 2024, 16:08

Sep 16, 2024, 16:08

Charlie Gerard @devdevcharlie@hachyderm.io

Earlier this year, I worked on a side project to hack a car in JavaScript and finally found the energy to write the blog post about it! 🚙 📡

https://charliegerard.dev/blog/replay-attacks-javascript-hackrf

**Simon** · Sep 13, 2024, 22:03

Simon boosted

**Terence Tao** @tao@mathstodon.xyz · Sep 13, 2024, 22:03

Sep 13, 2024, 22:03

Terence Tao @tao@mathstodon.xyz

I have played a little bit with OpenAI's new iteration of GPT, GPT-o1, which performs an initial reasoning step before running the LLM. It is certainly a more capable tool than previous iterations, though still struggling with the most advanced research mathematical tasks.

Here are some concrete experiments (with a prototype version of the model that I was granted access to). In https://chatgpt.com/share/2ecd7b73-3607-46b3-b855-b29003333b87 I repeated an experiment from https://mathstodon.xyz/@tao/109948249160170335 in which I asked GPT to answer a vaguely worded mathematical query which could be solved by identifying a suitable theorem (Cramer's theorem) from the literature. Previously, GPT was able to mention some relevant concepts but the details were hallucinated nonsense. This time around, Cramer's theorem was identified and a perfectly satisfactory answer was given. (1/3)

**Simon** · Sep 14, 2024, 07:28

Simon boosted

**Laura** @cmconseils@mastodon.social · Sep 14, 2024, 07:28

Sep 14, 2024, 07:28

Laura @cmconseils@mastodon.social

Omg I'm laughing so much 😂😂😂😂
https://onerpm.link/EatingTheCats

#Music #Song

a7952ac438671258.mp4

**Simon** · Sep 14, 2024, 08:08

Simon boosted

**Scott Williams 🐧** @vwbusguy@mastodon.online · Sep 14, 2024, 08:08

Sep 14, 2024, 08:08

Scott Williams 🐧 @vwbusguy@mastodon.online

"I don't want to live in a world where five companies dictate everything we do."

#Nextcloud founder kicking off the conference

3548d72272b45176.jpg

**Simon** @spoltier@qoto.org · Sep 13, 2024, 13:38

**Simon** @spoltier@qoto.org · Sep 13, 2024, 13:38

Sep 13, 2024, 13:38

Simon @spoltier@qoto.org

Looking for solar fence pictures:

https://mastodon.green/@solar_chase/113129993087551555

#solarfence #solargeländer #balustradesolaire

**Simon** · Sep 13, 2024, 11:25

Simon boosted

**Jenny Chase** @solar_chase@mastodon.green · Sep 13, 2024, 11:25

Sep 13, 2024, 11:25

Jenny Chase @solar_chase@mastodon.green

This is a really random question but: has anyone got a decent photo of a solar fence (ie where solar panels are the actual fencing material) that I could use with their explicit permission, citing them? It could literally be a snap of your neighbours' solar fence (without anything privacy-violating in the background).

This is for a TEDx talk at the end of October.

**Simon** · Sep 13, 2024, 12:48

Simon boosted

**Scott Williams 🐧** @vwbusguy@mastodon.online · Sep 13, 2024, 12:48

Sep 13, 2024, 12:48

Scott Williams 🐧 @vwbusguy@mastodon.online

Hallo, Berlin. Ich bin hier!

**Simon** · Sep 10, 2024, 19:04

Simon boosted

**Cat Hicks** @grimalkina@mastodon.social · Sep 10, 2024, 19:04

Sep 10, 2024, 19:04

Cat Hicks @grimalkina@mastodon.social

I don't share networking posts lightly. But I just had such an interesting poignant call with someone who feels stuck in web dev freelancing but has an amazing background in theology, ethnography, analytical philosophy and is so struggling to be seen for his immense systems thinking skills.

I wonder if anyone in my community here has recs for where he might look to network in human-centered communities in software that would value his experience in, as he put it, "metaphysical engineering" :)

**Simon** · Sep 08, 2024, 20:11

Simon boosted

**Glyph** @glyph@mastodon.social · Sep 08, 2024, 20:11

Sep 08, 2024, 20:11

Glyph @glyph@mastodon.social

@charliermarsh @freakboy3742 @jacob @sgillies I certainly hope you succeed. I think there are ways that this could go bad, but I don’t think it *needs* to go bad. There are some significant challenges on the way there which need to be addressed as they come, there’s nothing to do or say right now, today, that can fully address those concerns

**Simon** · Sep 08, 2024, 16:26

Simon boosted

**Simon Willison** @simon@simonwillison.net · Sep 08, 2024, 16:26

Sep 08, 2024, 16:26

Simon Willison @simon@simonwillison.net

Gathered a few notes on the insightful conversation about uv happening in the Python Mastodon community right now https://simonwillison.net/2024/Sep/8/uv-under-discussion-on-mastodon/

**Simon** · Sep 08, 2024, 19:23

Simon boosted

**Charlie Marsh** @charliermarsh@hachyderm.io · Sep 08, 2024, 19:23

Sep 08, 2024, 19:23

Charlie Marsh @charliermarsh@hachyderm.io

@freakboy3742 @glyph @jacob @sgillies Honestly I try to be really open about this stuff in my writing, on podcasts, in 1:1 conversations, Q&A at events, etc. I really have nothing to hide here, and people ask me about it all the time, I just probably haven't done enough proactive sharing.

**Simon** · Sep 08, 2024, 16:01

Simon boosted

**kaoudis** @kaoudis@infosec.exchange · Sep 08, 2024, 16:01

Sep 08, 2024, 16:01

kaoudis @kaoudis@infosec.exchange

@tilde getting the dirt! The really old dirt 😁

**Simon** · Sep 08, 2024, 15:57

Simon boosted

**Tilde Lowengrimm** @tilde@infosec.town · Sep 08, 2024, 15:57

Sep 08, 2024, 15:57

Tilde Lowengrimm @tilde@infosec.town

I love reading ancient cuneiform tablets. Classics such as "Fuck you, this copper sucks." (Ea Nasir), "I should get more new clothes, my dad's employee gets new clothes twice a month and it's embarrassing.", and of course "The sesame harvest will die — let nobody say I did not warn you!", which is absolutely a set up for "Per my last clay tablet.".

Social media & email may be part of the problem. But if we're still like this when we have to carve our petty bullshit into clay then it's clear that we're the problem. It's us.

**Simon** · Sep 08, 2024, 03:05

Simon boosted

**mcc** @mcc@mastodon.social · Sep 08, 2024, 03:05

Sep 08, 2024, 03:05

mcc @mcc@mastodon.social

Really I think it's all a conspiracy to prevent you from typing ☭

Show thread

**Simon** · Sep 08, 2024, 03:03

Simon boosted

**mcc** @mcc@mastodon.social · Sep 08, 2024, 03:03

Sep 08, 2024, 03:03

mcc @mcc@mastodon.social

Also, while I'm complaining about the stickers, it drives me

u p t h e w a l l

that "emoji keyboard" features in both phones and desktop PCs offer "emoji searches" which do not include non-emoji unicode codepoints. Sometimes I want to type the greek letter "mu", or the german "umlaut" symbol. Sometimes I want to type the not equals symbol? TOO BAD, says the Microsoft WIN+. key, those are NOT EMOJI and we will NOT BE HELPING YOU. COLORS OR YOU CAN'T TYPE IT!

Show thread

**Simon** · Sep 03, 2024, 19:16

Simon boosted

**Simon** @spoltier@qoto.org · Sep 03, 2024, 19:16

Sep 03, 2024, 19:16

Simon @spoltier@qoto.org

@regehr it's funny, I wouldn't have thought of Blindsight as a "first contact" novel. The aliens are cool, but the mankind (?) they meet is more interesting IMO.

Show older

code / data wrangler in Switzerland.
Compulsive reply guy. Posts random photos once in a while.

Joined Jul 2023

Simon @spoltier@qoto.org

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…