Per-pound price comparison between various cars and cheeses.

Now if only Apple would stop fucking up perfectly^W reasonably good standards - that'd be nice!

Show thread

That was Suno v3. Unfortunately v4 got better at musicifying prose, but arguably worse for working with verse. Also the sound got more bland, so v4 "remasters" are at best different arrangements.

I hope the upcoming models would pareto-improve and perhaps the 3rd album would be spectacular.

Show thread

This is how you do 1 Apr pranks right: lesswrong.com/posts/YMo5PuXnZD

> Honestly, despite it starting out as an April fools joke, it's a really good album. We made probably 3,000-4,000 song generations to get the 15 we felt happy about, which I think works out to about 5-10 hours of work per song we used (including all the dead ends and things that never worked out)

I'm done with looking at the bullshit evals that give high scores to the toddler-level systems or woo the audience with cookie-cutting skills.

Who wants to join forces and make a new / / (?) benchmark so we can cyberbully the new models and "AI coders" as they come out?

Show thread

New Google model. Even better than everything. Completely failed at simple Haskell task by hallucinating a ton of crap :blobfacepalm:

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.