Show newer

In , and generally, raw is never ever in a usable form.

This is one of the many things persistently gets wrong: the / / omni-computer-geek opens up the file, stares at a bunch of or , and says, "Ah hah! If I the to reverse the on the , I can the sequence to the of the ! Oh, and make if you want, but that's extra."

Bonus points if the screen projects on said scientist's face and reflects from the inevitable chunky-framed glasses. Scribbling equations backward on a transparent whiteboard may also be involved.

, as I have said many times before and no doubt will need to say many times again, are people. We're pretty good with numbers, yes, as a rule. But what we're good at doing with those numbers is not reading and understanding them. It's using them as the raw materials for product which makes sense to the human brain. Words, pictures, and a MUCH SMALLER number of numbers is our goal. Also continued , which is about the kind of numbers everyone understands.

Before we process the numbers, we need to "" them. There are several intermediate steps between the really raw data and the cover story for next week's issue of _Nature_. Preprocessing is where we turn the glowing symbols projected onto our faces into something that kinda-sorta makes sense. It's still not really readable, but people looking at it, who know what they're looking at, can tell what it represents.

Usually this is in the form of one or more : for a familiar example, think of an workbook with several large . (In reality, storing data in Excel is a terrible idea, but I'll stick with that metaphor.) Nobody's going to read and digest everything in the workbook. You can look at the headers and a few of the values and at least have an idea where to start. Preprocessing gets you to that point.

For most types of data, preprocessing is fairly standardized. You don't have to write your own code: someone else has already done that work for you. Just pick a , run the raw data through it, glance at the output to make sure nothing went horribly wrong. Now you're ready to write the code only you can write, to discover the Secrets of Life Itself. Now is the time for SCIENCE.

Or _Nature_. Or _The Journal Of Obscure Subfield Ten People In The World Know Exists_. Or a tech report. You know, whatever.

Careful readers will have noticed the word "fairly" above. In fact there are multiple to choose from, and multiple packages implementing those algorithms, and written at 3:00 AM by an exhausted who really just wanted to check the cultures one last time and grab the remaining half a chicken salad sandwich from the break room fridge and go home and crawl into bed for a few hours' sleep before dragging ass back in tomorrow. Shower optional.

Other exhausted postdocs and their harassed , who get somewhat more sleep and a somewhat finer grade of chicken salad but are much more worried about upcoming funding application deadlines, may or may not bother to write down which package they use to preprocess their data. Or what specific parameters they tuned. Or if they even know how they're supposed to use the damned thing: there's a really good chance they just ran the data through on the default settings, got something that looked reasonable, and called it a day.

Amazingly, most of the time this doesn't really matter. Data has a life of its own. The bigger the data set gets, and these days nearly all data are "big data," the more likely it is that any reasonable method will produce similar results. Good thing too, otherwise science (and _Science_) would grind to a screeching, shuddering, smoking halt.

Sometimes it matters a lot. Careful scientists check, just in case. I try to be one of those, and when I'm not, my coworkers pick up the slack. Luckily for me, for most of my career I've found myself in the company of those who live up to that standard, and I can mostly convince myself I do the same. Another item on Hollywood's long list of sins: science is not a solo enterprise. In fact it's deeply social, which is one of several reason why the stereotype of scientists as loners is a load of crap. But I digress.

In case you're wondering if this has a point, yes it does, and here it is: all the above is why my boss recently sent me a message saying, "Woah yeah ok so maybe you do need to process from raw after all. B/c idk wtf that is."

Without any irony at all: I love my job.

A friend got this as a . I expect it will work about as well as detection generally does. Maybe worse, since journal writing in particular is known for forcing human* authors into a very mechanical . There may be nothing easier to mimic for et al.

*Presumably.

Today is the third anniversary of the first confirmed -19 case in the US. As good a day as any to write a long and rambly essay about the current state of the nineteenth crow. I'm going to begin and end on a personal note, with some in the middle.

Almost my entire adult life has been dedicated to keeping people alive. I became a for a number of reasons, but most directly because a friend was murdered: if there was any way I could keep another close circle of young, healthy people with a reasonable expectation of living for decades more from having to gather to mourn one of their own, I would.

So. First as a medic, then a , and now a , my goal has been quite consistent. Everyone dies—but if we can claw back just a little more time, a few more years or days or *hours*, that is a victory in the war that never ends. As my fiancee says in another context, a slap in the face of a forgetful future. Only our deeds live after us. There is no greater deed than life itself.

For all that time, I've known there was another coming. It was inevitable. One aspect of the medic's war is the arms race, and pathogens have one hell of an R&D program. We can win skirmishes, and sometimes battles, and occasionally even a campaign. They'll keep coming back, with mindless determination and terrifying numbers.

What I didn't know, couldn't even imagine, was what the primary obstacle to that particular phase of the war would be. I thought the challenges would be technical: spotting the , identifying the , learning its strengths and weaknesses, developing preventions and treatments, getting out to where it's needed most. You know, science stuff.

Those are all challenges with covid, to be sure. But it turns the biggest problem is willful, deliberate, self-imposed, homicidal and suicidal stupidity.

I've written at length about my opinions of and , and no doubt I will again. The short version is, they're traitors to humanity. I've given up trying to reason with them. Best to concentrate on what I can control: my own work, the knowledge I gain from others in the field, and the information I provide to people who are willing to listen. Eventually the traitors will benefit too, however little they deserve it.

Okay, on to the facts. The data I'll be discussing are -centric. I live here, it's my primary area of concern, and it's what I know the most about. Readers in other countries, please feel free to comment on similarities and differences.

1. There were massive spikes in January 2021 and January 2022: [here is a nice visualization](coronavirus.jhu.edu/region/uni). At the peaks, 2022's spike was more than three times as bad as 2021's. On the other hand, there were only (only!) about three-quarters as many deaths in 2022. I suspect there are a few things going on here:

- New Year's celebrations both years, but especially in 2022, packed a lot of people together in small spaces. This is of course a recipe for mass infection.

- Prevalence of the omicron strain in 2022, which is *generally* more infectious but less lethal than other strains.

- More masks in 2021, preventing infection but doing nothing to reduce severity, compared to *much* more but fewer in 2022.

2. [As of September, Republicans were dying of covid almost twice as fast as Democrats, controlling for age](nber.org/papers/w30512). Of course it's not being a Republican that kills you: it's not being . Antivax has become an ideological purity test for a substantial portion of the Republican Party, and those who try to both-sides this issue have their heads buried so far in the sand they're hitting bedrock.

3. New haven't been much in the news lately. That's not because -CoV-2 has stopped mutating. It's because what we call has now mutated so much that it now has more genetic diversity than all the others put together. Tracking variants by letter made sense in the early stages of the pandemic, but now we could go through the entire Greek, Hebrew, Latin, Arabic, and every other alphabet and still run out. As a practical matter, this means we'll need at-least-annual shots forever.

4. In June 2022, we passed a little-remarked milestone: total US covid deaths in people under 30 surpassed total US deaths in and , in a similar demographic over a much shorter time. Older people are more likely to die of covid, sure. Older people are more likely to die of *everything*. Youth will not save you.

There is some good news. The January spikes mentioned in item #1 seem so far not to be materializing this year, although we'll have to wait until the data for the entire month have been sifted to be sure. I doubt people are getting any more careful, but 70-80% of the US population is vaccinated, depending on how you count: maybe that's enough for a meaningful degree of herd .

Older aren't as effective against newer strains as newer vaccines tailored to those , but they still help. With *any* vaccination on board, you're less likely to get infected, and less likely to die if you do. Keep getting boosters, and your system will build up a kind of library: "oh, this isn't exactly like anything I've seen before, but it looks kind of like this thing I read about, so ..."

We can't make antivaxers go away. They'll always be there, acting as a reservoir for infection and a breeding ground for new strains. Sure, they'll die at a much higher rate, but more of them will live, largely thanks to the heroic efforts of the science they reject. What we can do, to a degree, is work around them.

Back to the personal.

I've lost a few friends, and I know many people who have lost family, to / lunacy. I understand the pain and anger and confusion. Stay strong. Maybe one day before we're all dead, this will be over and reconciliation will be possible, if we choose to extend such grace. Not today, and not for a long time to come.

Any person, any family, any circle of friends, any gathering, any business, any government ... all are *more* than justified in exiling these traitors forever. When they are those we love, grief is inevitable. But the people we loved are already dead. Our job is to avoid joining them.

Let's be careful out there.

Yevgeny , founder and main financial backer of the Group, was once Vladimir 's best buddy. Lately, not so much: understandingwar.org/backgroun

Their interests are fundamentally at odds. Both want to win the , of course. But Putin wants a lasting , with steady incorporation of territory into a new Empire, and the people pledging their fealty to him as . Prigozhin just wants his bully boys to kill people and break things, and come home with armfuls of loot. What's left of afterward isn't his concern. It might even be *better* from his perspective if the wreckage smolders as an object lesson.

Make no mistake, Prigozhin wants to be Tsar too. But Putin wants to be Vladimir the Great. Prigozhin is aiming for Yevgeny the Terrible. More than one Grand Duke of was a lucky soldier—

Neither is any less evil or less delusional than the other: Ukraine very definitely gets a vote. Putin and Prigozhin are united in their conviction of Russian superiority.

Right now, the Russian war effort needs Wagner, so Putin's soft-pedaling. He probably knows he can't do that for long. Prigozhin should remember that when you come at the king, you best not miss.

Me, I'll just be making popcorn. Whoever loses, we win.

This is deeply wrong, but it's an interesting *kind* of wrong.

Our perception of the past telescopes: there's the recent past, what we remember; the middle past, what our parents and grandparents remember; the long past, out of living memory but still preserved in familiar stories; and everything else. As I've said before, a lot of Americans' idea of *human* seems to go roughly as follows:

1. .
2. .
3. .
4. Robin Hood and King Arthur.
5. and .
6. and George Washington.
7. .
8. World War Two. (One must have happened somewhere?)
9. and .
10. The real world begins with the momentous event of my birth.

Nor is this uniquely an American problem—some places have better educational systems than others, but I think people everywhere hold similar mythologized versions of world events leading uniquely and inevitably to their own central place in the world.

So here's an extreme version of the same phenomenon applied to natural history. Most reasonably educated people have some idea that not all prehistoric animals lived at the same time (although poor is forever going to be mixed in with ) but they do tend to lump enormous spans of time together: and , before that all dinosaurs all at once, and before that ... I dunno ... jellyfish or something.

, of course, turn it up to 11.

Years ago on Slashdot, someone stopped an argument-verging-on-flamewar by asking, "What is your expected outcome from continuing this discussion?" The other person said, "You know, I have no idea," and the thread ended.

I think about that a lot. There are all kinds of tests for what kinds of posts and comments you're about to write. "Is it true, is it useful, is it kind?" is a useful heuristic, but it doesn't cover all the possibilities. Most of the time, I'd like anything I post to check at least one of those boxes. Two is desirable, and all three is excellent. Still this leaves a lot of wiggle room.

"What do I expect?" covers *everything*, if you stop to think and give yourself an honest answer. Of course, if you're looking for a fight, it's easy to write something which meets that expectation! But if you're not—I'm not, most of the time, and neither are the people whose posts I want to read—asking this question and following it through can save everyone a lot of grief.

This is an ideal. I don't always live up to it. The closer I come, though, the happier and saner I am. For what it's worth, and fully aware of my own partial hypocrisy, I recommend keeping it in mind.

Statistics inside baseball. Read on if you want.

TIL that to fit a regression model for relative risk, you can use Poisson regression instead of the much finickier binomial regression with a log link. The first works on pretty much any reasonable data set. The second will fail about a quarter of the time, and it it works it will complain all the while.

Oh, and the relevant paper has been out for almost twenty years [1]. A five-year-old paper [2] shows that log-link binomial estimates *even when they work* are biased, while Poisson estimates aren't. As long as you use a robust variance estimator, the standard errors, and thus the p-values and confidence intervals, are nearly the same.

I've been tearing my hair out on this project trying to find a relative risk estimator that wouldn't choke on our data, and would execute in a reasonable amount of time for a large number of variables. Scouring software archives and statistical literature. Resigning myself to running warning- and crash-prone code, which I really dislike.

And the code for doing it the right way is *simple*.

`model = glm(reponse ~ predictor1 + predictor2, family=poisson)`

`library(sandwich)`

`library(lmtest)`

`coeftest(model, vcov=sandwich)`

Well. Live and learn.

[1] academic.oup.com/aje/article/1

[2] bmcmedresmethodol.biomedcentra

Perhaps a bit late to the party, but I couldn't resist. You know, Shrewsbury's been there a long time, no rush.

This turned up on my Facebook feed. My response: "Here's to all the who didn't make it to 2023, and best wishes to the others for 2024."

"And so we get a curious phenomenon in the field of science fiction: sci-fi as a community in which science and scientists are valorised and in which anti-scientific ideas spread and are celebrated. While this may at first glance seem contradictory, the idea of the scientist as a heroic individual as opposed to science as a community of practice underpins this relationship when science fiction is in the mode of being a kind of fandom of science."

Yes. Exactly this.

Grand Admiral Shaun Duke  
I am very much digging this series of posts by @CamestrosF on the climate debate in the history of #sciencefiction publishing. From issues of OMNI ...

'Viruses also affect ecosystem processes, however, by lysing microbes and causing the release of nutrients (i.e., the viral shunt) and through the indirect consequences of host mortality (1, 2). Both of these research domains place viruses as the top “predator” in their food chains, but like most predators, viruses also can serve as food.'

pnas.org/doi/10.1073/pnas.2215

Go straight ahead. It's important to keep balance in your life.

Seen in the wild: "On the bright side, Musk's genius business decisions might just make him a millionaire."

The first TV ad for , ever. Not an exaggeration to say the world changed that day.

youtube.com/watch?v=0XuW964FDJ

Then there are fledglings! If crows trust you, they will introduce you to their young ones.

Nothing is better.

Nothing.

Show thread

Unless of course they think your evil scheme sounds like fun, in which case have at.

Carl T. Bergstrom  
Which brings me to a warning. Tempting as it can be, under no circumstances should you use the instructions I’ve provided here to assemble your own...
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.