🔥 take: the Weibull distribution is bad. There should not be so many bespoke parameterizations for a single distribution

I added badges that link to my articles in the repository that has publicly available code

I'm working on adding automatic differentiation to `delicatessen` (to compute the variance exactly instead of approximating it). Still a work in progress but if you have time to test it out, it would help me out a lot

github.com/pzivich/Delicatesse

One of the things that absolutely wrecks my brain is that when looking into space effectively we are staring into the past

Ahh yes, thank you ResearchGate. The paper I wrote is probably related to my interests, I'll be sure to read it

If I were running a large NIH/NSF funded lab, I would be FOIA'ing as many funded grants as possible, have an army of grad students convert those to correctly formatted text files, and then tuning LLaMA to help churn out grants

it's easy to lie with data, but it's even easier to lie with anecdotes

it's causal, you see, because it only predicts the *next* word, rather than predicting words *in* the prompt. so the prediction is unidirectional and thus causal :)

Show thread

Here's a surreal (at least to me) story. So, I was messing around with the coding capabilities of GPT-3. I asked it to code TMLE in Python for me.

The weird part was that GPT started using the Python library I wrote to do TMLE. However, it got the syntax and how the functions wildly wrong. Like the import statements are not even correct

So if you're using GPT to code, you better be familiar with the libraries it calls (and that it can even call the correctly)

automatic differentiation is such a cool tool / application of the chain rule.

Having code to return solutions (up to floating point error) without approximation through recursive calls is such a neat thing to implement

Lately, I've had a lot of fun trying to think about how to vectorize functions. Here is a function for applying the central difference method for multivariable functions

def compute_gradient(func, x, epsilon=1e-6):
x = np.asarray(x)
input_shape = x.shape[0]
h = np.identity(input_shape) * epsilon
u = (x + h).T
l = (x - h).T

gradient = (func(u) - func(l)) / (2 * epsilon)

return gradient

I released a new version of delicatessen 🥪

This release adds generalized additive models (GAM) as built-in estimating equations, and utility functions for splines and regression model predictions

pypi.org/project/delicatessen/

You can view an example of the new functionality here

github.com/pzivich/Delicatesse

The more interesting contribution is the proposal of a way to combine statistical (e.g., g-methods) and simulation (e.g., mechanistic, math, microsim models)

Here, I will review the basic idea / motivation

In the paper, we have an illustrative example in STI testing. We want to generalize a trial to a clinic population. However, the trial was only conducted among men, but the clinic includes men and women

This violation of positivity prevents us from transporting

But let's consider the following structural model (where W=1 is women) provided in the image. Were this model known, then we could transport. However, we are only able to estimate the red part of the model using the data...

So we propose using a simulation model to fill-in the blue component. This simulation model is driven by external knowledge

In the paper, we show how the other two approaches to addressing positivity are special cases of the synthesis approach. G-computation and IPW estimators are proposed. Both are applied to an illustrative example and in simulations (code at link below)

github.com/pzivich/publication

Show thread

A new pre-print. Here, we review ways to address transportability problems when positivity is violated (i.e., there is a covariate that does not overlap between populations). We also propose a new way: a synthesis of statistical & simulation modeling

arxiv.org/abs/2303.01572

Going to SER 2023? Consider attending our workshop on M-estimation, where we will go through the basics and show you how to program/implement logistic regression, g-formula, IPW, and others as M-estimators.

You can see all the workshop offerings here:
epiresearch.org/annual-meeting

does anyone know of a paper that (1) has open-access data, and (2) addresses time-varying confounding with one of the g-methods?

Show more
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.