Show newer

I released a new version of delicatessen 🥪

This release adds generalized additive models (GAM) as built-in estimating equations, and utility functions for splines and regression model predictions

pypi.org/project/delicatessen/

You can view an example of the new functionality here

github.com/pzivich/Delicatesse

The more interesting contribution is the proposal of a way to combine statistical (e.g., g-methods) and simulation (e.g., mechanistic, math, microsim models)

Here, I will review the basic idea / motivation

In the paper, we have an illustrative example in STI testing. We want to generalize a trial to a clinic population. However, the trial was only conducted among men, but the clinic includes men and women

This violation of positivity prevents us from transporting

But let's consider the following structural model (where W=1 is women) provided in the image. Were this model known, then we could transport. However, we are only able to estimate the red part of the model using the data...

So we propose using a simulation model to fill-in the blue component. This simulation model is driven by external knowledge

In the paper, we show how the other two approaches to addressing positivity are special cases of the synthesis approach. G-computation and IPW estimators are proposed. Both are applied to an illustrative example and in simulations (code at link below)

github.com/pzivich/publication

Show thread

A new pre-print. Here, we review ways to address transportability problems when positivity is violated (i.e., there is a covariate that does not overlap between populations). We also propose a new way: a synthesis of statistical & simulation modeling

arxiv.org/abs/2303.01572

Going to SER 2023? Consider attending our workshop on M-estimation, where we will go through the basics and show you how to program/implement logistic regression, g-formula, IPW, and others as M-estimators.

You can see all the workshop offerings here:
epiresearch.org/annual-meeting

@zpneal you can also think about it using sets. The set of swans contains no black objects. Similarly, the set of black objects contains no swans (otherwise the set of swans would contain a black object)

@willball12 yes, IPTW definitely counts! Thanks

Yeah, that's been my experience. It's a shame since applied examples would be helpful for teaching.

does anyone know of a paper that (1) has open-access data, and (2) addresses time-varying confounding with one of the g-methods?

@statsepi it is a good question for what level of responsibility we have.

But I don't think there would be much listening (unless it was clear fraud), which is disappointing

Great to see we are doing such a good job training the next generation of noise miners. I despair.

@Protzko to summarize, the dependence is an artifact, but applies to any effect measure we select. When we talk about heterogeneity, we need to be careful as it is always in reference to the scale of the effect measure

@Protzko the plot, shows the relationship between measures. The gray solid lines indicate homogeneity (there are an inf number of lines, just a few shown)

For (effect measures conditional on a trait, like the CATE) no heterogeneity, points have to lie on a gray line. However, that cannot hold on both scales as shown by the red dots

@Protzko sure! Consider measuring the effect of A on Y, we could define the effect as E[Y|A=1] - E[Y|A=0] or E[Y|A=1] / E[Y|A=0]. The first is on the additive scale and the second is on the multiplicative. So, it's more like the measure of effect (rather than the measures). In the linked paper, they use the CATE, which is additive

@Protzko well treatment heterogeneity is scale-dependent, so I think its a little more complicated. If a covariate has (1) some effect on the outcome & (2) homogeneity on one scale, then it must be heterogeneous on another scale (additive vs. multiplicative).

L'Abbe plots are a nice visualization of why this must be the case

There are only three guarantees in life; Death, Taxes, and Windows updates that worsen their products

2.5 years later I went back and answered my own question. Nice to see that I've learned things in that time lol

stats.stackexchange.com/questi

RT @eleanorapower
This summer, I'll be running a 3-week course on #social #network analysis with the excellent @tsvetkovadotme as part of the @LSEnews #SummerSchool. In short order, we'll get you working in #R with real-world network datasets! Please RT! lse.ac.uk/study-at-lse/summer-

@nickchk it's not too often, but I do run into cases where multi-indexing is helpful

Mostly it can be helpful for keeping repeated observation data, where I can put the dual ID columns 'outside' of the data set.

Another use-case is generating complex table 1's to be directly output to Excel

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.