I released a new version of delicatessen 🥪
This release adds generalized additive models (GAM) as built-in estimating equations, and utility functions for splines and regression model predictions
https://pypi.org/project/delicatessen/1.1/
You can view an example of the new functionality here
https://github.com/pzivich/Delicatessen/blob/main/examples/Generalized-Additive-Model.ipynb
The more interesting contribution is the proposal of a way to combine statistical (e.g., g-methods) and simulation (e.g., mechanistic, math, microsim models)
Here, I will review the basic idea / motivation
In the paper, we have an illustrative example in STI testing. We want to generalize a trial to a clinic population. However, the trial was only conducted among men, but the clinic includes men and women
This violation of positivity prevents us from transporting
But let's consider the following structural model (where W=1 is women) provided in the image. Were this model known, then we could transport. However, we are only able to estimate the red part of the model using the data...
So we propose using a simulation model to fill-in the blue component. This simulation model is driven by external knowledge
In the paper, we show how the other two approaches to addressing positivity are special cases of the synthesis approach. G-computation and IPW estimators are proposed. Both are applied to an illustrative example and in simulations (code at link below)
https://github.com/pzivich/publications-code/tree/master/TransportNoPositivity
A new pre-print. Here, we review ways to address transportability problems when positivity is violated (i.e., there is a covariate that does not overlap between populations). We also propose a new way: a synthesis of statistical & simulation modeling
New blog post: https://statsepi.substack.com/p/everybodys-backyard
Going to SER 2023? Consider attending our workshop on M-estimation, where we will go through the basics and show you how to program/implement logistic regression, g-formula, IPW, and others as M-estimators.
You can see all the workshop offerings here:
https://epiresearch.org/annual-meeting/2023-meeting/2023-workshops/
@willball12 fantastic! looking forward to it
@zpneal you can also think about it using sets. The set of swans contains no black objects. Similarly, the set of black objects contains no swans (otherwise the set of swans would contain a black object)
@willball12 yes, IPTW definitely counts! Thanks
Yeah, that's been my experience. It's a shame since applied examples would be helpful for teaching.
@statsepi it is a good question for what level of responsibility we have.
But I don't think there would be much listening (unless it was clear fraud), which is disappointing
@Protzko to summarize, the dependence is an artifact, but applies to any effect measure we select. When we talk about heterogeneity, we need to be careful as it is always in reference to the scale of the effect measure
@Protzko the plot, shows the relationship between measures. The gray solid lines indicate homogeneity (there are an inf number of lines, just a few shown)
For (effect measures conditional on a trait, like the CATE) no heterogeneity, points have to lie on a gray line. However, that cannot hold on both scales as shown by the red dots
@Protzko sure! Consider measuring the effect of A on Y, we could define the effect as E[Y|A=1] - E[Y|A=0] or E[Y|A=1] / E[Y|A=0]. The first is on the additive scale and the second is on the multiplicative. So, it's more like the measure of effect (rather than the measures). In the linked paper, they use the CATE, which is additive
@Protzko well treatment heterogeneity is scale-dependent, so I think its a little more complicated. If a covariate has (1) some effect on the outcome & (2) homogeneity on one scale, then it must be heterogeneous on another scale (additive vs. multiplicative).
L'Abbe plots are a nice visualization of why this must be the case
2.5 years later I went back and answered my own question. Nice to see that I've learned things in that time lol
https://stats.stackexchange.com/questions/474851/variance-for-a-doubly-robust-cate-estimator
RT @eleanorapower
This summer, I'll be running a 3-week course on #social #network analysis with the excellent @tsvetkovadotme as part of the @LSEnews #SummerSchool. In short order, we'll get you working in #R with real-world network datasets! Please RT! https://www.lse.ac.uk/study-at-lse/summer-schools/summer-school/courses/research-methods/me202
@nickchk it's not too often, but I do run into cases where multi-indexing is helpful
Mostly it can be helpful for keeping repeated observation data, where I can put the dual ID columns 'outside' of the data set.
Another use-case is generating complex table 1's to be directly output to Excel
Paul Zivich. Computational epidemiologist, causal inference researcher, and open-source enthusiast #epidemiology #statistics #python