@stefanforfan yeah, unfortunately the residual variation makes it hard to visually see. I suspect most relevant public health examples will have this same issue...
However, you could also view that as a benefit (i.e., we should use splines because a scatterplot may not be enough)
@stefanforfan if you use NHANES, I've seen HDL cholesterol as predicted by BMI to be non-linear in a few years
I'm working on adding automatic differentiation to `delicatessen` (to compute the variance exactly instead of approximating it). Still a work in progress but if you have time to test it out, it would help me out a lot
@willball12 yes, that is the 'cost'. Baseline variables are nice because we can have a more reliable ordering in time, but we do lose some precision if there is no A -> X2 (that's probably fine in most settings because I would be more worried about the arrow and less about the SE size)
@willball12 from a methodological perspective, you can adjust for things that are (1) not mediators and (2) latter on the 'causal path'. In the figure, either {X1} or {X2} are minimally sufficient sets. So, we can adjust for things that are not baseline (as long as *not* mediators).
There is a benefit to this: adjusting for variables closest to the outcome (e.g., X2) result in estimators with the greatest precision.
In this particular case, I don't know if I reasonably believe that those aren't mediators
it's causal, you see, because it only predicts the *next* word, rather than predicting words *in* the prompt. so the prediction is unidirectional and thus causal :)
@ecological_fallacy A little narrower than all science, but we have a few reviews on this topic. Stuff like that seems to still be pretty common
https://academic.oup.com/aje/article/192/3/483/6658218
https://www.medrxiv.org/content/10.1101/2022.03.07.22271661v2
Here's a surreal (at least to me) story. So, I was messing around with the coding capabilities of GPT-3. I asked it to code TMLE in Python for me.
The weird part was that GPT started using the Python library I wrote to do TMLE. However, it got the syntax and how the functions wildly wrong. Like the import statements are not even correct
So if you're using GPT to code, you better be familiar with the libraries it calls (and that it can even call the correctly)
Here is my attempt at disambiguating the various 'g-' terms
Lately, I've had a lot of fun trying to think about how to vectorize functions. Here is a function for applying the central difference method for multivariable functions
def compute_gradient(func, x, epsilon=1e-6):
x = np.asarray(x)
input_shape = x.shape[0]
h = np.identity(input_shape) * epsilon
u = (x + h).T
l = (x - h).T
gradient = (func(u) - func(l)) / (2 * epsilon)
return gradient
Paul Zivich. Computational epidemiologist, causal inference researcher, and open-source enthusiast #epidemiology #statistics #python