Anyone with a background:

I have a record of several hundred events. Each event occurred at a particular univariate condition x and had a binary outcome y. I don't get to choose x; there are about a hundred values that occurred exactly once, up to a maximum number of fourteen repetitions at one value of x (to measurement precision). The samples are roughly clustered around a central value of x, not uniformly distributed.

Is there a recommended way to estimate the local probability of y as a function of x (that is, if I measure the conditions as x=X, how likely is it that y will occur)? Simply averaging all samples at x=X doesn't give a usable curve, because all the single-sample values swing it wildly to zero or one, regardless of what any neighbouring samples have done. Currently what I'm doing is summing the averages over all samples where x<=X and the average over all samples where x>=X, then subtracting the average over all x. It looks more or less like the smooth curve predicted by theory but I'm pretty sure this counts as "misuse of statistics".

I'm especially interested in identifying regions (intervals on x) where I can say the observed probability differs from the prediction of theory by a statistically significant amount. I think coming up with a formula for a confidence interval would be the way to go, but feel free to point me in another direction.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.