**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:11

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:11

Aki Vehtari @avehtari@bayes.club

Sep 10, 2024, 09:11

@tslumley recently posted a reminder that in weighted sampling without replacement, the probability of inclusion is not usually proportional to the weight. @peter_ellis and @rstub posted they were surprised and made their own nice blog posts on the topic.

In importance resampling, this property of weighted sampling without replacement has been considered beneficial. Right now I don't have time to write a longer blog post with code examples, so here is just a short thread 🧵

1/n

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:11

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:11

Sep 10, 2024, 09:11

Aki Vehtari @avehtari@bayes.club

@tslumley @peter_ellis @rstub In importance resampling if one of the weights dominates, with replacement the number of unique draws can be very small or even one. High variability of weights leads to high variance (possibly infinite) of importance resampling estimate. 2/n

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:12

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:12

Sep 10, 2024, 09:12

Aki Vehtari @avehtari@bayes.club

@tslumley @peter_ellis @rstub Assuming we resample k times from a sample with size S (k < S), sampling without replacement constraints the probability of inclusion to be less than equal to 1/k. This introduces bias, but Skare, Bolviken, and Holden (2003) showed that this reduces the variance so much that the mean square error is better than with replacement! Downside of using sampling without replacement to reduce variance is that we need k<S. 3/n

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:12

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:12

Sep 10, 2024, 09:12

Aki Vehtari @avehtari@bayes.club

@tslumley @peter_ellis @rstub Instead of resampling k<S without resampling to constrain the inclusion probability to 1/k, Ionides (2008) proposed truncating the highest importance weights to 1/sqrt(S), and then we can resample k=S with replacement, and get the similar reduction in variance and mean square error. 4/n

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:13

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:13

Sep 10, 2024, 09:13

Aki Vehtari @avehtari@bayes.club

@tslumley @peter_ellis @rstub We (Vehtari et al., 2024, https://jmlr.org/papers/v25/19-556.html) proposed Pareto smoothing to stabilize importance weights, which improves over the simple truncation. Modifying the weights adds bias, but that is in many cases negligible, and the approach includes self-diagnostic to warn when the bias is non-negligible. 5/n

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:13

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:13

Sep 10, 2024, 09:13

Aki Vehtari @avehtari@bayes.club

@tslumley @peter_ellis @rstub Finally, when k=S, Kitagawa (1996) presented stratified and deterministic resampling, and Liu (2001) presented residual resampling, which all have smaller variance than simple random resampling (with replacement). 6/6

**tobychev** @tobychev@qoto.org · 2024-09-10T09:25:37Z

tobychev @tobychev@qoto.org

@avehtari
did you mean to write k > S in toot #4?
@tslumley @peter_ellis @rstub

Sep 10, 2024, 09:25 · · Fedilab · · ·

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:42

**Aki Vehtari** @avehtari@bayes.club · Sep 10, 2024, 09:42

Sep 10, 2024, 09:42

Aki Vehtari @avehtari@bayes.club

@tobychev @tslumley @peter_ellis @rstub No, I meant to write k=S, because that's what I do, but it works with k>=S, too, but I'm not aware of a case where that would be useful

Trending now

Resources

Developers

What is Mastodon?

qoto.org

More…