Lifting Weak Supervision To Structured PredictionWeak supervision (WS) is a rich set of techniques that produce pseudolabels
by aggregating easily obtained but potentially noisy label estimates from a
variety of sources. WS is theoretically well understood for binary
classification, where simple approaches enable consistent estimation of
pseudolabel noise rates. Using this result, it has been shown that downstream
models trained on the pseudolabels have generalization guarantees nearly
identical to those trained on clean labels. While this is exciting, users often
wish to use WS for structured prediction, where the output space consists of
more than a binary or multi-class label set: e.g. rankings, graphs, manifolds,
and more. Do the favorable theoretical properties of WS for binary
classification lift to this setting? We answer this question in the affirmative
for a wide range of scenarios. For labels taking values in a finite metric
space, we introduce techniques new to weak supervision based on
pseudo-Euclidean embeddings and tensor decompositions, providing a
nearly-consistent noise rate estimator. For labels in constant-curvature
Riemannian manifolds, we introduce new invariants that also yield consistent
noise rate estimation. In both cases, when using the resulting pseudolabels in
concert with a flexible downstream model, we obtain generalization guarantees
nearly identical to those for models trained on clean data. Several of our
results, which can be viewed as robustness guarantees in structured prediction
with noisy labels, may be of independent interest. Empirical evaluation
validates our claims and shows the merits of the proposed method.
arxiv.org