Tree Mover's Distance: Bridging Graph Metrics and Stability of Graph Neural NetworksUnderstanding generalization and robustness of machine learning models
fundamentally relies on assuming an appropriate metric on the data space.
Identifying such a metric is particularly challenging for non-Euclidean data
such as graphs. Here, we propose a pseudometric for attributed graphs, the Tree
Mover's Distance (TMD), and study its relation to generalization. Via a
hierarchical optimal transport problem, TMD reflects the local distribution of
node attributes as well as the distribution of local computation trees, which
are known to be decisive for the learning behavior of graph neural networks
(GNNs). First, we show that TMD captures properties relevant to graph
classification: a simple TMD-SVM performs competitively with standard GNNs.
Second, we relate TMD to generalization of GNNs under distribution shifts, and
show that it correlates well with performance drop under such shifts.
arxiv.org