Show newer

Convergence of Markov Chains for Constant Step-size Stochastic Gradient Descent with Separable Functions arxiv.org/abs/2409.12243

Convergence of Markov Chains for Constant Step-size Stochastic Gradient Descent with Separable Functions

Stochastic gradient descent (SGD) is a popular algorithm for minimizing objective functions that arise in machine learning. For constant step-sized SGD, the iterates form a Markov chain on a general state space. Focusing on a class of separable (non-convex) objective functions, we establish a "Doeblin-type decomposition," in that the state space decomposes into a uniformly transient set and a disjoint union of absorbing sets. Each of the absorbing sets contains a unique invariant measure, with the set of all invariant measures being the convex hull. Moreover the set of invariant measures are shown to be global attractors to the Markov chain with a geometric convergence rate. The theory is highlighted with examples that show: (1) the failure of the diffusion approximation to characterize the long-time dynamics of SGD; (2) the global minimum of an objective function may lie outside the support of the invariant measures (i.e., even if initialized at the global minimum, SGD iterates will leave); and (3) bifurcations may enable the SGD iterates to transition between two local minima. Key ingredients in the theory involve viewing the SGD dynamics as a monotone iterated function system and establishing a "splitting condition" of Dubins and Freedman 1966 and Bhattacharya and Lee 1988.

arxiv.org

Universal localizations, Atiyah conjectures and graphs of groups arxiv.org/abs/2409.12268

Universal localizations, Atiyah conjectures and graphs of groups

Let $G$ be a countable group that is the fundamental group of a graph of groups with finite edge groups and vertex groups that satisfy the strong Atiyah conjecture over $K \subseteq \mathbb{C}$ a field closed under complex conjugation. Assume that the orders of finite subgroups of $G$ are bounded above. We show that $G$ satisfies the strong Atiyah conjecture over $K$. In particular, this implies that the strong Atiyah conjecture is closed under free products. Moreover, we prove that the $\ast$-regular closure of $K[G]$ in $\mathcal{U}(G)$, $\mathcal{R}_{\small K[G]}$, is a universal localization of the graph of rings associated to the graph of groups, where the rings are the corresponding $\ast$-regular closures. As a result, we obtain that the algebraic and center-valued Atiyah conjecture over $K$ are also closed under the graph of groups construction provided that the edge groups are finite. We also infer some consequences on the structure of the $K_0$ and $K_1$-groups of $\mathcal{R}_{\small K[G]}$. The techniques developed allow us to prove that $K[G]$ fulfills the strong, algebraic and center-valued Atiyah conjectures and that $\mathcal{R}_{\small K[G]}$ is the universal localization of $K[G]$ over the set of all matrices that become invertible in $\mathcal{U}(G)$ if $G$ lies in a certain class of groups $\mathcal{T}_{\small \mathcal{VLI}}$, which contains in particular virtually-{locally indicable} groups that are the fundamental group of a graph of virtually free groups.

arxiv.org

Percolation at the uniqueness threshold via subgroup relativization arxiv.org/abs/2409.12283

Percolation at the uniqueness threshold via subgroup relativization

We study percolation on nonamenable groups at the uniqueness threshold $p_u$, the critical value that separates the phase in which there are infinitely many infinite clusters from the phase in which there is a unique infinite cluster. The number of infinite clusters at $p_u$ itself is a subtle question, depending on the choice of group, with only a relatively small number of examples understood. In this paper, we do the following: 1. Prove non-uniqueness at $p_u$ in a new class of examples, namely those groups that contain an amenable, $wq$-normal subgroup of exponential growth. Concrete new examples to which this result applies include lamplighters over nonamenable base groups. 2. Prove a co-heredity property of a certain strong form of non-uniqueness at $p_u$, stating that this property is inherited from a $wq$-normal subgroup to the entire group. Remarkably, this co-heredity property is the same as that proven for the vanishing of the first $\ell^2$ Betti number by Peterson and Thom (Invent. Math. 2011), supporting the conjecture that the two properties are equivalent. Our proof is based on the method of subgroup relativization, and relies in particular on relativized versions of uniqueness monotonicity, the equivalence of non-uniqueness and connectivity decay, the sharpness of the phase transition, and the Burton-Keane theorem. As a further application of the relative Burton-Keane theorem, we resolve a question of Lyons and Schramm (Ann. Probab. 1999) concerning intersections of random walks with percolation clusters.

arxiv.org

Series expansions for SPDEs with symmetric $\alpha$-stable L\'evy noise arxiv.org/abs/2409.12286

Series expansions for SPDEs with symmetric $α$-stable Lévy noise

In this article, we examine a stochastic partial differential equation (SPDE) driven by a symmetric $α$-stable (S$α$S) Lévy noise, that is multiplied by a linear function $σ(u)=u$ of the solution. The solution is interpreted in the mild sense. For this models, in the case of the Gaussian noise, the solution has an explicit Wiener chaos expansion, and is studied using tools from Malliavin calculus. These tools cannot be used for an infinite-variance Lévy noise. In this article, we provide sufficient conditions for the existence of a solution, and we give an explicit series expansion of this solution. To achieve this, we use the multiple stable integrals, which were developed in Samorodnitsky and Taqqu (1990, 1991), and originate from the LePage series representation of the noise. To give a meaning to the stochastic integral which appears in the definition of solution, we embed the space-time Lévy noise into a Lévy basis, and use the stochastic integration theory (Bichteler and Jacod 1983, Bichteler 2002) with respect to this object, as in other studies of SPDEs with heavy-tailed noise: Chong (2017a), Chong (2017b), Chong, Dalang and Humeau (2019). As applications, we consider the heat and wave equations with linear multiplicative noise, also called the parabolic/hyperbolic Anderson models.

arxiv.org

A Comparison of Sparse Solvers for Severely Ill-Conditioned Linear Systems in Geophysical Marker-In-Cell Simulations arxiv.org/abs/2409.11515

A Comparison of Sparse Solvers for Severely Ill-Conditioned Linear Systems in Geophysical Marker-In-Cell Simulations

Solving sparse linear systems is a critical challenge in many scientific and engineering fields, particularly when these systems are severely ill-conditioned. This work aims to provide a comprehensive comparison of various solvers designed for such problems, offering valuable insights and guidance for domain scientists and researchers. We develop the tools required to accurately evaluate the performance and correctness of 16 solvers from 11 state-of-the-art numerical libraries, focusing on their effectiveness in handling ill-conditioned matrices. The solvers were tested on linear systems arising from a coupled hydro-mechanical marker-in-cell geophysical simulation. To address the challenge of computing accurate error bounds on the solution, we introduce the Projected Adam method, a novel algorithm to efficiently compute the condition number of a matrix without relying on eigenvalues or singular values. Our benchmark results demonstrate that Intel oneAPI MKL PARDISO, UMFPACK, and MUMPS are the most reliable solvers for the tested scenarios. This work serves as a resource for selecting appropriate solvers, understanding the impact of condition numbers, and improving the robustness of numerical solutions in practical applications.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.