Online Seminar on Thursday
12.01.2022, 16:00 (4:00 pm) CET
Randolf Altmeyer: Polynomial time guarantees for sampling based posterior inference
The Bayesian approach provides a flexible and popular framework for a wide range of nonparametric inference problems. It relies crucially on computing functionals with respect to the posterior distribution. Examples are the posterior mean or posterior quantiles for uncertainty quantification. In practice, this requires sampling from the posterior distribution using numerical algorithms, e.g., Markov chain Monte Carlo (MCMC) methods. The runtime of these algorithms to achieve a given target precision will typically, at least without additional structural assumptions, scale exponentially in the model dimension and the sample size. In contrast, in this talk we show that sampling based posterior inference in a general high-dimensional setup is indeed feasible. Given a sufficiently good initialiser, we present polynomial-time convergence guarantees for a widely used gradient based MCMC sampling scheme. The proof exploits the local curvature induced by the Fisher-information of the statistical model near the underlying truth, and relies on results from the non-linear inverse problem literature. We will discuss applications to logistic and Gaussian regression, as well as to density estimation.
Meeting-ID: 649 5035 0297
Join via Skype for Business:
01.12.2022, 16:00 (4:00 pm) CET
Edgar Dobriban: T-Cal: An optimal test for the calibration of predictive models
The prediction accuracy of machine learning methods is steadily increasing, but the calibration of their uncertainty predictions poses a significant challenge. Numerous works focus on obtaining well-calibrated predictive models, but less is known about reliably assessing model calibration. This limits our ability to know when algorithms for improving calibration have a real effect, and when their improvements are merely artifacts due to random noise in finite datasets. In this work, we consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem. The null hypothesis is that the predictive model is calibrated, while the alternative hypothesis is that the deviation from calibration is sufficiently large.
We find that detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions. When the conditional class probabilities are Hölder continuous, we propose T-Cal, a minimax optimal test for calibration based on a debiased plug-in estimator of the ℓ2-Expected Calibration Error (ECE). We further propose Adaptive T-Cal, a version that is adaptive to unknown smoothness. We verify our theoretical findings with a broad range of experiments, including with several popular deep neural net architectures and several standard post-hoc calibration methods. T-Cal is a practical general-purpose tool, which — combined with classical tests for discrete-valued predictors — can be used to test the calibration of essentially any probabilistic classification method.
03.11.2022, 16:00 (4:00 pm) CET
Mona Azadkia: A Fast Non-parametric Approach for Local Causal Structure Learning
Abstract: In this talk, we introduce a non-parametric approach to the problem of causal structure learning with essentially no assumptions on functional relationships and noise. We develop DAG-FOCI, a computationally fast algorithm for this setting that is based on the FOCI variable selection algorithm in Azadkia-Chatterjee-2021. DAG-FOCI outputs the set of parents of a response variable of interest. We provide theoretical guarantees of our procedure when the underlying graph does not contain any (undirected) cycle containing the response variable of interest. Furthermore, in the absence of this assumption, we give a conservative guarantee against false positive causal claims when the set of parents is identifiable.
07.07.2022, 16:00 (4:00 pm) CET
Mathias Drton: Identification and Estimation of Graphical Continuous Lyapunov Models
Abstract: Graphical continuous Lyapunov models offer a new perspective on modeling causally interpretable dependence structure in multivariate data by treating each independent observation as a one-time cross-sectional snapshot of a temporal process. Specifically, the models consider multivariate Ornstein-Uhlenbeck processes in equilibrium. This setup leads to Gaussian models in which the covariance matrix is determined by the continuous Lyapunov equation. In this setting, each graphical model assumes a sparse drift matrix with support determined by a directed graph. The talk will discuss identifiability of such sparse drift matrices as well as their regularized estimation.
02.06.202, 16:00 (4:00 pm) CET
Arnak Dalalyan: Estimating the matching map between two sets of high-dimensional, noisy and corrupted features
Abstract: In this talk, I will present some recent results on finding the matching map between subsets of two sets of size n consisting of d-dimensional noisy feature vectors. The main result shows that, if the signal-to-noise ratio of the feature vectors is of order at least d¼, then it is possible to recover the true matching map exactly with a high probability. A notable feature of this result is that it does not assume the knowledge of the number of feature-vectors in the first set that have their pairs in the second set. We also show that the rate d¼ can not be improved by other procedure. When the number k of matching pairs is known, this rate is achieved by the minimizer of the sum sum squares of distances between matched pairs of feature-vectors. We show how this estimator can be extended to the setting of unknown k. In addition, we show that the resulting optimization problem can be formulated as a minimum-cost flow problem, and thus solved efficiently, with complexity O(k½ n2).
Finally, we will report the result of numerical experiments illustrating our theoretical findings.
Download slides: [here]
05.05.2022, 16:00 (4:00 pm) CET
Nicolas Verzelen: Some recent results on graph and point clustering
Abstract: In this presentation, we consider two prototypical unsupervised learning problems (i) clustering nodes from a graph sampled from a Stochastic Block Model and (ii) clustering points sampled from a Gaussian Mixture Model. In these two models, the statistician aims at estimating an hidden partition (of nodes or points) from the data. I will first introduce suitable notions of distances between the groups in each model. Then, I will survey recent results on the minimal separation distance between the cluster so that a procedure is able to recover the partition with high probability. This will be mostly done through the prism of the K-means criterion and its convex relaxations. Interestingly, these clustering problems seem to exhibit a computational-statistical trade-off: known polynomial-time procedures are shown to recover the hidden partitions under stronger separation conditions than minimax (but exponential time) one, at least when the number of groups is large. Partial computational lower bounds that support the existence of this gap will be discussed at the end of the talk.
Download slides: [here]