Probabilistic and Statistical Techniques for Cosmological Applications

Rome, June 5-7th 2013
Istituto Nazionale di Alta Matematica


R. Adler - Cosmology and (random) Topology: Is CMB Really Defined On a Sphere?
Much of the functional data that we see seems to be defined on relatively simple parameter spaces. For example, cosmic microwave background data is typically seen as being defined over the 2-dimensional sphere, while galactic density data is defined over 3-dimensional space. I will argue that, in many cases, the true dimensions of parameter spaces are much higher (at least 5 for CMB and at least 8 for SDSS) and the spaces themselves are structurally far more complex than one might at first imagine, and that this calls for a topological approach to analysing many random phenomena. I will then discuss some of the new tools available for such an analysis. While these tools are typically based on deep and often esoteric mathematics, understanding the underlying ideas and learning how to apply them fortunately requires no more than the undergraduate mathematics we have all learnt, and an open mind.

A. Balbi - What have we learned from the CMB?
Over the past twenty years, the detailed study of angular fluctuations in the intensity of the Cosmic Microwave Background has produced an extraordinary advancement in our knowledge of the universe. I give a broad account of the current state of affair and how it came to be.

J. Cisewski - Mapping the Intergalactic Medium using Lyman-alpha Data and  Persistent Homology
Light we observe from quasars has traveled through the  intergalactic medium (IGM) to reach us, and leaves an imprint of some  properties of the IGM on its spectrum.  There is a particular imprint of  which cosmologists are familiar, dubbed the Lyman-alpha Forest.  From this  imprint, we can infer the density of neutral hydrogen along the line of  sight from us to the quasar.  The Sloan Digital Sky Survey Data Release 9  (SDSS - DR9) produced over 54,000 quasar spectra that can be used for  analysis of the Lyman alpha forest and, thus, aid cosmologists in further  understanding the IGM along with revealing or corroborating other  properties of the Universe.  With cosmological simulation output, we develop a methodology using local  polynomial smoothing to model the IGM.  I will briefly describe the  modeling methodology, but focus on how to analyze the adequacy of the  modeling procedure and discuss some of the issues faced when modeling the  real data from SDSS - DR9.  Finally, describing the topological features  of the IGM can aid in our understanding of the large-scale structure of  the Universe along with providing a framework for comparing cosmological  simulation output with real data beyond the standard measures.  Accessing  important topological features of data can be accomplished with persistent  homology - I will introduce persistent homology, and describe an example  of how it can be used in cosmology.

A. Dalalyan - Oracle Inequalities for Aggregation of Affine Estimators
We consider the problem of combining a (possibly  uncountably infinite) set of affine estimators in non-parametric  regression model with heteroscedastic Gaussian noise. Focusing on the  exponentially weighted aggregate, we prove a PAC-Bayesian type  inequality that leads to sharp oracle inequalities in discrete but  also in continuous settings. The framework is general enough to cover  the combinations of various procedures such as least square  regression, kernel ridge regression, shrinking estimators and many  other estimators used in the literature on statistical inverse  problems. As a consequence, we show that the proposed aggregate  provides an adaptive estimator in the exact minimax sense without  neither discretizing the range of tuning parameters nor splitting the  set of observations. We also illustrate numerically the good  performance achieved by the exponentially weighted aggregate.  (This is a joint work with J. Salmon) 

J. Jin - Higher Criticism and Rare and Weak effects
We are often said to be entering the era of Big Data, where massive data sets are generated on a daily basis. A phenomenon that is frequently found in Big Data is that the signals of interest are often rare and weak: the effects that we are interested in are mostly very subtle and few and far between. Whether we are talking about genome scans or tick-by-tick financial data, most of what we see is noise; the signal is hard to find and it's easy to be fooled. In such rare and weak effect settings, classical methods and most modern empirical methods are simply overwhelmed. Yet, it is of great interest to develop methods that can cope with such settings. We introduce Higher Criticism (HC) as a method to deal with rare and weak signals. HC is a notion that goes back to Tukey in 1976. In the past decade, HC has evolved into an array of data-driven tools that are found to be especially useful for analyzing rare and weak signals in \Big Data", covering an array of problems including signal detection, classification, and spectral clustering. In this talk, we review the development of HC (some are earlier and some are recent), and suggest a range of problems where HC can be useful.

S. Feeney - A case study in anomaly detection: looking for other universes in the cosmic microwave background
With the advent of precise measurements of the cosmic microwave background (CMB), anomaly detection has become a powerful tool in cosmology, targeting clusters of galaxies, point-source contaminants, topological defects indicative of high-energy phase transitions, and more. Anomaly detection encompasses many of the statistical challenges facing the field: in particular, the need to process (in a principled and robust fashion) huge modern datasets in human timescales. It is possible to extract information about the anomalous sources, and more interestingly the underlying physical processes governing their properties, by viewing the population of sources as a hierarchical Bayesian model. I will demonstrate this technique by concentrating on one particularly exciting potential anomaly: the signatures of collisions with bubble universes arising in eternal inflation. In the picture of eternal inflation, our observable Universe resides inside a single bubble nucleated from an inflating false vacuum. Many theories giving rise to eternal inflation predict that we have causal access to collisions with other bubble universes, which leave characteristic localised modulations of the CMB. I will present the results from the first observational search for the effects of bubble collisions using CMB data from the WMAP satellite.

Chris Genovese - Hunting for Manifolds and Ridges I
(Joint work with Marco Perone-Pacifico, Isa Verdinelli and Larry Wasserman) We discuss the problem of finding stable, high-density regions in point clouds. In this part of the talk we discuss the formal problem of locating a manifold based on noisy data. We explain the statistical model and we show how the difficulty of the problem can be formalized using statistical minimax theory. We then find the minimax rate under several different models for the noise. In the case of Gaussian noise, we show that the rate is extremely slow (logarithmic). We suggest, instead, estimating a surrogate for the manifold. This is a set that is close to the manifold and can be estimated at a polynomial rate.

N. Leonenko - Monofractal and multifractal models for isotropic and spherical random fields
We plan to provide a brief survey of development and open problems in the following areas:
1) Spectral theory of isotropic random fields in the 3-dimensional Euclidean space and spherical random ?elds with Gaussian and Student distributions.
2) Rényi functions for multifractal products of isotropic random fields and spherical random fields. Multifractal analysis based on the lognormal, loggamma and log-negative inverted gamma scenarios. Testing for non-Gaussianity of isotropic and spherical data based on the multifractal analysis and empirical structure function.
3) Parametric models for statistical analysis of isotropic and spherical scalar and vector random fields.
4) Correlation and spectral theory for tensor-valued random fields: forecasting and interpolation.
The lecture is based on the joint papers (published, in press or in preparation) with V.Anh, D.Denisov, D.Marinucci, A. Malyarenko, L. Sakhno and N.-R. Shieh.

D. Marinucci - Testing for Isotropy and Geometric Features of Needlets Excursion Sets
In this talk, we shall be concerned with geometric functionals and excursion probabilities for some nonlinear transforms evaluated on wavelet/needlet components of spherical random fields. For such fields, we consider smoothed polynomial transforms, such as those arising from local estimates of angular power spectra and bispectra; we focus on the geometry of their excursion sets, and we study their asymptotic behaviour, in the high-frequency sense. We put particular emphasis on the analysis of Euler-Poincaré characteristics, which can be exploited to derive extremely accurate estimates for excursion probabilities. The present analysis is motivated by the statistical investigation of asymmetries and anisotropies in CMB data.

S. Matarrese - Non-Gaussianity and Cosmology
I'll review the role and relevance of non-Gaussian signals in the cosmological framework, focussing on i) their origin in the early Universe and during the recent non-linear stage of the evolution of perturbations; ii) their search in Cosmic Microwave Background data as well as in the large-scale structure of the Universe; iii) the role of rare events ("upcrossing regione" and "peaks") in the cosmological search for primordial non-Gaussianity.

J. McEwen - Signal processing on spherical manifolds
Observations that live on a spherical manifold arise in many applications.  In cosmology, for example, observations of the relic radiation of the Big Bang, the so-called cosmic microwave background (CMB), are inherently made on the celestial sphere.  To analyse such data signal processing techniques defined on spherical manifolds are required.  I will discuss recent advances in the area of signal processing on spherical manifolds.  Firstly, I will discuss wavelet constructions on the sphere, charting the historical development from continuous wavelet methodologies through to the scale-discretised wavelet framework that supports the exact reconstruction of signals from a discrete sampling of wavelet coefficients.  I will then discuss the non-trivial extension of scale-discretised wavelets to the ball, i.e. the sphere augmented with depth.  Finally, I will conclude with a cosmological application of these techniques.

R. Nickl - Inference for nonparametric functions on homogeneous manifolds: Lie groups, Needlets, Rademacher processes and graph Laplacians
Suppose one is given n observation points on a compact homogeneous manifold, examples for which include the unit sphere but also projective spaces and Grassmann or Stiefel manifolds. Suppose one wants to either estimate the probability density of the observations directly, or some underlying functional regression relationship, without any particular parametric assumptions on the underlying function. We discuss recent results from geometric analysis that show how one can use the Lie group structure to define a manifold analogue of a localised wavelet (=needlet) basis, and use it to estimate the unknown function based on the observations. We use concentration of measure arguments for Rademacher processes to construct non-asymptotic confidence regions for the underlying function. For manifolds where analytical expressions of the eigenfunctions of the Laplacian are difficult to obtain we discuss the computation of the estimator based on graph Laplacian methods.

G. Peccati - High-frequency asymptotics on homogeneous spaces: some explicit estimates.
I will explore some asymptotic results concerning high-frequency central limit theorems for random vectors, composed of harmonic coefficients associated with isotropic random fields on a sphere. I will provide an overview of recent findings in the area, in particular connected with entropic estimates. The main results presented in the talk are motivated by the high-resolution asymptotic analysis of the CMB radiation. Based on joint works with: D. Marinucci, I. Nourdin and Y. Swan

I. Pesenson - Shannon Sampling and Paley-Wiener Localized Frames on Riemannian Manifolds
In the last decade, methods based on various kinds of wavelet bases on the unit sphere S^2 and on the rotation group SO(3) have found applications in virtually all areas where analysis of spherical data is required, including cosmology, weather prediction, geodesy, and crystallography.
The goal of my talk is to explain constructions of Paley-Wiener localized frames on compact and non-compact Riemannian manifolds of bounded geometry. It is important that all our constructions produce frames which are either Parseval or nearly Parseval.
Special consideration will be given to a particularly important case of n-dimensional standard unit ball B^n in R^n. Three different approaches will be discussed. In our first approach we treat the closed B^n as a compact submanifold of R^n with boundary. In our second approach the open ball B is treated as a noncompact manifold which is isometric to a real hyperbolic space (the Poincare model). In our third approach we suggest a specific identification of B^n with a direct product of [0; 1] and unit sphere S^(n-1).

R. Scaramella - Euclid space mission: a cosmological challenge for the next 15 years
Euclid is the next ESA mission devoted to cosmology. It aims at covering most of the extragalactic sky, studying both gravitational lensing and clustering over ~15000 square degrees. The mission is expected to be launched in year 2020 and to last six years. The sheer amount of data of different kinds, the variety of (un)known systematic effects and the complexity of measures require efforts both in sophisticated simulations and techniques of data analysis. We will review the mission main characteristics and mention some of the areas of interest to this meeting. to be decided

A. Schwartzman - Distribution of the height of local maxima of random fields
Let f(t) be a smooth Gaussian random field over a parameter space T, where T may be a subset of Euclidean space or, more generally, a Riemannian manifold. For any local maximum of f(t) located at t0 in the interior of T, we provide general formulas and asymptotic approximations for the excursion probability P{f(t0) > u | t0 is a local maximum of f(t)} and the overshoot probability P{f(t0) > u | t0 is a local maximum of f(t) and f(t0) > v}. Assuming further that f is isotropic, we apply the GOE techniques in random matrices to compute such conditional probabilities explicitly when T is Euclidean or a sphere of arbitrary dimension. Such calculations are motivated by the statistical problem of detecting peaks in the presence of smooth Gaussian noise.

J.L. Starck - Sparsity and the Cosmic Microwave Background
Bayesian methodology is very popular in Bayesian cosmology and is even often considered as the only way to process properly astronomical data set.
Recent progress in harmonic analysis such as compressed sensing theory or sparsity however open us new ways to acquire/analyze data which can be hardly understood from the Bayesian perspective.
We briefly review the concept of sparsity and its relation to Compressed Sensing, the new sampling theory, then we show how these new concepts can help for Cosmic Microwave Background data analysis.

J. Taylor - A significance test for adaptive linear modeling
In this talk we consider testing the signi cance of the terms in a fitted regression, via the lasso. We propose a novel test statistic for this problem, and show that it has a simple asymptotic null distribution. This work builds on the least angle regression approach for tting the lasso, and the notion of degrees of freedom for adaptive models (Efron 1986) and for the lasso (Efron et al. 2004, Zou et al. 2007). It is also related to the distribution of the maximum of a discrete random field. Time permitting, we will expand on this connection. This is joint work with Richard Lockhart (Simon Fraser University) and Ryan Tibshirani (Carnegie Mellon University).

B. Wandelt - Cosmostatistics
"Why cosmostatistics? Rich cosmological data sets, meaningful physical models, and advances in computing and algorithms create the perfect environment for principled analysis approaches. I will discuss advances relevant to data such as the cosmic microwave background anisotropies on the sphere, and the observed galaxy distribution in redshift space. Examples include full physical as well as semi-blind approaches, samples from reconstructed cosmological evolution histories, and a new general method for fast computation of Fisher matrices for inference of covariance from data sets on the sphere.

L. Wasserman - Hunting for Manifolds and Ridges II
(Joint work with Chris Genovese, Marco Perone-Pacifico and Isa Verdinelli) In this part of the talk, we discuss a particular surrogate for a manifold, namely, hyper-ridges in the density. These are low dimensional sets characterized by conditions on the eigenvalues of the Hessian. We show that the ridges can be estimated at polynomial rate and thus serve as a good surrogate for the underlying manifold. In fact, our methods work well even when the underlying set is not a manifold. We then discuss the problem of ``dimensional leakage'' in which structures can leave their imprints in several different dimensions. Finally, we show how the bootstrap can be used to assess the variability of the procedure.

I. Wigman - On the geometry of random spherical harmonics
This work is joint with D. Marinucci.
  The random Gaussian spherical harmonics appear in the L2 expansion of any Gaussian field defined on the sphere, and one is interested in their geometry in the high degree limit. This question has numerous applications in various fields such as spectral geometry, and Gaussianity testing for random fields originating in Cosmology with high degree limit corresponds to high quality observations.
We address some aspects of the geometry of random Gaussian spherical harmonics, such as the defect (or "signed measure"), and more generally, nonlinear functionals of the spherical harminics. We were able to evaluate the asymptotic behaviour for the defect, and prove a general Central Limit Theorem under some mild assumptions on the functionals.
  In this talk I will introduce the audience to the random spherical harmonics, some questions regarding it and state the main results. Time permitting, I will show some aspects of the proofs.

The meeting is supported by European Research Council Grant n. 277742 Pascal and hosted by I.N.d.A.M., Istituto Nazionale d'Alta Matematica
Last Modify - Claudio D. - 06/04/2013