#### The aim of the workshop is to give a survey on recent topics in algebraic statistics and to bring together young researchers to discuss and work on current research problems in this field.

### Dates

The workshop will take place from **30.11. to 02.12.2017**. A detailed programme will follow. The workshop will start on Thursday at 8:30 and end on Saturday around 13:00.

### Opening talk

#### Jan Draisma (Universität Bern): Some Algebraic Statistics

Algebraic statistics concerns statistical models defined by systems of polynomial equations and inequalities. These systems are typically highly structured, and attract attention from a growing range of algebraists trained at uncovering and exploiting this strucuture. An algebraist myself, I will discuss two topics in this vein: maximum likelihood geometry on the one hand, and graphical models and their equations on the other hand.

### Invited talks

#### Angelos Armen (University of Oxford): Model Distinguishability in Bayesian Networks

When performing graphical model selection, it is important to understand which graphs represent distinct models; this is referred to as model distinguishability and can be studied using techniques from algebraic geometry. In this talk, I will discuss distinguishability of ordinary Bayesian network (BN) models, which amounts to Markov equivalence, and distinguishability of marginal/conditional BN models, which is more complicated. In addition, I will present results from our own work in graphically representing the equality constraints holding in a BN model which has been conditioned upon.

#### Carlos Enrique Améndola Cerón (TU Berlin): Algebraic Statistics of Gaussian Mixtures

Mixtures of Gaussian distributions are fundamental probability models with a long history in statistics, remaining widely used today. In this lecture we will define and introduce the main properties of its probability densities, present the main questions associated to these statistical models and explain the algebraic connections highlighted in recent research.

#### Eliana Duarte (MPI Leipzig) and Christiane Görgen (MPI Leipzig): (Non-)Toric parametric statistical models

One of the fundamental insights in Algebraic Statistics is that exponential families ‘are’ toric varieties. A full characterisation of the algebraic geometry surrounding a special class of these, called decomposable graphical models, was given by Geiger et al. (2006). We now investigate a more recent statistical model represented by a coloured probability tree and called a staged tree (Smith et al., 2017). Staged tree models include decomposable graphical models as a special case but in their algebro-geometric description sum-to-1 conditions on the probability simplex cannot be ignored. These hence constitute a wider class of models—and of varieties—which are toric only under very special conditions.

The two lectures on this topic recall known results about exponential families, both from a statistics and an algebra point of view, and formally introduce staged tree models. We then give a recent characterisation of these models as semi-algebraic sets whose only inequalities are those coming from the probability simplex. In particular, the algebraic description specifies these models as solution sets of odds-ratio equations. We illustrate properties of these varieties and discuss a number of open problems.

#### Rob Eggermont (TU Eindhoven): Asymptotic Properties of Algebraic Statistical Models

In the last years, methods for analyzing large data sets are becoming increasingly important. Ideally, we want algorithms that work regardless of the size of the data set. In algebraic statistics, this often means that we have a model for a data set of any given size, and we want that the equations that describe the model for a set of size n do not become ‘more difficult’ as n grows.

#### Alex Fink (Queen Mary University of London): The Intersection Property for Conditional Independence

The intersection property of conditional independence asserts that independence from each of two random variables given the other implies independence from both jointly. Sometimes called an “axiom”, the intersection property is in fact not true of all distributions. We characterise when it holds in both the discrete and the continuous cases (the latter being work of Peters).

#### Jens Grygierek (University of Osnabrück) Random simplicial Complexes

The Lonely Complex and the Giant Beast

Random Simplicial Complexes

We introduce the Poisson Point Process on R d and give insights on the percolation of the well known Gilbert Graph. We use the Poisson Point Processto generate the corresponding Vietoris-Rips andˇCech complexes as model for random simplicial complexes. Further we give insights on two interestingproperties of this model:

1. Almost surely, it contains infinitely many copies of any given dembeddablefinite simplicial complex as isolated components.

2. Assume we have percolation, then almost surely, it contains an unbounded connected component having infinitely many copies of anygiven dembeddable finite simplicial complex as wedge summands

#### Kaie Kubjas (Aalto University): Real Algebraic Geometry and Algebraic Statistics

Statistical models parametrized by polynomial functions are semialgebraic sets. For maximum likelihood estimation, it is often beneficial to know polynomial equations and inequalities that define these models implicitly. In this talk, we will review some statistical models that are semialgebraic and discuss how to find their implicit descriptions.

#### Mateusz Michalek (MPI Leipzig): Algebraic Methods in Phylogenetic Tree Models

In statistics one encounters various maps from parameter spaces to spaces of probability distributions. In cases when such maps are algebraic, their images are often (open, semialgebraic subsets of) nice algebraic varieties. I will present one case of particular interest: phylogenetic Markov tree models. Phylogenetics is a science that aims at reconstructing the history of evolution. Under special assumptions on probabilities and the process of mutation the associated varieties have nice mathematical properties. Results going back to Hendy and Penny, and fully described by Sturmfels and Sullivant, reveal toric structures, when there are sufficiently many symmetries among states. Thus a part of my talk will be about toric varieties in phylogenetics. On the opposite side, when no symmetries are assumed one obtains secant varieties of Segre varieties – very interesting, but extremely complicated objects. Apart from a brief overview of the subtect, various open problems will be presented.

### Organizers

Hanna Döring, Martina Juhnke-Kubitzke, Thomas Kahle, Tim Römer

Please register here (there is no workshop fee, but it helps us to organize the workshop appropriately).

### Recommended Hotels

A block of rooms has been set aside at the Intour Hotel. Mention the keyword *statistic* when making a reservation.

### Related activities

#### Kaie Kubjas (Aalto University)