sose2018 - Universität Bielefeld

Breadcrumb überspringen und zum Hauptmenü wechseln

Sommersemester 2018

Dienstag, 24.04.2018, 12-13 Uhr - Raum: W9-109

Dr. Paul-Christian Bürkner
Institut für Psychologie der Westfälischen Wilhelms-Universität Münster

Fitting Bayesian Multilevel Models using Stan: An Introduction to the R package brms

The talk will be about Bayesian multilevel models and their implementation in R using the package brms. At start there will be a short introduction to multilevel modeling and to Bayesian statistics in general followed by an introduction to Stan, which is an incredible flexible language to fit open-ended Bayesian models. I will then explain how to access Stan using just basic R formula syntax via the brms package. It supports a wide range of response distributions and modeling options such as splines, autocorrelation, or censoring all in a multilevel context. Many post-processing and plotting methods are implemented as well. Some examples from Psychology and Medicine will be discussed.

Dienstag, 08.05.2018,12-13 Uhr - Raum: W9-109

Svenja Elkenkamp
Universität Bielefeld

Motifs in the mobility panel

This talk is about motifs in the German Mobility Panel by the Karlsruher Institute of Technology (KIT). Motifs are graphs that represent the daily movement of one person. They give information about the number of visited locations and the order in which they are visited. Former research found that these motifs have a correlation with the travel mode someone uses and that they seem to be stable over time. Beside this, not much is known about influences on the mobility patterns. The main questions investigated in this talk are:

How do mobility patterns of people living in different household types or regions differ?
Which variables influence the motif choices?

First, motifs in general are introduced and the most common patterns in the dataset are explained. Afterwards, the results of descriptive analysis are presented to show differences for mobility patterns in different household types and regions. In the end, multinomial logit models are computed using the R package ‘mlogit’. Thereby, advantages and disadvantages of this package will be discussed.

Dienstag, 29.05.2018, 12-13 Uhr - Raum: W9-109

Prof. Dr. Göran Kauermann
Institut für Statistik der Ludwig-Maximilians-Universität München

From Statistics and Computer Science towards Data Science

Big Data is certainly one of the buzzwords of the last five years. With the digital revolution nearly everything can today be measured, recorded or protocolled. The amount of data exceeds multiples of peta bytes with an annual increase, which has been unbelievable even a couple of years ago. But the pure data flood is without great use if no information is drawn from the data. This step is the real challenge for the next decades. While some early views in the gold rush times of Big Data were even postulating that Big Data calls for the end of theory, since with enough data every question can be answered, it becomes more and more apparent, that the step from Big Data to relevant information is full of obstacles and traps and demands for novel scientific routes. This view is not new, but has been formulated by Cleveland (2001) more than 1 ¹/₂ decades ago. He proposed to combine statistical approaches with computer science and labelled this as Data Science. In his article “Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics“ he criticises the existing separation between statistics and computer science and writes: „The benefit to the data analyst has been limited, because the knowledge among computer scientists about how to (...) approach the analysis of data is limited, just as the knowledge of computing environments by statisticians is limited. A merger of the knowledge bases would produce a powerful force for innovation.“ This merger of the two disciplines is what is today understood as Data Science and the current Big Data challenges make Cleveland’s proposal vivid and indispensable. At the Ludwig-Maximilians-University Munich a new master program in Data Science has been launched in 2016 (www.datascience-munich.de). The initiative is funded by the Elitenetwork Bavaria (www.elitenetzwerk.bayern.de). The program is jointly run by the Department of Statistics and the Department of Computer Science at the LMU and curriculum is, as proposed by Cleveland, a real merge of statistics and computer science. The talk will motivate the necessity of both, computer science and statistics for the analysis of Big Data. In particular we demonstrate that quantity of data (like in Big Data) is not the ultimate goal, but quality of data needs to be considered. The latter is possible with statistical means. In particular we exhibit a curriculum for Data Science which combines statistics and computer science, following Cleveland’s proposal (see Kauermann & Seidl, 2018).

Dienstag, 05.06.2018, 12-13 Uhr - Raum: W9-109

Prof. Dr. Dietmar Bauer
Universität Bielefeld

Estimating cointegrating relations: going from VAR to VARMA

In this talk the concept of cointegration is investigated in the classical form using vector autoregressions (VAR). There the vector error correction model representation (VECM) plays the major role in terms of estimating and interpreting the model. It is discussed, how the VECM can be used to derive estimators as well as testing procedures for the cointegrating relations and how the cointegrating relations found can be interpreted. In the main part then the Ribarits-Hanzon approach to representing VARMA models in the VECM is shown. This representation can be used in order to transfer the results in terms of estimation and hypothesis testing from the case of a VAR generating the data to the case of VARMA processes. Finally it is demonstrated via a simulation study that this transfer leads to better estimates for VARMA data generating processes.

Dienstag, 19.06.2018, 12-13 Uhr - Raum: W9-109

Dr. Ignacio Alvarez
Universidad de la República

Confounding effects in double shrinkage hierarchical models

Double shrinkage models are used when there are two set of parameters with hierarchical distributions. For example, cases in which we want to model group means and group variances. In this talk, we work with hierarchical models for both means and variances simultaneously in a sparse high-dimensional context. Specifically, we focus on the effects of variance hierarchical modeling on the mean vector inference. Inferences about group means and variances can be confounded, resulting in some surprisingly poor estimates of the means when signals are "too" strong. We show that this confounding effect occurs in both simple simulated scenarios and in a real data set from RNA- seq expression of maize plants. The main reason for the confounding effect is related to the light tails of the normal distribution. When a normal distribution is used as hierarchical distribution for the group means, the few groups with strong signals are mistakenly shrunk towards zero. Changing the hierarchical distribution to Cauchy seems to solve this issue.

Dienstag, 10.07.2018, 12-13 Uhr - Raum: W9-109

Dr. Turid Frahnow
Universität Bielefeld

My daily life of a biomedical data scientist – Working between Life Science and Big Data

The exponential advances in laboratory technologies in recent years have exceeded the known limitation of measurements and lead to a new generation of data generation. Targeting not only a small number of specific biomarkers but measuring everything that is measurable in a sample of tissue, blood or other sources seem to be a fresh start of understanding processes in systems biology and medicine. But this new strategy leads to manifold challenges in the data analysis requiring also new strategies to deal with the data. This talk will give you insights in the daily life of a data scientist working in the fields of life sciences. For a data scientist flexibility is one of the most important skills, therefore I would like to give you not only a broad overview of my research topics but also introduce you to some of the mathematical/statistical and biological-technical trip hazards (and how to deal with them).

Mittwoch, 18.07.2018, 14-15 Uhr - Raum: U3-140

Dr. Yuri Malitsky
Morgan Stanley, USA

Machine Learning in Finance

Dr. Malitsky will present an overview of the state of machine learning and data science in the financial industry. The topics of his talk are so secret, he cannot even tell us what he will be talking about yet. Will we learn the secrets to becoming millionaires? Dr. Malitsky earned his PhD from Brown University in 2012 working on topics involving data science and optimization. He did his Post-Doc at University College Cork and worked as a researcher at the IBM Watson Research Center.