sose2017 - Universität Bielefeld

zum Hauptinhalt wechseln

Breadcrumb überspringen und zum Hauptmenü wechseln

Sommersemester 2017

Dienstag, 25.04.2017, 12-13 Uhr - Raum: W9-109

Prof. Dr. Philipp Cimiano
Exzellenzcluster für Cognitive Interaction Technology (CITEC), Universität Bielefeld

Vortrag fällt aus

Dienstag, 09.05.2017, 12-13 Uhr - Raum: W9-109

Prof. Dr. Dietmar Bauer
Universität Bielefeld

Schätzung der Unsicherheit der Fahrzeit für Langzeitprognosen mit Hilfe von Floating Taxi Daten

In addition to the prediction of route travel times also the prediction of the associated uncertainty is of importance. In particular for logistics for scheduling applications long-term predictions – a couple of hours ahead – are of interest. In route planning for vehicle fleets such predictions need to be made for all possible routes in an urban road network. Currently taxi floating car data is one of the prime data sources in order to develop such models, which in the past mainly focused on obtaining predictions of route travel times but neglected the modeling of the associated uncertainties. Such models for route travel time uncertainties need to take feature of the corresponding link level travel time data into account. In particular heteroskedasticity due to varying sample sizes for different links as well as correlations between fluctuations of link travel times are important in this respect. In this paper a large scale floating taxi data set (the excerpt used in this paper is based on more than 30 million speed measurements covering 39km of roads) is used in order to investigate these two components of link travel time uncertainty in depth. In particular we show that the noise level in link travel times depends on the number of samples for each link for each time-of-day-interval but also on the traffic situation. With regard to correlations between links we show that the link travel times for adjacent links are highly correlated but correlations fade with increasing distance. Additionally common usage of links plays a role with high correlation along common routes and low correlations on seldom used routes.

Dienstag, 23.05.2017, 12-13 Uhr - Raum: W9-109

Dr. Kristian Kleinke
Universität Bielefeld

Quantile regression based multiple imputation

Research by Yu, Burton, and Rivero-Arias (2007) and Kleinke (2017) suggests, that multiple imputation procedures could lead to biased inferences, when their distributional assumptions depart too heavily from the empirical data at hand. However, robust procedures that can cope with various aspects of "non-normality" and heterogeneity are typically not yet implemented in major commercial statistical packages. Furthermore, evaluation studies that systematically tested the assumed robustness of these procedures are scarce. We compared the performance of various standard and robust MI procedures that are available in R regarding their ability to yield unbiased parameter estimates and standard errors, when the empirical data are skewed and heteroscedastic. Performance was evaluated in a Monte Carlo Simulation based on empirical data from the Erlangen-Nuremberg development and prevention project: we predicted physical punishment by fathers of elementary school children by socio-demographic variables, fathers’ aggressiveness, dysfunctional parent-child relations and various other parenting characteristics (cf. Haupt, Lösel, & Stemmler, 2014). Haupt et al. (2014) compared results of standard OLS-regression and more robust regression procedures and deemed quantile regression (QR) an appropriate method to analyse these data. Analogously, we assumed that creating multiple imputations under a QR based imputation model would be more appropriate than using regression based MI solutions that rely on a normal homoscedastic model. Overall, QR based MI yielded the most accurate results. Advantages of this procedure are (a) quantile regression is completely distribution free, (b) as QR estimates conditional quantiles of the response variable rather than "just" the conditional mean of the response variable given certain values of the predictor variables, QR gives a more comprehensive picture of the relationships in the data, and (c) model specification is quite simple and straightforward.

Dienstag, 06.06.2017, 12-13 Uhr - Raum: W9-109

Prof. Dr. Göran Kauermann
Institut für Statistik der Ludwig-Maximilians-Universität München

Mietspiegel heute - zwischen statistischem Anspruch und Realität

Der Vortrag diskutiert die statistischen Hürden bei der Erstellung von Mietspiegeln. Durch die eingeführte Mietpreisbremse und die anstehende „Mietspiegelverordnung“ haben Mietspiegel an Bedeutung gewonnen. Der Vortrag spiegelt die statistischen Voraussetzungen wider, die ein Mietspiegel erfüllen sollte und gibt exemplarisch mit den Mietspiegeln aus Berlin, Bielefeld, Hamburg, Freiburg und München ein Abbild der Realität. Es zeigt sich, dass die statistische Qualität der Mietspiegel unterschiedlicher kaum sein kann. In dem Vortrag zeigen wir auf, welche Fallen bei der statistischen Auswertung bestehen und diskutieren Herausforderungen und nötige Mindeststandards, die ein qualifizierter Mietspiegel aus der Sicht der Statistik haben sollte.

Dienstag, 20.06.2017, 12-13 Uhr - Raum: W9-109

J.-Prof. Dr. Axel Mayer
Institut für Psychologie, RWTH Aachen University

Strukturgleichungsmodelle zur Analyse der Wirksamkeit von Interventionen

Die Beurteilung der Wirksamkeit einer Behandlung oder Intervention ist ein wichtiger Bereich der sozialwissenschaftlichen Methodenlehre. Meist ist es nicht nur von Interesse, ob eine Intervention im Durchschnitt wirksam ist, sondern auch für wen und unter welchen Bedingungen. Im Vortrag werden neue statistische Methoden zur Analyse von durchschnittlichen und bedingten Effekten in experimentellen und quasiexperimentellen Designs vorgestellt und anhand von praktischen Beispielen illustriert. Diese Methoden basieren auf einem Mehrgruppenstrukturgleichungsmodell und erlauben daher die Betrachtung von manifesten und latenten abhängigen Variablen und manifesten und latenten Kovariaten, sowie die Berücksichtigung von kategorialen Kovariaten und Interaktionen höherer Ordnung. Desweiteren wird im Vortrag dargestellt, wie der vorgestellte Ansatz auf Designs mit Mehrebenenstruktur erweitert und wie Propensity Scores in die Analyse mit einbezogen werden können. Es werden verschiedene Optionen zur Berechnung von Standardfehlern, sowie das Testen informativer Hypothesen und Möglichkeiten zur Erweiterung und Anpassung des statistischen Modells diskutiert. Während des Vortrags wird das Computerprogramm EffectLiteR vorgestellt, das die umfassende Analyse der Wirksamkeit einer Intervention mit Hilfe von Strukturgleichungsmodellen erleichtert und somit einem breiten Nutzerfeld zugänglich macht. EffectLiteR ist ein Open Source R Programm, das auf lavaan basiert und eine grafische Benutzeroberfläche hat.

Dienstag, 04.07.2017, 12-13 Uhr - Raum: W9-109

Dr. Christian Heinze
Universität Bielefeld

Estimation of structured transition matrices in high dimensions

This talk considers the estimation of vector autoregressive dynamics in a high-dimensional regime, wherein the cross-sectional dimension may be relatively large compared to the sample size. Estimation in this setting requires some extra structure. The recent literature has given much attention to the case of sparse transition matrices and penalized estimation which adapts to the unknown sparsity pattern. In this talk, the transition matrices are well approximated by matrices with a common and low dimensional column space whose elements exhibit a smoothness property. This scenario is motivated by a factor model for a spatiotemporal process whose spatial domain amounts to a given weighted graph. Therein, the columns of the factor loading matrix determine the factors' influence and are assumed to be smooth in space, that is, on the graph. These properties carry over to the transition matrices of the best autoregressive predictor for the common component, which become the target for estimation. The proposed estimator amounts to the unique minimum of a penalized least-squares criterion with penalty term given by the (scaled) composition of the nuclear norm and a linear weighting function. The two factors of this penalty term reflect its twofold objective: firstly, the nuclear norm promotes low rank estimates (to match the parameter reduction due to the factor structure), and, secondly, the linear weighting function encourages smoothness on the graph. The talk presents some guarantees on the corresponding estimation error.

Dienstag, 18.07.2017, 12-13 Uhr - Raum: W9-109

Prof. Dr. Philipp Cimiano
Exzellenzcluster für Cognitive Interaction Technology (CITEC), Universität Bielefeld

Probabilistic Graphical Models for Natural Language Processing: the case of sentiment analysis

In this talk we introduce standard tasks in NLP and show how they can be modelled as problems of statistical modelling and inference. We introduce probabilistic graphical models as applied in natural language processing and discuss in particular the problem of applying them to the case of fine-grained or aspect-oriented sentiment analysis as a joint variable prediction problem. We discuss the impact of modelling the joint interaction between the key variables aspect and sentiment on the task. We further discuss recent results on transfer learning for the case of transfering models between different natural languages, in our case English and German.