1,721,113 research outputs found
Documentation of Sample Sizes and Panel Attrition in the German Socio Economic Panel (SOEP) 1984 - 2003
Documentation of Sample Sizes and Panel Attrition in the German Socio Economic Panel (SOEP) (1984 until 2008)
Estimating the density of ethnic minorities and aged people in Berlin: multivariate kernel density estimation applied to sensitive geo-referenced administrative data protected via measurement error
Modern systems of official statistics require the timely estimation of area-specific densities of subpopulations. Ideally estimates should be based on precise geocoded information, which is not available because of confidentiality constraints. One approach for ensuring confidentiality is by rounding the geoco-ordinates. We propose multivariate non-parametric kernel density estimation that reverses the rounding process by using a measurement error model. The methodology is applied to the Berlin register of residents for deriving density estimates of ethnic minorities and aged people. Estimates are used for identifying areas with a need for new advisory centres for migrants and infrastructure for older people
Attrition of Households and Individuals in Panel Surveys
Attrition is mostly caused by not contacted or refusing sample members. On one hand it is well-known that reasons to attrite due to non-contact are different from those that are due to refusal. On the other hand does non-contact most probably affect household attrition, while refusal can be effective on both households and individuals. In this article, attrition on both the household and (conditional on household participation) the individual level is analysed in three panel surveys from the Cross National Equivalent File (CNEF): the German Socio- Economic Panel (GSOEP), the British Household Panel Study (BHPS), and the Swiss Household Panel (SHP). To follow households over time we use a common rule in all three surveys. First, we find different attrition magnitudes and patterns both across the surveys and also on the household and the individual level. Second, there is more evidence for reinforced rather than compensated household level selection effects if the individual level is also taken into account.CNEF, individual attrition, household attrition, attrition bias, reference person, household head
Switching between different non-hierarchical administrative areas via simulated geo-coordinates: a case study for student residents in Berlin
The transformation of area aggregates between non-hierarchical area systems (administrative areas) is a standard problem in official statistics. For this problem, we present a proposal which is based on kernel density estimates. The approach applies a modification of a stochastic expectation maximization algorithm, which was proposed in the literature for the transformation of totals on rectangular areas to kernel density estimates. As a by-product of the routine, one obtains simulated geo-coordinates for each unit. With the help of these geo-coordinates, it is possible to calculate case numbers for any area system of interest. The proposed method is evaluated in a design-based simulation based on a close-to-reality, simulated data set with known exact geo-coordinates. In the empirical part, the method is applied to student resident figures from Berlin, Germany. These are known only at the level of ZIP codes, but they are needed for smaller administrative planning districts. Results for (a) student concentration areas and (b) temporal changes in the student residential areas between 2005 and 2015 are presented and discussed.</p
Das Arbeitsangebot verheirateter Frauen im Lebenszyklus : Eine deskriptive Analyse einer Längsschnittstichprobe aus dem Sozio-ökonomischen Panel
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
The Fade-Away Phenomenon of Initial Non-response Bias in Panel Surveys
In practice, almost every survey suffers from the problem of non-response. The problem of non-response arises mainly due to the refusal of the persons to respond, and sometimes when there is unavailability of some persons, households, firms because of invalid addresses or wrong telephone numbers, inability of the interviewer to reach the household in remote areas or failure to collect the required information from a sample member in the mail surveys (Armstrong and Overton, 1977; Hox and deLeeuw, 1994). In the context of panel surveys, non-response occurs when the sample members don’t participate in a particular wave of the study. This kind of non-response is called wave non-response. On the other hand, when a sample member participates in the initial wave of the survey but refuses to participate in the later waves of the survey, this kind of non-response is called panel attrition. Panel attrition is a common problem in panel surveys, which reduces sample size and can lead to biased inferences when the propensity to drop out is systematically related to the substantive outcome of interest.
In this thesis, we are mainly concerned with the problem of non-response in panel surveys. Like any non-mandatory survey, a panel survey suffers from substantial non-response at its start. 30 to 70% of the initial sample persons refuse to cooperate. The motivation and causation for this behaviour don’t distinguish from standard cross-sectional surveys. However, in panel surveys, the respondents are repeatedly interviewed in later waves. With this repeated measurement it is possible to analyze gross-change, i.e., individual change, for example, changes between poverty and non-poverty. These individual changes have a substantial impact on the distribution of the variable interest in later waves of the panel. As a consequence, an initial bias resulting from selective non-response at the start of the panel may “fade-away” in later panel waves. The fade-away phenomenon can be empirically observed for those rare cases where a panel is selected from the register and where it is possible to make statistical inferences also for the non-responders based on the register information. Motivated by examples from the Finnish sub-samples of the European Community Household Panel (ECHP), the European Statistics on Income and Living Conditions (EU-SILC) and the German Panel Labor Market and Social Security (PASS) Alho et al. (2017) have developed a statistical framework in the context of Markov chains which explains the fade-away effect of initial non-response bias.
Non-response in surveys may create a bias in the estimates. However, one advantage of panel surveys over cross-sectional surveys is that under some regularity conditions an initial non-response bias may fade-away over later panel waves. Sisto (2003) and Rendtel (2013) studied the effect of initial non-response on the income quintiles estimates from the European Community Household Panel (ECHP) and poverty rates from the European Union Statistics of Income and Living Conditions (EU-SILC). They reported that the effect of initial non-response bias declines very fast for income quintiles and poverty states in the subsequent panel waves. Such a hypothesis of the fade-away effect doesn’t only base on the information provided by the respondent sample but also depends on the information obtained from the non-respondent
sample, where information about the non-respondents is available via registers. Rendtel (2013) used the concept of the Markov chain to explain the fade-away phenomenon. The purpose of using this approach is the possibility to use the steady-state distribution of the Markov chain. If the transition law of the Markov chain is stable over time, then under some regularity conditions the distribution on the state space of the Markov chain converges to a stable distribution, called the steady-state distribution. Alho (2015) extends the approach to regression analysis. He uses a two wave panel to explain the fade-away phenomenon of initial non-response bias in the framework of regression analysis with a single covariate. In the proposed regression model the covariate and the error term are decomposed into permanent and nonpermanent variance components. Alho concludes that the initial non-response bias fades-away in the case of low non-permanent components of the covariate and/or the error term.
The thesis is divided into three parts: Part I contains the theoretical foundations for the fade-away effect of initial non-response bias in panel surveys. In part II of this thesis, a simulation study is conducted to investigate the fade-away effect of the initial non-response bias in a multi-wave panel survey. The purpose of the simulation study is to investigate the accuracy of the bias approximation in a simulation setting and check the size of the fade-away effect in later panel waves with no analytical bias approximation. Alho (2015) has investigated the bias of cross-sectional OLS estimates under not missing at random (NMAR) non-response at the start of the panel. He derived analytical bias approximation for the OLS estimate of the slope coefficient of the variable of interest. His underlying model used a variance component model with two components: a fixed individual component and an auto-regressive shock component (Alho’s model will be discussed in Subsection 2.3.2 of Chapter 2). However, in multi-wave panel surveys, the analytical expression for Alho’s bias approximation formula becomes very intractable for later waves. Therefore, we extend the results to a longer panel wave via a simulation study.
In Chapter 3 of this thesis, we have conducted a simulation study to verify the approximate results of Alho (2015), and investigated the accuracy of the bias approximation in a simulation setting. We checked the size of the fade-away in later panel waves with no analytical bias approximation. The speed of the fade-away effect of the initial non-response bias is then investigated for different stability scenarios of covariates and error terms, with and without any attrition patterns in later panel waves. As the speed of the fade-away depends on the stability of the covariates and error terms it is important to investigate this effect not only for simulated data but also for real longitudinal data. Therefore, in the application part (Part III) of this thesis, we switch to real data from the German Socio Economic Panel (SOEP): specifically to income data and life satisfaction scores data of the SOEP
ein Vergleich von Kalibration und Propensity Score Adjustment
Das Ziel dieser Arbeit war zu zeigen, ob der Mikrozensus-Längschnittsdatensatz 1996-1999 für die Schätzung von Arbeitsmarktdynamiken in der deutschen Bevölkerung benutzt werden kann. Es wurde gezeigt, dass die Nicht-Berücksichtigung der Ausfälle durch räumliche Mobilität zu verzerrten Schätzungen der Arbeitsmarkübergänge führt. Zur Korrektur dieser Verzerrung wurden zwei Schätzer (Kalibration und PSA) miteinander verglichen. Der Vergleich basierte auf der Betrachtung des relativen Biases sowie der Schätzung des Standardfehlers.
Da es für die beiden Ansätze keine einheitliche Theorie gab, wurden zunächst die beiden Ansätze in einem asymptotischen Kontext dargestellt. Hierbei wurden für beide Verfahren die asymptotisch korrekten Varianzen hergeleitet und ein neues Verfahren zur Simulation der Ausfälle durch Nonresponse entwickelt. Das neue Verfahren wurde auf das SOEP angewendet, das räumliche Mobilität erfasst, um die Performance der beiden Schätzer miteinander zu vergleichen.
Die Ergebnisse der Simulationsstudie zeigten, dass der Kalibrationsansatz bei gleichzeitiger Nutzung von Gesamtwerten der Population und der Stichprobeninformation den Bias bei gleichzeitig leichter Unterschätzung des Standardfehlers reduzieren kann. Zusätzlich zeigte die Simulationsstudie, dass der PSA-Schätzer auf Basis eines einfachen Logit-Modells und ohne Berücksichtigung der Korrelation innerhalb der Haushalte, den Erwerbsübergang erwerbslos zu erwerbstätig noch besser korrigiert. Wird zusätzlich auf den PSA-Schätzer die Kalibration angewendet, ergab sich kein signifikanter Unterschied zum PSA-Schätzer alleine. Bei der Schätzung der Standardfehler zeigte sich, dass bei den beiden Übergängen mit der höchsten Fallzahl der PSA-Schätzer relativ gut abschnitt.
Eine vorsichtige Auswertungsstrategie würde auf beide Verfahren zurückgreifen und bei großen Diskrepanzen auf bedeutende Selektionseffekte durch die Nichterfassung der mobilen Personen schließen.The aim of this work was to show whether the first panel version of the German Microcensus, covering the years 1996-1999, can be used for the estimation of labour force flows. It has been shown that non-consideration of missing information about residential movers led to biased estimates of the gross flows. To correct for this bias two estimators, calibration and PSA, were compared. The performance of the two estimators was assessed using the relative bias as well as the standard error.
As no uniform theory exists for the two approaches, this was derived in an asymptotic context. For both approaches asymptotically correct variances were derived and a new method for simulation of non-response due to missing information was developed. The new method was tested using the SOEP which covers the residential mobility.
The results of the simulation study showed that the calibration approach using the population totals and sampling information led to considerable bias reduction. At the same time the standard errors were slightly underestimated. Moreover, the simulation study showed that the PSA method based on logistic regression without taking into account the correlation within the households reached very strong bias reduction for the transition unemployment to employment. When calibration was applied on the PSA estimator no significant differences compared to the use of the PSA estimator alone was observed. For the estimation of standard errors the results showed a good performance of the PSA approach for the transitions with the highest number of cases.
A careful evaluation would use both approaches and in the presence of large discrepancies conclude that considerable selection effects are induced by missing information about residential movers
- …
