1,721,119 research outputs found
Fast adaptive penalized splines
Krivobokova T, Crainiceanu CM, Kauermann G. Fast adaptive penalized splines. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS. 2008;17(1):1-20.This article proposes a numerically simple method for locally adaptive smoothing. The heterogeneous regression function is modeled as a penalized spline with a varying smoothing parameter modeled as another penalized spline. This is formulated as a hierarchical mixed model, with spline coefficients following zero mean normal distribution with a smooth variance structure. The major contribution of this article is to use the Laplace approximation of the marginal likelihood for estimation. This method is numerically simple and fast. The idea is extended to spatial and non-normal response smoothing
Testing differentially expressed genes in dose-response studies and with ordinal phenotypes
When testing for differentially expressed genes between more than two groups, the groups are often defined by dose levels in dose-response experiments or ordinal phenotypes, such as disease stages. We discuss the potential of a new approach that uses the levels' ordering without making any structural assumptions, such as monotonicity, by testing for zero variance components in a mixed models framework. Since the mixed effects model approach borrows strength across doses/levels, the test proposed can also be applied when the number of dose levels/phenotypes is large and/or the number of subjects per group is small. We illustrate the new test in simulation studies and on several publicly available datasets and compare it to alternative testing procedures. All tests considered are implemented in R and are publicly available. The new approach offers a very fast and powerful way to test for differentially expressed genes between ordered groups without making restrictive assumptions with respect to the true relationship between factor levels and response
Generalized Semiparametric Regression Models with Nonparametric Effects of Covariates Measured with Error
Generalized Semiparametric Regression Models with Nonparametric Effects of Covariates Measured with Error
Longitudinal scalar-on-functions regression with application to tractography data
We propose a class of estimation techniques for scalar-on-function regression where both outcomes and functional predictors may be observed at multiple visits. Our methods are motivated by a longitudinal brain diffusion tensor imaging tractography study. One of the study's primary goals is to evaluate the contemporaneous association between human function and brain imaging over time. The complexity of the study requires the development of methods that can simultaneously incorporate: (1) multiple functional (and scalar) regressors; (2) longitudinal outcome and predictor measurements per patient; (3) Gaussian or non-Gaussian outcomes; and (4) missing values within functional predictors. We propose two versions of a new method, longitudinal functional principal components regression (PCR). These methods extend the well-known functional PCR and allow for different effects of subject-specific trends in curves and of visit-specific deviations from that trend. The new methods are compared with existing approaches, and the most promising techniques are used for analyzing the tractography data
Millennial Scale Sea Level Curve Estimation
26 pages, 1 article*Millennial Scale Sea Level Curve Estimation* (Staudenmayer, John; Balco, Greg; Gehrels, W. Roland; Altman, Naomi; Crainiceanu, Ciprian; Qui, Jing) 26 page
Predictive performance of physical activity on mortality using UK Biobank and NHANES datasets
The absolute and relative mortality prediction performance of objective measures of physical activity obtained from accelerometers is quantified in the UK Biobank and the National Health and Nutrition Examination Survey (NHANES). Studies were analyzed separately because of differences in objective physical activity measurements as well as in some traditional predictors of mortality. Prediction performance was assessed using the ten-fold cross validated C (10-f-CV) index. In NHANES using single variable Cox regression models the most predictive variable was age followed by total activity count (TAC) and 12 other accelerometer-derived summaries. In UK Biobank, the top five most predictive variables are accelerometer-derived summaries with the most predictive variable being total acceleration (TA). The most predictive non-physical activity related variable is age. Out of the top 15 predictors, 14 were accelerometry-derived objective measurements of physical activity. Using forward selection and a stopping rule of an increase of less than 0.001 for the 10-f-CV resulted in a model with 10 predictors including age, active to sedentary transition probability (ASTP), smoking status, coronary heart failure (CHF), drinking status, gender, mobility problem, diabetes, body mass index (BMI) and education. In UK Biobank using a similar procedure resulted in a nine-variable model including total activity (TA), age, relative amplitude (RA), longstanding illness/disability, cigarette smoking, injury/illness within past 2 years, gender, cancer and high blood pressure. Another approach was to use a two-stage forward selection where first traditional predictors were included and then accelerometer-derived physical activity summaries were included. Using the same inclusion stopping criterion in NHANES this resulted in a 13 variable model which included, in this order, age, mobility problem, smoking status, CHF, drinking status, gender, diabetes, BMI, stroke, ASTP, TLAC11 and RA. In UK Biobank the two step forward selection procedure resulted in a 10 variable model which included age, self-reported overall health, gender, cigarette smoking, longstanding illness/disability, injury/illness within past 2 years, high blood pressure, cancer, RA, and two-hour summary of total log activity. The analytic results in this thesis show that accelerometer-derived physical activity summaries: (1) outperform traditional risk factors in terms of mortality prediction performance; and (2) improve prediction performance in addition to traditional risk factors. Possible reasons for the differences and similarities between the results from NHANES and UK Biobank are provided
Quantifying the Association between Objectively-Measured Physical Activity and Current or Future Multiple Sclerosis Status in the UK Biobank
Background: Objectively measured physical activity (PA) data were collected in the accelerometry sub-study of the UK Biobank. The association between PA and multiple sclerosis (MS) is studied.
Objectives: (1) Compare and quantify PA difference between MS cases and controls. (2) Evaluate the predictive role of PA for future MS.
Methods: Eight accelerometer-derived PA variables were used, including total acceleration (TA), total log-transformed acceleration (TLA), total sedentary time (ST), total minutes of light-intensity physical activity (LIPA), total minutes of moderate-to-vigorous physical activity (MVPA), active-to-sedentary transition probability (ASTP), sedentary-to-active transition probability (SATP), and relative amplitude (RA). In the first sub-study, each current (at accelerometer-wearing) MS patient was matched with 30 participants without MS on age, sex and BMI. The PA differences between case and control groups were visually compared using density plots and confirmed by Welch's two-sample t-tests. In the second sub-study, the accelerometry data excluding participants diagnosed with MS before accelerometer-wearing were used to analyze PA’s predictive effects. Single-predictor Cox proportional hazards models were built for PA variables and traditional predictors. Concordance was used to compare their performances. Two-stage forward selection was used to decide the best combination of predictors.
Results: In the case-control study, PA was found significantly different between cases and controls for all eight PA variables (p < 0.001), among which mean MVPA differed the most by 39%. In the survival analysis, the variable that performed the best in the single-predictor models was age (C = 0.604). The best-performing PA variable was RA (C = 0.594). The variables with significant coefficients were age (p = 0.048), RA (p = 0.004), and stroke (p = 0.010). The selected predictors from the two-stage forward selection procedure were age (p = 0.015), stroke (p = 0.009), Townsend deprivation index (p = 0.874), and RA (p = 0.004), reaching a 0.693 overall concordance. Age, stroke, and RA were then used to construct the final model, which had a concordance of C = 0.691.
Conclusion: PA levels were found out to be significantly lower among MS patients than those without MS, which is consistent with previous studies. Among older individuals, younger age, stroke history, and lower RA play significant roles in predicting MS diagnosis
FINGERPRINTING OF WALKING USING DATA FROM WRIST AND ANKLE-WORN ACCELEROMETERS
Identifying an individual from accelerometry data collected during walking without reliance on step-cycle detection has not been achieved with high accuracy. Therefore, we propose an open-source reproducible method to: (1) create a unique, person-specific
“walking fingerprint” from a sample of un-landmarked high-resolution data collected by wrist and ankle-worn accelerometers; and (2) predict who an individual is from their walking fingerprint.
Accelerometry data were collected during walking from 32 individuals (19 females) aged from 23 to 52 years old for at least 380 s each. For this study’s purpose, data are not landmarked, nor synchronized. Individual walking fingerprints were created by:
(1) partitioning the accelerometer time series in adjacent, non-overlapping one-second intervals; (2) transforming all one-second interval data for a given individual into a three-dimensional (3D) image obtained by plotting each one-second interval time series by the lagged time series for a series of lags; (3) partitioning these resulting participant-specific 3D images into a grid of cells; and (4) identifying the combinations of grid cells (areas in the 3D image) that best predict the individual. For every participant, the first 200 s of data were used as training and the last 180 s as testing. This approach does not segment individual strides but instead walking, resulting in reduced dependence on complementary algorithms and increasing its generalizability.
This method correctly identified 100% of the participants in testing data for left-wrist worn accelerometry [1] and highlighted unique features of walking that characterize the individuals. This is significant as predicting the identity of an individual from their walking patterns has immediate implications that can complement or replace those of actual fingerprinting, voice, and image recognition. Furthermore, as walking may change with age or disease burden, individual walking fingerprints may be used as biomarkers of change in health status with potential clinical and epidemiologic implications
- …
