ARDC Research Link Australia

Publication

SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data

Publisher: Cold Spring Harbor Laboratory

Date: 12-07-2021

DOI: 10.1101/2021.07.11.451967

Abstract: Survival analysis is a branch of statistics that deals with both, the tracking of time and of the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarise the survival time and perform a classification analysis. Here, we develop a benchmarking framework, SurvBenchmark, that evaluates a erse collection of survival models for both clinical and omics datasets. SurvBenchmark not only focuses on classical approaches such as the Cox model, but it also evaluates state-of-art machine learning survival models. All approaches were assessed using multiple performance metrics, these include model predictability, stability, flexibility and computational issues. Our systematic comparison framework with over 320 comparisons (20 methods over 16 datasets) shows that the performances of survival models vary in practice over real-world datasets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies. jean.yang@sydney.edu.au

Publication

A variational Bayes approach to variable selection

Publisher: Institute of Mathematical Statistics

Date: 2017

DOI: 10.1214/17-EJS1332

Publication

A Note on the Effect on Power of Score Tests via Dimension Reduction by Penalized Regression under the Null

Publisher: Walter de Gruyter GmbH

Date: 29-01-2010

DOI: 10.2202/1557-4679.1231

Publication

Robust estimation of precision matrices under cellwise contamination

Publisher: Elsevier BV

Date: 2016

DOI: 10.1016/J.CSDA.2015.02.005

Publication

svReg: Structural varying‐coefficient regression to differentiate how regional brain atrophy affects motor impairment for Huntington disease severity groups

Publisher: Wiley

Date: 19-04-2021

DOI: 10.1002/BIMJ.202000312

Abstract: For Huntington disease, identification of brain regions related to motor impairment can be useful for developing interventions to alleviate the motor symptom, the major symptom of the disease. However, the effects from the brain regions to motor impairment may vary for different groups of patients. Hence, our interest is not only to identify the brain regions but also to understand how their effects on motor impairment differ by patient groups. This can be cast as a model selection problem for a varying‐coefficient regression. However, this is challenging when there is a pre‐specified group structure among variables. We propose a novel variable selection method for a varying‐coefficient regression with such structured variables and provide a publicly available R package svreg for implementation of our method. Our method is empirically shown to select relevant variables consistently. Also, our method screens irrelevant variables better than existing methods. Hence, our method leads to a model with higher sensitivity, lower false discovery rate and higher prediction accuracy than the existing methods. Finally, we found that the effects from the brain regions to motor impairment differ by disease severity of the patients. To the best of our knowledge, our study is the first to identify such interaction effects between the disease severity and brain regions, which indicates the need for customized intervention by disease severity.

Publication

A Multi-Step Precision Pathway for Predicting Allograft Survival in Heterogeneous Cohorts of Kidney Transplant Recipients

Publisher: Frontiers Media SA

Date: 12-09-2023

DOI: 10.3389/TI.2023.11338

Publication

A prediction model for viability at the end of the first trimester after a single early pregnancy evaluation

Publisher: Wiley

Date: 02-2013

DOI: 10.1111/AJO.12046

Abstract: The aim was to develop a new model to predict the outcome at the end of the 1st trimester after a single visit to the early pregnancy unit (EPU). Prospective observational study in the EPU at Nepean Hospital, between November 2006 and February 2009. Data were collected from all women in the 1st trimester of their pregnancy who had a live intrauterine pregnancy (IUP) at the 1st transvaginal ultrasound scan (TVS). 29 historical, clinical and ultrasound end points were recorded. Women were followed until the final diagnosis was established at the end of the 1st trimester: viability or nonviability. A multinomial logistic regression model was developed. The performance of this model was evaluated using receiver operating characteristic (ROC) curves. Data from 416 pregnancies were included: 92.1% were live beyond the 1st trimester, and 7.9% had miscarried. The most useful prognostic variables for developing the logistic regression model were gestational age by dates, vaginal (PV) bleeding, PV clots, gestational age by TVS, consistency with menstrual dates, mean gestational sac (GS) size, mean yolk sac (YS) size and number of previous caesarean sections. Used retrospectively on 416 women based on 25 imputations, the model gave an AUC of 0.88. Based on cross-validation, the independent predictive power obtained an AUC of 0.78. We have developed a new model to predict the outcome of the 1st trimester in women with live IUP at the 1st scan.

Publication

MCVIS: A New Framework for Collinearity Discovery, Diagnostic, and Visualization

Publisher: Informa UK Limited

Date: 30-07-2020

DOI: 10.1080/10618600.2020.1779729

Publication

Weighted least squares estimation of the extreme value index

Publisher: Elsevier BV

Date: 05-2006

DOI: 10.1016/J.SPL.2005.10.025

Publication

Predation risk and competitive interactions affect foraging of an endangered refuge-dependent herbivore

Publisher: Wiley

Date: 25-02-2011

DOI: 10.1111/J.1469-1795.2011.00446.X

Publication

GEE-Assisted Variable Selection for Latent Variable Models with Multivariate Binary Data

Publisher: Informa UK Limited

Date: 17-11-2021

DOI: 10.1080/01621459.2021.1987251

Publication

2nd special issue on robust analysis of complex data

Publisher: Elsevier BV

Date: 09-2017

DOI: 10.1016/J.CSDA.2017.05.013

Publication

Prediction modeling—part 2: using machine learning strategies to improve transplantation outcomes

Publisher: Elsevier BV

Date: 04-2021

DOI: 10.1016/J.KINT.2020.08.026

Publication

Partially smooth tail-index estimation for small samples

Publisher: Springer Science and Business Media LLC

Date: 30-12-2010

DOI: 10.1007/S00180-010-0221-5

Publication

A PERSONALISED PREDICTION MODEL FOR ALLOGRAFT SURVIVAL AFTER KIDNEY TRANSPLANTATION

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 09-2020

DOI: 10.1097/01.TP.0000698456.11863.66

Publication

Melanoma Explorer: a web application to allow easy reanalysis of publicly available and clinically annotated melanoma omics data sets

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 06-2019

DOI: 10.1097/CMR.0000000000000533

Publication

Nutritional Intake and Gut Microbiome Composition Predict Parkinson’s Disease

Publisher: Frontiers Media SA

Date: 11-05-2022

DOI: 10.3389/FNAGI.2022.881872

Abstract: Models to predict Parkinson’s disease (PD) incorporating alterations of gut microbiome (GM) composition have been reported with varying success. To assess the utility of GM compositional changes combined with macronutrient intake to develop a predictive model of PD. We performed a cross-sectional analysis of the GM and nutritional intake in 103 PD patients and 81 household controls (HCs). GM composition was determined by 16S licon sequencing of the V3-V4 region of bacterial ribosomal DNA isolated from stool. To determine multivariate disease-discriminant associations, we developed two models using Random Forest and support-vector machine (SVM) methodologies. Using updated taxonomic reference, we identified significant compositional differences in the GM profiles of PD patients in association with a variety of clinical PD characteristics. Six genera were overrepresented and eight underrepresented in PD patients relative to HCs, with the largest difference being overrepresentation of Lactobacillaceae at family taxonomic level. Correlation analyses highlighted multiple associations between clinical characteristics and select taxa, whilst constipation severity, physical activity and pharmacological therapies associated with changes in beta ersity. The random forest model of PD, incorporating taxonomic data at the genus level and carbohydrate contribution to total energy demonstrated the best predictive capacity [Area under the ROC Curve (AUC) of 0.74]. The notable differences in GM ersity and composition when combined with clinical measures and nutritional data enabled the development of a predictive model to identify PD. These findings support the combination of GM and nutritional data as a potentially useful biomarker of PD to improve diagnosis and guide clinical management.

Publication

Determination of prognosis in metastatic melanoma through integration of clinico-pathologic, mutation, mRNA, microRNA, and protein information

Publisher: Wiley

Date: 24-07-2015

DOI: 10.1002/IJC.29047

Abstract: In patients with metastatic melanoma, the identification and validation of accurate prognostic biomarkers will assist rational treatment planning. Studies based on "-omics" technologies have focussed on a single high-throughput data type such as gene or microRNA transcripts. Occasionally, these features have been evaluated in conjunction with limited clinico-pathologic data. With the increased availability of multiple data types, there is a pressing need to tease apart which of these sources contain the most valuable prognostic information. We evaluated and integrated several data types derived from the same tumor specimens in AJCC stage III melanoma patients-gene, protein, and microRNA expression as well as clinical, pathologic and mutation information-to determine their relative impact on prognosis. We used classification frameworks based on pre-validation and bootstrap multiple imputation to compare the prognostic power of each data source, both in idually as well as integratively. We found that the prognostic utility of clinico-pathologic information was not out-performed by any of the various "-omics" platforms. Rather, a combination of clinico-pathologic variables and mRNA expression data performed best. Furthermore, a patient-based classification analysis revealed that the prognostic accuracy of various data types was not the same for different patients. This indicates that ongoing development in the in idualized evaluation of melanoma patients must take account of the value of both traditional and novel "-omics" measurements.

Publication

Random Effects Misspecification Can Have Severe Consequences for Random Effects Inference in Linear Mixed Models

Publisher: Wiley

Date: 15-04-2020

DOI: 10.1111/INSR.12378

Abstract: There has been considerable and controversial research over the past two decades into how successfully random effects misspecification in mixed models (i.e. assuming normality for the random effects when the true distribution is non‐normal) can be diagnosed and what its impacts are on estimation and inference. However, much of this research has focused on fixed effects inference in generalised linear mixed models. In this article, motivated by the increasing number of applications of mixed models where interest is on the variance components, we study the effects of random effects misspecification on random effects inference in linear mixed models, for which there is considerably less literature. Our findings are surprising and contrary to general belief: for point estimation, maximum likelihood estimation of the variance components under misspecification is consistent, although in finite s les, both the bias and mean squared error can be substantial. For inference, we show through theory and simulation that under misspecification, standard likelihood ratio tests of truly non‐zero variance components can suffer from severely inflated type I errors, and confidence intervals for the variance components can exhibit considerable under coverage. Furthermore, neither of these problems vanish asymptotically with increasing the number of clusters or cluster size. These results have major implications for random effects inference, especially if the true random effects distribution is heavier tailed than the normal. Fortunately, simple graphical and goodness‐of‐fit measures of the random effects predictions appear to have reasonable power at detecting misspecification. We apply linear mixed models to a survey of more than 4 000 high school students within 100 schools and analyse how mathematics achievement scores vary with student attributes and across different schools. The application demonstrates the sensitivity of mixed model inference to the true but unknown random effects distribution.

Publication

Cross-Platform Omics Prediction procedure: a game changer for implementing precision medicine in patients with stage-III melanoma

Publisher: Cold Spring Harbor Laboratory

Date: 09-12-2020

DOI: 10.1101/2020.12.09.415927

Abstract: There is no consensus methodology that can account for the variation in omics signatures when they are acquired across different platforms and times. This poses a significant barrier to the implementation of valuable biomarkers into clinical practice. We present a novel procedure (Cross-Platform Omics Prediction) that accounts for these variations and demonstrate its utility in three risk models for different diseases that is suitable for prospective and multi-centre clinical implementation.

Publication

The LASSO on latent indices for regression modeling with ordinal categorical predictors

Publisher: Elsevier BV

Date: 09-2020

DOI: 10.1016/J.CSDA.2020.106951

Publication

NEMoE: A nutrition aware regularized mixture of experts model addressing diet-cohort heterogeneity of gut microbiota in Parkinson’s disease

Publisher: Cold Spring Harbor Laboratory

Date: 11-11-2021

DOI: 10.1101/2021.11.10.21266194

Abstract: The microbiome plays a fundamental role in human health and diet is one of the strongest modulators of the gut microbiome. However, interactions between microbiota and host health are complex and erse. Understanding the interplay between diet, the microbiome and health state could enable the design of personalized intervention strategies and improve the health and wellbeing of affected in iduals. A common approach to this is to ide the study population into smaller cohorts based on dietary preferences in the hope of identifying specific microbial signatures. However, classification of patients based solely on diet is unlikely to reflect the microbiome-host health relationship or the taxonomic microbiome makeup. To this end, we present a novel approach, the N utrition- E cotype M ixture o f E xperts (NEMoE) model, for establishing associations between gut microbiota and health state that accounts for diet-specific cohort variability using a regularized mixture of experts model framework with an integrated parameter sharing strategy to ensure data driven diet-cohort identification consistency across taxonomic levels. The success of our approach was demonstrated through a series of simulation studies, in which NEMoE showed robustness with regard to parameter selection and varying degrees of data heterogeneity. Further application to real-world microbiome data from a Parkinson’s disease cohort revealed that NEMoE is capable of not only improving predictive performance for Parkinson’s Disease but also for identifying diet-specific microbiome markers of disease. Our results indicate that NEMoE can be used to uncover diet-specific relationships between nutritional-ecotype and patient health and to contextualize precision nutrition for different diseases.

Publication

Estimation of graphical models for skew continuous data

Publisher: Wiley

Date: 06-02-2022

DOI: 10.1111/SJOS.12569

Abstract: We consider a new approach for estimating non‐Gaussian undirected graphical models. Specifically, we model continuous data from a class of multivariate skewed distributions, whose conditional dependence structure depends on both a precision matrix and a shape vector. To estimate the graph, we propose a novel estimation method based on nodewise regression: we first fit a linear model, and then fit a one component projection pursuit regression model to the residuals obtained from the linear model, and finally threshold appropriate quantities. Theoretically, we establish error bounds for each nodewise regression and prove the consistency of the estimated graph when the number of variables erges with the s le size. Simulation results demonstrate the strong finite s le performance of our new method over existing methods for estimating Gaussian and non‐Gaussian graphical models. Finally, we demonstrate an application of the proposed method on observations of physicochemical properties of wine.

Publication

Identification, Review, and Systematic Cross-Validation of microRNA Prognostic Signatures in Metastatic Melanoma

Publisher: Elsevier BV

Date: 2016

DOI: 10.1038/JID.2015.355

Publication

Estimating the number of motor units using random sums with independently thinned terms

Publisher: Elsevier BV

Date: 07-2006

DOI: 10.1016/J.MBS.2006.04.006

Abstract: The problem of estimating the numbers of motor units N in a muscle is embedded in a general stochastic model using the notion of thinning from point process theory. In the paper a new moment type estimator for the numbers of motor units in a muscle is denned, which is derived using random sums with independently thinned terms. Asymptotic normality of the estimator is shown and its practical value is demonstrated with bootstrap and approximative confidence intervals for a data set from a 31-year-old healthy right-handed, female volunteer. Moreover simulation results are presented and Monte-Carlo based quantiles, means, and variances are calculated for N in{300,600,1000}.

Publication

The difference of symmetric quantiles under long range dependence

Publisher: Elsevier BV

Date: 03-2015

DOI: 10.1016/J.SPL.2014.12.022

Publication

A radiographic analysis of the abnormal hallux interphalangeus angle range: Considerations for surgeons performing Akin osteotomies

Publisher: SAGE Publications

Date: 09-04-2019

DOI: 10.1177/2309499019841093

Abstract: This study is aimed to determine the abnormal radiological hallux interphalangeus angle (HIA) range, which can assist surgeons in determining the required bone resection in an Akin osteotomy of the proximal phalanx of the great toe. Radiographs of 141 feet were analyzed. The mean HIA and range were calculated. The prevalence of hallux valgus interphalangeus (HVI) deformity was 78% (110/141). The mean HIA was 13.5° ± 4.5° (1.4-24.4). Fifty percent had abnormal HIA values of 10-15°, 40% had values of 15-20°, and 10% had greater than 20°. A large proportion of patients with HVI deformities may need greater than the standard 2-3-mm bone wedge removal during Akin osteotomy. The high prevalence and wide range of HVI deformities should alert surgeons to the possibility that greater than 3-mm bone wedge resections may be required. Level of Evidence: Level IV.

Publication

Model Selection in Linear Mixed Models

Publisher: Institute of Mathematical Statistics

Date: 05-2013

DOI: 10.1214/12-STS410

Publication

A multi-step classifier addressing cohort heterogeneity improves performance of prognostic biomarkers in three cancer types

Publisher: Impact Journals, LLC

Date: 11-08-2016

DOI: 10.18632/ONCOTARGET.13203

Publication

PARTIALLY LINEAR MODEL SELECTION BY THE BOOTSTRAP

Publisher: Wiley

Date: 31-05-2009

DOI: 10.1111/J.1467-842X.2009.00540.X

Publication

037 The gut microbiome in Parkinson’s disease: longitudinal insights into disease progression and the use of device-assisted therapies

Publisher: BMJ Publishing Group Ltd

Date: 08-2021

DOI: 10.1136/BMJNO-2021-ANZAN.37

Publication

Iterative Estimation of the Extreme Value Index

Publisher: Springer Science and Business Media LLC

Date: 06-2005

DOI: 10.1007/S11009-005-1487-X

Publication

Revisiting fitting monotone polynomials to data

Publisher: Springer Science and Business Media LLC

Date: 23-12-2013

DOI: 10.1007/S00180-012-0390-5

Publication

IDENTIFICATION OF DRIVEN RISK FACTORS FOR HLA-DR IN KIDNEY TRANSPLANTATION

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 09-2020

DOI: 10.1097/01.TP.0000700748.71062.0C

Publication

bcGST - an interactive bias-correction method to identify over-represented gene-sets in boutique arrays

Publisher: Cold Spring Harbor Laboratory

Date: 28-12-2017

DOI: 10.1101/240234

Abstract: Motivation: Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopedia of Genes and Genomes are important tools in Gene Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene set of interest and an annotation. Importantly, GST tests for over-representation of genes in an annotation term. One implicit assumption of GST is that the gene expression platform captures the complete or a very large proportion of the genome. However, this assumption is neither satisfied for the increasingly popular boutique array nor the custom designed gene expression profiling platform. Specifically, conventional GST is no longer appropriate due to the gene set selection bias induced during the construction of these platforms. Results: We propose bcGST, a bias-corrected Gene Set Test by introducing bias correction terms in the contingency table needed for calculating the Fisher’s Exact Test (FET). The adjustment method works by estimating the proportion of genes captured on the array with respect to the genome in order to assist filtration of annotation terms that would otherwise be falsely included or excluded. We illustrate the practicality of bcGST and its stability through multiple differential gene expression analyses in melanoma and TCGA cancer studies. Availability: The bcGST method is made available as a Shiny web application at shiny.maths.usyd.edu.au/bcGST/ Contact: kevin.wang@sydney.edu.au

Publication

Association between periodontal and peri‐implant conditions: a 10‐year prospective study

Publisher: Wiley

Date: 20-01-2004

DOI: 10.1111/J.1600-0501.2004.00982.X

Abstract: The aims of this study were to (1) compare prospectively the clinical and radiographic changes in periodontal and peri-implant conditions, (2) investigate the association of changes in periodontal parameters and peri-implant conditions over a mean observation period of 10 years (8-12 years) after implant installation, and (3) evaluate patient risk factors known to aggravate the periodontal conditions for their potential influence on the peri-implant tissue status. Eighty-nine partially edentulous patients with a mean age of 58.9 years (28-88 years) were examined at 1 and 10 years after implant placement. The patients contributed with 179 implants that were placed after comprehensive periodontal treatment and restored with crowns or fixed partial dentures. One hundred and seventy-nine matching control teeth were chosen as controls. Also, the remaining teeth (n=1770) in the dentitions were evaluated. Data on smoking habits and general health aspects were collected at 1 and 10 years as well. At 10 years, statistically significant differences existed between implants and matching control teeth with regard to most of the clinical and radiographic parameters (P<0.01) with the exception of plaque index (PII) and recession. Multiple regression analyses were performed to associate combinations of periodontal diagnostic parameters to the peri-implant conditions: probing attachment level (PAL) at implants at 10 years was associated with implant location, full-mouth probing pocket depth (PPD) and full-mouth PAL (P=0.0001, r2=0.36). PPD at implants at 10 years correlated to implant location, full-mouth PPD and full-mouth PAL (P<0.001, r2=0.47). Marginal bone level at implants at 10 years was significantly associated to smoking, general health condition, implant location, full-mouth PAL and change over time in full-mouth PPD (P<0.001, r2=0.39). These results present evidence for the association between periodontal and peri-implant conditions and the changes in these tissues over 10 years in partially edentulous patients.

Publication

Assessing Modularity Using a Random Matrix Theory Approach

Publisher: Walter de Gruyter GmbH

Date: 26-01-2011

DOI: 10.2202/1544-6115.1667

Abstract: Random matrix theory (RMT) is well suited to describing the emergent properties of systems with complex interactions amongst their constituents through their eigenvalue spectrums. Some RMT results are applied to the problem of clustering high dimensional biological data with complex dependence structure amongst the variables. It will be shown that a gene relevance or correlation network can be constructed by choosing a correlation threshold in a principled way, such that it corresponds to a block diagonal structure in the correlation matrix, if such a structure exists. The structure is then found using community detection algorithms, but with parameter choice guided by RMT predictions. The resulting clustering is compared to a variety of hierarchical clustering outputs and is found to the most generalised result, in that it captures all the features found by the other considered methods.

Publication

Fast and flexible methods for monotone polynomial fitting

Publisher: Informa UK Limited

Date: 02-2016

DOI: 10.1080/00949655.2016.1139582

Publication

Testing random effects in linear mixed models: another look at the F‐test (with discussion)

Publisher: Wiley

Date: 03-2019

DOI: 10.1111/ANZS.12256

Publication

Robust subtractive stability measures for fast and exhaustive feature importance ranking and selection in generalised linear models

Publisher: Wiley

Date: 09-0008

DOI: 10.1111/ANZS.12375

Abstract: We introduce the relatively new concept of subtractive lack‐of‐fit measures in the context of robust regression, in particular in generalised linear models. We devise a fast and robust feature selection framework for regression that empirically enjoys better performance than other selection methods while remaining computationally feasible when fully exhaustive methods are not. Our method builds on the concepts of model stability, subtractive lack‐of‐fit measures and repeated model identification. We demonstrate how the multiple implementations add value in a robust regression type context, in particular through utilizing a combination of robust regression coefficient and scale estimates. Through res ling, we construct a robust stability matrix, which contains multiple measures of feature importance for each variable. By constructing this stability matrix and using it to rank features based on importance, we are able to reduce the candidate model space and then perform an exhaustive search on the remaining models. We also introduce two different visualisations to better convey information held within the stability matrix a subtractive Mosaic Probability Plot and a subtractive Variable Inclusion Plot. We demonstrate how these graphics allow for a better understanding of how variable importance changes under small alterations to the underlying data. Our framework is made available in R through the RobStabR package.

Publication

The latency distribution of motor evoked potentials in patients with multiple sclerosis

Publisher: Elsevier BV

Date: 12-2012

DOI: 10.1016/J.CLINPH.2012.05.008

Abstract: To compare the in idual latency distributions of motor evoked potentials (MEP) in patients with multiple sclerosis (MS) to the previously reported results in healthy subjects (Firmin et al., 2011). We applied the previously reported method to measure the distribution of MEP latencies to 16 patients with MS. The method is based on transcranial magnetic stimulation and consists of a combination of the triple stimulation technique with a method originally developed to measure conduction velocity distributions in peripheral nerves. MEP latency distributions in MS typically showed two peaks. The in idual MEP latency distributions were significantly wider in patients with MS than in healthy subjects. The mean triple stimulation delay extension at the 75% quantile, a proxy for MEP latency distribution width, was 7.3 ms in healthy subjects and 10.7 ms in patients with MS. In patients with MS, slow portions of the central motor pathway contribute more to the MEP than in healthy subjects. The bimodal distribution found in healthy subjects is preserved in MS. Our method to measure the distribution of MEP latencies is suitable to detect alterations in the relative contribution of corticospinal tract portions with long MEP latencies to motor conduction.

Publication

BcGST-an interactive bias-correction method to identify over-represented gene-sets in boutique arrays

Publisher: Oxford University Press (OUP)

Date: 12-09-2019

DOI: 10.1093/BIOINFORMATICS/BTY783

Abstract: Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes are important tools in Gene-Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene-set of interest and an annotation. Importantly, GST tests for over-representation of genes in an annotation term. One implicit assumption of GST is that the gene expression platform captures the complete or a very large proportion of the genome. However, this assumption is neither satisfied for the increasingly popular boutique array nor the custom designed gene expression profiling platform. Specifically, conventional GST is no longer appropriate due to the gene-set selection bias induced during the construction of these platforms. We propose bcGST, a bias-corrected GST by introducing bias-correction terms in the contingency table needed for calculating the Fisher’s Exact Test. The adjustment method works by estimating the proportion of genes captured on the array with respect to the genome in order to assist filtration of annotation terms that would otherwise be falsely included or excluded. We illustrate the practicality of bcGST and its stability through multiple differential gene expression analyses in melanoma and the Cancer Genome Atlas cancer studies. The bcGST method is made available as a Shiny web application at shiny.maths.usyd.edu.au/bcGST/. Supplementary data are available at Bioinformatics online.

Publication

Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context

Publisher: Informa UK Limited

Date: 11-2011

DOI: 10.1198/TAS.2011.11052

Publication

mplot: An R Package for Graphical Model Stability and Variable Selection Procedures

Publisher: Foundation for Open Access Statistic

Date: 2018

DOI: 10.18637/JSS.V083.I09

Publication

Screening methods for linear errors‐in‐variables models in high dimensions

Publisher: Wiley

Date: 25-03-2022

DOI: 10.1111/BIOM.13628

Abstract: Microarray studies, in order to identify genes associated with an outcome of interest, usually produce noisy measurements for a large number of gene expression features from a small number of subjects. One common approach to analyzing such high‐dimensional data is to use linear errors‐in‐variables (EIV) models however, current methods for fitting such models are computationally expensive. In this paper, we present two efficient screening procedures, namely, corrected penalized marginal screening (PMSc) and corrected sure independence screening (SISc), to reduce the number of variables for final model building. Both screening procedures are based on fitting corrected marginal regression models relating the outcome to each contaminated covariate separately, which can be computed efficiently even with a large number of features. Under mild conditions, we show that these procedures achieve screening consistency and reduce the number of features substantially, even when the number of covariates grows exponentially with s le size. In addition, if the true covariates are weakly correlated, we show that PMSc can achieve full variable selection consistency. Through a simulation study and an analysis of gene expression data for bone mineral density of Norwegian women, we demonstrate that the two new screening procedures make estimation of linear EIV models computationally scalable in high‐dimensional settings, and improve finite s le estimation and selection performance compared with estimators that do not employ a screening stage.

Publication

Cox regression with exclusion frequency-based weights to identify neuroimaging markers relevant to Huntington’s disease onset

Publisher: Institute of Mathematical Statistics

Date: 12-2016

DOI: 10.1214/16-AOAS967

Publication

A QUANTITATIVE FABRIC ANALYSIS APPROACH TO THE DISCRIMINATION OF WHITE MARBLES*

Publisher: Wiley

Date: 08-1999

DOI: 10.1111/J.1475-4754.1999.TB00980.X

Publication

Predictive Value of Radiological Criteria for Disintegration Rates of Extracorporeal Shock Wave Lithotripsy

Publisher: S. Karger AG

Date: 2001

DOI: 10.1159/000056590

Abstract: i Objective: /i To evaluate routinely applicable criteria to predict fragmentation of renal calculi by extracorporeal shock wave lithotripsy (ESWL). i Patients and Methods: /i Two hundred and two consecutive patients (121 men, 81 women), median age 48 (range 19–81) years, were treated with the original Dornier HM-3 lithotriptor at a single stone center. Inclusion criteria were: solitary stones, 10–30 mm in greatest diameter, located in renal pelvis or calyces. Based on plain radiographs, the calculi were classified according to their size, form, location, density (compared to the 12th rib), structure and surface. Furthermore, age of the patient, gender and body mass index were also considered for evaluation. Disintegration was documented on day 1 after ESWL by plain X-ray. A multivariate regression analysis was applied to all preoperative parameters, based on the dual variable stone free versus residual fragments. i Results: /i The overall disintegration rate was 95.5% 42 patients (20.8%) were completely stone free, and 151 patients (74.7%) had clinically insignificant residual fragments (5 mm or smaller). 14.9% of men and 29.6% of women were stone free (p = 0.01). All other parameters did not reach statistical significance. i Conclusions: /i The disintegration rate of the HM-3 is excellent for kidney stones women did significantly better than men. However, because of this high disintegration rate, a much larger series would be necessary to define possible differences between preinterventional parameters, if there were any at all.

Publication

On Variational Bayes Estimation and Variational Information Criteria for Linear Regression Models

Publisher: Wiley

Date: 06-02-2014

DOI: 10.1111/ANZS.12063

Publication

Smooth tail-index estimation

Publisher: Informa UK Limited

Date: 09-2009

DOI: 10.1080/00949650802142667

Publication

Identification of important regressor groups, subgroups and individuals via regularization methods: application to gut microbiome data

Publisher: Oxford University Press (OUP)

Date: 24-10-2013

DOI: 10.1093/BIOINFORMATICS/BTT608

Abstract: Motivation: Gut microbiota can be classified at multiple taxonomy levels. Strategies to use changes in microbiota composition to effect health improvements require knowing at which taxonomy level interventions should be aimed. Identifying these important levels is difficult, however, because most statistical methods only consider when the microbiota are classified at one taxonomy level, not multiple. Results: Using L1 and L2 regularizations, we developed a new variable selection method that identifies important features at multiple taxonomy levels. The regularization parameters are chosen by a new, data-adaptive, repeated cross-validation approach, which performed well. In simulation studies, our method outperformed competing methods: it more often selected significant variables, and had small false discovery rates and acceptable false-positive rates. Applying our method to gut microbiota data, we found which taxonomic levels were most altered by specific interventions or physiological status. Availability: The new approach is implemented in an R package, which is freely available from the corresponding author. Contact: tpgarcia@srph.tamhsc.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Publication

A method to measure the distribution of latencies of motor evoked potentials in man

Publisher: Elsevier BV

Date: 2011

DOI: 10.1016/J.CLINPH.2010.05.034

Abstract: To measure the intra-in idual distribution of the latencies of motor evoked potentials (MepL) using transcranial magnetic stimulation. We used the triple stimulation technique (TST) to quantify the proportion of excited spinal motor neurons supplying the abductor digiti minimi muscle in response to a maximal magnetic brain stimulus (Magistris et al., 1998). By systematically manipulating the TST delay, we could quantify the contribution of slow-conducting motor tract portions to the TST litude. Our method allowed the establishment of a MepL distribution for each of the 29 examined healthy subjects. MepLs of 50% of the motor tract contributing to the motor evoked potential laid between the intra-in idually minimal MepL (MepL(min)) and MepL(min)+4.9 ms (range 1.6-9.2). The in idual MepL distributions showed two peaks in most subjects. The first peak appeared at a MepL that was 3.0 ms longer on average (range 0.7-6.0) than MepL(min) the second peak appeared at MepL(min)+8.1 ms on average (range 3.7-13.0). Slow-conducting parts of the motor pathway contribute notably to the motor evoked potential. Our data suggest a bimodal distribution of central conduction times, which might possibly relate to different fibre types within the pyramidal tract. We present a non-invasive method to assess slow-conducting parts of the human central motor tract.

Publication

Joint Selection in Mixed Models using Regularized PQL

Publisher: Informa UK Limited

Date: 13-06-2017

DOI: 10.1080/01621459.2016.1215989

Publication

Semiparametric Regression Using Variational Approximations

Publisher: Informa UK Limited

Date: 27-02-2019

DOI: 10.1080/01621459.2018.1518235

Publication

NEMoE: a nutrition aware regularized mixture of experts model to identify heterogeneous diet-microbiome-host health interactions

Publisher: Springer Science and Business Media LLC

Date: 15-03-2023

DOI: 10.1186/S40168-023-01475-4

Abstract: Unrevealing the interplay between diet, the microbiome, and the health state could enable the design of personalized intervention strategies and improve the health and well-being of in iduals. A common approach to this is to ide the study population into smaller cohorts based on dietary preferences in the hope of identifying specific microbial signatures. However, classification of patients based solely on diet is unlikely to reflect the microbiome-host health relationship or the taxonomic microbiome makeup. We present a novel approach, the Nutrition-Ecotype Mixture of Experts (NEMoE) model, for establishing associations between gut microbiota and health state that accounts for diet-specific cohort variability using a regularized mixture of experts model framework with an integrated parameter sharing strategy to ensure data-driven diet-cohort identification consistency across taxonomic levels. The success of our approach was demonstrated through a series of simulation studies, in which NEMoE showed robustness with regard to parameter selection and varying degrees of data heterogeneity. Further application to real-world microbiome data from a Parkinson’s disease cohort revealed that NEMoE is capable of not only improving predictive performance for Parkinson’s Disease but also for identifying diet-specific microbial signatures of disease. In summary, NEMoE can be used to uncover diet-specific relationships between nutritional-ecotype and patient health and to contextualize precision nutrition for different diseases.

Publication

Exploring Multicollinearity Using a Random Matrix Theory Approach

Publisher: Walter de Gruyter GmbH

Date: 14-01-2012

DOI: 10.1515/1544-6115.1668

Abstract: Clustering of gene expression data is often done with the latent aim of dimension reduction, by finding groups of genes that have a common response to potentially unknown stimuli. However, what is poorly understood to date is the behaviour of a low dimensional signal embedded in high dimensions. This paper introduces a multicollinear model which is based on random matrix theory results, and shows potential for the characterisation of a gene cluster's correlation matrix. This model projects a one dimensional signal into many dimensions and is based on the spiked covariance model, but rather characterises the behaviour of the corresponding correlation matrix. The eigenspectrum of the correlation matrix is empirically examined by simulation, under the addition of noise to the original signal. The simulation results are then used to propose a dimension estimation procedure of clusters from data. Moreover, the simulation results warn against considering pairwise correlations in isolation, as the model provides a mechanism whereby a pair of genes with `low' correlation may simply be due to the interaction of high dimension and noise. Instead, collective information about all the variables is given by the eigenspectrum.

Publication

The Gut Microbiome in Parkinson’s Disease: A Longitudinal Study of the Impacts on Disease Progression and the Use of Device-Assisted Therapies

Publisher: Frontiers Media SA

Date: 17-05-2022

DOI: 10.3389/FNAGI.2022.875261

Abstract: Altered gut microbiome (GM) composition has been established in Parkinson’s disease (PD). However, few studies have longitudinally investigated the GM in PD, or the impact of device-assisted therapies. To investigate the temporal stability of GM profiles from PD patients on standard therapies and those initiating device-assisted therapies (DAT) and define multivariate models of disease and progression. We evaluated validated clinical questionnaires and stool s les from 74 PD patients and 74 household controls (HCs) at 0, 6, and 12 months. Faster or slower disease progression was defined from levodopa equivalence dose and motor severity measures. 19 PD patients initiating Deep Brain Stimulation or Levodopa-Carbidopa Intestinal Gel were separately evaluated at 0, 6, and 12 months post-therapy initiation. Persistent underrepresentation of short-chain fatty-acid-producing bacteria, Butyricicoccus, Fusicatenibacter, Lachnospiraceae ND3007 group , and Erysipelotrichaceae UCG-003 , were apparent in PD patients relative to controls. A sustained effect of DAT initiation on GM associations with PD was not observed. PD progression analysis indicated that the genus Barnesiella was underrepresented in faster progressing PD patients at t = 0 and t = 12 months. Two-stage predictive modeling, integrating microbiota abundances and nutritional profiles, improved predictive capacity (change in Area Under the Curve from 0.58 to 0.64) when assessed at Amplicon Sequence Variant taxonomic resolution. We present longitudinal GM studies in PD patients, showing persistently altered GM profiles suggestive of a reduced butyrogenic production potential. DATs exerted variable GM influences across the short and longer-term. We found that specific GM profiles combined with dietary factors improved prediction of disease progression in PD patients.

Publication

Structured variable selection with q-values

Publisher: Oxford University Press (OUP)

Date: 10-04-2013

DOI: 10.1093/BIOSTATISTICS/KXT012

Publication

Cross-Platform Omics Prediction procedure: a statistical machine learning framework for wider implementation of precision medicine

Publisher: Springer Science and Business Media LLC

Date: 04-07-2022

DOI: 10.1038/S41746-022-00618-5

Abstract: In this modern era of precision medicine, molecular signatures identified from advanced omics technologies hold great promise to better guide clinical decisions. However, current approaches are often location-specific due to the inherent differences between platforms and across multiple centres, thus limiting the transferability of molecular signatures. We present Cross-Platform Omics Prediction (CPOP), a penalised regression model that can use omics data to predict patient outcomes in a platform-independent manner and across time and experiments. CPOP improves on the traditional prediction framework of using gene-based features by selecting ratio-based features with similar estimated effect sizes. These components gave CPOP the ability to have a stable performance across datasets of similar biology, minimising the effect of technical noise often generated by omics platforms. We present a comprehensive evaluation using melanoma transcriptomics data to demonstrate its potential to be used as a critical part of a clinical screening framework for precision medicine. Additional assessment of generalisation was demonstrated with ovarian cancer and inflammatory bowel disease studies.

Publication

Adenocarcinomas of the upper third of the rectum and the rectosigmoid junction seem to have similar prognosis as colon cancers even without radiotherapy, SAKK 40/87

Publisher: Springer Science and Business Media LLC

Date: 28-08-2014

DOI: 10.1007/S00423-014-1243-1

Abstract: To investigate the prognosis of adenocarcinomas of the upper third of the rectum and the rectosigmoid-junction without radiotherapy. Patients from a multicenter randomized controlled trial from 1987-1993 on adjuvant chemotherapy for R0-resected colorectal cancers with stage I-III disease were retrospectively allocated: cancers of the lower two-thirds of the rectum (11 cm or less from anal-verge, Group A, n = 205), of the upper-third of the rectum and rectosigmoid-junction (>11-20 cm from anal-verge, Group B, n = 142), and of the colon (>20 cm from anal-verge, Group C, n = 378). The total mesorectal excision (TME) technique had not been introduced yet. The adjuvant chemotherapy turned out to be ineffective. None of the patients received neoadjuvant or adjuvant radiotherapy. The patients had a regular follow-up (median, 8.0 years). The 5-year disease-free survival (DFS) rate was 0.54 (95%CI, 0.47-0.60) in Group A, 0.68 (95%CI, 0.60-0.75) in Group B, and 0.69 (95%CI, 0.64-0.74) in Group C. The 5-year overall survival (OS) rate was 0.64 (95%CI, 0.57-0.71) in Group A, 0.79 (95%CI, 0.71-0.85) in Group B, and 0.77 (95%CI, 0.73-0.81) in Group C. Compared with Group C, patients in Group A had a significantly worse OS (hazard ratio [HR] for death 2.10) and a worse DFS (HR for relapse/death 1.93), while patients in Group B had a similar OS (HR 1.12) and DFS (HR 1.07). Adenocarcinomas of the upper third of the rectum and the rectosigmoid-junction seem to have similar prognosis as colon cancers. Even for surgeons not familiar with the TME technique, preoperative radiotherapy may be avoided for most rectosigmoid cancers above 11 cm from anal-verge.

Publication

On generalized degrees of freedom with application in linear mixed models selection

Publisher: Springer Science and Business Media LLC

Date: 26-07-2014

DOI: 10.1007/S11222-014-9488-7

Publication

Inferring data-specific micro-RNA function through the joint ranking of micro-RNA and pathways from matched micro-RNA and gene expression data

Publisher: Oxford University Press (OUP)

Date: 24-04-2015

DOI: 10.1093/BIOINFORMATICS/BTV220

Abstract: Motivation: In practice, identifying and interpreting the functional impacts of the regulatory relationships between micro-RNA and messenger-RNA is non-trivial. The sheer scale of possible micro-RNA and messenger-RNA interactions can make the interpretation of results difficult. Results: We propose a supervised framework, pMim, built upon concepts of significance combination, for jointly ranking regulatory micro-RNA and their potential functional impacts with respect to a condition of interest. Here, pMim directly tests if a micro-RNA is differentially expressed and if its predicted targets, which lie in a common biological pathway, have changed in the opposite direction. We leverage the information within existing micro-RNA target and pathway databases to stabilize the estimation and annotation of micro-RNA regulation making our approach suitable for datasets with small s le sizes. In addition to outputting meaningful and interpretable results, we demonstrate in a variety of datasets that the micro-RNA identified by pMim, in comparison to simpler existing approaches, are also more concordant with what is described in the literature. Availability and implementation: This framework is implemented as an R function, pMim, in the package sydSeq available from -packages. Contact: jean.yang@sydney.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

Publication

SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data

Publisher: Oxford University Press (OUP)

Date: 2022

DOI: 10.1093/GIGASCIENCE/GIAC071

Abstract: Survival analysis is a branch of statistics that deals with both the tracking of time and the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarize the survival time and perform a classification analysis. Here, we develop a benchmarking design, SurvBenchmark, that evaluates a erse collection of survival models for both clinical and omics data sets. SurvBenchmark not only focuses on classical approaches such as the Cox model but also evaluates state-of-the-art machine learning survival models. All approaches were assessed using multiple performance metrics these include model predictability, stability, flexibility, and computational issues. Our systematic comparison design with 320 comparisons (20 methods over 16 data sets) shows that the performances of survival models vary in practice over real-world data sets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies.

Publication

Hierarchical Selection of Fixed and Random Effects in Generalized Linear Mixed Models

Publisher: Statistica Sinica (Institute of Statistical Science)

Date: 2017

DOI: 10.5705/SS.202015.0329

Publication

Outlier Robust Model Selection in Linear Regression

Publisher: Informa UK Limited

Date: 12-2005

DOI: 10.1198/016214505000000529

Publication

TWO-STAGE SUPPORT ESTIMATION

Publisher: Wiley

Date: 12-2005

DOI: 10.1111/J.1467-842X.2005.00409.X

Publication

Sparse Sliced Inverse Regression via Cholesky Matrix Penalization

Publisher: Statistica Sinica (Institute of Statistical Science)

Date: 2023

DOI: 10.5705/SS.202020.0406

Publication

On the max-domain of attraction of distributions with log-concave densities

Publisher: Elsevier BV

Date: 09-2008

DOI: 10.1016/J.SPL.2007.12.008

Publication

On Model Selection Curves

Publisher: Wiley

Date: 08-2010

DOI: 10.1111/J.1751-5823.2010.00108.X

Publication

Sparse Pairwise Likelihood Estimation for Multivariate Longitudinal Mixed Models

Publisher: Informa UK Limited

Date: 19-06-2018

DOI: 10.1080/01621459.2017.1371026

Publication

015 Gut microbiota and nutritional profiles of Parkinson’s disease patients

Publisher: BMJ Publishing Group Ltd

Date: 08-2021

DOI: 10.1136/BMJNO-2021-ANZAN.15

Publication

Tail estimation based on numbers of near m-extremes

Publisher: Springer Science and Business Media LLC

Date: 2003

DOI: 10.1023/A:1024509818767

Publication

A robust scale estimator based on pairwise means

Publisher: Informa UK Limited

Date: 03-2012

DOI: 10.1080/10485252.2011.621424

Publication

Controlling the local false discovery rate in the adaptive Lasso

Publisher: Oxford University Press (OUP)

Date: 09-04-2013

DOI: 10.1093/BIOSTATISTICS/KXT008

Publication

LC-N2G: a local consistency approach for nutrigenomics data analysis

Publisher: Springer Science and Business Media LLC

Date: 17-11-2020

DOI: 10.1186/S12859-020-03861-3

Abstract: Nutrigenomics aims at understanding the interaction between nutrition and gene information. Due to the complex interactions of nutrients and genes, their relationship exhibits non-linearity. One of the most effective and efficient methods to explore their relationship is the nutritional geometry framework which fits a response surface for the gene expression over two prespecified nutrition variables. However, when the number of nutrients involved is large, it is challenging to find combinations of informative nutrients with respect to a certain gene and to test whether the relationship is stronger than chance. Methods for identifying informative combinations are essential to understanding the relationship between nutrients and genes. We introduce Local Consistency Nutrition to Graphics (LC-N2G), a novel approach for ranking and identifying combinations of nutrients with gene expression. In LC-N2G, we first propose a model-free quantity called Local Consistency statistic to measure whether there is non-random relationship between combinations of nutrients and gene expression measurements based on (1) the similarity between s les in the nutrient space and (2) their difference in gene expression. Then combinations with small LC are selected and a permutation test is performed to evaluate their significance. Finally, the response surfaces are generated for the subset of significant relationships. Evaluation on simulated data and real data shows the LC-N2G can accurately find combinations that are correlated with gene expression. The LC-N2G is practically powerful for identifying the informative nutrition variables correlated with gene expression. Therefore, LC-N2G is important in the area of nutrigenomics for understanding the relationship between nutrition and gene expression information.

Publication

Graphical tools for model selection in generalized linear models

Publisher: Wiley

Date: 29-05-2013

DOI: 10.1002/SIM.5855

Abstract: Model selection techniques have existed for many years however, to date, simple, clear and effective methods of visualising the model building process are sparse. This article describes graphical methods that assist in the selection of models and comparison of many different selection criteria. Specifically, we describe for logistic regression, how to visualize measures of description loss and of model complexity to facilitate the model selection dilemma. We advocate the use of the bootstrap to assess the stability of selected models and to enhance our graphical tools. We demonstrate which variables are important using variable inclusion plots and show that these can be invaluable plots for the model building process. We show with two case studies how these proposed tools are useful to learn more about important variables in the data and how these tools can assist the understanding of the model building process.

Publication

Surface Area Estimation Using 3D Point Clouds and Delaunay Triangulation

Publisher: Springer Nature Switzerland

Date: 2023

DOI: 10.1007/978-3-031-35308-6_3

Publication

Phase II Study of Capecitabine and Oxaliplatin in First- and Second-Line Treatment of Advanced or Metastatic Colorectal Cancer

Publisher: American Society of Clinical Oncology (ASCO)

Date: 04-2002

DOI: 10.1200/JCO.2002.07.087

Abstract: PURPOSE: To determine the efficacy and tolerability of combining oxaliplatin with capecitabine in the treatment of advanced nonpretreated and pretreated colorectal cancer. PATIENTS AND METHODS: Forty-three nonpretreated patients and 26 patients who had experienced one fluoropyrimidine-containing regimen for advanced colorectal cancer were treated with oxaliplatin 130 mg/m 2 on day 1 and capecitabine 1,250 mg/m 2 bid on days 1 to 14 every 3 weeks. Patients with good performance status (World Health Organization grade 0 to 1) were accrued onto two nonrandomized parallel arms of a phase II study. RESULTS: The objective response rate was 49% (95% confidence interval [CI], 33% to 65%) for nonpretreated and 15% (95% CI, 4% to 35%) for pretreated patients. The main toxicity of this combination was diarrhea, which occurred at grade 3 or 4 in 35% of the nonpretreated and 50% of the pretreated patients. Grade 3 or 4 sensory neuropathy, including laryngopharyngeal dysesthesia, occurred in 16% of patients on both cohorts. Capecitabine dose reductions were necessary in 26% of the nonpretreated and 45% of the pretreated patients in the second treatment cycle. The median overall survival was 17.1 months and 11.5 months, respectively. CONCLUSION: Combining capecitabine and oxaliplatin yields promising activity in advanced colorectal cancer. The main toxicity is diarrhea, which is manageable with appropriate dose reductions. On the basis of our toxicity experience, we recommend use of capecitabine in combination with oxaliplatin 130 mg/m 2 at an initial dose of 1,250 mg/m 2 bid in nonpretreated patients and at a dose of 1,000 mg/m 2 bid in pretreated patients.

Publication

Fast and approximate exhaustive variable selection for generalised linear models with APES

Publisher: Wiley

Date: 12-2019

DOI: 10.1111/ANZS.12276

Samuel Muller

Researcher

Research Topics

Top 5 Research Topics

ANZSRC Field of Research (FoR)

ANZSRC Socio-Economic Objective (SEO)

Related Links

Publications

SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data

A variational Bayes approach to variable selection

A Note on the Effect on Power of Score Tests via Dimension Reduction by Penalized Regression under the Null

Robust estimation of precision matrices under cellwise contamination

svReg: Structural varying‐coefficient regression to differentiate how regional brain atrophy affects motor impairment for Huntington disease severity groups

A Multi-Step Precision Pathway for Predicting Allograft Survival in Heterogeneous Cohorts of Kidney Transplant Recipients

A prediction model for viability at the end of the first trimester after a single early pregnancy evaluation

MCVIS: A New Framework for Collinearity Discovery, Diagnostic, and Visualization

Weighted least squares estimation of the extreme value index

Predation risk and competitive interactions affect foraging of an endangered refuge-dependent herbivore

GEE-Assisted Variable Selection for Latent Variable Models with Multivariate Binary Data

2nd special issue on robust analysis of complex data

Prediction modeling—part 2: using machine learning strategies to improve transplantation outcomes

Partially smooth tail-index estimation for small samples

A PERSONALISED PREDICTION MODEL FOR ALLOGRAFT SURVIVAL AFTER KIDNEY TRANSPLANTATION

Melanoma Explorer: a web application to allow easy reanalysis of publicly available and clinically annotated melanoma omics data sets

Nutritional Intake and Gut Microbiome Composition Predict Parkinson’s Disease

Determination of prognosis in metastatic melanoma through integration of clinico-pathologic, mutation, mRNA, microRNA, and protein information

Random Effects Misspecification Can Have Severe Consequences for Random Effects Inference in Linear Mixed Models

Cross-Platform Omics Prediction procedure: a game changer for implementing precision medicine in patients with stage-III melanoma

The LASSO on latent indices for regression modeling with ordinal categorical predictors

NEMoE: A nutrition aware regularized mixture of experts model addressing diet-cohort heterogeneity of gut microbiota in Parkinson’s disease

Estimation of graphical models for skew continuous data

Identification, Review, and Systematic Cross-Validation of microRNA Prognostic Signatures in Metastatic Melanoma

Estimating the number of motor units using random sums with independently thinned terms

The difference of symmetric quantiles under long range dependence

A radiographic analysis of the abnormal hallux interphalangeus angle range: Considerations for surgeons performing Akin osteotomies

Model Selection in Linear Mixed Models

A multi-step classifier addressing cohort heterogeneity improves performance of prognostic biomarkers in three cancer types

PARTIALLY LINEAR MODEL SELECTION BY THE BOOTSTRAP

037 The gut microbiome in Parkinson’s disease: longitudinal insights into disease progression and the use of device-assisted therapies

Iterative Estimation of the Extreme Value Index

Revisiting fitting monotone polynomials to data

IDENTIFICATION OF DRIVEN RISK FACTORS FOR HLA-DR IN KIDNEY TRANSPLANTATION

bcGST - an interactive bias-correction method to identify over-represented gene-sets in boutique arrays

Association between periodontal and peri‐implant conditions: a 10‐year prospective study

Assessing Modularity Using a Random Matrix Theory Approach

Fast and flexible methods for monotone polynomial fitting

Testing random effects in linear mixed models: another look at the F‐test (with discussion)

Robust subtractive stability measures for fast and exhaustive feature importance ranking and selection in generalised linear models

The latency distribution of motor evoked potentials in patients with multiple sclerosis

BcGST-an interactive bias-correction method to identify over-represented gene-sets in boutique arrays

Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context

mplot: An R Package for Graphical Model Stability and Variable Selection Procedures

Screening methods for linear errors‐in‐variables models in high dimensions

Cox regression with exclusion frequency-based weights to identify neuroimaging markers relevant to Huntington’s disease onset

A QUANTITATIVE FABRIC ANALYSIS APPROACH TO THE DISCRIMINATION OF WHITE MARBLES*

Predictive Value of Radiological Criteria for Disintegration Rates of Extracorporeal Shock Wave Lithotripsy

On Variational Bayes Estimation and Variational Information Criteria for Linear Regression Models

Smooth tail-index estimation

Identification of important regressor groups, subgroups and individuals via regularization methods: application to gut microbiome data

A method to measure the distribution of latencies of motor evoked potentials in man

Joint Selection in Mixed Models using Regularized PQL

Semiparametric Regression Using Variational Approximations

NEMoE: a nutrition aware regularized mixture of experts model to identify heterogeneous diet-microbiome-host health interactions

Exploring Multicollinearity Using a Random Matrix Theory Approach

The Gut Microbiome in Parkinson’s Disease: A Longitudinal Study of the Impacts on Disease Progression and the Use of Device-Assisted Therapies

Structured variable selection with q-values

Cross-Platform Omics Prediction procedure: a statistical machine learning framework for wider implementation of precision medicine

Adenocarcinomas of the upper third of the rectum and the rectosigmoid junction seem to have similar prognosis as colon cancers even without radiotherapy, SAKK 40/87

On generalized degrees of freedom with application in linear mixed models selection

Inferring data-specific micro-RNA function through the joint ranking of micro-RNA and pathways from matched micro-RNA and gene expression data

SurvBenchmark: comprehensive benchmarking study of survival analysis methods using both omics data and clinical data

Hierarchical Selection of Fixed and Random Effects in Generalized Linear Mixed Models

Outlier Robust Model Selection in Linear Regression

TWO-STAGE SUPPORT ESTIMATION

Sparse Sliced Inverse Regression via Cholesky Matrix Penalization

On the max-domain of attraction of distributions with log-concave densities

On Model Selection Curves

Sparse Pairwise Likelihood Estimation for Multivariate Longitudinal Mixed Models

015 Gut microbiota and nutritional profiles of Parkinson’s disease patients

Tail estimation based on numbers of near m-extremes