ORCID Profile
0000-0002-3087-8127
Current Organisation
Macquarie University
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
In Research Link Australia (RLA), "Research Topics" refer to ANZSRC FOR and SEO codes. These topics are either sourced from ANZSRC FOR and SEO codes listed in researchers' related grants or generated by a large language model (LLM) based on their publications.
Applied Statistics | Statistics | Statistical Theory | Bioinformatics | Gene Expression (incl. Microarray and other genome-wide approaches) | Statistical theory | Applied statistics | Pattern Recognition and Data Mining | Statistics |
Expanding Knowledge in the Mathematical Sciences | Specific Population Health (excl. Indigenous Health) not elsewhere classified | Cancer and Related Disorders | Ecosystem Assessment and Management of Forest and Woodlands Environments | Expanding Knowledge in the Chemical Sciences | Expanding Knowledge in the Agricultural and Veterinary Sciences | Indigenous Health not elsewhere classified | Health not elsewhere classified | Expanding Knowledge in the Medical and Health Sciences
Publisher: Cold Spring Harbor Laboratory
Date: 12-07-2021
DOI: 10.1101/2021.07.11.451967
Abstract: Survival analysis is a branch of statistics that deals with both, the tracking of time and of the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarise the survival time and perform a classification analysis. Here, we develop a benchmarking framework, SurvBenchmark, that evaluates a erse collection of survival models for both clinical and omics datasets. SurvBenchmark not only focuses on classical approaches such as the Cox model, but it also evaluates state-of-art machine learning survival models. All approaches were assessed using multiple performance metrics, these include model predictability, stability, flexibility and computational issues. Our systematic comparison framework with over 320 comparisons (20 methods over 16 datasets) shows that the performances of survival models vary in practice over real-world datasets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies. jean.yang@sydney.edu.au
Publisher: Institute of Mathematical Statistics
Date: 2017
DOI: 10.1214/17-EJS1332
Publisher: Walter de Gruyter GmbH
Date: 29-01-2010
Publisher: Elsevier BV
Date: 2016
Publisher: Wiley
Date: 19-04-2021
Abstract: For Huntington disease, identification of brain regions related to motor impairment can be useful for developing interventions to alleviate the motor symptom, the major symptom of the disease. However, the effects from the brain regions to motor impairment may vary for different groups of patients. Hence, our interest is not only to identify the brain regions but also to understand how their effects on motor impairment differ by patient groups. This can be cast as a model selection problem for a varying‐coefficient regression. However, this is challenging when there is a pre‐specified group structure among variables. We propose a novel variable selection method for a varying‐coefficient regression with such structured variables and provide a publicly available R package svreg for implementation of our method. Our method is empirically shown to select relevant variables consistently. Also, our method screens irrelevant variables better than existing methods. Hence, our method leads to a model with higher sensitivity, lower false discovery rate and higher prediction accuracy than the existing methods. Finally, we found that the effects from the brain regions to motor impairment differ by disease severity of the patients. To the best of our knowledge, our study is the first to identify such interaction effects between the disease severity and brain regions, which indicates the need for customized intervention by disease severity.
Publisher: Frontiers Media SA
Date: 12-09-2023
Publisher: Wiley
Date: 02-2013
DOI: 10.1111/AJO.12046
Abstract: The aim was to develop a new model to predict the outcome at the end of the 1st trimester after a single visit to the early pregnancy unit (EPU). Prospective observational study in the EPU at Nepean Hospital, between November 2006 and February 2009. Data were collected from all women in the 1st trimester of their pregnancy who had a live intrauterine pregnancy (IUP) at the 1st transvaginal ultrasound scan (TVS). 29 historical, clinical and ultrasound end points were recorded. Women were followed until the final diagnosis was established at the end of the 1st trimester: viability or nonviability. A multinomial logistic regression model was developed. The performance of this model was evaluated using receiver operating characteristic (ROC) curves. Data from 416 pregnancies were included: 92.1% were live beyond the 1st trimester, and 7.9% had miscarried. The most useful prognostic variables for developing the logistic regression model were gestational age by dates, vaginal (PV) bleeding, PV clots, gestational age by TVS, consistency with menstrual dates, mean gestational sac (GS) size, mean yolk sac (YS) size and number of previous caesarean sections. Used retrospectively on 416 women based on 25 imputations, the model gave an AUC of 0.88. Based on cross-validation, the independent predictive power obtained an AUC of 0.78. We have developed a new model to predict the outcome of the 1st trimester in women with live IUP at the 1st scan.
Publisher: Informa UK Limited
Date: 30-07-2020
Publisher: Elsevier BV
Date: 05-2006
Publisher: Wiley
Date: 25-02-2011
Publisher: Informa UK Limited
Date: 17-11-2021
Publisher: Elsevier BV
Date: 09-2017
Publisher: Elsevier BV
Date: 04-2021
Publisher: Springer Science and Business Media LLC
Date: 30-12-2010
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 09-2020
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 06-2019
Publisher: Frontiers Media SA
Date: 11-05-2022
DOI: 10.3389/FNAGI.2022.881872
Abstract: Models to predict Parkinson’s disease (PD) incorporating alterations of gut microbiome (GM) composition have been reported with varying success. To assess the utility of GM compositional changes combined with macronutrient intake to develop a predictive model of PD. We performed a cross-sectional analysis of the GM and nutritional intake in 103 PD patients and 81 household controls (HCs). GM composition was determined by 16S licon sequencing of the V3-V4 region of bacterial ribosomal DNA isolated from stool. To determine multivariate disease-discriminant associations, we developed two models using Random Forest and support-vector machine (SVM) methodologies. Using updated taxonomic reference, we identified significant compositional differences in the GM profiles of PD patients in association with a variety of clinical PD characteristics. Six genera were overrepresented and eight underrepresented in PD patients relative to HCs, with the largest difference being overrepresentation of Lactobacillaceae at family taxonomic level. Correlation analyses highlighted multiple associations between clinical characteristics and select taxa, whilst constipation severity, physical activity and pharmacological therapies associated with changes in beta ersity. The random forest model of PD, incorporating taxonomic data at the genus level and carbohydrate contribution to total energy demonstrated the best predictive capacity [Area under the ROC Curve (AUC) of 0.74]. The notable differences in GM ersity and composition when combined with clinical measures and nutritional data enabled the development of a predictive model to identify PD. These findings support the combination of GM and nutritional data as a potentially useful biomarker of PD to improve diagnosis and guide clinical management.
Publisher: Wiley
Date: 24-07-2015
DOI: 10.1002/IJC.29047
Abstract: In patients with metastatic melanoma, the identification and validation of accurate prognostic biomarkers will assist rational treatment planning. Studies based on "-omics" technologies have focussed on a single high-throughput data type such as gene or microRNA transcripts. Occasionally, these features have been evaluated in conjunction with limited clinico-pathologic data. With the increased availability of multiple data types, there is a pressing need to tease apart which of these sources contain the most valuable prognostic information. We evaluated and integrated several data types derived from the same tumor specimens in AJCC stage III melanoma patients-gene, protein, and microRNA expression as well as clinical, pathologic and mutation information-to determine their relative impact on prognosis. We used classification frameworks based on pre-validation and bootstrap multiple imputation to compare the prognostic power of each data source, both in idually as well as integratively. We found that the prognostic utility of clinico-pathologic information was not out-performed by any of the various "-omics" platforms. Rather, a combination of clinico-pathologic variables and mRNA expression data performed best. Furthermore, a patient-based classification analysis revealed that the prognostic accuracy of various data types was not the same for different patients. This indicates that ongoing development in the in idualized evaluation of melanoma patients must take account of the value of both traditional and novel "-omics" measurements.
Publisher: Wiley
Date: 15-04-2020
DOI: 10.1111/INSR.12378
Abstract: There has been considerable and controversial research over the past two decades into how successfully random effects misspecification in mixed models (i.e. assuming normality for the random effects when the true distribution is non‐normal) can be diagnosed and what its impacts are on estimation and inference. However, much of this research has focused on fixed effects inference in generalised linear mixed models. In this article, motivated by the increasing number of applications of mixed models where interest is on the variance components, we study the effects of random effects misspecification on random effects inference in linear mixed models, for which there is considerably less literature. Our findings are surprising and contrary to general belief: for point estimation, maximum likelihood estimation of the variance components under misspecification is consistent, although in finite s les, both the bias and mean squared error can be substantial. For inference, we show through theory and simulation that under misspecification, standard likelihood ratio tests of truly non‐zero variance components can suffer from severely inflated type I errors, and confidence intervals for the variance components can exhibit considerable under coverage. Furthermore, neither of these problems vanish asymptotically with increasing the number of clusters or cluster size. These results have major implications for random effects inference, especially if the true random effects distribution is heavier tailed than the normal. Fortunately, simple graphical and goodness‐of‐fit measures of the random effects predictions appear to have reasonable power at detecting misspecification. We apply linear mixed models to a survey of more than 4 000 high school students within 100 schools and analyse how mathematics achievement scores vary with student attributes and across different schools. The application demonstrates the sensitivity of mixed model inference to the true but unknown random effects distribution.
Publisher: Cold Spring Harbor Laboratory
Date: 09-12-2020
DOI: 10.1101/2020.12.09.415927
Abstract: There is no consensus methodology that can account for the variation in omics signatures when they are acquired across different platforms and times. This poses a significant barrier to the implementation of valuable biomarkers into clinical practice. We present a novel procedure (Cross-Platform Omics Prediction) that accounts for these variations and demonstrate its utility in three risk models for different diseases that is suitable for prospective and multi-centre clinical implementation.
Publisher: Elsevier BV
Date: 09-2020
Publisher: Cold Spring Harbor Laboratory
Date: 11-11-2021
DOI: 10.1101/2021.11.10.21266194
Abstract: The microbiome plays a fundamental role in human health and diet is one of the strongest modulators of the gut microbiome. However, interactions between microbiota and host health are complex and erse. Understanding the interplay between diet, the microbiome and health state could enable the design of personalized intervention strategies and improve the health and wellbeing of affected in iduals. A common approach to this is to ide the study population into smaller cohorts based on dietary preferences in the hope of identifying specific microbial signatures. However, classification of patients based solely on diet is unlikely to reflect the microbiome-host health relationship or the taxonomic microbiome makeup. To this end, we present a novel approach, the N utrition- E cotype M ixture o f E xperts (NEMoE) model, for establishing associations between gut microbiota and health state that accounts for diet-specific cohort variability using a regularized mixture of experts model framework with an integrated parameter sharing strategy to ensure data driven diet-cohort identification consistency across taxonomic levels. The success of our approach was demonstrated through a series of simulation studies, in which NEMoE showed robustness with regard to parameter selection and varying degrees of data heterogeneity. Further application to real-world microbiome data from a Parkinson’s disease cohort revealed that NEMoE is capable of not only improving predictive performance for Parkinson’s Disease but also for identifying diet-specific microbiome markers of disease. Our results indicate that NEMoE can be used to uncover diet-specific relationships between nutritional-ecotype and patient health and to contextualize precision nutrition for different diseases.
Publisher: Wiley
Date: 06-02-2022
DOI: 10.1111/SJOS.12569
Abstract: We consider a new approach for estimating non‐Gaussian undirected graphical models. Specifically, we model continuous data from a class of multivariate skewed distributions, whose conditional dependence structure depends on both a precision matrix and a shape vector. To estimate the graph, we propose a novel estimation method based on nodewise regression: we first fit a linear model, and then fit a one component projection pursuit regression model to the residuals obtained from the linear model, and finally threshold appropriate quantities. Theoretically, we establish error bounds for each nodewise regression and prove the consistency of the estimated graph when the number of variables erges with the s le size. Simulation results demonstrate the strong finite s le performance of our new method over existing methods for estimating Gaussian and non‐Gaussian graphical models. Finally, we demonstrate an application of the proposed method on observations of physicochemical properties of wine.
Publisher: Elsevier BV
Date: 2016
DOI: 10.1038/JID.2015.355
Publisher: Elsevier BV
Date: 07-2006
DOI: 10.1016/J.MBS.2006.04.006
Abstract: The problem of estimating the numbers of motor units N in a muscle is embedded in a general stochastic model using the notion of thinning from point process theory. In the paper a new moment type estimator for the numbers of motor units in a muscle is denned, which is derived using random sums with independently thinned terms. Asymptotic normality of the estimator is shown and its practical value is demonstrated with bootstrap and approximative confidence intervals for a data set from a 31-year-old healthy right-handed, female volunteer. Moreover simulation results are presented and Monte-Carlo based quantiles, means, and variances are calculated for N in{300,600,1000}.
Publisher: Elsevier BV
Date: 03-2015
Publisher: SAGE Publications
Date: 09-04-2019
Abstract: This study is aimed to determine the abnormal radiological hallux interphalangeus angle (HIA) range, which can assist surgeons in determining the required bone resection in an Akin osteotomy of the proximal phalanx of the great toe. Radiographs of 141 feet were analyzed. The mean HIA and range were calculated. The prevalence of hallux valgus interphalangeus (HVI) deformity was 78% (110/141). The mean HIA was 13.5° ± 4.5° (1.4-24.4). Fifty percent had abnormal HIA values of 10-15°, 40% had values of 15-20°, and 10% had greater than 20°. A large proportion of patients with HVI deformities may need greater than the standard 2-3-mm bone wedge removal during Akin osteotomy. The high prevalence and wide range of HVI deformities should alert surgeons to the possibility that greater than 3-mm bone wedge resections may be required. Level of Evidence: Level IV.
Publisher: Institute of Mathematical Statistics
Date: 05-2013
DOI: 10.1214/12-STS410
Publisher: Impact Journals, LLC
Date: 11-08-2016
Publisher: Wiley
Date: 31-05-2009
Publisher: BMJ Publishing Group Ltd
Date: 08-2021
Publisher: Springer Science and Business Media LLC
Date: 06-2005
Publisher: Springer Science and Business Media LLC
Date: 23-12-2013
Publisher: Ovid Technologies (Wolters Kluwer Health)
Date: 09-2020
Publisher: Cold Spring Harbor Laboratory
Date: 28-12-2017
DOI: 10.1101/240234
Abstract: Motivation: Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopedia of Genes and Genomes are important tools in Gene Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene set of interest and an annotation. Importantly, GST tests for over-representation of genes in an annotation term. One implicit assumption of GST is that the gene expression platform captures the complete or a very large proportion of the genome. However, this assumption is neither satisfied for the increasingly popular boutique array nor the custom designed gene expression profiling platform. Specifically, conventional GST is no longer appropriate due to the gene set selection bias induced during the construction of these platforms. Results: We propose bcGST, a bias-corrected Gene Set Test by introducing bias correction terms in the contingency table needed for calculating the Fisher’s Exact Test (FET). The adjustment method works by estimating the proportion of genes captured on the array with respect to the genome in order to assist filtration of annotation terms that would otherwise be falsely included or excluded. We illustrate the practicality of bcGST and its stability through multiple differential gene expression analyses in melanoma and TCGA cancer studies. Availability: The bcGST method is made available as a Shiny web application at shiny.maths.usyd.edu.au/bcGST/ Contact: kevin.wang@sydney.edu.au
Publisher: Wiley
Date: 20-01-2004
DOI: 10.1111/J.1600-0501.2004.00982.X
Abstract: The aims of this study were to (1) compare prospectively the clinical and radiographic changes in periodontal and peri-implant conditions, (2) investigate the association of changes in periodontal parameters and peri-implant conditions over a mean observation period of 10 years (8-12 years) after implant installation, and (3) evaluate patient risk factors known to aggravate the periodontal conditions for their potential influence on the peri-implant tissue status. Eighty-nine partially edentulous patients with a mean age of 58.9 years (28-88 years) were examined at 1 and 10 years after implant placement. The patients contributed with 179 implants that were placed after comprehensive periodontal treatment and restored with crowns or fixed partial dentures. One hundred and seventy-nine matching control teeth were chosen as controls. Also, the remaining teeth (n=1770) in the dentitions were evaluated. Data on smoking habits and general health aspects were collected at 1 and 10 years as well. At 10 years, statistically significant differences existed between implants and matching control teeth with regard to most of the clinical and radiographic parameters (P<0.01) with the exception of plaque index (PII) and recession. Multiple regression analyses were performed to associate combinations of periodontal diagnostic parameters to the peri-implant conditions: probing attachment level (PAL) at implants at 10 years was associated with implant location, full-mouth probing pocket depth (PPD) and full-mouth PAL (P=0.0001, r2=0.36). PPD at implants at 10 years correlated to implant location, full-mouth PPD and full-mouth PAL (P<0.001, r2=0.47). Marginal bone level at implants at 10 years was significantly associated to smoking, general health condition, implant location, full-mouth PAL and change over time in full-mouth PPD (P<0.001, r2=0.39). These results present evidence for the association between periodontal and peri-implant conditions and the changes in these tissues over 10 years in partially edentulous patients.
Publisher: Walter de Gruyter GmbH
Date: 26-01-2011
Abstract: Random matrix theory (RMT) is well suited to describing the emergent properties of systems with complex interactions amongst their constituents through their eigenvalue spectrums. Some RMT results are applied to the problem of clustering high dimensional biological data with complex dependence structure amongst the variables. It will be shown that a gene relevance or correlation network can be constructed by choosing a correlation threshold in a principled way, such that it corresponds to a block diagonal structure in the correlation matrix, if such a structure exists. The structure is then found using community detection algorithms, but with parameter choice guided by RMT predictions. The resulting clustering is compared to a variety of hierarchical clustering outputs and is found to the most generalised result, in that it captures all the features found by the other considered methods.
Publisher: Informa UK Limited
Date: 02-2016
Publisher: Wiley
Date: 03-2019
DOI: 10.1111/ANZS.12256
Publisher: Wiley
Date: 09-0008
DOI: 10.1111/ANZS.12375
Abstract: We introduce the relatively new concept of subtractive lack‐of‐fit measures in the context of robust regression, in particular in generalised linear models. We devise a fast and robust feature selection framework for regression that empirically enjoys better performance than other selection methods while remaining computationally feasible when fully exhaustive methods are not. Our method builds on the concepts of model stability, subtractive lack‐of‐fit measures and repeated model identification. We demonstrate how the multiple implementations add value in a robust regression type context, in particular through utilizing a combination of robust regression coefficient and scale estimates. Through res ling, we construct a robust stability matrix, which contains multiple measures of feature importance for each variable. By constructing this stability matrix and using it to rank features based on importance, we are able to reduce the candidate model space and then perform an exhaustive search on the remaining models. We also introduce two different visualisations to better convey information held within the stability matrix a subtractive Mosaic Probability Plot and a subtractive Variable Inclusion Plot. We demonstrate how these graphics allow for a better understanding of how variable importance changes under small alterations to the underlying data. Our framework is made available in R through the RobStabR package.
Publisher: Elsevier BV
Date: 12-2012
DOI: 10.1016/J.CLINPH.2012.05.008
Abstract: To compare the in idual latency distributions of motor evoked potentials (MEP) in patients with multiple sclerosis (MS) to the previously reported results in healthy subjects (Firmin et al., 2011). We applied the previously reported method to measure the distribution of MEP latencies to 16 patients with MS. The method is based on transcranial magnetic stimulation and consists of a combination of the triple stimulation technique with a method originally developed to measure conduction velocity distributions in peripheral nerves. MEP latency distributions in MS typically showed two peaks. The in idual MEP latency distributions were significantly wider in patients with MS than in healthy subjects. The mean triple stimulation delay extension at the 75% quantile, a proxy for MEP latency distribution width, was 7.3 ms in healthy subjects and 10.7 ms in patients with MS. In patients with MS, slow portions of the central motor pathway contribute more to the MEP than in healthy subjects. The bimodal distribution found in healthy subjects is preserved in MS. Our method to measure the distribution of MEP latencies is suitable to detect alterations in the relative contribution of corticospinal tract portions with long MEP latencies to motor conduction.
Publisher: Oxford University Press (OUP)
Date: 12-09-2019
DOI: 10.1093/BIOINFORMATICS/BTY783
Abstract: Gene annotation and pathway databases such as Gene Ontology and Kyoto Encyclopaedia of Genes and Genomes are important tools in Gene-Set Test (GST) that describe gene biological functions and associated pathways. GST aims to establish an association relationship between a gene-set of interest and an annotation. Importantly, GST tests for over-representation of genes in an annotation term. One implicit assumption of GST is that the gene expression platform captures the complete or a very large proportion of the genome. However, this assumption is neither satisfied for the increasingly popular boutique array nor the custom designed gene expression profiling platform. Specifically, conventional GST is no longer appropriate due to the gene-set selection bias induced during the construction of these platforms. We propose bcGST, a bias-corrected GST by introducing bias-correction terms in the contingency table needed for calculating the Fisher’s Exact Test. The adjustment method works by estimating the proportion of genes captured on the array with respect to the genome in order to assist filtration of annotation terms that would otherwise be falsely included or excluded. We illustrate the practicality of bcGST and its stability through multiple differential gene expression analyses in melanoma and the Cancer Genome Atlas cancer studies. The bcGST method is made available as a Shiny web application at shiny.maths.usyd.edu.au/bcGST/. Supplementary data are available at Bioinformatics online.
Publisher: Informa UK Limited
Date: 11-2011
Publisher: Foundation for Open Access Statistic
Date: 2018
Publisher: Wiley
Date: 25-03-2022
DOI: 10.1111/BIOM.13628
Abstract: Microarray studies, in order to identify genes associated with an outcome of interest, usually produce noisy measurements for a large number of gene expression features from a small number of subjects. One common approach to analyzing such high‐dimensional data is to use linear errors‐in‐variables (EIV) models however, current methods for fitting such models are computationally expensive. In this paper, we present two efficient screening procedures, namely, corrected penalized marginal screening (PMSc) and corrected sure independence screening (SISc), to reduce the number of variables for final model building. Both screening procedures are based on fitting corrected marginal regression models relating the outcome to each contaminated covariate separately, which can be computed efficiently even with a large number of features. Under mild conditions, we show that these procedures achieve screening consistency and reduce the number of features substantially, even when the number of covariates grows exponentially with s le size. In addition, if the true covariates are weakly correlated, we show that PMSc can achieve full variable selection consistency. Through a simulation study and an analysis of gene expression data for bone mineral density of Norwegian women, we demonstrate that the two new screening procedures make estimation of linear EIV models computationally scalable in high‐dimensional settings, and improve finite s le estimation and selection performance compared with estimators that do not employ a screening stage.
Publisher: Institute of Mathematical Statistics
Date: 12-2016
DOI: 10.1214/16-AOAS967
Publisher: Wiley
Date: 08-1999
Publisher: S. Karger AG
Date: 2001
DOI: 10.1159/000056590
Abstract: i Objective: /i To evaluate routinely applicable criteria to predict fragmentation of renal calculi by extracorporeal shock wave lithotripsy (ESWL). i Patients and Methods: /i Two hundred and two consecutive patients (121 men, 81 women), median age 48 (range 19–81) years, were treated with the original Dornier HM-3 lithotriptor at a single stone center. Inclusion criteria were: solitary stones, 10–30 mm in greatest diameter, located in renal pelvis or calyces. Based on plain radiographs, the calculi were classified according to their size, form, location, density (compared to the 12th rib), structure and surface. Furthermore, age of the patient, gender and body mass index were also considered for evaluation. Disintegration was documented on day 1 after ESWL by plain X-ray. A multivariate regression analysis was applied to all preoperative parameters, based on the dual variable stone free versus residual fragments. i Results: /i The overall disintegration rate was 95.5% 42 patients (20.8%) were completely stone free, and 151 patients (74.7%) had clinically insignificant residual fragments (5 mm or smaller). 14.9% of men and 29.6% of women were stone free (p = 0.01). All other parameters did not reach statistical significance. i Conclusions: /i The disintegration rate of the HM-3 is excellent for kidney stones women did significantly better than men. However, because of this high disintegration rate, a much larger series would be necessary to define possible differences between preinterventional parameters, if there were any at all.
Publisher: Wiley
Date: 06-02-2014
DOI: 10.1111/ANZS.12063
Publisher: Informa UK Limited
Date: 09-2009
Publisher: Oxford University Press (OUP)
Date: 24-10-2013
DOI: 10.1093/BIOINFORMATICS/BTT608
Abstract: Motivation: Gut microbiota can be classified at multiple taxonomy levels. Strategies to use changes in microbiota composition to effect health improvements require knowing at which taxonomy level interventions should be aimed. Identifying these important levels is difficult, however, because most statistical methods only consider when the microbiota are classified at one taxonomy level, not multiple. Results: Using L1 and L2 regularizations, we developed a new variable selection method that identifies important features at multiple taxonomy levels. The regularization parameters are chosen by a new, data-adaptive, repeated cross-validation approach, which performed well. In simulation studies, our method outperformed competing methods: it more often selected significant variables, and had small false discovery rates and acceptable false-positive rates. Applying our method to gut microbiota data, we found which taxonomic levels were most altered by specific interventions or physiological status. Availability: The new approach is implemented in an R package, which is freely available from the corresponding author. Contact: tpgarcia@srph.tamhsc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Publisher: Elsevier BV
Date: 2011
DOI: 10.1016/J.CLINPH.2010.05.034
Abstract: To measure the intra-in idual distribution of the latencies of motor evoked potentials (MepL) using transcranial magnetic stimulation. We used the triple stimulation technique (TST) to quantify the proportion of excited spinal motor neurons supplying the abductor digiti minimi muscle in response to a maximal magnetic brain stimulus (Magistris et al., 1998). By systematically manipulating the TST delay, we could quantify the contribution of slow-conducting motor tract portions to the TST litude. Our method allowed the establishment of a MepL distribution for each of the 29 examined healthy subjects. MepLs of 50% of the motor tract contributing to the motor evoked potential laid between the intra-in idually minimal MepL (MepL(min)) and MepL(min)+4.9 ms (range 1.6-9.2). The in idual MepL distributions showed two peaks in most subjects. The first peak appeared at a MepL that was 3.0 ms longer on average (range 0.7-6.0) than MepL(min) the second peak appeared at MepL(min)+8.1 ms on average (range 3.7-13.0). Slow-conducting parts of the motor pathway contribute notably to the motor evoked potential. Our data suggest a bimodal distribution of central conduction times, which might possibly relate to different fibre types within the pyramidal tract. We present a non-invasive method to assess slow-conducting parts of the human central motor tract.
Publisher: Informa UK Limited
Date: 13-06-2017
Publisher: Informa UK Limited
Date: 27-02-2019
Publisher: Springer Science and Business Media LLC
Date: 15-03-2023
DOI: 10.1186/S40168-023-01475-4
Abstract: Unrevealing the interplay between diet, the microbiome, and the health state could enable the design of personalized intervention strategies and improve the health and well-being of in iduals. A common approach to this is to ide the study population into smaller cohorts based on dietary preferences in the hope of identifying specific microbial signatures. However, classification of patients based solely on diet is unlikely to reflect the microbiome-host health relationship or the taxonomic microbiome makeup. We present a novel approach, the Nutrition-Ecotype Mixture of Experts (NEMoE) model, for establishing associations between gut microbiota and health state that accounts for diet-specific cohort variability using a regularized mixture of experts model framework with an integrated parameter sharing strategy to ensure data-driven diet-cohort identification consistency across taxonomic levels. The success of our approach was demonstrated through a series of simulation studies, in which NEMoE showed robustness with regard to parameter selection and varying degrees of data heterogeneity. Further application to real-world microbiome data from a Parkinson’s disease cohort revealed that NEMoE is capable of not only improving predictive performance for Parkinson’s Disease but also for identifying diet-specific microbial signatures of disease. In summary, NEMoE can be used to uncover diet-specific relationships between nutritional-ecotype and patient health and to contextualize precision nutrition for different diseases.
Publisher: Walter de Gruyter GmbH
Date: 14-01-2012
Abstract: Clustering of gene expression data is often done with the latent aim of dimension reduction, by finding groups of genes that have a common response to potentially unknown stimuli. However, what is poorly understood to date is the behaviour of a low dimensional signal embedded in high dimensions. This paper introduces a multicollinear model which is based on random matrix theory results, and shows potential for the characterisation of a gene cluster's correlation matrix. This model projects a one dimensional signal into many dimensions and is based on the spiked covariance model, but rather characterises the behaviour of the corresponding correlation matrix. The eigenspectrum of the correlation matrix is empirically examined by simulation, under the addition of noise to the original signal. The simulation results are then used to propose a dimension estimation procedure of clusters from data. Moreover, the simulation results warn against considering pairwise correlations in isolation, as the model provides a mechanism whereby a pair of genes with `low' correlation may simply be due to the interaction of high dimension and noise. Instead, collective information about all the variables is given by the eigenspectrum.
Publisher: Frontiers Media SA
Date: 17-05-2022
DOI: 10.3389/FNAGI.2022.875261
Abstract: Altered gut microbiome (GM) composition has been established in Parkinson’s disease (PD). However, few studies have longitudinally investigated the GM in PD, or the impact of device-assisted therapies. To investigate the temporal stability of GM profiles from PD patients on standard therapies and those initiating device-assisted therapies (DAT) and define multivariate models of disease and progression. We evaluated validated clinical questionnaires and stool s les from 74 PD patients and 74 household controls (HCs) at 0, 6, and 12 months. Faster or slower disease progression was defined from levodopa equivalence dose and motor severity measures. 19 PD patients initiating Deep Brain Stimulation or Levodopa-Carbidopa Intestinal Gel were separately evaluated at 0, 6, and 12 months post-therapy initiation. Persistent underrepresentation of short-chain fatty-acid-producing bacteria, Butyricicoccus, Fusicatenibacter, Lachnospiraceae ND3007 group , and Erysipelotrichaceae UCG-003 , were apparent in PD patients relative to controls. A sustained effect of DAT initiation on GM associations with PD was not observed. PD progression analysis indicated that the genus Barnesiella was underrepresented in faster progressing PD patients at t = 0 and t = 12 months. Two-stage predictive modeling, integrating microbiota abundances and nutritional profiles, improved predictive capacity (change in Area Under the Curve from 0.58 to 0.64) when assessed at Amplicon Sequence Variant taxonomic resolution. We present longitudinal GM studies in PD patients, showing persistently altered GM profiles suggestive of a reduced butyrogenic production potential. DATs exerted variable GM influences across the short and longer-term. We found that specific GM profiles combined with dietary factors improved prediction of disease progression in PD patients.
Publisher: Oxford University Press (OUP)
Date: 10-04-2013
Publisher: Springer Science and Business Media LLC
Date: 04-07-2022
DOI: 10.1038/S41746-022-00618-5
Abstract: In this modern era of precision medicine, molecular signatures identified from advanced omics technologies hold great promise to better guide clinical decisions. However, current approaches are often location-specific due to the inherent differences between platforms and across multiple centres, thus limiting the transferability of molecular signatures. We present Cross-Platform Omics Prediction (CPOP), a penalised regression model that can use omics data to predict patient outcomes in a platform-independent manner and across time and experiments. CPOP improves on the traditional prediction framework of using gene-based features by selecting ratio-based features with similar estimated effect sizes. These components gave CPOP the ability to have a stable performance across datasets of similar biology, minimising the effect of technical noise often generated by omics platforms. We present a comprehensive evaluation using melanoma transcriptomics data to demonstrate its potential to be used as a critical part of a clinical screening framework for precision medicine. Additional assessment of generalisation was demonstrated with ovarian cancer and inflammatory bowel disease studies.
Publisher: Springer Science and Business Media LLC
Date: 28-08-2014
DOI: 10.1007/S00423-014-1243-1
Abstract: To investigate the prognosis of adenocarcinomas of the upper third of the rectum and the rectosigmoid-junction without radiotherapy. Patients from a multicenter randomized controlled trial from 1987-1993 on adjuvant chemotherapy for R0-resected colorectal cancers with stage I-III disease were retrospectively allocated: cancers of the lower two-thirds of the rectum (11 cm or less from anal-verge, Group A, n = 205), of the upper-third of the rectum and rectosigmoid-junction (>11-20 cm from anal-verge, Group B, n = 142), and of the colon (>20 cm from anal-verge, Group C, n = 378). The total mesorectal excision (TME) technique had not been introduced yet. The adjuvant chemotherapy turned out to be ineffective. None of the patients received neoadjuvant or adjuvant radiotherapy. The patients had a regular follow-up (median, 8.0 years). The 5-year disease-free survival (DFS) rate was 0.54 (95%CI, 0.47-0.60) in Group A, 0.68 (95%CI, 0.60-0.75) in Group B, and 0.69 (95%CI, 0.64-0.74) in Group C. The 5-year overall survival (OS) rate was 0.64 (95%CI, 0.57-0.71) in Group A, 0.79 (95%CI, 0.71-0.85) in Group B, and 0.77 (95%CI, 0.73-0.81) in Group C. Compared with Group C, patients in Group A had a significantly worse OS (hazard ratio [HR] for death 2.10) and a worse DFS (HR for relapse/death 1.93), while patients in Group B had a similar OS (HR 1.12) and DFS (HR 1.07). Adenocarcinomas of the upper third of the rectum and the rectosigmoid-junction seem to have similar prognosis as colon cancers. Even for surgeons not familiar with the TME technique, preoperative radiotherapy may be avoided for most rectosigmoid cancers above 11 cm from anal-verge.
Publisher: Springer Science and Business Media LLC
Date: 26-07-2014
Publisher: Oxford University Press (OUP)
Date: 24-04-2015
DOI: 10.1093/BIOINFORMATICS/BTV220
Abstract: Motivation: In practice, identifying and interpreting the functional impacts of the regulatory relationships between micro-RNA and messenger-RNA is non-trivial. The sheer scale of possible micro-RNA and messenger-RNA interactions can make the interpretation of results difficult. Results: We propose a supervised framework, pMim, built upon concepts of significance combination, for jointly ranking regulatory micro-RNA and their potential functional impacts with respect to a condition of interest. Here, pMim directly tests if a micro-RNA is differentially expressed and if its predicted targets, which lie in a common biological pathway, have changed in the opposite direction. We leverage the information within existing micro-RNA target and pathway databases to stabilize the estimation and annotation of micro-RNA regulation making our approach suitable for datasets with small s le sizes. In addition to outputting meaningful and interpretable results, we demonstrate in a variety of datasets that the micro-RNA identified by pMim, in comparison to simpler existing approaches, are also more concordant with what is described in the literature. Availability and implementation: This framework is implemented as an R function, pMim, in the package sydSeq available from -packages. Contact: jean.yang@sydney.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.
Publisher: Oxford University Press (OUP)
Date: 2022
DOI: 10.1093/GIGASCIENCE/GIAC071
Abstract: Survival analysis is a branch of statistics that deals with both the tracking of time and the survival status simultaneously as the dependent response. Current comparisons of survival model performance mostly center on clinical data with classic statistical survival models, with prediction accuracy often serving as the sole metric of model performance. Moreover, survival analysis approaches for censored omics data have not been thoroughly investigated. The common approach is to binarize the survival time and perform a classification analysis. Here, we develop a benchmarking design, SurvBenchmark, that evaluates a erse collection of survival models for both clinical and omics data sets. SurvBenchmark not only focuses on classical approaches such as the Cox model but also evaluates state-of-the-art machine learning survival models. All approaches were assessed using multiple performance metrics these include model predictability, stability, flexibility, and computational issues. Our systematic comparison design with 320 comparisons (20 methods over 16 data sets) shows that the performances of survival models vary in practice over real-world data sets and over the choice of the evaluation metric. In particular, we highlight that using multiple performance metrics is critical in providing a balanced assessment of various models. The results in our study will provide practical guidelines for translational scientists and clinicians, as well as define possible areas of investigation in both survival technique and benchmarking strategies.
Publisher: Statistica Sinica (Institute of Statistical Science)
Date: 2017
Publisher: Informa UK Limited
Date: 12-2005
Publisher: Wiley
Date: 12-2005
Publisher: Statistica Sinica (Institute of Statistical Science)
Date: 2023
Publisher: Elsevier BV
Date: 09-2008
Publisher: Wiley
Date: 08-2010
Publisher: Informa UK Limited
Date: 19-06-2018
Publisher: BMJ Publishing Group Ltd
Date: 08-2021
Publisher: Springer Science and Business Media LLC
Date: 2003
Publisher: Informa UK Limited
Date: 03-2012
Publisher: Oxford University Press (OUP)
Date: 09-04-2013
Publisher: Springer Science and Business Media LLC
Date: 17-11-2020
DOI: 10.1186/S12859-020-03861-3
Abstract: Nutrigenomics aims at understanding the interaction between nutrition and gene information. Due to the complex interactions of nutrients and genes, their relationship exhibits non-linearity. One of the most effective and efficient methods to explore their relationship is the nutritional geometry framework which fits a response surface for the gene expression over two prespecified nutrition variables. However, when the number of nutrients involved is large, it is challenging to find combinations of informative nutrients with respect to a certain gene and to test whether the relationship is stronger than chance. Methods for identifying informative combinations are essential to understanding the relationship between nutrients and genes. We introduce Local Consistency Nutrition to Graphics (LC-N2G), a novel approach for ranking and identifying combinations of nutrients with gene expression. In LC-N2G, we first propose a model-free quantity called Local Consistency statistic to measure whether there is non-random relationship between combinations of nutrients and gene expression measurements based on (1) the similarity between s les in the nutrient space and (2) their difference in gene expression. Then combinations with small LC are selected and a permutation test is performed to evaluate their significance. Finally, the response surfaces are generated for the subset of significant relationships. Evaluation on simulated data and real data shows the LC-N2G can accurately find combinations that are correlated with gene expression. The LC-N2G is practically powerful for identifying the informative nutrition variables correlated with gene expression. Therefore, LC-N2G is important in the area of nutrigenomics for understanding the relationship between nutrition and gene expression information.
Publisher: Wiley
Date: 29-05-2013
DOI: 10.1002/SIM.5855
Abstract: Model selection techniques have existed for many years however, to date, simple, clear and effective methods of visualising the model building process are sparse. This article describes graphical methods that assist in the selection of models and comparison of many different selection criteria. Specifically, we describe for logistic regression, how to visualize measures of description loss and of model complexity to facilitate the model selection dilemma. We advocate the use of the bootstrap to assess the stability of selected models and to enhance our graphical tools. We demonstrate which variables are important using variable inclusion plots and show that these can be invaluable plots for the model building process. We show with two case studies how these proposed tools are useful to learn more about important variables in the data and how these tools can assist the understanding of the model building process.
Publisher: Springer Nature Switzerland
Date: 2023
Publisher: American Society of Clinical Oncology (ASCO)
Date: 04-2002
Abstract: PURPOSE: To determine the efficacy and tolerability of combining oxaliplatin with capecitabine in the treatment of advanced nonpretreated and pretreated colorectal cancer. PATIENTS AND METHODS: Forty-three nonpretreated patients and 26 patients who had experienced one fluoropyrimidine-containing regimen for advanced colorectal cancer were treated with oxaliplatin 130 mg/m 2 on day 1 and capecitabine 1,250 mg/m 2 bid on days 1 to 14 every 3 weeks. Patients with good performance status (World Health Organization grade 0 to 1) were accrued onto two nonrandomized parallel arms of a phase II study. RESULTS: The objective response rate was 49% (95% confidence interval [CI], 33% to 65%) for nonpretreated and 15% (95% CI, 4% to 35%) for pretreated patients. The main toxicity of this combination was diarrhea, which occurred at grade 3 or 4 in 35% of the nonpretreated and 50% of the pretreated patients. Grade 3 or 4 sensory neuropathy, including laryngopharyngeal dysesthesia, occurred in 16% of patients on both cohorts. Capecitabine dose reductions were necessary in 26% of the nonpretreated and 45% of the pretreated patients in the second treatment cycle. The median overall survival was 17.1 months and 11.5 months, respectively. CONCLUSION: Combining capecitabine and oxaliplatin yields promising activity in advanced colorectal cancer. The main toxicity is diarrhea, which is manageable with appropriate dose reductions. On the basis of our toxicity experience, we recommend use of capecitabine in combination with oxaliplatin 130 mg/m 2 at an initial dose of 1,250 mg/m 2 bid in nonpretreated patients and at a dose of 1,000 mg/m 2 bid in pretreated patients.
Publisher: Wiley
Date: 12-2019
DOI: 10.1111/ANZS.12276
Start Date: 2023
End Date: 12-2025
Amount: $388,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 11-2021
End Date: 10-2024
Amount: $390,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 01-2011
End Date: 12-2014
Amount: $350,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 05-2017
End Date: 05-2020
Amount: $354,500.00
Funder: Australian Research Council
View Funded ActivityStart Date: 06-2014
End Date: 05-2017
Amount: $351,000.00
Funder: Australian Research Council
View Funded ActivityStart Date: 04-2018
End Date: 12-2021
Amount: $359,083.00
Funder: Australian Research Council
View Funded ActivityStart Date: 06-2013
End Date: 06-2017
Amount: $390,000.00
Funder: Australian Research Council
View Funded Activity