ORCID Profile
0000-0002-5147-9299
Current Organisation
Yeshiva University Albert Einstein College of Medicine
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
In Research Link Australia (RLA), "Research Topics" refer to ANZSRC FOR and SEO codes. These topics are either sourced from ANZSRC FOR and SEO codes listed in researchers' related grants or generated by a large language model (LLM) based on their publications.
Gene Expression (incl. Microarray and other genome-wide approaches) | Biochemistry and Cell Biology | Bioinformatics | Systems Biology
Expanding Knowledge in the Biological Sciences | Application Software Packages (excl. Computer Games) | Expanding Knowledge in the Mathematical Sciences |
Publisher: Springer Science and Business Media LLC
Date: 17-06-2015
DOI: 10.1038/NATURE14607
Publisher: Public Library of Science (PLoS)
Date: 18-12-2015
Publisher: Springer Science and Business Media LLC
Date: 17-06-2015
DOI: 10.1038/NATURE14606
Publisher: Oxford University Press (OUP)
Date: 23-09-2022
DOI: 10.1093/BIB/BBAC387
Abstract: Accurately identifying cell-populations is paramount to the quality of downstream analyses and overall interpretations of single-cell RNA-seq (scRNA-seq) datasets but remains a challenge. The quality of single-cell clustering depends on the proximity metric used to generate cell-to-cell distances. Accordingly, proximity metrics have been benchmarked for scRNA-seq clustering, typically with results averaged across datasets to identify a highest performing metric. However, the ‘best-performing’ metric varies between studies, with the performance differing significantly between datasets. This suggests that the unique structural properties of an scRNA-seq dataset, specific to the biological system under study, have a substantial impact on proximity metric performance. Previous benchmarking studies have omitted to factor the structural properties into their evaluations. To address this gap, we developed a framework for the in-depth evaluation of the performance of 17 proximity metrics with respect to core structural properties of scRNA-seq data, including sparsity, dimensionality, cell-population distribution and rarity. We find that clustering performance can be improved substantially by the selection of an appropriate proximity metric and neighbourhood size for the structural properties of a dataset, in addition to performing suitable pre-processing and dimensionality reduction. Furthermore, popular metrics such as Euclidean and Manhattan distance performed poorly in comparison to several lessor applied metrics, suggesting that the default metric for many scRNA-seq methods should be re-evaluated. Our findings highlight the critical nature of tailoring scRNA-seq analyses pipelines to the dataset under study and provide practical guidance for researchers looking to optimize cell-similarity search for the structural properties of their own data.
Publisher: Oxford University Press (OUP)
Date: 28-06-2012
Abstract: Pluripotent stem cells can differentiate into every cell type of the human body. Reprogramming of somatic cells into induced pluripotent stem cells (iPSCs) therefore provides an opportunity to gain insight into the molecular and cellular basis of disease. Because the cellular DNA damage response poses a barrier to reprogramming, generation of iPSCs from patients with chromosomal instability syndromes has thus far proven to be difficult. Here we demonstrate that fibroblasts from patients with ataxia-telangiectasia (A-T), a disorder characterized by chromosomal instability, progressive neurodegeneration, high risk of cancer, and immunodeficiency, can be reprogrammed to bona fide iPSCs, albeit at a reduced efficiency. A-T iPSCs display defective radiation-induced signaling, radiosensitivity, and cell cycle checkpoint defects. Bioinformatic analysis of gene expression in the A-T iPSCs identifies abnormalities in DNA damage signaling pathways, as well as changes in mitochondrial and pentose phosphate pathways. A-T iPSCs can be differentiated into functional neurons and thus represent a suitable model system to investigate A-T-associated neurodegeneration. Collectively, our data show that iPSCs can be generated from a chromosomal instability syndrome and that these cells can be used to discover early developmental consequences of ATM deficiency, such as altered mitochondrial function, that may be relevant to A-T pathogenesis and amenable to therapeutic intervention.
Publisher: Cold Spring Harbor Laboratory
Date: 03-2023
DOI: 10.1101/2023.02.27.530366
Abstract: Cell reprogramming involves time-intensive, costly processes that ultimately produce low numbers of reprogrammed cells of variable quality. By screening a range of polyacrylamide hydrogels (pAAm gels) of varying stiffness (1 kPA – 1.3 MPa) we found that a gel of medium stiffness significantly increases the overall number of reprogrammed cells by up to ten-fold with accelerated reprogramming kinetics, as compared to the standard Tissue Culture PolyStyrene (TCPS)-based protocol. We observe that though the gel improves both early and late phases of reprogramming, improvement in the late (reprogramming prone population maturation) phase is more pronounced and produces iPSCs having different characteristics and lower remnant transgene expression than those produced on TCPS. Comparative RNA-Seq analyses coupled with experimental validation reveals that modulation of Bone Morphogenic Protein (BMP) signalling by a novel reprogramming regulator, Phactr3, upregulated in the gel at an earliest time-point without the influence of transcription factors used for reprogramming, plays a crucial role in the improvement in the early reprogramming kinetics and overall reprogramming outcomes. This study provides new insights into the mechanism via which substrate stiffness modulates reprogramming kinetics and iPSC quality outcomes, opening new avenues for producing higher numbers of quality iPSCs or other reprogrammed cells at shorter timescales.
Publisher: Elsevier BV
Date: 12-2015
Publisher: Public Library of Science (PLoS)
Date: 14-10-2011
Publisher: Public Library of Science (PLoS)
Date: 19-08-2015
Publisher: Springer Science and Business Media LLC
Date: 2005
Publisher: Springer Science and Business Media LLC
Date: 18-02-2016
Abstract: The epigenetic landscape was introduced by Conrad Waddington as a metaphor of cellular development. Like a ball rolling down a hillside is channelled through a succession of valleys until it reaches the bottom, cells follow specific trajectories from a pluripotent state to a committed state. Transcription factors (TFs) interacting as a network (the gene regulatory network (GRN)) orchestrate this developmental process within each cell. Here, we quantitatively model the epigenetic landscape using a kind of artificial neural network called the Hopfield network (HN). An HN is composed of nodes (genes/TFs) and weighted undirected edges, resulting in a weight matrix ( W ) that stores interactions among the nodes over the entire network. We used gene co-expression to compute the edge weights. Through W , we then associate an energy score ( E ) to each input pattern (pattern of co-expression for a specific developmental stage) such that each pattern has a specific E. We propose that, based on the co-expression values stored in W , HN associates lower E values to stable phenotypic states and higher E to transient states. We validate our model using time course gene-expression data sets representing stages of development across 12 biological processes including differentiation of human embryonic stem cells into specialized cells, differentiation of THP1 monocytes to macrophages during immune response and trans-differentiation of epithelial to mesenchymal cells in cancer. We observe that transient states have higher energy than the stable phenotypic states, yielding an arc-shaped trajectory. This relationship was confirmed by perturbation analysis. HNs offer an attractive framework for quantitative modelling of cell differentiation (as a landscape) from empirical data. Using HNs, we identify genes and TFs that drive cell-fate transitions, and gain insight into the global dynamics of GRNs.
Publisher: Public Library of Science (PLoS)
Date: 14-12-2015
Publisher: Elsevier BV
Date: 05-2008
Publisher: Wiley
Date: 08-11-2020
DOI: 10.1111/SAJE.12273
Publisher: Elsevier BV
Date: 08-2014
Publisher: Springer Science and Business Media LLC
Date: 1
DOI: 10.1038/NATURE14047
Abstract: Pluripotency is defined by the ability of a cell to differentiate to the derivatives of all the three embryonic germ layers: ectoderm, mesoderm and endoderm. Pluripotent cells can be captured via the archetypal derivation of embryonic stem cells or via somatic cell reprogramming. Somatic cells are induced to acquire a pluripotent stem cell (iPSC) state through the forced expression of key transcription factors, and in the mouse these cells can fulfil the strictest of all developmental assays for pluripotent cells by generating completely iPSC-derived embryos and mice. However, it is not known whether there are additional classes of pluripotent cells, or what the spectrum of reprogrammed phenotypes encompasses. Here we explore alternative outcomes of somatic reprogramming by fully characterizing reprogrammed cells independent of preconceived definitions of iPSC states. We demonstrate that by maintaining elevated reprogramming factor expression levels, mouse embryonic fibroblasts go through unique epigenetic modifications to arrive at a stable, Nanog-positive, alternative pluripotent state. In doing so, we prove that the pluripotent spectrum can encompass multiple, unique cell states.
Publisher: Springer Science and Business Media LLC
Date: 10-12-2014
DOI: 10.1038/NATURE14046
Abstract: Somatic cell reprogramming to a pluripotent state continues to challenge many of our assumptions about cellular specification, and despite major efforts, we lack a complete molecular characterization of the reprograming process. To address this gap in knowledge, we generated extensive transcriptomic, epigenomic and proteomic data sets describing the reprogramming routes leading from mouse embryonic fibroblasts to induced pluripotency. Through integrative analysis, we reveal that cells transition through distinct gene expression and epigenetic signatures and bifurcate towards reprogramming transgene-dependent and -independent stable pluripotent states. Early transcriptional events, driven by high levels of reprogramming transcription factor expression, are associated with widespread loss of histone H3 lysine 27 (H3K27me3) trimethylation, representing a general opening of the chromatin state. Maintenance of high transgene levels leads to re-acquisition of H3K27me3 and a stable pluripotent state that is alternative to the embryonic stem cell (ESC)-like fate. Lowering transgene levels at an intermediate phase, however, guides the process to the acquisition of ESC-like chromatin and DNA methylation signature. Our data provide a comprehensive molecular description of the reprogramming routes and is accessible through the Project Grandiose portal at www.stemformatics.org.
Publisher: PeerJ
Date: 23-05-2017
DOI: 10.7717/PEERJ.3334
Abstract: Identifying the pathways that control a cellular phenotype is the first step to building a mechanistic model. Recent ex les in developmental biology, cancer genomics, and neurological disease have demonstrated how changes in the variability of gene expression can highlight important genes that are under different degrees of regulatory control. Simple statistical tests exist to identify differentially-variable genes however, methods for investigating how changes in gene expression variability in the context of pathways and gene sets are under-explored. Here we present pathVar, a new method that provides functional interpretation of gene expression variability changes at the level of pathways and gene sets. pathVar is based on a multinomial exact test, or an asymptotic Chi-squared test as a more computationally-efficient alternative. The method can be used for gene expression studies from any technology platform in all biological settings either with a single phenotypic group, or two-group comparisons. To demonstrate its utility, we applied the method to a erse set of diseases, species and s les. Results from pathVar are benchmarked against analyses based on average expression and two methods of GSEA, and demonstrate that analyses using both statistics are useful for understanding transcriptional regulation. We also provide recommendations for the choice of variability statistic that have been informed through analyses on simulations and real data. Based on the datasets selected, we show how pathVar can be used to gain insight into expression variability of single cell versus bulk s les, different stem cell populations, and cancer versus normal tissue comparisons.
Publisher: Elsevier BV
Date: 06-2014
Publisher: Springer Science and Business Media LLC
Date: 11-09-2015
Publisher: Public Library of Science (PLoS)
Date: 11-08-2011
Publisher: Springer Science and Business Media LLC
Date: 03-2014
DOI: 10.1038/NATURE13182
Publisher: Springer Science and Business Media LLC
Date: 19-04-2009
Abstract: High-throughput real-time quantitative reverse transcriptase polymerase chain reaction (qPCR) is a widely used technique in experiments where expression patterns of genes are to be profiled. Current stage technology allows the acquisition of profiles for a moderate number of genes (50 to a few thousand), and this number continues to grow. The use of appropriate normalization algorithms for qPCR-based data is therefore a highly important aspect of the data preprocessing pipeline. We present and evaluate two data-driven normalization methods that directly correct for technical variation and represent robust alternatives to standard housekeeping gene-based approaches. We evaluated the performance of these methods against a single gene housekeeping gene method and our results suggest that quantile normalization performs best. These methods are implemented in freely-available software as an R package qpcrNorm distributed through the Bioconductor project. The utility of the approaches that we describe can be demonstrated most clearly in situations where standard housekeeping genes are regulated by some experimental condition. For large qPCR-based data sets, our approaches represent robust, data-driven strategies for normalization.
Publisher: Springer Science and Business Media LLC
Date: 20-10-2023
Publisher: Oxford University Press (OUP)
Date: 16-02-2011
DOI: 10.1093/BIOINFORMATICS/BTR074
Abstract: Motivation: Unsupervised ‘cluster’ analysis is an invaluable tool for exploratory microarray data analysis, as it organizes the data into groups of genes or s les in which the elements share common patterns. Once the data are clustered, finding the optimal number of informative subgroups within a dataset is a problem that, while important for understanding the underlying phenotypes, is one for which there is no robust, widely accepted solution. Results: To address this problem we developed an ‘informativeness metric’ based on a simple analysis of variance statistic that identifies the number of clusters which best separate phenotypic groups. The performance of the informativeness metric has been tested on both experimental and simulated datasets, and we contrast these results with those obtained using alternative methods such as the gap statistic. Availability: The method has been implemented in the Bioconductor R package attract it is also freely available from ubs/attract_1.0.1.zip. Contact: jess@jimmy.harvard.edu johnq@jimmy.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Publisher: Oxford University Press (OUP)
Date: 28-12-2022
DOI: 10.1093/GIGASCIENCE/GIAC126
Abstract: Single-cell RNA sequencing (scRNA-seq) methods have been advantageous for quantifying cell-to-cell variation by profiling the transcriptomes of in idual cells. For scRNA-seq data, variability in gene expression reflects the degree of variation in gene expression from one cell to another. Analyses that focus on cell–cell variability therefore are useful for going beyond changes based on average expression and, instead, identifying genes with homogeneous expression versus those that vary widely from cell to cell. We present a novel statistical framework, scShapes, for identifying differential distributions in single-cell RNA-sequencing data using generalized linear models. Most approaches for differential gene expression detect shifts in the mean value. However, as single-cell data are driven by overdispersion and dropouts, moving beyond means and using distributions that can handle excess zeros is critical. scShapes quantifies gene-specific cell-to-cell variability by testing for differences in the expression distribution while flexibly adjusting for covariates if required. We demonstrate that scShapes identifies subtle variations that are independent of altered mean expression and detects biologically relevant genes that were not discovered through standard approaches. This analysis also draws attention to genes that switch distribution shapes from a unimodal distribution to a zero-inflated distribution and raises open questions about the plausible biological mechanisms that may give rise to this, such as transcriptional bursting. Overall, the results from scShapes help to expand our understanding of the role that gene expression plays in the transcriptional regulation of a specific perturbation or cellular phenotype. Our framework scShapes is incorporated into a Bioconductor R package (ackages/release/bioc/html/scShapes.html).
Location: United States of America
Start Date: 06-2018
End Date: 12-2023
Amount: $944,572.00
Funder: Australian Research Council
View Funded Activity