ARDC Research Link Australia

Publication

Transcriptomics and single‐cell RNA‐sequencing

Publisher: Wiley

Date: 28-09-2019

Abstract: The past four decades have yielded advances in molecular biology allowing detailed characterization of the cellular genome and the transcriptome: the complete set of RNA species transcribed by a cell or tissue. Through transcriptomics and next-generation sequencing, we can now attain an unprecedented level of detail in understanding cellular phenotypes through examining the genes expressed in specific physiological and pathological states. In this review, we provide an overview of transcriptomics and RNA-sequencing in the analysis of whole tissue and single cells. We describe the techniques and pitfalls involved in the isolation and sequencing of single cells, and what additional benefits this application can provide. Finally, we look to how these technologies are being applied in pulmonary research, and how they may translate in the near future into clinical practice.

Publication

Genome-wide association study of intraocular pressure uncovers new pathways to glaucoma

Publisher: Springer Science and Business Media LLC

Date: 27-07-2018

DOI: 10.1038/S41588-018-0176-Y

Abstract: Intraocular pressure (IOP) is currently the sole modifiable risk factor for primary open-angle glaucoma (POAG), one of the leading causes of blindness worldwide

Publication

Genetic and Nongenetic Variation Revealed for the Principal Components of Human Gene Expression

Publisher: Oxford University Press (OUP)

Date: 11-2013

DOI: 10.1534/GENETICS.113.153221

Abstract: Principal components analysis has been employed in gene expression studies to correct for population substructure and batch and environmental effects. This method typically involves the removal of variation contained in as many as 50 principal components (PCs), which can constitute a large proportion of total variation present in the data. Each PC, however, can detect many sources of variation, including gene expression networks and genetic variation influencing transcript levels. We demonstrate that PCs generated from gene expression data can simultaneously contain both genetic and nongenetic factors. From heritability estimates we show that all PCs contain a considerable portion of genetic variation while nongenetic artifacts such as batch effects were associated to varying degrees with the first 60 PCs. These PCs demonstrate an enrichment of biological pathways, including core immune function and metabolic pathways. The use of PC correction in two independent data sets resulted in a reduction in the number of cis- and trans-expression QTL detected. Comparisons of PC and linear model correction revealed that PC correction was not as efficient at removing known batch effects and had a higher penalty on genetic variation. Therefore, this study highlights the danger of eliminating biologically relevant data when employing PC correction in gene expression data.

Publication

Constraints on eQTL Fine Mapping in the Presence of Multisite Local Regulation of Gene Expression

Publisher: Oxford University Press (OUP)

Date: 08-2017

DOI: 10.1534/G3.117.043752

Abstract: Expression quantitative trait locus (eQTL) detection has emerged as an important tool for unraveling of the relationship between genetic risk factors and disease or clinical phenotypes. Most studies use single marker linear regression to discover primary signals, followed by sequential conditional modeling to detect secondary genetic variants affecting gene expression. However, this approach assumes that functional variants are sparsely distributed and that close linkage between them has little impact on estimation of their precise location and the magnitude of effects. We describe a series of simulation studies designed to evaluate the impact of linkage disequilibrium (LD) on the fine mapping of causal variants with typical eQTL effect sizes. In the presence of multisite regulation, even though between 80 and 90% of modeled eSNPs associate with normally distributed traits, up to 10% of all secondary signals could be statistical artifacts, and at least 5% but up to one-quarter of credible intervals of SNPs within r2 & 0.8 of the peak may not even include a causal site. The Bayesian methods eCAVIAR and DAP (Deterministic Approximation of Posteriors) provide only modest improvement in resolution. Given the strong empirical evidence that gene expression is commonly regulated by more than one variant, we conclude that the fine mapping of causal variants needs to be adjusted for multisite influences, as conditional estimates can be highly biased by interference among linked sites, but ultimately experimental verification of in idual effects is needed. Presumably similar conclusions apply not just to eQTL mapping, but to multisite influences on fine mapping of most types of quantitative trait.

Publication

Evidence for mitochondrial genetic control of autosomal gene expression

Publisher: Oxford University Press (OUP)

Date: 18-10-2016

DOI: 10.1093/HMG/DDW347

Publication

A review of the development of tumor vasculature and its effects on the tumor microenvironment

Publisher: Informa UK Limited

Date: 04-2017

DOI: 10.2147/HP.S133231

Publication

SARS-CoV-2 Receptor ACE2 Is an Interferon-Stimulated Gene in Human Airway Epithelial Cells and Is Detected in Specific Cell Subsets across Tissues

Publisher: Elsevier BV

Date: 05-2020

DOI: 10.1016/J.CELL.2020.04.035

Publication

Blood gene expression studies in migraine: Potential and caveats

Publisher: SAGE Publications

Date: 09-02-2016

DOI: 10.1177/0333102416628463

Abstract: Global gene expression analysis may be used to obtain insights into the functional processes underlying migraine. However, there is a shortage of high-quality post-mortem brain tissue s les for genetic analysis. One approach is to use a more accessible tissue as a surrogate, such as peripheral blood. Discuss the benefits and caveats of blood genomic profiling in migraine and its potential application in the development of biomarkers of migraine susceptibility and outcome. Demonstrate the utility of blood-based expression profiles in migraine by analysing pilot Illumina HT-12 expression data from 76 (38 case, 38 control) whole-blood s les. Current evidence suggests peripheral blood is a biologically valid substrate for genetic studies of migraine, and may be used to identify biomarkers and therapeutic pathways. Pilot blood gene expression data confirm that expression profiles significantly differ between migraine case and non-migraine control in iduals.

Publication

C-reactive protein upregulates the whole blood expression of CD59 - an integrative analysis

Publisher: Public Library of Science (PLoS)

Date: 18-09-2017

DOI: 10.1371/JOURNAL.PCBI.1005766

Publication

Human population dispersal “Out of Africa” estimated from linkage disequilibrium and allele frequencies of SNPs

Publisher: Cold Spring Harbor Laboratory

Date: 25-04-2011

DOI: 10.1101/GR.119636.110

Abstract: Genetic and fossil evidence supports a single, recent ( ,000 yr) origin of modern Homo sapiens in Africa, followed by later population ergence and dispersal across the globe (the “Out of Africa” model). However, there is less agreement on the exact nature of this migration event and dispersal of populations relative to one another. We use the empirically observed genetic correlation structure (or linkage disequilibrium) between 242,000 genome-wide single nucleotide polymorphisms (SNPs) in 17 global populations to reconstruct two key parameters of human evolution: effective population size ( N e ) and population ergence times ( T ). A linkage disequilibrium (LD)–based approach allows changes in human population size to be traced over time and reveals a substantial reduction in N e accompanying the “Out of Africa” exodus as well as the dramatic re-expansion of non-Africans as they spread across the globe. Secondly, two parallel estimates of population ergence times provide clear evidence of population dispersal patterns “Out of Africa” and subsequent dispersal of proto-European and proto-East Asian populations. Estimates of ergence times between European–African and East Asian–African populations are inconsistent with its simplest manifestation: a single dispersal from the continent followed by a split into Western and Eastern Eurasian branches. Rather, population ergence times are consistent with substantial ancient gene flow to the proto-European population after its ergence with proto-East Asians, suggesting distinct, early dispersals of modern H. sapiens from Africa. We use simulated genetic polymorphism data to demonstrate the validity of our conclusions against alternative population demographic scenarios.

Publication

Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes

Publisher: Springer Science and Business Media LLC

Date: 27-07-2018

DOI: 10.1038/S41467-018-04951-W

Abstract: Type 2 diabetes (T2D) is a very common disease in humans. Here we conduct a meta-analysis of genome-wide association studies (GWAS) with ~16 million genetic variants in 62,892 T2D cases and 596,424 controls of European ancestry. We identify 139 common and 4 rare variants associated with T2D, 42 of which (39 common and 3 rare variants) are independent of the known variants. Integration of the gene expression data from blood ( n = 14,115 and 2765) with the GWAS results identifies 33 putative functional genes for T2D, 3 of which were targeted by approved drugs. A further integration of DNA methylation ( n = 1980) and epigenomic annotation data highlight 3 genes ( CAMK1D , TP53INP1 , and ATP5G1 ) with plausible regulatory mechanisms, whereby a genetic variant exerts an effect on T2D through epigenetic regulation of gene expression. Our study uncovers additional loci, proposes putative genetic regulatory mechanisms for T2D, and provides evidence of purifying selection for T2D-associated variants.

Publication

Contribution of genetic variation to transgenerational inheritance of DNA methylation

Publisher: Springer Science and Business Media LLC

Date: 2014

DOI: 10.1186/GB-2014-15-5-R73

Publication

A village in a dish model system for population-scale hiPSC studies

Publisher: Springer Science and Business Media LLC

Date: 09-06-2023

DOI: 10.1038/S41467-023-38704-1

Abstract: The mechanisms by which DNA alleles contribute to disease risk, drug response, and other human phenotypes are highly context-specific, varying across cell types and different conditions. Human induced pluripotent stem cells are uniquely suited to study these context-dependent effects but cell lines from hundreds or thousands of in iduals are required. Village cultures, where multiple induced pluripotent stem lines are cultured and differentiated in a single dish, provide an elegant solution for scaling induced pluripotent stem experiments to the necessary s le sizes required for population-scale studies. Here, we show the utility of village models, demonstrating how cells can be assigned to an induced pluripotent stem line using single-cell sequencing and illustrating that the genetic, epigenetic or induced pluripotent stem line-specific effects explain a large percentage of gene expression variation for many genes. We demonstrate that village methods can effectively detect induced pluripotent stem line-specific effects, including sensitive dynamics of cell states.

Publication

Dynamic ocean management: Defining and conceptualizing real-time management of the ocean

Publisher: Elsevier BV

Date: 08-2015

DOI: 10.1016/J.MARPOL.2015.03.014

Publication

Septic Shock: A Genomewide Association Study and Polygenic Risk Score Analysis

Publisher: Cambridge University Press (CUP)

Date: 08-2020

DOI: 10.1017/THG.2020.60

Abstract: Previous genetic association studies have failed to identify loci robustly associated with sepsis, and there have been no published genetic association studies or polygenic risk score analyses of patients with septic shock, despite evidence suggesting genetic factors may be involved. We systematically collected genotype and clinical outcome data in the context of a randomized controlled trial from patients with septic shock to enrich the presence of disease-associated genetic variants. We performed genomewide association studies of susceptibility and mortality in septic shock using 493 patients with septic shock and 2442 population controls, and polygenic risk score analysis to assess genetic overlap between septic shock risk/mortality with clinically relevant traits. One variant, rs9489328, located in AL589740 . 1 noncoding RNA, was significantly associated with septic shock ( p = 1.05 × 10 –10 ) however, it is likely a false-positive. We were unable to replicate variants previously reported to be associated ( p 1.00 × 10 –6 in previous scans) with susceptibility to and mortality from sepsis. Polygenic risk scores for hematocrit and granulocyte count were negatively associated with 28-day mortality ( p = 3.04 × 10 –3 p = 2.29 × 10 –3 ), and scores for C-reactive protein levels were positively associated with susceptibility to septic shock ( p = 1.44 × 10 –3 ). Results suggest that common variants of large effect do not influence septic shock susceptibility, mortality and resolution however, genetic predispositions to clinically relevant traits are significantly associated with increased susceptibility and mortality in septic in iduals.

Publication

Benchmarking of cell type deconvolution pipelines for transcriptomics data

Publisher: Springer Science and Business Media LLC

Date: 06-11-2020

DOI: 10.1038/S41467-020-19015-1

Abstract: Many computational methods have been developed to infer cell type proportions from bulk transcriptomics data. However, an evaluation of the impact of data transformation, pre-processing, marker selection, cell type composition and choice of methodology on the deconvolution results is still lacking. Using five single-cell RNA-sequencing (scRNA-seq) datasets, we generate pseudo-bulk mixtures to evaluate the combined impact of these factors. Both bulk deconvolution methodologies and those that use scRNA-seq data as reference perform best when applied to data in linear scale and the choice of normalization has a dramatic impact on some, but not all methods. Overall, methods that use scRNA-seq data have comparable performance to the best performing bulk methods whereas semi-supervised approaches show higher error values. Moreover, failure to include cell types in the reference that are present in a mixture leads to substantially worse results, regardless of the previous choices. Altogether, we evaluate the combined impact of factors affecting the deconvolution task across different datasets and propose general guidelines to maximize its performance.

Publication

Single‐Cell Immune Profiling in Coronary Artery Disease: The Role of State‐of‐the‐Art Immunophenotyping With Mass Cytometry in the Diagnosis of Atherosclerosis

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 15-12-2020

DOI: 10.1161/JAHA.120.017759

Abstract: Coronary artery disease remains the leading cause of death globally and is a major burden to every health system in the world. There have been significant improvements in risk modification, treatments, and mortality however, our ability to detect asymptomatic disease for early intervention remains limited. Recent discoveries regarding the inflammatory nature of atherosclerosis have prompted investigation into new methods of diagnosis and treatment of coronary artery disease. This article reviews some of the highlights of the important developments in cardioimmunology and summarizes the clinical evidence linking the immune system and atherosclerosis. It provides an overview of the major serological biomarkers that have been associated with atherosclerosis, noting the limitations of these markers attributable to low specificity, and then contrasts these serological markers with the circulating immune cell subtypes that have been found to be altered in coronary artery disease. This review then outlines the technique of mass cytometry and its ability to provide high‐dimensional single‐cell data and explores how this high‐resolution quantification of specific immune cell subpopulations may assist in the diagnosis of early atherosclerosis in combination with other complimentary techniques such as single‐cell RNA sequencing. We propose that this improved specificity has the potential to transform the detection of coronary artery disease in its early phases, facilitating targeted preventative approaches in the precision medicine era.

Publication

Reconciling the analysis of IBD and IBS in complex trait studies

Publisher: Springer Science and Business Media LLC

Date: 28-09-2010

DOI: 10.1038/NRG2865

Abstract: Identity by descent (IBD) is a fundamental concept in genetics and refers to alleles that are descended from a common ancestor in a base population. Identity by state (IBS) simply refers to alleles that are the same, irrespective of whether they are inherited from a recent ancestor. In modern applications, IBD relationships are estimated from genetic markers in in iduals without any known relationship. This can lead to erroneous inference because a consistent base population is not used. We argue that the purpose of most IBD calculations is to predict IBS at unobserved loci. Recognizing this aim leads to better methods to estimating IBD with benefits in mapping genes, estimating genetic variance and predicting inbreeding depression.

Publication

Trans-eQTLs identified in whole blood have limited influence on complex disease biology

Publisher: Springer Science and Business Media LLC

Date: 11-06-2018

DOI: 10.1038/S41431-018-0174-7

Publication

Genotype-free demultiplexing of pooled single-cell RNA-seq

Publisher: Cold Spring Harbor Laboratory

Date: 07-03-2019

DOI: 10.1101/570614

Abstract: A variety of experimental and computational methods have been developed to demultiplex s les from pooled in iduals in a single-cell RNA sequencing (scRNA-Seq) experiment which either require adding information (such as hashtag barcodes) or measuring information (such as genotypes) prior to pooling. We introduce scSplit which utilises genetic differences inferred from scRNA-Seq data alone to demultiplex pooled s les. scSplit also extracts a minimal set of high confidence presence/absence genotypes in each cluster which can be used to map clusters to original s les. Using a range of simulated, merged in idual-s le as well as pooled multi-in idual scRNA-Seq datasets, we show that scSplit is highly accurate and concordant with demuxlet predictions. Furthermore, scSplit predictions are highly consistent with the known truth in cell-hashing dataset. We also show that multiplexed-scRNA-Seq can be used to reduce batch effects caused by technical biases. scSplit is ideally suited to s les for which external genome-wide genotype data cannot be obtained (for ex le non-model organisms), or for which it is impossible to obtain unmixed s les directly, such as mixtures of genetically distinct tumour cells, or mixed infections. scSplit is available at: on-xu/scSplit

Publication

TNFAIP3 Reduction-of-Function Drives Female Infertility and CNS Inflammation

Publisher: Frontiers Media SA

Date: 08-04-2022

DOI: 10.3389/FIMMU.2022.811525

Abstract: Women with autoimmune and inflammatory aetiologies can exhibit reduced fecundity. TNFAIP3 is a master negative regulator of inflammation, and has been linked to many inflammatory conditions by genome wide associations studies, however its role in fertility remains unknown. Here we show that mice harbouring a mild Tnfaip3 reduction-of-function coding variant ( Tnfaip3 I325N ) that reduces the threshold for inflammatory NF-κB activation, exhibit reduced fecundity. Sub-fertility in Tnfaip3 I325N mice is associated with irregular estrous cycling, low numbers of ovarian secondary follicles, impaired mammary gland development and insulin resistance. These pathological features are associated with infertility in human subjects. Transplantation of Tnfaip3 I325N ovaries, mammary glands or pancreatic islets into wild-type recipients rescued estrous cycling, mammary branching and hyperinsulinemia respectively, pointing towards a cell-extrinsic hormonal mechanism. Examination of hypothalamic brain sections revealed increased levels of microglial activation with reduced levels of luteinizing hormone. TNFAIP3 coding variants may offer one contributing mechanism for the cause of sub-fertility observed across otherwise healthy populations as well as for the wide variety of auto-inflammatory conditions to which TNFAIP3 is associated. Further, TNFAIP3 represents a molecular mechanism that links heightened immunity with neuronal inflammatory homeostasis. These data also highlight that tuning-up immunity with TNFAIP3 comes with the potentially evolutionary significant trade-off of reduced fertility.

Publication

RAAS blockade, kidney disease, and expression of ACE2, the entry receptor for SARS-CoV-2, in kidney epithelial and endothelial cells

Publisher: Cold Spring Harbor Laboratory

Date: 23-06-2020

DOI: 10.1101/2020.06.23.167098

Abstract: SARS-CoV-2, the coronavirus that causes COVID-19, binds to angiotensin-converting enzyme 2 (ACE2) on human cells. Beyond the lung, COVID-19 impacts erse tissues including the kidney. ACE2 is a key member of the Renin-Angiotensin-Aldosterone System (RAAS) which regulates blood pressure, largely through its effects on the kidney. RAAS blockers such as ACE inhibitors (ACEi) and Angiotensin Receptor Blockers (ARBs) are widely used therapies for hypertension, cardiovascular and chronic kidney diseases, and therefore, there is intense interest in their effect on ACE2 expression and its implications for SARS-CoV-2 pathogenicity. Here, we analyzed single-cell and single-nucleus RNA-seq of human kidney to interrogate the association of ACEi/ARB use with ACE2 expression in specific cell types. First, we performed an integrated analysis aggregating 176,421 cells across 49 donors, 8 studies and 8 centers, and adjusting for sex, age, donor and center effects, to assess the relationship of ACE2 with age and sex at baseline. We observed a statistically significant increase in ACE2 expression in tubular epithelial cells of the thin loop of Henle (tLoH) in males relative to females at younger ages, the trend reversing, and losing significance with older ages. ACE2 expression in tLoH increases with age in females, with an opposite, weak effect in males. In an independent cohort, we detected a statistically significant increase in ACE2 expression with ACEi/ARB use in epithelial cells of the proximal tubule and thick ascending limb, and endothelial cells, but the association was confounded in this small cohort by the underlying disease. Our study illuminates the dynamics of ACE2 expression in specific kidney cells, with implications for SARS-CoV-2 entry and pathogenicity.

Publication

Single-cell genomics meets human genetics

Publisher: Springer Science and Business Media LLC

Date: 21-04-2023

DOI: 10.1038/S41576-023-00599-5

Publication

Dynamics of human monocytes and airway macrophages during healthy aging and after transplant

Publisher: Rockefeller University Press

Date: 09-01-2020

DOI: 10.1084/JEM.20191236

Abstract: The ontogeny of airway macrophages (AMs) in human lung and their contribution to disease are poorly mapped out. In mice, aging is associated with an increasing proportion of peripherally, as opposed to perinatally derived AMs. We sought to understand AM ontogeny in human lung during healthy aging and after transplant. We characterized monocyte/macrophage populations from the peripheral blood and airways of healthy volunteers across infancy/childhood (2–12 yr), maturity (20–50 yr), and older adulthood (& yr). Single-cell RNA sequencing (scRNA-seq) was performed on airway inflammatory cells isolated from sex-mismatched lung transplant recipients. During healthy aging, the proportions of blood and bronchoalveolar lavage (BAL) classical monocytes peak in adulthood and decline in older adults. scRNA-seq of BAL cells from lung transplant recipients indicates that after transplant, the majority of AMs are recipient derived. These data show that during aging, the peripheral monocyte phenotype is consistent with that found in the airways and, furthermore, that the majority of human AMs after transplant are derived from circulating monocytes.

Publication

Optimal use of regression models in genome‐wide association studies

Publisher: Wiley

Date: 21-07-2011

DOI: 10.1111/J.1365-2052.2011.02234.X

Abstract: The performance of linear regression models in genome-wide association studies is influenced by how marker information is parameterized in the model. Considering the impact of parameterization is especially important when using information from multiple markers to test for association. Properties of the population, such as linkage disequilibrium (LD) and allele frequencies, will also affect the ability of a model to provide statistical support for an underlying quantitative trait locus (QTL). Thus, for a given location in the genome, the relationship between population properties and model parameterization is expected to influence the performance of the model in providing evidence for the position of a QTL. As LD and allele frequencies vary throughout the genome and between populations, understanding the relationship between these properties and model parameterization is of considerable importance in order to make optimal use of available genomic data. Here, we evaluate the performance of regression-based association models using genotype and haplotype information across the full spectrum of allele frequency and LD scenarios. Genetic marker data from 200 broiler chickens were used to simulate genomic conditions by selecting in idual markers to act as surrogate QTL (sQTL) and then investigating the ability of surrounding markers to estimate sQTL genotypes and provide statistical support for their location. The LD and allele frequencies of markers and sQTL are shown to have a strong effect on the performance of models relative to one another. Our results provide an indication of the best choice of model parameterization given certain scenarios of marker and QTL LD and allele frequencies. We demonstrate a clear advantage of haplotype-based models, which account for phase uncertainty over other models tested, particularly for QTL with low minor allele frequencies. We show that the greatest advantage of haplotype models over single-marker models occurs when LD between markers and the causal locus is low. Under these situations, haplotype models have a greater accuracy of predicting the location of the QTL than other models tested.

Publication

Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood

Publisher: Springer Science and Business Media LLC

Date: 11-06-2018

DOI: 10.1038/S41467-018-04558-1

Abstract: Understanding the difference in genetic regulation of gene expression between brain and blood is important for discovering genes for brain-related traits and disorders. Here, we estimate the correlation of genetic effects at the top-associated cis -expression or -DNA methylation (DNAm) quantitative trait loci ( cis -eQTLs or cis -mQTLs) between brain and blood ( r b ). Using publicly available data, we find that genetic effects at the top cis -eQTLs or mQTLs are highly correlated between independent brain and blood s les ( $$\\hat r_b = 0.70$$ r ^ b = 0.70 for cis -eQTLs and $$\\hat r_ b = 0.78$$ r ^ b = 0.78 for cis -mQTLs). Using meta-analyzed brain cis -eQTL/mQTL data ( n = 526 to 1194), we identify 61 genes and 167 DNAm sites associated with four brain-related phenotypes, most of which are a subset of the discoveries (97 genes and 295 DNAm sites) using data from blood with larger s le sizes ( n = 1980 to 14,115). Our results demonstrate the gain of power in gene discovery for brain-related phenotypes using blood cis -eQTL/mQTL data with large s le sizes.

Publication

The single-cell eQTLGen consortium

Publisher: eLife Sciences Publications, Ltd

Date: 09-03-2020

DOI: 10.7554/ELIFE.52155

Abstract: In recent years, functional genomics approaches combining genetic information with bulk RNA-sequencing data have identified the downstream expression effects of disease-associated genetic risk factors through so-called expression quantitative trait locus (eQTL) analysis. Single-cell RNA-sequencing creates enormous opportunities for mapping eQTLs across different cell types and in dynamic processes, many of which are obscured when using bulk methods. Rapid increase in throughput and reduction in cost per cell now allow this technology to be applied to large-scale population genetics studies. To fully leverage these emerging data resources, we have founded the single-cell eQTLGen consortium (sc-eQTLGen), aimed at pinpointing the cellular contexts in which disease-causing genetic variants affect gene expression. Here, we outline the goals, approach and potential utility of the sc-eQTLGen consortium. We also provide a set of study design considerations for future single-cell eQTL studies.

Publication

Autosomal genetic control of human gene expression does not differ across the sexes

Publisher: Springer Science and Business Media LLC

Date: 12-2016

DOI: 10.1186/S13059-016-1111-0

Publication

Single cell eQTL analysis identifies cell type-specific genetic control of gene expression in fibroblasts and reprogrammed induced pluripotent stem cells

Publisher: Springer Science and Business Media LLC

Date: 05-03-2021

DOI: 10.1186/S13059-021-02293-3

Abstract: The discovery that somatic cells can be reprogrammed to induced pluripotent stem cells (iPSCs) has provided a foundation for in vitro human disease modelling, drug development and population genetics studies. Gene expression plays a critical role in complex disease risk and therapeutic response. However, while the genetic background of reprogrammed cell lines has been shown to strongly influence gene expression, the effect has not been evaluated at the level of in idual cells which would provide significant resolution. By integrating single cell RNA-sequencing (scRNA-seq) and population genetics, we apply a framework in which to evaluate cell type-specific effects of genetic variation on gene expression. Here, we perform scRNA-seq on 64,018 fibroblasts from 79 donors and map expression quantitative trait loci (eQTLs) at the level of in idual cell types. We demonstrate that the majority of eQTLs detected in fibroblasts are specific to an in idual cell subtype. To address if the allelic effects on gene expression are maintained following cell reprogramming, we generate scRNA-seq data in 19,967 iPSCs from 31 reprogramed donor lines. We again identify highly cell type-specific eQTLs in iPSCs and show that the eQTLs in fibroblasts almost entirely disappear during reprogramming. This work provides an atlas of how genetic variation influences gene expression across cell subtypes and provides evidence for patterns of genetic architecture that lead to cell type-specific eQTL effects.

Publication

DNA methylation is required to maintain both DNA replication timing precision and 3D genome organization integrity

Publisher: Elsevier BV

Date: 09-2021

DOI: 10.1016/J.CELREP.2021.109722

Abstract: DNA replication timing and three-dimensional (3D) genome organization are associated with distinct epigenome patterns across large domains. However, whether alterations in the epigenome, in particular cancer-related DNA hypomethylation, affects higher-order levels of genome architecture is still unclear. Here, using Repli-Seq, single-cell Repli-Seq, and Hi-C, we show that genome-wide methylation loss is associated with both concordant loss of replication timing precision and deregulation of 3D genome organization. Notably, we find distinct disruption in 3D genome compartmentalization, striking gains in cell-to-cell replication timing heterogeneity and loss of allelic replication timing in cancer hypomethylation models, potentially through the gene deregulation of DNA replication and genome organization pathways. Finally, we identify ectopic H3K4me3-H3K9me3 domains from across large hypomethylated domains, where late replication is maintained, which we purport serves to protect against catastrophic genome reorganization and aberrant gene transcription. Our results highlight a potential role for the methylome in the maintenance of 3D genome regulation.

Publication

Systematic identification of trans eQTLs as putative drivers of known disease associations

Publisher: Springer Science and Business Media LLC

Date: 08-09-2013

DOI: 10.1038/NG.2756

Publication

Single cell analysis of lymphatic endothelial cell fate specification and differentiation during zebrafish development

Publisher: Cold Spring Harbor Laboratory

Date: 11-02-2022

DOI: 10.1101/2022.02.10.479999

Abstract: During development, the lymphatic vasculature forms as a second, new vascular network derived from blood vessels. The transdifferentiation of embryonic venous endothelial cells (VECs) into lymphatic endothelial cells (LECs) is the first step in this process. Specification, differentiation and maintenance of LEC fate are all driven by the transcription factor Prox1, yet downstream mechanisms remain to be elucidated. We present a single cell transcriptomic atlas of lymphangiogenesis in zebrafish revealing new markers and hallmarks of LEC differentiation over four developmental stages. We further profile single cell transcriptomic and chromatin accessibility changes in zygotic prox1a mutants that are undergoing a VEC-LEC fate reversion during differentiation. Using maternal and zygotic prox1a rox1b mutants, we determine the earliest transcriptomic changes directed by Prox1 during LEC specification. This work altogether reveals new transcriptional targets and regulatory regions of the genome downstream of Prox1 in LEC maintenance, as well as showing that Prox1 specifies LEC fate primarily by limiting blood vascular and hematopoietic fate. This extensive single cell resource provides new mechanistic insights into the enigmatic role of Prox1 and the control of LEC differentiation in development.

Publication

scPred: accurate supervised method for cell-type classification from single-cell RNA-seq data

Publisher: Springer Science and Business Media LLC

Date: 12-2019

DOI: 10.1186/S13059-019-1862-5

Abstract: Single-cell RNA sequencing has enabled the characterization of highly specific cell types in many tissues, as well as both primary and stem cell-derived cell lines. An important facet of these studies is the ability to identify the transcriptional signatures that define a cell type or state. In theory, this information can be used to classify an in idual cell based on its transcriptional profile. Here, we present scPred , a new generalizable method that is able to provide highly accurate classification of single cells, using a combination of unbiased feature selection from a reduced-dimension space, and machine-learning probability-based prediction method. We apply scPred to scRNA-seq data from pancreatic tissue, mononuclear cells, colorectal tumor biopsies, and circulating dendritic cells and show that scPred is able to classify in idual cells with high accuracy. The generalized method is available at owellgenomicslab/scPred/ .

Publication

Single-Cell Profiling Identifies Key Pathways Expressed by iPSCs Cultured in Different Commercial Media

Publisher: Elsevier BV

Date: 09-2018

DOI: 10.1016/J.ISCI.2018.08.016

Publication

DropletQC: improved identification of empty droplets and damaged cells in single-cell RNA-seq data

Publisher: Springer Science and Business Media LLC

Date: 12-2021

DOI: 10.1186/S13059-021-02547-0

Abstract: Advances in droplet-based single-cell RNA-sequencing (scRNA-seq) have dramatically increased throughput, allowing tens of thousands of cells to be routinely sequenced in a single experiment. In addition to cells, droplets capture cell-free “ambient” RNA predominantly caused by lysis of cells during s le preparation. S les with high ambient RNA concentration can create challenges in accurately distinguishing cell-containing droplets and droplets containing ambient RNA. Current methods to separate these groups often retain a significant number of droplets that do not contain cells or empty droplets. Additionally, there are currently no methods available to detect droplets containing damaged cells, which comprise partially lysed cells, the original source of the ambient RNA. Here, we describe DropletQC, a new method that is able to detect empty droplets, damaged, and intact cells, and accurately distinguish them from one another. This approach is based on a novel quality control metric, the nuclear fraction, which quantifies for each droplet the fraction of RNA originating from unspliced, nuclear pre-mRNA. We demonstrate how DropletQC provides a powerful extension to existing computational methods for identifying empty droplets such as EmptyDrops. We implement DropletQC as an R package, which can be easily integrated into existing single-cell analysis workflows.

Publication

Integrating single-cell genomics pipelines to discover mechanisms of stem cell differentiation

Publisher: Elsevier BV

Date: 12-2021

DOI: 10.1016/J.MOLMED.2021.09.006

Abstract: Pluripotent stem cells underpin a growing sector that leverages their differentiation potential for research, industry, and clinical applications. This review evaluates the landscape of methods in single-cell transcriptomics that are enabling accelerated discovery in stem cell science. We focus on strategies for scaling stem cell differentiation through multiplexed single-cell analyses, for evaluating molecular regulation of cell differentiation using new analysis algorithms, and methods for integration and projection analysis to classify and benchmark stem cell derivatives against in vivo cell types. By discussing the available methods, comparing their strengths, and illustrating strategies for developing integrated analysis pipelines, we provide user considerations to inform their implementation and interpretation.

Publication

A single‐cell transcriptome atlas of the adult human retina

Publisher: EMBO

Date: 22-08-2019

DOI: 10.15252/EMBJ.2018100811

Publication

The Brisbane systems genetics study: Genetical genomics meets complex trait genetics

Publisher: Public Library of Science (PLoS)

Date: 26-04-2012

DOI: 10.1371/JOURNAL.PONE.0035430

Publication

Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression

Publisher: Springer Science and Business Media LLC

Date: 09-2021

DOI: 10.1038/S41588-021-00913-Z

Publication

Biological insights from 108 schizophrenia-associated genetic loci

Publisher: Springer Science and Business Media LLC

Date: 07-2014

DOI: 10.1038/NATURE13595

Publication

Genetic variation affects morphological retinal phenotypes extracted from UK Biobank optical coherence tomography images

Publisher: Public Library of Science (PLoS)

Date: 12-05-2021

DOI: 10.1371/JOURNAL.PGEN.1009497

Abstract: Optical Coherence Tomography (OCT) enables non-invasive imaging of the retina and is used to diagnose and manage ophthalmic diseases including glaucoma. We present the first large-scale genome-wide association study of inner retinal morphology using phenotypes derived from OCT images of 31,434 UK Biobank participants. We identify 46 loci associated with thickness of the retinal nerve fibre layer or ganglion cell inner plexiform layer. Only one of these loci has been associated with glaucoma, and despite its clear role as a biomarker for the disease, Mendelian randomisation does not support inner retinal thickness being on the same genetic causal pathway as glaucoma. We extracted overall retinal thickness at the fovea, representative of foveal hypoplasia, with which three of the 46 SNPs were associated. We additionally associate these three loci with visual acuity. In contrast to the Mendelian causes of severe foveal hypoplasia, our results suggest a spectrum of foveal hypoplasia, in part genetically determined, with consequences on visual function.

Publication

Neonatal DNA methylation profile in human twins is specified by a complex interplay between intrauterine environmental and genetic factors, subject to tissue-specific influence

Publisher: Cold Spring Harbor Laboratory

Date: 16-07-2012

DOI: 10.1101/GR.136598.111

Abstract: Comparison between groups of monozygotic (MZ) and dizygotic (DZ) twins enables an estimation of the relative contribution of genetic and shared and nonshared environmental factors to phenotypic variability. Using DNA methylation profiling of ∼20,000 CpG sites as a phenotype, we have examined discordance levels in three neonatal tissues from 22 MZ and 12 DZ twin pairs. MZ twins exhibit a wide range of within-pair differences at birth, but show discordance levels generally lower than DZ pairs. Within-pair methylation discordance was lowest in CpG islands in all twins and increased as a function of distance from islands. Variance component decomposition analysis of DNA methylation in MZ and DZ pairs revealed a low mean heritability across all tissues, although a wide range of heritabilities was detected for specific genomic CpG sites. The largest component of variation was attributed to the combined effects of nonshared intrauterine environment and stochastic factors. Regression analysis of methylation on birth weight revealed a general association between methylation of genes involved in metabolism and biosynthesis, providing further support for epigenetic change in the previously described link between low birth weight and increasing risk for cardiovascular, metabolic, and other complex diseases. Finally, comparison of our data with that of several older twins revealed little evidence for genome-wide epigenetic drift with increasing age. This is the first study to analyze DNA methylation on a genome scale in twins at birth, further highlighting the importance of the intrauterine environment on shaping the neonatal epigenome.

Publication

Mapping the dynamic genetic regulatory architecture ofHLAgenes at single-cell resolution

Publisher: Cold Spring Harbor Laboratory

Date: 20-03-2023

DOI: 10.1101/2023.03.14.23287257

Abstract: The human leukocyte antigen (HLA) locus plays a critical role in complex traits spanning autoimmune and infectious diseases, transplantation, and cancer. While coding variation in HLA genes has been extensively documented, regulatory genetic variation modulating HLA expression levels has not been comprehensively investigated. Here, we mapped expression quantitative trait loci (eQTLs) for classical HLA genes across 1,073 in iduals and 1,131,414 single cells from three tissues, using personalized reference genomes to mitigate technical confounding. We identified cell-type-specific cis- eQTLs for every classical HLA gene. Modeling eQTLs at single-cell resolution revealed that many eQTL effects are dynamic across cell states even within a cell type. HLA-DQ genes exhibit particularly cell-state-dependent effects within myeloid, B, and T cells. Dynamic HLA regulation may underlie important interin idual variability in immune responses.

Publication

An integrated cell barcoding and computational analysis pipeline for scalable analysis of differentiation at single-cell resolution

Publisher: Cold Spring Harbor Laboratory

Date: 14-10-2022

DOI: 10.1101/2022.10.12.511862

Abstract: This study develops a versatile cell multiplexing and data analysis platform to gain knowledge gain into mechanisms of cell differentiation. We engineer a cell barcoding system in human cells enabling multiplexed single-cell RNA sequencing for high throughput perturbation of customisable and erse experimental conditions. This is coupled with a new computational analysis pipeline that overcomes the limitations of conventional algorithms by using an unsupervised, genome-wide, orthogonal biological reference point to reveal the cell ersity and regulatory networks in the input scRNA-seq data set. We implement this pipeline by engineering transcribed barcodes into induced pluripotent stem cells and multiplex 62 independent experimental conditions comprising eight differentiation time points and nine developmental signalling perturbations in duplicates. We identify and deconstruct the temporal, signalling, and gene regulatory imperatives of iPSC differentiation into cell types of ectoderm, mesoderm, and endoderm lineages. This study provides a cellular and computational pipeline to study cell differentiation applicable to studies in developmental biology, drug discovery, and disease modelling.

Publication

Single cell RNA sequencing of stem cell-derived retinal ganglion cells

Publisher: Springer Science and Business Media LLC

Date: 13-02-2018

DOI: 10.1038/SDATA.2018.13

Abstract: We used single cell sequencing technology to characterize the transcriptomes of 1,174 human embryonic stem cell-derived retinal ganglion cells (RGCs) at the single cell level. The human embryonic stem cell line BRN3B-mCherry (A81-H7), was differentiated to RGCs using a guided differentiation approach. Cells were harvested at day 36 and prepared for single cell RNA sequencing. Our data indicates the presence of three distinct subpopulations of cells, with various degrees of maturity. One cluster of 288 cells showed increased expression of genes involved in axon guidance together with semaphorin interactions, cell-extracellular matrix interactions and ECM proteoglycans, suggestive of a more mature RGC phenotype.

Publication

propeller: testing for differences in cell type proportions in single cell data

Publisher: Oxford University Press (OUP)

Date: 25-08-2022

DOI: 10.1093/BIOINFORMATICS/BTAC582

Abstract: Single cell RNA-Sequencing (scRNA-seq) has rapidly gained popularity over the last few years for profiling the transcriptomes of thousands to millions of single cells. This technology is now being used to analyse experiments with complex designs including biological replication. One question that can be asked from single cell experiments, which has been difficult to directly address with bulk RNA-seq data, is whether the cell type proportions are different between two or more experimental conditions. As well as gene expression changes, the relative depletion or enrichment of a particular cell type can be the functional consequence of disease or treatment. However, cell type proportion estimates from scRNA-seq data are variable and statistical methods that can correctly account for different sources of variability are needed to confidently identify statistically significant shifts in cell type composition between experimental conditions. We have developed propeller, a robust and flexible method that leverages biological replication to find statistically significant differences in cell type proportions between groups. Using simulated cell type proportions data, we show that propeller performs well under a variety of scenarios. We applied propeller to test for significant changes in cell type proportions related to human heart development, ageing and COVID-19 disease severity. The propeller method is publicly available in the open source speckle R package (hipsonlab/speckle). All the analysis code for the article is available at the associated analysis website: phipsonlab.github.io ropeller-paper-analysis/. The speckle package, analysis scripts and datasets have been deposited at 0.5281/zenodo.7009042. Supplementary data are available at Bioinformatics online.

Publication

Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets

Publisher: Springer Science and Business Media LLC

Date: 28-03-2016

DOI: 10.1038/NG.3538

Abstract: Genome-wide association studies (GWAS) have identified thousands of genetic variants associated with human complex traits. However, the genes or functional DNA elements through which these variants exert their effects on the traits are often unknown. We propose a method (called SMR) that integrates summary-level data from GWAS with data from expression quantitative trait locus (eQTL) studies to identify genes whose expression levels are associated with a complex trait because of pleiotropy. We apply the method to five human complex traits using GWAS data on up to 339,224 in iduals and eQTL data on 5,311 in iduals, and we prioritize 126 genes (for ex le, TRAF1 and ANKRD55 for rheumatoid arthritis and SNX19 and NMRAL1 for schizophrenia), of which 25 genes are new candidates 77 genes are not the nearest annotated gene to the top associated GWAS SNP. These genes provide important leads to design future functional studies to understand the mechanism whereby DNA variation leads to complex trait variation.

Publication

DIRC3-IGFBP5 is a shared genetic risk locus and therapeutic target for carpal tunnel syndrome and trigger finger

Publisher: Cold Spring Harbor Laboratory

Date: 11-10-2021

DOI: 10.1101/2021.10.07.21264697

Abstract: Trigger finger (TF) and carpal tunnel syndrome (CTS) are two common non-traumatic hand disorders that frequently co-occur. By identifying TF and CTS cases in UK Biobank (UKB), we confirmed a highly significant phenotypic association between the diseases. To investigate the genetic basis for this association we performed a genome-wide association study (GWAS) including 2,908 TF cases and 436,579 European controls in UKB, identifying five independent loci. Colocalization with CTS summary statistics identified a co-localized locus at DIRC3 (lncRNA), which was replicated in FinnGen and fine-mapped to rs62175241. Single-cell and bulk eQTL analysis in fibroblasts from healthy donors (n=79) and tenosynovium s les from CTS patients (n=77) showed that the disease-protective rs62175241 allele was associated with increased DIRC3 and IGFBP5 expression. IGFBP5 is a secreted antagonist of IGF-1 signaling, and elevated IGF-1 levels were associated with CTS and TF in UKB, thereby implicating IGF-1 as a driver of both diseases.

Publication

Single-Cell Transcriptional Profiling of Aortic Endothelium Identifies a Hierarchy from Endovascular Progenitors to Differentiated Cells

Publisher: Elsevier BV

Date: 05-2019

DOI: 10.1016/J.CELREP.2019.04.102

Abstract: The cellular and molecular profiles that govern the endothelial heterogeneity of the circulatory system have yet to be elucidated. Using a data-driven approach to study the endothelial compartment via single-cell RNA sequencing, we characterized cell subpopulations within and assigned them to a defined endothelial hierarchy. We show that two transcriptionally distinct endothelial populations exist within the aorta and, using two independent trajectory analysis methods, confirm that they represent transitioning cells rather than discrete cell types. Gene co-expression analysis revealed crucial regulatory networks underlying each population, including significant metabolic gene networks in progenitor cells. Using mitochondrial activity assays and phenotyping, we confirm that endovascular progenitors display higher mitochondrial content compared to differentiated endothelial cells. The identities of these populations were further validated against bulk RNA sequencing (RNA-seq) data obtained from normal and tumor-derived vasculature. Our findings validate the heterogeneity of the aortic endothelium and previously suggested hierarchy between progenitor and differentiated cells.

Publication

Distinct Brainstem and Forebrain Circuits Receiving Tracheal Sensory Neuron Inputs Revealed Using a Novel Conditional Anterograde Transsynaptic Viral Tracing System

Publisher: Society for Neuroscience

Date: 06-05-2015

DOI: 10.1523/JNEUROSCI.5128-14.2015

Abstract: Sensory nerves innervating the mucosa of the airways monitor the local environment for the presence of irritant stimuli and, when activated, provide input to the nucleus of the solitary tract (Sol) and paratrigeminal nucleus (Pa5) in the medulla to drive a variety of protective behaviors. Accompanying these behaviors are perceivable sensations that, particularly for stimuli in the proximal end of the airways, can be discrete and localizable. Airway sensations likely reflect the ascending airway sensory circuitry relayed via the Sol and Pa5, which terminates broadly throughout the CNS. However, the relative contribution of the Sol and Pa5 to these ascending pathways is not known. In the present study, we developed and characterized a novel conditional anterograde transneuronal viral tracing system based on the H129 strain of herpes simplex virus 1 and used this system in rats along with conventional neuroanatomical tracing with cholera toxin B to identify subcircuits in the brainstem and forebrain that are in receipt of relayed airway sensory inputs via the Sol and Pa5. We show that both the Pa5 and proximal airways disproportionately receive afferent terminals arising from the jugular (rather than nodose) vagal ganglia and the output of the Pa5 is predominately directed toward the ventrobasal thalamus. We propose the existence of a somatosensory-like pathway from the proximal airways involving jugular ganglia afferents, the Pa5, and the somatosensory thalamus and suggest that this pathway forms the anatomical framework for sensations arising from the proximal airway mucosa.

Publication

Shared genetic control of expression and methylation in peripheral blood

Publisher: Springer Science and Business Media LLC

Date: 06-04-2016

DOI: 10.1186/S12864-016-2498-4

Publication

Inference of the Genetic Architecture Underlying BMI and Height with the Use of 20,240 Sibling Pairs

Publisher: Elsevier BV

Date: 11-2013

DOI: 10.1016/J.AJHG.2013.10.005

Publication

No evidence that plasmablasts transdifferentiate into developing neutrophils in severe COVID‐19 disease

Publisher: Wiley

Date: 2021

DOI: 10.1002/CTI2.1308

Abstract: A recent single‐cell RNA sequencing study by Wilk et al . suggested that plasmablasts can transdifferentiate into ‘developing neutrophils’ in patients with severe COVID‐19 disease. We explore the evidence for this. We downloaded the original data and code used by the authors in their study to replicate their findings and explore the possibility that regressing out variables may have led the authors to overfit their data. The lineage relationship between plasmablasts and developing neutrophils breaks down when key features are not regressed out, and the data are not overfitted during the analysis. Plasmablasts do not transdifferentiate into developing neutrophils. The single‐cell RNA sequencing is a powerful technique for biological discovery and hypothesis generation. However, caution should be exercised in the bioinformatic analysis and interpretation of the data and findings cross‐validated by orthogonal techniques.

Publication

Retinal ganglion cell-specific genetic regulation in primary open angle glaucoma

Publisher: Cold Spring Harbor Laboratory

Date: 14-07-2021

DOI: 10.1101/2021.07.14.452417

Abstract: To assess the transcriptomic profile of disease-specific cell populations, fibroblasts from patients with primary open-angle glaucoma (POAG) were reprogrammed into induced pluripotent stem cells (iPSCs) before being differentiated into retinal organoids and compared to those from healthy in iduals. We performed single-cell RNA-sequencing of a total of 330,569 cells and identified cluster-specific molecular signatures. Comparing the gene expression profile between cases and controls, we identified novel genetic associations for this blinding disease. Expression quantitative trait mapping identified a total of 2,235 significant loci across all cell types, 58 of which are specific to the retinal ganglion cell subpopulations, which ultimately degenerate in POAG. Transcriptome-wide association analysis identified genes at loci previously associated with POAG, and analysis, conditional on disease status, implicated 54 statistically significant retinal ganglion cell-specific expression quantitative trait loci. This work highlights the power of large-scale iPSC studies to uncover context-specific profiles for a genetically complex disease.

Publication

Retinal ganglion cell-specific genetic regulation in primary open-angle glaucoma

Publisher: Elsevier BV

Date: 06-2022

DOI: 10.1016/J.XGEN.2022.100142

Publication

Nebulosa recovers single-cell gene expression signals by kernel density estimation

Publisher: Oxford University Press (OUP)

Date: 18-01-2021

DOI: 10.1093/BIOINFORMATICS/BTAB003

Abstract: Data sparsity in single-cell experiments prevents an accurate assessment of gene expression when visualized in a low-dimensional space. Here, we introduce Nebulosa, an R package that uses weighted kernel density estimation to recover signals lost through drop-out or low expression. Nebulosa can be easily installed from owellgenomicslab/Nebulosa. Supplementary data are available at Bioinformatics online.

Publication

Single-cell eQTL mapping identifies cell type-specific genetic control of autoimmune disease.

Publisher: American Association for the Advancement of Science (AAAS)

Date: 08-04-2022

DOI: 10.1126/SCIENCE.ABF3041

Abstract: The human immune system displays substantial variation between in iduals, leading to differences in susceptibility to autoimmune disease. We present single-cell RNA sequencing (scRNA-seq) data from 1,267,758 peripheral blood mononuclear cells from 982 healthy human subjects. For 14 cell types, we identified 26,597 independent cis-expression quantitative trait loci (eQTLs) and 990 trans-eQTLs, with most showing cell type-specific effects on gene expression. We subsequently show how eQTLs have dynamic allelic effects in B cells that are transitioning from naïve to memory states and demonstrate how commonly segregating alleles lead to interin idual variation in immune function. Finally, using a Mendelian randomization approach, we identify the causal route by which 305 risk loci contribute to autoimmune disease at the cellular level. This work brings together genetic epidemiology with scRNA-seq to uncover drivers of interin idual variation in the immune system.

Publication

Genetic correlations reveal the shared genetic architecture of transcription in human peripheral blood

Publisher: Springer Science and Business Media LLC

Date: 07-09-2017

DOI: 10.1038/S41467-017-00473-Z

Abstract: Transcript co-expression is regulated by a combination of shared genetic and environmental factors. Here, we estimate the proportion of co-expression that is due to shared genetic variance. To do so, we estimated the genetic correlations between each pairwise combination of 2469 transcripts that are highly heritable and expressed in whole blood in 1748 unrelated in iduals of European ancestry. We identify 556 pairs with a significant genetic correlation of which 77% are located on different chromosomes, and report 934 expression quantitative trait loci, identified in an independent cohort, with significant effects on both transcripts in a genetically correlated pair. We show significant enrichment for transcription factor control and physical proximity through chromatin interactions as possible mechanisms of shared genetic control. Finally, we construct networks of interconnected transcripts and identify their underlying biological functions. Using genetic correlations to investigate transcriptional co-regulation provides valuable insight into the nature of the underlying genetic architecture of gene regulation.

Publication

Gene transcripts associated with muscle strength: a CHARGE meta-analysis of 7,781 persons

Publisher: American Physiological Society

Date: 2016

DOI: 10.1152/PHYSIOLGENOMICS.00054.2015

Abstract: Lower muscle strength in midlife predicts disability and mortality in later life. Blood-borne factors, including growth differentiation factor 11 (GDF11), have been linked to muscle regeneration in animal models. We aimed to identify gene transcripts associated with muscle strength in adults. Meta-analysis of whole blood gene expression (overall 17,534 unique genes measured by microarray) and hand-grip strength in four independent cohorts ( n = 7,781, ages: 20–104 yr, weighted mean = 56), adjusted for age, sex, height, weight, and leukocyte subtypes. Separate analyses were performed in subsets (older/younger than 60, men/women). Expression levels of 221 genes were associated with strength after adjustment for cofactors and for multiple statistical testing, including ALAS2 (rate-limiting enzyme in heme synthesis), PRF1 (perforin, a cytotoxic protein associated with inflammation), IGF1R, and IGF2BP2 (both insulin like growth factor related). We identified statistical enrichment for hemoglobin biosynthesis, innate immune activation, and the stress response. Ten genes were associated only in younger in iduals, four in men only and one in women only. For ex le, PIK3R2 (a negative regulator of PI3K/AKT growth pathway) was negatively associated with muscle strength in younger ( yr) in iduals but not older (≥60 yr). We also show that 115 genes (52%) have not previously been linked to muscle in NCBI PubMed abstracts. This first large-scale transcriptome study of muscle strength in human adults confirmed associations with known pathways and provides new evidence for over half of the genes identified. There may be age- and sex-specific gene expression signatures in blood for muscle strength.

Publication

Signatures of negative selection in the genetic architecture of human complex traits.

Publisher: Springer Science and Business Media LLC

Date: 16-04-2018

DOI: 10.1038/S41588-018-0101-4

Abstract: We develop a Bayesian mixed linear model that simultaneously estimates single-nucleotide polymorphism (SNP)-based heritability, polygenicity (proportion of SNPs with nonzero effects), and the relationship between SNP effect size and minor allele frequency for complex traits in conventionally unrelated in iduals using genome-wide SNP data. We apply the method to 28 complex traits in the UK Biobank data (N = 126,752) and show that on average, 6% of SNPs have nonzero effects, which in total explain 22% of phenotypic variance. We detect significant (P < 0.05/28) signatures of natural selection in the genetic architecture of 23 traits, including reproductive, cardiovascular, and anthropometric traits, as well as educational attainment. The significant estimates of the relationship between effect size and minor allele frequency in complex traits are consistent with a model of negative (or purifying) selection, as confirmed by forward simulation. We conclude that negative selection acts pervasively on the genetic variants associated with human complex traits.

Publication

Single-cell RNA-seq of human induced pluripotent stem cells reveals cellular heterogeneity and cell state transitions between subpopulations

Publisher: Cold Spring Harbor Laboratory

Date: 11-05-2018

DOI: 10.1101/GR.223925.117

Abstract: Heterogeneity of cell states represented in pluripotent cultures has not been described at the transcriptional level. Since gene expression is highly heterogeneous between cells, single-cell RNA sequencing can be used to identify how in idual pluripotent cells function. Here, we present results from the analysis of single-cell RNA sequencing data from 18,787 in idual WTC-CRISPRi human induced pluripotent stem cells. We developed an unsupervised clustering method and, through this, identified four subpopulations distinguishable on the basis of their pluripotent state, including a core pluripotent population (48.3%), proliferative (47.8%), early primed for differentiation (2.8%), and late primed for differentiation (1.1%). For each subpopulation, we were able to identify the genes and pathways that define differences in pluripotent cell states. Our method identified four transcriptionally distinct predictor gene sets composed of 165 unique genes that denote the specific pluripotency states using these sets, we developed a multigenic machine learning prediction method to accurately classify single cells into each of the subpopulations. Compared against a set of established pluripotency markers, our method increases prediction accuracy by 10%, specificity by 20%, and explains a substantially larger proportion of deviance (up to threefold) from the prediction model. Finally, we developed an innovative method to predict cells transitioning between subpopulations and support our conclusions with results from two orthogonal pseudotime trajectory methods.

Publication

The genetic regulation of transcription in human endometrial tissue

Publisher: Oxford University Press (OUP)

Date: 08-02-2017

DOI: 10.1093/HUMREP/DEX006

Abstract: Do genetic effects regulate gene expression in human endometrium? This study demonstrated strong genetic effects on endometrial gene expression and some evidence for genetic regulation of gene expression in a menstrual cycle stage-specific manner. Genetic effects on expression levels for many genes are tissue specific. Endometrial gene expression varies across menstrual cycle stages and between in iduals, but there are limited data on genetic control of expression in endometrium. We analysed genome-wide genotype and gene expression data to map cis expression quantitative trait loci (eQTL) in endometrium. We recruited 123 women of European ancestry. DNA s les from blood were genotyped on Illumina HumanCoreExome chips. Total RNA was extracted from endometrial tissues. Whole-transcriptome profiles were characterized using Illumina Human HT-12 v4.0 Expression Beadchips. We performed eQTL mapping with ~8 000 000 genotyped and imputed single nucleotide polymorphisms (SNPs) and 12 329 genes. We identified a total of 18 595 cis SNP-probe associations at a study-wide level of significance (P < 1 × 10-7), which correspond to independent eQTLs for 198 unique genes. The eQTLs with the largest effect in endometrial tissue were rs4902335 for CHURC1 (P = 1.05 × 10-32) and rs147253019 for ZP3 (P = 8.22 × 10-30). We further performed a context-specific eQTL analysis to investigate if genetic effects on gene expression regulation act in a menstrual cycle-specific manner. Interestingly, five cis-eQTLs were identified with a significant stage-by-genotype interaction. The strongest stage interaction was the eQTL for C10ORF33 (PYROXD2) with SNP rs2296438 (P = 2.0 × 10-4), where we observe a 2-fold difference in the average expression levels of heterozygous s les depending on the stage of the menstrual cycle. The summary eQTL results are publicly available to browse or download. A limitation of the present study was the relatively modest s le size. It was not powered to identify trans-eQTLs and larger s le sizes will also be needed to provide better power to detect cis-eQTLs and cycle stage-specific effects, given the substantial changes in expression across the menstrual cycle for many genes. Identification of endometrial eQTLs provides a platform for better understanding genetic effects on endometriosis risk and other endometrial-related pathologies. Funding for this work was provided by NHMRC Project Grants GNT1026033, GNT1049472, GNT1046880, GNT1050208, GNT1105321 and APP1083405. There are no competing interests.

Publication

scGPS: Determining Cell States and Global Fate Potential of Subpopulations

Publisher: Frontiers Media SA

Date: 19-07-2021

DOI: 10.3389/FGENE.2021.666771

Abstract: Finding cell states and their transcriptional relatedness is a main outcome from analysing single-cell data. In developmental biology, determining whether cells are related in a differentiation lineage remains a major challenge. A seamless analysis pipeline from cell clustering to estimating the probability of transitions between cell clusters is lacking. Here, we present Single Cell Global fate Potential of Subpopulations ( scGPS ) to characterise transcriptional relationship between cell states. scGPS decomposes mixed cell populations in one or more s les into clusters ( SCORE algorithm) and estimates pairwise transitioning potential ( scGPS algorithm) of any pair of clusters. SCORE allows for the assessment and selection of stable clustering results, a major challenge in clustering analysis. scGPS implements a novel approach, with machine learning classification, to flexibly construct trajectory connections between clusters. scGPS also has a feature selection functionality by network and modelling approaches to find biological processes and driver genes that connect cell populations. We applied scGPS in erse developmental contexts and show superior results compared to a range of clustering and trajectory analysis methods. scGPS is able to identify the dynamics of cellular plasticity in a user-friendly workflow, that is fast and memory efficient. scGPS is implemented in R with optimised functions using C++ and is publicly available in Bioconductor.

Publication

propeller: Testing for differences in cell type proportions in single cell data

Publisher: Cold Spring Harbor Laboratory

Date: 28-11-2021

DOI: 10.1101/2021.11.28.470236

Abstract: Single cell RNA Sequencing (scRNA-seq) has rapidly gained popularity over the last few years for profiling the transcriptomes of thousands to millions of single cells. This technology is now being used to analyse experiments with complex designs including biological replication. One question that can be asked from single cell experiments, which has been difficult to directly address with bulk RNA-seq data, is whether the cell type proportions are different between two or more experimental conditions. As well as gene expression changes, the relative depletion or enrichment of a particular cell type can be the functional consequence of disease or treatment. However, cell type proportions estimates from scRNA-seq data are variable and statistical methods that can correctly account for different sources of variability are needed to confidently identify statistically significant shifts in cell type composition between experimental conditions. We have developed propeller , a robust and flexible method that leverages biological replication to find statistically significant differences in cell type proportions between groups. Using simulated cell type proportions data we show that propeller performs well under a variety of scenarios. We applied propeller to test for significant changes in proportions of cell types related to human heart development, ageing and COVID-19 disease severity. The propeller method is publicly available in the open source speckle R package ( hipsonlab/speckle ). All the analysis code for the paper is available at hipsonlab ropeller-paper-analysis/ , and the associated analysis website is available at phipsonlab.github.io ropeller-paper-analysis/ . Alicia Oshlack: Alicia.Oshlack@petermac.org Belinda Phipson: phipson.b@wehi.edu.au Yes.

Publication

Single Cell RNA Sequencing of stem cell-derived retinal ganglion cells

Publisher: Cold Spring Harbor Laboratory

Date: 22-09-2017

DOI: 10.1101/191395

Abstract: We used human embryonic stem cell-derived retinal ganglion cells (RGCs) to characterize the transcriptome of 1,174 cells at the single cell level. The human embryonic stem cell line BRN3B-mCherry A81-H7 was differentiated to RGCs using a guided differentiation approach. Cells were harvested at day 36 and subsequently prepared for single cell RNA sequencing. Our data indicates the presence of three distinct subpopulations of cells, with various degrees of maturity. One cluster of 288 cells upregulated genes involved in axon guidance together with semaphorin interactions, cell-extracellular matrix interactions and ECM proteoglycans, suggestive of a more mature phenotype.

Publication

Transcriptomic and proteomic retinal pigment epithelium signatures of age-related macular degeneration

Publisher: Cold Spring Harbor Laboratory

Date: 20-08-2021

DOI: 10.1101/2021.08.19.457044

Abstract: Induced pluripotent stem cells generated from patients with geographic atrophy as well as healthy in iduals were differentiated to retinal pigment epithelium (RPE) cells. By integrating transcriptional profiles of 127,659 RPE cells generated from 43 in iduals with geographic atrophy and 36 controls with genotype data, we identified 439 expression Quantitative Trait (eQTL) loci in cis that were associated with disease status and specific to subpopulations of RPE cells. We identified loci linked to two genes with known associations with geographic atrophy - PILRB and PRPH2, in addition to 43 genes with significant genotype x disease interactions that are candidates for novel genetic associations for geographic atrophy. On a transcriptome-only level, we identified molecular pathways significantly upregulated in geographic atrophy-RPE including in extracellular cellular matrix reorganisation, neurodegeneration, and mitochondrial functions. We subsequently implemented a large-scale proteomics analysis, confirming modification in proteins associated with these pathways. We also identified six significant protein (p) QTL that regulate protein expression in the RPE cells and in geographic atrophy - two of which share variants with cis-eQTL. Transcriptome-wide association analysis identified genes at loci previously associated with age-related macular degeneration. Further analysis conditional on disease status, implicated statistically significant RPE-specific eQTL. This study uncovers important differences in RPE homeostasis associated with geographic atrophy.

Publication

The Medical Genome Reference Bank contains whole genome and phenotype data of 2570 healthy elderly

Publisher: Springer Science and Business Media LLC

Date: 23-01-2020

DOI: 10.1038/S41467-019-14079-0

Abstract: Population health research is increasingly focused on the genetic determinants of healthy ageing, but there is no public resource of whole genome sequences and phenotype data from healthy elderly in iduals. Here we describe the first release of the Medical Genome Reference Bank (MGRB), comprising whole genome sequence and phenotype of 2570 elderly Australians depleted for cancer, cardiovascular disease, and dementia. We analyse the MGRB for single-nucleotide, indel and structural variation in the nuclear and mitochondrial genomes. MGRB in iduals have fewer disease-associated common and rare germline variants, relative to both cancer cases and the gnomAD and UK Biobank cohorts, consistent with risk depletion. Age-related somatic changes are correlated with grip strength in men, suggesting blood-derived whole genomes may also provide a biologic measure of age-related functional deterioration. The MGRB provides a broadly applicable reference cohort for clinical genetics and genomic association studies, and for understanding the genetics of healthy ageing.

Publication

Heritable defects in telomere and mitotic function selectively predispose to sarcomas

Publisher: American Association for the Advancement of Science (AAAS)

Date: 20-01-2023

DOI: 10.1126/SCIENCE.ABJ4784

Abstract: Cancer genetics has to date focused on epithelial malignancies, identifying multiple histotype-specific pathways underlying cancer susceptibility. Sarcomas are rare malignancies predominantly derived from embryonic mesoderm. To identify pathways specific to mesenchymal cancers, we performed whole-genome germline sequencing on 1644 sporadic cases and 3205 matched healthy elderly controls. Using an extreme phenotype design, a combined rare-variant burden and ontologic analysis identified two sarcoma-specific pathways involved in mitotic and telomere functions. Variants in centrosome genes are linked to malignant peripheral nerve sheath and gastrointestinal stromal tumors, whereas heritable defects in the shelterin complex link susceptibility to sarcoma, melanoma, and thyroid cancers. These studies indicate a specific role for heritable defects in mitotic and telomere biology in risk of sarcomas.

Publication

Hemani et al. reply

Publisher: Springer Science and Business Media LLC

Date: 10-2014

DOI: 10.1038/NATURE13692

Publication

Cryopreservation of human cancers conserves tumour heterogeneity for single-cell multi-omics analysis

Publisher: Springer Science and Business Media LLC

Date: 10-05-2021

DOI: 10.1186/S13073-021-00885-Z

Abstract: High throughput single-cell RNA sequencing (scRNA-Seq) has emerged as a powerful tool for exploring cellular heterogeneity among complex human cancers. scRNA-Seq studies using fresh human surgical tissue are logistically difficult, preclude histopathological triage of s les, and limit the ability to perform batch processing. This hindrance can often introduce technical biases when integrating patient datasets and increase experimental costs. Although tissue preservation methods have been previously explored to address such issues, it is yet to be examined on complex human tissues, such as solid cancers and on high throughput scRNA-Seq platforms. Using the Chromium 10X platform, we sequenced a total of ~ 120,000 cells from fresh and cryopreserved replicates across three primary breast cancers, two primary prostate cancers and a cutaneous melanoma. We performed detailed analyses between cells from each condition to assess the effects of cryopreservation on cellular heterogeneity, cell quality, clustering and the identification of gene ontologies. In addition, we performed single-cell immunophenotyping using CITE-Seq on a single breast cancer s le cryopreserved as solid tissue fragments. Tumour heterogeneity identified from fresh tissues was largely conserved in cryopreserved replicates. We show that sequencing of single cells prepared from cryopreserved tissue fragments or from cryopreserved cell suspensions is comparable to sequenced cells prepared from fresh tissue, with cryopreserved cell suspensions displaying higher correlations with fresh tissue in gene expression. We showed that cryopreservation had minimal impacts on the results of downstream analyses such as biological pathway enrichment. For some tumours, cryopreservation modestly increased cell stress signatures compared to freshly analysed tissue. Further, we demonstrate the advantage of cryopreserving whole-cells for detecting cell-surface proteins using CITE-Seq, which is impossible using other preservation methods such as single nuclei-sequencing. We show that the viable cryopreservation of human cancers provides high-quality single-cells for multi-omics analysis. Our study guides new experimental designs for tissue biobanking for future clinical single-cell RNA sequencing studies.

Publication

Another explanation for apparent epistasis

Publisher: Springer Science and Business Media LLC

Date: 10-2014

DOI: 10.1038/NATURE13691

Publication

Comprehensive benchmarking of computational deconvolution of transcriptomics data

Publisher: Cold Spring Harbor Laboratory

Date: 10-01-2020

DOI: 10.1101/2020.01.10.897116

Abstract: Many computational methods to infer cell type proportions from bulk transcriptomics data have been developed. Attempts comparing these methods revealed that the choice of reference marker signatures is far more important than the method itself. However, a thorough evaluation of the combined impact of data transformation, pre-processing, marker selection, cell type composition and choice of methodology on the results is still lacking. Using different single-cell RNA-sequencing (scRNA-seq) datasets, we generated hundreds of pseudo-bulk mixtures to evaluate the combined impact of these factors on the deconvolution results. Along with methods to perform deconvolution of bulk RNA-seq data we also included five methods specifically designed to infer the cell type composition of bulk data using scRNA-seq data as reference. Both bulk and single-cell deconvolution methods perform best when applied to data in linear scale and the choice of normalization can have a dramatic impact on the performance of some, but not all methods. Overall, single-cell methods have comparable performance to the best performing bulk methods and bulk methods based on semi-supervised approaches showed higher error and lower correlation values between the computed and the expected proportions. Moreover, failure to include cell types in the reference that are present in a mixture always led to substantially worse results, regardless of any of the previous choices. Taken together, we provide a thorough evaluation of the combined impact of the different factors affecting the computational deconvolution task across different datasets and propose general guidelines to maximize its performance.

Publication

Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression

Publisher: Springer Science and Business Media LLC

Date: 26-04-2018

DOI: 10.1038/S41588-018-0090-3

Publication

Testing Two Evolutionary Theories of Human Aging with DNA Methylation Data

Publisher: Oxford University Press (OUP)

Date: 30-08-2017

DOI: 10.1534/GENETICS.117.300217

Abstract: The evolutionary theories of mutation accumulation (MA) and disposable soma (DS) provide possible explanations for the existence of human aging. To better understand the relative importance of these theories, we devised a test to identify MA- and DS-consistent sites across the genome using familial DNA methylation data. Two key characteristics of DNA methylation allowed us to do so. First, DNA methylation exhibits distinct and widespread changes with age, with numerous age-differentially-methylated sites observed across the genome. Second, many sites show heritable DNA methylation patterns within families. We extended heritability predictions of MA and DS to DNA methylation, predicting that MA-consistent age-differentially-methylated sites will show increasing heritability with age, while DS-consistent sites will show the opposite. Variance components models were used to test for changing heritability of methylation with age at 48,601 age-differentially-methylated sites across the genome in 610 in iduals from 176 families. Of these, 102 sites showed significant MA-consistent increases in heritability with age, while 2266 showed significant DS-consistent decreases in heritability. These results suggest that both MA and DS play a role in explaining aging and aging-related changes, and that while the majority of DNA methylation changes observed in aging are consistent with epigenetic drift, targeted changes exist and may mediate effects of aging-related genes.

Publication

A model of impaired Langerhans cell maturation associated with HPV induced epithelial hyperplasia

Publisher: Elsevier BV

Date: 11-2021

DOI: 10.1016/J.ISCI.2021.103326

Publication

Genotype-free demultiplexing of pooled single-cell RNA-seq

Publisher: Springer Science and Business Media LLC

Date: 12-2019

DOI: 10.1186/S13059-019-1852-7

Abstract: A variety of methods have been developed to demultiplex pooled s les in a single cell RNA sequencing (scRNA-seq) experiment which either require hashtag barcodes or s le genotypes prior to pooling. We introduce scSplit which utilizes genetic differences inferred from scRNA-seq data alone to demultiplex pooled s les. scSplit also enables mapping clusters to original s les. Using simulated, merged, and pooled multi-in idual datasets, we show that scSplit prediction is highly concordant with demuxlet predictions and is highly consistent with the known truth in cell-hashing dataset. scSplit is ideally suited to s les without external genotype information and is available at: on-xu/scSplit

Publication

Genetic parameters of production traits in Atlantic salmon (Salmo salar)

Publisher: Elsevier BV

Date: 02-2008

DOI: 10.1016/J.AQUACULTURE.2007.11.036

Publication

Demuxafy: Improvement in droplet assignment by integrating multiple single-cell demultiplexing and doublet detection methods

Publisher: Cold Spring Harbor Laboratory

Date: 08-03-2022

DOI: 10.1101/2022.03.07.483367

Abstract: Recent innovations in droplet-based single-cell RNA-sequencing (scRNA-seq) have provided the technology necessary to investigate biological questions at cellular resolution. With the ability to assay thousands of cells in a single capture, pooling cells from multiple in iduals has become a common strategy. Droplets can subsequently be assigned to a specific in idual by leveraging their inherent genetic differences, and numerous computational methods have been developed to address this problem. However, another challenge implicit with droplet-based scRNA-seq is the occurrence of doublets - droplets containing two or more cells. The inaccurate assignment of cells to in iduals or failure to remove doublets contribute unwanted noise to the data and result in erroneous scientific conclusions. Therefore, it is essential to assign cells to in iduals and remove doublets accurately. We present a new framework to improve in idual singlet classification and doublet removal through a multi-method intersectional approach. We developed a framework to evaluate the enhancement in donor assignment and doublet removal through the consensus intersection of multiple demultiplexing and doublet detecting methods. The accuracy was assessed using scRNA-seq data of ∼1.4 million peripheral blood mononucleated cells from 1,034 unrelated in iduals and ∼90,000 fibroblast cells from 81 unrelated in iduals. We show that our approach significantly improves droplet assignment by separating singlets from doublets and classifying the correct in idual compared to any single method. We show that the best combination of techniques varies under different biological and experimental conditions, and we present a framework to optimise cell assignment for a given experiment. We offer Demuxafy ( demultiplexing-doublet-detecting-docs.readthedocs.io/en/latest/index.html ) - a framework built-in Singularity to provide clear, consistent documentation of each method and additional tools to simplify and improve demultiplexing and doublet removal. Our results indicate that leveraging multiple demultiplexing and doublet detecting methods improves accuracy and, consequently, downstream analyses in multiplexed scRNA-seq experiments.

Publication

Genetic regulation of disease risk and endometrial gene expression highlights potential target genes for endometriosis and polycystic ovarian syndrome

Publisher: Springer Science and Business Media LLC

Date: 30-07-2018

DOI: 10.1038/S41598-018-29462-Y

Abstract: Gene expression varies markedly across the menstrual cycle and expression levels for many genes are under genetic control. We analyzed gene expression and mapped expression quantitative trait loci (eQTLs) in endometrial tissue s les from 229 women and then analyzed the overlap of endometrial eQTL signals with genomic regions associated with endometriosis and other reproductive traits. We observed a total of 45,923 cis -eQTLs for 417 unique genes and 2,968 trans -eQTLs affecting 82 unique genes. Two eQTLs were located in known risk regions for endometriosis including LINC00339 on chromosome 1 and VEZT on chromosome 12 and there was evidence for eQTLs that may be target genes in genomic regions associated with other reproductive diseases. Dynamic changes in expression of in idual genes across cycle include alterations in both mean expression and transcriptional silencing. Significant effects of cycle stage on mean expression levels were observed for (2,427/15,262) probes with detectable expression in at least 90% of s les and for (2,877/9,626) probes expressed in some, but not all s les. Pathway analysis supports similar biological control of both altered expression levels and transcriptional silencing. Taken together, these data identify strong genetic effects on genes with erse functions in human endometrium and provide a platform for better understanding genetic effects on endometrial-related pathologies.

Publication

Genome-wide analysis of blood gene expression in migraine implicates immune-inflammatory pathways

Publisher: SAGE Publications

Date: 06-01-2018

DOI: 10.1177/0333102416686769

Abstract: Typical migraine is a frequent, debilitating and painful headache disorder with an estimated heritability of about 50%. Although genome-wide association (GWA) studies have identified over 40 single nucleotide polymorphisms associated with migraine, further research is required to determine their biological role in migraine susceptibility. Therefore, we performed a study of genome-wide gene expression in a large s le of 83 migraine cases and 83 non-migraine controls to determine whether altered expression levels of genes and pathways could provide insights into the biological mechanisms underlying migraine. We assessed whole blood gene expression data for 17994 expression probes measured using IlluminaHT-12 v4.0 BeadChips. Differential expression was assessed using multivariable logistic regression. Gene expression probes with a nominal p value 0.05 were classified as differentially expressed. We identified modules of co-regulated genes and tested them for enrichment of differentially expressed genes and functional pathways using a false discovery rate .05. Association analyses between migraine and probe expression levels, adjusted for age and gender, revealed an excess of small p values, but there was no significant single-probe association after correction for multiple testing. Network analysis of pooled expression data identified 10 modules of co-expressed genes. One module harboured a significant number of differentially expressed genes and was strongly enriched with immune-inflammatory pathways, including multiple pathways expressed in microglial cells. These data suggest immune-inflammatory pathways play an important role in the pathogenesis, manifestation, and/or progression of migraine in some patients. Furthermore, gene-expression associations are measurable in whole blood, suggesting the analysis of blood gene expression can inform our understanding of the biological mechanisms underlying migraine, identify biomarkers, and facilitate the discovery of novel pathways and thus determine new targets for drug therapy.

Publication

Expression quantitative trait locus analysis for translational medicine

Publisher: Springer Science and Business Media LLC

Date: 24-06-2015

DOI: 10.1186/S13073-015-0186-7

Publication

Longitudinal expression profiling of CD4+ and CD8+ cells in patients with active to quiescent giant cell arteritis

Publisher: Springer Science and Business Media LLC

Date: 23-07-2018

DOI: 10.1186/S12920-018-0376-4

Publication

Determining cell fate specification and genetic contribution to cardiac disease risk in hiPSC-derived cardiomyocytes at single cell resolution

Publisher: Cold Spring Harbor Laboratory

Date: 05-12-2017

DOI: 10.1101/229336

Abstract: The majority of genetic loci underlying common disease risk act through changing genome regulation, and are routinely linked to expression quantitative trait loci, where gene expression is measured using bulk populations of mature cells. A crucial step that is missing is evidence of variation in the expression of these genes as cells progress from a pluripotent to mature state. This is especially important for cardiovascular disease, as the majority of cardiac cells have limited properties for renewal postneonatal. To investigate the dynamic changes in gene expression across the cardiac lineage, we generated RNA-sequencing data captured from 43,168 single cells progressing through in vitro cardiac-directed differentiation from pluripotency. We developed a novel and generalized unsupervised cell clustering approach and a machine learning method for prediction of cell transition. Using these methods, we were able to reconstruct the cell fate choices as cells transition from a pluripotent state to mature cardiomyocytes, uncovering intermediate cell populations that do not progress to maturity, and distinct cell trajectories that terminate in cardiomyocytes that differ in their contractile forces. Second, we identify new gene markers that denote lineage specification and demonstrate a substantial increase in their utility for cell identification over current pluripotent and cardiogenic markers. By integrating results from analysis of the single cell lineage RNA-sequence data with population-based GWAS of cardiovascular disease and cardiac tissue eQTLs, we show that the pathogenicity of disease-associated genes is highly dynamic as cells transition across their developmental lineage, and exhibit variation between cell fate trajectories. Through the integration of single cell RNA-sequence data with population-scale genetic data we have identified genes significantly altered at cell specification events providing insights into a context-dependent role in cardiovascular disease risk. This study provides a valuable data resource focused on in vitro cardiomyocyte differentiation to understand cardiac disease coupled with new analytical methods with broad applications to single-cell data.

Publication

MHC-Dependent Mate Selection within 872 Spousal Pairs of European Ancestry from the Health and Retirement Study

Publisher: MDPI AG

Date: 22-01-2018

DOI: 10.3390/GENES9010053

Publication

Human iPSC-derived cerebellar neurons from a patient with ataxia-telangiectasia reveal disrupted gene regulatory networks

Publisher: Frontiers Media SA

Date: 13-10-2017

DOI: 10.3389/FNCEL.2017.00321

Publication

Chronic lung diseases are associated with gene expression programs favoring SARS-CoV-2 entry and severity

Publisher: Springer Science and Business Media LLC

Date: 14-07-2021

DOI: 10.1038/S41467-021-24467-0

Abstract: Patients with chronic lung disease (CLD) have an increased risk for severe coronavirus disease-19 (COVID-19) and poor outcomes. Here, we analyze the transcriptomes of 611,398 single cells isolated from healthy and CLD lungs to identify molecular characteristics of lung cells that may account for worse COVID-19 outcomes in patients with chronic lung diseases. We observe a similar cellular distribution and relative expression of SARS-CoV-2 entry factors in control and CLD lungs. CLD AT2 cells express higher levels of genes linked directly to the efficiency of viral replication and the innate immune response. Additionally, we identify basal differences in inflammatory gene expression programs that highlight how CLD alters the inflammatory microenvironment encountered upon viral exposure to the peripheral lung. Our study indicates that CLD is accompanied by changes in cell-type-specific gene expression programs that prime the lung epithelium for and influence the innate and adaptive immune responses to SARS-CoV-2 infection.

Publication

Comparative performance of the BGI and Illumina sequencing technology for single-cell RNA-sequencing

Publisher: Oxford University Press (OUP)

Date: 13-05-2020

DOI: 10.1093/NARGAB/LQAA034

Abstract: The libraries generated by high-throughput single cell RNA-sequencing (scRNA-seq) platforms such as the Chromium from 10× Genomics require considerable amounts of sequencing, typically due to the large number of cells. The ability to use these data to address biological questions is directly impacted by the quality of the sequence data. Here we have compared the performance of the Illumina NextSeq 500 and NovaSeq 6000 against the BGI MGISEQ-2000 platform using identical Single Cell 3′ libraries consisting of over 70 000 cells generated on the 10× Genomics Chromium platform. Our results demonstrate a highly comparable performance between the NovaSeq 6000 and MGISEQ-2000 in sequencing quality, and the detection of genes, cell barcodes, Unique Molecular Identifiers. The performance of the NextSeq 500 was also similarly comparable to the MGISEQ-2000 based on the same metrics. Data generated by both sequencing platforms yielded similar analytical outcomes for general single-cell analysis. The performance of the NextSeq 500 and MGISEQ-2000 were also comparable for the deconvolution of multiplexed cell pools via variant calling, and detection of guide RNA (gRNA) from a pooled CRISPR single-cell screen. Our study provides a benchmark for high-capacity sequencing platforms applied to high-throughput scRNA-seq libraries.

Publication

The autotaxin-lysophosphatidic acid pathway mediates mesenchymal cell recruitment and fibrotic contraction in lung transplant fibrosis

Publisher: Elsevier BV

Date: 2021

DOI: 10.1016/J.HEALUN.2020.10.005

Publication

Stromal cell diversity associated with immune evasion in human triple‐negative breast cancer

Publisher: EMBO

Date: 13-08-2020

DOI: 10.15252/EMBJ.2019104063

Publication

The Genetic Architecture of Gene Expression in Peripheral Blood

Publisher: Elsevier BV

Date: 02-2017

DOI: 10.1016/J.AJHG.2016.12.008

Publication

Refining Attention-Deficit/Hyperactivity Disorder and Autism Spectrum Disorder Genetic Loci by Integrating Summary Data From Genome-wide Association, Gene Expression, and DNA Methylation Studies

Publisher: Elsevier BV

Date: 09-2020

DOI: 10.1016/J.BIOPSYCH.2020.05.002

Publication

SARS-CoV-2 entry factors are highly expressed in nasal epithelial cells together with innate immune genes

Publisher: Springer Science and Business Media LLC

Date: 23-04-2020

DOI: 10.1038/S41591-020-0868-6

Publication

Pitfalls and opportunities for applying latent variables in single-cell eQTL analyses

Publisher: Springer Science and Business Media LLC

Date: 23-02-2023

DOI: 10.1186/S13059-023-02873-5

Abstract: Using latent variables in gene expression data can help correct unobserved confounders and increase statistical power for expression quantitative trait Loci (eQTL) detection. The probabilistic estimation of expression residuals (PEER) and principal component analysis (PCA) are widely used methods that can remove unwanted variation and improve eQTL discovery power in bulk RNA-seq analysis. However, their performance has not been evaluated extensively in single-cell eQTL analysis, especially for different cell types. Potential challenges arise due to the structure of single-cell RNA-seq data, including sparsity, skewness, and mean-variance relationship. Here, we show by a series of analyses that PEER and PCA require additional quality control and data transformation steps on the pseudo-bulk matrix to obtain valid latent variables otherwise, it can result in highly correlated factors (Pearson's correlation r = 0.63 ~ 0.99). Incorporating valid PFs/PCs in the eQTL association model would identify 1.7 ~ 13.3% more eGenes. Sensitivity analysis showed that the pattern of change between the number of eGenes detected and fitted PFs/PCs varied significantly in different cell types. In addition, using highly variable genes to generate latent variables could achieve similar eGenes discovery power as using all genes but save considerable computational resources (~ 6.2-fold faster).

Publication

Congruence of Additive and Non-Additive Effects on Gene Expression Estimated from Pedigree and SNP Data

Publisher: Public Library of Science (PLoS)

Date: 16-05-2013

DOI: 10.1371/JOURNAL.PGEN.1003502

Publication

Endometriosis risk alleles at 1p36.12 act through inverse regulation of CDC42 and LINC00339

Publisher: Oxford University Press (OUP)

Date: 20-09-2016

DOI: 10.1093/HMG/DDW320

Abstract: Genome-wide association studies (GWAS) have identified markers within the WNT4 region on chromosome 1p36.12 showing consistent and strong association with increasing endometriosis risk. Fine mapping using sequence and imputed genotype data has revealed strong candidates for the causal SNPs within these critical regions however, the molecular pathogenesis of these SNPs is currently unknown. We used gene expression data collected from whole blood from 862 in iduals and endometrial tissue from 136 in iduals from independent populations of European descent to examine the mechanism underlying endometriosis susceptibility. Association mapping results from 7,090 in iduals (2,594 cases and 4,496 controls) supported rs3820282 as the SNP with the strongest association for endometriosis risk (P = 1.84 × 10−5, OR = 1.244 (1.126-1.375)). SNP rs3820282 is a significant eQTL in whole blood decreasing expression of LINC00339 (also known as HSPC157) and increasing expression of CDC42 (P = 2.0 ×10−54 and 4.5x10−4 respectively). The largest effects were for two LINC00339 probes (P = 2.0 ×10−54 1.0 × 10−34). The eQTL for LINC00339 was also observed in endometrial tissue (P = 2.4 ×10−8) with the same direction of effect for both whole blood and endometrial tissue. There was no evidence for eQTL effects for WNT4. Chromatin conformation capture provides evidence for risk SNPs interacting with the promoters of both LINC00339 and CDC4 and luciferase reporter assays suggest the risk SNP rs12038474 is located in a transcriptional silencer for CDC42 and the risk allele increases expression of CDC42. However, no effect of rs3820282 was observed in the LINC00339 expression in Ishikawa cells. Taken together, our results suggest that SNPs increasing endometriosis risk in this region act through CDC42, but further functional studies are required to rule out inverse regulation of both LINC00339 and CDC42.

Publication

A combined strategy for quantitative trait loci detection by genome-wide association

Publisher: Springer Science and Business Media LLC

Date: 23-02-2009

DOI: 10.1186/1753-6561-3-S1-S6

Abstract: We applied a range of genome-wide association (GWA) methods to map quantitative trait loci (QTL) in the simulated dataset provided by the 12 th QTLMAS workshop in order to derive an effective strategy. A variance component linkage analysis revealed QTLs but with low resolution. Three single-marker based GWA methods were then applied: Transmission Disequilibrium Test and single marker regression, fitting an additive model or a genotype model, on phenotypes pre-corrected for pedigree and fixed effects. These methods detected QTL positions with high concordance to each other and with greater refinement of the linkage signals. Further multiple-marker and haplotype analyses confirmed the results with higher significance. Two-locus interaction analysis detected two epistatic pairs of markers that were not significant by marginal effects. Overall, using stringent Bonferroni thresholds we identified 9 additive QTL and 2 epistatic interactions, which together explained about 12.3% of the corrected phenotypic variance. The combination of methods that are robust against population stratification, like QTDT, with flexible linear models that take account of the family structure provided consistent results. Extensive simulations are still required to determine appropriate thresholds for more advanced model including epistasis.

Publication

An integrated cell atlas of the human lung in health and disease

Publisher: Cold Spring Harbor Laboratory

Date: 11-03-2022

DOI: 10.1101/2022.03.10.483747

Abstract: Organ- and body-scale cell atlases have the potential to transform our understanding of human biology. To capture the variability present in the population, these atlases must include erse demographics such as age and ethnicity from both healthy and diseased in iduals. The growth in both size and number of single-cell datasets, combined with recent advances in computational techniques, for the first time makes it possible to generate such comprehensive large-scale atlases through integration of multiple datasets. Here, we present the integrated Human Lung Cell Atlas (HLCA) combining 46 datasets of the human respiratory system into a single atlas spanning over 2.2 million cells from 444 in iduals across health and disease. The HLCA contains a consensus re-annotation of published and newly generated datasets, resolving under- or misannotation of 59% of cells in the original datasets. The HLCA enables recovery of rare cell types, provides consensus marker genes for each cell type, and uncovers gene modules associated with demographic covariates and anatomical location within the respiratory system. To facilitate the use of the HLCA as a reference for single-cell lung research and allow rapid analysis of new data, we provide an interactive web portal to project datasets onto the HLCA. Finally, we demonstrate the value of the HLCA reference for interpreting disease-associated changes. Thus, the HLCA outlines a roadmap for the development and use of organ-scale cell atlases within the Human Cell Atlas.

Publication

AAV-Mediated CRISPR/Cas Gene Editing of Retinal Cells In Vivo

Publisher: Association for Research in Vision and Ophthalmology (ARVO)

Date: 29-06-2016

DOI: 10.1167/IOVS.16-19316

Abstract: Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/CRISPR-associated protein (Cas) has recently been adapted to enable efficient editing of the mammalian genome, opening novel avenues for therapeutic intervention of inherited diseases. In seeking to disrupt yellow fluorescent protein (YFP) in a Thy1-YFP transgenic mouse, we assessed the feasibility of utilizing the adeno-associated virus 2 (AAV2) to deliver CRISPR/Cas for gene modification of retinal cells in vivo. Single guide RNA (sgRNA) plasmids were designed to target YFP, and after in vitro validation, selected guides were cloned into a dual AAV system. One AAV2 construct was used to deliver Streptococcus pyogenes Cas9 (SpCas9), and the other delivered sgRNA against YFP or LacZ (control) in the presence of mCherry. Five weeks after intravitreal injection, retinal function was determined using electroretinography, and CRISPR/Cas-mediated gene modifications were quantified in retinal flat mounts. Adeno-associated virus 2-mediated in vivo delivery of SpCas9 with sgRNA targeting YFP significantly reduced the number of YFP fluorescent cells of the inner retina of our transgenic mouse model. Overall, we found an 84.0% (95% confidence interval [CI]: 81.8-86.9) reduction of YFP-positive cells in YFP-sgRNA-infected retinal cells compared to eyes treated with LacZ-sgRNA. Electroretinography profiling found no significant alteration in retinal function following AAV2-mediated delivery of CRISPR/Cas components compared to contralateral untreated eyes. Thy1-YFP transgenic mice were used as a rapid quantifiable means to assess the efficacy of CRISPR/Cas-based retinal gene modification in vivo. We demonstrate that genomic modification of cells in the adult retina can be readily achieved by viral-mediated delivery of CRISPR/Cas.

Publication

Identification of 55,000 Replicated DNA Methylation QTL

Publisher: Springer Science and Business Media LLC

Date: 04-12-2018

DOI: 10.1038/S41598-018-35871-W

Abstract: DNA methylation plays an important role in the regulation of transcription. Genetic control of DNA methylation is a potential candidate for explaining the many identified SNP associations with disease that are not found in coding regions. We replicated 52,916 cis and 2,025 trans DNA methylation quantitative trait loci (mQTL) using methylation from whole blood measured on Illumina HumanMethylation450 arrays in the Brisbane Systems Genetics Study (n = 614 from 177 families) and the Lothian Birth Cohorts of 1921 and 1936 (combined n = 1366). The trans mQTL SNPs were found to be over-represented in 1 Mbp subtelomeric regions, and on chromosomes 16 and 19. There was a significant increase in trans mQTL DNA methylation sites in upstream and 5′ UTR regions. The genetic heritability of a number of complex traits and diseases was partitioned into components due to mQTL and the remainder of the genome. Significant enrichment was observed for height (p = 2.1 × 10 −10 ), ulcerative colitis (p = 2 × 10 −5 ), Crohn’s disease (p = 6 × 10 −8 ) and coronary artery disease (p = 5.5 × 10 −6 ) when compared to a random s le of SNPs with matched minor allele frequency, although this enrichment is explained by the genomic location of the mQTL SNPs.

Publication

Transcriptomic Profiling of Human Pluripotent Stem Cell-derived Retinal Pigment Epithelium over Time

Publisher: Elsevier BV

Date: 04-2021

DOI: 10.1016/J.GPB.2020.08.002

Publication

Ribosomal protein S6 mRNA is a biomarker upregulated in multiple sclerosis, downregulated by interferon treatment, and affected by season

Publisher: SAGE Publications

Date: 14-10-2013

DOI: 10.1177/1352458513507819

Abstract: Multiple Sclerosis (MS) is an immune-mediated disease of the central nervous system which responds to therapies targeting circulating immune cells. Our aim was to test if the T-cell activation gene expression pattern (TCAGE) we had previously described from whole blood was replicated in an independent cohort. We used RNA-seq to interrogate the whole blood transcriptomes of 72 in iduals (40 healthy controls, 32 untreated MS). A cohort of 862 control in iduals from the Brisbane Systems Genetics Study (BSGS) was used to assess heritability and seasonal expression. The effect of interferon beta (IFNB) therapy on expression was evaluated. The MS/TCAGE association was replicated and rationalized to a single marker, ribosomal protein S6 (RPS6). Expression of RPS6 was higher in MS than controls ( p .0004), and lower in winter than summer ( p .6E-06). The seasonal pattern correlated with monthly UV light index ( R=0.82, p .002), and was also identified in the BSGS cohort ( p .0016). Variation in expression of RPS6 was not strongly heritable. RPS6 expression was reduced by IFNB therapy. These data support investigation of RPS6 as a potential therapeutic target and candidate biomarker for measuring clinical response to IFNB and other MS therapies, and of MS disease heterogeneity.

Publication

Transcriptomic Profiling of Human Pluripotent Stem Cell-Derived Retinal Pigment Epithelium Over Time

Publisher: Cold Spring Harbor Laboratory

Date: 14-11-2019

DOI: 10.1101/842328

Abstract: Human pluripotent stem cell (hPSC)-derived progenies are immature versions of cells, presenting a potential limitation to the accurate modelling of disease associated with maturity or age. Hence, it is important to characterise how closely cells used in culture resemble their native counterparts. In order to select appropriate points in time for RPE cultures to reflect native counterparts, we characterised the transcriptomic profiles of hPSC-derived retinal pigment epithelium (RPE) cells from 1- and 12-month cultures. We differentiated the human embryonic stem cell line H9 into RPE cells, performed single cell RNA-sequencing of a total of 16,576 cells, and analysed the resulting data to assess the molecular changes of RPE cells across these two culture time points. Our results indicate the stability of the RPE transcriptomic signature, with no evidence of an epithelial – mesenchymal transition, and with maturing populations of RPE observed with time in culture. Assessment of gene ontology pathways revealed that as cultures age, RPE cells upregulate expression of genes involved in metal binding and antioxidant functions. This might reflect an increased ability to handle oxidative stress as cells mature. Comparison with native human RPE data confirmed a maturing transcriptional profile of RPE cells in culture. These results suggest that in vitro long-term culture of RPE cells allow the modelling of specific phenotypes observed in native mature tissue. Our work highlights the transcriptional landscape of hPSC-derived RPE as they age in culture, which provides a reference for native and patient-s les to be benchmarked against.

Publication

Single-cell transcriptomics of alloreactive CD4+ T cells over time reveals divergent fates during gut graft-versus-host disease

Publisher: American Society for Clinical Investigation

Date: 09-07-2020

DOI: 10.1172/JCI.INSIGHT.137990

Publication

Endometrial vezatin and its association with endometriosis risk

Publisher: Oxford University Press (OUP)

Date: 22-03-2016

DOI: 10.1093/HUMREP/DEW047

Abstract: Do endometriosis risk-associated single nucleotide polymorphisms (SNPs) found at the 12q22 locus have effects on vezatin ( ITALIC! VEZT) expression? The original genome-wide association study (GWAS) SNP (rs10859871), and other newly identified association signals, demonstrate strong evidence for ITALIC! cis-expression quantitative trait loci (eQTL) effects on ITALIC! VEZT expression. GWAS have identified several disease-risk loci (SNPs) associated with endometriosis. The SNP rs10859871 is located within the ITALIC! VEZT gene. ITALIC! VEZT expression is altered in the endometrium of endometriosis patients and is an excellent candidate for having a causal role in endometriosis. Most of the SNPs identified from GWAS are not located within the coding region of the genome. However, they are likely to have an effect on the regulation of gene expression. Genetic variants that affect levels of gene expression are called expression quantitative trait loci (eQTL). S les for genotyping and ITALIC! VEZT variant screening were drawn from women recruited for genetic studies in Australia/New Zealand and women undergoing surgery in a tertiary care centre. Coding variants for ITALIC! VEZT were screened in blood from 100 unrelated in iduals (endometriosis-dense families) from the QIMR Berghofer Medical Research Institute dataset. SNPs at the 12q22 locus were imputed and reanalysed for their association with endometriosis. Reanalysis of endometriosis risk-association was performed on a final combined Australian dataset of 2594 cases and 4496 controls. Gene expression was performed on 136 endometrial s les. eQTL analysis in whole blood was performed on 862 in iduals from the Brisbane Systems Genetics Study. Endometrial tissue-specific eQTL analysis was performed on 122 s les (eutopic endometrium) collected following laparoscopic surgery. VEZT protein expression studies employed ITALIC! n = 56 (western blotting) and ITALIC! n = 42 (immunohistochemistry) endometrial s les. The women recruited for this study provided blood and/or endometrial tissue s les in a hospital setting. Genomic DNA was screened for common and coding variants. SNPs of interest in the 12q22 region were genotyped using Agena MassARRAY technology or Taqman SNP genotyping assay. Gene expression profiles from RNA extracted from blood and endometrial tissue s les were generated using Illumina whole-genome expression chips (Human HT-12 v4.0). Whole protein extracted from endometrium was used for VEZT western blots, and paraffin sections of endometrium were employed for VEZT immunohistochemistry semi-quantitative analysis. A total of 11 coding variants of ITALIC! VEZT (including one novel variant) were identified from an endometriosis-dense cohort. Polymorphic coding and imputed SNPs were combined with previous GWAS data to reanalyse the endometriosis risk association of the 12q22 region. The disease association signal at 12q22 was due to coding variants in ITALIC! VEZT or ITALIC! FGD6 (FYVE, RhoGEF and PH domain-containing 6) and SNPs with the strongest signals were either intronic or intergenic. We found strong evidence for ITALIC! VEZT cis-eQTLs with the sentinel SNP (rs10859871) in blood and endometrium, where the endometriosis risk allele (C) was associated with an increase in ITALIC! VEZT expression. We could not demonstrate this genotype-specific effect on VEZT protein expression in endometrium. However, we did observe a menstrual cycle stage specific increase in VEZT protein expression in endometrial glands, specific to the secretory phase ( ITALIC! P = 2.0 × 10(-4)). In comparison to the blood s le datasets, the study numbers of endometrial tissues were substantially reduced. Protein studies failed to complement RNA results, also likely a reflection of the low study numbers in these experiments. ITALIC! In silico prediction tools used in this investigation are typically based on cell lines different to our tissues of interest, thus any functional annotations drawn from these approaches should be considered carefully. Therefore, functional studies on VEZT and related pathway components are still warranted to unequivocally implicate a causal role for VEZT in endometriosis pathophysiology. GWAS have proven to be very valuable tools for deciphering complex diseases. Endometriosis is a text-book ex le of a complex disease, involving genetic, lifestyle and environmental influences. Our focused investigation of the 12q22 region validates an association with increased endometriosis risk. Endometriosis risk SNPs (including rs10859871) located within this locus demonstrated evidence for ITALIC! cis-eQTLs on ITALIC! VEZT expression. By examining women who possess an enhanced genetic risk of developing endometriosis, we have identified an effect on ITALIC! VEZT expression and therefore a potential gene/gene pathway in endometriosis disease establishment and development. Funding for this work was provided by NHMRC Project Grants GNT1012245, GNT1026033, GNT1049472 and GNT1046880. G.W.M. is supported by the NHMRC Fellowship scheme (GNT1078399). S.J.H.-C. is supported by the J.N. Peters Bequest Fellowship. The authors declare no competing interests. N/A.

Publication

Itaconate controls the severity of pulmonary fibrosis

Publisher: American Association for the Advancement of Science (AAAS)

Date: 08-10-2020

DOI: 10.1126/SCIIMMUNOL.ABC1884

Abstract: The ACOD1/itaconate axis is a pulmonary regulatory pathway that controls the severity of fibrosis and is a potential therapeutic target in IPF.

Publication

Overlap of expression quantitative trait loci (eQTL) in human brain and blood.

Publisher: Springer Science and Business Media LLC

Date: 03-06-2014

DOI: 10.1186/1755-8794-7-31

Publication

Effect of all-but-one conditional analysis for eQTL isolation in peripheral blood

Publisher: Oxford University Press (OUP)

Date: 02-11-2022

DOI: 10.1093/GENETICS/IYAC162

Abstract: Expression quantitative trait locus detection has become increasingly important for understanding how noncoding variants contribute to disease susceptibility and complex traits. The major challenges in expression quantitative trait locus fine-mapping and causal variant discovery relate to the impact of linkage disequilibrium on signals due to one or multiple functional variants that lie within a credible set. We perform expression quantitative trait locus fine-mapping using the all-but-one approach, conditioning each signal on all others detected in an interval, on the Consortium for the Architecture of Gene Expression cohorts of microarray-based peripheral blood gene expression in 2,138 European-ancestry human adults. We contrast these results with traditional forward stepwise conditional analysis and a Bayesian localization method. All-but-one conditioning significantly modifies effect-size estimates for 51% of 2,351 expression quantitative trait locus peaks, but only modestly affects credible set size and location. On the other hand, both conditioning approaches result in unexpectedly low overlap with Bayesian credible sets, with just 57% peak concordance and between 50% and 70% SNP sharing, leading us to caution against the assumption that any one localization method is superior to another. We also cross reference our results with ATAC-seq data, cell-type-specific expression quantitative trait locus, and activity-by-contact-enhancers, leading to the proposal of a 5-tier approach to further reduce credible set sizes and prioritize likely causal variants for all known inflammatory bowel disease risk loci active in immune cells.

Publication

The relationship between adrenocortical candidate gene expression and clinical response to hydrocortisone in patients with septic shock

Publisher: Springer Science and Business Media LLC

Date: 29-06-2021

DOI: 10.1007/S00134-021-06464-5

Publication

Detection of HPV E7 Transcription at Single-Cell Resolution in Epidermis

Publisher: Elsevier BV

Date: 12-2018

DOI: 10.1016/J.JID.2018.06.169

Abstract: Persistent human papillomavirus (HPV) infection is responsible for at least 5% of human malignancies. Most HPV-associated cancers are initiated by the HPV16 genotype, as confirmed by detection of integrated HPV DNA in cells of oral and anogenital epithelial cancers. However, single-cell RNA sequencing may enable prediction of HPV involvement in carcinogenesis at other sites. We conducted single-cell RNA sequencing on keratinocytes from a mouse transgenic for the E7 gene of HPV16 and showed sensitive and specific detection of HPV16-E7 mRNA, predominantly in basal keratinocytes. We showed that increased E7 mRNA copy number per cell was associated with increased expression of E7 induced genes. This technique enhances detection of active viral transcription in solid tissue and may clarify possible linkage of HPV infection to development of squamous cell carcinoma.

Publication

Predicting Sensation Seeking From Dopamine Genes

Publisher: SAGE Publications

Date: 26-01-2011

DOI: 10.1177/0956797610397669

Publication

A single-cell tumor immune atlas for precision oncology

Publisher: Cold Spring Harbor Laboratory

Date: 21-09-2021

DOI: 10.1101/GR.273300.120

Abstract: The tumor immune microenvironment is a main contributor to cancer progression and a promising therapeutic target for oncology. However, immune microenvironments vary profoundly between patients, and biomarkers for prognosis and treatment response lack precision. A comprehensive compendium of tumor immune cells is required to pinpoint predictive cellular states and their spatial localization. We generated a single-cell tumor immune atlas, jointly analyzing published data sets of ,000 cells from 217 patients and 13 cancer types, providing the basis for a patient stratification based on immune cell compositions. Projecting immune cells from external tumors onto the atlas facilitated an automated cell annotation system. To enable in situ mapping of immune populations for digital pathology, we applied SPOTlight, combining single-cell and spatial transcriptomics data and identifying colocalization patterns of immune, stromal, and cancer cells in tumor sections. We expect the tumor immune cell atlas, together with our versatile toolbox for precision oncology, to advance currently applied stratification approaches for prognosis and immunotherapy.

Publication

The low EOMES/TBX21 molecular phenotype in multiple sclerosis reflects CD56+ cell dysregulation and is affected by immunomodulatory therapies

Publisher: Elsevier BV

Date: 02-2016

DOI: 10.1016/J.CLIM.2015.12.015

Abstract: Multiple Sclerosis (MS) is an autoimmune disease treated by therapies targeting peripheral blood cells. We previously identified that expression of two MS-risk genes, the transcription factors EOMES and TBX21 (ET), was low in blood from MS and stable over time. Here we replicated the low ET expression in a new MS cohort (p<0.0007 for EOMES, p<0.028 for TBX21) and demonstrate longitudinal stability (p<10(-4)) and high heritability (h(2)=0.48 for EOMES) for this molecular phenotype. Genes whose expression correlated with ET, especially those controlling cell migration, further defined the phenotype. CD56+ cells and other subsets expressed lower levels of Eomes or T-bet protein and/or were under-represented in MS. EOMES and TBX21 risk SNP genotypes, and serum EBNA-1 titres were not correlated with ET expression, but HLA-DRB1*1501 genotype was. ET expression was normalised to healthy control levels with natalizumab, and was highly variable for glatiramer acetate, fingolimod, interferon-beta, dimethyl fumarate.

Publication

Genome-wide association analyses of risk tolerance and risky behaviors in over 1 million individuals identify hundreds of loci and shared genetic influences

Publisher: Springer Science and Business Media LLC

Date: 14-01-2019

DOI: 10.1038/S41588-018-0309-3

Publication

A single-cell and spatially resolved atlas of human breast cancers

Publisher: Springer Science and Business Media LLC

Date: 09-2021

DOI: 10.1038/S41588-021-00911-1

Publication

Comprehensive Multiple eQTL Detection and Its Application to GWAS Interpretation

Publisher: Oxford University Press (OUP)

Date: 22-05-2019

DOI: 10.1534/GENETICS.119.302091

Abstract: Expression QTL (eQTL) detection has emerged as an important tool for unraveling the relationship between genetic risk factors and disease or clinical phenotypes. Most studies are predicated on the assumption that only a single causal variant explains the association signal in each interval. This greatly simplifies the statistical modeling, but is liable to biases in scenarios where multiple local causal-variants are responsible. Here, our primary goal was to address the prevalence of secondary cis-eQTL signals regulating peripheral blood gene expression locally, utilizing two large human cohort studies, each & s les with accompanying whole genome genotypes. The CAGE (Consortium for the Architecture of Gene Expression) dataset is a compendium of Illumina microarray studies, and the Framingham Heart Study is a two-generation Affymetrix dataset. We also describe Bayesian colocalization analysis of the extent of sharing of cis-eQTL detected in both studies as well as with the BIOS RNAseq dataset. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ∼40% of over 3500 eGenes in both microarray datasets, and that the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. Although & % of the peak signals across platforms fine map to the same credible interval, the colocalization analysis finds that as many as 50–60% of the primary eQTL are actually shared. Subsequently, colocalization of eQTL signals with GWAS hits detected 1349 genes whose expression in peripheral blood is associated with 591 human phenotype traits or diseases, including enrichment for genes with regulatory functions. At least 10%, and possibly as many as 40%, of eQTL-trait colocalized signals are due to nonprimary cis-eQTL peaks, but just one-quarter of these colocalization signals replicated across the gene expression datasets. Our results are provided as a web-based resource for visualization of multi-site regulation of gene expression and its association with human complex traits and disease states.

Publication

DevKidCC allows for robust classification and direct comparisons of kidney organoid datasets

Publisher: Springer Science and Business Media LLC

Date: 22-02-2022

DOI: 10.1186/S13073-022-01023-Z

Abstract: While single-cell transcriptional profiling has greatly increased our capacity to interrogate biology, accurate cell classification within and between datasets is a key challenge. This is particularly so in pluripotent stem cell-derived organoids which represent a model of a developmental system. Here, clustering algorithms and selected marker genes can fail to accurately classify cellular identity while variation in analyses makes it difficult to meaningfully compare datasets. Kidney organoids provide a valuable resource to understand kidney development and disease. However, direct comparison of relative cellular composition between protocols has proved challenging. Hence, an unbiased approach for classifying cell identity is required. The R package, scPred , was trained on multiple single cell RNA-seq datasets of human fetal kidney. A hierarchical model classified cellular subtypes into nephron, stroma and ureteric epithelial elements. This model, provided in the R package DevKidCC ( github.com/KidneyRegeneration/DevKidCC ), was then used to predict relative cell identity within published kidney organoid datasets generated using distinct cell lines and differentiation protocols, interrogating the impact of such variations. The package contains custom functions for the display of differential gene expression within cellular subtypes. DevKidCC was used to directly compare between distinct kidney organoid protocols, identifying differences in relative proportions of cell types at all hierarchical levels of the model and highlighting variations in stromal and unassigned cell types, nephron progenitor prevalence and relative maturation of in idual epithelial segments. Of note, DevKidCC was able to distinguish distal nephron from ureteric epithelium, cell types with overlapping profiles that have previously confounded analyses. When applied to a variation in protocol via the addition of retinoic acid, DevKidCC identified a consequential depletion of nephron progenitors. The application of DevKidCC to kidney organoids reproducibly classifies component cellular identity within distinct single-cell datasets. The application of the tool is summarised in an interactive Shiny application, as are ex les of the utility of in-built functions for data presentation. This tool will enable the consistent and rapid comparison of kidney organoid protocols, driving improvements in patterning to kidney endpoints and validating new approaches.

Publication

An experimental comparison of the Digital Spatial Profiling and Visium spatial transcriptomics technologies for cancer research

Publisher: Cold Spring Harbor Laboratory

Date: 06-04-2023

DOI: 10.1101/2023.04.06.535805

Abstract: Spatial transcriptomic technologies are powerful tools for resolving the spatial heterogeneity of gene expression in tissue s les. However, little evidence exists on relative strengths and weaknesses of the various available technologies for profiling human tumour tissue. In this study, we aimed to provide an objective assessment of two common spatial transcriptomics platforms, 10X Genomics’ Visium and Nanostring’s GeoMx DSP. The abilities of the DSP and Visium platforms to profile transcriptomic features were compared using matching cell line and primary breast cancer tissue s les. A head-to-head comparison was conducted using data generated from matching s les and synthetic tissue references. Platform specific features were also assessed according to manufacturers’ recommendations to evaluate the optimal usage of the two technologies. We identified substantial variations in assay design between the DSP and Visium assays such as transcriptomic coverage and composition of the transcripts detected. When the data was standardised according to manufacturers’ recommendations, the DSP platform was more sensitive in gene expression detection. However, its specificity was diminished by the presence of non-specific detection. Our results also confirmed the strength and weakness of each platform in characterising spatial transcriptomic features of tissue s les, in particular their application to hypothesis generation versus hypothesis testing. In this study, we share our experience on both DSP and Visium technologies as end users. We hope this can guide future users to choose the most suitable platform for their research. In addition, this dataset can be used as an important resource for the development of new analysis tools.

Publication

Myocyte Specific Upregulation of ACE2 in Cardiovascular Disease: Implications for SARS-CoV-2 Mediated Myocarditis

Publisher: Ovid Technologies (Wolters Kluwer Health)

Date: 22-06-2020

DOI: 10.1161/CIRCULATIONAHA.120.047911

Publication

Seasonal Effects on Gene Expression

Publisher: Public Library of Science (PLoS)

Date: 29-05-2015

DOI: 10.1371/JOURNAL.PONE.0126995

Publication

Genetic control of gene expression in whole blood and lymphoblastoid cell lines is largely independent

Publisher: Cold Spring Harbor Laboratory

Date: 19-12-2012

DOI: 10.1101/GR.126540.111

Abstract: The degree to which the level of genetic variation for gene expression is shared across multiple tissues has important implications for research investigating the role of expression on the etiology of complex human traits and diseases. In the last few years, several studies have been published reporting the extent of overlap in expression quantitative trait loci (eQTL) identified in multiple tissues or cell types. Although these studies provide important information on the regulatory control of genes across tissues, their limited power means that they can typically only explain a small proportion of genetic variation for gene expression. Here, using expression data from monozygotic twins (MZ), we investigate the genetic control of gene expression in lymphoblastoid cell lines (LCL) and whole blood (WB). We estimate the genetic correlation that represents the combined effects of all causal loci across the whole genome and is a measure of the level of common genetic control of gene expression between the two RNA sources. Our results show that, when averaged across the genome, mean levels of genetic correlation for gene expression in LCL and WB s les are close to zero. We support our results with evidence from gene expression in an independent s le of LCL, T-cells, and fibroblasts. In addition, we provide evidence that housekeeping genes, which maintain basic cellular functions, are more likely to have high genetic correlations between the RNA sources than non-housekeeping genes, implying a relationship between the transcript function and the degree to which a gene has tissue-specific genetic regulatory control.

Publication

Village in a dish: a model system for population-scale hiPSC studies

Publisher: Cold Spring Harbor Laboratory

Date: 19-08-2021

DOI: 10.1101/2021.08.19.457030

Abstract: The mechanisms by which DNA alleles contribute to disease risk, drug response, and other human phenotypes are highly context-specific, varying across cell types and under different conditions. Human induced pluripotent stem cells (hiPSCs) are uniquely suited to study these context-dependent effects, but to do so requires cell lines from hundreds or potentially thousands of in iduals. Village cultures, where multiple hiPSC lines are cultured and differentiated together in a single dish, provide an elegant solution for scaling hiPSC experiments to the necessary s le sizes required for population-scale studies. Here, we show the utility of village models, demonstrating how cells can be assigned back to a donor line using single cell sequencing, and addressing whether line-specific signaling alters the transcriptional profiles of companion lines in a village culture. We generated single cell RNA sequence data from hiPSC lines cultured independently (uni-culture) and in villages at three independent sites. We show that the transcriptional profiles of hiPSC lines are highly consistent between uni- and village cultures for both fresh (0.46 R 0.88) and cryopreserved s les (0.46 R 0.62). Using a mixed linear model framework, we estimate that the proportion of transcriptional variation across cells is predominantly due to donor effects, with minimal evidence of variation due to culturing in a village system. We demonstrate that the genetic, epigenetic or hiPSC line-specific effects on gene expression are consistent whether the lines are uni- or village-cultured (0.82 R 0.94). Finally, we identify the consistency in the landscape of cell states between uni- and village-culture systems. Collectively, we demonstrate that village methods can be effectively used to detect hiPSC line-specific effects including sensitive dynamics of cell states.

Publication

Transcriptomic and proteomic retinal pigment epithelium signatures of age-related macular degeneration

Publisher: Springer Science and Business Media LLC

Date: 26-07-2022

DOI: 10.1038/S41467-022-31707-4

Abstract: There are currently no treatments for geographic atrophy, the advanced form of age-related macular degeneration. Hence, innovative studies are needed to model this condition and prevent or delay its progression. Induced pluripotent stem cells generated from patients with geographic atrophy and healthy in iduals were differentiated to retinal pigment epithelium. Integrating transcriptional profiles of 127,659 retinal pigment epithelium cells generated from 43 in iduals with geographic atrophy and 36 controls with genotype data, we identify 445 expression quantitative trait loci in cis that are asssociated with disease status and specific to retinal pigment epithelium subpopulations. Transcriptomics and proteomics approaches identify molecular pathways significantly upregulated in geographic atrophy, including in mitochondrial functions, metabolic pathways and extracellular cellular matrix reorganization. Five significant protein quantitative trait loci that regulate protein expression in the retinal pigment epithelium and in geographic atrophy are identified - two of which share variants with cis- expression quantitative trait loci, including proteins involved in mitochondrial biology and neurodegeneration. Investigation of mitochondrial metabolism confirms mitochondrial dysfunction as a core constitutive difference of the retinal pigment epithelium from patients with geographic atrophy. This study uncovers important differences in retinal pigment epithelium homeostasis associated with geographic atrophy.

Publication

Transitioning single-cell genomics into the clinic

Publisher: Springer Science and Business Media LLC

Date: 31-05-2023

DOI: 10.1038/S41576-023-00613-W

Joseph Powell

Researcher

Related Links

Publications

Transcriptomics and single‐cell RNA‐sequencing

Genome-wide association study of intraocular pressure uncovers new pathways to glaucoma

Genetic and Nongenetic Variation Revealed for the Principal Components of Human Gene Expression

Constraints on eQTL Fine Mapping in the Presence of Multisite Local Regulation of Gene Expression

Evidence for mitochondrial genetic control of autosomal gene expression

A review of the development of tumor vasculature and its effects on the tumor microenvironment

SARS-CoV-2 Receptor ACE2 Is an Interferon-Stimulated Gene in Human Airway Epithelial Cells and Is Detected in Specific Cell Subsets across Tissues

Blood gene expression studies in migraine: Potential and caveats

C-reactive protein upregulates the whole blood expression of CD59 - an integrative analysis

Human population dispersal “Out of Africa” estimated from linkage disequilibrium and allele frequencies of SNPs

Genome-wide association analyses identify 143 risk variants and putative regulatory mechanisms for type 2 diabetes

Contribution of genetic variation to transgenerational inheritance of DNA methylation

A village in a dish model system for population-scale hiPSC studies

Dynamic ocean management: Defining and conceptualizing real-time management of the ocean

Septic Shock: A Genomewide Association Study and Polygenic Risk Score Analysis

Benchmarking of cell type deconvolution pipelines for transcriptomics data

Single‐Cell Immune Profiling in Coronary Artery Disease: The Role of State‐of‐the‐Art Immunophenotyping With Mass Cytometry in the Diagnosis of Atherosclerosis

Reconciling the analysis of IBD and IBS in complex trait studies

Trans-eQTLs identified in whole blood have limited influence on complex disease biology

Genotype-free demultiplexing of pooled single-cell RNA-seq

TNFAIP3 Reduction-of-Function Drives Female Infertility and CNS Inflammation

RAAS blockade, kidney disease, and expression of ACE2, the entry receptor for SARS-CoV-2, in kidney epithelial and endothelial cells

Single-cell genomics meets human genetics

Dynamics of human monocytes and airway macrophages during healthy aging and after transplant

Optimal use of regression models in genome‐wide association studies

Identifying gene targets for brain-related traits using transcriptomic and methylomic data from blood

The single-cell eQTLGen consortium

Autosomal genetic control of human gene expression does not differ across the sexes

Single cell eQTL analysis identifies cell type-specific genetic control of gene expression in fibroblasts and reprogrammed induced pluripotent stem cells

DNA methylation is required to maintain both DNA replication timing precision and 3D genome organization integrity

Systematic identification of trans eQTLs as putative drivers of known disease associations

Single cell analysis of lymphatic endothelial cell fate specification and differentiation during zebrafish development

scPred: accurate supervised method for cell-type classification from single-cell RNA-seq data

Single-Cell Profiling Identifies Key Pathways Expressed by iPSCs Cultured in Different Commercial Media

DropletQC: improved identification of empty droplets and damaged cells in single-cell RNA-seq data

Integrating single-cell genomics pipelines to discover mechanisms of stem cell differentiation

A single‐cell transcriptome atlas of the adult human retina

The Brisbane systems genetics study: Genetical genomics meets complex trait genetics

Large-scale cis- and trans-eQTL analyses identify thousands of genetic loci and polygenic scores that regulate blood gene expression

Biological insights from 108 schizophrenia-associated genetic loci

Genetic variation affects morphological retinal phenotypes extracted from UK Biobank optical coherence tomography images

Neonatal DNA methylation profile in human twins is specified by a complex interplay between intrauterine environmental and genetic factors, subject to tissue-specific influence

Mapping the dynamic genetic regulatory architecture ofHLAgenes at single-cell resolution

An integrated cell barcoding and computational analysis pipeline for scalable analysis of differentiation at single-cell resolution

Single cell RNA sequencing of stem cell-derived retinal ganglion cells

propeller: testing for differences in cell type proportions in single cell data

Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets

DIRC3-IGFBP5 is a shared genetic risk locus and therapeutic target for carpal tunnel syndrome and trigger finger

Single-Cell Transcriptional Profiling of Aortic Endothelium Identifies a Hierarchy from Endovascular Progenitors to Differentiated Cells

Distinct Brainstem and Forebrain Circuits Receiving Tracheal Sensory Neuron Inputs Revealed Using a Novel Conditional Anterograde Transsynaptic Viral Tracing System

Shared genetic control of expression and methylation in peripheral blood

Inference of the Genetic Architecture Underlying BMI and Height with the Use of 20,240 Sibling Pairs

No evidence that plasmablasts transdifferentiate into developing neutrophils in severe COVID‐19 disease

Retinal ganglion cell-specific genetic regulation in primary open angle glaucoma

Retinal ganglion cell-specific genetic regulation in primary open-angle glaucoma

Nebulosa recovers single-cell gene expression signals by kernel density estimation

Single-cell eQTL mapping identifies cell type-specific genetic control of autoimmune disease.

Genetic correlations reveal the shared genetic architecture of transcription in human peripheral blood

Gene transcripts associated with muscle strength: a CHARGE meta-analysis of 7,781 persons

Signatures of negative selection in the genetic architecture of human complex traits.

Single-cell RNA-seq of human induced pluripotent stem cells reveals cellular heterogeneity and cell state transitions between subpopulations

The genetic regulation of transcription in human endometrial tissue

scGPS: Determining Cell States and Global Fate Potential of Subpopulations

propeller: Testing for differences in cell type proportions in single cell data

Single Cell RNA Sequencing of stem cell-derived retinal ganglion cells

Transcriptomic and proteomic retinal pigment epithelium signatures of age-related macular degeneration

The Medical Genome Reference Bank contains whole genome and phenotype data of 2570 healthy elderly

Heritable defects in telomere and mitotic function selectively predispose to sarcomas

Hemani et al. reply

Cryopreservation of human cancers conserves tumour heterogeneity for single-cell multi-omics analysis

Another explanation for apparent epistasis

Comprehensive benchmarking of computational deconvolution of transcriptomics data

Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression

Testing Two Evolutionary Theories of Human Aging with DNA Methylation Data

A model of impaired Langerhans cell maturation associated with HPV induced epithelial hyperplasia

Genotype-free demultiplexing of pooled single-cell RNA-seq