ORCID Profile
0000-0001-8401-0545
Current Organisation
University of North Carolina at Chapel Hill
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
Publisher: American Association for Cancer Research (AACR)
Date: 04-04-2023
DOI: 10.1158/2767-9764.22544795
Abstract: Supplementary Table 1. S le Size of each participated study, by case-control status and genotype platform. Supplementary Table 2. Association between fourteen environmental factors and the risk of breast cancer. Supplementary Table 3. Interactions between genes and fourteen environmental factors.
Publisher: Life Science Alliance, LLC
Date: 17-01-2019
Abstract: Most methods for statistical analysis of RNA-seq data take a matrix of abundance estimates for some type of genomic features as their input, and consequently the quality of any obtained results is directly dependent on the quality of these abundances. Here, we present the junction coverage compatibility score, which provides a way to evaluate the reliability of transcript-level abundance estimates and the accuracy of transcript annotation catalogs. It works by comparing the observed number of reads spanning each annotated splice junction in a genomic region to the predicted number of junction-spanning reads, inferred from the estimated transcript abundances and the genomic coordinates of the corresponding annotated transcripts. We show that although most genes show good agreement between the observed and predicted junction coverages, there is a small set of genes that do not. Genes with poor agreement are found regardless of the method used to estimate transcript abundances, and the corresponding transcript abundances should be treated with care in any downstream analyses.
Publisher: Cold Spring Harbor Laboratory
Date: 06-08-2021
DOI: 10.1101/2022.08.05.502985
Abstract: Deriving biological insights from genomic data commonly requires comparing attributes of selected genomic loci to a null set of loci. The selection of this null set is non trivial, as it requires careful consideration of potential covariates, a problem that is exacerbated by the non-uniform distribution of genomic features including genes, enhancers, and transcription factor binding sites. Propensity score-based covariate matching methods allow selection of null sets from a pool of possible items while controlling for multiple covariates however, existing packages do not operate on genomic data classes and can be slow for large data sets making them difficult to integrate into genomic workflows. To address this, we developed matchRanges , a propensity score-based covariate matching method for the efficient and convenient generation of matched null ranges from a set of background ranges within the Bioconductor framework.
Publisher: Cold Spring Harbor Laboratory
Date: 28-07-2018
DOI: 10.1101/378539
Abstract: Most methods for statistical analysis of RNA-seq data take a matrix of abundance estimates for some type of genomic features as their input, and consequently the quality of any obtained results are directly dependent on the quality of these abundances. Here, we present the junction coverage compatibility (JCC) score, which provides a way to evaluate the reliability of transcript-level abundance estimates as well as the accuracy of transcript annotation catalogs. It works by comparing the observed number of reads spanning each annotated splice junction in a genomic region to the predicted number of junction-spanning reads, inferred from the estimated transcript abundances and the genomic coordinates of the corresponding annotated transcripts. We show that while most genes show good agreement between the observed and predicted junction coverages, there is a small set of genes that do not. Genes with poor agreement are found regardless of the method used to estimate transcript abundances, and the corresponding transcript abundances should be treated with care in any downstream analyses.
Publisher: American Association for Cancer Research (AACR)
Date: 04-04-2023
DOI: 10.1158/2767-9764.22544801
Abstract: Quantile-Quantile plot (Q-Q plot) of the aMiSTi p-values for each set of the GxE interactions.
Publisher: Cold Spring Harbor Laboratory
Date: 18-01-2018
DOI: 10.1101/250126
Abstract: Dropout events in single-cell transcriptome sequencing (scRNA-seq) cause many transcripts to go undetected and induce an excess of zero read counts, leading to power issues in differential expression (DE) analysis. This has triggered the development of bespoke scRNA-seq DE methods to cope with zero inflation. Recent evaluations, however, have shown that dedicated scRNA-seq tools provide no advantage compared to traditional bulk RNA-seq tools. We introduce a weighting strategy, based on a zero-inflated negative binomial (ZINB) model, that identifies excess zero counts and generates gene and cell-specific weights to unlock bulk RNA-seq DE pipelines for zero-inflated data, boosting performance for scRNA-seq.
Publisher: American Association for Cancer Research (AACR)
Date: 04-04-2023
DOI: 10.1158/2767-9764.22544801.V1
Abstract: Quantile-Quantile plot (Q-Q plot) of the aMiSTi p-values for each set of the GxE interactions.
Publisher: American Association for Cancer Research (AACR)
Date: 04-04-2023
DOI: 10.1158/2767-9764.22544798
Abstract: Funding and acknowledgements.
Publisher: American Association for Cancer Research (AACR)
Date: 08-04-2022
DOI: 10.1158/2767-9764.CRC-21-0119
Abstract: Genome-wide association studies (GWAS) have identified more than 200 susceptibility loci for breast cancer, but these variants explain less than a fifth of the disease risk. Although gene–environment interactions have been proposed to account for some of the remaining heritability, few studies have empirically assessed this. We obtained genotype and risk factor data from 46,060 cases and 47,929 controls of European ancestry from population-based studies within the Breast Cancer Association Consortium (BCAC). We built gene expression prediction models for 4,864 genes with a significant (P & 0.01) heritable component using the transcriptome and genotype data from the Genotype-Tissue Expression (GTEx) project. We leveraged predicted gene expression information to investigate the interactions between gene-centric genetic variation and 14 established risk factors in association with breast cancer risk, using a mixed-effects score test. After adjusting for number of tests using Bonferroni correction, no interaction remained statistically significant. The strongest interaction observed was between the predicted expression of the C13orf45 gene and age at first full-term pregnancy (PGXE = 4.44 × 10−6). In this transcriptome-informed genome-wide gene–environment interaction study of breast cancer, we found no strong support for the role of gene expression in modifying the associations between established risk factors and breast cancer risk. Our study suggests a limited role of gene–environment interactions in breast cancer risk.
Publisher: Oxford University Press (OUP)
Date: 2022
Abstract: CTCF (CCCTC-binding factor) is an 11-zinc-finger DNA binding protein which regulates much of the eukaryotic genome’s 3D structure and function. The ersity of CTCF binding motifs has led to a fragmented landscape of CTCF binding data. We collected position weight matrices of CTCF binding motifs and defined strand-oriented CTCF binding sites in the human and mouse genomes, including the recent Telomere to Telomere and mm39 assemblies. We included selected experimentally determined and predicted CTCF binding sites, such as CTCF-bound cis-regulatory elements from SCREEN ENCODE. We recommend filtering strategies for CTCF binding motifs and demonstrate that liftOver is a viable alternative to convert CTCF coordinates between assemblies. Our comprehensive data resource and usage recommendations can serve to harmonize and strengthen the reproducibility of genomic studies utilizing CTCF binding data. ackages/CTCF. Companion website: dozmorovlab.github.io/CTCF/ Code to reproduce the analyses: ozmorovlab/CTCF.dev. Supplementary data are available at Bioinformatics Advances online.
Publisher: Public Library of Science (PLoS)
Date: 25-02-2020
Publisher: American Association for Cancer Research (AACR)
Date: 04-04-2023
DOI: 10.1158/2767-9764.C.6550751.V1
Abstract: Genome-wide association studies (GWAS) have identified more than 200 susceptibility loci for breast cancer, but these variants explain less than a fifth of the disease risk. Although gene–environment interactions have been proposed to account for some of the remaining heritability, few studies have empirically assessed this. We obtained genotype and risk factor data from 46,060 cases and 47,929 controls of European ancestry from population-based studies within the Breast Cancer Association Consortium (BCAC). We built gene expression prediction models for 4,864 genes with a significant ( i P /i 0.01) heritable component using the transcriptome and genotype data from the Genotype-Tissue Expression (GTEx) project. We leveraged predicted gene expression information to investigate the interactions between gene-centric genetic variation and 14 established risk factors in association with breast cancer risk, using a mixed-effects score test. After adjusting for number of tests using Bonferroni correction, no interaction remained statistically significant. The strongest interaction observed was between the predicted expression of the i C13orf45 /i gene and age at first full-term pregnancy (P sub GXE /sub = 4.44 × 10 sup −6 /sup ). In this transcriptome-informed genome-wide gene–environment interaction study of breast cancer, we found no strong support for the role of gene expression in modifying the associations between established risk factors and breast cancer risk. Our study suggests a limited role of gene–environment interactions in breast cancer risk. /
Publisher: Cold Spring Harbor Laboratory
Date: 13-09-2023
Publisher: Cold Spring Harbor Laboratory
Date: 05-09-2022
DOI: 10.1101/2022.09.02.506382
Abstract: bootRanges provides fast functions for generation of bootstrapped genomic ranges representing the null sets in enrichment analysis. We show that shuffling or permutation schemes may result in overly narrow test statistics null distributions, while creating new ranges sets with a block bootstrap preserves local genomic correlation structure and generates more reliable null distributions. It can also be used in more complex analyses, such as accessing correlations between cis-regulatory elements (CREs) and genes across cell types or providing optimized thresholds, e.g. log fold change (logFC) from differential analysis. The bootRanges functions are available in the R/Bioconductor package nullranges at ackages/nullranges .
Publisher: F1000 Research Ltd
Date: 29-02-2016
DOI: 10.12688/F1000RESEARCH.7563.2
Abstract: High-throughput sequencing of cDNA (RNA-seq) is used extensively to characterize the transcriptome of cells. Many transcriptomic studies aim at comparing either abundance levels or the transcriptome composition between given conditions, and as a first step, the sequencing reads must be used as the basis for abundance quantification of transcriptomic features of interest, such as genes or transcripts. Various quantification approaches have been proposed, ranging from simple counting of reads that overlap given genomic regions to more complex estimation of underlying transcript abundances. In this paper, we show that gene-level abundance estimates and statistical inference offer advantages over transcript-level analyses, in terms of performance and interpretability. We also illustrate that the presence of differential isoform usage can lead to inflated false discovery rates in differential gene expression analyses on simple count matrices but that this can be addressed by incorporating offsets derived from transcript-level abundance estimates. We also show that the problem is relatively minor in several real data sets. Finally, we provide an R package ( tximport ) to help users integrate transcript-level abundance estimates from common quantification pipelines into count-based statistical inference engines.
Publisher: Cold Spring Harbor Laboratory
Date: 25-09-2019
DOI: 10.1101/777888
Abstract: Correct annotation metadata is critical for reproducible and accurate RNA-seq analysis. When files are shared publicly or among collaborators with incorrect or missing annotation metadata, it becomes difficult or impossible to reproduce bioinformatic analyses from raw data. It also makes it more difficult to locate the transcriptomic features, such as transcripts or genes, in their proper genomic context, which is necessary for overlapping expression data with other datasets. We provide a solution in the form of an R/Bioconductor package tximeta that performs numerous annotation and metadata gathering tasks automatically on behalf of users during the import of transcript quantification files. The correct reference transcriptome is identified via a hashed checksum stored in the quantification output, and key transcript databases are downloaded and cached locally. The computational paradigm of automatically adding annotation metadata based on reference sequence checksums can greatly facilitate genomic workflows, by helping to reduce overhead during bioinformatic analyses, preventing costly bioinformatic mistakes, and promoting computational reproducibility. The tximeta package is available at ackages/tximeta .
Publisher: F1000 Research Ltd
Date: 30-12-2015
DOI: 10.12688/F1000RESEARCH.7563.1
Abstract: High-throughput sequencing of cDNA (RNA-seq) is used extensively to characterize the transcriptome of cells. Many transcriptomic studies aim at comparing either abundance levels or the transcriptome composition between given conditions, and as a first step, the sequencing reads must be used as the basis for abundance quantification of transcriptomic features of interest, such as genes or transcripts. Several different quantification approaches have been proposed, ranging from simple counting of reads that overlap given genomic regions to more complex estimation of underlying transcript abundances. In this paper, we show that gene-level abundance estimates and statistical inference offer advantages over transcript-level analyses, in terms of performance and interpretability. We also illustrate that while the presence of differential isoform usage can lead to inflated false discovery rates in differential expression analyses on simple count matrices and transcript-level abundance estimates improve the performance in simulated data, the difference is relatively minor in several real data sets. Finally, we provide an R package ( tximport ) to help users integrate transcript-level abundance estimates from common quantification pipelines into count-based statistical inference engines.
Publisher: American Association for Cancer Research (AACR)
Date: 04-04-2023
DOI: 10.1158/2767-9764.C.6550751
Abstract: Genome-wide association studies (GWAS) have identified more than 200 susceptibility loci for breast cancer, but these variants explain less than a fifth of the disease risk. Although gene–environment interactions have been proposed to account for some of the remaining heritability, few studies have empirically assessed this. We obtained genotype and risk factor data from 46,060 cases and 47,929 controls of European ancestry from population-based studies within the Breast Cancer Association Consortium (BCAC). We built gene expression prediction models for 4,864 genes with a significant ( i P /i 0.01) heritable component using the transcriptome and genotype data from the Genotype-Tissue Expression (GTEx) project. We leveraged predicted gene expression information to investigate the interactions between gene-centric genetic variation and 14 established risk factors in association with breast cancer risk, using a mixed-effects score test. After adjusting for number of tests using Bonferroni correction, no interaction remained statistically significant. The strongest interaction observed was between the predicted expression of the i C13orf45 /i gene and age at first full-term pregnancy (P sub GXE /sub = 4.44 × 10 sup −6 /sup ). In this transcriptome-informed genome-wide gene–environment interaction study of breast cancer, we found no strong support for the role of gene expression in modifying the associations between established risk factors and breast cancer risk. Our study suggests a limited role of gene–environment interactions in breast cancer risk. /
Publisher: American Association for Cancer Research (AACR)
Date: 04-04-2023
DOI: 10.1158/2767-9764.22544798.V1
Abstract: Funding and acknowledgements.
Publisher: American Association for Cancer Research (AACR)
Date: 04-04-2023
DOI: 10.1158/2767-9764.22544795.V1
Abstract: Supplementary Table 1. S le Size of each participated study, by case-control status and genotype platform. Supplementary Table 2. Association between fourteen environmental factors and the risk of breast cancer. Supplementary Table 3. Interactions between genes and fourteen environmental factors.
Location: United States of America
Location: United States of America
No related grants have been discovered for Michael Love.