ORCID Profile
0000-0001-5165-4408
Current Organisation
Massachusetts General Hospital
Does something not look right? The information on this page has been harvested from data sources that may not be up to date. We continue to work with information providers to improve coverage and quality. To report an issue, use the Feedback Form.
Publisher: Cold Spring Harbor Laboratory
Date: 17-12-2021
DOI: 10.1101/2021.12.16.21267891
Abstract: Primary open-angle glaucoma (POAG) is a leading cause of irreversible blindness globally. There is disparity in POAG prevalence and manifestations across ancestries. We identify novel and unique genetics that underlie POAG risk in different ancestries by performing meta-analysis across 15 biobanks (of the Global Biobank Meta-analysis Initiative) with previously multi-ancestry studies. 18 novel significant loci, three of which were ancestry-specific, and five sex-specific were identified. We performed gene-enrichment and transcriptome-wide association studies (TWAS), implicating vascular and cancer genes. A fifth of these genes are primary ciliary genes. Extensive statistical analysis of genes in the SIX6 and CDKN2B-AS1 loci (implicated in POAG, cardiovascular diseases and cancers) found interaction between SIX6 and causal variants in chr9p21.3, with expression effect on CDKN2A/B . We infer that some POAG risk variants may be ancestry-specific, sex-specific, or both. Our results further support the contribution of vascular, cancer, and primary cilia genes in POAG pathogenesis.
Publisher: Cold Spring Harbor Laboratory
Date: 02-12-2021
DOI: 10.1101/2021.11.30.21267108
Abstract: Asthma is a complex disease that affects millions of people and varies in prevalence by an order of magnitude across geographic regions and populations. However, the extent to which genetic variation contributes to these disparities is unclear, as studies probing the genetics of asthma have been primarily limited to populations of European (EUR) descent. As part of the Global Biobank Meta-analysis Initiative (GBMI), we conducted the largest genome-wide association study of asthma to date (153,763 cases and 1,647,022 controls) via meta-analysis across 18 biobanks spanning multiple countries and ancestries. Altogether, we discovered 179 genome-wide significant loci (p 5×10 −8 ) associated with asthma, 49 of which are not previously reported. We replicate well-known associations such as IL1RL1 and STAT6 , and find that overall the novel associations have smaller effects than previously-discovered loci, highlighting our substantial increase in statistical power. Despite the considerable range in prevalence of asthma among biobanks, from 3% to 24%, the genetic effects of associated loci are largely consistent across the biobanks and ancestries. To further investigate the polygenic architecture of asthma, we construct polygenic risk scores (PRS) using a multi-ancestry approach, which yields higher predictive power for asthma in non-EUR populations compared to PRS derived from previous asthma meta-analyses. Additionally, we find considerable genetic overlap between asthma age-of-onset subtypes, as well as between asthma and chronic obstructive pulmonary disease (COPD) but minimal overlap in enriched biological pathways. Our work underscores the multifactorial nature of asthma development and offers insight into the shared genetic architecture of asthma that may be differentially perturbed by environmental factors and contribute to variation in prevalence.
Publisher: Springer Science and Business Media LLC
Date: 25-11-2027
Publisher: Cold Spring Harbor Laboratory
Date: 20-03-2022
DOI: 10.1101/2022.03.16.22272457
Abstract: Meta-analysis is pervasively used to combine multiple genome-wide association studies (GWAS) into a more powerful whole. To resolve causal variants, meta-analysis studies typically apply summary statistics-based fine-mapping methods as they are applied to single-cohort studies. However, it is unclear whether heterogeneous characteristics of each cohort ( e . g ., ancestry, s le size, phenotyping, genotyping, or imputation) affect fine-mapping calibration and recall. Here, we first demonstrate that meta-analysis fine-mapping is substantially miscalibrated in simulations when different genotyping arrays or imputation panels are included. To mitigate these issues, we propose a summary statistics-based QC method, SLALOM, that identifies suspicious loci for meta-analysis fine-mapping by detecting outliers in association statistics based on ancestry-matched local LD structure. Having validated SLALOM performance in simulations and the GWAS Catalog, we applied it to 14 disease endpoints from the Global Biobank Meta-analysis Initiative and found that 67% of loci showed suspicious patterns that call into question fine-mapping accuracy. These predicted suspicious loci were significantly depleted for having likely causal variants, such as nonsynonymous variants, as a lead variant (2.7x Fisher’s exact P = 7.3 × 10 −4 ). Compared to fine-mapping results in in idual biobanks, we found limited evidence of fine-mapping improvement in the GBMI meta-analyses. Although a full solution requires complete synchronization across cohorts, our approach identifies likely spurious results in meta-analysis fine-mapping. We urge extreme caution when interpreting fine-mapping results from meta-analysis.
Publisher: Cold Spring Harbor Laboratory
Date: 21-07-2022
DOI: 10.1101/2022.07.20.500802
Abstract: Most genome-wide association studies (GWAS) of major depression (MD) have been conducted in s les of European ancestry. Here we report a multi-ancestry GWAS of MD, adding data from 21 studies with 88,316 MD cases and 902,757 controls to previously reported data from in iduals of European ancestry. This includes s les of African (36% of effective s le size), East Asian (26%) and South Asian (6%) ancestry and Hispanic/Latinx participants (32%). The multi-ancestry GWAS identified 190 significantly associated loci, 53 of them novel. For previously reported loci from GWAS in European ancestry the power-adjusted transferability ratio was 0.6 in the Hispanic/Latinx group and 0.3 in each of the other groups. Fine-mapping benefited from additional s le ersity: the number of credible sets with ≤5 variants increased from 3 to 12. A transcriptome-wide association study identified 354 significantly associated genes, 205 of them novel. Mendelian Randomisation showed a bidirectional relationship with BMI exclusively in s les of European ancestry. This first multi-ancestry GWAS of MD demonstrates the importance of large erse s les for the identification of target genes and putative mechanisms.
Publisher: Elsevier BV
Date: 09-2020
Publisher: Elsevier BV
Date: 08-2016
Publisher: Springer Science and Business Media LLC
Date: 31-10-2019
DOI: 10.1038/S41467-019-12283-6
Abstract: In many species, the offspring of related parents suffer reduced reproductive success, a phenomenon known as inbreeding depression. In humans, the importance of this effect has remained unclear, partly because reproduction between close relatives is both rare and frequently associated with confounding social factors. Here, using genomic inbreeding coefficients ( F ROH ) for .4 million in iduals, we show that F ROH is significantly associated ( p 0.0005) with apparently deleterious changes in 32 out of 100 traits analysed. These changes are associated with runs of homozygosity (ROH), but not with common variant homozygosity, suggesting that genetic variants associated with inbreeding depression are predominantly rare. The effect on fertility is striking: F ROH equivalent to the offspring of first cousins is associated with a 55% decrease [95% CI 44–66%] in the odds of having children. Finally, the effects of F ROH are confirmed within full-sibling pairs, where the variation in F ROH is independent of all environmental confounding.
Publisher: Springer Science and Business Media LLC
Date: 12-03-2018
Publisher: Cold Spring Harbor Laboratory
Date: 03-02-2020
DOI: 10.1101/2020.02.02.20020065
Abstract: Blood cells play essential roles in human health, underpinning physiological processes such as immunity, oxygen transport, and clotting, which when perturbed cause a significant health burden. Here we integrate data from UK Biobank and a large-scale international collaborative effort, including 563,946 European ancestry participants, and discover 5,106 new genetic variants independently associated with 29 blood cell phenotypes covering the full allele frequency spectrum of variation impacting hematopoiesis. We holistically characterize the genetic architecture of hematopoiesis, assess the relevance of the omnigenic model to blood cell phenotypes, delineate relevant hematopoietic cell states influenced by regulatory genetic variants and gene networks, identify novel splice-altering variants mediating the associations, and assess the polygenic prediction potential for blood cell traits and clinical disorders at the interface of complex and Mendelian genetics. These results show the power of large-scale blood cell GWAS to interrogate clinically meaningful variants across the full allelic spectrum of human variation.
Publisher: Elsevier BV
Date: 09-2020
Publisher: Springer Science and Business Media LLC
Date: 10-2019
Publisher: Cold Spring Harbor Laboratory
Date: 30-12-2022
DOI: 10.1101/2022.12.29.522270
Abstract: Polygenic risk scores (PRS) developed from multi-ancestry genome-wide association studies (GWAS), PRS multi , hold promise for improving PRS accuracy and generalizability across populations. To establish best practices for leveraging the increasing ersity of genomic studies, we investigated how various factors affect the performance of PRS multi compared to PRS constructed from single-ancestry GWAS (PRS single ). Through extensive simulations and empirical analyses, we showed that PRS multi overall outperformed PRS single in understudied populations, except when the understudied population represented a small proportion of the multi-ancestry GWAS. Notably, for traits with large-effect ancestry-enriched variants, such as mean corpuscular volume, using substantially fewer s les from Biobank Japan achieved comparable accuracies to a much larger European cohort. Furthermore, integrating PRS based on local ancestry-informed GWAS and large-scale European-based PRS improved predictive performance in understudied African populations, especially for less polygenic traits with large ancestry-enriched variants. Our work highlights the importance of ersifying genomic studies to achieve equitable PRS performance across ancestral populations and provides guidance for developing PRS from multiple studies.
Publisher: Cold Spring Harbor Laboratory
Date: 21-11-2021
DOI: 10.1101/2021.11.18.21266545
Abstract: With the increasing availability of biobank-scale datasets that incorporate both genomic data and electronic health records, many associations between genetic variants and phenotypes of interest have been discovered. Polygenic risk scores (PRS), which are being widely explored in precision medicine, use the results of association studies to predict the genetic component of disease risk by accumulating risk alleles weighted by their effect sizes. However, few studies have thoroughly investigated best practices for PRS in global populations across different diseases. In this study, we utilize data from the Global-Biobank Meta-analysis Initiative (GBMI), which consists of in iduals from erse ancestries and across continents, to explore methodological considerations and PRS prediction performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRS using heuristic (pruning and thresholding, P+T) and Bayesian (PRS-CS) methods. We found that the genetic architecture, such as SNP-based heritability and polygenicity, varied greatly among endpoints. For both PRS construction methods, using a European ancestry LD reference panel resulted in comparable or higher prediction accuracy compared to several other non-European based panels this is largely attributable to European descent populations still comprising the majority of GBMI participants. PRS-CS overall outperformed the classic P+T method, especially for endpoints with higher SNP-based heritability. For ex le, substantial improvements are observed in East-Asian ancestry (EAS) using PRS- CS compared to P+T for heart failure (HF) and chronic obstructive pulmonary disease (COPD). Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma which has known variation in disease prevalence across global populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using the GBMI and highlight the importance of best practices for PRS in the biobank-scale genomics era.
Publisher: Springer Science and Business Media LLC
Date: 09-12-2021
Publisher: Springer Science and Business Media LLC
Date: 03-06-2019
DOI: 10.1038/S41588-019-0449-0
Abstract: An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Publisher: Elsevier BV
Date: 12-2022
Publisher: Springer Science and Business Media LLC
Date: 08-07-2021
DOI: 10.1038/S41586-021-03767-X
Abstract: The genetic make-up of an in idual contributes to the susceptibility and response to viral infection. Although environmental, clinical and social factors have a role in the chance of exposure to SARS-CoV-2 and the severity of COVID-19 1,2 , host genetics may also be important. Identifying host-specific genetic factors may reveal biological mechanisms of therapeutic relevance and clarify causal relationships of modifiable environmental risk factors for SARS-CoV-2 infection and outcomes. We formed a global network of researchers to investigate the role of human genetics in SARS-CoV-2 infection and COVID-19 severity. Here we describe the results of three genome-wide association meta-analyses that consist of up to 49,562 patients with COVID-19 from 46 studies across 19 countries. We report 13 genome-wide significant loci that are associated with SARS-CoV-2 infection or severe manifestations of COVID-19. Several of these loci correspond to previously documented associations to lung or autoimmune and inflammatory diseases 3–7 . They also represent potentially actionable mechanisms in response to infection. Mendelian randomization analyses support a causal role for smoking and body-mass index for severe COVID-19 although not for type II diabetes. The identification of novel host genetic factors associated with COVID-19 was made possible by the community of human genetics researchers coming together to prioritize the sharing of data, results, resources and analytical frameworks. This working model of international collaboration underscores what is possible for future genetic discoveries in emerging pandemics, or indeed for any complex human disease.
Publisher: Springer Science and Business Media LLC
Date: 07-06-2021
DOI: 10.1038/S41467-021-23134-8
Abstract: The large majority of variants identified by GWAS are non-coding, motivating detailed characterization of the function of non-coding variants. Experimental methods to assess variants’ effect on gene expressions in native chromatin context via direct perturbation are low-throughput. Existing high-throughput computational predictors thus have lacked large gold standard sets of regulatory variants for training and validation. Here, we leverage a set of 14,807 putative causal eQTLs in humans obtained through statistical fine-mapping, and we use 6121 features to directly train a predictor of whether a variant modifies nearby gene expression. We call the resulting prediction the expression modifier score (EMS). We validate EMS by comparing its ability to prioritize functional variants with other major scores. We then use EMS as a prior for statistical fine-mapping of eQTLs to identify an additional 20,913 putatively causal eQTLs, and we incorporate EMS into co-localization analysis to identify 310 additional candidate genes across UK Biobank phenotypes.
Publisher: Springer Science and Business Media LLC
Date: 26-05-2023
No related grants have been discovered for Masahiro Kanai.