Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Demographic History and Genetic Adaptation in the Himalayan Region Inferred from Genome-Wide SNP Genotypes of 49 Populations.

Molecular biology and evolution | 2018

We genotyped 738 individuals belonging to 49 populations from Nepal, Bhutan, North India, or Tibet at over 500,000 SNPs, and analyzed the genotypes in the context of available worldwide population data in order to investigate the demographic history of the region and the genetic adaptations to the harsh environment. The Himalayan populations resembled other South and East Asians, but in addition displayed their own specific ancestral component and showed strong population structure and genetic drift. We also found evidence for multiple admixture events involving Himalayan populations and South/East Asians between 200 and 2,000 years ago. In comparisons with available ancient genomes, the Himalayans, like other East and South Asian populations, showed similar genetic affinity to Eurasian hunter-gatherers (a 24,000-year-old Upper Palaeolithic Siberian), and the related Bronze Age Yamnaya. The high-altitude Himalayan populations all shared a specific ancestral component, suggesting that genetic adaptation to life at high altitude originated only once in this region and subsequently spread. Combining four approaches to identifying specific positively selected loci, we confirmed that the strongest signals of high-altitude adaptation were located near the Endothelial PAS domain-containing protein 1 and Egl-9 Family Hypoxia Inducible Factor 1 loci, and discovered eight additional robust signals of high-altitude adaptation, five of which have strong biological functional links to such adaptation. In conclusion, the demographic history of Himalayan populations is complex, with strong local differentiation, reflecting both genetic and cultural factors; these populations also display evidence of multiple genetic adaptations to high-altitude environments.

Pubmed ID: 29796643 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

  • Agency: Wellcome Trust, United Kingdom
  • Agency: Wellcome Trust, United Kingdom
    Id: 098051
  • Agency: Wellcome Trust, United Kingdom
    Id: 087576

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


VARIANT (tool)

RRID:SCR_005194

Analysis tool that can report the functional properties of any variant in all the human, mouse or rat genes (and soon new model organisms will be added) and the corresponding neighborhoods. Also other non-coding extra-genic regions, such as miRNAs are included in the analysis. It not only reports the obvious functional effects in the coding regions but also analyzes noncoding SNVs situated both within the gene and in the neighborhood that could affect different regulatory motifs, splicing signals, and other structural elements. These include: Jaspar regulatory motifs, miRNA targets, splice sites, exonic splicing silencers, calculations of selective pressures on the particular polymorphic positions, etc. Software analysis pipelines used in the analysis of NGS data are highly modular, heterogeneous, and rapidly evolving. VARIANT can easily be incorporated into a NGS resequencing pipeline either as a CLI or invoked a webservice. It inputs data directly from the most widely used programs for SNV detection.

View all literature mentions

1000 Genomes Project and AWS (tool)

RRID:SCR_008801

A dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.

View all literature mentions

Phyre (tool)

RRID:SCR_010270

A structure prediction system to reliably detect remote homologies.

View all literature mentions

ADMIXTURE (tool)

RRID:SCR_001263

A software tool for maximum likelihood estimation of individual ancestries from multilocus SNP genotype datasets. It uses the same statistical model as STRUCTURE but calculates estimates much more rapidly using a fast numerical optimization algorithm. It uses a block relaxation approach to alternately update allele frequency and ancestry fraction parameters. Each block update is handled by solving a large number of independent convex optimization problems, which are tackled using a fast sequential quadratic programming algorithm. Convergence of the algorithm is accelerated using a novel quasi-Newton acceleration method.

View all literature mentions

PLINK (tool)

RRID:SCR_001757

Open source whole genome association analysis toolset, designed to perform range of basic, large scale analyses in computationally efficient manner. Used for analysis of genotype/phenotype data. Through integration with gPLINK and Haploview, there is some support for subsequent visualization, annotation and storage of results. PLINK 1.9 is improved and second generation of the software.

View all literature mentions

Haploview (tool)

RRID:SCR_003076

A Java based software tool designed to simplify and expedite the process of haplotype analysis by providing a common interface to several tasks relating to such analyses. Haploview currently allows users to examine block structures, generate haplotypes in these blocks, run association tests, and save the data in a number of formats. All functionalities are highly customizable. (entry from Genetic Analysis Software) * LD & haplotype block analysis * haplotype population frequency estimation * single SNP and haplotype association tests * permutation testing for association significance * implementation of Paul de Bakker's Tagger tag SNP selection algorithm. * automatic download of phased genotype data from HapMap * visualization and plotting of PLINK whole genome association results including advanced filtering options Haploview is fully compatible with data dumps from the HapMap project and the Perlegen Genotype Browser. It can analyze thousands of SNPs (tens of thousands in command line mode) in thousands of individuals. Note: Haploview is currently on a development and support freeze. The team is currently looking at a variety of options in order to provide support for the software. Haploview is an open source project hosted by SourceForge. The source can be downloaded at the SourceForge project site.

View all literature mentions

Eigensoft (tool)

RRID:SCR_004965

EIGENSOFT package combines functionality from our population genetics methods (Patterson et al. 2006) and our EIGENSTRAT stratification method (Price et al. 2006). The EIGENSTRAT method uses principal components analysis to explicitly model ancestry differences between cases and controls along continuous axes of variation; the resulting correction is specific to a candidate marker''s variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. The EIGENSOFT package has a built-in plotting script and supports multiple file formats and quantitative phenotypes. Source code, documentation and executables for using EIGENSOFT 3.0 on a Linux platform can be downloaded. New features of EIGENSOFT 3.0 include supporting either 32-bit or 64-bit Linux machines, a utility to merge different data sets, a utility to identify related samples (accounting for population structure), and supporting multiple file formats for EIGENSTRAT stratification correction.

View all literature mentions

STRING (tool)

RRID:SCR_005223

Database of known and predicted protein interactions. The interactions include direct (physical) and indirect (functional) associations and are derived from four sources: Genomic Context, High-throughput experiments, (Conserved) Coexpression, and previous knowledge. STRING quantitatively integrates interaction data from these sources for a large number of organisms, and transfers information between these organisms where applicable. The database currently covers 5''214''234 proteins from 1133 organisms. (2013)

View all literature mentions