Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Ancestry Prediction Comparisons of Different AISNPs for Five Continental Populations and Population Structure Dissection of the Xinjiang Hui Group via a Self-Developed Panel.

Genes | 2020

Ancestry informative markers are genetic markers that show distinct genetic divergences among different populations. These markers can be utilized to discern population substructures and estimate the ancestral origins of unknown individuals. Previously, we developed a multiplex system of 30 ancestry informative single nucleotide polymorphism (AISNP) loci to facilitate ancestral inferences in different continental populations. In the current study, we first compared the ancestry resolutions of the 30 AISNPs and the other previously reported AISNP panels for African, European, East Asian, South Asian and American populations. Next, the genetic components of the Xinjiang Hui group were further explored in comparison to these continental populations based on the 30 AISNPs. Genetic divergence analyses of the 30 AISNPs in these five continental populations revealed that most of the AISNPs showed high genetic differentiations between these populations. Ancestry analysis comparisons of the 30 AISNPs and other published AISNPs revealed that these 30 AISNPs had comparable efficiency to other AISNP panels. Genetic relationship analyses among the studied Hui group and other continental populations demonstrated that the Hui group had close genetic affinities with East Asian populations and might share the genetic ancestries with East Asian populations. Overall, the 30 AISNPs can be used to predict the bio-geographical origins of different continental populations. Moreover, the obtained genetic data of 30 AISNPs in the Hui group can further enrich the extant reference data, which can be used as reference data for ancestry analyses of the Hui group.

Pubmed ID: 32375366 RIS Download

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


ADMIXTURE (tool)

RRID:SCR_001263

A software tool for maximum likelihood estimation of individual ancestries from multilocus SNP genotype datasets. It uses the same statistical model as STRUCTURE but calculates estimates much more rapidly using a fast numerical optimization algorithm. It uses a block relaxation approach to alternately update allele frequency and ancestry fraction parameters. Each block update is handled by solving a large number of independent convex optimization problems, which are tackled using a fast sequential quadratic programming algorithm. Convergence of the algorithm is accelerated using a novel quasi-Newton acceleration method.

View all literature mentions

GATK (tool)

RRID:SCR_001876

A software package to analyze next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size. This software library makes writing efficient analysis tools using next-generation sequencing data very easy, and second it's a suite of tools for working with human medical resequencing projects such as 1000 Genomes and The Cancer Genome Atlas. These tools include things like a depth of coverage analyzers, a quality score recalibrator, a SNP/indel caller and a local realigner. (entry from Genetic Analysis Software)

View all literature mentions

Haploview (tool)

RRID:SCR_003076

A Java based software tool designed to simplify and expedite the process of haplotype analysis by providing a common interface to several tasks relating to such analyses. Haploview currently allows users to examine block structures, generate haplotypes in these blocks, run association tests, and save the data in a number of formats. All functionalities are highly customizable. (entry from Genetic Analysis Software) * LD & haplotype block analysis * haplotype population frequency estimation * single SNP and haplotype association tests * permutation testing for association significance * implementation of Paul de Bakker's Tagger tag SNP selection algorithm. * automatic download of phased genotype data from HapMap * visualization and plotting of PLINK whole genome association results including advanced filtering options Haploview is fully compatible with data dumps from the HapMap project and the Perlegen Genotype Browser. It can analyze thousands of SNPs (tens of thousands in command line mode) in thousands of individuals. Note: Haploview is currently on a development and support freeze. The team is currently looking at a variety of options in order to provide support for the software. Haploview is an open source project hosted by SourceForge. The source can be downloaded at the SourceForge project site.

View all literature mentions

Picard (tool)

RRID:SCR_006525

Java toolset for working with next generation sequencing data in the BAM format.

View all literature mentions

Promega (tool)

RRID:SCR_006724

An Antibody supplier

View all literature mentions

VARSCAN (tool)

RRID:SCR_006849

A platform-independent, technology-independent software tool for identifying SNPs and indels in massively parallel sequencing of individual and pooled samples. Given data for a single sample, VarScan identifies and filters germline variants based on read counts, base quality, and allele frequency. Given data for a tumor-normal pair, VarScan also determines the somatic status of each variant (Germline, Somatic, or LOH) by comparing read counts between samples. (entry from Genetic Analysis Software)

View all literature mentions

1000 Genomes Project and AWS (tool)

RRID:SCR_008801

A dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.

View all literature mentions

HARDY (tool)

RRID:SCR_009107

Markov chain Monte Carlo program for association in two-dimensional contingency tables, and for testing Hardy-Weinberg equilibrium. (entry from Genetic Analysis Software)

View all literature mentions

GENEPOP (tool)

RRID:SCR_009194

Population genetic data analysis software package. Used to perform exact Hardy Weinberg Equilibrium test. Used for population differentiation and for genotypic disequilibrium among pairs of loci. Computes estimates of F-statistics, null allele frequencies, allele size-based statistics for microsatellites, etc. and performs analyses of isolation by distance from pairwise comparisons of individuals or population samples.

View all literature mentions

cutadapt (tool)

RRID:SCR_011841

Software tool that removes adapter sequences from DNA sequencing reads.

View all literature mentions