X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Resource Name
RRID:SCR_007105 RRID Copied      
PDF Report How to cite
CD-HIT (RRID:SCR_007105)
Copy Citation Copied
Resource Information

URL: http://weizhong-lab.ucsd.edu/cd-hit/

Proper Citation: CD-HIT (RRID:SCR_007105)

Description: THIS RESOURCE IS NO LONGER IN SERVICE. Documented on February 28,2023. Software program for clustering biological sequences with many applications in various fields such as making non-redundant databases, finding duplicates, identifying protein families, filtering sequence errors and improving sequence assembly etc. It is very fast and can handle extremely large databases. CD-HIT helps to significantly reduce the computational and manual efforts in many sequence analysis tasks and aids in understanding the data structure and correct the bias within a dataset. The CD-HIT package has CD-HIT, CD-HIT-2D, CD-HIT-EST, CD-HIT-EST-2D, CD-HIT-454, CD-HIT-PARA, PSI-CD-HIT, CD-HIT-OTU and over a dozen scripts. * CD-HIT (CD-HIT-EST) clusters similar proteins (DNAs) into clusters that meet a user-defined similarity threshold. * CD-HIT-2D (CD-HIT-EST-2D) compares 2 datasets and identifies the sequences in db2 that are similar to db1 above a threshold. * CD-HIT-454 identifies natural and artificial duplicates from pyrosequencing reads. * CD-HIT-OTU cluster rRNA tags into OTUs The usage of other programs and scripts can be found in CD-HIT user''s guide. CD-HIT was originally developed by Dr. Weizhong Li at Dr. Adam Godzik''s Lab at the Burnham Institute (now Sanford-Burnham Medical Research Institute).

Abbreviations: CD-HIT

Synonyms: CD-HIT Program

Resource Type: software application, source code, software resource, data processing software

Defining Citation: PMID:20053844, PMID:16731699, DOI:10.1093/bioinformatics/btl158

Keywords: cluster, protein, sequence, classification, domain, analysis, nucleotide sequence, dna, protein sequence, bio.tools, FASEB list

Expand All
This resource

is listed by

Debian

is listed by

bio.tools

is listed by

OMICtools

has parent organization

University of California at San Diego; California; USA

has parent organization

Google Code

is parent organization of

CD-HIT-OTU

Usage and Citation Metrics
We apologize, the data for 2022 is currently unavailable for most resources. We are aware of the issue and are working to resolve it.

We found {{ ctrl2.mentions.total_count }} mentions in open access literature.

We have not found any literature mentions for this resource.

We are searching literature mentions for this resource.

View full usage report

Most recent articles:

{{ mention._source.dc.creators[0].familyName }} {{ mention._source.dc.creators[0].initials }}, et al. ({{ mention._source.dc.publicationYear }}) {{ mention._source.dc.title }} {{ mention._source.dc.publishers[0].name }}, {{ mention._source.dc.publishers[0].volume }}({{ mention._source.dc.publishers[0].issue }}), {{ mention._source.dc.publishers[0].pagination }}. (PMID:{{ mention._id.replace('PMID:', '') }})

Checkfor all resource mentions.

Collaborator Network

A list of researchers who have used the resource and an author search tool

Find mentions based on location


{{ ctrl2.mentions.errors.location }}

A list of researchers who have used the resource and an author search tool. This is available for resources that have literature mentions.

Ratings and Alerts

No rating or validation information has been found for CD-HIT.

No alerts have been found for CD-HIT.

Data and Source Information