Multi-Locus Sequence Typing (MLST) of Streptococcus pneumoniae is based on the sequence of seven housekeeping gene fragments. The analysis of MLST allelic profiles by eBURST allows the grouping of genetically related strains into Clonal Complexes (CCs) including those genotypes with a common descent from a predicted ancestor. However, the increasing use of MLST to characterize S. pneumoniae strains has led to the identification of a large number of new Sequence Types (STs) causing the merger of formerly distinct lineages into larger CCs. An example of this is the CC156, displaying a high level of complexity and including strains with allelic profiles differing in all seven of the MLST loci, capsular type and the presence of the Pilus Islet-1 (PI-1). Detailed analysis of the CC156 indicates that the identification of new STs, such as ST4945, induced the merging of formerly distinct clonal complexes. In order to discriminate the strain diversity within CC156, a recently developed typing schema, 96-MLST, was used to analyse 66 strains representative of 41 different STs. Analysis of allelic profiles by hierarchical clustering and a minimum spanning tree identified ten genetically distinct evolutionary lineages. Similar results were obtained by phylogenetic analysis on the concatenated sequences with different methods. The identified lineages are homogenous in capsular type and PI-1 presence. ST4945 strains were unequivocally assigned to one of the lineages. In conclusion, the identification of new STs through an exhaustive analysis of pneumococcal strains from various laboratories has highlighted that potentially unrelated subgroups can be grouped into a single CC by eBURST. The analysis of additional loci, such as those included in the 96-MLST schema, will be necessary to accurately discriminate the clonal evolution of the pneumococcal population.
Pubmed ID: 23593373 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Software environment and programming language for statistical computing and graphics. R is integrated suite of software facilities for data manipulation, calculation and graphical display. Can be extended via packages. Some packages are supplied with the R distribution and more are available through CRAN family.It compiles and runs on wide variety of UNIX platforms, Windows and MacOS.
View all literature mentionsSoftware R package. Methods for Cluster analysis. Performs variety of types of cluster analysis and other types of processing on large microarray datasets.
View all literature mentionsApplication that uses molecular sequence data to compute unrooted phylogenetic networks. Given an alignment of sequences, a distance matrix, or a set of trees, the program will compute a phylogenetic tree or network using methods such as split decomposition, neighbor-net, consensus network, super networks methods or methods for computing hybridization or simple recombination networks.
View all literature mentions