BackgroundEditTrypanosoma cruzi is a parasite that casuses Chagas disease in humans. It is transmited through reduviid vectors via the insects feces after a blood meal. Chagas is a lethal diseas affecting millions of people in Central and South America.
The genome was completely sequenced from the T. cruzi strain CL Brener, which is a subgroup of of T. cruzi II, associated with the domestic infection of placental mammals. The project was published in Science in July of 2005 by research groups from TIGR, The Seattle Biomedical Research Institute and Uppsala University (El-Sayed,? et al.? 2005). The genome is 34 Mb long and cotains 22,570 encoding predicted genes. Over 50% of the genome consists of repeated sequences, such as retrotransposons, minsatellites and genes from lager families of surface molecules.
Characterization of Clone CL BrenerEdit
Unlike most Genome Projects, the T. cruzi genome project started with a consensus by the involved laboratories to use a single reference clone: CL Brener. This decision was taken as T. cruzi is extremely heterogeneous and divergent at molecular and functional levels. CL Brener was initially extracted from Triatoma infestans, a strictly domiciliary vector. It was shown to be highly infective to mammals (which made it easy for mice infection), capable of completing its whole life cycle in human cell culture, and shown to be highly susceptible to chemotherapeutic agents.
Mitochondrial Lineage Edit
Two primers, D17 and D72, were developed to target a divergent domain of rRNA Gene Alpha-24S. It was conclude that CL Brener belong to lineage II, specifically under the subgroup IIe, which is associated with infection of placental mammals. Subgroup IIe was phylogentically traced as a hybrid from other lineage II subgroups, IIb and IIc.
By using pulsed field gel electrophoresis, 20 uniform chromosomal bands ranging in size from 0.45 to 3.5Mbp were described for the molecular karyotype of CL Brener. Santos et al. (1997) calculated that each band contains at least 2 chromosomes yielding an approximate of 64 chromosomes per cell. Bands hosted both heterozygous and homologous chromosomes. The weighted sum of the chromosomal bands was approximately 87 Mbp. Chromoblots were hybridized with a panel of cloned sequences (homologous probes), 13 of which identified single chromosomes; most identified several repetitive sequences mapped throughout all chromosomes. These findings correspond to earilier studies that showed that repetitive sequeces accounted for about 50% of T. cruzi's genome (Castro et al. 1981, Lanar et al. 1981). That Several markers showed 9 different linkage groups, which was later useful for the construction of physical chromosomal maps required for the assembly of the T. cruzi genome.
Genome Sequencing, Assembly and Annotation
The sequence was obtained by using the whole-genome shotgun (WGS); better suited to sequence the T. cruzi genome because it yields longer DNA strands necesary to account for the high repeat content and hybrid nature of the CL Brener. WGS sequencig yield 2.5 fold genome coverage, consisting of 5489 scaffolds (containing 8740 contigs) totaling 67 Mb. On the basis of the assembly results, the T. cruzi diploid genome size was estimated to be between 106.4 and 110.7 Mb; larger than the previous estimate of 87 Mb from the karyotype analysis. Analysis of the 60.4-annotated dataset revealed that 30.5 Mb contain sequence found at least twice in the assembly, which suggests that they likely represent the two different haplotypes in the T. cruzi CL Brener genome and suppoted the role of multiple progenitors in the evolution of T. cruzi hybrid strains (including CL Brener).Comparison of CL Brener reads with sequences generated from the T. cruzi Esmeraldo genome, member of the IIb progenitor subgroups, helped distinguish between the two haplotype of CL Brener. The two haplotypes displayed high levels of gene synteny (preservation of linkage groups), with most differences because of insertion/deletions in intergenic (non-coding) and subtelomeric regions and/or amplification of repetitive sequences. The average sequence divergence between the two haplotypes was 5.4%, but the protein-coding regions were considerably more conserved (2.2% difference) than intergenic regions.
The haploid T. cruzi genome contains about 12,000 genes, although the analysis of 4008 contigs predicted and a total of 22,570. A total of 594 RNA genes were also identified in the annotated contigs, and 1400 in the unnannotated contigs. Unnannotated contigs contained many tandemly repeated ribosomal RNA (rRNA), spliced leader (sl) RNA, and small nucleolar RNA (snoRNA) genes. As in other trypanosomatids, the protein-coding genes were generally arranged in long clusters of tens-to-hundreds of genes on the same DNA strand.
Surface Molecules Repeats Edit
At least 50% of the T. cruzi genome is repetitive sequence, consisting mostly of large gene families of surface proteins, retrotransposons, and subtelomeric repeats. T. cruzi shows an extensive expansion of several families of surface molecules, including Trans-sialidase (TS), mucin, mucin-associated surface proteins (MASPs) and glycoprotein gp63 protease, which are each encoded by serveral hundred genes. These proteins are often found in sub-telomeric locations, are mostly T. cruzi-specific, and account for aproximately 18% of the total protein-coding genes.
Unlike most parasites, T. cruzi is able to incorporate the host's sialic acid using a surface-bound MASP which transfer sialidase from sialoglycoconjugates in the host. T. cruzi also contains 1430 genes members of the TS superfamily. Most active TSs contain a variable number 12-amino acid SAPA (shed acute-phase antigen). About 725 genes enconde enzymatically in-active TS-like pseudo proteins, which suggests a strong selective pressure on the TS gene family to diversify. This pressure may be in part provided by the mammalian immune response, because TSs are targets of both humoral and cell-mediated immune response.
El-Sayed, Najib M., et al. 2005. The genome sequence of Trypanosoma cruzi, etiologic agent of Chagas disease. Science Signalling 309.5733: 409.
Santos, Marcia RM, et al. 1997. The Trypanosoma cruzi genome project: nuclear karyotype and gene mapping of clone CL Brener. Memorias do Instituto Oswaldo Cruz 92.6: 821-828.