Automation of functional annotation of genomes and transcriptomes

Authors

  • Luis Fernando Cadavid Gutiérrez National University
  • José Nelson Pérez Castillo Universidad Distrital Francisco José de Caldas
  • Cristian Alejandro Rojas Quintero Universidad Distrital Francisco José de Caldas
  • Nelson Enrique Vera Parra Universidad Distrital Francisco José de Caldas

Keywords:

Annotator, Functional annotation, Gene ontology, High Throughput Sequencing. (en).

Keywords:

Annotator, Functional annotation, Gene ontology, High Throughput Sequencing. (es).

Downloads

Abstract (en)

Functional annotation represents a means to investigate and classify genes and transcripts according to their function within a given organism.

This paper presents Massive Automatic Functional Annotation (MAFA - Web), which is an online free bioinformatics tool that allows automation, unification and optimization of functional annotation processes when dealing with large volumes of sequences. MAFA includes tools for categorization and statistical analysis of associations between sequences. We have evaluated the performance of MAFA with a set of data taken from Diploria-Strigosatranscriptome (using an 8-core computer, namely E7450 @ 2,40GHZ with 256GB RAM), processing rates of 2,7 seconds per sequence (using Uniprot database) and 50,0 seconds per sequence (using Non-redundant from NCBI database) were found together with particular RAM usage patterns that depend on the database being processed (1GB for Uniprot database and 9GB for Non-redundant database).. Aviability: https://github.com/BioinfUD/MAFA.

 

Abstract (es)

Functional annotation represents a means to investigate and classify genes and transcripts according to their function within a given organism.

This paper presents Massive Automatic Functional Annotation (MAFA - Web), which is an online free bioinformatics tool that allows automation, unification and optimization of functional annotation processes when dealing with large volumes of sequences. MAFA includes tools for categorization and statistical analysis of associations between sequences. We have evaluated the performance of MAFA with a set of data taken from Diploria-Strigosatranscriptome (using an 8-core computer, namely E7450 @ 2,40GHZ with 256GB RAM), processing rates of 2,7 seconds per sequence (using Uniprot database) and 50,0 seconds per sequence (using Non-redundant from NCBI database) were found together with particular RAM usage patterns that depend on the database being processed (1GB for Uniprot database and 9GB for Non-redundant database). Aviability: https://github.com/BioinfUD/MAFA.

 

Author Biographies

Luis Fernando Cadavid Gutiérrez, National University

Medicine Doctor, Ecology and Evolutionary Biology PhD., IEI Research Group - Teacher / Researcher, Institute of Genetics and Department of Biology, National University, Bogotá

José Nelson Pérez Castillo, Universidad Distrital Francisco José de Caldas

System Engineer, Informatics PhD., GICOGE Research Group - Director of Center for Scientific Research and Development, Universidad Distrital Francisco José de Caldas, Bogotá.

Cristian Alejandro Rojas Quintero, Universidad Distrital Francisco José de Caldas

System Engineer Student, GICOGE Research Group - Student, Universidad Distrital Francisco José de Caldas, Bogotá

Nelson Enrique Vera Parra, Universidad Distrital Francisco José de Caldas

Electronic Engineer, Information Sciences and Communication M.Sc., GICOGE Research Group - Teacher / Researcher, Universidad Distrital Francisco José de Caldas, Bogotá

References

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., &Lipman, D. J. (1990). Basic local alignment search tool. Journal of molecular biology, 215(3), 403-410.

Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., ...& Sherlock, G. (2000). Gene Ontology: tool for the unification of biology. Nature genetics, 25(1), 25-29.

Bairoch, A., Apweiler, R., Wu, C. H., Barker, W. C., Boeckmann, B., Ferro, S., ...&Yeh, L. S. L. (2005). The universal protein resource (UniProt). Nucleic acids research, 33(suppl 1), D154-D159

Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., &Madden, T. L. (2009). BLAST+: architecture and applications. BMC bioinformatics, 10(1), 421.

Carbon, S., Ireland, A., Mungall, C. J., Shu, S., Marshall, B., & Lewis, S. (2009). AmiGO: online access to ontology and annotation data. Bioinformatics, 25(2), 288-289.

Cock, P. J., Antao, T., Chang, J. T., Chapman, B. A., Cox, C. J., Dalke, A., ...& de Hoon, M. J. (2009). Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25(11), 1422-1423.

Metzker, M. L. (2010). Sequencing technologies—the next generation. Nature Reviews Genetics, 11(1), 31-46.

Pruitt, K. D., Tatusova, T., &Maglott, D. R. (2007). NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic acids research, 35(suppl 1), D61-D65.

How to Cite

APA

Cadavid Gutiérrez, L. F., Pérez Castillo, J. N., Rojas Quintero, C. A., and Vera Parra, N. E. (2014). Automation of functional annotation of genomes and transcriptomes. Tecnura, 18, 90–96. https://doi.org/10.14483/22487638.9246

ACM

[1]
Cadavid Gutiérrez, L.F. et al. 2014. Automation of functional annotation of genomes and transcriptomes. Tecnura. 18, (Dec. 2014), 90–96. DOI:https://doi.org/10.14483/22487638.9246.

ACS

(1)
Cadavid Gutiérrez, L. F.; Pérez Castillo, J. N.; Rojas Quintero, C. A.; Vera Parra, N. E. Automation of functional annotation of genomes and transcriptomes. Tecnura 2014, 18, 90-96.

ABNT

CADAVID GUTIÉRREZ, Luis Fernando; PÉREZ CASTILLO, José Nelson; ROJAS QUINTERO, Cristian Alejandro; VERA PARRA, Nelson Enrique. Automation of functional annotation of genomes and transcriptomes. Tecnura, [S. l.], v. 18, p. 90–96, 2014. DOI: 10.14483/22487638.9246. Disponível em: https://revistas.udistrital.edu.co/index.php/Tecnura/article/view/9246. Acesso em: 17 jul. 2024.

Chicago

Cadavid Gutiérrez, Luis Fernando, José Nelson Pérez Castillo, Cristian Alejandro Rojas Quintero, and Nelson Enrique Vera Parra. 2014. “Automation of functional annotation of genomes and transcriptomes”. Tecnura 18 (December):90-96. https://doi.org/10.14483/22487638.9246.

Harvard

Cadavid Gutiérrez, L. F. (2014) “Automation of functional annotation of genomes and transcriptomes”, Tecnura, 18, pp. 90–96. doi: 10.14483/22487638.9246.

IEEE

[1]
L. F. Cadavid Gutiérrez, J. N. Pérez Castillo, C. A. Rojas Quintero, and N. E. Vera Parra, “Automation of functional annotation of genomes and transcriptomes”, Tecnura, vol. 18, pp. 90–96, Dec. 2014.

MLA

Cadavid Gutiérrez, Luis Fernando, et al. “Automation of functional annotation of genomes and transcriptomes”. Tecnura, vol. 18, Dec. 2014, pp. 90-96, doi:10.14483/22487638.9246.

Turabian

Cadavid Gutiérrez, Luis Fernando, José Nelson Pérez Castillo, Cristian Alejandro Rojas Quintero, and Nelson Enrique Vera Parra. “Automation of functional annotation of genomes and transcriptomes”. Tecnura 18 (December 1, 2014): 90–96. Accessed July 17, 2024. https://revistas.udistrital.edu.co/index.php/Tecnura/article/view/9246.

Vancouver

1.
Cadavid Gutiérrez LF, Pérez Castillo JN, Rojas Quintero CA, Vera Parra NE. Automation of functional annotation of genomes and transcriptomes. Tecnura [Internet]. 2014 Dec. 1 [cited 2024 Jul. 17];18:90-6. Available from: https://revistas.udistrital.edu.co/index.php/Tecnura/article/view/9246

Download Citation

Visitas

137

Dimensions


PlumX


Downloads

Download data is not yet available.

Most read articles by the same author(s)

Loading...