Please use this identifier to cite or link to this item:
https://www.arca.fiocruz.br/handle/icict/30735
ASSEMBLY OF A PAN-GENOME FROM DEEP SEQUENCING OF 910 HUMANS OF AFRICAN DESCENT
DNA
Humanos
Moldes genéticos
Populações
Mapeamento genético
África
Author
Sherman, Rachel M
Forman, Juliet
Antonescu, Valentin
Puiu, Daniela
Daya, Michelle
Rafaels, Nicholas
Boorgula, Meher Preethi
Chavan, Sameer
Vergara, Candelaria
Ortega, Victor E
Levin, Albert M
Eng, Celeste
Yazdanbakhsh, Maria
Wilson, James G
Marrugo, Javier
Lange, Leslie A
Williams, L Keoki
Watson, Harold
Ware, Lorraine B
Olopade, Christopher O
Olopade, Olufunmilayo
Oliveira, Ricardo Riccio
Ober, Carole
Nicolae, Dan L
Meyers, Deborah A
Mayorga, Alvaro
Knight-Madden, Jennifer
Hartert, Tina
Hansel, Nadia N
Foreman, Marilyn G
Ford, Jean G
Faruque, Mezbah U
Dunston, Georgia M
Caraballo, Luis
Burchard, Esteban G
Bleecker, Eugene R
Araujo, Maria I
Herrera-Paz, Edwin F
Campbell, Monica
Foster, Cassandra
Taub, Margaret A
Beaty, Terri H
Ruczinski, Ingo
Mathias, Rasika A
Barnes, Kathleen C
Salzberg, Steven L
Forman, Juliet
Antonescu, Valentin
Puiu, Daniela
Daya, Michelle
Rafaels, Nicholas
Boorgula, Meher Preethi
Chavan, Sameer
Vergara, Candelaria
Ortega, Victor E
Levin, Albert M
Eng, Celeste
Yazdanbakhsh, Maria
Wilson, James G
Marrugo, Javier
Lange, Leslie A
Williams, L Keoki
Watson, Harold
Ware, Lorraine B
Olopade, Christopher O
Olopade, Olufunmilayo
Oliveira, Ricardo Riccio
Ober, Carole
Nicolae, Dan L
Meyers, Deborah A
Mayorga, Alvaro
Knight-Madden, Jennifer
Hartert, Tina
Hansel, Nadia N
Foreman, Marilyn G
Ford, Jean G
Faruque, Mezbah U
Dunston, Georgia M
Caraballo, Luis
Burchard, Esteban G
Bleecker, Eugene R
Araujo, Maria I
Herrera-Paz, Edwin F
Campbell, Monica
Foster, Cassandra
Taub, Margaret A
Beaty, Terri H
Ruczinski, Ingo
Mathias, Rasika A
Barnes, Kathleen C
Salzberg, Steven L
Affilliation
Múltipla - ver em Notas
Abstract
We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences that is present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads that failed to align, and assembled these reads into contiguous sequences (contigs). We then compared all contigs to one another to identify a set of unique sequences representing regions of the African pan-genome missing from the reference genome. Our analysis revealed 296,485,284 bp in 125,715 distinct contigs present in the populations of African descent, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome. Although the functional significance of nearly all of this sequence is unknown, 387 of the novel contigs fall within 315 distinct protein-coding genes, and the rest appear to be intergenic.
Keywords in Portuguese
Genoma humanoDNA
Humanos
Moldes genéticos
Populações
Mapeamento genético
África
Share