De novo assembly of Amoeba proteus genome using 10x Genomics linked reads

Yong Kong, Yale University

0000-0002-2881-5274

ACCESS Allocation Request BIO210112

Abstract: Amoeba proteus is one of the most famous amoebae which has been widely used in biology for more than 250 years. A. proteus is a large (250–750 μm), transparent, free-living unicellular protist that lives in freshwater and feeds on smaller protists. A. proteus is easily found in any place in the world and has a simple life cycle, so by virtue of its universality and easiness to culture, A. proteus became a model organism for studies of modern cytology and endosymbiosis. Despite its long history, however, little is known about its genome. It was estimated that A. proteus has one of the largest genomes among all the species despite its small size. To better understand this amazing species, we’re interested in generating a de novo assembly of A. proteus’s genome using 2 billions 10x Genomics linked reads. The program we are going to use is supernova. Due to the size of the genome and the amount of linked reads involved, in certain stages of the computation large amounts of memory, sometimes in excess of 3T, are needed.

Allocations:

2021 PSC Bridges-2 Extreme Memory (PSC Bridges-2 EM) 7,000.0 Core-hours
2021 PSC Bridges-2 Regular Memory (PSC Bridges-2 RM) 45,000.0 Core-hours
2021 PSC Bridges-2 Storage (PSC Ocean) 12,000.0 GB
The estimated value of these awarded resources is $2,206.00. The allocation of these resources represents a considerable investment by the NSF in advanced computing infrastructure for the U.S. The dollar value of the allocation is estimated from the NSF awards supporting the allocated resources.
There are no other allocations for this project.

Other Titles:

There are no prior titles for this project.