Abstract: |
Despite the microbiome gaining attention from many fields of science, there is still not a broad understanding of how microbes interact both locally and across body sites, what factors mediate the relationship between microbe and host, and the breadth of physiologic functions affected by the microbiome. Many research dollars have been poured into trying to understand how a single microbe might cause disease, when rather, it is a community of microbes and their interaction with environmental mediators which impact overall health. Despite the need to do this work, it is very resource intensive to do such an analysis, requiring intensive computational resources and time. This project will use the biobakery workflow to examine metagenomic data from the gut and vaginal body sites of 382 women at two points during pregnancy (total files = 1,528, 4 Tb). This is the first step towards understanding the population dynamics of the microbiome across body sites and over time. The biobakery workflow will provide data on the microbial strains present, the gene function of those strains, and create the human filtered files for deposit on a public database. Following the completion of this step, downstream analysis can be conducted integrating clinical and lifestyle data with microbiome data. Given the size of the dataset and the computational time required, I am requesting access to PSC Large Memory Nodes through XSEDE to benchmark the planned analysis pipeline on a subset of the data before requesting more resources. |