Researchers use deep neural networks to predict transcription factor binding

COLUMBUS, Ohio (Aug 24, 2023) — 

Emily Miraldi, assistant professor in the Divisions of Immunobiology and Biomedical Informatics at Cincinnati Children’s Hospital, Department of Pediatrics at University of Cincinnati School of Medicine, leads an “immune-engineering” research group that uses mathematical modeling of the immune system to predict immune responses and understand disease.

Dr. Emily Miraldi (center) along with graduate students Tareian Cazares (left) and Faiz Rizvi (right) lead the multi-lab maxATAC modeling collaboration at Cincinnati Children’s Hospital.
Dr. Emily Miraldi (center) along with graduate students Tareian Cazares (left) and Faiz Rizvi (right) lead the multi-lab maxATAC modeling collaboration at Cincinnati Children’s Hospital.

The Ohio Supercomputer Center (OSC) plays an important role in the research, as Miraldi has needed high performance computing resources to solve computationally demanding mathematical problems.

“The biological question motivating my work at OSC is a very famous one: Cells in the human body share a common DNA blueprint but have a great diversity of functions and behaviors,” Miraldi said.

The diversity of cell types in the human body are driven by unique patterns of gene expression, which are controlled by proteins called transcription factors. Aberrant gene expression patterns are a hallmark of many diseases and can be traced to altered gene regulation by transcription factors, Miraldi explained.

“Discovering the transcription factors that control disease-associated gene expression provides an opportunity to develop therapies that might target those transcription factors to improve disease outcomes in the ‘poorly behaving’ cell types,” Miraldi said.

In an article published in the journal Genome Research, Miraldi’s team recently showed that a new data type called “Assay for Transposase Accessible Chromatin” (ATAC-seq), could identify transcription factor regulators of gene expression across cell types (Miraldi et al. (2019) Genome Research, Pokrovskii et al. (2019) Immunity).

Before having access to OSC’s high performance computing resources, the team’s studies used simple mathematical models to predict the transcription factor binding from ATAC-seq. With more computational capability, Miraldi began using deep neural network models, which enabled her to improve the accuracy of the transcription factor binding predictions.

“We initially used ATAC-seq data in a crude way to infer transcription factor binding sites, but, taking advantage of the high performance computing resources at OSC, were able to use the latest advances in deep neural network modeling to more accurately predict transcription factor binding events from ATAC-seq.”

The resulting collection of open-source, user-friendly deep neural network models is called “maxATAC,” published in the journal PLoS Computational Biology (Cazares et al., 2023). The maxATAC models can be used by other research groups to predict transcription factor binding from ATAC-seq in any human cell type – including single-cell (sc)ATAC-seq, which is now a standard technology at many research institutions.

“Transcription factor binding prediction scATAC-seq is especially valuable at Cincinnati Children’s Hospital, where there is great desire to understand gene regulation and disease mechanisms from scarce patient samples (e.g., cancer tumor biopsies, transplant rejection) that can only be analyzed by single-cell technologies,” she said.

About OSC: The Ohio Supercomputer Center (OSC) addresses the rising computational demands of academic and industrial research communities by providing a robust shared infrastructure and proven expertise in advanced modeling, simulation and analysis. OSC empowers scientists with the services essential to making extraordinary discoveries and innovations, partners with businesses and industry to leverage computational science as a competitive force in the global knowledge economy and leads efforts to equip the workforce with the key technology skills required for 21st century jobs.

Subjects: