Natural language processing makes sense of vast volume of content

COLUMBUS, Ohio (May 23, 2023) — 

While natural language processing may not have much name recognition among the general public, many people rely on it every day. Using Google to find information online? The search engine employs natural language processing to comb through and analyze massive numbers of webpages and return the most relevant answers.

Alexa Prize TaskBot Challenge

Huan Sun led a team of Ohio State students (Ohio State TacoBot) to a third-place win in the first Alexa Prize TaskBot Challenge. The challenge, launched in March 2021, selected 10 university teams to develop bots that assist customers with multi-step cooking or do-it-yourself (DIY) home improvement tasks. The teams were required to address many difficult artificial intelligence (AI) challenges, from knowledge representation and inference, and commonsense and causal reasoning, to language understanding and generation. Only five teams advanced to the finals, and Ohio State TacoBot is the only U.S. team among the top three performers, which was announced in June 2022.

Huan Sun, an associate professor in the Department of Computer Science and Engineering at The Ohio State University, is fascinated by natural language processing, a subfield of artificial intelligence, and its many applications. With the aid of the high performance computing (HPC) power of the Ohio Supercomputer Center (OSC), Sun and her students have garnered scientific awards for their findings on the subject.

In one recent project, the Sun group created deep learning models that can review the content contained within tables commonly used on websites and understand the relationship between the pieces of information. While conventional search engines refer the user only to the table itself, the deep learning models take a step further and extract and analyze information from the tables, Sun explained. The scientific paper won the 2022 Association for Computing Machinery (ACM) Special Interest Group on Management of Data (SIGMOD) Research Highlight Award.

In a separate study, the Ohio State researchers collaborated with Nationwide Children’s Hospital to create a system that makes the wealth of information buried in clinical texts more accessible. In addition to helping clinicians save time reviewing and drawing conclusions from individual clinical notes, the system also allows them to submit specific questions for analysis, Sun said. One of the research papers received the Best Paper Award at the 2021 Institute of Electrical and Electronics Engineers (IEEE) International Conference on Bioinformatics and Biomedicine.

Huan Sun's research group posing for a photo outside.
Members of Ohio State’s TacoBot team, led by Associate Professor Huan Sun: Left to right, top row: Zhen Wang and Samuel Stevens; second row: Tianshu Zhang, Shijie Chen (student lead), and Yu Su (co-faculty advisor); third row: Lingbo Mo, Xiang Yue, Xiang Deng; and bottom row: Ashley Lewis, Ziru Chen (student lead), and Sun (lead faculty advisor).

Through her participation in Ohio State’s new AI Institute for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE), which is a National Science Foundation (NSF) AI Institute, Sun is expanding her natural language processing work to other domains. The initiative, in which OSC serves as a key HPC resource, has allowed Sun to create additional partnerships with researchers in bioinformatics, biomedical sciences, public health and environmental sciences.

Within ICICLE, Sun is focused on applying conversational artificial intelligence, which is concerned with building natural language interfaces, to various problems. Her group currently is helping ICICLE’s foodshed research team use the technology in a study of the impact of grocery store closures on residents’ food access.

Regardless of the research project, Sun and her students rely on OSC resources, using both Pitzer and Owens clusters.

“We use OSC for almost every single project that we’re doing,” Sun said. “We’ve been very happy to acknowledge them in almost every single paper we publish.”

Sun appreciates the $1,000 annual faculty credit that she can use towards OSC services, as well as the fast technical support that keeps her studies moving forward. OSC has helped Sun meet project goals, which in turn has attracted additional support—such as NSF funding—to advance her natural language processing research.

About OSC: The Ohio Supercomputer Center (OSC) addresses the rising computational demands of academic and industrial research communities by providing a robust shared infrastructure and proven expertise in advanced modeling, simulation and analysis. OSC empowers scientists with the services essential to making extraordinary discoveries and innovations, partners with businesses and industry to leverage computational science as a competitive force in the global knowledge economy and leads efforts to equip the workforce with the key technology skills required for 21st century jobs.

Subjects: