Skip to main content

News

"CODA: Temporal Domain Generalization via Concept Drift Simulator" by IE's Zou wins Best Paper Competition
By
Alex Keimig
The work of assistant professor of industrial engineering Dr. Na Zou was recently recognized with the Best Paper Award in a competition associated with the 2024 Institute for Operations Research and the Management Sciences (INFORMS) Conference on Quality, Statistics, and Reliability (ICQSR).
The work of assistant professor of industrial engineering Dr. Na Zou was recently recognized with the Best Paper Award in a competition associated with the 2024 Institute for Operations Research and the Management Sciences (INFORMS) Conference on Quality, Statistics, and Reliability (ICQSR).

The work of assistant professor of industrial engineering Dr. Na Zou was recently recognized with the Best Paper Award in a competition associated with the 2024 Institute for Operations Research and the Management Sciences (INFORMS) Conference on Quality, Statistics, and Reliability (ICQSR) — described by Zou as "a major research community."

The paper, CODA: Temporal Domain Generalization via Concept Drift Simulator, was co-authored by Zou and her research partners at Rice University and Texas A&M University: Chia-Yuan Chang, Yu-Neng Chuang, Zhimeng Jiang, Kwei-Herng Lai, and Anxiao Jiang. Funding for the project came from an NSF award earlier this year; the total $1.2 million award is split across the three collaborating institutions.

"In real-world applications, machine learning models often become obsolete due to shifts in the joint distribution arising from underlying temporal trends, a phenomenon known as the 'temporal concept drift'," assert Zou et al. in the paper's abstract.

With machine learning currently at the forefront of innumerable innovative efforts, investigating solutions for issues such as concept drift is critical work.

"We're addressing a data challenge related to the quality of the data used to train a model. We propose a [new] method from a data-centric perspective," said Zou.

This method is the COncept Drift simulAtor (CODA) framework: a way to simulate future data with potential changes that machine learning models may face before they actually face them.

"Previously, most existing work relied on model-centric methods; that is, applying different models to a fixed data set to enhance prediction. Since the temporal distribution shifts arise from data, we incorporate the temporal trends in a simulator to generate out-of- distribution future data. The generated data can be used to train various models for improving generalization."

Zou uses a real-world example to illustrate what makes this research so important, not to mention practical:

"For example, consider the task of using Twitter data to predict seasonal flu trends. Over time, the number of active users on Twitter is increasing, new friendships are formed, and user profiles evolve, all of which can significantly affect the model performance for future flu prediction using models trained on the initial data. But this future data, such as new users and friendships for the next year, is not yet available and cannot be accessed now. Instead of training new flu prediction models after collecting Twitter data next year, our proposed method can simulate the future Twitter data via capturing the temporal trends. The simulated Twitter data can be used to train various models, leading to more accurate flu prediction for the upcoming year."

Compared to model-centric modeling, a data-centric approach is critical because it addresses underlying data quality and distribution issues and can significantly enhance model performance and generalization, leading to more reliable, robust and effective solutions for real-world applications.

Share This Story: