Publications
You can also find my articles on my Google Scholar profile.
Selected Publications
(* indicates equal contribution)
Reason to Rote: Rethinking Memorization in Reasoning. EMNLP 2025. [pdf]
Yupei Du, Philipp Mondorf, Silvia Casola, Yuekun Yao, Robert Litschko, and Barbara Plank.
TL;DR: We mechanistically study benign memorization in language models in reasoning tasks, and find that memorization does not replace but rather is built on generalization.
Grokking ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior. arXiv preprint 2025. [pdf]
Florian Eichin, Yupei Du, Philipp Mondorf, Barbara Plank, Michael A. Hedderich.
TL;DR: We introduce ExPLAIND—an interpretability framework for jointly attributing model components, data, and training dynamics and apply it to investigate Grokking.
Language models can learn implicit multi-hop reasoning, but only if they have lots of training data. EMNLP 2025. [pdf]
Yuekun Yao, Yupei Du, Dawei Zhu, Michael Hahn*, and Alexander Koller*.
TL;DR: We studied the implicit multi-hop reasoning capabilities of language models, and find that they require an exponentially increasing amount of training data to perform well as the depth grows, and curriculum learning can substantially mitigate this.
Disentangling the Roles of Representation and Selection in Data Pruning. ACL 2025. [pdf]
Yupei Du, Yingjin Song, Hugh Mee Wong, Daniil Ignatev, Albert Gatt, and Dong Nguyen.
TL;DR: We disentangled and systematically studied the influence of data representation and selection algorithm in data pruning.
On Support Samples of Next Word Prediction. ACL 2025. [pdf]
Yuqian Li*, Yupei Du*, Yufang Liu, Feifei Feng, Mou Xiao Feng, and Yuanbin Wu.
TL;DR: We studied the training instances that support the predictions of language models, and reveal that supporting is likely an intrinsic property of data.
Burn After Reading: Do Multimodal Large Language Models Truly Capture Order of Events in Image Sequences? Findings of ACL 2025. [pdf]
Yingjin Song, Yupei Du, Denis Paperno, and Albert Gatt.
TL;DR: We propose a vision-language benchmark for multi-event temporal grounding and reasoning in image sequences.
FTFT: Efficient and Robust Fine-Tuning by Transferring Training Dynamics. COLING 2025. [pdf]
Yupei Du, Albert Gatt, and Dong Nguyen.
TL;DR: We show that the training dynamics of an efficient but weak model can be transferred to much more capable models to achieve better robustness and efficiency.
Understanding Gender Bias in Knowledge Base Embeddings. ACL 2022. [pdf]
Yupei Du, Qi Zheng, Yuanbin Wu, Man Lan, Yan Yang, Meirong Ma.
TL;DR: We propose methods to both quantify and trace the origins of gender biases in knowledge base (embeddings), using a closed-form approximation of influence functions.
Exploring Human Gender Stereotypes with Word Association Test. EMNLP 2019. [pdf]
Yupei Du, Yuanbin Wu, Man Lan.
TL;DR: We use label propagation to quantify and visualize how gender biases are transferred and reinforced through word associations, and therefore offer a large-scale dataset of word-level gender bias scores.