Yuxin (Audrey) Wang

Hello! I am Yuxin, a third-year Computer Science Ph.D. student at Dartmouth, advised by Prof. Soroush Vosoughi, with additional support from Prof. Saeed Hassanpour. I specialize in NLP, focusing on the cognitive and societal aspects of AI. My work primarily explores the cognitive and behavioral performances of AI systems by analyzing their language understanding capabilities and probing their reasoning processes. I am also interested in measuring and improving conversational AI’s integration with external tools (e.g., information retriever, strategy planner) to support more effective AI-human interaction.

I obtained my master’s and bachelor’s degrees in Computer Science from Nanjing Univerisity in 2023 and 2019, respectively. During my master’s time at Websoft Lab, my research encompassed knowledge graphs representation learning and continual learning. I have participated in multiple researches and projects.

Sometimes I write research and tech blogs on Medium and MyBlog. I also love jogging and cycling. Feel free to reach out for any potential research collaboration opportunities.

News

Apr 16, 2026	I will be an Applied Scientist Intern at Amazon in Seattle this Summer!
Mar 6, 2026	Wrote a new blog on LLM inference scheduling: Understanding vLLM Scheduling: Token Budgets, Chunked Prefill, and Policies.
Jan 21, 2026	Our Terminal-bench paper is out! Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces
Nov 4, 2025	Please check Harbor and Terminal-bench, framework and benchmark for testing LLM agents on various tasks. Happy to have contributed to them 🎉.
Sep 15, 2025	I will be a PhD Machine Learning intern at Pinterest this Fall!
Jul 27, 2025	Accepted AAAI-26 Program Committee Invitation.

Internship experience

Amazon — Applied Scientist Intern June 2026 – September 2026

Seattle, WA, USA

Mentor:

Haoxin Zheng, Hengrui Cai
Microsoft Turing / MSR — Research Intern June 2025 – September 2025

Redmond, WA, USA

Mentor:

Nick Craswell

Project: conversational memory modeling, AI chatbot metric personalization
Microsoft Research Asia — Research Intern July 2022 - February 2023

Beijing, China

Mentor:

Börje Karlsson

Project: knowledge-enhanced open-world story generation
Pinterest — PhD Machine Learning Intern September 2025 – December 2025

Palo Alto, CA, USA

Project: open-weight vLLMs inference optimization, model throughput acceleration

Selected publications

arXiv

Memory Makes the Difference: Evaluating How Different Memory Roles Shape Conversational Agents

Yuxin Wang, Paul Thomas, Zhiwei Yu, Yuan Gao, Saeed Hassanpour, Soroush Vosoughi, Robert Sim, and Nick Craswell

2026

Abs arXiv Code

Prior research on memory mechanism in RAG-based conversational system has emphasized how memory is stored and retrieved. However, far less is known about how memories with different functional roles influence response quality. Specifically, how they shape an agent’s responses under varying conversational contexts and whether they lead to substantively different response behaviors. Existing evaluations in conversational system are also largely reference-based, insufficiently capturing the nuances in responses that may address users’ preferences differently. In this work, we probe the impact of different memory types in shaping agents’ responses. We present a fine-grained taxonomy of conversational memory, classify retrieved memories into different role types, and design a user-centric evaluation framework that simulates user perspectives. Through comparative experiments on long-term datasets and frontier LLMs, our analysis reveal many differentiated effects of memories: e.g., clarifying memory improves responses’ factual accuracy and constraint awareness, making them more correct and personalized; irrelevant memory reduces topic relevance and degrades constraint awareness. Despite the power of frontier LLMs, these findings shed light on how different memory types can be leveraged to produce more personalized responses and inspire further research in this direction.
arXiv

Probing Association Biases in LLM Moderation Over-Sensitivity

Yuxin Wang, Botao Yu, Ivory Yang, Saeed Hassanpour, and Soroush Vosoughi

2025 Under review

Abs arXiv

Large Language Models are widely used for content moderation but often misclassify benign comments as toxic, leading to over-sensitivity. While previous research attributes this issue primarily to the presence of offensive terms, we reveal a potential cause beyond token level: LLMs exhibit systematic topic biases in their implicit associations. Inspired by cognitive psychology’s implicit association tests, we introduce Topic Association Analysis, a semantic-level approach to quantify how LLMs associate certain topics with toxicity. By prompting LLMs to generate free-form scenario imagination for misclassified benign comments and analyzing their topic amplification levels, we find that more advanced models (e.g., GPT-4 Turbo) demonstrate stronger topic stereotype despite lower overall false positive rates. These biases suggest that LLMs do not merely react to explicit, offensive language but rely on learned topic associations, shaping their moderation decisions. Our findings highlight the need for refinement beyond keyword-based filtering, providing insights into the underlying mechanisms driving LLM over-sensitivity.
ICLR

ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Sentences

Yuxin Wang, Xiaomeng Zhu, Weimin Lyu, Saeed Hassanpour, and Soroush Vosoughi

In ICLR 2025 (spotlight)

Abs Proceeding arXiv Python Package Code

Handling implicit language is essential for natural language processing systems to achieve precise text understanding and facilitate natural interactions with users. Despite its importance, the absence of a robust metric for accurately measuring the implicitness of language significantly constrains the depth of analysis possible in evaluating models’ comprehension capabilities. This paper addresses this gap by developing a scalar metric that quantifies the implicitness level of language without relying on external references. Drawing on principles from traditional linguistics, we define ”implicitness” as the divergence between semantic meaning and pragmatic interpretation. To operationalize this definition, we introduce ImpScore, a novel, reference-free metric formulated through an interpretable regression model. This model is trained using pairwise contrastive learning on a specially curated dataset comprising 112,580 (implicit sentence, explicit sentence) pairs. We validate ImpScore through a user study that compares its assessments with human evaluations on out-of-distribution data, demonstrating its accuracy and strong correlation with human judgments. Additionally, we apply ImpScore to hate speech detection datasets, illustrating its utility and highlighting significant limitations in current large language models’ ability to understand highly implicit content.
ACL

MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations

Yuxin Wang, Ivory Yang, Saeed Hassanpour, and Soroush Vosoughi

In ACL 2024 (oral presentation)

Abs Proceeding Dataset Code Poster Website

Mental manipulation, a significant form of abuse in interpersonal conversations, presents a challenge to identify due to its context-dependent and often subtle nature. The detection of manipulative language is essential for protecting potential victims, yet the field of Natural Language Processing (NLP) currently faces a scarcity of resources and research on this topic. Our study addresses this gap by introducing a new dataset, named MentalManip, which consists of 4,000 annotated movie dialogues. This dataset enables a comprehensive analysis of mental manipulation, pinpointing both the techniques utilized for manipulation and the vulnerabilities targeted in victims. Our research further explores the effectiveness of leading-edge models in recognizing manipulative dialogue and its components through a series of experiments with various configurations. The results demonstrate that these models inadequately identify and categorize manipulative content. Attempts to improve their performance by fine-tuning with existing datasets on mental health and toxicity have not overcome these limitations. We anticipate that MentalManip will stimulate further research, leading to progress in both understanding and mitigating the impact of mental manipulation in conversations.
ISWC

Facing Changes: Continual Entity Alignment for Growing Knowledge Graphs

Yuxin Wang, Yuanning Cui, Wenqiang Liu, Zequn Sun, Yiqiao Jiang, Kexin Han, and Wei Hu

In The Semantic Web – ISWC 2022 (acceptance rate: 17.6%)

Abs Proceeding arXiv Blog Code

Entity alignment is a basic and vital technique in knowledge graph (KG) integration. Over the years, research on entity alignment has resided on the assumption that KGs are static, which neglects the nature of growth of real-world KGs. As KGs grow, previous alignment results face the need to be revisited while new entity alignment waits to be discovered. In this paper, we propose and dive into a realistic yet unexplored setting, referred to as continual entity alignment. To avoid retraining an entire model on the whole KGs whenever new entities and triples come, we present a continual alignment method for this task. It reconstructs an entity’s representation based on entity adjacency, enabling it to generate embeddings for new entities quickly and inductively using their existing neighbors. It selects and replays partial pre-aligned entity pairs to train only parts of KGs while extracting trustworthy alignment for knowledge augmentation. As growing KGs inevitably contain non-matchable entities, different from previous works, the proposed method employs bidirectional nearest neighbor matching to find new entity alignment and update old alignment. Furthermore, we also construct new datasets by simulating the growth of multilingual DBpedia. Extensive experiments demonstrate that our continual alignment method is more effective than baselines based on retraining or inductive learning.
Neurocomputing

Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey

Yuxin Wang, Jieru Lin, Zhiwei Yu, Wei Hu, and Borje F. Karlsson

Neurocomputing 2023

Abs Article arXiv

Storytelling and narrative are fundamental to human experience, intertwined with our social and cultural engagement. As such, researchers have long attempted to create systems that can generate stories automatically. In recent years, powered by deep learning and massive data resources, automatic story generation has shown significant advances. However, considerable challenges, like the need for global coherence in generated stories, still hamper generative models from reaching the same storytelling ability as human narrators. To tackle these challenges, many studies seek to inject structured knowledge into the generation process, which is referred to as structured knowledge-enhanced story generation. Incorporating external knowledge can enhance the logical coherence among story events, achieve better knowledge grounding, and alleviate over-generalization and repetition problems in stories. This survey provides the latest and comprehensive review of this research field: (i) we present a systematic taxonomy regarding how existing methods integrate structured knowledge into story generation; (ii) we summarize involved story corpora, structured knowledge datasets, and evaluation metrics; (iii) we give multidimensional insights into the challenges of knowledge-enhanced story generation and cast light on promising directions for future study.