Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Understanding and Defending VLM Jailbreaks via Jailbreak-Related Representation Shift

Published in arXiv 2026, 2026

We show that VLM jailbreaks are not perception failures but distinct internal states driven by image-induced representation shifts, and propose JRS-Rem to remove these shifts at inference time.

Recommended citation: Zhihua Wei, Qiang Li, Jian Ruan, Zhenxin Qin, Leilei Wen, Ruiyang Qin, Qingzhuo Wang, Dongrui Liu, Wen Shen. (2026). "Understanding and Defending VLM Jailbreaks via Jailbreak-Related Representation Shift." arXiv 2026.
Download Paper

TME-PSR: Time-aware, Multi-interest, and Explanation Personalization for Sequential Recommendation

Published in arXiv 2026, 2026

We propose TME-PSR, a framework that integrates time-awareness, multi-interest modeling, and personalized explanations for sequential recommendation.

Recommended citation: Qingzhuo Wang, Leilei Wen, Juntao Chen, Kunyu Peng, Ruiyang Qin, Zhihua Wei, Wen Shen. (2026). "TME-PSR: Time-aware, Multi-interest, and Explanation Personalization for Sequential Recommendation." arXiv 2026.
Download Paper

Mitigating Action-Relation Hallucinations in LVLMs via Relation-aware Visual Enhancement

Published in ACL 2026, 2026

We define the ARS score to locate action-relation-sensitive attention heads, and propose RVE, a training-free method that enhances attention to action-relevant image regions to mitigate action-relation hallucinations in LVLMs.

Recommended citation: Zhenxin Qin, Qiang Li, Qingzhuo Wang, Ruiyang Qin, Zhihua Wei, Wen Shen. (2026). "Mitigating Action-Relation Hallucinations in LVLMs via Relation-aware Visual Enhancement." ACL 2026.

Multilingual Safety Alignment via Self-Distillation

Published in arXiv 2026, 2026

We propose an on-policy self-distillation method for multilingual safety alignment, transferring the model’s own safety capabilities from high-resource to low-resource languages without reliance on human-annotated safety data.

Recommended citation: Ruiyang Qin*, Qingzhuo Wang*, Dongrui Liu, Qiang Li, Zhihua Wei, Wen Shen. (2026). "Multilingual Safety Alignment via Self-Distillation." arXiv 2026.
Download Paper

A Unified Approach to Interpreting Knowledge Distillation for Large Language Models via Interactions

Published in ICML 2026, 2026

We interpret knowledge distillation from a game-theoretic interaction perspective, revealing that the essence of distillation is the sparsification of interactions, and propose the CIP loss to explicitly enforce this mechanism.

Recommended citation: Qingzhuo Wang*, Ruiyang Qin*, Zhenxin Qin, Wen Shen, Zhihua Wei. (2026). "A Unified Approach to Interpreting Knowledge Distillation for Large Language Models via Interactions." ICML 2026.
Download Paper

Evaluating and Explaining Prompt Sensitivity of LLMs Using Interactions

Published in ICML 2026, 2026

We introduce game-theoretic interactions to fine-grainedly analyze prompt sensitivity of LLMs, proposing the IPS metric and uncovering that factors like SFT and scale reduce sensitivity by stabilizing low-order interactions.

Recommended citation: Ruiyang Qin, Qingzhuo Wang, Tian Wang, Zhihua Wei, Wen Shen. (2026). "Evaluating and Explaining Prompt Sensitivity of LLMs Using Interactions." ICML 2026.
Download Paper

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.