Weiyue Li

Hi! I am an graduate student in Data Science at Harvard University. My research focus on aligning vision, language, and domain generalization in AI/ML applications, as well as in improving the reasoning, robustness, and fairness of large multimodal models. I’m fortunate to work with Peter Baile Chen from MIT CSAIL and Prof. Mengyu Wang from Harvard AI and Robotics Lab.

Previously, I graduated Summa Cum Laude from the University of California, San Diego (UCSD) with triple majors in B.S. in Data Science, B.S. in Applied Mathematics, and B.A. in Economics. I was advised by Prof. Zhuowen Tu and Prof. Hao Zhang for multimodality LLM and benchmarking during my undergraduate years. I’m the sole recipient of the 2024 Jeffrey B. Remmel Award for Academic Excellence for my contribution to the data science community at UCSD.

I am also an experienced full-stack developer, and my full-stack internship project is in the scope for L6 (Senior) SDEs according to the internal Amazon SDE role guide. I am on the job market now, please contact me for any opportunities!

Research

	Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models Dachuan Zhao, Weiyue Li*, Zhenda Shen, Yushu Qiu, Bowen Xu, Haoyu Chen, Yongchao Chen under review paper / code / cite @article{zhao2025bias, title={Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models}, author={Zhao, Dachuan and Li, Weiyue and Shen, Zhenda and Qiu, Yushu and Xu, Bowen and Chen, Haoyu and Chen, Yongchao}, journal={arXiv preprint arXiv:2511.18123}, year={2025} } Post-hoc debiasing aims to mitigate biases in pre-trained Vision-Language Models (VLMs) like CLIP without expensive retraining. Existing coordinate-based methods (e.g., projection, arithmetic) treat biases as single directions, often failing to capture their complexity. We propose Subspace Projection Debiasing (SPD), geometrically rethinking bias as a subspace rather than a coordinate.
	The Necessity for Intervention Fidelity: Unintended Side Effects When Steering LLMs Jonas B Raedler, Weiyue Li*, Alyssa Mia Taliotis, Manasvi Goyal, Siddharth Swaroop, Weiwei Pan ICML 2025 Workshop on Reliable and Responsible Foundation Models* paper / cite @inproceedings{ raedler2025the, title={The Necessity for Intervention Fidelity: Unintended Side Effects When Steering {LLM}s}, author={Jonas B Raedler and Weiyue Li and Alyssa Mia Taliotis and Manasvi Goyal and Siddharth Swaroop and Weiwei Pan}, booktitle={ICML 2025 Workshop on Reliable and Responsible Foundation Models}, year={2025}, url={https://openreview.net/forum?id=8nYQEGou3L} } Steering (inference-time modification of activations) offers a lightweight alternative to fine-tuning for aligning large language models (LLMs). While effective on targeted behaviors, we do not yet fully understand the unintended side effects on the model’s broader capabilities. In this work, we introduce the concept of “Intervention Fidelity” to quantify these side effects and propose metrics to evaluate them.
	BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions Wenbo Hu, Yifan Xu, Yi Li, Weiyue Li, Zeyuan Chen, Zhuowen Tu AAAI 2024 website / paper / code / cite @inproceedings{hu2024bliva, title={Bliva: A simple multimodal llm for better handling of text-rich visual questions}, author={Hu, Wenbo and Xu, Yifan and Li, Yi and Li, Weiyue and Chen, Zeyuan and Tu, Zhuowen}, booktitle={Proceedings of the AAAI Conference on Artificial Intelligence}, volume={38}, number={3}, pages={2256--2264}, year={2024} } We introduce BLIVA, an augmented version of InstructBLIP with Visual Assistant. BLIVA incorporates the query embeddings from InstructBLIP and also directly projects encoded patch embeddings into the LLM, a technique inspired by LLaVA. This approach ensures that the model captures intricate details potentially missed during the query decoding process. Empirical evidence demonstrates that our model, BLIVA, significantly enhances performance in processing text-rich VQA benchmarks (up to 17.76% in OCR-VQA benchmark) and in undertaking typical VQA benchmarks (up to 7.9% in Visual Spatial Reasoning benchmark), comparing to our baseline InstructBLIP. BLIVA demonstrates significant capability in decoding real-world images, irrespective of text presence.

Selective Projects

SON: Enhancing Prompt Understanding of Diffusion Models with Large Language Models Guided Layouts

Weiyue Li*, Yi Li*, Xiaoyue Wang*, Hao Zhang
2024 Outstanding Capstone Project Award
website / paper / code

Services

CVPR: Reviewer (2026)

Teaching

	MLOps & LLMOps: Production AI Systems Prof. Pavlos Protopapas Harvard AC 215 FA25 website This course provides a comprehensive understanding of the Deep Learning process with a strong emphasis on Machine Learning Operations (MLOps). It bridges the gap between model development and production operation, combining data science, data engineering, and software engineering practices. Students learn to build, deploy, and manage AI systems through an iterative process of development, testing, monitoring, and updating.
	Planning and Learning Methods in AI Prof. Stephanie Gil, Prof. Kiante Brantley Harvard CS 1820 WI25 The course introduces the ideas and techniques underlying this exciting field, with the goal of teaching students to identify effective representations and approaches for a wide variety of computational tasks. Topics covered in this course are broadly divided into search and planning, optimization and games, and uncertainty and learning. Special attention is given to ethical considerations in AI and to applications that benefit society.
	Introduction to Computational Linguistics and Natural-language Processing Prof. Stuart Shieber Harvard CS 1870 FA24 This course introduces the field of computational linguistics and natural language processing (NLP). It covers methods for analyzing and generating human language, including syntax, semantics, and pragmatics, as well as applications such as machine translation, information extraction, and dialogue systems.
	Neural Networks and Deep Learning Prof. Zhuowen Tu UCSD COGS 181 WI24 website This course will cover the basics about neural networks, as well as recent developments in deep learning including deep belief nets, convolutional neural networks, recurrent neural networks, long-short term memory, and reinforcement learning. We will study details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification.
	Introduction to Machine Learning Prof. Edwin Solares UCSD CSE 151A WI24 Broad introduction to machine learning. The topics include some topics in supervised learning, such as k-nearest neighbor classifiers, decision trees, boosting, and perceptrons; and topics in unsupervised learning, such as k-means and hierarchical clustering. In addition to the actual algorithms, the course focuses on the principles behind the algorithms.
	AI: Probabilistic Models Prof. Mary Anne Smart UCSD CSE 150A S123 website / evaluation Introduction to probabilistic models at the heart of modern artificial intelligence. Specific topics to be covered include probabilistic methods for reasoning and decision-making under uncertainty; inference and learning in Bayesian networks; prediction and planning in Markov decision processes; applications to intelligent systems, speech and natural language processing, information retrieval, and robotics.
	The Practice and Application of Data Science (X3) Prof. Tauhidur Rahman, Prof. Suraj Rampure UCSD DSC 80 WI24, SP23, WI23 website / evaluation Students master the data science life-cycle and learn many of the fundamental principles and techniques of data science spanning algorithms, statistics, machine learning, visualization, and data systems.
	Theoretical Foundations of Data Science II Prof. Yusu Wang UCSD DSC 40B SP23 website / evaluation DSC 40B, the second course in the sequence, introduces fundamental topics in combinatorics, graph theory, probability, and continuous and discrete algorithms with applications to data analysis.
	Theoretical Foundations of Data Science I Prof. Truong Son Hy and Prof. Mahdi Soleymani UCSD DSC 40A FA22 website / evaluation DSC 40A will introduce fundamental topics in machine learning, statistics, and linear algebra with applications to data analysis.
	Data Structures and Algorithms for Data Science Prof. Soohyun Liao and Prof. Marina Langlois UCSD DSC 30 SP22 website Programming techniques including encapsulation, abstract data types, interfaces, algorithms and complexity, and data structures such as stacks, queues, priority queues, heaps, linked lists, binary trees, binary search trees, and hash tables with Java.
	Programming and Basic Data Structures for Data Science (x4) Prof. Marina Langlois UCSD DSC 20 WI24, SP22, WI22, FA21 website Programming techniques including recursion, higher-order functions, function composition, object-oriented programming, interpreters, classes, and simple data structures such as arrays, lists, and linked lists.
	Principles of Data Science Prof. Suraj Rampure and Prof. Janine Tiefenbruck and Prof. Rod Albuyeh UCSD DSC 10 FA23 website This first course in data science introduces students to data exploration, statistical inference, and prediction. It introduces the Python programming language as a tool for tabular data manipulation, visualization, and simulation. Through homework assignments and projects, students are given an opportunity to develop their analytical skills while working with real-world datasets from a variety of domains.
	Econometrics (X2) Prof. Gordon Dahl, Prof. Maria Candido UCSD ECON 120B WI23, FA22 website / evaluation This course prepares students for empirical analysis in an academic or business setting. It covers the fundamentals of regression, including estimation and hypothesis testing in a univariate and multivariate framework. It presents ideas using the “potential outcomes” framework and makes the important distinction between prediction and causality. The course discusses reasons why estimators may be biased or inconsistent, and how both randomized experiments and natural experiments can be used to obtain causal estimates.
	Mathematical Reasoning Prof. John Eggers UCSD MATH 109 SP22* website / evaluation This course uses a variety of topics in mathematics to introduce the students to rigorous mathematical proof, emphasizing quantifiers, induction, negation, proof by contradiction, naive set theory, equivalence relations and epsilon-delta proofs.
	Introduction to Differential Equations (X2) Prof. Nandagopal Ramachandran, Prof. Ming Xiao UCSD MATH 20D FA22, SP21 website / evaluation Ordinary differential equations: exact, separable, and linear; constant coefficients, undetermined coefficients, variations of parameters. Systems. Series solutions. Laplace transforms. Techniques for engineering sciences. Computing symbolic and graphical solutions using MATLAB.
	Calculus and Analytic Geometry for Science and Engineering Prof. Emmanuel Vavalis UCSD MATH 20C WI21* website Vector geometry, vector functions and their derivatives. Partial differentiation. Maxima and minima. Double integration.
	Calculus for Science and Engineering (X2) Prof. Yucheng Tu, Prof. Yuming Zhang and Prof. Jacob Sterbenz UCSD MATH 20A WI22, FA21 website / evaluation Foundations of differential and integral calculus of one variable. Functions, graphs, continuity, limits, derivative, tangent line. Applications with algebraic, exponential, logarithmic, and trigonometric functions. Introduction to the integral.