Pan Lu

I am a Postdoctoral Scholar at Stanford University. I am affiliated with Stanford AI Lab, Zou's Group, and Choi's xlab, where I am fortunate to be advised by Professor James Zou and Professor Yejin Choi.

I received my Ph.D. in computer science from UCLA, where I was advised by Kai-Wei Chang and Song-Chun Zhu. I was a member of UCLA Natural Language Processing Group (UCLA NLP). Previously, I completed my M.S. in computer science at Tsinghua University, supervised by Jianyong Wang. My research has been recognized with Most Influential ICLR Paper Award (top-15 cited at ICLR 2024), Most Influential NIPS Paper Award (top-15 cited at NeurIPS 2022), KnowledgeNLP 2025 Workshop Best Paper Award, and EMNLP 2024 Best Paper Nomination — achievements made possible thanks to the support of my advisors and collaborators. I have been fortunate to receive recognition from Amazon PhD Fellowship, Bloomberg Data Science Ph.D. Fellowship (Global 9), Qualcomm Innovation Fellowship (18 winners), UCLA Dissertation Year Fellowship, and NeurIPS Scholar Award.

My research goal is to develop intelligent machines that can reason and collaborate with humans for the common good. My primary focus lies in machine learning and natural language processing, particularly in machine reasoning, mathematical reasoning, and scientific discovery. My recent research interests include:

Tool-Augmented LLMs and Agentic Systems for complex reasoning [OctoTools] [Chameleon] [ChemAgent]
Post-Training and Test-Time Training techniques for foundation models [STIC] [LLaMA-Adapter] [LLaMA-Adapter V2] [SPHINX-X] [TextGrad] [PromptPG]
AI for Math: advancing mathematical reasoning capabilities of AI systems and LLMs across multimodal, knowledge-intensive, and real-world contexts [IneqMath] [MathVista] [MathVerse] [PromptPG] [Inter-GPS] [IconQA] [TheoremQA] [DL4Math] [MATH-AI]
AI for Science: AI systems that facilitate scientific reasoning and scientific discovery [ScienceQA] [SciBench] [Protein-LLM] [ChemAgent]

[25.06] We are seeking students to collaborate on research in agentic AI, post-training LLMs, reinforcement learning, mathematical reasoning, AI for Science, and related fields. A background in these fields is preferred but not strictly required. If you're interested in joining us, please apply via this form. For a faster response, kindly send me an email after submitting the form.

[05/2024] New! A paper on enhancing LVLMs with self-training is available at Preprint.
[05/2024] New! Thrilled to be awarded the Bloomberg Data Science Ph.D. Fellowship! Thanks!
[05/2024] New! One paper on advanced quantitative reasoning is accepted to ACL 2024 (Findings).
[05/2024] New! Two papers on math reasoning and VLMs are accepted at ICML 2024. See you in Vienna!
[04/2024] New! Defended my doctoral dissertation! Thanks to my advisor and committee members!
[03/2024] New! I am co-organizing the AI for Math Workshop at ICML 2024. See you in Vienna!
[03/2024] New! A paper on visual math reasoning with Multi-modal LLMs is available at Preprint.
[02/2024] New! A paper on LLMs for advanced quantitative reasoning is available at Preprint.
[01/2024] New! Two papers on large multimodal models are accepted to ICLR 2024.
[01/2024] New! A paper on model editing for LLMs is available at Preprint.
[01/2024] New! Two papers on large multimodal models are accepted to ICLR 2024.
[01/2024] New! A paper on model editing for LLMs is available at Preprint.
[12/2023] New! I am co-organizing the Tool-Augmented VIsion Workshop at CVPR 2024. See you in Seattle!
[12/2023] New! I am attending NeurIPS 2023 from Dec 10 to Dec 16. See you in New Orleans!
[12/2023] New! Google's Gemini benchmarks our MathVista for evaluating math reasoning in visual contexts!
[11/2023] New! Honored to be covered by UCLA CS for winning Qualcomm Innovation Fellowship. Thanks!
[10/2023] New! The 112-page study on GPT-4V, Bard, and others on visual math reasoning is available here.
[10/2023] New! Honored to serve as PC Chair and co-organize SoCal NLP 2023. See you in LA!
[10/2023] New! One paper on mathematical reasoning is accepted to EMNLP 2023.
[10/2023] New! One paper on mathematical reasoning in visual contexts (MathVista) is submitted to Preprint.
[09/2023] New! One paper on tool-augmented LLMs is accepted to NeurIPS 2023.
[07/2023] New! One paper on a scientific reasoning benchmark (SciBench) is submitted to Preprint.
[07/2023] New! I am co-organizing the 3rd MATH-AI Workshop at NeurIPS 2023. See you in New Orleans!
[06/2023] New! Excited to receive the UCLA Dissertation Year Fellowship.
[05/2023] New! Honored to deliver a guest lecture for UCLA CS 263: Natural Language Processing. [Slides]
[05/2023] New! One paper on theorem-driven math question answering (TheoremQA) is available at Preprint.
[05/2023] New! Honored to deliver a invited talk on tool-augmented LLMs at Google Brain. [Slides]
[05/2023] New! Delighted to join prestigious LightingAI event as invited speaker on Discord.
[05/2023] New! A paper on multimodal procedural planning is available at Preprint.
[05/2023] New! One survey paper on deep learning for mathematical reasoning is accepted to ACL 2023.
[04/2023] New! LLaMA-Adapter-V2, a parameter-efficient visual instruction model, is available at Preprint.
[04/2023] New! One tutorial proposal on mathematical reasoning is accepted to IJCAI 2023.
[04/2023] New! One paper on tool augmented LLMs (Chameleon) is available at Preprint.
[04/2023] New! Two papers are accepted to CVPR 2023 O-DRUM Workshop.
[03/2023] New! One paper on fine-tuning LLaMA in one hour (LLaMA-Adapter) is available at Preprint.
[01/2023] New! One paper on in-context learning for math reasoning (PromptPG) is accepted to ICLR 2023.
[12/2022] New! A survey paper on deep learning for mathematical reasoning is available at Preprint.
[12/2022] New! One paper is accepted to AAAI'23 KnowledgeNLP Workshop as an Oral Presentation.
[12/2022] New! I am excited to join Microsoft Research as a research intern!
[10/2022] New! Happy to receive the NeurIPS 2022 Scholar Award.
[10/2022] New! Two papers on mathematical reasoning are accepted to EMNLP 2022.
[09/2022] New! One paper on prompt learning for math reasoning (PromptPG) is submitted to Preprint.
[09/2022] New! One paper on chain-of-thought reasoning for ScienceQA is accepted to NeurIPS 2022.
[07/2022] New! I am co-organizing the 2nd MATH-AI Workshop at NeurIPS 2022. See you in New Orleans!
[07/2022] New! One paper on socially intelligent agents is accepted to SIGDIAL 2022.
[04/2022] Excited to be listed as a Highlighted Reviewer for ICLR 2022.
[03/2022] I am excited to join Allen Institute for AI (AI2) as a research intern!
[03/2022] One paper on character animation sampling is submitted to Preprint.
[12/2021] Two papers are accepted to AAAI 2022.
[10/2021] One paper on visual question answering for icon images (IconQA) is accepted to NeurIPS 2021.
[07/2021] I am co-organizing the MATHAI4ED Workshop at NeurIPS 2021. Welcome to participate!
[07/2021] Our workshop proposal for Math AI for Education (MATHAI4ED) is accepted to NeurIPS 2021.
[05/2021] One paper on interpretable geometry problem solving is accepted to ACL 2021 as an Oral Presentation.
[05/2021] One paper on social relation inference in dialogues is accepted to ACL 2021 as an Oral Presentation.
[03/2021] One paper on socially intelligent agents is submitted to Preprint.

OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning
Pan Lu*, Bowen Chen*, Sheng Liu*, Rahul Thapa, Joseph Boen, James Zou
arXiv:2502.11271 [Project] [Paper] [Code] [Package] [Demo] [YouTube] [Twitter] [Slack] [BibTex]

(*Equal Contribution)
🏆 Best Paper Award, KnowledgeNLP Workshop at NAACL 2025

Solving Inequality Proofs with Large Language Models
Jiayi Sheng*, Luna Lyu*, Jikai Jin, Tony Xia, Alex Gu, James Zou†, Pan Lu†
arXiv:2506.07927 [Project] [Paper] [Code] [Data] [Submission] [Twitter] [BibTex]

(† Co-senior authors)

Optimizing generative AI by backpropagating language model feedback
Mert Yuksekgonul*, Federico Bianchi*, Joseph Boen*, Sheng Liu*, Pan Lu*, Zhi Huang*, Carlos Guestrin, James Zou
Nature 639, 609–616 (2025) [Project] [Paper] [Code] [YouTube] [Documentation] [BibTex]

(*Equal Contribution)

Protein Large Language Models: A Comprehensive Survey
Yijia Xiao, Wanjia Zhao, Junkai Zhang, Yiqiao Jin, Han Zhang, Zhicheng Ren, Renliang Sun, Haixin Wang, Guancheng Wan, Pan Lu, Xiao Luo, Yu Zhang, James Zou, Yizhou Sun, Wei Wang
Preprint [Paper] [PDF] [Tutorial] [Coverage] [BibTex]

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning
Xueqing Wu, Yuheng Ding, Bingxuan Li, Pan Lu, Da Yin, Kai-Wei Chang, Nanyun Peng
CVPR 2025 [Project] [Paper] [PDF] [Code] [Data] [BibTex]

ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning
Xiangru Tang, Tianyu Hu, Muyang Ye, Yanjun Shao, Xunjian Yin, Siru Ouyang, Wangchunshu Zhou, Pan Lu, Zhuosheng Zhang, Yilun Zhao, Arman Cohan, Mark Gerstein
ICLR 2025 [Paper] [PDF] [Code] [News] [BibTex]

MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models
Wenbo Hu, Jia-Chen Gu, Zi-Yi Dou, Mohsen Fayyaz, Pan Lu, Kai-Wei Chang, Nanyun Peng
ICLR 2025 [Project] [Paper] [PDF] [Code] [HF Dataset] [BibTex]

MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines
Dongzhi Jiang, Renrui Zhang, Ziyu Guo, Yanmin Wu, Jiayi Lei, Pengshuo Qiu, Pan Lu, Zehui Chen, Guanglu Song, Peng Gao, Yu Liu, Chunyuan Li, Hongsheng Li
ICLR 2025 [Project] [Paper] [PDF] [Hugging Face] [Code] [Data] [BibTex]

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Fei Wang*, Xingyu Fu*, James Y. Huang, Zekun Li, Qin Liu, Xiaogeng Liu, Mingyu Derek Ma, Nan Xu, Wenxuan Zhou, Kai Zhang, Tianyi Lorena Yan, Wenjie Jacky Mo, Hsiang-Hui Liu, Pan Lu, Chunyuan Li, Chaowei Xiao, Kai-Wei Chang, Dan Roth, Sheng Zhang, Hoifung Poon, Muhao Chen
ICLR 2025 [Project] [Paper] [PDF] [Hugging Face] [Code] [Data] [Twitter] [BibTex]
(*Equal Contribution)

Enhancing Large Vision Language Models with Self-Training on Image Comprehension
Yihe Deng*, Pan Lu*, Fan Yin, Ziniu Hu, Sheng Shen, Quanquan Gu, James Zou, Kai-Wei Chang, Wei Wang
NeurIPS 2024 [Project] [Paper] [PDF] [Hugging Face] [Code] [Model] [Data] [BibTex]
(*Equal Contribution)

Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue
Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, Nanyun Peng
EMNLP 2024 [Paper] [PDF] [Code] [Twitter] [BibTex]
🏆 Best Paper Nomination, EMNLP 2024

VDebugger: Harnessing Execution Feedback for Debugging Visual Programs
Xueqing Wu, Zongyu Lin, Songyan Zhao, Te-Lin Wu, Pan Lu, Nanyun Peng, Kai-Wei Chang
EMNLP 2024 (Findings) [Project] [Paper] [PDF] [Code] [Model] [Data] [Twitter] [BibTex]

Multimodal Procedural Planning via Dual Text-Image Prompting
Yujie Lu, Pan Lu, Zhiyu Chen, Wanrong Zhu, Xin Eric Wang, William Yang Wang
EMNLP 2024 (Findings) [Paper] [PDF] [Code] [Twitter] [Coverage] [BibTex]

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, Pengshuo Qiu, Aojun Zhou, Pan Lu, Kai-Wei Chang, Peng Gao, Hongsheng Li
ECCV 2024 [Project] [Paper] [PDF] [Code] [Data] [Visualization] [Coverage] [Daily Papers] [BibTex]

Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data
Xiao Liu, Zirui Wu, Xueqing Wu, Pan Lu, Kai-Wei Chang, Yansong Feng
ACL 2024 (Findings) [Project] [Paper] [PDF] [Code] [Data] [Twitter] [BibTex]

SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
Xiaoxuan Wang*, Ziniu Hu*, Pan Lu*, Yanqiao Zhu*, Jieyu Zhang, Satyen Subramaniam, Arjun R. Loomba, Shichang Zhang, Yizhou Sun, Wei Wang
ICML 2024 [Paper] [PDF] [Code] [Twitter] [BibTex]

(*Equal Contribution)
Nature News Feature (15 November 2023)

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Peng Gao, Renrui Zhang, Chris Liu, Longtian Qiu, Siyuan Huang, Weifeng Lin, Shitian Zhao, Shijie Geng, Ziyi Lin, Peng Jin, Kaipeng Zhang, Wenqi Shao, Chao Xu, Conghui He, Junjun He, Hao Shao, Pan Lu, Hongsheng Li, Yu Qiao
ICML 2024 [Paper] [PDF] [Code] [Doc] [Hugging Face] [Twitter] [Coverage] [BibTex]

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao
ICLR 2024 [Project] [Paper] [PDF] [Code] [Dataset] [Leaderboard] [Visualize] [Coverage] [BibTex]

🏆 Most Influential ICLR Papers (Top-15 cited paper at ICLR-24)
🏆 Oral Presentation (1.2%) (85 in 7304 submissions)

LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Renrui Zhang, Jiaming Han, Aojun Zhou, Xiangfei Hu, Shilin Yan, Pan Lu, Hongsheng Li, Peng Gao, Yu Qiao
ICLR 2024 [Paper] [PDF] [Code] [Twitter] [Coverage] [BibTex]

LightningAI Blog Feature (14 April 2023)

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models
Pan Lu, Baolin Peng, Hao Cheng, Michel Galley, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Jianfeng Gao
NeurIPS 2023 [Project] [Paper] [PDF] [Code] [Twitter] [Coverage] [BibTex]

🏆 Best Weekly AI Paper (by AlphaSignal, 1st in 1682, 0.06%)
🏆 Awesome NeurIPS 2023 Papers (40 in 3584, 0.01%)
🏆 NeurIPS 2023 Top 10 Multimodal ML Papers

KokoMind: Can LLMs Understand Social Interactions?
Weiyan Shi*, Liang Qiu*, Dehong Xu, Pengwei Sui, Pan Lu, Zhou Yu
[Project] [Code] [Twitter] [Twitter] [BibTex]
(*Equal Contribution)

TheoremQA: A Theorem-driven Question Answering Dataset
Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, Tony Xia
EMNLP 2023 [Paper] [PDF] [Code] [Twitter] [BibTex]

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model
Peng Gao, Jiaming Han, Renrui Zhang, Ziyi Lin, Shijie Geng, Aojun Zhou, Wei Zhang, Pan Lu, Conghui He, Xiangyu Yue, Hongsheng Li, Yu Qiao
arXiv:2304.15010 [Paper] [PDF] [Code] [Gradio] [Gradio-Multimodal] [Twitter] [YouTube] [BibTex]

A Survey of Deep Learning for Mathematical Reasoning
Pan Lu, Liang Qiu, Wenhao Yu, Sean Welleck, Kai-Wei Chang
ACL 2023 [Paper] [PDF] [Code] [Poster] [Twitter] [Coverage] [BibTex]

🏆 Most Influential ArXiv (Artificial Intelligence) Papers (Top-25 cited paper at arXiv-22)

Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning
Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay Rajpurohit, Peter Clark, Ashwin Kalyan
ICLR 2023 [Paper] [PDF] [Project] [Data] [Code] [Explore] [Leaderboard] [Twitter] [BibTex]

Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu, Swaroop Mishra, Tony Xia, Liang Qiu, Kai-Wei Chang, Song-Chun Zhu, Oyvind Tafjord, Peter Clark, Ashwin Kalyan
NeurIPS 2022 [Paper] [PDF] [Project] [Data] [Huggingface] [Code] [Explore] [Leaderboard] [Twitter] [BibTex]

🏆 Most Influential NIPS Papers (Top-15 cited paper at NeurIPS-22)

LILA: A Unified Benchmark for Mathematical Reasoning
Swaroop Mishra*, Matthew Finlayson*, Pan Lu, Leonard Tang, Sean Welleck, Chitta Baral, Tanmay Rajpurohit, Oyvind Tafjord, Ashish Sabharwal, Peter Clark, Ashwin K. Kalyan
EMNLP 2022 [Paper] [PDF] [Project] [Data] [Code] [Huggingface] [BibTex]
(*Equal Contribution)

UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression
Jiaqi Chen, Tong Li, Jinghui Qin, Pan Lu, Liang Lin, Chongyu Chen and Xiaodan Liang
EMNLP 2022 [Paper] [PDF] [Code] [BibTex]

Towards Socially Intelligent Agents with Mental State Transition and Human Utility
Liang Qiu*, Yizhou Zhao*, Yuan Liang, Pan Lu, Weiyan Shi, Zhou Yu, Song-Chun Zhu
SIGDIAL 2022 [Paper] [PDF] [BibTex]
(*Equal Contribution)

Learning from the Tangram to Solve Mini Visual Tasks
Yizhou Zhao, Liang Qiu, Pan Lu, Feng Shi, Tian Han, Song-Chun Zhu
AAAI 2022 [Paper] [PDF] [Code] [BibTex]
Oral Presentation

IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning
Pan Lu, Liang Qiu, Jiaqi Chen, Tony Xia, Yizhou Zhao, Wei Zhang, Zhou Yu, Xiaodan Liang, Song-Chun Zhu
NeurIPS 2021 [Paper] [PDF] [Project] [Code] [BibTex]
Datasets and Benchmarks Track

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning
Pan Lu*, Ran Gong*, Shibiao Jiang*, Liang Qiu, Siyuan Huang, Xiaodan Liang, Song-Chun Zhu
ACL 2021 [Paper] [PDF] [Project] [Code] [BibTex]

Oral Presentation (*Equal Contribution)

SocAoG: Incremental Graph Parsing for Social Relation Inference in Dialogues
Liang Qiu, Yuan Liang, Yizhou Zhao, Pan Lu, Baolin Peng, Zhou Yu, Ying Nian Wu, Song-Chun Zhu
ACL 2021 [Paper] [PDF] [BibTex]
Oral Presentation

Learning Long- and Short-Term User Literal-Preference with Multimodal Hierarchical Transformer Network for Personalized Image Caption
Wei Zhang, Yue Ying, Pan Lu, Hongyuan Zha
AAAI 2020 [Paper] [PDF] [BibTex]

Knowledge Aware Semantic Concept Expansion for Image-Text Matching
Botian Shi, Lei Ji, Pan Lu, Nan Duan
IJCAI 2019 [Paper] [BibTex]
Oral Presentation

Dynamic Fusion with Intra- and Inter-modality Attention Flow for Visual Question Answering
Peng Gao, Zhengkai Jiang, Haoxuan You, Pan Lu, Steven CH Hoi, Xiaogang Wang, Hongsheng Li
CVPR 2019 [Paper] [Code] [BibTex]
Oral Presentation

Knowledge-Aware Deep Dual Networks for Text-Based Mortality Prediction
Ning Liu, Pan Lu, Wei Zhang, Jianyong Wang
ICDE 2019 [Paper] [BibTex]

Question-Guided Hybrid Convolution for Visual Question Answering
Peng Gao, Hongsheng Li, Shuang Li, Pan Lu, Yikang Li, Steven Hoi, Xiaogang Wang
ECCV 2018 [Paper] [BibTex]

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering
Pan Lu, Lei Ji, Wei Zhang, Nan Duan, Ming Zhou, Jianyong Wang
SIGKDD 2018 [Paper] [Project] [Video] [BibTex]
Oral Presentation

Co-attending Free-form Regions and Detections with Multi-modal Multiplicative Feature Embedding for Visual Question Answering
Pan Lu, Hongsheng Li, Wei Zhang, Jianyong Wang, Xiaogang Wang
AAAI 2018 [Paper] [Code] [BibTex]
Oral Presentation

News

Upcoming Travel

Selected Publications

Teaching

Guest Lecturers

Teaching Assistants

Professional Service

Conferences

Workshops and Tutorials

Program Committee Member

Journal Reviewer

Organizations

Selected Awards

Contact