Publications

Gaze-LLE: Gaze Target Estimation via Large-scale Learned Encoders

CVPR 2025 (Highlight, Acceptance rate 3.0%)
Fiona Ryan, Ajay Bati, Sangmin Lee, Daniel Bolya, Judy Hoffman, James M Rehg

Improving Personalized Search with Regularized Low-Rank Parameter Updates

CVPR 2025 (Highlight, Acceptance rate 3.0%)
Fiona Ryan, Josef Sivic, Fabian Caba Heilbron, Judy Hoffman, James M Rehg, Bryan Russell

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

CVPR 2025
Ozgur Kara, Krishna Kumar Singh, Feng Liu, Duygu Ceylan, James M Rehg, Tobias Hinz

SocialGesture: Delving into Multi-person Gesture Understanding

CVPR 2025
Xu Cao, Pranav Virupaksha, Wenqi Jia, Bolin Lai, Fiona Ryan, Sangmin Lee, James M Rehg

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

CVPR 2025
Zixuan Huang, Mark Boss, Aaryaman Vasishta, James M Rehg, Varun Jampani

Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation

CVPR 2025 (Highlight, Acceptance rate 3.0%)
Xiang Li, Zixuan Huang, Anh Thai, James M Rehg

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

CVPR 2025 (Highlight, Acceptance rate 3.0%)
Bolin Lai, Felix Juefei-Xu, Miao Liu, Xiaoliang Dai, Nikhil Mehta, Chenguang Zhu, Zeyi Huang, James M Rehg, Sangmin Lee, Ning Zhang, Tong Xiao

RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data

ICLR 2025
Maxwell A Xu, Jaya Narain, Gregory Darnell, Haraldur Hallgrimsson, Hyewon Jeong, Darren Forde, Richard Fineman, Karthik J Raghuram, James M Rehg, Shirley Ren

Leveraging Object Priors for Point Tracking

ECCV 2024 ILR Workshop
Bikram Boote, Anh Thai, Wenqi Jia, Ozgur Kara, Stefan Stojanov, James M Rehg, Sangmin Lee

3x2: 3D Object Part Segmentation by 2D Semantic Correspondences

ECCV 2024
Anh Thai, Weiyao Wang, Hao Tang, Stefan Stojanov, James M. Rehg, Matt Feiszli

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

ECCV 2024 (Oral, Acceptance rate 2.3%, ๐Ÿ† Best Paper Finalist)
Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, Miao Liu

Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

ECCV 2024
Bolin Lai, Fiona Ryan, Wenqi Jia, Miao Liu*, James M. Rehg*

MAPLM: A Real-World Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding

CVPR 2024
Xu Cao*, Tong Zhou*, Yunsheng Ma*, Wenqian Ye, Can Cui, Kun Tang, Zhipeng Cao, Kaizhao Liang, Ziran Wang, James M. Rehg, Chao Zheng

PointInfinity: Resolution-Invariant Point Diffusion Models

CVPR 2024
Zixuan Huang, Justin Johnson, Shoubhik Debnath, James M. Rehg, Chao-Yuan Wu

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

CVPR 2024 (Oral, Acceptance rate 0.8%)
Kristen Grauman et al. (including Bikram Boote, Fiona Ryan, James M. Rehg)

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

CVPR 2024
Yunsheng Ma*, Can Cui*, Xu Cao*, Wenqian Ye, Peiran Liu, Juanwu Lu, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Aniket Bera, James M. Rehg, Ziran Wang

Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations

CVPR 2024 (Oral, Acceptance rate 0.8%)
Sangmin Lee, Bolin Lai, Fiona Ryan, Bikram Boote, James M. Rehg

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

CVPR 2024 (Highlight, Acceptance rate 3.6%)
Ozgur Kara*, Bariscan Kurtkaya*, Hidir Yesiltepe, James M. Rehg, Pinar Yanardag

The Audio-Visual Conversational Graph: From an Egocentric- Exocentric Perspective

CVPR 2024
Wenqi Jia, Miao Liu, Hao Jiang, Ishwarya Ananthabhotla, James M. Rehg, Vamsi Krishna Ithapu, Ruohan Gao

ZeroShape: Regression-based Zero-shot Shape Reconstruction

CVPR 2024
Zixuan Huang*, Stefan Stojanov*, Anh Thai, Varun Jampani, James M. Rehg