Publications | Jim Rehg

STRIDE: When to Speak Meets Sequence Denoising for Streaming Video Understanding

Junho Kim, Hosu Lee, James Matthew Rehg, Minsu Kim, Yong Man Ro

Narrative-Driven Paper-to-Slide Generation via ArcDeck

Tarik Can Ozden, Sachidanand VS, Furkan Horoz, Ozgur Kara, Junho Kim, James Matthew Rehg

Layer-Aware Video Composition via Split-then-Merge

Ozgur Kara, Yujia Chen, Ming-Hsuan Yang, James Matthew Rehg, Wen-Sheng Chu, Du Tran

Kirin: Animal Motion Generation from In-the-Wild Video

Brian Nlong Zhao, Zhuoyang Pan, James Matthew Rehg, Jiajun Wu, Shangzhe Wu

Decoding Children's Gait Behavior

Yifan Shen, Boyi Li, Meihuan Huang, Yuanzhe Liu, Xu Cao, Jinyang Jin, Zhengyuan Li, Anglin Liu, Junho Kim, Jingyuan Zhu, Lan Fangzhou, Jianguo Cao, Jintai Chen, Ismini Lourentzou, James Matthew Rehg

Vinedresser3D: Agentic Text-guided 3D Editing

Yankuan Chi, Xiang Li, Zixuan Huang, James Matthew Rehg

PDF Project

Toward Diffusible High-Dimensional Latent Spaces: A Frequency Perspective

Bolin Lai, Xudong Wang, Saketh Rambhatla, James Matthew Rehg, Zsolt Kira, Rohit Girdhar, Ishan Misra

PDF Project

Omni-MMSI: Toward Identity-attributed Social Interaction Understanding

Xinpeng Li, Bolin Lai, Hardy Chen, Shijian Deng, Cihang Xie, Yuyin Zhou, James Matthew Rehg, Yapeng Tian

PDF Project

Learning Predictive Visuomotor Coordination

Wenqi Jia, Bolin Lai, Miao Liu, Danfei Xu, James Matthew Rehg

PDF Project

How Much 3D Do Video Foundation Models Encode?

Zixuan Huang, Xiang Li, Zhaoyang Lv, James Matthew Rehg

PDF Project

Gaze Target Estimation Anywhere with Concepts

Xu Cao, Houze Yang, Vipin Gunda, Zhongyi Zhou, Tianyu Xu, Adarsh Kowdle, Inki Kim, James Matthew Rehg

PDF

Forecasting 3D Scanpaths in Egocentric Video

Fiona Ryan, Ishwarya Ananthabhotla, Yijun Qian, Judy Hoffman, James Matthew Rehg, Vamsi Krishna Ithapu, Calvin Murdock

PDF

CoherentHand: Temporally Consistent 3D Hand Trajectory Synthesis with Semantic Motion Priors

Bikram Boote, Junho Kim, Ozgur Kara, Sangmin Lee, James Matthew Rehg

PDF Project

DiffVax: Optimization-Free Image Immunization Against Diffusion-Based Editing

Tarik Can Ozden, Ozgur Kara, Oguzhan Akcin, Kerem Zaman, Shashank Srivastava, Sandeep P. Chinchali, James Matthew Rehg

PDF Project

Toward Human Deictic Gesture Target Estimation

Xu Cao, Pranav Virupaksha, Sangmin Lee, Bolin Lai, Wenqi Jia, Jintai Chen, James Matthew Rehg

PDF Project

Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

Yifan Shen, Yuanzhe Liu, Jingyuan Zhu, Xu Cao, Xiaofeng Zhang, Yixiao He, Wenming Ye, James Matthew Rehg, Ismini Lourentzou

PDF Project

DiffEye: Diffusion-Based Continuous Eye-Tracking Data Generation Conditioned on Natural Images

Ozgur Kara, Harris Nisar, James Matthew Rehg

PDF Project

Cue3D: Quantifying the Role of Image Cues in Single-Image 3D Generation

Xiang Li, Zirui Wang, Zixuan Huang, James Matthew Rehg

PDF Project

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Bolin Lai, Felix Juefei-Xu, Miao Liu, Xiaoliang Dai, Nikhil Mehta, Chenguang Zhu, Zeyi Huang, James M Rehg, Sangmin Lee, Ning Zhang, Tong Xiao

PDF Project

Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation

Xiang Li, Zixuan Huang, Anh Thai, James M Rehg

PDF Project

SPAR3D: Stable Point-Aware Reconstruction of 3D Objects from Single Images

Zixuan Huang, Mark Boss, Aaryaman Vasishta, James M Rehg, Varun Jampani

PDF Project

SocialGesture: Delving into Multi-person Gesture Understanding

Xu Cao, Pranav Virupaksha, Wenqi Jia, Bolin Lai, Fiona Ryan, Sangmin Lee, James M Rehg

PDF Project

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

Ozgur Kara, Krishna Kumar Singh, Feng Liu, Duygu Ceylan, James M Rehg, Tobias Hinz

PDF Project

Improving Personalized Search with Regularized Low-Rank Parameter Updates

Fiona Ryan, Josef Sivic, Fabian Caba Heilbron, Judy Hoffman, James M Rehg, Bryan Russell

PDF Project

Gaze-LLE: Gaze Target Estimation via Large-scale Learned Encoders

Fiona Ryan, Ajay Bati, Sangmin Lee, Daniel Bolya, Judy Hoffman, James M Rehg

PDF Project

RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data

Maxwell A Xu, Jaya Narain, Gregory Darnell, Haraldur Hallgrimsson, Hyewon Jeong, Darren Forde, Richard Fineman, Karthik J Raghuram, James M Rehg, Shirley Ren

PDF Project

Leveraging Object Priors for Point Tracking

Bikram Boote, Anh Thai, Wenqi Jia, Ozgur Kara, Stefan Stojanov, James M Rehg, Sangmin Lee

PDF

3x2: 3D Object Part Segmentation by 2D Semantic Correspondences

Anh Thai, Weiyao Wang, Hao Tang, Stefan Stojanov, James M. Rehg, Matt Feiszli

PDF Project

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, Miao Liu

PDF Project

Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

Bolin Lai, Fiona Ryan, Wenqi Jia, Miao Liu*, James M. Rehg*

PDF Project

PointInfinity: Resolution-Invariant Point Diffusion Models

Zixuan Huang, Justin Johnson, Shoubhik Debnath, James M. Rehg, Chao-Yuan Wu

MAPLM: A Real-World Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding

Xu Cao*, Tong Zhou*, Yunsheng Ma*, Wenqian Ye, Can Cui, Kun Tang, Zhipeng Cao, Kaizhao Liang, Ziran Wang, James M. Rehg, Chao Zheng

Project

ZeroShape: Regression-based Zero-shot Shape Reconstruction

Zixuan Huang*, Stefan Stojanov*, Anh Thai, Varun Jampani, James M. Rehg

PDF Project

The Audio-Visual Conversational Graph: From an Egocentric- Exocentric Perspective

Wenqi Jia, Miao Liu, Hao Jiang, Ishwarya Ananthabhotla, James M. Rehg, Vamsi Krishna Ithapu, Ruohan Gao

PDF Project

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

Ozgur Kara*, Bariscan Kurtkaya*, Hidir Yesiltepe, James M. Rehg, Pinar Yanardag

PDF Project

Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations

Sangmin Lee, Bolin Lai, Fiona Ryan, Bikram Boote, James M. Rehg

PDF Project

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Yunsheng Ma*, Can Cui*, Xu Cao*, Wenqian Ye, Peiran Liu, Juanwu Lu, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Aniket Bera, James M. Rehg, Ziran Wang

PDF

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Kristen Grauman et al. (including Bikram Boote, Fiona Ryan, James M. Rehg)

PDF Project

REBAR: Retrieval-Based Reconstruction For Time-series Contrastive Learning

Maxwell A. Xu, Alexander Moreno, Hui Wei, Benjamin M. Marlin, James M. Rehg

PDF

Low-shot Object Learning with Mutual Exclusivity Bias

Anh Thai, Ahmad Humayun, Stefan Stojanov, Zixuan Huang, Bikram Boote, James M. Rehg

PDF Project

In the Eye of Transformer: Global–Local Correlation for Egocentric Gaze Estimation and Beyond

Bolin Lai, Miao Liu, Fiona Ryan, James M. Rehg

PDF

Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games

Bolin Lai, Hongxin Zhang, Miao Liu, Aryan Pariani, Fiona Ryan, Wenqi Jia, Shirley Anugrah Hayati, James M. Rehg, Diyi Yang

PDF Project

ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-based Consistency

Zixuan Huang, Varun Jampani, Anh Thai, Yuanzhen Li, Stefan Stojanov, James M. Rehg

PDF Project

Egocentric Auditory Attention Localization in Conversations

Fiona Ryan, Hao Jiang, Abhinav Shukla, James M. Rehg, Vamsi Krishna Ithapu

PDF Project

Which way is ‘right’?: Uncovering limitations of Vision-and-Language Navigation models

Meera Hahn, Amit Raj, James M. Rehg

PDF

Explaining a machine learning decision to physicians via counterfactuals

Supriya Nagesh, Nina Mishra, Yonatan Naamad, James M. Rehg, Mehul A Shah, Alexei Wagner

PDF

In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation

Bolin Lai, Miao Liu, Fiona Ryan, James M. Rehg

PDF Project Poster

Kernel Multimodal Continuous Attention

Alexander Moreno, Zhenke Wu, Supriya Nagesh, Walter Dempsey, James M. Rehg

PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation

Maxwell A. Xu, Alexander Moreno, Supiya Nagesh, V. Burak Aydemir, David W. Wetter, Santosh Kumar, James M. Rehg

PDF Code Project

Learning Dense Object Descriptors from Multiple Views for Low-shot Category Generalization

Stefan Stojanov, Anh Thai, Zixuan Huang, James M Rehg

PDF Code Video

Discovering Novel Predictors of Minimally Verbal Outcomes in Autism through Computational Modeling

Maxwell Xu, James M. Rehg, Agata Rozga, Jena McDaniel, Paul Yoder, Linda R. Watson, Nancy Brady

PDF

Egocentric Activity Recognition and Localization on a 3D Map

Miao Liu, Lingni Ma, Kiran Somasundaram, Yin Li, Kristen Grauman, James M. Rehg, Chao Li

Generative Adversarial Network for Future Hand Segmentation from Egocentric Video

Wenqi Jia, Miao Liu, James M. Rehg

PDF Code Project Poster

Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues

Zixuan Huang, Stefan Stojanov, Anh Thai, Varun Jampani, James M. Rehg

PDF Code Project Poster

Ego4D: Around the World in 3,000 Hours of Egocentric Video

Kristen Grauman et al., (including lab members Miao Liu, Fiona Ryan, Wenqi Jia, Audrey Southerland, and James M. Rehg)

PDF Dataset Project

The Surprising Positive Knowledge Transfer in Continual 3D Object Shape Reconstruction

Anh Thai, Stefan Stojanov, Zixuan Huang, James M. Rehg

The Surprising Positive Knowledge Transfer in Continual 3D Object Shape Reconstruction

Anh Thai, Stefan Stojanov, Zixuan Huang, James M. Rehg

PDF Code

mRisk: Continuous Risk Estimation for Smoking Lapse from Noisy Sensor Data with Incomplete and Positive-Only Labels

Md Azim Ullah, Soujanya Chatterjee, Christopher P Fagundes, Cho Lam, Inbal Nahum-Shani, James M. Rehg, David W Wetter, Santosh Kumar

No RL, No Simulation: Learning to Navigate without Navigating

Meera Hahn, Devendra Singh Chaplot, Shubham Tulsiani, Mustafa Mukadam, James M. Rehg, Abhinav Gupta

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Miao Liu, Dexin Yang, Yan Zhang, Zhaopeng Cui, James M. Rehg, Siyu Tang

PDF Project

Transformers for prompt-level EMA non-response prediction

Supriya Nagesh, Alexander Moreno, Stephanie M. Carpenter, Jamie Yap, Soujanya Chatterjee, Steven Lloyd Lizotte, Neng Wan, Santosh Kumar, Cho Lam, David W. Wetter, Inbal Nahum-Shani, James M. Rehg

PDF

Efficient Learning and Decoding of the Continuous-Time Hidden Markov Model for Disease Progression Modeling

Yu-Ying Liu, Alexander Moreno, Maxwell A. Xu, Shuang Li, Jena C. McDaniel, Nancy C. Brady, Agata Rozga, Fuxin Li, Le Song, James M. Rehg

PDF DOI

Using Shape to Categorize: Low-Shot Learning with an Explicit Shape Bias

Stefan Stojanov, Anh Thai, James M. Rehg

PDF Code Dataset Project

Orthogonal Over-Parameterized Training

Weiyang Liu, Rongmei Lin, Zhen Liu, James M. Rehg, Liam Paull, Li Xiong, Le Song, Adrian Weller

Discriminative Appearance Modeling with Multi-track Pooling for Real-time Multi-object Tracking

Chanho Kim, Li Fuxin, Mazen Alotaibi, James M. Rehg

Approximate Inverse Reinforcement Learning from Vision-based Imitation Learning

Keuntaek Lee, Bogdan Vlahov, Jason Gibson, James M. Rehg, Evangelos A Theodorou

3D Reconstruction of Novel Object Shapes from Single Images

Anh Thai, Stefan Stojanov, Vijay Upadhya, James M. Rehg

The mobile assistance for regulating smoking (MARS) micro-randomized trial design protocol

Inbal Nahum-Shani, Lindsey N Potter, Cho Y Lam, Jamie Yap, Alexander Moreno, Rebecca Stoffel, Zhenke Wu, Neng Wan, Walter Dempsey, Santosh Kumar, Emre Ertin, Susan A Murphy, James M. Rehg, David W Wetter