Publications

3x2: 3D Object Part Segmentation by 2D Semantic Correspondences

3x2: 3D Object Part Segmentation by 2D Semantic Correspondences

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation

PointInfinity: Resolution-Invariant Point Diffusion Models

PointInfinity: Resolution-Invariant Point Diffusion Models

MAPLM: A Real-World Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding

MAPLM: A Real-World Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding

ZeroShape: Regression-based Zero-shot Shape Reconstruction

ZeroShape: Regression-based Zero-shot Shape Reconstruction

The Audio-Visual Conversational Graph: From an Egocentric- Exocentric Perspective

The Audio-Visual Conversational Graph: From an Egocentric- Exocentric Perspective

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

REBAR: Retrieval-Based Reconstruction For Time-series Contrastive Learning

REBAR: Retrieval-Based Reconstruction For Time-series Contrastive Learning

Low-shot Object Learning with Mutual Exclusivity Bias

Low-shot Object Learning with Mutual Exclusivity Bias

Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games

Werewolf Among Us: A Multimodal Dataset for Modeling Persuasion Behaviors in Social Deduction Games

ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-based Consistency

ShapeClipper: Scalable 3D Shape Learning from Single-View Images via Geometric and CLIP-based Consistency

Egocentric Auditory Attention Localization in Conversations

Egocentric Auditory Attention Localization in Conversations

Explaining a machine learning decision to physicians via counterfactuals

Explaining a machine learning decision to physicians via counterfactuals

Kernel Multimodal Continuous Attention

Kernel Multimodal Continuous Attention

PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation

PulseImpute: A Novel Benchmark Task for Pulsative Physiological Signal Imputation

Discovering Novel Predictors of Minimally Verbal Outcomes in Autism through Computational Modeling

Discovering Novel Predictors of Minimally Verbal Outcomes in Autism through Computational Modeling

Egocentric Activity Recognition and Localization on a 3D Map

Egocentric Activity Recognition and Localization on a 3D Map

Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues

Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues

Ego4D: Around the World in 3,000 Hours of Egocentric Video

Ego4D: Around the World in 3,000 Hours of Egocentric Video

mRisk: Continuous Risk Estimation for Smoking Lapse from Noisy Sensor Data with Incomplete and Positive-Only Labels

mRisk: Continuous Risk Estimation for Smoking Lapse from Noisy Sensor Data with Incomplete and Positive-Only Labels

No RL, No Simulation: Learning to Navigate without Navigating

No RL, No Simulation: Learning to Navigate without Navigating

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Transformers for prompt-level EMA non-response prediction

Transformers for prompt-level EMA non-response prediction

Efficient Learning and Decoding of the Continuous-Time Hidden Markov Model for Disease Progression Modeling

Efficient Learning and Decoding of the Continuous-Time Hidden Markov Model for Disease Progression Modeling

Orthogonal Over-Parameterized Training

Orthogonal Over-Parameterized Training

Approximate Inverse Reinforcement Learning from Vision-based Imitation Learning

Approximate Inverse Reinforcement Learning from Vision-based Imitation Learning

3D Reconstruction of Novel Object Shapes from Single Images

3D Reconstruction of Novel Object Shapes from Single Images

The mobile assistance for regulating smoking (MARS) micro-randomized trial design protocol

The mobile assistance for regulating smoking (MARS) micro-randomized trial design protocol

Continuous measurement of attachment behavior: A multimodal view of the strange situation procedure

Continuous measurement of attachment behavior: A multimodal view of the strange situation procedure

Where Are You? Localization from Embodied Dialog

Where Are You? Localization from Embodied Dialog

Attention Distillation for Learning Video Representations

Attention Distillation for Learning Video Representations

Tripping through time: Efficient Localization of Activities in Videos

Tripping through time: Efficient Localization of Activities in Videos

Toys4K 3D Object Dataset

Toys4K 3D Object Dataset

4,000 3D object instances from 105 categories of developmentally plausible objects
Georgia Tech Egocentric Activity Datasets

Georgia Tech Egocentric Activity Datasets

Summary Text for GTEA dataset
Detecting Attended Visual Targets in Video

Detecting Attended Visual Targets in Video

Approximate Inverse Reinforcement Learning from Vision-based Imitation Learning

Approximate Inverse Reinforcement Learning from Vision-based Imitation Learning

3D Reconstruction of Novel Object Shapes from Single Images

3D Reconstruction of Novel Object Shapes from Single Images

Locally Weighted Regression Pseudo-Rehearsal for Adaptive Model Predictive Control

Locally Weighted Regression Pseudo-Rehearsal for Adaptive Model Predictive Control

Classification of Decompensated Heart Failure from Clinical and Home Ballistocardiography

Classification of Decompensated Heart Failure from Clinical and Home Ballistocardiography

Incremental Object Learning from Contiguous Views

Incremental Object Learning from Contiguous Views

Watching the TV Watchers

Watching the TV Watchers

SyncWISE: Window Induced Shift Estimation for Synchronization of Video and Accelerometry from Wearable Sensors

SyncWISE: Window Induced Shift Estimation for Synchronization of Video and Accelerometry from Wearable Sensors