One paper "See, Point, Fly" is accepted at CoRL 2025
Jun, 2025
One paper "LongSplat" is accepted at ICCV 2025
Mar, 2025
Two papers "FrugalNeRF" and "AuraFusion360" are accepted at CVPR 2025
Dec, 2024
Serve as a student volunteer for SIGGRAPH Asia 2024
Oct, 2024
Serve as a reviewer for ICLR 2025
Jul, 2024
Start my imaging engineer internship at Logitech
Mar, 2024
One paper "BoostMVSNeRFs" is accepted at SIGGRAPH 2024
Research
My research lies at the intersection of computer vision, machine learning, and 3D geometry. I focus on enabling machines to perceive and reconstruct the 3D world with efficiency, accuracy, robustness, and scalability, using learning-based methods that improve with data and computation. I am passionate about building real-time 3D reconstruction systems that generalize effectively in the real world.
LongSplat reconstructs long, casually captured videos into coherent compact 3D scenes by leveraging incremental joint pose and 3DGS optimization, accurate pose estimation, and efficient anchor formation.
See, Point, Fly (SPF) enables UAVs to navigate to any goal based on free-form natural language instructions in any environment, without task-specific training.
FrugalNeRF turns just two images and 10 minutes into high-quality 3D scenes by using weight-sharing voxels and cross-scale geometric adaptation to guide training without relying on external
priors.
A reference-guided 3D inpainting approach utilizing SDEdit on aligned Gaussian initialization, and created a 360° inpainting dataset (360-USID) for comprehensive evaluation.
DiffIR2VR-Zero leverages pre-trained diffusion models for video restoration, using hierarchical token merging and hybrid optical flow with nearest neighbor matching. It achieves top performance
across diverse degradations without training.
We proposed a GNSS/PDR fusion algorithm specifically designed for smartwatches. This algorithm tracks the varied roll and pitch of the sensor caused by hand swings and integrates a CNN model to
predict 1-D speed and perform ZUPT detection.
Projects
My projects span computer vision, robotics, and image processing, bridging theoretical research with practical applications across both academia and industry. I enjoy tackling challenging problems that require innovative solutions and have real-world impact.
Developed a GPU-based IQ enhancer for Logitech's new Conference Cameras Products (Rally Board 65), featuring real-time face crop, image quality improvement, and noise reduction.
A wrist-worn IMU PDR algorithm. Utilizing VQF to compute IMU attitude, divided into three stages: Step Detection, Step Length Estimation, and Heading Estimation, enabling navigation with
Wrist-Worn IMU worn by pedestrians.
A ROS package for robot navigation and arm control. It is capable of recognizing and grasping objects, as well as navigating to any location on the map.
A real-time camera navigation algorithm in a pre-built LiDAR map, utilizing NDT for 3D point cloud registration, effectively reducing 65% of accumulated position error.