MVTrans: Multi-View Perception of Transparent Objects

Published in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023

Recommended citation: Wang, Yi Ru, Yuchi Zhao, Haoping Xu, Sagi Eppel, Alán Aspuru-Guzik, Florian Shkurti, and Animesh Garg. “MVTrans: Multi-View Perception of Transparent Objects.” In 2023 IEEE International Conference on Robotics and Automation (ICRA), 3771–78, 2023. https://ieeexplore.ieee.org/abstract/document/10161089

Transparent object perception is a crucial skill for applications such as robot manipulation in household and laboratory settings. Existing methods utilize RGB-D or stereo inputs to handle a subset of perception tasks including depth and pose estimation. However transparent object perception remains to be an open problem. In this paper, we forgo the unreliable depth map from RGB-D sensors and extend the stereo based method. Our proposed method, MVTrans, is an end-to-end multi-view architecture with multiple perception capabilities, including depth estimation, segmentation, and pose estimation. Additionally, we establish a novel procedural photo-realistic dataset generation pipeline and create a large-scale transparent object detection dataset, Syn-TODD, which is suitable for training networks with all three modalities, RGB-D, stereo and multi-view RGB.