The ability to predict how an environment changes based on forces applied to it is fundamental for a robot to achieve specific goals. For instance, in order to arrange objects on a table into a desired configuration, a robot has to be able to reason about where and how to push individual objects, which requires some understanding of physical quantities such as object boundaries, mass, surface friction, and their relationship to forces. In this work, we explore the use of deep learning for learning such a notion of physical intuition.
We introduce SE3-Nets, which are deep networks designed to model rigid body motion from raw point cloud data. Based only on pairs of 3D point clouds along with a continuous action vector and point wise data associations, SE3-Nets learn to segment effected object parts and predict their motion resulting from the applied force. Rather than learning point wise flow vectors, SE3-Nets predict SE3 transformations for different parts of the scene. We test the system on three simulated tasks (using the physics simulator Gazebo) where we predict the motion of a varying number of rigid objects under the effect of applied forces and a 14-DOF robot arm with 4-actuated joints. We show that the structure underlying SE3-Nets enables them to generate a far more consistent prediction of object motion than traditional flow based networks, while also learning a notion of “objects” without explicit supervision.
SE3-Nets: Learning Rigid Body Motion using Deep Neural Networks, Arunkumar Byravan and Dieter Fox, IEEE International Conference on Robotics and Automation (ICRA), 2017 (Best Vision Paper Finalist)