Every behavior you see in the above video is controlled by a single vision-based neural network that emits actions at 10Hz. The neural network consumes images and emits actions to control the driving, the arms, gripper, torso, and head. The video contains no teleoperation, no computer graphics, no cuts, no video speedups, no scripted trajectory playback. It's all controlled via neural networks, all autonomous, all 1X speed.
Last edited by Yuli Ban on Sat Feb 10, 2024 1:37 am, edited 1 time in total.
And remember my friend, future events such as these will affect you in the future