← Back to context

Comment by victorbjorklund

6 days ago

It sounds like you use the leader arm to show the robot how the task should be done. If you just used your own arm for the task the robot would have to translate human movements to its own mechanics (hard) but this way it does only need to replicate the movement you showed (easier). After you teach it how to do the movement it can then do it by itself. You show once and it can repeat a million times.

Ok I was under the impression (due to the cameras) that it's doing something with machine learning or can do a novel movement. This is just recording movements and playing them back.

  • If you bridge recorded trajectories with LVLM, then cameras are necessary visual input for LLM to decide which sub-tasks need to be performed to accomplish long-horizon task, and sub-tasks correspond to pre-recorded ("blind") trajectories which are replayed.

    If you go beyond pre-recorded "blind" trajectories into more robust task-policies (which you would have to train from many demonstrations) then cameras become necessary to execute the sub-task.