Surgical robot automation has the potential to improve efficiency, consistency, and human productivity in minimally invasive surgery. However, existing automation methods are often designed for specific tasks and struggle to generalize across different surgical scenes and procedures. To address this challenge, our team developed VPPV, a vision-based embodied intelligence framework for generalized surgical task autonomy. VPPV integrates visual parsing, a perceptual regressor, policy learning, and a visual servoing controller into a unified pipeline that combines the adaptability of data-driven learning with the precision of classical control. By bridging image observations to physically meaningful state representations, the framework enables zero-shot sim-to-real transfer from simulation to real surgical robotic systems.
Embodied Intelligence Platform for Surgical Robot Automation
Researchers
Introduction
The Main Impact
The developed system VPPV enables surgical robots to learn generalized task autonomy in simulation and transfer these capabilities directly to real-world surgical settings. By combining robust visual understanding with policy learning and precise control, it supports multi-task automation, adapts to complex surgical scenes, and provides a practical pathway toward scalable and clinically relevant surgical robot autonomy.

Overview of the proposed VPPV framework for generalized task autonomy in surgical robotics

The research team conducted in vivo testing of the embodied intelligence platform in a pre-clinical setting
Our team had successfully completed the multi-task surgical automation tests on a live animal. The research has been published in the prestigious multidisciplinary research journal Science Robotics.


