in IEEE International Conference on Robotics and Automation (ICRA) 2026
Vision-Language-Action (VLA) models have emerged as a promising paradigm for building generally capable robot agents. Leveraging scalable teleoperation, simulation, and web-scale video data, as well as pre-trained vision and language models, these systems have demonstrated impressive generalization and instruction-following capability across diverse manipulation tasks.
However, their real-world impact remains limited: most pipelines lack rigorous data curation, principled training regimes, and standardized evaluation, leading to inconsistent performance and safety concerns on physical robots. Meanwhile, performance on real robots is dictated by the underlying "data engine", highlighting the importance of scalable data collection, domain randomization, procedural synthesis, and human-in-the-loop feedback.
This workshop aims to catalyze progress by examining the entire end-to-end VLA pipeline, from scalable data collection and systematic curation to training strategies and inference-time reasoning. A core focus will be on surfacing best practices and establishing shared metrics for generalization and safety.
Whether your work focuses on data collection & curation, preprocessing & augmentation, model architecture, training strategies, optimization techniques, inference & deployment, or evaluation & benchmarking โ if it advances Vision-Language-Action systems, we want to hear from you.
Submit Your WorkRegister your team now
Submit research papers
Competition results & accepted papers
ICRA 2026, Vienna
We invite 2-4 page extended abstracts on any component of the VLA pipeline, including:
Authors are encouraged to include links to code, configs, and data recipes.
Train on over 10,000 hours of data and evaluate your VLA systems on real robot hardware.
Evaluation videos and data provided for finetuning after each round
Try with sample dataset (airoa-moma)
Standard Hardware: Toyota HSR
Competition teams will have access to:
Detailed program schedule - subject to minor changes
* indicates the primary contact person