in IEEE International Conference on Robotics and Automation (ICRA) 2026
Vision-Language-Action (VLA) models have emerged as a promising paradigm for building generally capable robot agents. Leveraging scalable teleoperation, simulation, and web-scale video data, as well as pre-trained vision and language models, these systems have demonstrated impressive generalization and instruction-following capability across diverse manipulation tasks.
However, their real-world impact remains limited: most pipelines lack rigorous data curation, principled training regimes, and standardized evaluation, leading to inconsistent performance and safety concerns on physical robots. Meanwhile, performance on real robots is dictated by the underlying "data engine", highlighting the importance of scalable data collection, domain randomization, procedural synthesis, and human-in-the-loop feedback.
This workshop aims to catalyze progress by examining the entire end-to-end VLA pipeline, from scalable data collection and systematic curation to training strategies and inference-time reasoning. A core focus will be on surfacing best practices and establishing shared metrics for generalization and safety.
Whether your work focuses on data collection & curation, preprocessing & augmentation, model architecture, training strategies, optimization techniques, inference & deployment, or evaluation & benchmarking โ if it advances Vision-Language-Action systems, we want to hear from you.
Submit Your WorkClosed โ 36 teams registered
Submit research papers
Competition results & accepted papers
ICRA 2026, Vienna
We invite 2-4 page extended abstracts (excluding acknowledgement and references) on any component of the VLA pipeline, including:
This workshop is non-archival โ submissions can be concurrently under review or previously published at other venues.
Authors are encouraged to include links to code, configs, and data recipes.
Train on over 10,000 hours of data and evaluate your VLA systems on real robot hardware.
Evaluation videos and data provided for finetuning after each round
Try with sample dataset (airoa-moma)Competition teams will have access to:
Registration is closed โ all 36 team slots have been filled. We are not accepting additional teams at this time.
Rules & PoliciesDetailed program schedule - subject to minor changes
* indicates the primary contact person