Today's batch of 27 papers reveals a field converging on two complementary fronts: closing the sim-to-real transfer gap and making learned policies robust enough for deployment. The autonomous driving cluster is particularly striking — five papers collectively tackle the full post-training pipeline from scenario generation (Conditional Flow-VAE) through closed-loop fine-tuning (CRAFT, ReflectDrive-2) to driver behavior modeling (Driver-WM) and validation (Practical validation). What unites them is a shared dissatisfaction with open-loop evaluation: each paper introduces mechanisms to stress-test or improve policies under closed-loop feedback, whether through distribution-matched safety-critical rollouts, RL-aligned self-editing of discrete trajectory tokens, or Bayesian equivalence testing of synthetic scenarios.
A second dominant theme is the maturation of model-based RL and planning for contact-rich, long-horizon tasks. ELVIS tackles the deep imagination brittleness problem with ensemble-calibrated uncertainty gating, HDFlow separates strategic subgoal diffusion from fast flow-based trajectory generation, and Dream-MPC rehabilitates gradient-based planning with amortized optimization. Meanwhile, the tactile manipulation papers (Reduced-order Neural Modeling, From Reach to Insert, Active Contact Sensing) demonstrate that touch-aware control is moving beyond proof-of-concept: sub-millimeter insertion at 0.05mm clearance, neural tactile simulation at 65% speedup, and 97.5% handover success via active perturbation all point toward industrial-grade tactile capability.
Cross-cutting these themes is a growing emphasis on computational efficiency under resource constraints. ConsisVLA-4D achieves 2.4× inference speedup over OpenVLA through consistency-based 3D reasoning without additional sensors, the dual-barrier CBF safety filter runs on Raspberry Pi via closed-form linear solves, and the cascaded-fidelity MPC for bipedal walking mixes model fidelities across the prediction horizon to hit real-time rates. The message is clear: the field is moving past "does it work in simulation?" toward "does it run fast enough, safely enough, on real hardware?"
Scenario generation, post-training for driving policies, driver modeling, and safety validation.
Vision-language-action architectures, latent action supervision, and offline-to-online policy learning.
Latent imagination, hierarchical diffusion-flow planning, gradient-based MPC, and multi-agent learning.
High-fidelity tactile simulation, precision assembly under sub-mm tolerances, and active handover sensing.
Radar SLAM, underwater navigation, safety-critical control on occupancy maps, and space rendezvous.
Koopman operators, hand-eye calibration, bipedal walking MPC, and agile robot locomotion.
Self-folding robots, autonomous laparoscope control, gaze estimation benchmarks, and embodied AI privacy.