4th collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'26)

Name: 4th collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'26)
Start: 2026-03-30T08:30:00+01:00
End: 2026-04-01T18:00:00+01:00
Location: University of Liverpool

30 March 2026 to 1 April 2026

University of Liverpool

Europe/London timezone

Dr Andrea Santamaria Garcia

ansantam@liverpool.ac.uk

Koopman-Stabilised World Models for Offline Reinforcement Learning in Accelerator Control

31 Mar 2026, 12:00

Teaching Hub 502 First Floor (University of Liverpool)

Teaching Hub 502 First Floor

University of Liverpool

Poster Poster session

Simon Hirlaender (PLUS University Salzburg)

Particle accelerators and their design studies generate large amounts of historical data from archived operation logs and high-fidelity simulations, yet most learning-based control strategies still rely on online optimisation, where new data must be collected through direct machine interaction. To make better use of such pre-generated data and avoid additional online exploration, we present a workflow for offline reinforcement learning (offline RL) based on a two-stage modelling approach. First, an Xsuite-based high-fidelity beam dynamics model is used to generate and archive trajectories for steering tasks across a set of representative machine scenarios (e.g., optics variations, alignment errors, and jitter conditions), providing synthetic but realistic expert and non-expert behaviour. Second, a Koopman-inspired hybrid world model is learned from this dataset, yielding fast, stable multi-step prediction together with epistemic uncertainty estimates via ensemble variance. This learned model serves as a surrogate environment for offline RL. We benchmark offline RL policies against a PPO agent trained directly in the original Xsuite physics model, where PPO episodes are terminated once trajectories leave expert-like regions or enter high-epistemic-uncertainty domains, reflecting realistic operational safety limits. Results show that policies trained purely offline on the Koopman world model can match or exceed PPO performance under these constraints, while requiring no additional online exploration. The proposed workflow demonstrates how Xsuite-based simulation, uncertainty-aware surrogate modelling, and offline RL can be combined to turn historical scenario data into a safe and reproducible pathway for learning-based accelerator control.

Student	Yes

Olga Mironova Simon Hirlaender (PLUS University Salzburg)

Andrea Santamaria Garcia (University of Liverpool and Cockcroft Institute) Lorenz Fischl (MedAustron GmbH, Wiener Neustadt, Austria) Sarah Trausner (University of Salzburg)

There are no materials yet.

4th collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'26)

Dr Andrea Santamaria Garcia

Koopman-Stabilised World Models for Offline Reinforcement Learning in Accelerator Control

Teaching Hub 502 First Floor

University of Liverpool

Speaker

Description

Primary authors

Co-authors

Presentation materials

Choose timezone

4th collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'26)

Dr Andrea Santamaria Garcia

Speaker

Description

Primary authors

Co-authors

Presentation materials