4th collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'26)

Name: 4th collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'26)
Start: 2026-03-30T08:30:00+01:00
End: 2026-04-01T18:00:00+01:00
Location: University of Liverpool

30 March 2026 to 1 April 2026

University of Liverpool

Europe/London timezone

Dr Andrea Santamaria Garcia

ansantam@liverpool.ac.uk

Session

Poster session

31 Mar 2026, 12:00

Teaching Hub 502 First Floor (University of Liverpool)

Teaching Hub 502 First Floor

University of Liverpool

There are no materials yet.

45. Agent-Based Simulation of Medical Device Redistribution in Crises with Reinforcement Learning

Georg Weinberger (University of Salzburg - Department of Geoinformatics (Z_GIS))

31/03/2026, 12:00

Poster

RL4AA’26 Poster Abstract

Background and Motivation
In crisis situations, hospitals can face shortages of medical devices required for the proper treatment of patients. More effective coordination of existing medical resources could therefore
improve patient care as well as the resilience of the healthcare system.

The objective of this work is to develop a spatial simulation that models...

61. Automated tuning using RL trained on Cheetah simulation at DESY and building Cheetah simulation for ISIS virtual accelerator

Raunakk Banerjee (science and technology facilities council)

31/03/2026, 12:00

Poster

Automating tuning has been an area of great interest in the accelerator community in recent years. Bayesian Optimisation (BO) has been favoured over Reinforcement Learning (RL) due to its short training time and reliability. However, RL has become increasingly viable with access to large training datasets from fast and differentiable simulation, Cheetah.

In this work, we develop Cheetah...

59. Autonomous beam flattening using reinforcement learning at the CLEAR facility at CERN

Giacomo Tangari (Sapienza University of Rome / CERN)

31/03/2026, 12:00

Poster

Modern particle accelerators operate in highly complex, nonlinear, and time-varying regimes, where optimal performance relies on the coordinated tuning of many coupled parameters under uncertainty and noise. Traditional control and optimization strategies based on physics models, linearization, or manual tuning often struggle to adapt in real time to changing beam conditions, hardware drifts,...

56. Autonomous Optimization of RF Triple Splitting in the CERN PS

Joel Wulff (CERN)

31/03/2026, 12:00

Poster

Reinforcement learning (RL) is a powerful technique for optimizing complex beam manipulations. An RL-based autonomous controller has been developed for the triple splitting RF manipulation in the CERN Proton Synchrotron (PS), essential to establish the bunch spacing for the LHC. The system combined a convolutional neural network for initial phase correction with sequential soft actor-critic...

54. Batch spacing optimization at SPS injection by RL

Matthias Remta (CERN / University of Vienna)

31/03/2026, 12:00

Poster

This contribution will be based on the paper "Batch spacing optimization by reinforcement learning" (DOI: https://doi.org/10.1103/g9wr-197z):
Beams designated for the LHC are injected into the SPS in multiple batches. Given the tight spacing of 200 ns between these batches, the injection kickers have to be precisely synchronized with the injected beam to minimize injection oscillations. Due...

52. Binary Trigger Signals for Deep Reinforcement Learning in Equity Trading

Juan Manuel Montoya Bayardo, Dr Simon Hirländer (Uni Salzburg)

31/03/2026, 12:00

Poster

This study introduces a novel binary trigger-based state representation for deep reinforcement learning (DRL) in stock trading. Unlike conventional approaches using continuous technical indicators (MACD, RSI, CCI, ADX), we encode market state via binary signals: MVX (moving-average crossover) and BOLLX (Bollinger band breakout). We also propose trigger-date filtering, which trains only on...

49. Causal GP-MPC: Where Structure, Safety, and Online Learning Meet for Robust Accelerator Control

Simon Hirlaender (IDA Lab, Paris Lodron University of Salzburg)

31/03/2026, 12:00

Poster

Robust accelerator control increasingly relies on data-driven optimisation, yet balancing adaptability with safety remains challenging. Simulation-driven, physics-informed reinforcement learning (RL) relies on soft constraints without formal safety guarantees, and classical response-matrix inversion (RMI) becomes suboptimal under noise
and hard actuator limits. Using the AWAKE electron...

28. Crystal Channelling Optimisation in the LHC Using Reinforcement Learning

Andrea Vella (University of Malta)

31/03/2026, 12:00

Poster

The Large Hadron Collider (LHC) requires a collimation system to ensure safe operation with both proton and heavy-ion beams. As of 2023, a crystal collimation scheme using bent silicon crystals was introduced to improve the collimation efficiency for heavy-ion beams. However, drifts in the crystal angular position led to the loss of cleaning performance during physics fills. These drifts are...

41. Designing a Gated Recurrent Unit in the Versal AI Engines

Michail Sapkas (UniPD - INFN Padova)

31/03/2026, 12:00

Poster

This poster presents the design and implementation of a Gated Recurrent Unit (GRU) on Xilinx Versal AI Engines. We outline the mapping of GRU computations to the AI Engine architecture, discuss dataflow and parallelization strategies, and highlight performance considerations for efficient recurrent neural network inference. The design supports unquantized models by leveraging 32-bit...

24. Designing for Tunability and Feedback in the Muon EDM Experiment

Johannes Alexander Jaeger (ETH Zurich / Paul Scherrer Institut)

31/03/2026, 12:00

Poster

The muon electric dipole moment (muEDM) experiment at PSI relies on highly sensitive off-axis muon injection into a compact frozen-spin trap. Injection performance depends strongly on magnetic field and material properties that are difficult to characterize with sufficient accuracy prior to commissioning. For a system of this complexity, purely feed-forward optimization of experimental...

64. Developing neural network based surrogate models for predicting laser accelerated proton energy spectra

lana Buckleton (University of strathclyde)

31/03/2026, 12:00

Poster

Reliable and well‑characterised laser‑driven proton beams are essential for advancing laser‑ion acceleration from fundamental research to practical applications such as medical physics [1]. However, shot-to-shot variability and the lack of robust, non‑invasive diagnostics continue to limit progress. Recent advances in machine learning [2] offer a promising route to overcoming these challenges...

47. Diagnosis and optimisation of laser pulse shaping for laser-plasma accelerators

Emily Archer (DESY)

31/03/2026, 12:00

Poster

Laser-plasma accelerators (LPAs) still trail conventional accelerators in terms of their ability to generate high-quality electron beams with low shot-to-shot variation. But with higher repetition rates and longer-term operation, the use of machine learning techniques is becoming increasingly viable as a control tool for improving the stability and reliability of LPAs.

In this context,...

79. Dreaming of Schottky Spectra: Building World Models for LEIR robust automation

Borja Rodriguez Mateos (CERN)

31/03/2026, 12:00

Poster

Stripper foil degradation in the Low Energy Ion Ring (LEIR) causes beam distribution drift that progressively degrades performance during multi-turn stacking at flat bottom. World models have emerged as a promising approach for sample-efficient and robust agents, enabling them to improve their behavior by rolling out policies in learned environment models between real interactions, thereby...

37. Extending Reinforcement Learning for Beam Steering with Bayesian Optimization and Online System Identification in the CERN SPS North Area

Adrián Menor de Oñate (CERN)

31/03/2026, 12:00

Poster

Commissioning slow extracted beams from the CERN Super Proton Synchrotron (SPS) to the North Area experimental targets requires trajectory control through multiple transfer lines using corrector magnets—a process that traditionally demands significant expert intervention. Previous work demonstrated using reinforcement learning (RL) for automated trajectory correction based on secondary...

57. Generalised Automatic Harmonic Operation in the CERN Proton Synchrotron Booster

Joel Wulff (CERN)

31/03/2026, 12:00

Poster

The Proton Synchrotron Booster (PSB) is equipped with a wideband radio-frequency (RF) system operated at multiple harmonics of the revolution frequency. Beyond acceleration, it stretches the protons bunches to mitigate space-charge effects. To maximize the bunch length throughout the entire acceleration cycle, three RF voltages at different harmonics and their relative phases must be tuned....

35. Geoff: Applications & Developments in 2025

Dr Penny Madysa (GSI)

31/03/2026, 12:00

Poster

The complexity of the CERN and GSI/FAIR accelerator facilities requires a high degree of automation to maximize beam time and performance for physics experiments. Geoff, the Generic Optimization Framework & Frontend, is an open-source tool developed within the EURO-LABS project by CERN and GSI to streamline access to classical and AI-based optimization methods. It provides standardized...

46. Improving Trajectory Tracking in Reinforcement Learning by Augmenting States with Future Targets

Georg Schäfer

31/03/2026, 12:00

Poster

Standard Reinforcement Learning (RL) for trajectory tracking typically relies on myopic state representations, providing agents only with the current target. This forces a reactive control paradigm, resulting in lag and overshoot during dynamic transitions. To address this, we propose augmenting the standard RL state space, which traditionally contains only the current reference, with future...

42. Koopman-Stabilised World Models for Offline Reinforcement Learning in Accelerator Control

Simon Hirlaender (PLUS University Salzburg)

31/03/2026, 12:00

Poster

Particle accelerators and their design studies generate large amounts of historical data from archived operation logs and high-fidelity simulations, yet most learning-based control strategies still rely on online optimisation, where new data must be collected through direct machine interaction. To make better use of such pre-generated data and avoid additional online exploration, we present a...

36. Leveraging Reinforcement Learning, Genetic Algorithms and Transformers for background determination in particle physics

Guillermo Hijano Mendizabal (University of Zurich)

31/03/2026, 12:00

Poster

Experimental studies of beauty hadron decays face significant challenges due to a wide range of backgrounds arising from the numerous possible decay channels with similar final states. For a particular signal decay, the process for ascertaining the most relevant background processes necessitates a detailed analysis of final state particles, potential misidentifications, and kinematic overlaps,...

38. MACS: Multi-Agent Reinforcement Learning for Optimization of Crystal Structures

Elena Zamaraeva (University of Manchester, Fusion21)

31/03/2026, 12:00

Poster

Geometry optimization of atomic structures is a common and crucial task in computational chemistry and materials design. Following the learning to optimize paradigm, we propose a new multi-agent reinforcement learning method called Multi-Agent Crystal Structure optimization (MACS) to address periodic crystal structure optimization. MACS treats geometry optimization as a...

53. ML-Based Phase Space Reconstruction for Loss Reduction at the PS-to-SPS Transfer

Jake Flowerdew (CERN)

31/03/2026, 12:00

Poster

The beam intensity in the injector chain at CERN has been nearly doubled as part of the upgrades for the High-Luminosity LHC (HL-LHC). This presents multiple operational challenges. A critical bottleneck is the uncaptured beam created during the transfer from the Proton Synchrotron (PS) to the Super Proton Synchrotron (SPS). Tomographic reconstruction of the longitudinal distribution during...

44. Multi-Agent Reinforcement Learning for Resource Allocation in Wireless Network Communication

Sabrina Pochaba (Salzburg Research)

31/03/2026, 12:00

Poster

Multi-Agent Reinforcement Learning (MARL) is an important subfield of Reinforcement Learning, in which multiple agents learn in a shared environment. The simultaneous learning of several players naturally arises in domains like robotics, network communication and traffic control, where agents affect and influence one another. Thus, MARL can simulate real-world problems in a reliable way, and...

27. Online reinforcement learning control of beam collision for BEPCII

Jiaqi Fan (中国科学院高能物理研究所（IHEP）)

31/03/2026, 12:00

Poster

For the Beijing Electron-Positron Collider II (BEPCII), operators need to tune the transverse offsets—including displacement and angular deviation (x, x’, y, y’)—of the two beams at the interaction point (IP) to maintain high luminosity as the beam current decays during normal operation. Given that the optimal offset exhibits a non-linear variation with beam current within a single run and...

62. PIPELINES: A NODE-BASED EDITOR FOR STREAMLINED OPTIMISATION PROTOTYPING IN THE CONTROL ROOM

Shaun Preston (University of Oxford)

31/03/2026, 12:00

Poster

Development shifts on accelerators are usually time-constrained and infrequent. Meanwhile, control room PCs are not designed for scrappy R\&D, and maintaining multiple workflows with python scripts is prone to error. GUI apps have been successfully deployed and used in the past to perform optimisation at accelerator facilities. However, bookkeeping can become difficult in complex tasks....

48. Reinforcement Learning Beyond Greedy Optimisation for Delayed-Consequence Accelerator Control

Kajsa Miho Björkbom (PLUS University Salzburg), Simon Hirlaender (PLUS University Salzburg)

31/03/2026, 12:00

Poster

Most accelerator control systems assume that the effect of an action can be evaluated locally and immediately. While greedy approaches work in near-linear regimes and Bayesian Optimisation (BO) is now standard for black-box tuning, both are essentially static optimisers and struggle in dynamic tasks with delayed consequences, where even adaptive BO remains time-myopic and lacks explicit...

65. Reinforcement Learning combined with a surrogate model of the accelerator

Daniele Zebele (INFN)

31/03/2026, 12:00

Poster

Recent developments at the INFN laboratories in Legnaro have demonstrated the effectiveness of Bayesian optimization in automating the tuning process of particle accelerators, yielding substantial improvements in beam quality, significantly reducing setup times, and shortening recovery times following interruptions. Despite these advances, the high-dimensional parameter space defined by...

60. Reinforcement Learning–Guided Dynamic Tuning of a THz Linac

Filip Peczek (The University of Manchester)

31/03/2026, 12:00

Poster

By scaling accelerator operation to THz frequencies, dielectric-lined waveguides (DLWs) can achieve accelerating gradients far higher than conventional RF structures, while supporting modes that couple longitudinal acceleration with transverse focusing. We propose a reinforcement-learning–based dynamic tuner that, using beam distribution information, adjusts the THz phase and amplitude of...

40. Resource-Conditioned Reinforcement Learning for Physics Instrument Design

Sara Zoccheddu (University of Zurich)

31/03/2026, 12:00

Poster

Designing advanced particle-physics instruments requires navigating a high-dimensional space of discrete and continuous choices while satisfying strict constraints on material, cost, and geometry. In practice, these constraints evolve throughout an experiment’s lifetime, making it insufficient to optimize a single “best” detector configuration. We present a resource-conditioned reinforcement...

16. Robust Real-Time Optimization of SIS18 Injection using Gaussian Process MPC

Simon Hirlaender (PLUS University Salzburg)

31/03/2026, 12:00

Poster

We present advancements in the data-driven Model Predictive Control (MPC) framework for optimizing multi-turn injection (MTI) into the SIS18 synchrotron. Building on our prior work on safe, sample-efficient optimization, we systematically investigate the impact of current noise and transverse emittance fluctuations. By incorporating realistic error models derived from dedicated measurements of...

33. Testing and Improving RL Policies via Rule Learning

Ignacio D. Lopez-Miguel

31/03/2026, 12:00

Poster

Using domain knowledge to improve deep RL policies is a current challenge. LEGIBLE mines rules from an RL policy, constituting a partially symbolic representation. These rules describe which decisions the RL policy makes and which it avoids making. It then generalizes the mined rules using domain knowledge. Finally, it evaluates generalized rules to determine which generalizations improve...

51. Towards a Training-Efficient Reinforcement Learning Based Control Approach for the LUMEN Engine Using Curriculum-Guided PPO

Fabio Matanza (University of Salzburg, B.Sc. AI Thesis (DLR, Institute of Space Propulsion & IDA Lab Salzburg)), Simon Hirlaender (PLUS University Salzburg)

31/03/2026, 12:00

Poster

In space propulsion, a small set of controllable valves regulates propellant mass flows and thereby achieves desired operating targets. Reinforcement learning (RL) is promising here because it can learn feedback policies directly from simulator interaction. Prior work has demonstrated the practical viability of deep RL for related control tasks. However, even in low-dimensional actuator...

82. Reinforcement Learning for Optimal Bunch Merge in the AGS

Eiad Hamwi (Brookhaven National Laboratory)

Poster

In BNL’s Booster, the beam bunches can be split into two or three smaller bunches to reduce their space-charge forces. They are then merged back after acceleration in the Alternating Gradient Synchrotron (AGS). This acceleration with decreased space-charge forces can reduce the final emittance, increasing the luminosity in RHIC and improving proton polarization. Parts of this procedure have...

Choose timezone

4th collaboration workshop on Reinforcement Learning for Autonomous Accelerators (RL4AA'26)

Dr Andrea Santamaria Garcia

Presentation materials