Liverpool Virtual Seminar Series on Data Intensive Science

Name: Liverpool Virtual Seminar Series on Data Intensive Science
Start: 2022-10-10T15:00:00+01:00
End: 2028-12-31T18:00:00+00:00
Location: Zoom Webinar

10 October 2022 to 31 December 2028

Zoom Webinar

Europe/London timezone

Naomi Smith

naomi.smith@liverpool.ac.uk

Build Big meets Build Smart to Explore the Universe

Date: Tuesday 14 Oct 2025 – 15:00 (Europe/London)
Speaker: Carolina Cuesta-Lazaro, Postdoctoral Fellow at the NSF Institute for Artificial Intelligence and Fundamental Interactions (IAIFI) at MIT

Abstract

Modern cosmology exemplifies the synergy between two complementary approaches in machine learning: scaling up with large models and datasets ("build big") versus incorporating targeted inductive biases for specific problems ("build smart"). Rather than choosing between these strategies, the most promising advances emerge from combining both. This talk presents three complementary projects that demonstrate this principle in action.

First, I show how foundation models for science benefit from incorporating both simulated and observed data. By learning shared representations across these domains through alignment losses, we achieve robust simulation-based inference that remains reliable even under model misspecification.

Second, I demonstrate scale-dependent anomaly detection using machine learning with cosmological inductive biases. By incorporating physical knowledge about scale dependence into the model’s architectures, we can detect deviations from standard models across different cosmological scales non-parametrically. This approach leverages both large observational datasets and physically-motivated architectural choices to identify potential new physics.

Third, I explore using large language models for automated hypothesis generation in cosmology. Through a systematic evaluation framework, I show that LLMs can autonomously propose novel dark energy theories and implement them in existing physics codes like CLASS. While the approach shows promise, it also reveals current limitations, including implementation challenges for complex models and the tendency to improve fits through additional parameters rather than fundamental insights.

Each project illustrates how the future of scientific discovery lies not in choosing between computational scale and inductive biases, but in thoughtfully combining both.

The talk is now also available on YouTube: https://youtu.be/yIqFbrpp7NI

Biography

Carolina Cuesta-Lazaro works at the intersection of astrophysics and machine learning. She is interested in developing robust and interpretable models that can guide us towards future discoveries in physics.

Carolina received her Ph.D. in Physics and Data Science from the Institute of Computational Cosmology at Durham University, UK. Alongside her PhD, she has been a research collaborator with the United Nations (UN) Global Pulse and the UK’s National Health Service (NHS), developing epidemiological simulations, and a research intern at Amazon’s Alexa team. She was also a postdoctoral fellow at the NSF Institute for Artificial Intelligence and Fundamental Interactions (IAIFI) at MIT and the Center for Astrophysics at Harvard. Next year she will join NYU for a faculty appointment.

Choose timezone

Liverpool Virtual Seminar Series on Data Intensive Science

Naomi Smith

Abstract

Biography