JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organizers at HAICON26@fu-confirm.de.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Please note that all times are shown in the time zone of the conference. The current conference time is: 24th Apr 2026, 05:05:54pm CEST

Agenda Overview

Session

Import to your local calendar

WS 7b (2/2) - Current status of the benchmarking field: lessons learned from the first half of the UNLOCK initiative

Time:

Monday, 08/June/2026:

4:30pm - 6:30pm

Location: Track 7

Session Abstract

Brief Description and Outline:

In this session, we will explore the state of the art in the benchmarking field, showcasing the most useful tools and summarizing best practices for setting up benchmarks. The workshop program features invited talks by leading contributors in the benchmarking field, a poster session from projects funded by the Helmholtz UNLOCK benchmarking initiative, and concludes with a discussion panel on benchmark design with experts from the Vector Institute and Helmholtz Munich.

14:15 - 14:35 HumaniBench: A Human-Centric Benchmark for Large Multimodal Models Evaluation (Shaina Raza, Vector Institute, Canada)

14:35 - 14:55 ChemBench: A benchmarking app for chemistry LLMs (TBA, Friedrich Schiller, University Jena)

14:55 - 15:15 OpenProblems: a platform for benchmarking open problems in single-cell analysis (Robrecht Cannoodt, Edaro)

15:15 - 15:35 AI Energy Consumption benchmarks (Philipp Huber, KIT)

15:35 - 15:55 SPRIND competitions (Dominik Hermle, SPRIND)

15:55 - 16:00 Discussion & outlook

16:00 - 16:45 Poster session: discovering the UNLOCK project Incl. a coffee break

16:45 - 17:30 Panel discussion: what makes a good benchmark? (Shaina Raza, Malte Luecken, Marie Piraud, TBA)

Moderators: Daria Romanovskaia, Kaleb Phipps

Official end of the workshop

Goals:

With growing numbers of AI models developed across scientific fields, establishment of benchmarks is essential. We want to bring together a community of researchers that have expertise in establishing benchmarks in their corresponding fields to exchange experience and provide practical advice. We see three main goals of this workshop:

- Define a wide range of benchmarking goals in the AI space;

- Highlight importance of global AI safety benchmarks;

- Encourage exchange between UNLOCK project participants and broader benchmarking community.

Presenters Experience:

Shaina Raza, from Vector Institute, one of the leading institute in establishing benchmarks in the field of trustworthy AI;

Among Helmholtz Munich scientists:

Marie Piraud - head of AI consultants in Helmholtz Munich;

Steffen Schneider’s group builds machine learning algorithms for representation learning and inference of nonlinear system dynamics;

Malte Lücken and his group are one of the pioneers of benchmarking in single-cell genomics.

We in addition would invite two-three more scientists with relevant backgrounds.

Interactive poster session will give an overview of the projects, funded by the UNLOCK benchmarking initiative.

Target Audience:

Researchers working on establishing AI benchmarks across different areas of expertise

Keywords:

Benchmarks, Trustworthy AI, AI safety