Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
Please note that all times are shown in the time zone of the conference. The current conference time is: 24th Apr 2026, 05:05:54pm CEST
|
Agenda Overview |
| Session | ||
WS 7b (2/2) - Current status of the benchmarking field: lessons learned from the first half of the UNLOCK initiative
| ||
| Session Abstract | ||
|
Brief Description and Outline: In this session, we will explore the state of the art in the benchmarking field, showcasing the most useful tools and summarizing best practices for setting up benchmarks. The workshop program features invited talks by leading contributors in the benchmarking field, a poster session from projects funded by the Helmholtz UNLOCK benchmarking initiative, and concludes with a discussion panel on benchmark design with experts from the Vector Institute and Helmholtz Munich. - 14:15 - 14:35 HumaniBench: A Human-Centric Benchmark for Large Multimodal Models Evaluation (Shaina Raza, Vector Institute, Canada) 14:35 - 14:55 ChemBench: A benchmarking app for chemistry LLMs (TBA, Friedrich Schiller, University Jena) 14:55 - 15:15 OpenProblems: a platform for benchmarking open problems in single-cell analysis (Robrecht Cannoodt, Edaro) 15:15 - 15:35 AI Energy Consumption benchmarks (Philipp Huber, KIT) 15:35 - 15:55 SPRIND competitions (Dominik Hermle, SPRIND) 15:55 - 16:00 Discussion & outlook 16:00 - 16:45 Poster session: discovering the UNLOCK project Incl. a coffee break 16:45 - 17:30 Panel discussion: what makes a good benchmark? (Shaina Raza, Malte Luecken, Marie Piraud, TBA) Moderators: Daria Romanovskaia, Kaleb Phipps Official end of the workshop - Goals: With growing numbers of AI models developed across scientific fields, establishment of benchmarks is essential. We want to bring together a community of researchers that have expertise in establishing benchmarks in their corresponding fields to exchange experience and provide practical advice. We see three main goals of this workshop: - Define a wide range of benchmarking goals in the AI space; - Highlight importance of global AI safety benchmarks; - Encourage exchange between UNLOCK project participants and broader benchmarking community. - Presenters Experience: Shaina Raza, from Vector Institute, one of the leading institute in establishing benchmarks in the field of trustworthy AI; Among Helmholtz Munich scientists: Marie Piraud - head of AI consultants in Helmholtz Munich; Steffen Schneider’s group builds machine learning algorithms for representation learning and inference of nonlinear system dynamics; Malte Lücken and his group are one of the pioneers of benchmarking in single-cell genomics. We in addition would invite two-three more scientists with relevant backgrounds. Interactive poster session will give an overview of the projects, funded by the UNLOCK benchmarking initiative. - Target Audience: Researchers working on establishing AI benchmarks across different areas of expertise - Keywords: Benchmarks, Trustworthy AI, AI safety |
