Slides
IDPFUN2 Training School, 2026: RDM, DOME & Reproducibility
The IDFUN2 training school took place in Budapest, Hungary, with the week's events commencing on Monday, 4 May 2026. The programme featured comprehensive sessions led by Gavin Farrell from the University of Padova and ELIXIR Italy. The training centred on the intersection of data management, artificial intelligence, reproducibility, and career development within the life sciences. Below is a description of the course contents and a breakdown of the three primary slide decks in the requested order.
1. AI Data Management and DOME
* The slides cover foundational Research Data Management concepts. The FAIR principles for scientific data are heavily emphasised.
* The importance of data management plans is explained. Tools like the Data Stewardship Wizard and FAIRsharing are highlighted for maintaining good data practices.
* The intersection of FAIR data and artificial intelligence is explored. The presentation showcases how high-quality data is essential for model development.
* The DOME recommendations are detailed, standing for Data, Optimisation, Model, and Evaluation. Guidance is provided on reporting machine learning methods transparently using the DOME Registry.
2. Reproducibility and AI in bioinformatics
* The presentation addresses the reproducibility crisis in bioinformatics. It defines the core challenges and barriers, which include time and tooling overheads.
* Reproducible computational workflows are compared against non-reproducible manual methods. Reproducible workflows utilise tools like Docker, Conda, Nextflow, and Snakemake.
* The Open and Sustainable AI recommendations are introduced. These guidelines tackle reusability, reproducibility, and the environmental sustainability of AI models.
* The session discusses the emerging challenges posed by agentic AI systems and large language models. These technologies raise new questions regarding scientific integrity and reproducibility.
3. Career Development Opportunities and roles in scientific research infrastructures
* The target audience includes Masters, PhD, and Postdoc profiles. It is aimed at anyone interested in careers within data-focused research infrastructures.
* The session outlines the definition and scope of research infrastructures. It highlights key organisations such as ELIXIR and EMBL-EBI.
* Various roles are detailed, including data curator, research software engineer, and project manager. The presentation notes the cross-transferable skills applicable to industry.
* The session concludes with advice on the power of networking and community building for career advancement.
DOI: https://doi.org/10.5281/zenodo.20026744
Licence: Creative Commons Attribution 4.0 International
Contact: [email protected]
Keywords: DOME, OSAI
Target audience: PhD Students, Master students
Resource type: Slides
Version: 1
Status: Archived
Prerequisites:
Base understanding of machine learning and life science data. Suitable for both wet and dry lab scientists.
Learning objectives:
Understand RDM, and relation to managing AI/ML data assets.
Date created: 2026-05-01
Date modified: 2026-05-07
Date published: 2026-05-07
Scientific topics: Machine learning
Activity log
Italy