BEGIN:VCALENDAR
VERSION:2.0
PRODID:icalendar-ruby
CALSCALE:GREGORIAN
BEGIN:VEVENT
DTSTAMP:20260614T170841Z
UID:901798a3-7205-40cb-8343-63518a71218a
DTSTART:20240108T120000Z
DTEND:20240108T150000Z
DESCRIPTION:Researchers often spend a significant amount of time on data-wr
 angling tasks\, such as reformatting\, cleaning\, and integrating data fro
 m different sources. Despite the availability of software tools\, they oft
 en end up with difficult-to-reuse workflows that require manual steps. Omn
 ipy is a new Python library that offers a systematic and scalable approach
  to research data and metadata wrangling. It allows researchers to import 
 data in various formats and continuously reshape it through typed transfor
 mations. For large datasets\, Omnipy seamlessly scales up data flows for d
 eployment on external compute resources\, with the user in full control of
  the orchestration.\n\nThis workshop will build on the half-day workshop [
 "Using Omnipy for data wrangling and metadata mapping (beginner level)"](h
 ttps://tess.elixir-europe.org/events/using-omnipy-for-data-wrangling-and-m
 etadata-mapping-part-1-beginner-level). In this second workshop\, particip
 ants will learn how to develop various types of data flows in Omnipy\, inc
 luding integration with web services. They will make use of the powerful i
 ndustry-developed Prefect orchestration engine to scale up the game and de
 ploy high-throughput ETL flows using external compute resources.\n\nThe wo
 rkshop is divided into three parts:\n\n1. The first part will introduce th
 e slogan "parse\, don't validate" and show how these concepts are implemen
 ted in Omnipy. On this background\, we will introduce the three types of d
 ata flows supported by Omnipy: linear\, DAG\, and function flows. We will 
 also\, through hands-on examples\, show how to make use of various job mod
 ifiers to power up and customise predefined tasks and flows to construct m
 ore complex data flows.\n2. The second part will focus on integrating data
  flows with web services through REST APIs. We will mainly focus on extrac
 ting data from data sources\, but will also touch upon loading results ont
 o data sinks. Hands-on examples will introduce tasks and flows that allow 
 flattening of JSON data into relational tabular form for mapping\, and the
 n restructuring the results back to JSON.\n3. The last part will introduce
  Omnipy's integration with S3-based cloud storage and the Prefect ETL orch
 estration library. As a hands-on exercise\, the participant will scale up 
 the data flow developed in the second part of the workshop by deploying it
  on an external compute infrastructure\, potentially the Kubernetes-based 
 NIRD Toolkit from SIGMA2 (if Prefect-integration in NIRD is finalised in t
 ime for the workshop).
LOCATION:Georg Sverdrups hus\, 39 Moltke Moes vei
SUMMARY:Using Omnipy for data wrangling and metadata mapping (Part 2 -Inter
 mediate level)
URL;VALUE=URI:https://www.ub.uio.no/english/courses-events/events/dsc/2024/
 digital-scholarship-days/22-omnipy-part2.html
END:VEVENT
END:VCALENDAR
