BEGIN:VCALENDAR
VERSION:2.0
PRODID:icalendar-ruby
CALSCALE:GREGORIAN
BEGIN:VEVENT
DTSTAMP:20260705T100920Z
UID:841b1b2c-d639-4766-bd25-4a46bd33571c
DTSTART:20171116T070000Z
DTEND:20171117T140000Z
DESCRIPTION:Description\n\nWith the rapid growth in data volume that is bei
 ng used in data analysis tasks\, it gets more and more challenging for the
  user to process it using standard methods. Enter Spark\, a high-performan
 ce distributed computing framework\, which allows us to tackle big-data pr
 oblems by distributing the workload across a cluster of machines. \n\n \n
 \nThis two day course addresses the technical architechture and use cases 
 of Spark\, setting it up for your work\, best practices and programming as
 pects. The first day includes the overview\, architechtural concepts and p
 rogramming with Spark's fundamental data structure (RDD). The second day f
 ocuses on the SQL module of Spark\, which allows the user to analyse data 
 using Spark's distributed collection (Dataframes) by using the traditional
  SQL queries. \n\nLearning outcome\n\nAfter this course you should be able
  to write simple to intermediate programmes in Spark using RDD and datafra
 mes/SQL. \n\nPrerequisites\n\nBasic knowledge on programming in general is
  recommended (ideally\, Python). \n\nPlease NOTE: This is not a regular pr
 ogramming course\, the participants would be expected to learn emerging co
 ncepts in the field of big data / distributed processing\, which might be 
 completely different from the concepts of a general progamming language. \
 n\nAgenda\n\nDay 1\, Thursday 16.11\n\n   09.00 – 09.30 Overview and a
 rchitechture of Spark \n	   09.30 – 10.15 Basics of RDDs + Demo\n	  
  10.15 – 10.30 Coffee break\n	   10.30 – 11.00 RDD: Transformations 
 and Actions\n	   11.00 – 12.00 Exercises\n	   12.00 – 13.00 Lunch\
 n	   13.00 – 13.30 Word Count Example\n	   13.30 – 14.00 Exercises
 \n	   14.00 – 14.15 Short overview of Machine learning library of Spar
 k \n	   14.15 – 14.30 Coffee break\n	   14.30 – 15.30 Exercises\n	
    15.30 – 16.00 Summary of the first day &amp\; exercises walk-trough
 \nDay 2\, Friday 17.11\n\n   09.00 – 09.30 Spark Dataframes and SQL ov
 erview\n	   09.30 – 10.15 Exercises\n	   10.15 – 10.30 Coffee brea
 k\n	   10.30 – 10.45 Dataframes and SQL contd.\n	   10.45 – 12.00 
 Exercises\n	   12.00 – 13.00 Lunch\n	   13.00 – 13.30 Best practic
 es and other useful stuff\n	   13.30 – 14.30 Exercises\n	   14.30 
 – 14.45 Coffee break\n	   14.45 – 15.00 Brief overview of Spark Stre
 aming\n	   15.00 – 15.15 Demo: Processing live twitter stream data\n	
    15.15 – 16.00 Summary of the course &amp\; exercises walk-trough\nL
 ecturers:  \n\nApurva Nandan (CSC)\, Teaching Assistant: Tommi Jalkanen (
 CSC)\n\n \n\nLanguage:  EnglishPrice:          Free of charge\n
 \nhttps://events.prace-ri.eu/event/668/
SUMMARY:Analysing large datasets with Apache Spark  @ CSC
URL;VALUE=URI:https://events.prace-ri.eu/event/668/
END:VEVENT
END:VCALENDAR
