Research Computing and Data Science Training/Education

The full power of high performance computing systems can best be exploited by people with specialized knowledge. The education and training of such people is absolutely critical, especially since the methodology in many disciplines has evolved to include a large computational component. SciNet has developed an education and training program for the wider scientific community aimed at helping students and users obtain the skills and knowledge required to get the most out of advanced research computing resources. It is one of our most important activities, and it has shown tremendous growth throughout SciNet’s existence.

SciNet’s education and training program

SciNet’s education program started with the traditional “Intro to SciNet” sessions and yearly intensive parallel programming workshops. As our user base has grown to encompass fields relatively new to HPC, such as medical science, biology, forestry, and economics, the program has grown to include topics in data science such as introductory scientific computing in Python, R, machine learning, and work-flow design, while still including advanced research computing and high performance computing.

The skills that SciNet aims to transfer are rare and sought-after, and complement and enhance the skills students learn in regular curricula. Users and students can get a certificate in Scientific Computing, Data Science, or High Performance Computing once they have completed enough SciNet credit-hours. As a document that proves the holder has highly competitive skills, the certificates are in high demand. From the start of the program in 2013 until November 2018, over 200 SciNet certificates have been issued.

The growth of SciNet’s education program is illustrated by the chart above which counts the total number of attendance (number of attendees times duration in hours) of all education and training events given by SciNet.

This graph also highlights the growth in popularity of our data science courses, which includes machine learning.

SciNet courses tie into university graduate programs

By partnering with other departments in the University, an increasing number of our training courses have been taken for credit toward graduate degrees at the University of Toronto. Our current partners include the Departments of Physics, Astrophysics, Chemistry, and Ecology and Evolutionary Biology, as well as the Institute for Medical Sciences. Indicative of the success of our “partnered” courses, the full term physics graduate course “Scientific Computing for Physicists” has had a consistent enrollment of around 40 in the last three Winter terms since its creation in 2016, and attracted students from many different departments such as physics, astrophysics, engineering and math. The modular course Data Analysis with R, given in partnership with IMS and EEB, which started in the Fall of 2016, had over 100 registered students, a success which prompted the creation of a full term IMS course, “Introduction to Clinical BioStatistics”. This course has a large data science component and has had no problems filling up the enrollment cap of 80 students. SciNet analysts also deliver a graduate course “Quantitative Applications for Data Analysis” in partnership with the Biological Sciences group at University of Toronto Scarborough. Furthermore, since 2017, we also guest-lecture in the 4th year Physics undergraduate Research Project course.

SciNet’s education site contains up-to-date information on courses, as well as course materials and recordings.

The diversity of academic backgrounds of the students taking our courses can be seem in the following chart, broken down by faculty within the University of Toronto.

faculty_studenthours_distribution_scinet_teaching

Collaborations in research computing training

Together with our partner consortia, SHARCNET and CAC, SciNet is involved in the annual Ontario Summer Schools in High Performance Computing. These schools provide attendees with opportunities to learn and share knowledge and experience in high performance and technical computing. Each of the three consortia organizes one week of summer school. In the past two years, the number of unique attendees to the Toronto-based summer school was over 150.

SciNet is also an organizer and sponsor of the International High Performance Computing Summer School (IHPCSS). This ‘school’ is a graduate-level summer institute organized as a collaboration between SciNet, XSEDE, PRACE and RCCS/RIKEN. In 2015, we were the local organizers of the IHPCSSS, when it was held at the University of Toronto. The IHPCSS is an expenses-paid program which is open to graduate students from Canada, the US, Europe and Japan. The demand from Canadian students is consistently about ten times larger than the number of available spots, further evidence for the demand for training in research computing.