Rouge AMD GPU Cluster

May 17, 2021 in news, Systems


The Rouge cluster was donated to the University of Toronto by AMD as part of their COVID-19 HPC Fund support program. The cluster consists of 20 x86_64 nodes each with a single AMD EPYC 7642 48-Core CPU running at 2.3GHz with 512GB of RAM and 8 Radeon Instinct MI50 GPUs per node.

The nodes are interconnected with 2xHDR100 Infiniband for internode communications and disk I/O to the SciNet Niagara filesystems. In total this cluster contains 960 CPU cores and 160 GPUs.

The user experience on Rouge is similar to that on Niagara and Mist, in that it uses the same scheduler and software module framework.

The cluster was named after the Rouge River that runs through the eastern part of Toronto and surrounding cities.

The system is currently its beta testing phase. Existing Niagara and Mist users affiliated with the University of Toronto can request early access by writing to support@scinet.utoronto.ca

In tandem with this SciNet hosted system, AMD, in collaboration with Penguin Computing, has also given access to a cloud system of the same architecture.

Specifics of the cluster:

  • Rouge consists of 20 nodes.
  • Each node has a 48-core AMD EPYC7642 CPU, 2-way hyperthreaded, and 8 AMD Radeon Instinct MI50 GPUs.
  • There is 512 GB of RAM per node.
  • HDR Infiniband one-to-one network between nodes.
  • Shares file systems with the Niagara cluster (parallel filesystem: IBM Spectrum Scale, formerly known as GPFS).
  • No local disks.
  • Theoretical peak performance (“Rpeak”) of 1.6 PF (double precision), 3.2 PF (single precision).
  • Technical documentation can be found at SciNet’s documentation wiki.

Industry Post-doc Position in Dynamical Downscaling

March 16, 2021 in HPC Jobs, HPC Jobs Ontario, news

Professor W. R. Peltier at the University of Toronto Department of Physics in collaboration with Aquanty invites applications for a postdoctoral research associate to investigate climate change impacts in northern Canada. The research work will include dynamical downscaling of climate projections, with an emphasis on land-surface – climate interactions in Arctic regions. The successful candidate will use the Weather Research and Forecasting (WRF) model to downscale CMIP5 and CMIP6 projections, and to assess temporal and spatial changes in snow cover and permafrost distribution. This project is part of a larger initiative to investigate the impact of climate change on natural resources across Canada, and includes partners in academia, government, and industry.

The minimum requirements for this position are:

  • A doctorate in atmospheric science, meteorology, hydrology, physics or a similar quantitative field
  • Significant experience with the Python programming language, its numerical/scientific stack (e.g. numpy, xarray etc.) and version control (e.g. git)
  • Experience with Linux/Unix environments, shell scripting (e.g. bash) and high-performance/parallel computing
  • Demonstrated ability to publish novel research

The ideal candidate would also possess the following skills and experiences:

  • Research experience with WRF or a similar limited-area atmospheric model
  • Familiarity with land-surface models like Noah-MP or CLM, and the ability to make changes or updates to these model components
  • Interest in climate change impacts and application of research results
  • Commitment to maintainable and reusable software

It is also expected that the successful candidate will contribute to the formulation of research objectives
and the design of numerical experiments, as well as towards the writing and publication of their own
research and that of project partners.

The position will be supervised by Prof. Peltier at the University of Toronto, and there will be direct
technical interaction with Aquanty researchers. Due to the applied nature of the research project,
engagement with both the research community and with natural resources stake-holder groups is
expected.

The appointment will be for a 3-year period, and it is expected that the successful candidate will be
legally able to work in Canada, and will (pending the evolution of the COVID pandemic) eventually
(re-)locate to the Greater Toronto Area in order to maintain a presence at the University of Toronto.
Interested candidates should contact Dr. Andre R. Erler at aerler@aquanty.com (using the subject line
“Post-doc Application”) and include an academic CV and cover letter.

Niagara at Scale Pilot

March 5, 2021 in blog-general, for_press, for_researchers, for_users, frontpage, news

SciNet will be reserving the Niagara cluster for two days in March for the first-ever “Niagara at Scale”, from March 30th, 2021, at 12 noon EST, to April 1st, 2021, at 12 noon EST.

Purpose of the “Niagara at Scale” event

This event will enable pre-approved projects that require all or nearly all of the capacity of the Niagara supercomputer at once. Such heroic computations are Niagara’s mandate, as it is the “Large Parallel cluster within the national systems of the Compute Canada Federation, and the fastest machine of its kind in Canada according to the TOP500 List. But computations of this size — think massively parallel codes running on tens of thousands of cores — are hard or impossible to run within the regular batch scheduler.

How to apply

We already have some groups interested in participating, but we would like to extend our invitation to the whole Canadian high-performance computing community before committing to a particular date. Users that have massively parallel jobs or workflows that could take advantage of this opportunity, are encourage to contact us at support@scinet.utoronto.ca by Friday, March 12, 2021 (note: this is an extension of the original deadline of March 5).

In the email, please briefly describe your intended computation, as well as the size and duration of the jobs you would like to run at scale.  Successful proposals will need to show evidence that their codes can run efficiently on at least 20,000 cores on Niagara and include strong and/or weak scaling data and plots.

In addition, your codes must be able to checkpoint and restart, especially since jobs will be restricted to shorter wall time.

Information session on March 10, 2021

We will hold an online information session regarding this program on March 10, 2021 at our SciNet User Group Meeting at noon EST. Attend to learn what kind of computations this program is aimed at. We will also provide guidance on how to get your computation to such a large scale if it needs it but your code does not yet scale to that size. For more information and sign-up for the event, go to https://scinet.courses/569

Future “Niagara at Scale” Events

The current event is a pilot project. If this initiative proves successful, we are planning to hold several of these events per year.

2020 International Summer School on HPC Challenges in Computational Sciences, University of Toronto, Canada, July 7-12

November 29, 2019 in for_press, for_researchers, for_users, frontpage, news


Update April 17, 2020: This event has been postponed to 2021.

Applications open November 29, 2019, and are due January 27, 2020

Who can apply: Graduate students and postdoctoral scholars from institutions in Canada, Europe, Japan and the United States, especially if you use advanced computing in your research. Students from underrepresented groups in computing are highly encouraged to apply (e.g., women, racial/ethnic minorities, persons with disabilities, etc.).

Who are the teachers: Leading computational scientists and HPC technologists from the U.S., Japan, Europe and Canada will teach classes and provide mentoring to attendees.

What will you learn: Topics include:

  • HPC challenges by discipline
  • HPC programming proficiencies
  • Performance analysis & profiling
  • Scientific visualization
  • Big Data Analytics
  • Mentoring
  • Networking
  • Machine Learning
  • Canadian, EU, Japanese and U.S. HPC-infrastructures

Preferred qualifications, but not required:

  • Familiarity with HPC, not necessarily an HPC expert, but rather a scholar who could benefit from including advanced computing tools and methods into their existing computational work
  • A graduate student with a strong research plan or a postdoctoral fellow in the early stages of their research careers
  • Utilize parallel programming at least on monthly basis, more frequently preferred
  • A science or engineering background, however, applicants from other disciplines are welcome, provided your research activities include computational work.

Cost: School fees, meals, and housing are covered for all accepted applicants, also intercontinental flight costs.

Further information and application: https://ss20.ihpcss.org

Questions? Reach out to the contact for your region listed on the back of this page to have questions answered about eligibility, the application process, or the summer school itself.

This summer school is organized by:

            
            

Contacts

Reach out to the contact for your region listed to get questions answered about eligibility, the application process, or the summer school itself.

CANADA
SciNet HPC Consortium: www.scinethpc.ca

Ramses van Zon
SciNet, Univ. of Toronto, Canada
Email: rzon@scinet.utoronto.ca

EUROPE
PRACE: www.prace-ri.eu

Hermann Lederer Simon Wong
Max Planck Computing and Data Facility, Germany ICHEC, Ireland
Email: lederer@mpcdf.mpg.de Email: simon.wong@ichec.ie

JAPAN
RIKEN: www.r-ccs.riken.jp/en

Toshiyuki Imamura
CCS, RIKEN
Email: Imamura.toshiyuki@riken.jp

UNITED STATES
XSEDE: www.xsede.org.

Jay Alameda
NCSA, University of Illinois at Urbana-Champaign, United States
Email: alameda@illinois.edu

SciNet Job Opportunity: Manager, Information Systems Security

November 19, 2019 in HPC Jobs, HPC Jobs Ontario, news

SciNet is looking for a Manager, Information Systems Security. This individual will be working under the direction of SciNet’s Chief Technical Officer (CTO) in coordination with the University of Toronto’s Chief Information Security Officer (CISO), the Manager, Information Systems Security is responsible for working with Information Technology staff and resources at SciNet and the wider Compute Canada federation to minimize risk of the compromising of information, data, servers, and server-based applications. Work is done in the context of existing policy, guidelines and applicable legislation in a fluid, consultative environment.

For more details see the following postings on the job site of the University of Toronto:
External posting /
Internal posting

This posting closes on December 5th, 2019.

Vizualize This! Competition

October 2, 2019 in frontpage, news

Visualize This! invites researchers from all disciplines to use their own datasets — or our sample dataset — to build a unique and innovative visualization that displays an interesting aspect of the data.

Visualize What?

Now in its fourth year, Visualize This! invites researchers from all disciplines to use their own datasets — or our sample dataset — to build a unique and innovative visualization that displays an interesting aspect of the data. Our panel of judges will review all entries, and prizes will be awarded to the best submissions.

The theme of this year’s challenge is Distributed Rendering — the visualization of very large datasets that require parallel rendering on a cluster.

Visualize This! is open to anyone affiliated with a Canadian post-secondary institution (college or university) or research organization. Participants from all research fields are encouraged to enter.

Ways to Participate:

1. Use Your Own Dataset

Use data from your own research. Any dataset that is too large to be rendered on a standalone desktop/workstation will be sufficiently large for this competition.

2. Use Our Dataset

If you don’t have a large enough dataset from your own work, Joshua Brinkerhoff from UBC Okanagan will be supplying a 3D Computational Fluid Dynamics (CFD) dataset that can be used for this competition. Example visualizations of this data are featured in the imagery on this poster. Joshua’s data will be available from September 30.

Submissions Due:

November 30, 2019

For more information email viz-challenge AT westgrid.ca or visit: https://computecanada.github.io/visualizeThis.

SciNet’s Summer School: a decade-old tradition

October 1, 2019 in blog, blog-general, for_educators, for_press, frontpage, news

Most would associate summertime with a relaxing and leisurely season of the year. However, HPC centres like SciNet, as in many others around the world, perceive this differently and are actually quite busy during this period.

Among the many activities SciNet carries out during the summer “break” are workshops and short courses. These activities are scheduled in the summer to fit between the term-long courses that SciNet offers to graduate students at the University of Toronto.

In particular, one of SciNet’s oldest training activities is a one-week intensive school on high-performance and technical computing. This annual summer school is our flagship training event, and is aimed at graduate students, undergraduate students, postdocs, researchers and occasionally even faculty members, who are engaged in compute intensive research. SciNet’s first such summer school was given in 2009, at which time it was called a “Parallel Scientific Computing” workshop. This first version of the school was heavily focused on parallel programming and applications in astrophysics.

These days, SciNet’s summer school is part of the Compute Ontario Summer School on Scientific and High Performance Computing. Held geographically in the west, centre and east of the province of Ontario in Canada, the summer school provides attendees with the opportunity to learn and share knowledge and experience in high performance and technical computing on modern HPC platforms. The central edition is the continuation of the SciNet summer school.

Not only is the school organized in a wider context, its program has expanded as well. In the last three years, the Toronto edition has had three streams with a wide variety of topics, from shell programming to data science, machine learning and neural networks, biomedical computing, and, still, parallel programming.

The type of training offered at the summer school is very practical, with a lot of hands-on exercises and live coding. This practical approach is very typical for most of SciNet’s courses but takes its ultimate form during the summer school instruction.

In addition to the training that participants received, the school also offers the opportunity of participants to interact with other participants, as well as the instructors, exchange ideas or discuss about current problems they are trying to solve. In fact, since a couple of years, the program includes focused sessions such as “Bring your own code” and “Bio-Hacking”, where this sort of interactions are not only promoted but the main theme.

Our summer school has the add-on feature of being absolutely free of charge for participants! That’s something we believe is quite important for several reasons, but mostly because we believe that in this way we can reach more researchers from fields that are relatively new to doing computational research.

This type of event not only benefits the students and participants of the summer school, but also enables collaborations between departments and consortia, as part of the training was delivered in partnership with colleagues from SHARCNET and the Centre for Addiction and Mental Health.

click on picture to enlarge

SciNet’s first summer school in 2009 focussed on Parallel Scientific Computing and placed emphasis on scientific applications such as in astrophysics.

click on picture to enlarge

SciNet’s latest (and largest) summer school, held in June 2019. This summer school had three parallel streams: the traditional High-Performance Computing, one on Data Science and a stream on BioInformatics/Medical applications, which was added in 2017. Details of the courses covered in the school can be found in SciNet education website: SciNet.courses/438

Logistics and Organizational details of the Summer School

There is no simple recipe to make a successful summer school that attracts and retains motivated participants for five full days, but below are a few necessary ingredients.

Sessions and instructors… Coming up with a program of three streams with sessions on scientific computing, parallel programming and data science is a challenge, but finding the excellent instructors for them is an even greater challenge, especially in summer, when many people are away.
Nonetheless, the summer school has been able to grow from a single-stream offering of 100 lecture hours in 2014 to a three-stream program with nearly 300 lecture hours in 2019. Luckily, we are not limited just to SciNet staff for instructors, but get help from the people from SHARCNET and CAMH as well.

Rooms… Organizing a training section of one-week long from Monday to Friday starting at 9:30am and finishing 4:30pm, offers a lot of challenges. For starting, finding rooms (not only one, but actually three –as there are three parallel concurrent sessions), ideally on the same building and each of them able of hosting around a hundred people, with proper power outlets, AC capabilities, and comfortable enough is a task far from trivial. We manage to do this, again with the effort of our instructors and staff who start to look into booking rooms months in advance… again summertime is not that “quiet and relaxing time” people may think of at the university premises…

Taking attendance… We issue certificates for those participants that attend at least three days. This requires that we record the attendance of the participants for every session every day. In the initial summer schools, where there were one or two parallel sessions at most, and the total number of participants wasn’t too large, we used a paper signing list, where students self-reported their attendance. By the end of the week we would collect and count these lists and manually awarded certificates.

But with 3 parallel streams and more than two hundred participants, the task of manually sorting out attendance has become unfeasible. To tackle this issue, we developed a system using our own education website, where we ask the participants to take a “test” selecting from 10 randomly generated codes the one that is given in the session they are attending.
In this way, the participation of each student is recorded and tied to the specific session associated with the selected code. The same site handles registration and dispenses the students’ access to temporary accounts on computing resources they will use during the week, and contains the teaching materials.

Certificates… Having recorded the attendance from the participants, this is just the beginning of the process of issuing the certificates. After this, we have scripts that can identify the participants that would be awarded a certificate of participating according to the criteria stated before, and generate a PDF document stating that. Years ago, we use to run through the university campus on the last day to print hard-copies of these, but since last year we send the participants an electronic version of it. The number of certificates demonstrates the growth in attendance over the years: In 2014 we awarded 30 attendees with summer school certificates, in 2019, this number has grown to 159.

Financial support… One remarkable thing about the school is that we are able to continue offering this high-quality and relevant training free of cost to the participants. This is not a easy task to achieve, as there are several costs associated to the event. The cost of the instructors is absorbed by the partnering organization (SciNet, SHARCNET and CAMH), while logistic costs for the rooms and AV utilizations are covered by SciNet, while coffee breaks that are provided to the participants were sponsored by Compute Ontario.

Other centres have decided to charge their participants a modest registration fee for their summer school, which allows them to tackle two things: one is to alleviate the cost associated with the event itself; and secondly, to reduce the number of no-shows during the school. Fortunately our attendance numbers have been rising steadily every year, but our turn-out rate seems to be steady and predictable at 70%, making the no-show effect non-issue.

More summer activities…

SciNet also participates in the International HPC Summer School, sending a few instructors and 10 students to this competitive one-week program every year.

Last but not least, SciNet finished this year’s summer season co-organizing and hosting a “virtual” remotely hosted one week-long PetaScale Computing Institute at the end of August.

Although physically and intellectually exhausted, we finished one of the busiest summer seasons ever in SciNet’s training and education history, allowing us to keep pushing ourselves and re-charge of our energies for the beginning of the academic year.

Further details and information about SciNet’s education and teaching endeavours can be found in the following link:

Study on the role of mediator complex in gene expression in collaboration with SciNet

September 10, 2019 in for_press, for_researchers, frontpage, in_the_news, news, science, success_story, Testimonials

For the last two years, SciNet has been collaborating with PhD candidate Alejandro Saettone from the Fillingham lab from Ryerson University. One of the research projects, which also involved the group of Dr. Ronald Pearlman at York University, deciphered some aspects of the mediator complex’s role in transcription and gene expression using the model organism Tetrahymena thermophila. See the EurekAlert! story on the matter, or the original paper in Current Biology.

The collaboration of SciNet’s Dr. Marcelo Ponce and Alejandro Saettone led to the development of the RACS (“Rapid Analysis of ChIP-Seq data”) pipeline, which serves to analyze data obtained from Chromatin Immunoprecipation followed by next generation Sequencing experiments (ChIp-Seq for short). The paper on this computational pipeline has been recently accepted for publication in BMC BioInformatics. The RACS pipeline, a set of bash shell scripts and R scripts, is open-source software available as a git repository at https://bitbucket.org/mjponce/RACS.

The RACS pipeline has been quite fruitful, having already resulted in two papers where it was applied to data from the model organism Tetrahymena thermophila. The pipeline is expected to result in a few more papers analyzing further data, and there are plans to make it suitable to target more general cases.

Alejandro Saettone: “Our group was very fortunate to collaborate with Dr. Ponce from SciNet. He helped our lab to solve bioinformatic problems involving big data. With this collaboration, we were able to advance knowledge in chromatin remodeling and gene expression.”

Learn more about SciNet’s research and opportunities to establish research collaborations visiting our research website.

SciNet’s publication about Niagara deployment

August 3, 2019 in blog, blog-general, blog-technical, for_press, for_researchers, for_users, frontpage, news, Road_to_Niagara

Have you ever wondered how a supercomputer is designed and brought to life?
Read SciNet’s latest paper on the deployment of Canada’s fastest supercomputer: Niagara.

Niagara is currently the fastest supercomputer accessible to academics in Canada.
In this paper we describe the transition process from our previous systems, the TCS and GPC, the procurement and deployment processes, as well as the unique features that make Niagara a one-of-a-kind machine in Canada.

Please cite this paper when using Niagara to run your computations, simulations or analysis:
“Deploying a Top-100 Supercomputer for Large Parallel Workloads: the Niagara Supercomputer”, Ponce et al, “Proceedings of PEARC’19: Practice and Experience in Advanced Research Computing on Rise of the Machines (Learning)”, 34 (2019).

Learn more about SciNet’s research and publications by visiting the following link.

2019 Compute Ontario Summer School Central

May 14, 2019 in blog, for_educators, for_press, for_researchers, for_users, frontpage, news

The Compute Ontario Summer School on Scientific and High Performance Computing is an annual educational event for graduate/undergraduate students, postdocs and researchers who are engaged in a compute intensive research. Held geographically in the west, centre and east of the province of Ontario, the summer school provides attendees with the opportunity to learn and share knowledge and experience in high performance and technical computing on modern HPC platforms.

Each site will have a slightly different list of courses. The summer school will include both in-class lectures and hands-on labs (done on the participants’ laptops). Those who attend at least three full days cumulatively will receive an official certificate in HPC training (i.e., a total of 6 full morning and afternoon sessions).

Instructors for this school have been provided by SciNet, CAMH and SHARCNET. Break refreshments are provided courtesy of Compute Ontario.

Registration for the central installment in Toronto from June 24-28, 2019 is now open!

The registration is free and is aimed at Compute Canada users as well as students, post-docs and other researchers from academic institutions. You do not need to have a SciNet or Compute Canada account (although you can use that). Please be advised that seats are limited and tend to fill up.