SciNet’s publication about Niagara deployment

August 3, 2019 in blog, blog-general, blog-technical, for_press, for_researchers, for_users, frontpage, news, Road_to_Niagara

Have you ever wondered how a supercomputer is designed and brought to life?
Read SciNet’s latest paper on the deployment of Canada’s fastest supercomputer: Niagara.

Niagara is currently the fastest supercomputer accessible to academics in Canada.
In this paper we describe the transition process from our previous systems, the TCS and GPC, the procurement and deployment processes, as well as the unique features that make Niagara a one-of-a-kind machine in Canada.

Please cite this paper when using Niagara to run your computations, simulations or analysis:
“Deploying a Top-100 Supercomputer for Large Parallel Workloads: the Niagara Supercomputer”, Ponce et al, “Proceedings of PEARC’19: Practice and Experience in Advanced Research Computing on Rise of the Machines (Learning)”, 34 (2019).

Learn more about SciNet’s research and publications by visiting the following link.

SciNet Receives HPCwire Award

December 6, 2018 in blog, for_press, for_researchers, for_users, frontpage, in_the_news, news, Road_to_Niagara, success_story

We are very proud that SciNet has received the 2018 HPCwire Editor’s Award for Best Use of HPC in Physical Sciences. The award was announced at the 2018 International Conference for High Performance Computing, Networking, Storage and Analysis (SC18), in Dallas, Texas.

SciNet used Lenovo and Mellanox technologies on the new Niagara cluster to create spatial resolution models of the Pacific Ocean, helping to validate ocean waves movement and to assist in global warming calculations. These calculations were performed by a team of scientists involving University of Toronto’s Prof. W. Richard Peltier, University of Michigan oceanographer Prof. Brian Arbic, and NASA JPL’s Dr. Dimitris Menemenlis. More on this calculation can be found here.

This calculation was part of the “early science” program of the Niagara supercomputer at the SciNet HPC Consortion. In this short period in March of 2018, a number of scientists were given the opportunity to perform “heroic” calculations. These large scale calculation were essential to test, to tune and to get Niagara ready for use as a Canada’s fastest national academic supercomputer.

HPCwire: SciNet Launches Niagara, Canada’s Fastest Supercomputer

March 9, 2018 in in_the_news, news, Road_to_Niagara

HPCwire reports on the launch on the new supercomputer Niagara at Scinet.

Launch of the Niagara Supercomputer at SciNet

March 5, 2018 in for_educators, for_press, for_researchers, for_users, frontpage, in_the_news, news, Road_to_Niagara

The Niagara supercomputer was officially launched on March 5th, 2018. We were honoured by the presence and remarks of Reza Moridi (Ontario Minister of Research, Innovation and Science), Nizar Ladak (Compute Ontario President and CEO), Dr. Roseann O’Reilly Runte (CFI President and CEO), Prof. Vivek Goel (Vice-president of Research and Innovation at the University of Toronto), and Prof. W. Richard Peltier (Scientific Director of SciNet).

SciNet’s CTO Daniel Gruner gave an overview of the new system:

Niagara is located at University of Toronto and operated by the university’s high-performance computing centre SciNet, but the system is open to all Canadian university researchers.

Niagara is the fastest computer system in the country and is able to run a single job across all 60,000 cores thanks to a high-performance network which interconnects all the nodes. For more information on the configuration, see here.

A time-lapse of the building of Niagara is available (part of SciNet’s YouTube channel):

This system is jointly funded by the Canada Foundation for Innovation, the Government of Ontario, and the University of Toronto.

Road to Niagara 3: Hardware setup

March 5, 2018 in blog-technical, for_press, for_researchers, for_users, news, Road_to_Niagara, Uncategorized

This is the fourth of a series of posts on the transition to SciNet’s new supercomputer called “Niagara”, which will replace the General Purpose Cluster (GPC) and Tightly Coupled Cluster (TCS). The transition to Niagara will take place in the fall of 2017, and the system is planned to be available to users in early 2018.

The University of Toronto has awarded the contract for Niagara to Lenovo, and some of the details of the hardware specifications of the Niagara system have been released:

The system will have the following hardware components:

  • 1,500 nodes.
  • Each node will have 40 Intel Skylake cores (making a total of 60,000 cores) at 2.4 GHz.
  • Each node will have 200 GB (188 GiB)of DDR4 memory.
  • The interconnect between the nodes will be Mellanox EDR Infiniband in a Dragonfly+ topology.
  • A ~9PB usable shared parallel filesystem (GPFS) will be mounted on all nodes.
  • A 256TB Excelero burst buffer (NVMe fabric, up to 160 GB/s) will be available for fast I/O.
  • Peak theoretical speed: 4.61 PetaFLOPS

Niagara is estimated to be installed and operational towards in March 2018, and ready for users not too long after.

Even before official ready-date, there will a period in which select users can try out and port their codes to Niagara.

After the friendly-user period, all current users of the GPC (and former users of the TCS) will get access to Niagara.

The large core count, ample memory per core, and fast interconnect support Niagara’s intended purpose to enable large parallel compute jobs of 512 cores or more.

The software setup will also be tailored to large parallel computations. Nonetheless, there will still be a fair amount of backfill opportunity for smaller jobs.

The setup of Niagara is intended to be similar in spirit to the GPC, but different in form: scheduling per node, a home, scratch and possibly project directory defined in environment variables, a module system, and access to our team of analyst to help you get your codes running, and running well.

Road to Niagara 2: GPC Reduction

October 26, 2017 in news, Road_to_Niagara

This is the second of a series of posts on the transition to SciNet’s new supercomputer called “Niagara”, which will replace the General Purpose Cluster (GPC) and Tightly Coupled Cluster (TCS). The transition to Niagara will take place in the fall of 2017, and the system is planned to be available to users in early 2018.

The University of Toronto has awarded the contract for Niagara, which means its installation will start soon. To make room for this system, the General Purpose Cluster will be reduced from 30,912 to 16,800 cores on Tuesday November 28, 2017, at 12:00 noon.

Niagara is estimated to be installed, operational and ready for users towards the end of February 2018. At that time, the GPC will be decommissioned.

Even before official ready-date, there will a period in which select users can try out and port their codes to Niagara.

After the friendly-user period, all current users of the GPC (and former users of the TCS) will get access to Niagara (and their allocations on GPC or TCS will be carried over).

The setup will also be tailored to large parallel computations. Nonetheless, there will still be a fair amount of backfill opportunity for smaller jobs.

Although the details of the Niagara system are yet to be announced, existing SciNet users can get more information about the new system here.

Road to Niagara 1: Tightly Coupled Cluster Decommissioned

October 25, 2017 in frontpage, news, Road_to_Niagara


This is the first of a series of posts on the transition to SciNet’s new supercomputer called “Niagara”, which will replace our aging General Purpose Cluster (GPC) and Tightly Coupled Cluster (TCS). The transition to Niagara will take place in the fall of 2017, and the system is planned to be available to users in early 2018.

To make room for Niagara, old systems will have to go. Because enabling research computing is our priority, throughout the process of installting Niagara, at least 50% of the GPC will be kept running. The GPC will not be completely switched off until Niagara is available.

The first cluster to go was the TCS. This was SciNet’s first supercomputer, a 102-node, 3264-core, IBM Power 6 system installed in January of 2009.

The TCS was shut off on September 29, 2017, and physically removed in October. The end of an era.

As the pictures below show, you don’t just put your old supercomputer to the curb, there is a bit of work involved in removing it. It took about 8 hours, 14 pallets, 10 racks, and 3 truck loads. And a $5 bill was found under one of the TCS racks, so we made some money as well!

Currently we are in mids of finalizing the contract for Niagara, so the next post in this series will provided more details on the new system to come.

Decommissioning the old Power 6 TCS requires a little fork lift; those are heavy nodes.

TCS nodes taken out of their racks.


The empty space left behind by the TCS…


Decommissing TCS subfloor connections.

SciNet to be the site for the new Large Parallel System

May 11, 2015 in for_press, for_researchers, for_users, in_the_news, news, Road_to_Niagara

As part of its strategy for Advanced Research Computing and High Performance Computing in Canada, Compute Canada has conducted a site selection for four new systems. These systems are intended to replace and augment the currently aging computational systems available to Canadian academic researchers.

biligual-cc-web-logo-flat-white

Recognizing the diversity of ARC computing in academic research, Compute Canada is planning to install four systems. Three systems will be so-called General Purpose clusters, aimed at small to moderate sized jobs with a large variety of demands (e.g, IO, GPUS, memory, …).

The fourth machine will be a Large Parallel system, i.e. a tightly coupled parallel supercomputer intended for running large (on the order of at least 512 cores per job) parallel jobs, typically using the Message Passing Interface.

SciNet, at the University of Toronto, has been selected as the site for the LP system. The GP systems will be at the University of Victoria, at Simon Fraser University, and at the University of Waterloo.

Note that it is very hard at this stage to know when these new systems will be online. A rough, very tentative estimate is that they could start arriving sometime in 2016.

For more information regarding the selection, see the selection announcement by Compute Canada.