Argonnes Murat Keeli helps prepare Jupyter notebooks for Aurora

July 14, 2023 —The Argonne Leadership Computing Facility (ALCF) leads the way in exascale technology, meeting the challenge of preparing for the arrival of Aurora, the next exascale system. Murat Keeli, a computational scientist at the US Department of Energy’s Argonne National Laboratory, is focused on integrating Jupyter notebooks with Aurora to improve research on high-performance computing.

In this Q&A article, Keeli discusses his role, insights, and experiences in driving scientific computing into the new horizon of exascale performance. The ALCF is a user facility of the US DOE Office of Science.


ALCF: How long have you been working at HPC?

Murat Keeli: The first work I submitted to an HPC cluster was in the summer of 2007 when I started working as a PhD student. It was a 32 node Beowulf cluster, Haku, and I was mostly submitting independent work for each node, which is considered more as a capacity calculation. The first supercomputer I did work on was ALCF’s Mira system and it dates back to when I started as a postdoctoral researcher at Argonne in 2014. Only then did I learn to do work in capacity mode, which roughly means running a single parallelized work on thousands of nodes using MPI. I should note that I owe most of what I learned about supercomputers to ATPESC (Argonne Training Program on Extreme-Scale Computing), which was truly a life-changing experience.

Exa-scale development is largely uncharted territory. What is it like to work on a project to enable science on these powerful and unique systems?

It’s certainly very exciting, but there’s also significant pressure. Basically, we have the potential to study problems orders of magnitude larger than we can tackle with conventionally available resources. However, realizing that potential into a real capacity that other scientists can access is a difficult problem. Developing scientific software with the features, flexibility, and performance required for today’s most demanding applications can take many years of human effort. The Exascale Computing Project (ECP) has given us time to be ready when the exascale machines are powered up; however, there should be continued support to make sure these projects are up to users’ standards.

What are Jupyter Notebooks? How do they help scientific research?

I see Jupyter notebooks as interactive learning, R&D environments. Jupyter notebooks let you combine code, documentation, and visualization in a single document. Therefore, they are very useful for experimenting with new ideas, creating prototypes, performing data analysis and writing tutorials. While they initially only supported Julia, Python, and R, Jupyter Notebook now supports more than 40 programming languages. Additionally, cloud services such as Google Colab and GitHub Codespaces enable effective collaboration using Jupyter Notebooks. All of these features have made Jupyter Notebooks the most popular platform for data scientists, and there is an ongoing effort to use Jupyter Notebooks for reproducible research and publishing. I would like to underline their role also in education and training. They are now an integral part of academic courses and workshops to provide interactive tutorials to students.

How will Jupyter notebooks be used with Aurora? Will using Jupyter notebooks with exascale systems differ from using them with traditional CPU-based systems?

The shift from CPU-based to GPU-based systems parallels the shift from traditional simulation-based workflows to artificial intelligence/machine learning (AI/ML) based workflows. This means a significant change in the HPC user profile. As Jupyter notebooks are more popular among data science users, it is natural to expect more use of Jupyter notebooks with Aurora. I believe ALCF JupyterHub will provide a more user-friendly (versus traditional terminal-based access) gateway to Aurora, where users can submit and track their jobs and analyze data on the fly.

What are the challenges you face in preparing Jupyter notebooks for the Auroras launch? What is your strategy to prepare Jupyter for exascale?

I think the main challenge for launching any cutting-edge supercomputer is the user experience. While these systems are very capable in the abstract, using these resources efficiently is a real challenge, particularly for initial users. Our goal with Jupyter Notebooks is to lower the barrier for new users by providing them with a smoother experience. We’ll leverage interactive features like Jupyter widgets to prepare standalone tutorials that can help users get started. A variety of learning materials for ALCF workshops are now based on Jupyter Notebooks.

Who do you collaborate with in your work at the ALCF?

I started working in Argonne as a postdoctoral researcher in the Dynamic Chemistry group of the Chemical Sciences and Engineering (CSE) division. I have worked on a couple of projects involving scientists from CSE, Materials Science, Mathematics and Computer Science and ALCF. I continue to collaborate with these colleagues on related projects and proposals. In 2018 I joined the ALCF Data Science team and the Computational Science division. ALCF workshops and several computer assignment programs have allowed me to connect with researchers from all over the world. Just in April we published a nanoscale paper on using machine learning potentials with Belgian and Turkish scientists based on a collaboration started with a previous ALCF workshop. For the past three years I have mainly worked on the NWChemEx ECP project. This is a very large project involving six national laboratories (Ames, Argonne, Brookhaven, Lawrence Berkeley, Oak Ridge and Pacific Northwest National Laboratory) and Virginia Tech University. I would also like to mention the Sustainable Horizon Institutes Sustainable Research Pathways program in which I participated last year. This workforce development program has allowed me to connect with researchers from underrepresented communities, and I will be hosting three summer students this year.

How has your approach to preparing for exascale evolved over time?

Initially, I was mostly concerned about the heavy scaling and performance tuning. While this is a very important area in HPC, over time my interest has shifted more to the workflow development and user experience side. This is an area that becomes more important with the growing interest in projects that aim to couple AI/ML with traditional simulations.

The Argonne Leadership Computing Facilityprovides supercompproviding the scientific and engineering community with capabilities to advance fundamental discoveries and knowledge across a broad range of disciplines. Supported by the U.S. Department of Energy’s (DOE) Office of Science, Advanced Scientific Computing Research (ASCR) program, the ALCF is one of two DOELeadership Computing Facilities in the nation dedicated to open science.

Argonne National Laboratoryseeks solutions to pressing national problems in science and technology. The nation’s first national laboratory, Argonne conducts cutting-edge basic and applied science research in virtually every scientific discipline. Argonne researchers work closely with researchers from hundreds of companies, universities, and federal, state, and city agencies to help them solve their specific problems, advance America’s scientific leadership, and prepare the nation for a better future. With employees from over 60 nations, Argonne is operated by UChicago Argonne, LLC for the United States. Department of Energy Science Office.


Source: Nils Heinonen, ALCF

#Argonnes #Murat #Keeli #helps #prepare #Jupyter #notebooks #Aurora
Image Source : www.hpcwire.com

Leave a Comment