Red Hat to help DOE to containerize supercomputing • The Register

2022-06-16 10:03:04 By : Ms. Merity Tan

Cloud-native architectures have changed the way applications are deployed, but remain relatively uncharted territory for high-performance computing (HPC). This week, however, Red Hat and the US Department of Energy will be making some moves in the area.

The IBM subsidiary – working closely with the Lawrence Berkeley, Lawrence Livermore, and Sandia National Laboratories – aims to develop a new generation of HPC applications designed to run in containers, orchestrated using Kubernetes, and optimized for distributed filesystems.

The work might also make AI/ML workloads easier for enterprises to deploy in the process.

While Kubernetes, containerization, and block storage are all old hat in hyperscale and cloud datacenters, the technologies haven't been deployed on a wide scale in HPC environments. And where they have, they've been highly specialized to suit the workload's unique requirements.

"Our workloads are very different than the cloud. We need to run one very large job, and that gets split into many tens, hundreds, thousands of individual CPUs. It's a one-to-many mapping," Andrew Younge, research and development manager at Sandia National Laboratories, told The Register.

By comparison, cloud providers are primarily concerned with availability and capacity. In other words, how to make an application scale to meet the needs of rapidly changing usage and traffic patterns.

"With that in mind, we're trying… to use cloud-native technologies in the context of HPC, and that takes some customization," Younge explained.

Containerization isn't exactly new to HPC, but it has often been deployed in specialized runtimes, he added.

"By starting to adopt more standard technologies, that means that we can start to leverage other parts of the ecosystem," Shane Canon, senior engineer at Lawrence Berkeley National Laboratory, told The Register.

"What we want is to be able to run our HPC workloads, but we also want to start to marry that with Kubernetes-style deployments and configurations and execution."

"If you look at containerization in general, we have historically been focused on the application value of containers," Yan Fisher, global evangelist for emerging technologies at Red Hat, told The Register. "This really speaks to more of an infrastructure application."

To address these challenges, the IBM subsidiary is working with each of the labs to integrate cloud-native technologies into and in support of HPC workflows.

At Berkeley, Red Hat is working with Canon to make enhancements to Podman, a daemonless container engine similar to Docker, to replace the National Energy Research Scientific Computing Center's custom Shifter development runtime.

Similarly, at Sandia, Red Hat is working with Younge's team to explore the deployment of workloads on Kubernetes at scale using its OpenShift platform.

"In terms of Kubernetes, there's a lot of value to having that flexibility. We're traditionally used to HPC representing everything as a job, and that can sometimes be limiting," Younge said. "Representing services as well as jobs in some amalgamation of the two really provides a comprehensive scientific ecosystem."

Meanwhile, at Lawrence Livermore National Laboratory, the software vendor aims to help researchers deploy and manage containerized workloads alongside traditional HPC applications.

All three labs are investigating ways to deploy these workloads in distributed filesystems as opposed to the specialized parallel filesystems used today.

The ultimate goal of these endeavors is to make HPC workloads deployable on Kubernetes at "extreme scale" while providing users with well-understood ways of deploying them.

"A lot of this, especially with Podman, is about ensuring that the lessons we've learned in HPC can make it to a wider community," Younge said.

The benefits of this work extend well beyond the realm of science. The ability to easily deploy HPC workloads in containers or on Kubernetes has implications for the wave of enterprises scrambling to deploy large parallel workloads like AI/ML, he added. ®

Opinion Consulting giant McKinsey & Company has been playing a round of MythBusters: Metaverse Edition.

Though its origins lie in the 1992 sci-fi novel Snow Crash, the metaverse has been heavily talked about in business circles as if it's a real thing over the last year or so, peaking with Facebook's Earth-shattering rebrand to Meta in October 2021.

The metaverse, in all but name, is already here and has been for some time in the realm of online video games. However, Meta CEO Mark Zuckerberg's vision of it is not.

Right after the latest release of the KDE Frameworks comes the Plasma Desktop 5.25 plus the default desktop for the forthcoming Linux Mint 23.

Researchers at security product recommendation service Safety Detectives claim they’ve found almost a million customer records wide open on an Elasticsearch server run by Malaysian point-of-sale software vendor StoreHub.

Safety Detectives’ report states it found a StoreHub sever that stored unencrypted data and was not password protected. The security company’s researchers were therefore able to waltz in and access 1.7 billion records describing the affairs of nearly a million people, in a trove totalling over a terabyte.

StoreHub’s wares offer point of sale and online ordering, and the vendor therefore stores data about businesses that run its product and individual buyers’ activities.

Germany will be the host of the first publicly known European exascale supercomputer, along with four other EU sites getting smaller but still powerful systems, the European High Performance Computing Joint Undertaking (EuroHPC JU) announced this week.

Germany will be the home of Jupiter, the "Joint Undertaking Pioneer for Innovative and Transformative Exascale Research." It should be switched on next year in a specially designed building on the campus of the Forschungszentrum Jülich research centre and operated by the Jülich Supercomputing Centre (JSC), alongside the existing Juwels and Jureca supercomputers.

The four mid-range systems are: Daedalus, hosted by the National Infrastructures for Research and Technology in Greece; Levente at the Governmental Agency for IT Development in Hungary; Caspir at the National University of Ireland Galway in Ireland; and EHPCPL at the Academic Computer Centre CYFRONET in Poland.

Roboticists could learn a thing or two from insects if they're looking to build tiny AI machines capable of moving, planning, and cooperating with one another.

The six-legged creatures are the largest and most diverse multi-cellular organisms on Earth. They have evolved to live in all sorts of environments and exhibit different types of behaviors to survive and there are insects that fly, crawl, and swim.

Insects are surprisingly intelligent and energy efficient given the size of their small brains and bodies. These are traits that small simple robots should have if they are to be useful in the real world, a group of researchers posited in a paper published in Science Robotics on Wednesday.

Japan has updated its penal code to make insulting people online a crime punishable by a year of incarceration.

An amendment [PDF] that passed the House of Councillors (Japan's upper legislative chamber) on Monday spells out that insults designed to hurt the reader can now attract increased punishments.

Supporters of the amended law cite the death of 22-year-old wrestler and reality TV personality Hana Kimura as a reason it was needed. On the day she passed away, Kimura shared images of self-harm and hateful comments she'd received on social media. Her death was later ruled a suicide.

South Korea's ambition to launch a space industry on the back of a locally developed rocket have stalled, after a glitch saw the countdown halted for its latest attempt to place its Nuri vehicle into orbit.

The launch was planned for Wednesday, but postponed by a day due to unfavourable weather.

The Korea Aerospace and Research Institute tried again but, as the countdown progressed, an anomaly appeared in a first stage oxidizer tank. That issue was considered so serious that Nuri was returned to its assembly facility.

Cisco Live In his first in-person Cisco Live keynote in two years, CEO Chuck Robbins didn't make any lofty claims about how AI is taking over the network or how the company's latest products would turn networking on its head. Instead, the presentation was all about working with customers to make their lives easier.

"We need to simplify the things that we do with you. If I think back to eight or ten years ago, I think we've made progress, but we still have more to do," he said, promising to address customers' biggest complaints with the networking giant's various platforms.

"Everything we find that is inhibiting your experience from being the best that it can be, we're going to tackle," he declared, appealing to customers to share their pain points at the show.

GPUs are a powerful tool for machine-learning workloads, though they’re not necessarily the right tool for every AI job, according to Michael Bronstein, Twitter’s head of graph learning research.

His team recently showed Graphcore’s AI hardware offered an “order of magnitude speedup when comparing a single IPU processor to an Nvidia A100 GPU,” in temporal graph network (TGN) models.

“The choice of hardware for implementing Graph ML models is a crucial, yet often overlooked problem,” reads a joint article penned by Bronstein with Emanuele Rossi, an ML researcher at Twitter, and Daniel Justus, a researcher at Graphcore.

Japan is reportedly hoping to join the ranks of countries producing leading-edge 2nm chips as soon as 2025, and it's working with the US to make such ambitions a reality.

Nikkei reported Wednesday that businesses from both countries will jointly research the design and manufacturing of such components for devices ranging from smartphones to servers as part of a "bilateral chip technology partnership" between America and Japan.

The report arrives less than a month after US and Japanese leaders said they would collaborate on next-generation semiconductors as part of broader agreement that also calls for "protecting and promoting critical technologies, including through the use of export controls."

Elon Musk still hopes to quash a 2018 settlement agreement with the SEC requiring Tesla-related tweets to be approved by a lawyer before he can post them: on Wednesday, he took his case to the US Court of Appeals after a lower court denied this request.

The Tesla CEO landed himself in hot water with the watchdog when he tweeted he was thinking of taking the company private at $420 a share, and claimed to have already secured the necessary funding (sound familiar?) In reality, however, Musk did not have the funding or approval to do so. Investors, however, took him seriously and they started buying more shares, bumping up the stock price over 10 per cent.

The SEC accused Musk of fraud, saying his tweets were false and misled the public and caused disruption in the market. Musk was sued by the US regulator; he later settled the lawsuit by agreeing to pay $40 million in penalties, step down as chairman of the automaker's board, and accepted that any tweets discussing Tesla would have to be screened from now on.

The Register - Independent news and views for the tech community. Part of Situation Publishing

Biting the hand that feeds IT © 1998–2022