Supercomputing News and Discussions

User avatar
Yuli Ban
Posts: 5194
Joined: Sun May 16, 2021 4:44 pm

Re: Supercomputing News and Discussions

Post by Yuli Ban »

IN 2018, a new supercomputer called Summit was installed at Oak Ridge National Laboratory, in Tennessee. Its theoretical peak capacity was nearly 200 petaflops—that’s 200 thousand trillion floating-point operations per second. At the time, it was the most powerful supercomputer in the world, beating out the previous record holder, China’s Sunway TaihuLight, by a comfortable margin, according to the well-known Top500 ranking of supercomputers. (Summit is currently No. 2, a Japanese supercomputer called Fugaku having since overtaken it.)

In just four short years, though, demand for supercomputing services at Oak Ridge has outstripped even this colossal machine. “Summit is four to five times oversubscribed,” says Justin Whitt, who directs ORNL’s Leadership Computing Facility. “That limits the number of research projects that can use it.”

The obvious remedy is to get a faster supercomputer. And that’s exactly what Oak Ridge is doing. The new supercomputer being assembled there is called Frontier. When complete, it will have a peak theoretical capacity in excess of 1.5 exaflops.
And remember my friend, future events such as these will affect you in the future
User avatar
Yuli Ban
Posts: 5194
Joined: Sun May 16, 2021 4:44 pm

Re: Supercomputing News and Discussions

Post by Yuli Ban »

France's Jean Zay supercomputer, one of the most powerful computers in the world and part of the Top500, is now the first HPC to have a photonic coprocessor meaning it transmits and processes information using light. The development represents a first for the industry.

The breakthrough was made during a pilot program that saw LightOn collaborate with GENCI and IDRIS. Igor Carron, LightOn’s CEO and co-founder said in a press release: “This pilot program integrating a new computing technology within one of the world’s Supercomputers would not have been possible without the particular commitment of visionary agencies such as GENCI and IDRIS/CNRS. Together with the emergence of Quantum Computing, this world premiere strengthens our view that the next step after exascale supercomputing will be about hybrid computing.”

The technology will now be offered to select users of the Jean Zay research community over the next few months who will use the device to undertake research on machine learning foundations, differential privacy, satellite imaging analysis, and natural language processing (NLP) tasks. LightOn’s technology has already been successfully used by a community of researchers since 2018.

Supercomputers have come a long way in the past few years. In June of 2018, it was announced that the United States Department of Energy had the world's latest and most powerful supercomputer called Summit.

Summit operated at 200 petaflops while at maximum capacity, achieving 200 quadrillion calculations each second. The numbers at the time outperformed China's Sunway TaihuLight's 93 petaflop capacity as well as the U.S.'s previous record-holder Titan.
And remember my friend, future events such as these will affect you in the future
User avatar
wjfox
Site Admin
Posts: 13580
Joined: Sat May 15, 2021 6:09 pm
Location: Essex, UK
Contact:

Re: Supercomputing News and Discussions

Post by wjfox »

University Loses Valuable Supercomputer Research After Backup Error Wipes 77 Terabytes of Data

Thursday 4:30PM

Kyoto University, a top research institute in Japan, recently lost a whole bunch of research after its supercomputer system accidentally wiped out a whopping 77 terabytes of data during what was supposed to be a routine backup procedure.

That malfunction, which occurred sometime between Dec. 14 and Dec. 16, erased approximately 34 million files belonging to 14 different research groups that had been using the school’s supercomputing system. The university operates Hewlett Packard Cray computing systems and a DataDirect ExaScaler storage system—the likes of which can be utilized by research teams for various purposes.

It’s unclear what kind of files were specifically deleted or what caused the actual malfunction, though the school has said that the work of at least four different groups will not be able to be restored.

BleepingComputer, which originally reported on this incident, helpfully points out that supercomputing research is, uh, not super cheap, either—costing somewhere in the neighborhood of hundreds of dollars per hour to operate.

https://gizmodo.com/university-loses-va ... 1848286983
weatheriscool
Posts: 24486
Joined: Sun May 16, 2021 6:16 pm
Contact:

Re: Supercomputing News and Discussions

Post by weatheriscool »

Nanowire transistor with integrated memory to enable future supercomputers
https://techxplore.com/news/2022-01-nan ... uture.html
by Lund University

For many years, a bottleneck in technological development has been how to get processors and memories to work faster together. Now, researchers at Lund University in Sweden have presented a new solution integrating a memory cell with a processor, which enables much faster calculations, as they happen in the memory circuit itself.

In an article in Nature Electronics, the researchers present a new configuration, in which a memory cell is integrated with a vertical transistor selector, all at the nanoscale. This brings improvements in scalability, speed and energy efficiency compared with current mass storage solutions.

The fundamental issue is that anything requiring large amounts of data to be processed, such as AI and machine learning, requires speed and more capacity. For this to be successful, the memory and processor need to be as close to each other as possible. In addition, it must be possible to run the calculations in an energy-efficient manner, not least as current technology generates high temperatures with high loads.
User avatar
wjfox
Site Admin
Posts: 13580
Joined: Sat May 15, 2021 6:09 pm
Location: Essex, UK
Contact:

Re: Supercomputing News and Discussions

Post by wjfox »

Image
Tadasuke

Re: Supercomputing News and Discussions

Post by Tadasuke »

wjfox wrote: Thu Feb 17, 2022 4:06 pm graph
This is wrong, wjfox. $1000 PC (whether taking inflation into consideration or not) in 2020 can have multiple teraflops, which is 10^12. RTX 3070 with $499 MSRP has 20.31*10^12 flops, RTX 3070 Max-Q (laptop) with TDP of 80 watts has 13.21 teraflops (13.21*10^12). And TOP500 supercomputers are counted in fp64, meaning double precision, while consumer hardware is counted in fp32, meaning single precision. So you need to double supercomputers processing speed numbers to compare them to PCs. Japanese Fugaku is already around 1 exaflops fp32. American Aurora will be around 4 exaflops fp32, later this year (Intel). Kurzweil predicted 20 petaflops, 10 TB RAM $1000 laptops in 2023. It doesn't come true, but there is substantial progress nonetheless.

I recommend watching this first:

and then this:
to understand more about what Intel plans for the near future of supercomputers
Nero
Posts: 53
Joined: Sun Aug 15, 2021 5:17 pm

Re: Supercomputing News and Discussions

Post by Nero »

Tadasuke wrote: Mon Mar 14, 2022 7:19 pm
wjfox wrote: Thu Feb 17, 2022 4:06 pm graph
This is wrong, wjfox. $1000 PC (whether taking inflation into consideration or not) in 2020 can have multiple teraflops, which is 10^12. RTX 3070 with $499 MSRP has 20.31*10^12 flops, RTX 3070 Max-Q (laptop) with TDP of 80 watts has 13.21 teraflops (13.21*10^12). And TOP500 supercomputers are counted in fp64, meaning double precision, while consumer hardware is counted in fp32, meaning single precision. So you need to double supercomputers processing speed numbers to compare them to PCs. Japanese Fugaku is already around 1 exaflops fp32. American Aurora will be around 4 exaflops fp32, later this year (Intel). Kurzweil predicted 20 petaflops, 10 TB RAM $1000 laptops in 2023. It doesn't come true, but there is substantial progress nonetheless.

I recommend watching this first:

and then this:
to understand more about what Intel plans for the near future of supercomputers
More recently in 2021/22 we have the 3090 TI which is about 40 teraflops compute and later this year we will have the 4000 series which will likely be 60-80 teraflops in classic compute. The graph does need amending I agree because it is very likely we will have a petaflop desktop GPU before 2030.
Tadasuke

Re: Supercomputing News and Discussions

Post by Tadasuke »

Nero wrote: Tue Mar 15, 2022 4:56 pmMore recently in 2021/22 we have the 3090 TI which is about 40 teraflops compute and later this year we will have the 4000 series which will likely be 60-80 teraflops in classic compute. The graph does need amending I agree because it is very likely we will have a petaflop desktop GPU before 2030.
I think that wjfox took data for $1000 PCs that was measured in megaflops and pasted in onto supercomputer fp64 flops data. A $1000 2000 PC was over 1 gigaflops if you count the graphics card. Top 2000 GPUs were 8 gigaflops without overclocking (old GPUs and CPUs could overclock by 30%) and they were cheaper than today's. RTX 4070 will probably be $599 and 80% faster than RTX 3070. RTX 3090 Ti or 4090 Ti aren't GPUs for a $1000 PC, just like the 3080 Ti isn't a GPU for a $1000 laptop. A 2020 desktop PC is about 5000x faster than a 2000 PC in terms of flops. PS5 had about 1000x of real performance improvement over the PS2 and a new smartwatch is faster than the PS2 and has more memory.
Tadasuke

Re: Supercomputing News and Discussions

Post by Tadasuke »

I have a graph showing Linpack performance (floating-point double precision) of the 500th supercomputer from the TOP500 list of fastest supercomputers (teraflops = 10^12 flops). I think this represents real progress better than the #1. As you can see, improvement is substantial - about 32x during a decade, meaning Linpack performance doubles every 2 years. It may continue like that, so in 2031 performance will be 1024x higher than in 2011.

Image
User avatar
wjfox
Site Admin
Posts: 13580
Joined: Sat May 15, 2021 6:09 pm
Location: Essex, UK
Contact:

Re: Supercomputing News and Discussions

Post by wjfox »

Tadasuke wrote: Wed Mar 16, 2022 8:17 am
I think that wjfox took data for $1000 PCs that was measured in megaflops and pasted in onto supercomputer fp64 flops data.
Oh, I didn't create the graph. It's hot-linked from Wikipedia.
Tadasuke

Re: Supercomputing News and Discussions

Post by Tadasuke »

wjfox wrote: Wed Mar 16, 2022 2:49 pm
Tadasuke wrote: Wed Mar 16, 2022 8:17 am
I think that wjfox took data for $1000 PCs that was measured in megaflops and pasted in onto supercomputer fp64 flops data.
Oh, I didn't create the graph. It's hot-linked from Wikipedia.
Then Wikipedia is plainly wrong.

Just as journalists who compare a supercomputer full of graphics cards to a laptop, and of course they compare total supercomputer performance to a low-end 2 core laptop CPU performance. Mid-range desktop PC with RTX 3060 has 12.74 teraflops, mid-range laptop has at least half that (and you should add CPU performance). PCs are not that slow anymore.
User avatar
Yuli Ban
Posts: 5194
Joined: Sun May 16, 2021 4:44 pm

Re: Supercomputing News and Discussions

Post by Yuli Ban »

It's here!!!

Age of Exascale: Wickedly Fast Frontier Supercomputer Ushers in the Next Era of Computing
Today, Oak Ridge National Laboratory’s Frontier supercomputer was crowned fastest on the planet in the semiannual Top500 list. Frontier more than doubled the speed of the last titleholder, Japan’s Fugaku supercomputer, and is the first to officially clock speeds over a quintillion calculations a second—a milestone computing has pursued for 14 years.

That’s a big number. So before we go on, it’s worth putting into more human terms.

Imagine giving all 7.9 billion people on the planet a pencil and a list of simple arithmetic or multiplication problems. Now, ask everyone to solve one problem per second for four and half years. By marshaling the math skills of the Earth’s population for a half-decade, you’ve now solved over a quintillion problems.

Frontier can do the same work in a second, and keep it up indefinitely. A thousand years’ worth of arithmetic by everyone on Earth would take Frontier just a little under four minutes.

This blistering performance kicks off a new era known as exascale computing.
And remember my friend, future events such as these will affect you in the future
User avatar
raklian
Posts: 1981
Joined: Sun May 16, 2021 4:46 pm
Location: North Carolina

Re: Supercomputing News and Discussions

Post by raklian »

Yuli Ban wrote: Mon May 30, 2022 1:36 pm It's here!!!

Age of Exascale: Wickedly Fast Frontier Supercomputer Ushers in the Next Era of Computing
Today, Oak Ridge National Laboratory’s Frontier supercomputer was crowned fastest on the planet in the semiannual Top500 list. Frontier more than doubled the speed of the last titleholder, Japan’s Fugaku supercomputer, and is the first to officially clock speeds over a quintillion calculations a second—a milestone computing has pursued for 14 years.

That’s a big number. So before we go on, it’s worth putting into more human terms.

Imagine giving all 7.9 billion people on the planet a pencil and a list of simple arithmetic or multiplication problems. Now, ask everyone to solve one problem per second for four and half years. By marshaling the math skills of the Earth’s population for a half-decade, you’ve now solved over a quintillion problems.

Frontier can do the same work in a second, and keep it up indefinitely. A thousand years’ worth of arithmetic by everyone on Earth would take Frontier just a little under four minutes.

This blistering performance kicks off a new era known as exascale computing.
I'm equally impressed by the fact it's the 2nd most efficient supercomputer in terms of energy usage. :o
To know is essentially the same as not knowing. The only thing that occurs is the rearrangement of atoms in your brain.
Nero
Posts: 53
Joined: Sun Aug 15, 2021 5:17 pm

Re: Supercomputing News and Discussions

Post by Nero »

Take everything that Frontier can do and then double it https://www.hpcwire.com/2022/05/10/auro ... ervations/

Aurora will be available and online in a few more weeks and be able to achieve more than twice the Frontier performance.
User avatar
raklian
Posts: 1981
Joined: Sun May 16, 2021 4:46 pm
Location: North Carolina

Re: Supercomputing News and Discussions

Post by raklian »

Nero wrote: Mon May 30, 2022 3:58 pm Take everything that Frontier can do and then double it https://www.hpcwire.com/2022/05/10/auro ... ervations/

Aurora will be available and online in a few more weeks and be able to achieve more than twice the Frontier performance.
United States is on a roll when it comes to supercomputers these days. :)
To know is essentially the same as not knowing. The only thing that occurs is the rearrangement of atoms in your brain.
Tadasuke

Re: Supercomputing News and Discussions

Post by Tadasuke »

Current #1 Frontier is located at Oak Ridge National Laboratory (ORNL) in Tennessee and will be operated by Department of Energy. It achieved 1102 fp64 petaflops in Linpack using AMD EPYC 7003 CPUs, Instinct 250X GPUs and Slingshot-11 interconnect. Previous #1 is now #2 - Fugaku (A64FX 48C) located at RIKEN Center for Computational Sciences in Kobe, Japan with benchmark score of 442 fp64 petaflops, which is 190% more than #3 which is LUMI (EPYC Trento 7003 + Instinct MI250X) in Kajaani, Finland with 151.9 fp64 petaflops. #4 is Summit (POWER9 22C + Volta GV100) also in Oak Ridge National Laboratory at 148.8 fp64 petaflops.

AMD Trento CPUs run at 2 GHz and have 8 core complexes each 8 cores, connected by Infinity Fabric 3.0. There are 9472 nodes with 9472 of CPUs and 37888 "Aldebaran" MI250X GPUs. The nodes are linked to each other by a 200 GB/s Slingshot Ethernet interconnect. MI250X accelerators run at 1.62 GHz and have 2 chiplets each 7040 cores with memory bandwith of 3277 GB/s, 128 GB HBM2e and 45.57 fp64 teraflops.

Peak Linpack performance is 1686 petaflops and more sustained performance is 1102 petaflops. AI performance is higher at 6.88 exaflops. Power efficiency is 55.23 gigaflops per watt and power draw is around 20 megawatts. For comparison, Fugaku achieves 40.90 gigaflops per watt.

Image

Frontier is now supposed to deliver 50-fold speedup in real-world workloads, applications, compared to the 2012 Titan ORNL supercomputer. With time, the speedup might be higher.

I hope this supercomputer will help bring about commercial fusion power.
Last edited by Tadasuke on Tue May 31, 2022 6:48 am, edited 2 times in total.
User avatar
raklian
Posts: 1981
Joined: Sun May 16, 2021 4:46 pm
Location: North Carolina

Re: Supercomputing News and Discussions

Post by raklian »

Tadasuke wrote: Mon May 30, 2022 10:34 pm

Frontier is now supposed to deliver 50-fold speedup in real-world workloads, applications, compared to the 2012 Titan ORNL supercomputer. With time, the speedup might be higher.
I think with Aurora, which comes later this year, the speedup will be 1000-fold.
To know is essentially the same as not knowing. The only thing that occurs is the rearrangement of atoms in your brain.
Tadasuke

Re: Supercomputing News and Discussions

Post by Tadasuke »

As you can see, #1 is rising exponentially, while #500 is stalling, flattening a bit. This is caused by stagnation in performance or memory per dollar. Number 500 will soon be 1/1000 of number 1. The fastest are important, but I think that combined performance is more important. I hope that #500 will resume exponential growth of the past, that this is just a very temporary situation.

Image
User avatar
caltrek
Posts: 9280
Joined: Mon May 17, 2021 1:17 pm

Re: Supercomputing News and Discussions

Post by caltrek »

Another interesting problem addressed by supercomputers is global climate change. I am not sure that the overall computational speed is the constraint in the sense that something less than #1 or #2 might very well be able to do the job. I cited the article below in the Climate Change News and Discussion thread earlier today. I am citing a different portion of that article here that discusses supercomputers. I am not worried about copyright constraints as this is all from a news release.

A Cloudless Future? The Mystery at the Heart of Climate Forecasts
May 31, 2021

Extract:
(News Release via EurekAlert) Whereas the most advanced U.S. global climate model are struggling to approach 4 kilometer global resolution, (Michael) Pritchard (professor of Earth System science at UC Irvine) estimates that models need a resolution of at least 100 meters to capture the fine-scale turbulent eddies that form shallow cloud systems — 40 times more resolved in every direction. It could take until 2060, according to Moore's law, before the computing power is available to capture this level of detail.

Pritchard is working to fix this glaring gap by breaking the climate modeling problem into two parts: a coarse-grained, lower-resolution (100km) planetary model and many small patches with 100 to 200 meter resolution. The two simulations run independently and then exchange data every 30 minutes to make sure that neither simulation goes off-track nor becomes unrealistic.

His team's reported the results of these efforts in the Journal of Advances in Modeling Earth Systems in April 2022 (https://agupubs.onlinelibrary.wiley.com ... 1MS0028410. The research is supported by grants from the National Science Foundation (NSF) and the Department of Energy (DOE).

This climate simulation method, called a ‘Multiscale Modeling Framework (MMF),' has been around since 2000 and has long been an option within the Community Earth System Model (CESM) model, developed at the National Center for Atmospheric Research. The idea has lately been enjoying a renaissance at the Department of Energy, where researchers from the Energy Exascale Earth System Model (E3SM-https://e3sm.org/) have been pushing it to new computational frontiers as part of the Exascale Computing Project. Pritchard's co-author Walter Hannah from the Lawrence Livermore national laboratory helps lead this effort.

"The model does an end-run around the hardest problem – whole planet modeling," Pritchard explained. "It has thousands of little micromodels that capture things like realistic shallow cloud formation that only emerge in very high resolution."
Read more here: https://www.eurekalert.org/news-releases/954415
Don't mourn, organize.

-Joe Hill
Tadasuke

Re: Supercomputing News and Discussions

Post by Tadasuke »

When I read about ORNL Titan becoming the fastest supercomputer in November 2012, I thought that the real question is not how to get from ~20 petaflops to ~2 exaflops, but how to get from ~2 exaflops to ~200 exaflops. I think that the hardest part is in front of us, not behind us. Because the path to ANL Aurora is relatively straightforward. You increase the number of GPUs by 4x, increase power draw by 6x, increase GPU single core fp64 performance by 5x (mainly by improving fp64 to fp32 ratio) and increase the number of cores by also 5x. I don't think that another 100x will go as easily as that.

#1 When your fp64=fp32 you can't improve the ratio further (this is the case with the newest AMD Instinct GPUs)
#2 When your power draw is between 50 to 60 megawatts, you can't really scale it up anymore (unless you go to insane levels)
#3 When your frequency is around 2 GHz, you can't increase it much further without worsening power efficiency (that is why Epyc CPUs in Frontier are clocked so low)
#4 Silicon scaling (Moore's Law) is becoming increasingly costly below 12nm, cost per transistor is not falling down significantly anymore (that is why Aurora won't even have 8x Titan's storage and even 20x Titan's memory even though the computer will be around 5x costlier)
#5 You start running into Amdahl's Law problems in increasing real-world performance and not only theoretical performance (for real performance look up HPCG benchmark)

So I am very curious how they will scale supercomputers after Aurora. I am not surprised they can manage 2 exaflops. I will be surprised if they can manage 200 exaflops. That will be something. I can see possibly 10 exaflops (all in fp64) by increasing total core count by 5x, but what after that? Things become very hazy and vague after 10 exaflops.

In my opinion (same since 2012), scaling beyond 10 exaflops requires completely new ideas and completely new ways of increasing performance, because old ideas just won't work anymore. They simply won't bring what's needed. For example true 3D architectures could be what ushers a new era in performance scaling. I don't mean just stacking up some extra cache, but true 3D processors, that are designed and produced in all three dimensions, possibly with memristors instead of transistors. More than just improving CPU performance per clock by 2x and increasing core count by 5x. I will keep watching how supercomputers evolve, but what I see is that #500 improves more slowly than #1 and this means that the total impact will not be as great as some had hoped.
Post Reply