Re: Supercomputing News and Discussions
Posted: Thu Nov 02, 2023 3:55 pm
A community of futurology enthusiasts
https://www.futuretimeline.net/forum/
https://www.futuretimeline.net/forum/viewtopic.php?f=19&t=1384
It's just a number. I would like to see it actually bring results that would improve what matters.wjfox wrote: ↑Thu Nov 09, 2023 9:50 pm Eos represents an enormous amount of compute. It leverages 10,752 GPUs strung together using NVIDIA's Infiniband networking (moving a petabyte of data a second) and 860 terabytes of high bandwidth memory (36PB/sec aggregate bandwidth and 1.1PB sec interconnected) to deliver 40 exaflops of AI processing power
The modern world's supercomputers operate out in the open, as countries brag about their performance and enter them into standardized benchmarking competitions to prove their engineering chops. China doesn't play this game, however. Its entire supercomputer program is mostly kept secret because it's not supposed to have access to advanced technology. Despite its desire to keep its cards close to its vest, it recently announced a new supercomputer that could break the exascale barrier—all while using homegrown CPUs, which shouldn't be possible under the sanctions levied against it.
The new supercomputer is named Tianhe Xingyi, state news agency Xinhua reports (via Reuters). The release is unsurprisingly vague since China doesn't release numbers or hard info. It states only that it was built with "domestic advanced computing architecture, high-performance multi-core processors, high-speed interconnection networks, and large-scale storage." The release says that compared with Tianhe-2 (above), China has doubled many aspects of its performance. That's unsurprising, as Tianhe-2 first debuted on the Top500 list in 2013 and was the world's fastest supercomputer for several years after that, only being displaced by TaihuLight, another computer from China in 2016.
The proverbial paint is still dry on AMD's new Instinct MI300 chips, and yet the company has already said they're being used for a new supercomputer in Germany. AMD has announced "Exascale Supercomputing Is Coming to Stuttgart" and will build two computers: one that will upgrade an existing system to 39 PFLOPS and a future exascale machine similar to its current Frontier supercomputer. The two machines will be known as Hunter and Herder, with the former coming online in 2025 and the latter poised for a 2027 launch.
The two new supercomputers result from a new contract signed by the University of Stuttgart and Hewlett Packard Enterprise. It will see the organization upgrade its existing Hawk supercomputer and install a second system in the future at the HLRS, which is a research institute and supercomputer center in Stuttgart. The big news here is this is the first supercomputer contract for AMD's all-new MI300A chip, which combines a CPU, GPU, and high-bandwidth memory onto the same package. These data center "APUs" will go into Hawk, the center's current flagship supercomputer at 26 PFLOPS, which is nothing to sneeze at. This computer debuted at #16 on the Top500 list in 2020, so it's neither old nor slow. That said, we certainly understand the itch to upgrade a PC, so there's no shade coming from this direction.
https://www.nextbigfuture.com/2024/02/c ... -chip.htmlThere are reports that China has a new superchip MT-3000 processor designed by the National University of Defense Technology (NUDT). The MT-3000 has general-purpose CPU cores, control cores, and matrix accelerator cores. NUDT’s MT-3000 processor features a multi-zone structure that packs 16 general-purpose CPU cores with 96 control cores and 1,536 accelerator cores.
The MT-3000 processor reportedly achieves 11.6 FP64 TFLOPS of peak performance and demonstrates a power efficiency of 45.4 GigaFLOPS/Watt at an operational frequency of 1.20 GHz.
The Tianhe-3 a new supercomputer reported to be able to reachi 1.57 ExaFLOPS on LINPACK benchmarks. Tianhe-3 would use the MT-3000 at its core. The top US supercomputer is the Frontier with 1.102 ExaFLOPS of performance.
https://www.extremetech.com/computing/n ... i-training
In November of last year, Nvidia raised a few eyebrows by suddenly appearing in the 9th spot on the Top500 list of the world's fastest supercomputers with a system named Eos. Named after the Greek Goddess who opened the gates of dawn every day, Eos is Nvidia's enterprise-scale system for AI training, and the company has now released a video showing it off to the public for the first time.
Eos is essentially Nvidia's very own supercomputer that its employees get to use every day for things like AI training and playing Crysis on their lunch breaks. It comprises a cluster of 576 DGX H100 servers, and since each one features eight H100 GPUs, there's a total of 4,608 H100s linked together with its Quantum-2 InfiniBand technology. It's basically Nvidia showing off an extreme version of its DGX SuperPod design, which is AI training at an enterprise scale, which it hopes to sell to companies with huge budgets and massive AI models to train.
Nvidia describes Eos as a system that can power an "AI factory," as it's a very large-scale SuperPod DGX H100 system. The company says it is what allows it to develop its own AI breakthroughs and shows the power of Nvidia's latest technology when scaled up to ludicrous size.
The DGX H100 servers use Intel Xeon Platinum 8480C CPUs, which feature 56 cores and 112 threads. Combined with the 4,608 H100 GPUs, it offers 121 PetaFLOPS of Linpack performance, which was only good enough for 9th on the Top500, but that's more of a generic metric. When measured purely for AI training, it's easily one of the fastest systems in the world currently.
Nvidia is fresh off the unveiling of its new Blackwell AI superchip, and it's wasting no time making plans to roll that hardware out. Nvidia and Amazon partnered up last year to build what was to be one of the fastest supercomputers in the world, known as Project Ceiba. Now, the companies have said Project Ceiba will get a Blackwell upgrade to make it up to six times faster than originally envisioned.
The version of Project Ceiba discussed last year was still a beast, featuring more than 16,000 H100 Hopper AI accelerators. Nvidia predicted the machine would have offered 65 exaflops of AI processing power when complete. The current leading supercomputer is the US Department of Energy's Frontier machine, which can hit 1.1 exaFLOPS with thousands of AMD Epyc CPUs and Radeon GPUs.
https://www.extremetech.com/computing/r ... orm-reportRussia has always lagged behind the rest of the industrial world when it comes to information technology, and now sanctions from its war on Ukraine have held it back even further. Despite this situation, the country is reportedly in the early stages of deploying a new supercomputing and cloud platform that will feature up to 128 CPU cores per server cluster. It's unknown where these computer parts will be made, however, as Russia isn't known for running advanced silicon fabs.
The details about Russia's plans come from CNews, which appears to be a Russian news site. The site notes a state-owned company named Roselectronics has been developing this new computing platform called Basis using "domestic technologies." The platform is both scalable and a fusion of software and hardware. Each Basis module includes three servers with up to 128 CPU cores, along with 2TB of memory, though the architecture used for the CPUs isn't disclosed. It's unknown if it will feature a monolithic or chiplet design.