GPU and CPU news and discussions

Tadasuke
Posts: 549
Joined: Tue Aug 17, 2021 3:15 pm
Location: Europe

GPU prices, Blackwell FP64

Post by Tadasuke »

firestar464 wrote: Fri Mar 22, 2024 12:49 pm For the everyday person, yes. Meanwhile Big Tech is relishing the computing power
I used to think that $600 for a single GPU was a lot. 😅

And by the way, Blackwell has only 32% faster FP64 compute than Hopper (40 vs 30 teraflops). One full-spec B200 (208 billion transistors) uses 1200 watts at 100% usage. One B100 (104 billion transistors) uses up to 700 watts at 100% usage.
Global economy doubles in product every 15-20 years. Computer performance at a constant price doubles nowadays every 4 years on average. Livestock-as-food will globally stop being a thing by ~2050 (precision fermentation and more). Human stupidity, pride and depravity are the biggest problems of our world.
User avatar
wjfox
Site Admin
Posts: 8942
Joined: Sat May 15, 2021 6:09 pm
Location: London, UK
Contact:

Re: GPU prices, Blackwell FP64

Post by wjfox »

Tadasuke wrote: Fri Mar 22, 2024 4:31 pm
And by the way, Blackwell has only 32% faster FP64 compute than Hopper (40 vs 30 teraflops).
But for AI, lower precision (i.e. FP8, FP4) is better. Blackwell is a 5x improvement on this.
Tadasuke
Posts: 549
Joined: Tue Aug 17, 2021 3:15 pm
Location: Europe

simulations, transistor count vs relative performance

Post by Tadasuke »

Yeah, that's true, I agree. Makes no sense to use FP64 in machine learning or in (so called) inference. And various useful and practicable simulations could be also (more efficiently) done with AI (instead of directly simulating in costly very high precision), which has been proven to work. Protein folding or weather simulations for example (like what Jensen Huang showed on a stage with 10x better precision weather simulation using Nvidia Blackwell).

The $1000 hexa-core i7-980 (or i7-980X, which is the same processor) from 2010 has 1.17 billion transistors (BT). The $2000 eighteen-core i9-7980XE from 2017 has 7 BT and the $3000 twenty-eight-core Xeon W-3175X has 10 BT.

When you look at the most multi-threaded use-cases, which can well utilise 100% of all cores, threads and instructions of these processors, the i9-7980XE is 6x faster than the i7-980(X) and the Xeon Workstation 3175X is 8.6x faster than the i7-980(X).

Those increases in total performance can be also attributed to the increases in the total transistor count of those CPUs. Of course, it all had to be properly arranged and designed, first by engineers in specialised computer programs, then a mask had to be created and then the actual CPUs had to be cleanly "etched" using appropriate chemicals and photolitography on purified silicon wafers, in well-equipped and well-staffed foundries (semiconductor fabrication plants) around the world. :-)
Global economy doubles in product every 15-20 years. Computer performance at a constant price doubles nowadays every 4 years on average. Livestock-as-food will globally stop being a thing by ~2050 (precision fermentation and more). Human stupidity, pride and depravity are the biggest problems of our world.
weatheriscool
Posts: 13583
Joined: Sun May 16, 2021 6:16 pm

Re: GPU and CPU news and discussions

Post by weatheriscool »

TSMC's 3nm Business to Surge in 2024 Thanks to Apple, AMD, and Intel
Nvidia is notably missing from this list of big clients.
By Josh Norem March 27, 2024
TSMC launched its 3nm process at the end of 2022 and then spent all of 2023 making 3nm chips only for Apple. That will change in 2024, as the company begins cranking out 3nm products for its other customers, including Intel and AMD. Considering the size of these orders and the companies involved, it means 2024 will be a year where we'll likely see TSMC 3nm products flooding the high-end tech world, with one company notably missing from the roster—Nvidia.

The Taiwan Economic Daily is reporting on TSMC's surging 3nm business, and there's an economic angle to it. Last year, TSMC's 3nm node was just a small, niche process for the company with one big customer (Apple), but in 2024, it's expected to do gangbusters business. It will reportedly be big enough to land it right behind its most popular node—5nm—in terms of overall revenue. The site says its 3nm process might be responsible for up to 20% of its revenue this year, making it the second most popular node.
https://www.extremetech.com/computing/t ... -and-intel
weatheriscool
Posts: 13583
Joined: Sun May 16, 2021 6:16 pm

Re: GPU and CPU news and discussions

Post by weatheriscool »

Intel Lunar Lake CPU Appears With Embedded LPDDR5X Memory
The compute tile is also made by TSMC, according to this (usually reliable) source.
By Josh Norem March 28, 2024
Intel's Lunar Lake architecture will be the low-power follow-up to its current Meteor Lake platform for mobile devices. Due later this year, this new architecture will target ultra-portable laptops, handhelds, tablets, and similar hardware. It's now been photographed for the first time, showing it's a whole different kettle of fish for Intel, with several huge first-time advancements for the company.

German site Igor's Lab received a photo of an engineering sample of Lunar Lake and some engineering slides, revealing it will be Intel's first CPU with DDR memory on the package, similar to Apple's M-series chips. You may recall that Intel released a photograph of Meteor Lake with on-package memory late last year and then deleted it. However, the CPU in the picture looked the same as the one we're looking at today, so maybe that was Lunar Lake all along. Regardless, Igor confirms this CPU features a 4+4 design, which was also previously rumored, so it'll have four Lion Cove P-cores and four Skymont E-cores.
https://www.extremetech.com/computing/i ... r5x-memory
User avatar
wjfox
Site Admin
Posts: 8942
Joined: Sat May 15, 2021 6:09 pm
Location: London, UK
Contact:

Re: GPU and CPU news and discussions

Post by wjfox »

Tadasuke
Posts: 549
Joined: Tue Aug 17, 2021 3:15 pm
Location: Europe

--------------------------------------------------------------

Post by Tadasuke »

I made a long post, but I changed my mind (again).

To be honest, I dunno, I don't know the truth. It's beyond me to accurately ascertain or confirm. I'm not qualified for this.

Maybe this list can be helpful in comparing performance for GPUs: https://benchmarks.ul.com/compare/best-gpus
Last edited by Tadasuke on Sun Mar 31, 2024 7:37 am, edited 2 times in total.
Global economy doubles in product every 15-20 years. Computer performance at a constant price doubles nowadays every 4 years on average. Livestock-as-food will globally stop being a thing by ~2050 (precision fermentation and more). Human stupidity, pride and depravity are the biggest problems of our world.
firestar464
Posts: 823
Joined: Wed Oct 12, 2022 7:45 am

Re: GPU and CPU news and discussions

Post by firestar464 »

I can see the spaces just fine.
weatheriscool
Posts: 13583
Joined: Sun May 16, 2021 6:16 pm

Re: GPU and CPU news and discussions

Post by weatheriscool »

TSMC: One Trillion Transistor GPUs Will Be Possible in a Decade
3D-stacking chiplets will be the standard to increase compute power going forward.
By Josh Norem April 2, 2024
https://www.extremetech.com/computing/t ... n-a-decade
Things are about to get interesting in the semiconductor world, according to a lengthy missive penned by some executives at TSMC. The world's largest chipmaker is embarking on a journey that it says will result in a GPU with one trillion transistors—roughly 10 times the amount found in today's biggest chips—though it will take the Taiwanese company a decade to get there.

TSMC chairman Mark Liu and chief scientist H.-S Philip Wong have penned an editorial for IEEE Spectrum outlining their thoughts on the future of semiconductors. The headline is how the company plans to create a one trillion transistor GPU. The article details how the AI boom is currently the main driver for increased compute power in chips, especially GPUs. It notes that as we reach the end of the traditional node-shrink era, the way forward is clear: chiplets and 3D stacking.
Tadasuke
Posts: 549
Joined: Tue Aug 17, 2021 3:15 pm
Location: Europe

performance scaling across cores and threads

Post by Tadasuke »

Real-world performance typically doesn't scale linearly with cores and threads.

From what I've seen, with constant frequency, same cache size per thread and same architecture like Zen 4, performance increases in various advanced, professional, useful workloads stack up approximately like this:

▪️ 4 cores are 4x faster than 1
▪️ 12 cores are 10x faster than 1
▪️ 24 cores are 16x faster than 1
▪️ 32 cores are 20x faster than 1
▪️ 64 cores are 27x faster than 1
▪️ 96 cores are 32.5x faster than 1

Of course, it doesn't look like that with everything. It's just an example, to illustrate how it can/might be. Higher clocks in lower corecounts (compared with higher corecounts) only make this non-linearity even more obvious.

Will 320, 384 or 512 cores be any useful, practical or economical outside some edge cases? :? Theoretical flops/ops or Linpack flops ≠ actual performance (in finances, science and engineering for example). For AI there will be specialized hardware like there already increasingly is.

I'm always curious what the actual performance increases really are. Like, I currently think that Radeon 7800 XT is 12x faster than HD 5870 from Q3 2009, for about the same price when adjusted for inflation. :⁠-⁠)

By the way, Zen 4 can be even twice as fast as Zen 3 and Intel Sapphire Rapids (with the same number of cores and threads), largely thanks to its very decent architecture, large L3 cache, fast FPU and high memory bandwidth (333 GB/s for Threadripper and 461 GB/s for Epyc). Emerald Rapids is probably about 30% faster than Sapphire Rapids, so it's still behind AMD.
Global economy doubles in product every 15-20 years. Computer performance at a constant price doubles nowadays every 4 years on average. Livestock-as-food will globally stop being a thing by ~2050 (precision fermentation and more). Human stupidity, pride and depravity are the biggest problems of our world.
Post Reply