GPU and CPU news and discussions

YouTube · Post by **wjfox** » Tue Nov 19, 2024 7:57 pm

Cerebras Now The Fastest LLM Inference Processor; Its Not Even Close

Nov 19, 2024, 01:07pm EST

“There is no supercomputer on earth, regardless of size, that can achieve this performance,” said Andrew Feldman, Co-Founder and CEO of the AI startup. As a result, scientist can now accomplish in a single day what it took two years of GPU-based supercomputer simulations to achieve.

When Cerebras announced its record-breaking performance on the 70 billion parameter Llama 3.1, it was quite a surprise; Cerebras had previously focussed on using its Wafer Scale Engine (WSE) on the more difficult training part of the AI workflow. The memory on a CS3 is fast on-chip SRAM instead of the larger (and 10x slower) High Bandwidth Memory used in data center GPUs. Consequently, the Cerebras CS3 provides 7,000x more memory bandwidth than the Nvidia H100, addressing Generative AI's fundamental technical challenge: memory bandwidth.

And the latest result is just stupendous. Look at the charts above for a performance over time, and below to compare the competitive landscape for Llama 3.1-405B. The entire industry occupies the upper left quadrant of the chart, showing output speeds below the 100 tokens-per-second range for the Meta Llama 3.1-405B model. Cerebras produced some 970 tokens per second, all at roughly the same price as GPU and custom ASIC services like Samna N: 6 dollars per million input tokens and $12 dollars per million output tokens.

Compared to the competition, using 1000 input tokens, Cerebras embarrassed GPUs which all produced less than 100 tokens per second.

https://www.forbes.com/sites/karlfreund ... ven-close/

YouTube · Post by **wjfox** » Tue Nov 26, 2024 9:46 am

Nvidia RTX 5090 and 5080 GPUs again rumored for CES – but suggestion that the RTX 5080 could be positioned as a ‘professional’ GPU might worry PC gamers

By Darren Allan
published 21 hours ago

It’s looking more and more likely that Nvidia’s RTX 5090 and 5080 graphics cards are indeed being revealed at CES 2025, as previous chatter has indicated – plus we’ve heard some more worrying hints on pricing, sadly.

Much of the latest next-gen Blackwell speculation over the weekend comes from Moore’s Law is Dead (MLID), and the info here should be regarded with some skepticism, naturally.

MLID’s latest YouTube video has word from two sources at Nvidia’s retail partners who both claim that the unveiling of the RTX 5090 and 5080 is set to happen at CES 2025.

The first source MLID has heard from notes that their firm is currently talking to Nvidia about initial shipment numbers of these graphics cards, and that the on-sale date of the RTX 5090 and 5080 is a matter of weeks after the reveal – so likely late January.

https://www.techradar.com/computing/gpu ... -pc-gamers

weatheriscool · Post by **weatheriscool** » Tue Nov 26, 2024 6:09 pm

Intel Core 200 'Arrow Lake-S' Mainstream Chip Benchmarked on Geekbench
Arrow Lake-S 10-Core CPUs could make an appearance at CES.
By Josh Gulick November 26, 2024

It was only last month that we checked out Intel’s Core Ultra 9 285K high-end (but complicated) CPU. Now, leaks seem to point to two more Ultras being announced at CES. If the benchmark results are accurate, it looks like both of the new Core Ultra processors will perform better than their predecessors. That’s what you’d expect, of course—but the first Core Ultra’s problems had us watching to see how the rest of the line would perform.

Geekbench results appear to be giving the world a look at a new Core Ultra processor: the Core Ultra 225F. The Geekbench info shows the CPU on an Acer Aspire TC-1860, a typical mainstream PC. The current Raptor Lake-based Aspire TC goes for $550-$900 and has Core i5 and i7 processors, along with 8-12GB of memory, onboard graphics (or an Nvidia GeForce GTX 1660 Super if you’re willing to splurge) and a 512GB SSD. It has 'home office' written all over it.

Let’s start by comparing the Core Ultra 225F’s specs with its predecessor. As VideoCardz points out, the predecessor for the Core Ultra 225F is probably the Core i5-14400 (which also happens to be in some existing Aspire TC models). Both CPUs have 10 cores, with six performance and four efficiency cores. Geekbench puts the base frequency for the Ultra 225F at 3.30GHz/ Turbo 4.89GHz, which is about as fast as the CPU will get because it's not meant for overclocking.

https://www.extremetech.com/computing/i ... -geekbench

weatheriscool · Post by **weatheriscool** » Sun Dec 01, 2024 12:57 am

ingularity alert: AIs are already designing their own chips
By Abhimanyu Ghoshal
November 30, 2024

Human intelligence and our collective wisdom are already becoming limiting factors in the rise of AI. Indeed, the only smart move at this point seems to be letting AIs design their own future hardware, right down to the microchip level.

AI has already been unlocking new technological advancements and progress in challenging fields lately – from accelerating renewable energy research by a matter of years to accurately detecting cancer. So when I attended the WCIT conference in Armenia last month, a talk on how AI assists chip design caught my attention.

Meeting the world's ever-increasing demand for computing capabilities is quite a task. The processors in your phone, your laptop, and your car are already infinitesimally tiny and quick enough to precisely execute billions of instructions per second. And yet, we want the latest gadgets to do more and run faster than last year's models, every year.

https://newatlas.com/ai-humanoids/3-min ... ngularity/

Tadasuke · Post by **Tadasuke** » Sat Dec 14, 2024 12:09 pm

Comparing Intel Arc B580 graphics cards against older models from AMD and Nvidia:

• Radeon 270X 4GB was 229 USD (310 USD in 2024 dollars) in October 2013

• Radeon RX 480 8GB was 229 USD (301 USD in 2024 dollars) in June 2016 : 2x faster than 270X

• GeForce GTX 1660S 6GB was 229 USD (283 USD in 2024 dollars) in October 2019 : 40% faster than RX 480

• Radeon 7600 8GB was 269 USD (279 USD in 2024 dollars) in May 2023 : 73% faster than GTX 1660S

• Intel Arc B580 12GB is 249 USD in December 2024 : 15-17% faster than Radeon 7600 in 1080p and 11% cheaper, which is almost 30% better value in 1080p

the higher the resolution, the more leverage 12 GB of VRAM has over 8 GB, and honestly speaking, 8 GB is probably not enough for 1440p gaming or content creation, let alone 2160p

My prediction is that in July 2026, general consumer dGPU gaming performance/price will be 4x higher than July 2016 general gaming performance/price. Radeon RX 480 had the highest value of all cards in July 2016. Of course, I don't mean ray-tracing, as all cards before 2080 Ti competely suck at it and even 2080 Ti isn't that great (B580 is slightly faster in RT). Memory might be 2x larger.

weatheriscool · Post by **weatheriscool** » Wed Dec 18, 2024 5:21 pm

TSMC Offers a Peek at Its Cutting-Edge 2nm Process
Taiwan's largest chipmaker digs into the details at IEEE IEDM.
By Josh Gulick December 18, 2024

TSMC’s gate-all-around (GAA) technology is helping it deliver impressive results with its 2nm process. The chipmaker provided more details about its 2nm nanosheets at this week's IEEE International Electron Device Meeting (IEDM) in San Francisco. According to Wccftech, the N2 wafers could go for $25,000 to $30,000 apiece, compared with about $20,000 for a 3nm wafer.

One of the more impressive revelations about the 2nm process from the IEDM presentation relates to its efficiency. Compared with older processes, the new 2nm process results in up to 25 or possibly 30 percent reduced power consumption. From a performance standpoint, it appears that the 2nm process provides a 15 percent improvement.
TSMC Fab

https://www.extremetech.com/computing/t ... nm-process

weatheriscool · Post by **weatheriscool** » Mon Jan 06, 2025 10:50 pm

Intel Launches Core Ultra 200U CPUs
Arrow Lake for thin-and-light laptops.
By Josh Gulick January 6, 2025

Although CES 2025 officially runs from January 7th through the 10th this year, some companies are making announcements ahead of time. Intel took the opportunity to unveil its next line of laptop processors, dubbed Arrow Lake. The chipmaker focused on energy consumption and performance for these chips, leaving the AI muscle to the existing Core Ultra 200V (Lunar Lake) series.

Intel unveiled its Arrow Lake Core Ultra 200 desktop chips in October with an emphasis on performance-per-watt, so it’s unsurprising that Intel took a similar approach with its laptop CPUs. The Core Ultra 200U series is particularly efficiency-centric and aimed at thin-and-light laptops, while the 200H and 200HX lines target performance.

The Core Ultra 200U CPUs include the Core Ultra 7 265U, and 255U, along with the Core Ultra 5 235U and 225U. All four chips have 12 cores: two P-cores, eight E-cores, and 2 LPE-cores. (That’s P=Performance, E=Efficient and LPE= Low-Power Efficient.) The Core Ultra 7 265U’s frequency goes up to 5.3GHz, while the Core Ultra 5 tops out at 4.9GHz. All four chips have Intel Graphics onboard and support up to 96GB of DDR5 RAM or 64GB of power-efficient LPDDR5/x.

https://www.extremetech.com/computing/i ... -200u-cpus

YouTube · Post by **wjfox** » Tue Jan 07, 2025 9:44 am

weatheriscool · Post by **weatheriscool** » Thu Jan 09, 2025 11:35 am

Overclocker Breaks Speed Record With G.Skill DDR5 Memory
Reaches a stable DDR5-10600 in Memtest on an Asus ROG Crosshair X870E APEX board.
By Josh Gulick January 8, 2025
https://www.extremetech.com/computing/o ... dr5-memory

Asus overclocker Safedisk gets all the best toys. This time around, Asus and G.Skill gave Safedisk an Asus ROG Crosshair X870E Apex motherboard, an AMD Ryzen 5 8500G CPU and a 48GB kit of DDR5 G.Skill memory. They also handed over a Ryzen 7 9800X3D CPU for a different overclocking test. Sounds like a good gig to us.

Last fall, the South Korean overlocker set a world record by hitting DDR5-12112 with G.Skill Trident Z5, an Asus ROG Maximus Z890 Apex and an Intel Core Ultra 9 285K CPU. For that overclocking sesh, Safedisk used liquid nitrogen, which is excellent for keeping computer hardware frosty and not feasible for long-term use. This time around, Safedisk opted for water chiller cooling, according to G.Skill. Based on the numbers this system threw down, this is not a standard liquid-cooled setup, either.

Tadasuke · Post by **Tadasuke** » Fri Jan 10, 2025 8:01 am

Here's the newest episode of Broken Silicon where Nvidia's exaggerations and machinations are being discussed:

Listen, if you are interested in truth, instead of just marketing.

High prices, low amounts of VRAM, high input latency, fake frames, fake AI, hype, deception, deceitfulness and poor image quality is what Ngreedia really offers gamers.

Perhaps you are in the 1 million people worldwide interested in buying the highest greatest RTX 5090... Or in the 1 billion people interested in buying something decent that would cost $200-250, but it's either non-existing or crap.

Performance and TDP changes between the top Maxwell and Blackwell cards, according to me:

2017 TITAN Xp (250W) is 70% faster than 2015 GTX TITAN X (250W)
2018 TITAN RTX (280W) is 40% faster than 2017 TITAN Xp (250W)
2020 RTX 3090 (350W) is 55% faster than 2018 TITAN RTX (280W)
2022 RTX 4090 (450W) is 65% faster than 2020 RTX 3090 (350W)
2025 RTX 5090 (575W) is 30% faster than 2022 RTX 4090 (450W)

after 10 years, 5090 is 7.9x faster than Titan Maxwell, has 2.67x more VRAM and dissipates 2.3x heat (therefore needs 2.3x more energy)

this means that the upcoming RTX 5090 is only 3.44x more energy efficient than Titan X Maxwell after 10 years and 2x higher MSRP

the 5090 has only 20% more transistors than the 4090 ... and the MSRP is 25% higher than 2022 4090's $1600 MSRP

don't believe what Jensen Huang says or what Nvidia fanboys say

do games you play use DLSS? do they use ray-tracing? do programs you use have AI acceleration using GPUs? and if they do, are these features well implemented, useful?

YouTube · Post by **wjfox** » Fri Jan 10, 2025 8:29 pm

Tadasuke · Post by **Tadasuke** » Sat Jan 11, 2025 9:49 am

Intel Arc B580 however has very high CPU requirements and performance is inconsistent

I could understand 8 GB on the lowest-end $100-120 cards in 2025. But 8 GB on $250-400 cards in 2025? That's completely ridiculous and unacceptable in my book. I will never consider that as ok. Everything needs memory, including ray-tracing, path-tracing and all current AI models. Let's say 8 GB for a semi-decent AI model, 4 GB for ray-tracing and 12 GB for all other real-time graphics in 2560x1440 (mid-range resolution) .... that's 24 GB!! And that wouldn't be high-end! That would be just a semi-decent mid-range card!

Tadasuke · Post by **Tadasuke** » Sat Jan 11, 2025 1:52 pm

The more quickly Nvidia's total market capitalization grows, the lesser their improvements for relatively affordable hardware.

Their stocks were cheap when they offered large yearly improvements. Now their stocks are super high when they offer very small yearly improvements.

France's GDP : 3 trillion USD
Nvidia's market cap : 3.3 trillion USD

This is unbelievable. I think they aren't worth this much.

Vakanai · Post by **Vakanai** » Sat Jan 11, 2025 4:35 pm

Tadasuke wrote: ↑Sat Jan 11, 2025 1:52 pm The more quickly Nvidia's total market capitalization grows, the lesser their improvements for relatively affordable hardware.

Their stocks were cheap when they offered large yearly improvements. Now their stocks are super high when they offer very small yearly improvements.

France's GDP : 3 trillion USD
Nvidia's market cap : 3.3 trillion USD

This is unbelievable. I think they aren't worth this much.

I would be interested in hearing which if any companies are doing right in your estimate, and what your current top picks are GPU/CPU-wise?

Tadasuke · Post by **Tadasuke** » Sat Jan 11, 2025 5:50 pm

Vakanai wrote: ↑Sat Jan 11, 2025 4:35 pm I would be interested in hearing which if any companies are doing right in your estimate, and what your current top picks are GPU/CPU-wise?

The overall industry, technology and adoption slowdown is immense.

GPUs went from 170 megaflops in 1994 to 1200 gigaflops in 2008. And 1 teraflops GPU in 2008 (Radeon 4850) costed only $199. RTX 5090 in 2025 will cost $1999, so 10x more for 100x more flops (and 83.3x more flops than HD 4870).

Computer RAM went from 2 MB in 1993 to 2 GB in 2008 and GPU VRAM from 2 MB in 1994 to 1 GB in 2008. Between 2013 and 2024 computer RAM grew from 16 to 32 GB and GPU VRAM from 3-4 to 12-16 GB.

CPUs went from 60 MHz in 1993 to 3 GHz in 2002. In 2025 they are at 5 - 5.5 GHz.

As for VR, there is very little substantial stuff happening. I've been using computers for many years and I can say for sure that the closer to today, the slower any substantial positive changes are.

I don't understand how some people can be so certain about Technological Singularity when the situation is as described above. Without a completely new paradigm that would work great for decades we are going to be stuck, just like we were stuck with spaceflight for 55 years still waiting for SpaceX Starship to be operational and fully reusable.

So I don't see any great products right now.

If I were to pick something, it would be Ryzen 9700X + GeForce 5070 Ti or Radeon 9070 XT.

Vakanai · Post by **Vakanai** » Mon Jan 13, 2025 8:18 am

Tadasuke wrote: ↑Sat Jan 11, 2025 5:50 pm
Vakanai wrote: ↑Sat Jan 11, 2025 4:35 pm I would be interested in hearing which if any companies are doing right in your estimate, and what your current top picks are GPU/CPU-wise?
The overall industry, technology and adoption slowdown is immense.

GPUs went from 170 megaflops in 1994 to 1200 gigaflops in 2008. And 1 teraflops GPU in 2008 (Radeon 4850) costed only $199. RTX 5090 in 2025 will cost $1999, so 10x more for 100x more flops (and 83.3x more flops than HD 4870).

Computer RAM went from 2 MB in 1993 to 2 GB in 2008 and GPU VRAM from 2 MB in 1994 to 1 GB in 2008. Between 2013 and 2024 computer RAM grew from 16 to 32 GB and GPU VRAM from 3-4 to 12-16 GB.

CPUs went from 60 MHz in 1993 to 3 GHz in 2002. In 2025 they are at 5 - 5.5 GHz.

As for VR, there is very little substantial stuff happening. I've been using computers for many years and I can say for sure that the closer to today, the slower any substantial positive changes are.

I don't understand how some people can be so certain about Technological Singularity when the situation is as described above. Without a completely new paradigm that would work great for decades we are going to be stuck, just like we were stuck with spaceflight for 55 years still waiting for SpaceX Starship to be operational and fully reusable.

So I don't see any great products right now.

If I were to pick something, it would be Ryzen 9700X + GeForce 5070 Ti or Radeon 9070 XT.

And what do you think can or should be done to change this slowdown?

Tadasuke · Post by **Tadasuke** » Mon Jan 13, 2025 10:38 am

Vakanai wrote: ↑Mon Jan 13, 2025 8:18 am And what do you think can or should be done to change this slowdown?

Put much more incentives, goals, money, human talent and AIs into looking for alternative means of computation than electron-based and transistor-based 2D CMOS chips, instead of simply tolerating things as they are. Both in the private sector and in the not so private universities.

Do we really want 2000 watts AI accelerator cards costing 100 000 USD for one? Or 1000 watts gaming GPUs costing 4000 USD for one? It's better than nothing, but it's not the future of dreams (at least to me).

I suspect it is possible. I had learned about this topic, but I cannot solve all the underlying hardships. It's quite complicated, but current CMOS chips would appear to be extremely complicated to people of the 1960s.

YouTube · Post by **wjfox** » Mon Jan 13, 2025 11:13 am

Tadasuke wrote: ↑Mon Jan 13, 2025 10:38 am
Vakanai wrote: ↑Mon Jan 13, 2025 8:18 am And what do you think can or should be done to change this slowdown?
Put much more incentives, goals, money, human talent and AIs into looking for alternative means of computation than electron-based and transistor-based 2D CMOS chips, instead of simply tolerating things as they are. Both in the private sector and in the not so private universities.

Do we really want 2000 watts AI accelerator cards costing 100 000 USD for one? Or 1000 watts gaming GPUs costing 4000 USD for one? It's better than nothing, but it's not the future of dreams (at least to me).

I suspect it is possible. I had learned about this topic, but I cannot solve all the underlying hardships. It's quite complicated, but current CMOS chips would appear to be extremely complicated to people of the 1960s.

Graphene/nanotubes are surely the next paradigm. And maybe photonics after that (2040s-50s?).

Tadasuke · Post by **Tadasuke** » Mon Jan 13, 2025 1:55 pm

This is relevant here: https://www.techradar.com/pro/i-am-thri ... 100-months

According to the article, Nvidia manages to reduce price per 1 calculation per second by 25x in 100 months which is very substantial if it really translates to actual useful AI performance.

Nvidia Pascal DGX-1 was being sold for 129 000 USD in April 2016 (over 170 000 USD today). Nvidia Blackwell DIGITS AI computer goes for $3000 today.

Theoretical Flops Performance/Price improved by ~24x
Theoretical Flops Performance/Weight improved by 25.35x
Theoretical Flops Performance/Power Consumption improved by ~10x
And at the same very dense flops performance (fp4 vs fp16), the amount of memory went up by 2.4x and the amount of storage went up by 1.25x

I don't know, this looks very promising. Perhaps things aren't so bad after all. It's just that you usually don't really see these changes in everyday life. But it might just be dumb marketing. The original DGX had 8x P100 GPUs with combined real 42 FP64 teraflops and combined 5.7 TB/s VRAM bandwidth for scientific compute.

weatheriscool · Post by **weatheriscool** » Fri Jan 17, 2025 4:44 pm

Intel Updates to Boost Core Ultra 200 CPU Performance
Games might see up to a 26% improvement.
By Josh Gulick January 16, 2025

When Intel’s Core Ultra 9 285K arrived and hit the benchmarks this fall, reviewers noticed that the Arrow Lake flagship CPU put up mixed results. It handled certain compute workloads with ease but didn’t allow for the gaming performance we expected. It performed worse than Raptor Lake in some game benchmarks.

Intel investigated its Core Ultra 200 series processors and came back with a To-Fix list that covered a number of issues dragging on Arrow Lake CPU performance. The Arrow Lake-S Performance Update made clear that Intel was taking the issue seriously and that it had plans in place to improve performance for disappointed customers.

https://www.extremetech.com/computing/i ... erformance

Future Timeline

GPU and CPU news and discussions

Re: GPU and CPU news and discussions

Re: GPU and CPU news and discussions

Re: GPU and CPU news and discussions

Re: GPU and CPU news and discussions

Intel Arc B580 vs older graphics cards

Re: GPU and CPU news and discussions

Re: GPU and CPU news and discussions

Re: GPU and CPU news and discussions

Re: GPU and CPU news and discussions

RTX 50X0 is worse than you think

Re: GPU and CPU news and discussions

Re: GPU and CPU news and discussions

Nvidia's market cap vs their consumer hardware assortment

Re: Nvidia's market cap vs their consumer hardware assortment

current computing paradigm running out of steam

Re: current computing paradigm running out of steam

Re: current computing paradigm running out of steam

Re: current computing paradigm running out of steam

new Nvidia Blackwell DIGITS vs 2016 Nvidia Pascal DGX-1

Re: GPU and CPU news and discussions