Historical analysis of NVIDIA GPUs relative performance, core count and die sizes across product classes and generations

Hi! With how divisive the pricing and value is for the RTX 40 series (Ada), I've collected and organized data (from TechPowerUp) for the previous 5 generations, that is, starting from Maxwell 2.0 (GTX 9xx) up until Ada (RTX 4xxx), and would like to share some findings and trivia about why I feel this current generation delivers bad value overall. NOTE: I'm talking about gaming performance on these conclusions and analysis, not productivity or AI workloads.

In this generation we got some high highs and stupid low lows. We had technically good products, but at high prices (talking about RTX 4090), while others, well... let's just say not so good products for gaming like the 4060 Ti 16Gb.

I wanted to quantify how much of a good or bad value we get this generation compared to what we had the previous generations. This was also fueled by the downright shameful attempt to release a 12Gb 4080 which turned into the 4070 Ti, and I'll show you WHY I call this "unlaunch" shameful.

Methodology

I've scraped the TechPowerUp GPU database for some general information for all mainstream gaming GPUs from Maxwell 2.0 up until Ada. Stuff like release dates, memory, MSRP, core count, relative performance and other data.

The idea is to compare each class of GPU on a given generation with the "top tier" die available for that generation. For instance, the regular 3080 GPU is built using the GA102 die, and while the 3080 has 8704 CUDA cores, the GA102 die, when fully enabled, has 10752 cores and is the best die available for Ampere for gaming. This means that the regular 3080 is, of course, cut down, offering 8704/10752 = 80% of the total possible cores for that generation.

With that information, we can get an idea of how much value (as in, CUDA cores) we as consumers get relative to what is POSSIBLE on that generation. We can see what we previously got in past generations and compare it with the current generation. As we'll see further into this post, there is some weird shenanigans going on with Ada. This analysis totally DISCONSIDERS architectural gains, node size complexities, even video memory or other improvements. It is purely a metric of how much of a fully enabled die we are getting for the xx50, xx60, xx70, xx80 and xx90 class GPUs, again, comparing the number of cores we get versus what is possible on a given generation.

In this post, when talking about "cut down ratio" or similar terms, think of 50% being a card having 50% of the CUDA cores of the most advanced, top tier die available that generation. However I also mention a metric called RP, or relative performance. A RP of 50% means that that card performs half as well as the top tier card (source is TechPowerUp's relative performance database). This denomination is needed because again, the number of CUDA cores does not relate 1:1 with performance. For instance Some cards have 33% of the cores but perform at 45+% compared to their top tier counterpart.

The full picture

In the following image I've plotted the relevant data for this analysis. The X-axis divides each GPU generation, starting with Maxwell 2.0 up until Ada. The Y-axis shows how many cores the represented GPU has compared to the "top tier" die for that generation. For instance, in Pascal (GTX 10 series), the TITAN Xp is the fully enabled top die, the GP102, with 3840 CUDA cores. The 1060 6Gb, built on GP106, has 1280 CUDA cores, which is exactly 33.3% as many cores as the TITAN Xp.

I've also included, below the card name and die percentage compared to top die, other relevant information such as the relative performance (RP) each card has compared to the top tier card, actual number of cores and MSRP at launch. This allows us to see that even though the 1060 6Gb only has 33.3% of the cores of the TITAN Xp, it performs 46% as well as it (noted on the chart as RP: 46%), thus, CUDA core count is not perfectly correlated with actual performance (as we all know there are other factors at play like clock speed, memory, heat, etc.).