Should you buy a 3990X for rendering in blender?
When the 3990X dropped in February of this year, it certainly landed with a thud. But is it a good purchase for rendering? Is that even a sane question when its clear its faster than anything else out there CPU-wise right now? Assuming you're thinking about purchasing either a new workstation, or perhaps even building a render farm. After crunching numbers, I have some insights I'd love to share to answer these questions :)
In this article
A comparison of the 3990X along side the fastest rendering CPU's out there, the focus is on raytracing, particularly in cycles, we'll be looking at;
How each CPU performs and where the 3990X sits in the pack
How to determine a measure of value to compare each CPU against
How to determine a measure of value to include power draw
Other factors that affect performance, such as memory
Implications for building either a workstation or render farm
Why read this article?
I've told you whats in here, but should you spend your precious time reading on? Why is this stuff important? Well, here's the 'why'.
When purchasing a new system for any reason, there are two ways you can go about it, either you get all emotional and excited about the latest thing that is being hyped, or you get analytical and look at data to try and forecast the consequences of each option. So, if you take anything away from this article, know that we're recommending you do your own research, be objective, not emotional, and aim to get understanding about what you value, and how to get it.
Please, don't buy on hype, and, remember, there will be lots of assumptions we've made in this analysis (watch out for them as you read!). So back yourself up with your own research, ask questions, the comments section is a good place for that, and above all, stay calm!!
How do we choose a CPU for rendering in Blender?
I mean really? If you're not already familiar with all the various models (or SKUs if you will) of each CPU on the market, you'd be forgiven for feeling overwhelmed. What with all the confusing names, Intel has its *something*-lake names for CPU architectures, for example coffee-, sky-, cascade, ice-, purple-, monkey-, dishwasher- lake (etc). AMD also has quite a few models with equally weird names, like zen, Threadripper, Ryzen and so on. Throw hype in there and you need to be a seasoned techy to know what to buy.
Its clear we need data!
Thanks to Blender's Open Data Project
We turned to Blender's Open Data Project, which hosts a very large number of benchmarks that have been contributed by the community using a consistent benchmark utility tool. With this free data set, anyone can get in on the analyst game! I said you should do your own research, now you can!
Methodology - What we did
Now, if you're one to skip ahead to the conclusions, let me say that reading the methodology usually saves you some embarrassment later on. I'll try to be as succinct as I can be (verbosity is my blessing and curse it seems).
We downloaded and put a daily snapshot of the Open Data ... errr.... data, into a mongo database. This, though somewhat technical, is possible for anyone with a computer and an internet connection thanks to free software like python and mongodb. If you'd like to know how to do this yourself, let us know in the comments, if we get enough interest, we'll gladly show you how :) Or, just check out https://opendata.blender.org/.
The project got an upgrade this year and has made it easier for you to run your own queries if you want to check out how different hardware performs :)
Next we limited the scope of results to just one benchmark, running on windows, running blender 2.81. This is to try and limit the variables that might affect our analysis, like the fact that between blender versions, performance can be different, different OS types have been known to have different render times for the same scenes, and of course different benchmarks run different render times. What we want is just the differences due to the hardware alone, and nothing else.
Then it was simply a case of sorting the results by render time and concentrating on the fastest. then we hit a problem, our benchmark had no data for the 3990X, so we turned to a couple of sources online that had thankfully run the same scene and used that data ( assumption 1 These sources really did run the same scene as the benchmark)!
Why this benchmark though? Well its fairly new, so it represents a more recent state of the art, its also a decent length that will get a CPU to a steady state temperature, so the numbers represent something stable, not transient. CPU's tend to protect themselves from overheating, so they will turbo at a greater speed until they reach a limit, then they adjust their speed until finding a steady state they can keep going at.
There are problems of course using data in this way, there are hardware variables outside the CPU that we can't account for, such as the cooling solution or the RAM used, the motherboard, the ambient air temperature, etc. (assumption number 2, these effects will be minor as most hardware aims to not be terrible and bottleneck the speed of your expensive CPU, cause otherwise it would get replaced with something better).
For price, we used google, with the location set to the United States, so all prices are in USD. We tried to find at least four different prices for each CPU and took the average price (assumption 3, the prices haven't changed between me looking them up and you reading this article!! Seriously, price changes make quite drastic changes to the rankings of these CPU's!)
So, the TLDR here is, we used one benchmark, one OS, one version of blender, and then we got a few prices for each CPU we considered. Job done.
But really, what do we care about when it comes to buying CPUs anyways?
This depends on who you are and what you have in the bank to some extent, there will be those who love the rights of bragging about having a 3990X system as their latest toy. Personally I hope they share, and FYI we're working on a system that will allow those lucky individuals to do so (shameless plug for our render farm software we're building).
However, if you are concerned with efficiency, either for building a large scale render farm, or building a budget workstation, you are probably concerned with how much performance you get for your $$$.
So without further delay, here's a chart that shows just that for the fastest rendering CPUs in blender at the current epoch.
If you're thinking *yuck, what a mess* to yourself, don't worry, you're not alone, just a bit of cognitive strain as your brain attempts to figure what's going on. A helpful hint to interpreting the chart is to realize what we all want is infinite, free rendering power :)
Yup, the holy grail of computing, zero cost, zero time renders!
Now in practice this doesn't exist, so instead let's try to find a real CPU that is as close as possible. This means finding the CPU that is closest to the origin point on the chart or zero render time, zero cost (this chart has been set with its x-axis at 100 to make the data fit better in the frame, so there is a bit of skew because of that).
You can just eyeball this, or you can get pedantic (like me) and measure. Eyeballing would have likely had you pick the 3970X or 3960X as closest, with maybe the i9-10980XE as third place? (your guess in the comments?)
That would have been a terrifically good guess, but, not spot on! And, spoiler alert, the 3990X is a long way away from the origin, so by the eyeballs method, its not going to win this contest.
Doing it via measurement takes a bit more effort, we need to find the distances marked '?' for each CPU. The more 'valuable' CPU's will have smaller values for '?'. Now measure this line for all the CPU's. This is trivial if you know some high school mathematics. The best CPU then has the smallest value for '?'. Job Done!
And if you'd done that as described, you'd actually be making a mistake. Lets do a quick sanity check. This method seems seductively intuitive and simple, but it implies something a bit silly if we pick the right example.
Lets say we have two CPU's, one is $10,000 and gives us a render time of 100 seconds, the other is $5000 and gives us a render time of 200 seconds. If we agree that fewer seconds of render time is valuable to us, as in, we'll pay for less of them, then it seems logical that we'd agree that either of these two CPU's are just as valuable, since their performance vs cost is identical on that basis. So lets calculate an imaginary line on the graph for these imaginary CPU's
C1 = $10000, P1 = 100 where C is cost, P is render time in seconds
C2 = $500, P2 = 200
We use the formula for finding the distance to each point from the origin which is given by Pythagoras' theorem;
V1 = SQRT( C1^2 + P1^2) so that's the square root of the sum of the squares of the cost and render time, see? Easy!
V2 = SQRT(C2^2 + P2^2)
Finally, keeping the appropriate number of significant digits (which is all of them, cause copy paste from calculator...)
V1 = 10000.49998750062
V2 = 5003.9984012787214318818398802492
Recall, these should be equal, because if we get half the render time, but double the cost, its the same value, but uh oh, there not... hmmmmm annoying.
Ok, so what just happened? As you might have already guessed, the calculated values are almost the same as my imaginary prices of my imaginary CPU's. So its clear that the cost of the CPU is actually dominating the 'length' of the line we draw from the origin, this is not obvious when looking at the charts above.
The problem here is the distortion of the units of the axes. Cost dominates the length of the line because it is roughly ten times larger than render time for this data. Also the scale of the axes throws us off, they both increase at different rates for the same space on screen, which skews our perception.
What we need is a way to calculate values that are on the same scale, without losing the relative positions of each data point. Yes, technical and mathy as it is, we need to do whats called normalisation. This is where we scale the data to fit within the same two numbers on both axes, in our case 1.0 and 10. Then we can calculate the distances to each CPU again.
Above, we plotted this new 'distance' as a bubble on each data point on the original chart. Each bubble's width is equal to 100000 divided by the lengths of the now normalised lines between the origin and the data points. The chart above also shows value adjusted for power draw (in a sort of salmon colour) which we go into a bit later.
With this, we can now rank the CPU's in order of most to least valuable. So I did that, here are the results, brace yourself....
i9-10980XE - 19382.67501 -> we have a winner, and its.... wait what?
Ryzen 9 3950X - 19230.19889
3960X - 19203.5557
3970X - 18823.9379
2970WX - 18800.06362
i9-7980XE - 17156.3122
2990WX - 16204.31325
Ryzen 9 3900X - 14009.58414
i9-10920X - 13862.98136
TR 2950X - 13223.54352
i9-9960X - 12934.00742
i9-7940X - 12160.54735
TR 1950X - 11594.33848
i9-9920X - 11243.5702
TR 2920X - 10180.31259
i9-10900X - 10109.42496
3990X - 9950.371902 -> here's where I really hope there's no errors in my spreadsheet :s
i9-9820X - 9827.179916
Shock Horror! 3990X is second last! And Intel is first?!
The problem with data is that it can really rattle your intuition sometimes. And yes, these results suggest that an i9-10980XE is ever so slightly more valuable than second place, a Ryzen 9 3950X, what sacrilege!
Take care as you brain directs your fingers to prepare for comment, and, this is not a suggestion that you shouldn't (or should) buy any of these CPU's, or that one is better than the other, my intention here is to show a method for assessing CPU's in an objective way. Objectivity often leads to conflict with our expectations, desires, biases, etc. Recommendations will come, but not yet, keep reading :)
Certainly the hype of AMD has reached the point where I was totally sold, if I had not sat down and done these numbers, I'd have saved up and bought a 3990X immediately, and I might still do that, because I can't help myself, 3990X is cool and I just want one (though it will take me ages, I'm buying in Australian dollars and paying for shipping to the second most remote continent on Earth).
Debate aside, lets recap, the value score is an objective measure of how attractive each CPU is based on its 'closeness' to an ideal CPU. It is a useful measure if price and performance are equally valuable, meaning that if price doubles, but our render time halves, we perceive no change in value. It is on this basis that the i9-10980XE wins out. And remember, we're only considering one benchmark here. There are many more benchmarks we could test, time permitting.
Also, what about power consumption? What about single core performance, gaming, other applications besides Blender? All these questions remain unanswered, the value score as it is, can only answer for speed and price of the CPU under the given benchmark. I've also ignored the idea that if you want more power than either the 3950X or Intel i9-10980XE can provide, then you have no option but to buy a less 'valuable', more outright powerful CPU. I am a bit selfish here really, I am thinking of how these CPU's would scale in a render farm, I write render farm software, so yeah. All I care about is bang for buck, and this analysis is pretty narrow, ah if only there were more time for benchmarks *sigh*.
Making the comparison better
Obviously, after you've bought the CPU, and all the other hardware (which we didn't analyze, but probably should have), the biggest ongoing cost will be power consumption of the hardware. So how will we account for that?
Well we could extend the idea of 'closeness' to an ideal CPU to include how much power it consumes for a given unit of performance. Our ideal CPU could now be described as costing zero dollars, rendering in zero seconds and drawing zero power. Quite an imaginary CPU by all standards!
This is pretty easy to do, we can just add a normalised power draw to our length calculation and we have a new measurement for distance from the origin, which is now three dimensional, meaning we're now looking at a surface maybe? We could keep adding dimensions like this, but we'll stop at three for today. Thinking in four dimensional space makes my brain hurt.
Slight caveat here before proceeding, the power draw data for CPU's is not readily available, what we used were a few articles from Anandtech that bench marked CPU's running on all cores at full turbo. Not all the power draw figures were from blender and certainly not all of them were running the same benchmark. So, these power adjusted scores hold if assumption 4 does, which is, power consumption of these CPUs will be fairly similar for a wide variety of workloads once the CPU is running flat out and has reached steady state (is as hot as it will get in other words).
Ryzen 9 3950X - 18795.85
i9-10980XE - 15198.44
i9-7980XE - 14162.65
TR 2950X - 12119.25
TR 1950X - 10828.36
2970WX - 10787.53
2990WX - 10220.98
TR 2920X - 9631.43
3960X - 9177.84
3970X - 8831.20
3990X - 7201.58
Once again, surprises, the best chip turns out to be a Ryzen 9 3950X of all things. Most likely thanks to its quite low power consumption... just a guess. Also the 3990X turns out to be the least valuable when it comes to our new measure of value which includes power draw (but remember the caveat from earlier!).
So, is it recommendation time yet? Does this analysis suggest the 3990X is a dud? By no means, these calculations highlight what I think is valuable, an equal balance of performance, price and power efficiency, those are all the things that the 3990X isn't about, it is a monster CPU, stupidly powerful, and you pay for the privilege of so much speed in a workstation CPU, both in $$ and Watts drawn from the power grid (hmmm that reminds me of something...).
Anway, the definition of value in this article makes more sense in the render farm setting where you might be thinking of buying 5, 10 or more nodes. This is where a balanced weighting of price, performance and efficiency seem to make most sense to me, you can adjust the weighting to suit your own preferences quite easily if you want. Here's a quick refresher
V = SQRT( C x Cost^2 + P x Price^2 + Po x Power^2)
Here you can choose the weights C, P and Po to be what you want, in my version they are all exactly 1, which implies I value each the same. Don't forget that each of the cost, price and power numbers has been normalised.
Whereas in a render farm I personally would aim for the most efficient combination, a single workstation, on the other hand, needs to be as powerful as possible. Especially if its your only computer. The price may hurt, but is somewhat tolerable if you can make a decent living.
Power consumption is tricky, it isn't paid for upfront like the cost of the system/s, and your power bill usually obscures how much power a computer consumes since the power consumption of all the other devices in your home or office are lumped together in your bill.
All this makes power consumption less tangible and less urgent than the needs of getting stuff rendered. Though check out this report about PBOC enabled causing a 3990X system to draw 1000 Watts!! That is something to watch out for.
Other Factors to consider in a CPU
So far I've only really considered the narrow case of rendering a single benchmark in blender. However, the techniques used can be applied to any application or bench marking software, so feel free to go nuts with your app of choice and get nerdy with the data.
I want to turn the attention back to ray tracing in a more general sense and consider some factors that we should perhaps pay a lot more attention to. For a CPU may have a giant heart, but if the vessels that feed it are weak, it will have a heart attack.
The first thing to consider would be cache size. Talking very specific about ray tracing for a moment, a recent-ish paper on the bottlenecks that make the most impact in rendering, highlight stalls due to memory access latency. You can read more about that here. The gist is that between 60% to 90% of the total energy spent rendering is a result of moving data around, not calculations. Large increases in render time are seen when data has to be moved between the CPU and main system memory. This is why we use cache memory, which helps with this.
Cache memory is small amounts of memory located on the CPU itself, which have much faster access times than main ram. You can think of it like this, you want to make a sandwich. First you go to the fridge in your kitchen, if all the ingredients are there (fridge = cache) then your sandwich will be ready quickly. If it turns out the cheese isn't in the fridge, you need to go to the store (main system memory), and that's going to take a lot longer.
Having a larger cache means more of the scene is kept close to the parts of the CPU that need them. Any 'miss' on the cache means a long trip to system memory and a delay in rendering until that part of the scene is fetched.
The 3990X has quite large caches compared to other desktop CPU's like the i9-10980XE. The thing about cache is that not every workload relies on cache to the same degree. You might not be aware how much difference its makes, but in rendering with ray tracing, according to the paper I mentioned, having enough space to fit an entire scene into CPU cache is a good thing.
Main system memory is also important in the same way cache is, if your scene doesn't fit into main memory, then be prepared for a very, very long render time. If the difference between cache and main memory is fridge vs store, then main memory to even an SSD is even bigger, meaning you are waiting an awful longtime for that sandwich now, you're are now going to the next town for cheese.
The takeaway here is that the benefit from having sufficient main memory or cache trips further out from the CPU is only apparent with a sufficiently large scene. You might only notice when you run out of RAM, but not before. So the benefits from enough RAM or cache are avoiding bottlenecks, not increases in speed like more threads or higher clock speeds bring.
Both the 3990X and Intel's i9-10980X have support for up to 256GB of RAM, which is quite a lot. To put this in perspective the biggest scene I have personally heard of was 80GB (if you have made a bigger scene than that please tell me in the comments!), and Tangent Animation's Next Gen movie had scenes about 25GB.
The supported speed of the memory has an effect too, at the time of writing the 3000 series chips from AMD state support for 3200 RAM, where as the Intel chips only support 2933. Having faster RAM helps for the initial 'sync' stage of rendering where a lot of data is being transferred to the CPU (though RAM latency will likely also be important for mitigating the effects of cache 'misses', more on this in another article though).
There are more things to consider than just whats been touched on here, I'd highly recommend checking out some other reviews like those from Anandtech.com if you're keen to really get stuck in research your next build.
If you were building a single workstation then...
I'd recommend choosing wisely right now, AMD is about to release Ryzen 3, Intel has also yet to answer with something to counter AMDs product line for desktop. The 3990X certainly has had some very positive reviews and has an obscene amount of power, the darn thing can run Crysis for goodness sake and its a CPU!
However, some reviews have labelled it a dead end due to its lack of support for more RAM. Though average scenes today don't reach 256GB in size, that may not stay that way. Also we've mostly been considering rendering of animations, there are likely to be other workloads that require large amounts of main memory. I've worked in labs where there are researchers that deal with images that are a few terabytes in size!
Honestly though, for my own system, if I were building, I'd be looking at either seeing what Ryzen 3 brings, or I'd commit to a 3970X. Though it only has half the cores and cache, it supports double the main memory at 512GB. Yeah you read that right, the lower spec'd chip supports double the RAM. You can learn a bit more about the reactions of a popular youtuber's to this here.
The 3970X also just so happens to be closer to the origin of our performance vs price chart than most other CPU's where the 3990X is a way off. Its also got the second fastest render time and a higher base and boost clock speed, which will help for parts of rendering that can only use one core, so that core needs to be quick.
If you were building a render farm then...
Oh man things get real in this case :D. So, render farms scale things, purchase cost, power draw, render power, everything. We're only going to scratch the surface in this article, but here's a quick couple of graphs and some food for thought.
Here we took the top ten CPU's in terms of render time and then simulated the costs of creating render farms of various sizes. Each dot along those curves (moving from the far right to the left and upwards) represents another node added to a hypothetical render farm, using a particular CPU. The max number of nodes we simulated was 20. So we can assess render farms from 2-20 nodes using this data set.
Once again we employ this idea, a render farm that is closer to the origin for any particular number of nodes, is better than one further away. On this basis, the 3950X and 3970X are the best options, their curves are closer to the chart's origin for their whole lengths! The 3990X is in the middle of the pack, fighting with the 3960X and i9-10980XE.
However, as I mentioned before, we've only considering the purchase price and render time. The main operational costs of a render farm are power. So lets consider the best chips from a power perspective, and by the way, I didn't see this one coming...
So initially I thought I had chosen what were the two best CPU's according to their value scores adjusted for power. I also threw in the 3990X for good measure since it looked the most power hungry of the bunch. Certainly for the value equation it doesn't fare well, likely due to its large cost.
But this chart clearly shows it thumps the competition on power draw (again huge caveat on that due to the link I mentioned earlier where PBOC can send this CPU nuts and draw near 1000 Watts, and the fact we're using power draw from full loads on other benchmarks). Nothing is coming close until the farm gets quite large. For a small farm though, say three to six nodes, the 3990X comes out ahead on power for the same level of performance. The chart also is not accounting for the extra power needed for the motherboard, RAM, drives, GPUs (if any), fans etc. Adding that would accentuate the lead the 3990X has.
This is important since the power bill is the biggest ongoing cost (other than maybe software licenses if you pay for your software) and the 3990X clearly has an advantage here.
This chart also begs the question, is there a best 'value' point for a render farm? We've considered this for a single CPU, but from the chart, you can see that there is a point on each curve where the distance to the origin will be smallest. Past this point and you are paying more and more for less and less improvements in render time.
This all assumes you are just rendering one project at a time though as we've implicitly defined performance as render time. We could and will (next article) redefine performance as the amount of work done in a unit of time. This allows us to consider doing multiple jobs at once, or multiple frames at once.
The Open Data project sadly can't help with this and hence the definition of performance in this article being render time for one frame. All the data we have is for one frame being rendered. So the low hanging fruit was to go with this, but real render farms do more than just single frame renders :D.
Conclusions and final thoughts
The 3990X is crazy, so much power, but so weird to be limited in its RAM capacity. The two just don't seem to line up. It has clearly got punch for rendering though, so RAM issues aside (and it seems you'd have to go to extreme's to cross 256GB for a scene anyways) its a beast of a chip.
Is it worth USD 4000? Our calculations would suggest not, but then nothing else out there is anything like it. Plus it takes $20000 worth of Intel Xeon muscle to match it (not quite match it). So though it may not be the best value by these calculations, its so much better value than anything else, maybe that makes it attractive?
Beware though, this kind of relative pricing is what can drive bubbles in all sorts of markets, housing, stock, bitcoin. Its not a good practice in my book. So, though its surely an impressive piece of technology, the price point for what it does and the RAM issue have dampened my enthusiasm and I see a far more sensible choice in the 3970X, for both render farm nodes or workstations, that is my bet for now.
All that remains is to fight it out in the comments :D
FYI before you go, you might like to know that we make render farm software, we're aiming to make an open source version of our new addon we're building for blender. You can get the free version right here on our site and also contribute to our project via our support campaign. We're passionate about giving the community the tools and know how to setup a really cool pipeline for rendering stuff, so stick around if you're doing stuff in 3d and check these links out maybe :)
Download the latest free version -> https://www.crowd-render.com/download
Checkout our crowdfunding page -> https://www.crowd-render.com/crowdfunding
Thanks for reading and hope you'll stick around for the next article :)
I have to say I'm shocked that you managed to do all this complicated statistics and calculation, while completely forgetting the obvious choice and much more render power for a fraction of the price tag: Buy a Nvidia GPU and use CUDA. Buying CPUs to render anything meaningful will bankrupt you, CPUs that can hold a candle to for example a RTX 2080 are super super super expensive, usually several times the price. I hope this is not news to you, I can't imagine that you are that clueless, knowing so much about CPUs but nothing about GPUs, the real number crunching workhorses of our day. It is true that GPUs have the RAM limitation problem: Nvidia does not allow…
This reminds me of that episode of South Park where Randy searches for that perfect statistical calculus which puts him above average :)