How Distributed Rendering Helps You
Image courtesy of Antonio Arroyo and Oliver Villar
Ok, so the title posits something that might seem really obvious if you know even just a little about distributed rendering. Thats why this article is going to not just explain what it is but show you why its really really important to you as an artist that you understand how to use it and the importance of a render farm to your project.
An Explanation - Distributed Rendering
Distributed rendering is where you render either a single frame or multiple frames using several computers instead of just one. We're going to go into more detail about why this is faster but for now we'll spell out the basics.
There are two main methods for distributing rendering. The first and slightly easier method frame splitting. This is where an animation is rendered by distributing its frames over several computers. Each computer renders some percentage of all the frames in the animation. We've made a simple example to follow along to where an animation of nine frames is to be rendered on three computers.
With frame splitting, a "master" node usually coordinates the render, sometimes this node (a technical name for a computer) will render some of the frames, some times it acts as a manager and controls the render, but does not render any frames itself. In our example, all the computers render.
The "slave" or "render" nodes are given some of the animation to render, in this example; three frames each.
If all three computers are the same in terms of processing power, then they will all complete the render in a third of the time just one of the would take. This is why distributed rendering is so important to artists, you can scale down your render time by adding more computers.
The second method for distributing rendering is "bucket rendering" or "tile splitting". This method takes a single frame and splits it into parts to be rendered on multiple computers.
The tile splitting technique is what Crowdrender uses. As you can see from the diagram above, tile splitting works differently from frame splitting. Tile splitting breaks each frame up into parts and each computer renders one of these parts of the frame.
Tile splitting is more flexible than frame splitting for the obvious reason that tile splitting works for single frame renders. Frame splitting cannot make a single frame render faster! Tile splitting is therefore very useful for fast previewing of frames to ensure there are no errors and that the overall impression of the image is what the artist wants.
Tile splitting is not without its own problems, however. Where there is a lot of compositing required, frame splitting can accelerate the compositing as well as the rendering as the entire frame is available on each node. In tile splitting, compositing usually has to be done on the master node when the tiles from each slave node are received and the entire image is finally available.
Knowing just these details about each technique will help immensely in having a good final render experience!
An Example, why its so important
We conducted a benchmark test using a project from Antonio Arroyo (see the image above!). The test was simple, the baseline was a single computer rendering the scene. For this baseline test, we used a macbook pro with a fifth generation i5 processor operating four threads.
Then, the variant for the test. We used Crowdrender to do a tile split render (because we did just one frame) using the macbook as master and five other computers as slaves. The other slave nodes contained similar equipment, each had four threads and similar RAM (though the CPU's were from previous generations and so clock speeds and other features were less advanced than that of the laptop).
For the test we varied just the number of samples. Quite simply the more samples, the higher the quality of the render and the slower the image takes to complete.
Here are the results, plotted as samples (x-axis) vs render time in minutes:seconds.milliseconds we also repeated the results in a table format so you can see the exact numbers for your self.
The results for just 50 samples are:
with crowdrender : 1min 07 secs
without : 2 min 54 secs
The results for 100 samples are:
with crowdrender : 1 min 30 secs
without : 4 min 39 secs
The results for 200 samples are:
with crowdrender : 2 min 23 secs
without : 9 min 06 secs
the results for 500 samples are
with crowdrender : 5 min 12 secs
without : 19 min 40 secs
the results for 1000 samples are
with crowdrender : 10 min 06 secs
without : 39 min 27 secs
Distributing rendering gives you more than speed
So after reviewing the results, you can see that as samples increase, the render time for both tests increases in more or less a straight line. But, the results for distributing the render are much, much better. For 1000 samples, distributing takes only 1/4 of the time to render.
What is really great about this is that with each computer you add, you will see a further reduction in the rate at which render times increase with increasing samples. In other words, the green line gets flatter.
Another useful way of looking at this is that you can increase quality without increasing render time by using more computers. If you look carefully at the chart above, you'll see that 1000 samples (using Crowdrender and six machines) takes about the same time as one computer doing 250 samples. Distributing the render gave us four times the samples for negligible increase in render time. If we added more computers, we could increase the samples further.
For high quality renders, with lots of samples or higher resolutions, it makes a lot of sense to use distributed rendering. Render times grow too quickly with just a single machine, even for a very powerful one.
You might think, "why not just make one really fast machine"? The problem is that you hit a bottleneck very quickly in the number of CPU's you can put into one system (or GPUs). Also the cost increases very sharply when you reach the upper limits of the products available. Lets take a look.
Charts courtesy of www.cpubenchmark.net
The chart above shows CPU rating using passmark vs cost in USD. The blue arrows are a hypothetical example considering a system that scores 10,000 in passmark. Lets assume we want to increase our computing power by a factor of three; which corresponds to the earlier example of reducing render times by three times using distributed rendering. Only this time we want to build a single machine that can achieve this reduction.
The CPU with a score of 10,000 in passmark costs USD300. If we now look at the chart on the right (which is dual CPU configurations) then we can see that the only data point we have is at roughly USD5600! This means that to increase our computer power in a single system by three times, we have to spend 17.6 times the cost of our original CPU that scored 10,000!
We could get similar performance if we distributed the task we wanted to run over three systems, each with a USD300 processor. The cost of the CPUs would only be $900. This is 5.2 times less than a single system with dual CPUs.