Why you should build a render farm


Image courtesy of Antonio Arroyo and Oliver Villar



In this article we'll explain exactly why a render farm is important using real data.


We'll get to the bottom of why they are the only way to render some projects and how you can create one using free software.



An Explanation - What is a render farm, how does distributed rendering work?


Render farms are basically groups of computers that do distributed rendering, which is where you render either a single frame or multiple frames using several computers instead of just one. We're going to go into more detail about why this is faster but for now we'll spell out the basics.


There are two main methods for distributing rendering. The first and slightly easier method frame splitting. This is where an animation is rendered by distributing its frames over several computers (those several computers make up the render farm).


Each computer renders some percentage of all the frames in the animation. We've made a simple example to follow along to where an animation of nine frames is to be rendered on three computers (three computers is not a huge render farm, but much faster than just one computer!).

Distributed rendering - Frame splitting technique where frames of an animation are sent to many computers to be rendered as complete frames

With frame splitting, a "master" node usually coordinates the render, sometimes this node (a technical name for a computer) will render some of the frames, some times it acts as a manager and controls the render, but does not render any frames itself. In our example, all the computers render.


The "slave" or "render" nodes are given some of the frames of the animation to render, in this example; three frames each.

If all three computers are the same in terms of processing power, then they will all complete the render in a third of the time just one of them would take. This is why distributed rendering is so important to artists, you can scale down your render time by adding more computers.


The second method for distributing rendering is "bucket rendering" or "tile splitting". This method takes a single frame and splits it into parts to be rendered on multiple computers.

Tile splitting - distributed rendering technique that breaks a single frame into parts to be rendered on multiple computers

The tile splitting technique is what Crowdrender uses. As you can see from the diagram above, tile splitting works differently from frame splitting. Tile splitting breaks each frame up into parts and each computer renders one of these parts of the frame.

Tile splitting is more flexible than frame splitting for the obvious reason that tile splitting works for single frame renders. Frame splitting cannot make a single frame render faster! Tile splitting is therefore very useful for fast previewing of frames to ensure there are no errors and that the overall impression of the image is what the artist wants.


Tile splitting is not without its own problems, however. Where there is a lot of compositing required, only frame splitting can accelerate the compositing. This is because the entire frame is available on each node and in compositing, you generally need to have the entire frame.


In tile splitting, compositing usually has to be done on the master node when the tiles from each slave node are received and the entire image is finally available. This means that when using tile splitting the compositing pass must be done on the master and is therefore slower.


Knowing just these details about each technique will help immensely in having a good final render experience! For example, if you have a heavy compositing pass, frame splitting may be a better choice.

An Example, why its so important to have a render farm anyway?

We conducted a benchmark test using a project from Antonio Arroyo (see the image above!). The test was simple, the baseline was a single computer rendering the scene. For this baseline test, we used a macbook pro with a fifth generation i5 processor operating four threads.


Then, the variant for the test. We used Crowdrender to do a tile split render (because we did just one frame) using the macbook pro as master and five other computers as slaves. The other slave nodes contained similar equipment, each had four threads and similar RAM (though the CPU's were from previous generations and so clock speeds and other features were less advanced than that of the MacBook).


For the test we varied just the number of samples. Quite simply the more samples, the higher the quality of the render and the slower the image takes to complete.


Here are the results, plotted as samples (x-axis) vs render time in minutes:seconds.milliseconds we also repeated the results in a table format so you can see the exact numbers for your self.

The results for just 50 samples are:

with crowdrender : 1min 07 secs

without : 2 min 54 secs

The results for 100 samples are:

with crowdrender : 1 min 30 secs

without : 4 min 39 secs

The results for 200 samples are:

with crowdrender : 2 min 23 secs

without : 9 min 06 secs

the results for 500 samples are

with crowdrender : 5 min 12 secs

without : 19 min 40 secs

the results for 1000 samples are

with crowdrender : 10 min 06 secs

without : 39 min 27 secs

So after reviewing the results, you can see that as samples increase, the render time for both tests increases in roughly a straight line. But, the results for distributing the render are much, much better. For 1000 samples, distributing takes only 1/4 of the time to render.


What is really great about this is that with each computer you add, you will see a further reduction in the rate at which render times increase with increasing samples. In other words, the green line gets flatter.


Render farms are not just about speed


The results also suggest something else, that you can increase quality without increasing render time by using more computers. If you look carefully at the chart above, you'll see that 1000 samples (using Crowdrender and five machines) takes about the same time as one computer doing 250 samples. Distributing the render gave us four times the samples for a negligible increase in render time. If we added more computers, we could increase the samples further and get better quality images.


For high quality renders, with lots of samples or higher resolutions, it makes a lot of sense to use distributed rendering. Render times grow too quickly with just a single machine, even for a very powerful one.


Building a render farm vs building an epic workstation


It can be tempting to think, "why not just make one really fast machine"? After all, building a render farm can seem daunting, there's the added cost of specialised software, the cost of the network, and duplication of parts for each node in the farm. Perhaps just putting all those resources into one machine would be better?


The problem is that you hit a bottleneck very quickly in the number of CPU's you can put into one system (or GPUs). For top end workstations, you quickly find that you can get dual socket motherboards, so two CPUs max. Go for server motherboards and you can get four socket CPUs, but the cost starts to climb very quickly.


The cost of more performance from a single workstation starts to climb steeply towards the top end of the performance curve. Check this out.