Building a 32 cores,
16 nodesWindows Server HPC Cluster
After 20 years with Microsoft, last December I found myself
in this unusual “in between jobs” situation. What a great time to start
something new and use some spare time for some experimentation's. While working
at Microsoft I had a chance to play with many different Windows HPC clusters;
from small to large configurations like a 1,800 cores cluster. Since those days
are now behind me I set myself with the goal to create a small home cluster for
development and small scale benchmarks.
Like I said, because I have some free time I’m also taking
this opportunity to share some of my experience building this cluster. Here are
some initial considerations about the construction of this HOME HPC Cluster.
Cluster under construction
Final Result
16 Nodes Windows HPC Cluster up and running
The 32 cores, 16 nodes cluster configuration cost a little
less than US$ 2,000 to build; that's $62.50 per core or $125.00 per node
(including virtualized nodes). That's not bad considering that any similar
configuration from a brand name PC maker would put the price at least double
that. Of course I don't want fool myself or anybody: this cluster was build
mostly with lower end components and can't be compared to a robust, server
grade cluster. Just as an example, to keep costs low I opted for non ECC
memory; not a wise decision if you plan to run in production. In spite of that
the cluster is fully functional and is serving well for the intended purpose of
allowing for the developing and running of some parallel code. I'll start to
share some code, benchmark results, and conclusions in subsequent posts. For
now, here's the cluster configuration:
Computer
v
-
ATX Mid Tower Case - 400W PSU, 2x Int 5.25" x 1x Ext 3.5", 2x Int
3.5", 2x Front USB Ports
v
- Motherboard:
ASUS M5A97 LE R2.0
v
-
Processor: AMD FX-8320, AM3+, Eight-Core, 3.5GHz, 16MB, 125W, Unlocked
v
-
Memory: 16GB Desktop - DDR3, 2 x 8GB, 240 Pin, DIMM, XMP Ready
v
-
Video Card: Asus ATI Radeon HD5450 Silence - 1 GB DDR3 VGA/DVI/HDMI
v
-
Hard drive - 500 GB - internal - 2.5" - SATA-300 - 7200 rpm - buffer: 16
MB
Network
v
Network adapter: Realtek PCIe GBE Family
Controller (two per computer)
v
Switch: TRENDnet 8-Port Gigabit GREENnet Switch
v
CAT 6 network cables
Software
v
Windows Server 2012 Standard Evaluation, HPC
Pack 2012
Other
v
IOGEAR 4 Port USB Cable KVM Switch
v
Old Gateway Netbook -Two Cores Atom CPU
v
Old Linksys WRT160N
I've also considered building this cluster using lower
powered passive cooled computers but at the end the cost per core was more
expensive than the higher core density
solution I built - based on the price for lower powered Intel Atom based
machines as of December 2012. A bunch of other options to reduce cost that for
different reasons didn’t prevail: headless installation (nodes without a video
card), diskless nodes.
Additionally the cluster cost is not taking in consideration the almighty
Active Directory / Internet Gateway server as you can see in the cluster photo.
For this server I'm taking advantage of an old two cores Atom based netbook
that was not in use for some time. Actually, it works pretty well for this small HPC cluster.
It’s been a long time – years - since my last custom built
computer. In all honesty I had a lot of fun putting this cluster together from
scratch: researching and buying the hardware parts, assembling the cluster
(machines, network), software install and configuration (Windows Server, DNS,
AD, DHCP, Hyper-V, HPC Pack). Considering that I'm not an infrastructure or
system admin guy I think it all went pretty well and I'm really
satisfied with the results.
What's next? Let's put my new toy to work! I'm completing a
small C# program that'll enable to run file unzipping (www.7-zip.org) in parallel using
Windows HPC. This is a typical embarrassing parallel type of problem and a good
fit for some initial tests. I'll have three different C# implementations to
demonstrated the use of distinct approaches to implement the Parallel Unzip :
v
Using the Windows HPC Scheduler API - let's put
the Windows HPC scheduler under some stress.
v
Using Service Oriented Architecture HPC - I
expect to see the best performance of the three approaches; we'll see...
v
Traditional MPI based approach - you gotta have
the classic represented.
In subsequent posts I'll share the code and the benchmark
results of those three implementations.
Dear Mr. Manfredi,
ReplyDeleteYour HOME HPC Cluster looks great.
We plan to deploy similar cluster for training purpose.
Please advise us what manual to use?
We have found "DIY Supercomputing: How to Build a Small Windows HPC Cluster".
I see 4 nodes in your cluster. Where are the rest?
With best wishes,
G. Oyunbayar
G-Mobile LLC www.g-mobile.mn
R#506, ChD,
15160 Ulaanbaatar
Mongolia
Phone: +976 93111000
Fax: +976 11311195
Hi Oyunbayar,
ReplyDeleteThe DIY documentation you mentioned is a very good place to start to build your cluster. In my configuration I have 4 physical computers with eight cores each; on each physical computer I create 4 virtual machines using Hyper-V. That's how I got 16 nodes.
Let me know if you have any other question.
Best regards,
Pedro Manfredi
Thanks for sharing, Pedro. Can You explain what is the purpose of artificially increased nodes number ?
ReplyDeleteHi,
DeleteThe primary reason for using this configuration is to enable some testing scenarios like: deployment, performance bench-marking. In some cases this configuration might be desirable as well if your cluster will be used by multiple users and you want additional levels of isolation in: security, cpu use, etc. In any case, this is very flexible as you can use it in any desirable configuration: physical nodes, virtual nodes, or any combination.
Hope this helps,
Pedro
Hello Great Job ! Please can you tell,will it help in reducing time while video rendering. basically i want it for animation & video editing purpose . Please Help..
ReplyDeleteHi Santhal,
DeleteYou need to check your rendering software for support to Windows HPC. Renderman , believe, currently supports Windows HPC.
very nice work dear Pedro. I really have a lot of old pcs with dual core duo /DDR2 RAMS and up to 4 GB per PC. I dont know if these PCs can be put together to build a working cluster (may be using win srv 08), and if it is done, how how can i then test it and which tools do i need to build parallel small applications.
ReplyDeleteyour idea is small for a person, but huge for others.
:)
Regards
derar