Building a 32 cores,
16 nodesWindows Server HPC Cluster
After 20 years with Microsoft, last December I found myself
in this unusual “in between jobs” situation. What a great time to start
something new and use some spare time for some experimentation's. While working
at Microsoft I had a chance to play with many different Windows HPC clusters;
from small to large configurations like a 1,800 cores cluster. Since those days
are now behind me I set myself with the goal to create a small home cluster for
development and small scale benchmarks.
Like I said, because I have some free time I’m also taking
this opportunity to share some of my experience building this cluster. Here are
some initial considerations about the construction of this HOME HPC Cluster.
Cluster under construction
Final Result
16 Nodes Windows HPC Cluster up and running
The 32 cores, 16 nodes cluster configuration cost a little
less than US$ 2,000 to build; that's $62.50 per core or $125.00 per node
(including virtualized nodes). That's not bad considering that any similar
configuration from a brand name PC maker would put the price at least double
that. Of course I don't want fool myself or anybody: this cluster was build
mostly with lower end components and can't be compared to a robust, server
grade cluster. Just as an example, to keep costs low I opted for non ECC
memory; not a wise decision if you plan to run in production. In spite of that
the cluster is fully functional and is serving well for the intended purpose of
allowing for the developing and running of some parallel code. I'll start to
share some code, benchmark results, and conclusions in subsequent posts. For
now, here's the cluster configuration:
Computer
v
-
ATX Mid Tower Case - 400W PSU, 2x Int 5.25" x 1x Ext 3.5", 2x Int
3.5", 2x Front USB Ports
v
- Motherboard:
ASUS M5A97 LE R2.0
v
-
Processor: AMD FX-8320, AM3+, Eight-Core, 3.5GHz, 16MB, 125W, Unlocked
v
-
Memory: 16GB Desktop - DDR3, 2 x 8GB, 240 Pin, DIMM, XMP Ready
v
-
Video Card: Asus ATI Radeon HD5450 Silence - 1 GB DDR3 VGA/DVI/HDMI
v
-
Hard drive - 500 GB - internal - 2.5" - SATA-300 - 7200 rpm - buffer: 16
MB
Network
v
Network adapter: Realtek PCIe GBE Family
Controller (two per computer)
v
Switch: TRENDnet 8-Port Gigabit GREENnet Switch
v
CAT 6 network cables
Software
v
Windows Server 2012 Standard Evaluation, HPC
Pack 2012
Other
v
IOGEAR 4 Port USB Cable KVM Switch
v
Old Gateway Netbook -Two Cores Atom CPU
v
Old Linksys WRT160N
I've also considered building this cluster using lower
powered passive cooled computers but at the end the cost per core was more
expensive than the higher core density
solution I built - based on the price for lower powered Intel Atom based
machines as of December 2012. A bunch of other options to reduce cost that for
different reasons didn’t prevail: headless installation (nodes without a video
card), diskless nodes.
Additionally the cluster cost is not taking in consideration the almighty
Active Directory / Internet Gateway server as you can see in the cluster photo.
For this server I'm taking advantage of an old two cores Atom based netbook
that was not in use for some time. Actually, it works pretty well for this small HPC cluster.
It’s been a long time – years - since my last custom built
computer. In all honesty I had a lot of fun putting this cluster together from
scratch: researching and buying the hardware parts, assembling the cluster
(machines, network), software install and configuration (Windows Server, DNS,
AD, DHCP, Hyper-V, HPC Pack). Considering that I'm not an infrastructure or
system admin guy I think it all went pretty well and I'm really
satisfied with the results.
What's next? Let's put my new toy to work! I'm completing a
small C# program that'll enable to run file unzipping (www.7-zip.org) in parallel using
Windows HPC. This is a typical embarrassing parallel type of problem and a good
fit for some initial tests. I'll have three different C# implementations to
demonstrated the use of distinct approaches to implement the Parallel Unzip :
v
Using the Windows HPC Scheduler API - let's put
the Windows HPC scheduler under some stress.
v
Using Service Oriented Architecture HPC - I
expect to see the best performance of the three approaches; we'll see...
v
Traditional MPI based approach - you gotta have
the classic represented.
In subsequent posts I'll share the code and the benchmark
results of those three implementations.