Hardware configuration
Recently we've been involved in a project where a third party were due to supply hardware using their "standard configuration". Cutting the story short, this is how the hardware should have been setup prior to being handed over to us:
Hardware specification should be agreed upfront:
- Vendor
- Model
- You may want to check compatibility
- Not just of the server itself, but management features, components and peripherals
- Perhaps match existing models, or alternatively wait until the next generation ships
- CPU - "2x2GHz" may well not be accurate enough - is that a single dual core chip or two quad core chips?
- Do you need virtualisation extensions?
- Or perhaps a particular model (later models are often more efficient)
- Disks
- Speed, size, number
- Disk controller - do you want a hardware RAID solution?
- Note that many "onboard RAID" options are best termed FakeRAID and should be avoided
- If you do have a proper hardware RAID card, it's well worth ensuring that it has a battery backed write cache
- Memory
- Do you want ECC (error correcting) memory? Yes!
- Most servers will only use ECC memory
- Remote management cards
- Licenses if required
- If you can't afford a management card, you can normally use a built-in BMC device instead. This will allow you to remotely reboot the server if it crashes
- Power supplies
- Do you want redundancy?
- Or perhaps as a half-way house have one spare on the shelf for many servers?
- Rack or desktop?
- Most people use rack mounted kit - ensure the rails (and server) fit the rack
- Network cards
- If two or more ports are required, should these be spread across multiple cards for redundancy?
- Other cards - such as Fibre Channel cards
- Dates
- Most suppliers will have several weeks lead time - often many more if it is a new server range
- Setup time before the hardware is handed over
You should also ensure that the server configuration is valid - and preferably optimal:
- Check memory setup - in the case above, the memory used meant that only one in three slots could be used and this rule had not been followed, leading to memory errors and crashes
- Memory should be balanced - normally this means spreading it over as many channels as possible and purchasing in matching multiples (for example four at a time)
- Some availability features (such as ChipKill) may require specific placement
- Check adapter placement
- Normally spread these over multiple PCI buses (which may themselves offer different speeds and bandwidth)
- Think both about spreading the load and also availability in case a PCI bus error occurs
- Upgradability - do you wish to plan on upgrading the server later?
- Expect to double the memory of a server if nothing else
A server is more than just a lump of tin, it also requires documenting. We'd always advise documenting first and doing later. Firstly this ensures that the documentation is done, secondly it is far easier to type or write at a desk than in a datacentre, thirdly any last minute changes are easy to document (since they will most likely be scrawled onto a printed document). The documentation that needs preparing should include:
- Badge name (i.e. what the server is known as by the staff in the datacentre)
- Hostname (if different)
- Physical location
- Which datacentre, which rack, which units within that rack
- Rack diagrams should be up to date
- Server type (i.e. vendor and model)
- Network configuration
- Which ports plugin into which switch ports
- VLANs
- Bonding
- IP addresses
- Don't forget the management card (or onboard BMC)
- Disk configuration
- RAID levels
- Advanced settings such as background scans
- Usernames/passwords for management card, BMC and/or BIOS
- "Owner" can be a useful detail to know
- How many times have you heard the phrase "We aren't sure if the server is used any more"?
The server requires plugging into your datacentre infrastructure - you must ensure that you have adequate:
- Power (not just in terms of current but also number and type of connector)
- Many datacentres are now constrained by power demands above all other limitations
- Network ports - including remote management if required
- Fibre, copper, 100Mb, Gigabit, 10Gbps?
- KVM (Keyboard, Video, Mouse) ports if used
- Serial ports if used
- Fibre channel if required
- Several connectors, wavelengths and core diameter are used - ensure they are a suitable match
- Space in a rack
- Cooling
- Floor loading (weight)
Now that the server has arrived, it requires some basic setup. Firstly additional components must be installed. However you may actually want to defer this - sometimes firmware upgrades are required first.
Racking the server may seem like an unskilled job, however nothing could be further from the truth, when racking:
- Put a badge on it - front and rear
- If it has a bezel, put a badge both on the bezel and on the server behind it
- Or better yet, just bin the bezel
- If it has a cable management system, perhaps label that too
- Put anything particularly heavy towards the bottom of the rack
- The rack should have anti-tilt mechanisms fitted
- Consider airflow and cooling
- Some equipment may be particularly vulnerable to heat or generate a large amount of heat
- Think about obstructions (cabling for example)
- Cabling - you want to keep this tidy and easy to change
- Cable management arms can be both a boon and a curse. When pulling a server out the cables should come with it cleanly, however particularly with 1U servers in particular, consider sacrificing "hot plug" ability in order to ensure you don't unplug other servers accidentally by catching their cables
- Reusable velcro strips are fabulous for tidying cables
- Only use tie wraps for permanent fixtures, otherwise if someone needs to move them:
- They will need a knife (one trick is to use pliers and twist to snap them off)
- They may cut themselves, or worse the cable and other cables
- They won't have a replacement tie wrap and so the cabling will become a mess
- Do screw the server to the rack - if the rack started to tilt and all the servers slid as well you will stand even less chance of stopping it
All firmware should be upgraded - not just BIOS firmware, common firmware that requires upgrading includes:
- BIOS firmware
- Often you will need to reload "default" or "optimised" settings after upgrading the BIOS
- Management firmware (BMC/iLO/DRAC/management module)
- NIC (network interface card) firmware
- Even if they are built in and not add-on cards
- If there are multiple NICs, ensure you upgrade them all
- Disk controller firmware
- Fibre channel firmware
Now setup the BIOS, typical items to change include:
- Date and time
- Boot order
- We normally use CD-ROM, USB, private network card, hard disk
- Response on power outage
- Last state is a good choice
- BIOS password
- From your prepared documentation
- Badge name or hostname as appropriate
Even now there are a few things left to do:
- Management cards need setting up:
- IP addresses
- Usernames, passwords
- Name
- SNMP trap destinations
- SNMP community name
- Also check for any hardware errors that the management cards are reporting
- Clear the event log once any are resolved
- Disk controllers will need configuring:
- Disk arrays need defining and building
This is a pretty comprehensive list, however it clearly demonstrates that installing a server is not a five minute process.
Fortunately there are many ways you automate some of these processes, however many are vendor specific - so if you swap from HP to Dell you will have to learn new ways to do the same thing.