This post describes a more maintainable method than normal DHCP for automatically assigning host names and IP addresses to servers which is ideal for cloud computing and large clusters. It assigns them according to the location of the server, and requires zero administrative effort when you add or replace a server.
DHCP is OK, but it costs time and effort to keep up to date as you install and replace servers. If you have a million servers, you'd like for the IP addresses to be easily correlated to where the hardware is. For example, you'd like for the server name to be something like Drone-Rracknum-Sservernum and the IP address to be something like PREFIX.rack-number.server-placement-in-rack. Then your IP address would tell you immediately where the server is in your data center. If you have several hundred servers or more, this is a good thing. But you don't want to hard wire MAC addresses into DHCP configurations or IP addresses into filesystem images, either. Both are high-maintenance methods.
This post talks about an alternative method of assigning IP addresses, that lets you locate your servers easily, is lower overhead and more maintainable than the usual way of using DHCP.
The assumption of this arrangement is that you have a managed switch or two for each cabinet of servers, and that you maintain the IP addresses of the switches in DHCP. When you replace a switch, you put its MAC address / IP address pair into DHCP and away you go. But there are many fewer switches than servers, so this is something like 1/49 or so of the number of servers you have.
Once you have this, then the rest is simple. You assign the switch IP addresses sparsely, perhaps one per 50 addresses in a subnet. For example, 10.0.0.1, 10.0.0.51, etc. You then rewrite the network startup to listen for a link discovery packet (LLDP or CDP). This packet contains the management address of the switch, and the switch port this NIC is connected to.
If you assign these switch numbers in a way that reflects your hardware layout, then it's simple to decode that back into the placement of the switch in your environment.
Basically, the host computes the host name it gives to DHCP, and lets DHCP do its thing. The host name should be structured something like this: Drone-Rracknum-Sservernum. Of course, the script to translate the switch IP address to a rack number, and servernum offset is site dependent, but not hard to compute. If you have your racks arranged in nice neat rows and columns, you could go further and translate the racknum into a row and column or other local designations for the location of the server.
You give this host name to DHCP, and it then assigns you an IP address, DNS servers, and other information from that. If you wanted to hard-wire your "other information", the IP address itself is also easily computed, and you can avoid DHCP completely.
Now when you replace that computer with a different one, the new one automatically takes on the same IP address as the previous one had without any manual effort.
If you aren't familiar with CDP or LLDP - they provide management address and port information and a lot of other good stuff by periodically sending layer two packets to their connected endpoints. Most or all modern managed switches support one or both, and there are no known security issues that arise from enabling them.
It is more maintainable - instead of having to update DHCP tables with new MAC addresses for all the servers and switches, you only have to maintain the IP/MAC addresses of the switches, and the hostname/IP address correlations - leaving the MAC addresses out.
Of course, this also assumes that you have a rational cabling system in your cabinets - one where the switch port numbers correspond to positions of the servers in the cabinet. The main downside is that the systems take longer to come up. DHCP servers respond quickly, but you will have to wait to hear your link discovery packet. Of course, you will have to have another method for providing the netmask and the list of DNS servers. Although the former can be hard-wired, the latter may need to change over time.
Thanks to Narayan Desai of Argonne National Labs who introduced me to CDP and LLDP a few years back, and inspired these ideas. No doubt people have done this kind of thing before - maybe better. I mentioned this idea at LinuxCon, and people asked me to write it up. So here's a post about it.
What's your reaction to this? Is waiting for an LLDP packet going to take too long? How do you do this at your data center? Do you have a better method you follow?