I wasn't going to mention this at all.
I almost made it. I almost made it to the point where it would be old news, not worthy of an entry.
Almost.
But then, I figured, you guys have already seen me at my absolute worst, yet still some of you remain unrepulsed.
So a reminder is in order.
A reminder of my stupidity.
Got a new server at work last week. I think it was Wednesday. I do my Solaris installations over the network. It is 2006 after all. Installation CDs are so last century.
Anyway, I ran through all of the steps needed to make sure that the boot/installation server was running properly and that it was ready for the new equipment.
- I added the new MAC address to the NIS ethers database.
- Same thing with the IP address in the NIS hosts database.
- I made sure that all of the Solaris 10 CD images were mounted and shared properly.
- I created the proper TFTP entries.
- I had the network people configure the new server's switch port for auto/auto and had them put in the correct VLAN.
Wrong.
I rackmounted the new server, connected its network and console cabling, opened a terminal to the console port, and fired it up.
Watching the boot/installation server's logfiles and snoop output, I saw the RARPs go out from the new server. I saw the proper responses given. Then I watched the TFTP transfer of the boot image take place.
Everything was going very smoothly.
But then, then the new server installation failed with the message Unable to mount remote filesystem.
WTF?
I logged into another server on the network and verified that the installation CD images were indeed mountable. They were, so that wasn't the problem. So why couldn't the new server mount the fucking things?
I must have tried for four hours to get the new server to install. I even killed the rpc.bootparamd and tftpd and mountd processes and restarted them in debug mode, hoping that would shed some light on the problem.
Well, it did, in a way.
The problem was me.
It wasn't so much the output that I was seeing, it was what I wasn't seeing.
I wasn't seeing the boot/install server even trying to share out those CD images.
Finally, I figured it out.
See, up until a month or so ago, I'd used another server for Solaris installations. Up until about a month ago, another server had possesion of the IP address handed out by the rpc.bootparamd service.
And that other server, because it had access to the same NIS ethers database as the current boot/installation server, that other server was actually the one trying to serve up the Solaris CD images. Problem was, of course, that this other server didn't have access to those images anymore - they'd been migrated over to the new boot/install server along with the IP address used to identify the boot/installation server!
Duh.
So I killed the rpc.bootparamd process on the old server, and everything went as planned from that point on. The new server installed correctly. No thanks to me.
That's four hours out of my life that I'd really like to get back.