As part of the recent hardware upgrade to my ZFS file server I replaced the motherboard. I'd never replaced the motherboard on an active Solaris system before and was curious whether it would be at the easy end of the spectrum (like OpenBSD is) or at the impossible end (like any recent version of Windows). This is what I learned.

I should caveat this whole article by saying that the notes I took while doing this upgrade are not very good nor can I remember certain details as I did all of this a couple of weeks ago. Some of the steps below may be unnecessary or redundant. Some of it might not make any sense even. What I'm writing below is what I remember and what I'm piecing together from my notes.

I also want to point out that the steps I followed below would've worked just as well if I was moving the Solaris boot drive out of an old system and into a new one. In the end, upgrading the hardware around Solaris and moving the Solaris drive to a different system is pretty much the same thing.

Preparation

Before turning off the system to do the upgrade I did a little bit of preparation.

Export ZFS Storage Pools

The old and new boards where completely different. Different manufacturer, different chipsets, even different CPU vendors. I knew for sure then that when the system came up on the new board that the hard drive device paths (ie, /dev/dsk/cXdXtX and /devices/xxx) would not match what they were when the system was shut down. I therefore did a "zpool export" on all pools. This tells Solaris not to look for the pools when the system comes up again. If I didn't do this step, Solaris would complain when the system came up about not being able to find some or all of the drives that make up the pool due to their device paths being invalid. You might think that Solaris would just scan all connected drives to find the ones that belong to your pools but it does not. Solaris caches the device paths that make up your pools in the /etc/zfs/zpool.cache file and reads this file when the system boots. By exporting the pool, that file is removed and the pool is not automatically mounted on boot.

Reinstalling the Boot Blocks

Since the hard drive device paths were all going to change I knew that the system would likely not boot right up when I turned it back on. I assumed that I would have to reinstall the boot blocks onto the system drive in order for Grub to be able to find the OS. I dug around in the Solaris man pages and wrote down this command:

/sbin/installgrub /boot/grub/stage1 /boot/grub/state2 /dev/rdsk/cXtXdXs1

I wouldn't know exactly which device (cXtXdX) my boot drive would end up being until I booted the system on the new board. I did know that slice #1 (s1) contained the Solaris root partition though.

Reconfiguration Boot

When adding, removing or changing static hardware on a Solaris system you usually want to do a "reconfiguration boot". This tells Solaris to discover new hardware during boot and to recreated the /devices and /dev hierarchies. There's a couple of different ways to initiate a reconfiguration boot but I chose to create the /reconfigure file while I was prepping the system for shutdown because I knew that way I wouldn't forget to properly execute one of the other methods later on.

touch /reconfigure

Booting the System

After upgrading the board I booted the system and began fixing the brokenness.

Fixing Grub

As expected, the system did not boot. In fact I didn't even get the Grub menu. I pulled out an OpenIndiana Live CD and booted it. I used the format(1m) command to get a list of all drives attached to the system and from that list get the device name of the boot drive. I then used the installgrub command from above to reinstall the boot blocks on the drive.

/sbin/installgrub /boot/grub/stage1 /boot/grub/state2 /dev/rdsk/c1t1d0s1

The system now booted into the Grub menu.

Getting Solaris Booting (Kinda)

Booting Solaris from Grub now resulted in the system freezing shortly after displaying the copyright notice. Booting the system with the -a and -v flags showed me that it was pausing right after asking for the path to the "retire store". After some intense googling, this is what I came up with as a solution. After booting my live CD and mounting my Solaris install under /a:

rm -f /a/dev/rdsk/c*
rm -f /a/dev/dsk/c*
rm -f /a/dev/cfg/c*
devfsadm -r /a -p /a/etc/path_to_inst

The rm commands deleted all traces of hard drive devices under the /dev hierarchy. The last command causes the entire /device and /dev hierarchies to be rebuilt and the path_to_inst file to be recreated. Since I was working from the live CD, everything was done relative to /a where the Solaris installation was mounted.

Really Getting Solaris Booting!

At this point the system no longer froze up after showing the copyright notice. Instead, it teased me by making it a little bit further but then, quicker than you could read, it flashed up a bunch of text on the screen and instantly rebooted itself. I tried a few different tricks to be able to read the text but eventually had to settle on recording the boot up on my camera and playing it back frame by frame.

Solaris Boot Error

The very last line was the key. Somewhere, Solaris was still configured to boot from the device path that pointed to the boot slice under the old hardware. Booting from the live CD once again, I used the format(1m) command to determine the proper device path to the boot drive.

# format
[...]
       4. c1t1d0
          /pci@0,0/pci15d9,f580@1f,2/disk@1,0

I took this device string, appended ":b" to it to indicate the root slice was the second slice on the disk (ie, "s1") and edited the /boot/solaris/bootenv.rc file with this information.

# grep bootprop /a/boot/solaris/bootenv.rc
setprop bootpath /pci@0,0/pci15d9,f580@1f,2/disk@1,0:b

After editing the file, I updated the boot archive and rebooted.

bootadm update-archive -R /a

Final Steps

At this point the system was booting normally from the hard drive. The last step was to import the ZFS pools using zpool import. ZFS successfully scanned the drives, determined the new paths to the pool members, and mounted all of the file systems.

Summary

In summary, this is what I did:

  1. Export zpools
  2. Reconfiguration reboot
  3. Upgrade hardware
  4. Boot live CD
  5. Install bootblocks
  6. rm path_to_inst and hard drive devices under /dev/{dsk,rdsk,cfg}
  7. devfsadm to rebuilt device hierarchies
  8. Update /boot/solaris/bootenv.rc with proper bootpath
  9. Update boot archive and reboot
  10. Import zpools