Project: SNO Leopard
Project: Sno Leopard
TL;DR,
I installed Openshift 4.12 as a bare metal Single Node Cluster on a 2012 Mac Pro (MacPro5,1 A1289).
Specs:
2x Xeon E5645 (6c,12t ea); 24 vCPUs
128 GB DDR3 1333Mhz
1TB NVMe boot drive (SNO)
6x1TB SATA SSD storage (LVM)
Radeon RX580
2x 1G Intel Ethernet (bond active/backup)
Time to complete:
1 week from start to finish. About 15 total hours of actual work on evenings/weekends, most of which was waiting for installs of either macOS or OCP to finish. Most of the lag time was due to shipping and running into additional hurdles.
Cost Breakdown:
Qty |
Product |
Cost |
1 |
2012 Mac Pro (macpro5,1 A1289) |
$380 |
1 |
Sapphire Pulse Radeon RX580 |
$100 |
4 |
16GB OWC PC10600 DDR3 (64.0GB) |
$118 |
1 |
COMeap Dual Mini 6 Pin to 8 Pin Cable |
$12 |
4 |
OWC 2.5-Inch Drive Sled |
$60 |
1 |
Corsair Dual SSD Mounting Bracket |
$6 |
1 |
ORICO 5.25 Inch to 2.5 or 3.5 Inch Mount Kit |
$12 |
6 |
1TB Team Group SATA SSD |
$0 ** |
1 |
OWC Accelsior PCIe NVMe card |
$25 |
1 |
Samsung 970 EVO Plus NVMe |
$65 |
1 |
Amazon Basics DVI to HDMI |
$8 |
|
Total |
$768 |
** I already owned these and cannibalized them from my old lab
my target was to keep the cost under $1,000, so mission accomplished. I definitely could have kept costs lower by not paying the OWC/Apple tax and bought off brand stuff, and returned the RX580 once I was done flashing the the firmware. Also, I have plenty of smaller NVMe drives, so I didn't really need a new 1TB NVMe (I just wanted it). But lets say that if had wanted to do this on the cheap, I probably could have saved another $200 or so.
Summary:
There was a lot of fun and games along the way involving firmware updates and fighting with storage and network on OCP, but I finally got it done with a little help from friends. All in all, I'm very happy with the results.
End result. Now to do something about my cable management under my desk...
A leopard cannot change its spots.
Or can it? This project began because my current living situation is not conducive to having a closet full of enterprise (I use the term loosely) gear creating heat and noise. Also, the current level of tolerance for things in my lab causing the internet connection to go down right now is essentially nil. Of course, as a Red Hatter I do have access to RHDP (Red Hat Demo Platform) where I can request a sandbox on any of the major hyperscalers. However, as awesome as having essentially carte blanche to build whatever I want in AWS, Azure or GCP, that privilege comes with quite a few restrictions, primarily focused on preventing Red Hatters from running up exorbitant amounts of cloud spend. Primarily, these sandbox environments are short lived - just 7 days. Not only that, but considering a huge amount of the work I've been doing putting together demos and content for customers has been focused on virtualization, I would have to use bare metal instances which are quite expensive. For example, on AWS the cheapest bare metal option available without any workload on it costs a little over $4 an hour. Add compute, and the cost rapidly hockey sticks. Another thing to consider is that lab environments take time to deploy and configure, and having to redeploy the same lab 4 times a month is simply not a good use of time and resources.
That said, it became apparent to me that I needed a temporary home lab solution, with a whole lot of caveats. I didn't want it to break the bank but was willing to spend up to $1,000 which seemed a reasonable target, I needed enough compute power to run virtualized workloads, and I needed whatever I landed on to not turn into a rats nest of cables, blinking lights and screaming CPU fans (it sits in a living room shared by two other people). I considered a few different options, originally trying to run a SNO (Single Node Openshift) VM on my workstation, but the lack of CPU cores meant nested virtualization was out of the question. I thought about deploying a compact cluster from three of the nodes in my existing lab, but quickly nixed that idea because it wouldn't have just been 3 nodes - it would have been 3 nodes plus a router, plus a NAS, plus a managed switch, plus a UPS... it would have been most of the exact same footprint as my old lab, but with two thirds less resources.
It became clear that I needed one big beefy box that was powerful enough to run SNO + OCP-V + LVMO, and I needed it not to be an eyesore. I briefly considered a 1 or 2U server since you can pick up used Dell and HPE gear for a song on Ebay, but heat, noise and power would have been an issue, not to mention it would have been an eyesore. I considered building a PC with high core count and lots of RAM, but the cost quickly rocketed way past what I'm willing to invest in a temporary solution. Eventually I landed on a general list of requirements; a workstation system with single or dual socket Xeon with at least 16 cores and at least 128 GB of RAM, and room for plenty of SSDs. I looked at a couple of used workstation models from Lenovo and HP and almost pulled the trigger on a ThinkStation D30, but remembering my old ThinkStation S30, it generated a LOT of heat and a LOT of fan noise. Ultimately, I landed on an old MacPro5,1 for the following reasons:
* inexpensive - all said and done, I spent $786 on this project. I got the Mac Pro itself for $380
* simple - it's a single box, no need for a managed switch, a NAS, a router, the rats nest of cables, multiple power supplies or UPS...
* compact - sure, it's a full size ATX tower, but it fits neatly under my desk
* aesthetics - it's a Mac. It's a beautiful piece of hardware - unlike the horror scene that was my old lab
* quiet - it's damn near silent
* cool - unlike most workstations, the Mac Pro doesn't double as a space heater
* plenty of cores - SNO + OCP-V + ODF LVM needs minimum 16 cores. I have 24 which gets me comfortably past the threshold.
* plenty of RAM - 128 GB of RAM is still a LOT of RAM for a single box even in 2023
* plenty of storage - 6TB of SSD is more than half of what my 10 node ODF cluster had in a single box
so why "Sno Leopard"? well... it's a nod to the fact that this Mac is a relic from the days of Snow Leopard, and the fact that I plan on running Single Node Openshift or "SNO" as the operating system.
The Journey
A False Start
The process of getting Openshift installed on this Mac Pro was not easy, to say the least. There very first hurdle I ran into was the fact that the ATI Radeon HD 5770 that came installed in the Mac Pro only had a DVI connector, and I didn't have any DVI cables laying around. Thankfully, things like display cables are typically available on Amazon Prime 1 day shipping. So I was finally able to boot my Mac Pro and see what I had to work with. It came shipped to me with 2 Xeon E5645 CPUs (12c/24t) and 64 GB of RAM and it came with 512GB Samsung SATA SSD installed. When I booted it up, it had a fresh install of macOS El Capitan on it.
After disassembling and inspecting it, I found that it had 4 SATA ports on drive sleds and two more SATA ports in the optical drive enclosure. However, this Mac Pro is limited to SATA II speeds (300Mbps). It is ten years old after all, so not surprising. However, Openshift (and kubernetes in general) requires faster disk than that, or else it will incur latency in etcd which can severely impact performance. Thankfully, support for NVMe was added with a firmware update along with High Sierra 10.13.6, but was limited to 700Mbps, however without the capability of booting from NVMe. Much faster NVMe performance, as well as the ability to boot from NVMe drives was added with firmware 144.0.0.0 along with the upgrade to macOS Mojave 10.14, which is where the real fun began.
Along with the release of macOS Mojave, Apple deprecated a ton of old graphics cards in favor of newer cards (mostly Radeon RX and Vega models) that supported their new Metal framework. Annoyingly, you need to have a Metal capable GPU installed in order to trigger the install of macOS Mojave, and thus the firmware upgrade to 144.0.0.0. I tried a couple ways to work around it, but to no avail. I was even able to install Mojave using the Mojave Patcher for Unsupported Macs (https://dosdude1.com/mojave/), but unfortunately installing the OS that way does not actually trigger the firmware update. Literally the only way to install the firmware is by installing a Metal capable GPU. Apple released a list of officially supported GPUs here https://support.apple.com/en-us/HT208898. I decided on a Sapphire Pulse Radeon RX580 which I got on Amazon for $100. However, I ran into a snag when I tried to install it. The graphics card required an 8 pin power cable and my Mac Pro only came with a 6 pin. I ended purchasing this double mini 6 pin to 8 pin adapter https://www.amazon.com/dp/B07MV3G9QV?psc=1&ref=ppx_yo2ov_dt_b_product_details which worked perfectly, just set my project back a day.
With the proper power cable in hand, I was finally able to upgrade my Mac Pro to macOS Mojave, and more importantly, upgraded the boot ROM to firmware 144.0.0.0.
Boot Disk
Now that I had the ability to boot from NVMe (which is where I planned to install OCP due to the latency sensitive nature of etcd) I purchased this PCIe adapter https://www.amazon.com/dp/B08LYWDSKJ?ref=ppx_yo2ov_dt_b_product_details&th=1 OWC Accelsior PCIe adapter, mainly because OWC products are custom designed to run on Macs, along with this NVMe drive https://www.amazon.com/dp/B07MFZY2F2?ref=ppx_yo2ov_dt_b_product_details&th=1 1TB Samsung 970 EVO Plus which I verified was compatible on this blog https://blog.greggant.com/posts/2018/05/07/definitive-mac-pro-upgrade-guide.html.
The NVMe install went off without a hitch, and along with the upgrade to firmware 144.0.0.0 came the desired speed boost on NVMe I was after, yielding a much better 1500Mbps as well as the ability to boot from NVMe. I installed macOS Mojave on the NVMe drive and made sure to install all of the available updates. Once I was satisfied with the performance, I booted up a Fedora live disks and wiped the disk leaving it completely unformatted.
*** NOTE: with the 5770 installed, you have to boot Fedora in basic graphics mode on the Mac Pro or the system hangs on boot for some reason. I'm guessing it doesn't like something about the Mac compatible firmware on the card.
wipefs -af /dev/nvme0n1
dd if=/dev/zero | pv -tpreb | dd of=/dev/nvme0n1 bs=1M count=100
Storage Array
With my boot drive finally sorted out, I turned my attention to bulk storage. Knowing that I'm limited to SATA II, any el cheapo SSD will do and I just planned on re-using the 1TB SSDs I already had on hand from my old lab, which worked fine for ODF before. However, Apple being Apple, the drive sleds do not use standard mounting points for disks so they needed to be replaced. I couldn't just get regular, cheap 3.5" to 2.5" drive bay adapters, I ended up going with these from OWC https://www.amazon.com/dp/B009P4NEKA?psc=1&ref=ppx_yo2ov_dt_b_product_details for the 4 "RAID" slots. Technically, if you have the very expensive and rare 2009 Mac Pro RAID card, you can actually run SAS drives in these 4 slots without the need for any additional cabling which is actually pretty slick. However, since it relies on the macOS software RAID utility to provision and format the drives, I am curious whether or not this would be a worthwhile upgrade in for a future project. If it unlocks faster than SATA II speeds, I'd say it would be a worthwhile upgrade, but thus far, my research indicates it does not. The only added benefit would be battery backed write cache which would help prevent data loss/corruption in case of a power outage as well as the ability to use SAS drives, but without any performance increase I don't think it would be worth the hassle. Honestly, if I really want faster disk performance on my storage disks, a SATA III card is my best option, but that can quickly turn into a rats nest of cables inside my sleek, beautiful Mac Pro. Once I received my pretty blue anodized OWC drive sleds, I installed 4x1TB SSDs in the 4 on board RAID slots without a hitch. I did a little bit of benchmarking and found that they maxed out the bus at 300Mbps sustained reads and writes. in a RAID0 config a little over twice that at 750Mbps. I realize I'm not going to get much more than that out of SATA II in any configuration, but considering these disks are only going to be used for mass storage for container and VM workloads, they don't have be blazing fast since SSD performance is good enough even for virtual machines.
after inspecting the Mac Pro's logic board, I discovered there were two more regular SATA ports dedicated to the optical bay, originally for a pair of DVD-R/W drives. The original owner of this Mac Pro had swapped one of the optical drives for a 512GB Samsung SSD in SATA port 6. While in terms of real world performance once booted, there's no difference between any of the SATA ports, the recommended place to put the boot drive is actually RAID slot 1 (left most RAID slot as facing the right side of the Mac Pro case with the access panel off) as that is where the boot ROM looks to boot first. If it does not find a disk there it has to probe all ports until it finds a bootable disk, which adds about a minute to boot time. Also, whoever owned the Mac Pro before me decided to install their boot drive ghetto style, using 3M command strips to stick the SSD to the bottom of the DVD-R/W drive in the top slot of the optical bay. In reality, it's not going to go anywhere, but given how thoughtfully the modular layout of the Mac Pro was designed, this annoyed me. But oh well, I wasn't going to be using either the DVD-R/W drive, nor the 512GB SSD. I replaced the two drives with another pair of 1TB SATA SSDs, but I needed a way to mount them properly in a 5.25" drive bay. There are plenty of dual 2.5" to 5.25" drive bay adapters out there that orient a pair of 2.5" disks side by side, but due to the way the power leads for the original optical drives were vertically oriented, the power leads for the two SATA disks were too short to be mounted side by side, so I needed a way to mount two SSD's vertically. The solution I ended up settling on was these two adapters https://www.amazon.com/dp/B016498CK0?ref=ppx_yo2ov_dt_b_product_details&th=1 https://www.amazon.com/dp/B005OJFASY?ref=ppx_yo2ov_dt_b_product_details&th=1. The two SSDs mounted on top of one another in the 2.5" to 3.5" bay adapter, and then the 3.5" to 5.25" bay adapter mounted into the top slot of the optical bay tray on the Mac Pro. The solution is... overkill, to say the least. but it keeps things neat and tidy and thus appeases my OCD. And thus, both the boot disk and mass storage were sorted out.
before
After:
Memory
my Mac Pro came with 64 GB of RAM and supports a maximum of 128 GB, so I bought an additional 64 GB of RAM from OWC https://www.amazon.com/dp/B01CTC8K8U?ref=ppx_yo2ov_dt_b_product_details&th=1 (again, OWC products are basically guaranteed to be compatible with Apple hardware). The install is straightforward, just pop the CPU tray out of the bottom of the Mac Pro, install the DIMMs and plug it back in. I must say, putting CPU and RAM on it's own dedicated removable tray is one of the most elegant and easy to service solutions I've ever seen on a computer. Bravo, Apple. If only they carried forward this kind of modular, user serviceable design into their modern products. My Mac Pro booted right up and recognized all 128 GB of memory.
GPU
I ended up going back and forth on whether or not to use the old ATI Radeon HD 5770 or the newer Sapphire Pulse RX580 card. On the one hand, the older card has a Mac ROM, and thus the boot picker screen works. The RX 580 on the other hand, while compatible with macOS once booted, does not have a Mac EFI compatible firmware so you can't see the boot picker. I went down a rabbit hole trying to figure out if it were possible to reflash the firmware on it to enable the boot picker. The long and short of it is:
Yes, you technically can. No, it's absolutely not worth the hassle. To summarize my findings, there is one shady guy in Poland who runs/ran the site http://www.macvidcards.com who has managed to hack together a ROM for some video cards (specifically, the Sapphire Pulse Radeon RX580 I bought) that enables the boot picker, but it requires desoldering the VBIOS chip from the GPU, replacing it with a double capacity chip, and flashing the firmware with an SPI programmer. A quick google search reveals a whole lot of angry customers on Reddit posting horror stories about ignored emails, DOA cards and not getting refunds. Not something I'm willing to do for a convenience I'm going to use once in a blue moon. Maybe eventually I will find an already flashed RX580 on Ebay (like this one https://www.ebay.com/itm/254740968731?hash=item3b4fbeb51b:g:xqAAAOSwcjdffGcr&amdata=enc%3AAQAIAAAAwJTNXCkC9ZmswgwZmJII6cgxh%2Fcao8X1K1K65WG%2FIDQGuIAWdglj9sXZbrYjL6F1D2Dy1rjkSlSSpV90KFtuDqcWlyDp444q7JhX9%2FoBdWyHVI8VxkGYdLWbQvMAEnaeRZ3zSkGXUoo4zxvPs1vZA6vDMN7vECo2V3tkQ1K01NLyrpjEcG%2BNpkMijI7dS1o5ZGD8eGbTaOBOuNhi1VaA8jLbVofFEHO0sX1M7CT8X%2BnogeISd8RIMqSxkpLsylSieQ%3D%3D%7Ctkp%3ABk9SR8q1pO31YQ) since the card is from 2017 and doesn't hold a candle to newer more powerful cards. But... I don't need it for things like gaming or rendering Final Cut projects like I'd imagine the vast majority of people using Mac Pros are. But considering the Mac flashed cards cost 2-3 times what I paid for the card, I doubt I'm ever going to bother. in fact, other than experimenting with passing the GPU through to a VM, I doubt I'm going to use it at all.
Anyway, to summarize, I decided to just stick with the RX580 and keep the HD 5770 around just in case I ended up needing it for troubleshooting in the future. Thankfully, the Mac Pro's boot firmware follows a logical workflow. With no OS detected on any of the internal disks, it will boot from the first external disk it finds (eg; a thumb drive). Once an OS is installed, it will boot from the first internal disk - in my case, the NVMe drive, so for the purpose of installing OCP on this beast, it ended up working out fine.
Other mods:
My past experience with Openshift, and specifically with RHCOS installed on bare metal is that it is a pretty bare bones operating system and supports most server hardware, but isn't really targeted at consumer electronics. Specifically, I've seen the presence of things like WiFi and bluetooth cards cause issues like failure to boot and kernel panics, so one of the last things I did to prep the Mac Pro for Openshift duty was to remove the Airport and Bluetooth cards. Both of which are located on the lower part of the logic board, behind the CPU tray. Worth noting - the screws holding both cards onto the logic board are installed with thread sealant (Blue Loctite to be specific), so you must be extremely careful when removing them. You need to apply firm, even pressure and go slow. Slipping and accidentally gouging the logic board would definitely brick it.
***Pro Tip: it's a lot easier to remove these cards with the Mac Pro laying down on it's side. Use a long #1 Philips screwdriver
View behind the CPU tray. Airport card is the silver rectangle on the bottom left, the bluetooth daughter card is the black card on a standoff on the bottom right.
You have to remove the Airport card, but technically you can get away with leaving the bluetooth card on the board and simply disconnecting the data cable. The only caveat here would be that you probably want to put tape or heat shrink tubing over the WiFi antenna leads to prevent the metal connectors from accidentally contacting any part of the board.
But with the Airport and Bluetooth cards removed, I was finally done prepping my Mac Pro. Which brings us to the moment of truth:
The Install
But will it blend? Here's where things got a bit tricky (as though the journey to get to this point were breezy and hassle free...). A colleague pointed out that the CPUs on my Mac Pro may be too old to support current or future versions of Core OS. So I downloaded a CoreOS ISO, burned it to a thumbdrive and crossed my fingers.
*** Note: on a Mac, in order to burn an ISO to a thumbdrive, you can't just 'dd' the image as-is to a disk the way you can on Linux or on Windows with eg; Rufus USB. You first have to convert it to an img. eg;
hdiutil convert -format UDRW -o /path/to/disk.img /path/to/disk.iso && mv /path/to/disk.img.dmg /path/to/disk.img
# converting a disk with hdiutil will append the .dmg extension to the resulting file becuase... #AppleGonnaApp
So, with the ATI Radeon HD 5770 installed, I could boot the ISO but it would black screen shortly after the GRUB bootloader screen, so... inconclusive. Pushing the power button on the Mac Pro powered off the machine immediately as though it were stuck in BIOS or something. Things did not look good... I did have a fall back plan to install either RHEL 8 or ESXi and just run the Mac Pro as a regular hypervisor, but I really wanted to just run SNO on bare metal since Openshift Virtualization was one of the key workloads I wanted to test on this machine. With the Mac Pro stuck on a black screen with fans blowing, the only signs of life I got from it were two blinking on my router's ethernet link lights, so maybe it was actually up? I checked the DHCP reservations on my router and found that the Mac Pro was in fact pulling two IP addresses, 192.168.1.60 and 192.168.1.50, so
I pinged it and found it was in fact up.
With confirmation that RHCOS could in fact boot on my Mac Pro, I logged in to cloud.redhat.com to create an assisted installer ISO, burned it to a thumb drive and crossed my fingers. With the SNO thumb drive installed in the Mac Pro, I booted it up and waited, and after a couple minutes my Mac Pro did show up in the web console. Success!!! Or so I thought... I found out pretty quickly it could not proceed with the installation with two NICs connected. I planned on using both NICs in an active/backup bond, but struggled to figure out how to configure a bond on the assisted installer. I tried the way you would normally set up a bond in a traditional UPI bare metal install, but it didn't work as expected and I was advised by a colleague that manually modifying configurations on nodes is not supported. Of course I knew that... but #homelab
Until I could do some research and figure how to properly set up the bond, I just unplugged the second NIC and proceeded with the install. The cluster installed without a problem and booted right up from the NVMe drive. Lacking a local DNS server inside my network, I had to create a bunch of hosts file entries on my Mac Mini to access the cluster once it was up.
So, I was nominally successful insofar as I was able to prove that I could deploy my Mac Pro as a SNO cluster, but I still had some work to do to figure out the networking, storage and virtualization.
The Install, cont'd.
With some trial and error, and some assistance from a colleague @Brian Josza, I was able to figure out the network configuration. Using the assisted installer, you can provide a static/advanced network configuration through the web UI but it's not terribly intuitive. You need to know the device name and MAC address for each NIC on your system and you have to specify them both in the yaml file, as well as in the web UI. Ultimately, this is the configuration that worked for me
interfaces:
- name: enp9s0
macAddress: 40:6c:8f:bc:28:09
- name: enp10s0
macAddress: 40:6c:8f:bc:34:5d
- name: bond0
type: bond
state: up
ipv4:
address:
- ip: 192.168.1.60
prefix-length: 24
enabled: true
link-aggregation:
mode: active-backup
options:
miimon: '1000'
primary: enp9s0
port:
- enp9s0
- enp10s0
- name: enp9s0
state: up
mtu: 1500
type: ethernet
ipv4:
enabled: false
- name: enp10s0
state: up
mtu: 1500
type: ethernet
ipv4:
enabled: false
dns-resolver:
config:
server:
- 1.1.1.1
- 1.0.0.1
routes:
config:
- destination: 0.0.0.0/0
next-hop-address: 192.168.1.1
next-hop-interface: bond0
With the bonded network configuration finally figured out, I was able to reinstall the cluster. The next thing I needed to figure out was DNS. As I state before, I have no local DNS server and no plans to deploy one, so with the /etc/hosts file entries in place, I was able to access the cluster
192.168.1.60 api.snoleopard.cudanet.org
192.168.1.60 oauth-openshift.apps.snoleopard.cudanet.org
192.168.1.60 console-openshift-console.apps.snoleopard.cudanet.org
192.168.1.60 grafana-openshift-monitoring.apps.snoleopard.cudanet.org
192.168.1.60 thanos-querier-openshift-monitoring.apps.snoleopard.cudanet.org
192.168.1.60 prometheus-k8s-openshift-monitoring.apps.snoleopard.cudanet.org
192.168.1.60 alertmanager-main-openshift-monitoring.apps.snoleopard.cudanet.org
192.168.1.60 cdi-uploadproxy-openshift-cnv.apps.snoleopard.cudanet.org
*** the last one I added in addition the ones that the assisted installer provides you with in order to upload ISO images to OCP-V, which we'll get into later
Then once the cluster was accessible, I created the necessary port forwards to open my cluster to the internet, seen here:
*** yes, I'm aware of the potential security risk of having your API open to the Internet. No, I don't care, because I want to use my SNO cluster as my ACM/ACS hub
And then I created the necessary external DNS entries on Cloudflare (my DNS provider). FWIW, I used to have my domain hosted by GoDaddy, but have since moved to Namecheap for registrar and Cloudflare for DNS and am much happier with the service. Namecheap cost me $12 to move my domain and Cloudflare (for my usage) is free.
I tested to make sure the cluster was accessible from the internet on my iPhone and that proper DNS resolution worked. All was good, so I proceeded with my testing.
Once the cluster was available, I needed to figure out storage and virtualization, which took even more trial and error. My initial attempt to deploy the cluster simply by ticking the boxes during the setup to deploy both the LVM operator (formerly known as 'ODF LVM', not confusing at all) and Openshift Virtualization did not work. I'm honestly not sure what the problem ended up being, but my cluster was in a state where all six SSD disks were consumed by the LVM operator and a storage class was set up, but PVCs would not bind to a PV. After a bit of research, I thought the problem had to do with the fact that the default storage class 'lvms-vg1' was set to 'volumeBindingMode: WaitForFirstConsumer', but even after creating a second storage class with 'volumeBindingMode: Immediate' I still could not bind PVCs to PVs. I decided to wipe all of the disks and run through a fresh install and manually set up LVM and OCP-V after the fact, but ran into the same issue. After the 3rd (4th?) reinstall, this time checking the boxes for OCP-V and LVMO, my cluster came up once again. This time, PVCs would bind but they were taking an awfully long time; upwards of five minutes from the time I went to deploy a VM from a template to the point where the PVCs would be bound and start cloning the disk images. Frustrated and defeated, it being after 10PM already, in a last ditch effort to get things working, I SSH'd into the node and rebooted it and went to bed.
Magically, the next morning when I woke up, everything was fine. I tried deploying a Windows VM by uploading an ISO from my workstation which worked on the first shot. The PVCs was bound immediately and the VM started right away and I was able to successfully install Windows 11.
So the lesson here is "when in doubt, reboot". I really don't like the idea of using Windows sysadmin methods to troubleshoot a Linux based container platform, but hey... it worked.
So at long last, Project: SNO Leopard is a success.
Bridged Networking
enabling bridged networking, although not absolutely necessary, is a very nice thing to have. Using OVN, I've found that you need to do something kind of fiddly to get bridged networking to work under normal circumstances, but it turns out that in a flat network without VLANs it simply is not currently possible, see here https://access.redhat.com/solutions/6990200
For what it's worth, there is a fix targeted for OCP 4.14 which will enable a flat layer 2 overlay of the OVN br-ex network which will is intended to provide bridged network access for VMs to the same network as the nodes, but that doesn't exactly help me now.
In order to make bridged networking work in my situation, there are essentially 3 options:
- roll back to SDN
- redeploy the cluster without a network bond and put the VM bridge on the second NIC after install
- install another NIC
- just give up on it, and use MetalLB with lots of services.
Option 1 which would have been the simplest choice unfortunately is not possible because I have no way to disable IPV6 in my home network and that is a hard requirement for SDN on SNO.
Option 2 (the option I chose to go with) sucks because I lose my bond, but it's probably the easiest choice given the options
Option 3 is not a terrible idea, honestly. But it does add yet another layer of complexity and another cable (or two) to manage under my desk. Also, my router only has 4 ports and 3 are currently used and I'm not about to add a switch. The whole point of moving to a SNO cluster was to reduce footprint and simplify things.
Option 4 is more or less the situation I was in to begin with. My gripe about this is that while it may work for most things, anything that requires multicast, or that is not terribly well documented can cause problems. One of the main things I plan to run virtually is Active Directory, and I'm pretty sure a domain controller is not going to handle being NATed well, if at all.
Future plans
For now, I'm calling this project a wrap. I'm going to be setting up my usual stack; MetalLB, ACM, ACS, Openshift Pipelines, Openshift GitOps, etc. and just enjoying my beautiful, clean and quiet Openshift cluster. In the future, I may consider upgrading it with a SATA III card. Also, there are a handful of PCIe NVMe cards specifically for Mac Pros. The PCIe slots on the Mac Pro do not support bifurcation, so multiple NVMe drives are only supported on cards with their own dedicated controllers, but you can get about double the speed out of a dual NVMe card, but I don't think you can get much faster than 3000Mbps due to the fact that it's PCIe 2.0. Also, that assumes you install the card in either PCIe slot 1 or 2 which are x16 slots. Slots 3 and 4 are x4 slots and I've got my eye on a pair of Xeon X5690 CPUs (the highest CPU you can install in this Mac Pro) which I can get on Ebay for about $75 and would be a huge boost in single threaded performance (3.47Ghz vs. 2.40Ghz) https://www.cpubenchmark.net/compare/Intel-Xeon-E5645-vs-Intel-Xeon-X5690/1252vs1314, but I'm hesitant to do so because the jump from 80W CPUs to 130W CPUs will likely increase the amount of heat and fan noise this thing puts out, but generally speaking, reviews from other Mac Pro owners who have done the X5690 CPU upgrade on their machines have been positive. The fact is, Apple engineered the classic Mac Pro as a really thermally efficient machine, and the only reports of fan noise I've read thus far have been to do with an incompatibility between the MacPro4,1 and MacPro5,1 CPU trays, which brings me to a a super interesting foot note to end on!
Footnote
For anyone who is interested in going down the same route as me, there may be an even cheaper way to get the same result. Incidentally, the 2009 Mac Pro model (the MacPro4,1 ) is for almost all intents and purposes the same system as the later 2010-2012 Mac Pro (MacPro5,1). Both bear the same model number A1289. The two are so similar in fact that you can actually flash the firmware on a MacPro4,1 to a MacPro5,1 https://thehouseofmoth.com/turning-a-2009-41-mac-pro-into-a-2010-2012-51-mac-pro-2021-edition/
https://www.ifixit.com/Guide/How+to+Upgrade+the+Firmware+of+a+2009+Mac+Pro+41/98985
Which effectively will make your MacPro4,1 a MacPro5,1. The only difference between the two machines being the CPU tray itself. The CPU trays of the 4,1 and 5,1, although physically interchangeable are not compatible and will result in fans running full blast if you install the CPU tray from the wrong system. Also, the CPU tray from the MacPro4,1 differs from the CPU tray on the 5,1 in that it requires either A) delidded CPUs, or B) the use of spaces between the CPU and the heat sinks to accommodate the heat spreaders (lids). I wouldn't attempt to delid a CPU myself, but they can be purchased in pairs on Ebay https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2380057.m570.l1313&_nkw=delidded+x5690&_sacat=0
but they cost about double what a regular pair of X5690s would run you.