/posts/ 2026/how-to-break-nixos-with-microvm-nix
Mar 30, 2026
I’ve started using NixOS heavily in my homelab in the last few months, inspired by some limited experience a year ago and my love of Terraform I decided to take the plunge with one system in the lab, eventually it finished with the decommissioning of my VMware ESXi host and my Kubernetes cluster.
So now I run a NixOS hypervisor hosting several virtual machines via microvm.nix. The microvms are true virtual machines using Qemu and KVM, but share the host’s /nix/store via virtiofs, allowing the VM to be essentially stateless and manageable by the host’s NixOS configuration. After the clock change I was looking into why Loki was complaining about my Bind logs being timestamped from the future, that was due to a misconfiguration of Alloy parsing the local time Bind logs as UTC, but that is unrelated this issue. As part of the diagnosis step I restarted ns-02 which is a secondary DNS mirroring my primary, I quickly noticed in the logs that the system wouldn’t boot with a error of a missing file
Mar 29 19:42:26 hyp-01 systemd[1]: Starting MicroVM 'ns-02'...
Mar 29 19:42:26 hyp-01 microvm@ns-02[3296]: microvm@ns-02: error while loading shared libraries: libnuma.so.1: cannot open shared object file: No such file or directory
Mar 29 19:42:26 hyp-01 systemd[1]: microvm@ns-02.service: Main process exited, code=exited, status=127/n/a
Mar 29 19:42:26 hyp-01 busctl[3333]: Call failed: Failed to pin process 3296: No such process
Mar 29 19:42:26 hyp-01 systemd[1]: microvm@ns-02.service: Control process exited, code=exited, status=1/FAILURE
libnuma seems a very low-level library to be missing, and its failing on the host not in the VM itself. Next step was to run a nixos-rebuild to ensure that the system is up to date with the config and any missing files are restored.
error: opening file '/nix/store/nbsdqpfzh1jlpmh95s69b3iivfcvv3lh-config.sub-948ae97.drv': No such file or directory
Command 'nix --extra-experimental-features 'nix-command flakes' build --print-out-paths 'github:nikdoof/nixos-homeprod#nixosConfigurations."hyp-01".config.system.build.nixos-rebuild' --refresh --no-link' returned non-zero exit status 1.
Uh oh.
I suspect a disk failure, i’ve been plagued with some very unstable SSDs due to the heavy load of etcd in the Kubernetes cluster but after a dig into SMART, disk repairs and such everything looked fine. A search on Google lead me down the path of a corrupted /nix/store, but all the fixes didn’t really make any difference. nix-store --verify --check-contents --repair reported hundreds of “disappeared” paths and eventually hit a SQLite foreign key constraint error, preventing automated repair. The next logical step seemed to be a re-install, but I wanted to experiment first rather than just giving up.
It probably would of been faster to format the system, but instead I took to asking Claude what it thought, and after about 20 minutes we solved the issue.
While I was trying to debug a particularly annoying DNS problem on ns-02 I remounted the Nix store read-write with mount -o remount,rw /nix/store so I could update some systemd services to test, After I finished my debugging session I didn’t remount it back to read-only, or reboot the microvm leaving the store writable to the VM which was my first mistake.
Second mistake is that in a common configuration file I enabled nix.gc.automatic, this will review the files in the Nix store and cleanup any unused files after a defined number of days. In this instance ns-02 ran its GC against the host’s Nix store and deleted files it had no business touching. When the microvm GC ran with write access to the host store, it deleted .drv files (derivation descriptions) that the host still referenced. The Nix store database (/nix/var/nix/db/db.sqlite) retained entries pointing to these now-missing files, leaving the store in an inconsistent state.
Once we discovered it was nix.gc that had caused the issues Claude was able to produce a fix for the Nix store and allow for a full nixos-rebuild to work.
The store DB contained entries for paths that no longer existed on disk. The Nix store SQlite DB uses foreign keys, so we temporarily disable them and remove the missing files:
import sqlite3, os
db = sqlite3.connect('/nix/var/nix/db/db.sqlite')
c = db.cursor()
c.execute('PRAGMA foreign_keys = OFF')
# Remove entries for missing paths
c.execute('SELECT id, path FROM ValidPaths')
missing = [(id_, p) for id_, p in c.fetchall() if not os.path.exists(p)]
print(f'Missing paths: {len(missing)}')
if missing:
ids = ','.join(str(id_) for id_, _ in missing)
c.execute(f'DELETE FROM Refs WHERE referrer IN ({ids}) OR reference IN ({ids})')
c.execute(f'DELETE FROM DerivationOutputs WHERE drv IN ({ids})')
c.execute(f'DELETE FROM ValidPaths WHERE id IN ({ids})')
# Null out deriver references pointing to missing files
c.execute('SELECT DISTINCT deriver FROM ValidPaths WHERE deriver IS NOT NULL')
missing_derivers = [r[0] for r in c.fetchall() if not os.path.exists(r[0])]
print(f'Missing deriver refs: {len(missing_derivers)}')
if missing_derivers:
ph = ','.join('?' * len(missing_derivers))
c.execute(f'UPDATE ValidPaths SET deriver = NULL WHERE deriver IN ({ph})', missing_derivers)
db.commit()
db.close()
.drv filesEven after cleaning the DB, Nix kept failing because stale .drv files remained on disk from previous evaluations. These files referenced other derivations that had also been deleted.
mount -o remount,rw /nix/store
import sqlite3, os, glob
db = sqlite3.connect('/nix/var/nix/db/db.sqlite')
c = db.cursor()
c.execute('PRAGMA foreign_keys = OFF')
drvs = glob.glob('/nix/store/*.drv')
ids = []
for drv in drvs:
c.execute('SELECT id FROM ValidPaths WHERE path = ?', (drv,))
row = c.fetchone()
if row:
ids.append(str(row[0]))
os.unlink(drv)
if ids:
ph = ','.join(ids)
c.execute(f'DELETE FROM Refs WHERE referrer IN ({ph}) OR reference IN ({ph})')
c.execute(f'DELETE FROM DerivationOutputs WHERE drv IN ({ph})')
c.execute(f'DELETE FROM ValidPaths WHERE id IN ({ph})')
db.commit()
db.close()
print(f'Removed {len(drvs)} drv files')
mount -o remount,ro /nix/store
systemctl restart nix-daemon
nixos-rebuild switch --flake github:youruser/yourrepo#hostname
After the re-run the system was back to a working state.
If you use microvm.nix and share the host store with guests, make sure:
nix.gc.automatic is disabled on all microvm guests — guest GC has no way to know what the host still needs.auto-optimise-store is disabled on guests for the same reason.mount -o remount,rw /nix/store inside guests wherever possible — if you need it for one-off operations, never run Nix commands (especially GC) while the store is writable.