Posted by wasting_time 1 day ago
Roughly ten years ago, my homelab consisted of a dozen virtual machines running on SmartOS. I was not familiar with Illumos, and this was before it had a widely available web UI, but it was simple enough to use that these challenges didn't matter much. SmartOS was designed to boot reliably from USB flash storage, allowed me to use all my SATA ports for VM storage, and was my first "immutable" operating system. The primary focus on ZFS storage was another great quality of SmartOS.
Two moves and several years later, it was time to rebuild the lab, and I decided to go with Proxmox because it had decent ZFS support. Experience with Proxmox has been very good too. The GUI, many more virtualization features (in addition to the key ones I care about), and better hardware support through the Linux kernel have kept me on Proxmox for a long time.
Customizing my Proxmox installation always gave me anxiety. How could I defend my hypervisor from configuration drift? I wished there could be an immutable version of Proxmox.
Later on, I learned about govulcheck, which offers a novel dynamic/static analysis hybrid approach to vulnerability management. Nothing else out there does this (without teaming up with some huge company). I began to think that I should favor software solutions based on golang.
Ultimately, Incus (and IncusOS) fit this need very well. My IncusOS hosts excellent and I'm glad I can run Incus itself on most Linux distros - including NixOS!
I'll keep a small Proxmox host around for experimenting with new kernel features (Intel GVT-g / SR-IOV graphics) and old operating systems like Windows XP or anything else that needs special QEMU options.
The VM feature of Incus is based on QEMU/KVM so actually there's no need to keep Proxmox around, unless you really want to keep a host or cluster for experimentation with the Proxmox environment. With some configuration you can get SR-IOV and older operating systems working aswell.
There's a entire section about allocating GPUs to containers or VMs here: https://linuxcontainers.org/incus/docs/main/reference/device...
You can do the same with USB devices, NICs, infiniband adapters and whatever (as can be seen below and above the gpu part in the documentation)
For SR-IOV with VFs on a virtual machine the CLI command should look something like:
incus config device add <instance name> <device name> gpu gputype=sriov pci=<pci address>
https://linuxcontainers.org/incus/docs/main/reference/device...
But the possibility to just reroute a entire GPU to a virtual machine or container might be even more interesting:
incus config device add <instance name> <device name> gpu gputype=physical pci=<pci address>
https://linuxcontainers.org/incus/docs/main/reference/device...
Note that there's a possibility you'll need to play with the parameters a bit. All are mentioned in the docs.
CLI is first class in proxmox, I use the qm command for managing vms all the time. The networking is also just a file in `/etc/network/interfaces` that I modify with vim as needed.
[1]: https://du.nkel.dev/blog/2026-05-16_rootless_docker_virtiofs...
Also the mentions and requirements relating AI in the article sound like they are from another world. Did things really come to this? Even if they had, you one can still snapshot proxmox vms as well as host (zfs).
I don't even do that, I go into a shell and run qm commands for more complicated things. And for anything I ask an agent to do, it goes straight to qm and other CLI tools as well.
Weird.
All valid comments about the fact that Proxmox is not limited to clicking around in the web UI.
It has qm, pct, config files, a REST API, Terraform providers, and Ansible workflows. My point is not that Proxmox cannot be automated.
Even with that automation, state drift can still creep in when debugging means running one quick command, especially if an agent is allowed to execute imperative fixes that never make it back into the automation framework. It is that, for my setup, I wanted the reproducible configuration itself to be the source of truth.
The thing I care about is not buttons versus commands, but whether I can rebuild the host from version-controlled text files and know that every important change is captured there.
Also, do they get PBS using ZFS snapshots? Do they get HA, live migration, shared storage, easy CephOS, easy snapshots, quick cloning? Do you really want to migrate a VM from one node to another using the command line when you're in some serious situation?
Sure, for a homelab this might be OK, but the UI does make things easy for a reason.and it's not a gimmick.
I agree on a lot of the points, though, I just set up a second cluster and it took over 3 work days because of how much repetitive work is needed to do so. To be able to just take a file with instructions, adjust it a bit and deploy would be so much easier.
I searched the documentation but it wasn't really clear what its live migration and ZFS migration story is, but when I asked Claude to research it, it tells me that it supports live migration via ZFS snapshot replication, which is exactly what I'm looking for. I implemented a ganeti storage driver that does the same thing and am just getting ready to start testing it, but if Incus supports it I might look at moving that direction.
Anyone use Incus live migration with ZFS?
Like... this... this is not great documentation (I know I know, contribute myself): https://wiki.nixos.org/wiki/Incus
Within a few lines:
> To provide non-root access to the Incus server, you will want to add your user to the incus-admin group. Don't forget to reboot.
I mean I get that they probably mean /etc/group, but going on from there plenty of examples of "just change the config to use x" or similar.
Uhh, whut? It provides a button-y interface, but you can do everything via config files and `pct` on the command line if you prefer. I know that’s not full nix-style declarative, but you don’t have to mislead to sell me on the advantages of declarative infra.
Look for Terraform providers and you'll pretty much only find things to declare VMs and a few other resources around running them, but not a lot to define infrastructure, networking, firewalls, etc.
I would love to know more about how you do this, particularly the deploy part. I'm considering moving away from Ansible, but haven't had the time to dedicate to exploring a similar Nix experience.