3 Why ZFS

Note

Two pools — safe (NVMe) for application data, tank (SAS) for bulk media. One dataset per service.

3.1 Why ZFS

Copy-on-write snapshots before every deploy, free and instant
Per-dataset properties: compression, quotas, reservation, mountpoints
zrepl for automated snapshot management and replication
Alternative considered: Btrfs (ships in-tree, but tooling and RAID maturity lag behind ZFS for this use case)

3.2 Pool topology

safe: NVMe mirror for low-latency application state (databases, config, auth data)
tank: SAS mirror for bulk media (photos, videos, music, books)
Separation lets you make different durability and performance tradeoffs per workload

3.3 Dataset conventions

One dataset per service, mounted at /zfs/{pool}/{service} (e.g., /zfs/safe/authentication, /zfs/tank/memory)
Quadlet pod units use RequiresMountsFor=/zfs/... to express the dependency — systemd won’t start the pod until the dataset is mounted
Subdirectories within a dataset for volume separation (e.g., /zfs/safe/forge/data, /zfs/safe/forge/backups)

3.4 Permissions and rootless Podman

Service users own their dataset directories; UIDs are assigned statically (2000–2013) and baked into the instance image
Rootless Podman maps container UIDs into the service user’s subuid range
podman unshare chown for initial ownership setup; after that, the container’s internal root maps to the service user’s UID on the host

3.5 Snapshots and deploy safety

pyinfra deploy tasks can snapshot datasets before service restarts
Rollback is zfs rollback to the pre-deploy snapshot
zrepl handles periodic snapshots and pruning independently of deploys

# Why ZFS

::: {.callout-note}
Two pools — `safe` (NVMe) for application data, `tank` (SAS) for bulk media. One dataset per service.
:::

## Why ZFS

- Copy-on-write snapshots before every deploy, free and instant
- Per-dataset properties: compression, quotas, reservation, mountpoints
- [zrepl](https://zrepl.github.io/) for automated snapshot management and replication
- Alternative considered: [Btrfs](https://en.wikipedia.org/wiki/Btrfs) (ships in-tree, but tooling and RAID maturity lag behind ZFS for this use case)

## Pool topology

- `safe`: NVMe mirror for low-latency application state (databases, config, auth data)
- `tank`: SAS mirror for bulk media (photos, videos, music, books)
- Separation lets you make different durability and performance tradeoffs per workload

## Dataset conventions

- One dataset per service, mounted at `/zfs/{pool}/{service}` (e.g., `/zfs/safe/authentication`, `/zfs/tank/memory`)
- Quadlet pod units use `RequiresMountsFor=/zfs/...` to express the dependency — systemd won't start the pod until the dataset is mounted
- Subdirectories within a dataset for volume separation (e.g., `/zfs/safe/forge/data`, `/zfs/safe/forge/backups`)

## Permissions and rootless Podman

- Service users own their dataset directories; UIDs are assigned statically (2000–2013) and baked into the instance image
- Rootless Podman maps container UIDs into the service user's subuid range
- `podman unshare chown` for initial ownership setup; after that, the container's internal root maps to the service user's UID on the host

## Snapshots and deploy safety

- pyinfra deploy tasks can snapshot datasets before service restarts
- Rollback is `zfs rollback` to the pre-deploy snapshot
- [zrepl](https://zrepl.github.io/) handles periodic snapshots and pruning independently of deploys