Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,4 +77,4 @@ Thank you so much!

## Repo Activity

![Alt](https://repobeats.axiom.co/api/embed/53ea60503d80f77590f52ac0e983b2b8af47e20a.svg "Repobeats analytics image")
![Alt](https://repobeats.axiom.co/api/embed/d9565d6d1ed8222a5da5fedf25c18a9c8beab382.svg "Repobeats analytics image")
2 changes: 1 addition & 1 deletion cmd/proxsave/helpers_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@ func TestFormatDuration(t *testing.T) {
{30 * time.Second, "30.0s"},
{59 * time.Second, "59.0s"},
{60 * time.Second, "1.0m"},
{90 * time.Second, "1.5m"},
{time.Minute + 30*time.Second, "1.5m"},
{60 * time.Minute, "1.0h"},
{90 * time.Minute, "1.5h"},
}
Expand Down
169 changes: 165 additions & 4 deletions docs/RESTORE_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,6 +323,7 @@ Phase 13: pvesh SAFE Apply (Cluster SAFE Mode Only)
└─ Offer to apply datacenter.cfg via pvesh

Phase 14: Post-Restore Tasks
├─ Optional: Apply restored network config with rollback timer (requires COMMIT)
├─ Recreate storage/datastore directories
├─ Check ZFS pool status (PBS only)
├─ Restart PVE/PBS services (if stopped)
Expand Down Expand Up @@ -709,7 +710,8 @@ Cluster backup detected. Choose how to restore the cluster database:

**Post-restore actions (SAFE mode)**:
After export, the workflow offers interactive options to apply configurations via `pvesh`:
1. **VM/CT configs**: Scans exported configs and applies them via `pvesh set /nodes/<node>/qemu/<vmid>/config`
1. **VM/CT configs**: Scans exported configs (under `/etc/pve/nodes/<node>/...`) and applies them via `pvesh set /nodes/<node>/qemu/<vmid>/config`
- If the target node hostname differs from the hostname stored in the backup (common after hardware migration / reinstall), ProxSave detects the mismatch and prompts you to select the exported node directory to import from (instead of silently reporting “No VM/CT configs found”).
2. **Storage configuration**: Applies `storage.cfg` entries via `pvesh set /cluster/storage/<id>`
3. **Datacenter configuration**: Applies `datacenter.cfg` via `pvesh set /cluster/config`

Expand All @@ -722,6 +724,7 @@ Each action prompts for confirmation before execution.
- Unmounts `/etc/pve` FUSE filesystem
- Writes directly to `/var/lib/pve-cluster/config.db`
- Restarts services with restored configuration
- Avoids restoring files under `/etc/pve/*` while pmxcfs is stopped/unmounted (to prevent "shadowed" writes on the underlying disk). Those files are expected to come from the restored `config.db`.

**When to use**:
- Complete disaster recovery
Expand Down Expand Up @@ -1348,6 +1351,21 @@ These configurations are included in every backup and can be restored using **th
Apply all VM/CT configs via pvesh? (y/N): y
```

**If the node name changed** (example: backup from `pve-old`, restore on `pve-new`), ProxSave prompts for the exported source node:
```
SAFE cluster restore: applying configs via pvesh (node=pve-new)

WARNING: VM/CT configs in this backup are stored under different node names.
Current node: pve-new
Select which exported node to import VM/CT configs from (they will be applied to the current node):
[1] pve-old (qemu=12, lxc=3)
[0] Skip VM/CT apply
Choice: 1

Found 15 VM/CT configs for exported node pve-old (will apply to current node pve-new)
Apply all VM/CT configs via pvesh? (y/N): y
```

6. **Confirm and watch progress**:
```
Applied VM/CT config 100 (webserver)
Expand Down Expand Up @@ -1639,6 +1657,53 @@ Backup source: Proxmox Virtual Environment (PVE)
Type "yes" to continue anyway or "no" to abort: _
```

### 4. Network Safe Apply (Optional)

If the **network** category is restored, ProxSave can optionally apply the
new network configuration immediately using a **transactional rollback timer**.

**Important (console recommended)**:
- Run the live network apply/commit step from the **local console** (physical console, IPMI/iDRAC/iLO, Proxmox console, or hypervisor console), not from SSH.
- If the restored network config changes the management IP or routes, your SSH session will drop and you may be unable to type `COMMIT`.
- In that case, ProxSave will treat the lack of `COMMIT` as “not confirmed” and will restore the previous network settings (rollback).

**How it works**:
- On live restores (writing to `/`), ProxSave **stages** network files first under `/tmp/proxsave/restore-stage-*` and does **not** overwrite `/etc/network/*` during archive extraction.
- After extraction, ProxSave performs a prevention-first **staged install**: it writes the staged files to disk (no reload), runs safe NIC repair + preflight validation, and **rolls back automatically** if validation fails (leaving the staged copy for review).
- If rollback backup creation fails (or ProxSave is not running as root), ProxSave keeps network files staged and avoids writing to `/etc`.
- When you choose to apply live, ProxSave (re)validates and reloads networking inside the rollback timer window.
- ProxSave arms a local rollback job **before** applying changes
- Rollback restores **only network-related files** using a dedicated archive under `/tmp/proxsave/network_rollback_backup_*` (so it won’t undo other restored categories)
- Rollback also prunes network config files that were **created after** the backup (e.g. extra files under `/etc/network/interfaces.d/`), so rollback returns to the exact pre-restore state
- The user has **180 seconds** to type `COMMIT`
- If `COMMIT` is not received, ProxSave triggers the rollback and restores the pre-restore network configuration
- If the network-only rollback archive is not available, ProxSave prompts before falling back to the full safety backup (or skipping live apply)

This protects SSH/GUI access during network changes.

**Health checks**:
- After applying changes, ProxSave runs local checks (SSH route if available, default route, link state, IP addresses, gateway ping, DNS config/resolve, local web UI port)
- On PVE systems, additional checks are included for cluster networking: `/etc/pve` (pmxcfs) mount status, `pve-cluster` / `corosync` service state, and `pvecm status` quorum
- The result is shown to help decide whether to type `COMMIT`
- Diagnostics are saved under `/tmp/proxsave/network_apply_*` (snapshots `before.txt` / `after.txt` / `after_rollback.txt` when relevant, `health_before.txt` / `health_after.txt`, `preflight.txt`, `plan.txt`, and `ifquery_*`)

**NIC name repair**:
- If physical NIC names changed after reinstall (e.g. `eno1` → `enp3s0`), ProxSave attempts an automatic mapping using backup network inventory (permanent MAC / MAC / PCI path / udev IDs like `ID_PATH`, `ID_NET_NAME_PATH`, `ID_NET_NAME_SLOT`, `ID_SERIAL`)
- When a safe mapping is found, `/etc/network/interfaces` and `/etc/network/interfaces.d/*` are rewritten before applying the network config
- If you skip live network apply, ProxSave may still install the staged config to disk (no reload) after safe NIC repair + preflight; if validation fails, it rolls back and keeps the staged copy.
- If a mapping would overwrite an interface name that already exists on the current system, ProxSave prompts before applying it (conflict-safe)
- If persistent NIC naming rules are detected (custom udev `NAME=` rules or systemd `.link` files), ProxSave warns and prompts before applying NIC repair to avoid conflicts with user-intended naming
- A backup of the pre-repair files is stored under `/tmp/proxsave/nic_repair_*`

**Preflight validation**:
- After NIC repair, ProxSave runs a **gate** validation of the ifupdown configuration before reloading networking (e.g. `ifup -n -a` / `ifup --no-act -a` / `ifreload --syntax-check -a`)
- If validation fails, live apply is aborted and the validator output is saved under `/tmp/proxsave/network_apply_*/preflight.txt`
- Additionally (diagnostics-only), ProxSave can run `ifquery --check -a` **before and after apply** to show how the runtime state matches the target config. Its output is saved under `/tmp/proxsave/network_apply_*/ifquery_*`. Note that `ifquery --check` can show `[fail]` **before apply** even when the config is valid (because the running state still reflects the old config).
- On staged installs/applies, a failed preflight triggers an **automatic rollback of network files** (no prompt), returning to the pre-restore state and keeping the staged copy for review.

**Result reporting**:
- If you do not type `COMMIT`, ProxSave completes the restore with warnings and reports that the original network settings were restored (including the current IP, when detectable), plus the rollback log path.

### 4. Hard Guards

**Path Traversal Prevention**:
Expand Down Expand Up @@ -2002,9 +2067,105 @@ zfs list
# If ZFS, import pool
zpool import <pool-name>

# If directory, create it
mkdir -p /mnt/datastore/{.chunks,.lock}
chown backup:backup /mnt/datastore -R
# If directory-based datastore (non-ZFS), verify permissions for backup user
# NOTE:
# - On live restores, ProxSave stages PBS datastore/job configuration first under `/tmp/proxsave/restore-stage-*`
# and applies it safely after checking the current system state.
# - If a datastore path looks like a mountpoint location (e.g. under `/mnt`) but resolves to the root filesystem,
# ProxSave will **defer** that datastore definition (it will NOT be written to `datastore.cfg`), to avoid ending up
# with a broken datastore entry that blocks re-creation on a new/empty disk. Deferred entries are saved under
# `/tmp/proxsave/datastore.cfg.deferred.*` for manual review.
# - ProxSave may create missing datastore directories and fix `.lock`/ownership, but it will NOT format disks.
# - To avoid accidental writes to the wrong disk, ProxSave will skip datastore directory initialization if the
# datastore path looks like a mountpoint location (e.g. under /mnt) but resolves to the root filesystem.
# In that case, mount/import the datastore disk/pool first, then restart PBS (or re-run restore).
# - If the datastore path is not empty and contains unexpected files/directories, ProxSave will not touch it.
ls -ld /mnt/datastore /mnt/datastore/<DatastoreName> 2>/dev/null
namei -l /mnt/datastore/<DatastoreName> 2>/dev/null || true

# Common fix (adjust to your datastore path)
chown backup:backup /mnt/datastore && chmod 750 /mnt/datastore
chown -R backup:backup /mnt/datastore/<DatastoreName> && chmod 750 /mnt/datastore/<DatastoreName>
```

---

**Issue: "Bad Request (400) unable to read /etc/resolv.conf (No such file or directory)"**

**Cause**: `/etc/resolv.conf` is missing or a broken symlink. This can happen after a restore if a previous backup contained an invalid symlink (e.g. pointing to `../commands/resolv_conf.txt`), or if the target system uses `systemd-resolved` and the expected `/run/systemd/resolve/*` files are not present.

**Solution**:
```bash
ls -la /etc/resolv.conf
readlink /etc/resolv.conf 2>/dev/null || true

# If the link is broken or points to commands/resolv_conf.txt, replace it:
rm -f /etc/resolv.conf

if [ -e /run/systemd/resolve/resolv.conf ]; then
ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf
elif [ -e /run/systemd/resolve/stub-resolv.conf ]; then
ln -s /run/systemd/resolve/stub-resolv.conf /etc/resolv.conf
else
# Fallback: static DNS (adjust to your environment)
printf "nameserver 1.1.1.1\nnameserver 8.8.8.8\noptions timeout:2 attempts:2\n" > /etc/resolv.conf
chmod 644 /etc/resolv.conf
fi
```

Note: newer ProxSave versions attempt to auto-repair `/etc/resolv.conf` during restore when the `network` category is selected.

---

**Issue: "Bad Request (400) parsing /etc/proxmox-backup/datastore.cfg (expected section properties)"**

**Cause**: In PBS, properties inside a `datastore:` section must be indented. A malformed file (often from manual edits or very old configs) will prevent PBS from loading datastore config.

**Solution**:
```bash
# ProxSave will attempt to auto-normalize datastore.cfg during restore and store a backup under /tmp/proxsave/,
# but you can also fix it manually:
cp -a /etc/proxmox-backup/datastore.cfg /root/datastore.cfg.bak.$(date +%F_%H%M%S)

# Example of correct indentation:
# datastore: Data1
# gc-schedule 0/2:00
# path /mnt/datastore/Data1

editor /etc/proxmox-backup/datastore.cfg
systemctl restart proxmox-backup proxmox-backup-proxy
```

---

**Issue: "unable to read prune/verification job config ... syntax error (expected header)"**

**Cause**: PBS job config files (`/etc/proxmox-backup/prune.cfg`, `/etc/proxmox-backup/verification.cfg`) are empty or malformed. PBS expects a section header at the first non-comment line; an empty file can trigger parse errors.

**Restore behavior**:
- On live restores, ProxSave stages PBS job config files and will **remove** empty staged job configs instead of writing a 0-byte file (to avoid breaking PBS parsing).

**Manual fix**:
```bash
rm -f /etc/proxmox-backup/prune.cfg /etc/proxmox-backup/verification.cfg
systemctl restart proxmox-backup proxmox-backup-proxy
```

---

**Issue: "Datastore error: Is a directory (os error 21)"**

**Cause**: PBS expects a lock file at `<datastore-path>/.lock`. If `.lock` is a directory (common after manual fixes or incorrect initialization), PBS will fail to open it and the datastore becomes unavailable.

**Solution**:
```bash
P=/mnt/datastore/<DatastoreName>
ls -ld "$P/.lock"

# If .lock is a directory, replace it with a file:
rm -rf "$P/.lock" && touch "$P/.lock" && chown backup:backup "$P/.lock"

systemctl restart proxmox-backup proxmox-backup-proxy
```

---
Expand Down
2 changes: 2 additions & 0 deletions docs/RESTORE_TECHNICAL.md
Original file line number Diff line number Diff line change
Expand Up @@ -860,6 +860,7 @@ func extractSelectiveArchive(
mode,
logFile,
logPath,
nil, // skipFn (optional)
)

return logPath, err
Expand Down Expand Up @@ -1247,6 +1248,7 @@ func extractArchiveNative(
mode RestoreMode,
logFile *os.File,
logFilePath string,
skipFn func(entryName string) bool,
) error {
// 1. Open archive with decompression
file, _ := os.Open(archivePath)
Expand Down
19 changes: 19 additions & 0 deletions docs/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Complete troubleshooting guide for Proxsave with common issues, solutions, and d
- [Encryption Issues](#4-encryption-issues)
- [Disk Space Issues](#5-disk-space-issues)
- [Email Notification Issues](#6-email-notification-issues)
- [Restore Issues](#7-restore-issues)
- [Debug Procedures](#debug-procedures)
- [Getting Help](#getting-help)
- [Related Documentation](#related-documentation)
Expand Down Expand Up @@ -549,6 +550,24 @@ MIN_DISK_SPACE_PRIMARY_GB=5 # Lower threshold
# Add more storage or clean unnecessary files
```

---
### 7. Restore Issues

#### Error during network preflight: `addr_add_dry_run() got an unexpected keyword argument 'nodad'`

**Symptoms**:
- Restore networking preflight fails when running `ifup -n -a`
- Log contains: `NetlinkListenerWithCache.addr_add_dry_run() got an unexpected keyword argument 'nodad'`

**Cause**:
- A Proxmox-packaged `ifupdown2` version may ship a Python signature mismatch between `addr_add()` and `addr_add_dry_run()` (dry-run path), which crashes `ifup -n` when `nodad` is used.

**What ProxSave does**:
- During restore, ProxSave can apply a guarded hotfix (only when needed) by patching `/usr/share/ifupdown2/lib/nlcache.py` and writing a timestamped `.bak.*` backup first.

**Recovery / rollback**:
- To revert the hotfix, restore the `.bak.*` copy back onto `nlcache.py`, or upgrade `ifupdown2` when Proxmox publishes a fixed build.

---

## Debug Procedures
Expand Down
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ toolchain go1.25.5

require (
filippo.io/age v1.3.1
github.com/gdamore/tcell/v2 v2.13.6
github.com/gdamore/tcell/v2 v2.13.7
github.com/rivo/tview v0.42.0
golang.org/x/crypto v0.46.0
golang.org/x/crypto v0.47.0
golang.org/x/term v0.39.0
golang.org/x/text v0.33.0
)
Expand Down
8 changes: 4 additions & 4 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ filippo.io/hpke v0.4.0 h1:p575VVQ6ted4pL+it6M00V/f2qTZITO0zgmdKCkd5+A=
filippo.io/hpke v0.4.0/go.mod h1:EmAN849/P3qdeK+PCMkDpDm83vRHM5cDipBJ8xbQLVY=
github.com/gdamore/encoding v1.0.1 h1:YzKZckdBL6jVt2Gc+5p82qhrGiqMdG/eNs6Wy0u3Uhw=
github.com/gdamore/encoding v1.0.1/go.mod h1:0Z0cMFinngz9kS1QfMjCP8TY7em3bZYeeklsSDPivEo=
github.com/gdamore/tcell/v2 v2.13.6 h1:ZAKaC+z7EHtDlELEVw5qxvO560cCXOtn0Su4YqMahJM=
github.com/gdamore/tcell/v2 v2.13.6/go.mod h1:+Wfe208WDdB7INEtCsNrAN6O2m+wsTPk1RAovjaILlo=
github.com/gdamore/tcell/v2 v2.13.7 h1:yfHdeC7ODIYCc6dgRos8L1VujQtXHmUpU6UZotzD6os=
github.com/gdamore/tcell/v2 v2.13.7/go.mod h1:+Wfe208WDdB7INEtCsNrAN6O2m+wsTPk1RAovjaILlo=
github.com/lucasb-eyer/go-colorful v1.3.0 h1:2/yBRLdWBZKrf7gB40FoiKfAWYQ0lqNcbuQwVHXptag=
github.com/lucasb-eyer/go-colorful v1.3.0/go.mod h1:R4dSotOR9KMtayYi1e77YzuveK+i7ruzyGqttikkLy0=
github.com/rivo/tview v0.42.0 h1:b/ftp+RxtDsHSaynXTbJb+/n/BxDEi+W3UfF5jILK6c=
Expand All @@ -19,8 +19,8 @@ github.com/rivo/uniseg v0.4.7/go.mod h1:FN3SvrM+Zdj16jyLfmOkMNblXMcoc8DfTHruCPUc
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
golang.org/x/crypto v0.46.0 h1:cKRW/pmt1pKAfetfu+RCEvjvZkA9RimPbh7bhFjGVBU=
golang.org/x/crypto v0.46.0/go.mod h1:Evb/oLKmMraqjZ2iQTwDwvCtJkczlDuTmdJXoZVzqU0=
golang.org/x/crypto v0.47.0 h1:V6e3FRj+n4dbpw86FJ8Fv7XVOql7TEwpHapKoMJ/GO8=
golang.org/x/crypto v0.47.0/go.mod h1:ff3Y9VzzKbwSSEzWqJsJVBnWmRwRSHt/6Op5n9bQc4A=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
Expand Down
2 changes: 1 addition & 1 deletion internal/backup/archiver_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -401,7 +401,7 @@ func TestFormatDuration(t *testing.T) {
want string
}{
{30 * time.Second, "30.0s"},
{90 * time.Second, "1.5m"},
{time.Minute + 30*time.Second, "1.5m"},
{2 * time.Hour, "2.0h"},
}

Expand Down
Loading
Loading