-
-
Notifications
You must be signed in to change notification settings - Fork 4
Queue and Jobs
The Queue page provides real-time visibility into all backup operations, job status, and system activity across all clients.
Access the Queue at /queue or via the Queue link in the main navigation.
The Queue page displays:
- Summary Statistics: Quick overview cards
- In Progress: Currently running jobs with live progress bars
- Recently Completed: Last 25 finished jobs
- Auto-Refresh: Updates every 10 seconds automatically
Screenshot: Queue page showing stats cards, in-progress jobs, and recent jobs
Four stat cards provide at-a-glance status:
- Number of jobs waiting to run (status:
queuedorsent) - Includes jobs waiting for available worker slots
- High numbers may indicate agent connectivity issues or insufficient concurrent job slots
- Number of successfully completed jobs in the last 24 hours
- Includes all job types: backups, restores, prunes, etc.
- Trend indicator: green if increasing, red if decreasing
- Number of failed jobs in the last 24 hours
- Red alert if greater than 0
- Click to filter job list by failed status
- Average job execution time (in minutes) for the last 50 jobs
- Helps identify performance trends
- Excludes queued time, only measures active execution
Screenshot: Stat cards showing sample values
Jobs progress through several states:
| State | Description | Color | Meaning |
|---|---|---|---|
| queued | Job created, waiting to be sent to agent | Gray | Agent will pick it up on next poll |
| sent | Job sent to agent, agent acknowledged | Blue | Agent is preparing to execute |
| running | Job actively executing on agent | Yellow | Progress bars show completion percentage |
| completed | Job finished successfully | Green | Archive created, data backed up |
| failed | Job failed with error | Red | Check error log for details |
| cancelled | Job manually cancelled by user | Gray | Stopped before completion |
Shows all jobs currently in running state.
For each running job:
- Job ID: Unique identifier, click to view details
- Client Name: Which client the job is for
- Job Type: backup, prune, restore, etc.
- Started: How long ago the job started
- Progress Bar: Visual percentage complete (for backup/restore jobs)
- Current Action: What the job is doing right now (e.g., "Creating archive", "Pruning old archives")
Screenshot: In Progress section with multiple running jobs and progress bars
For Backup Jobs:
- Shows files processed and bytes transferred
- Progress percentage based on estimated total files
- Archive name being created
- Compression stats (original size vs. compressed size)
For Restore Jobs:
- Shows files extracted and bytes written
- Progress percentage based on archive size
- Destination directory
For Prune Jobs:
- Shows archives being analyzed
- Space reclaimed so far
- Compaction progress
For S3 Sync Jobs:
- Shows bytes uploaded
- Transfer speed
- Bandwidth limit (if configured)
Displays the 25 most recently finished jobs (completed, failed, or cancelled).
| Column | Description |
|---|---|
| ID | Job ID (click to view details) |
| Client | Client name (click to go to client detail) |
| Type | Job type icon and label |
| Status | Completed, Failed, or Cancelled badge |
| Started | Timestamp when job started |
| Duration | How long the job took to execute |
| Actions | Quick action buttons (Retry, View Details) |
Screenshot: Recently Completed table with various job types and statuses
- Completed: Green badge, checkmark icon
- Failed: Red badge, X icon, shows error summary on hover
- Cancelled: Gray badge, stop icon
| Icon | Job Type | Description |
|---|---|---|
| 📦 | backup | Regular borg create backup |
| ✂️ | prune | Prune old archives based on retention policy |
| 🗜️ | compact | Compact repository (reclaim space) |
| 📥 | restore | Restore files from archive |
| 🗄️ | restore_mysql | Restore MySQL database |
| 🗄️ | restore_pg | Restore PostgreSQL database |
| ✅ | check | Repository integrity check |
| ⬆️ | update_borg | Update borg binary on agent |
| ⬆️ | update_agent | Update bbs-agent script |
| 🧪 | plugin_test | Test plugin configuration |
| ☁️ | s3_sync | Sync repository to S3 |
Click any job ID to view the full job detail page at /queue/{id}.
- Job ID and status badge
- Client name (linked)
- Backup plan name (if applicable)
- Created, started, and completed timestamps
- Total duration
Screenshot: Job detail header showing metadata
- Current state (queued, running, completed, failed)
- Progress percentage (if running)
- Current action description
- ETA (estimated time to completion)
Displays the configuration used for this job:
For Backup Jobs:
- Repository path
- Directories being backed up
- Exclusion patterns
- Compression level
- Encryption passphrase (masked)
For Prune Jobs:
- Retention policy (keep daily/weekly/monthly/yearly)
- Prune statistics (archives kept vs. deleted)
For Restore Jobs:
- Source archive
- Destination directory
- Files/directories to restore
Screenshot: Job options section showing backup configuration
Real-time output from the borg/agent command:
- Scrollable log window
- Color-coded messages (info, warning, error)
- Updates live while job is running
- Shows verbose borg output for troubleshooting
Screenshot: Output log showing borg create verbose output
For failed jobs, a dedicated error section shows:
- Error message from borg or agent
- Exit code
- Stack trace (if applicable)
- Suggested solutions based on error type
Screenshot: Error log section highlighting failure reason
Available actions depend on job state:
| Action | Available When | Effect |
|---|---|---|
| Cancel | queued, sent, running | Stops the job, sets status to cancelled |
| Retry | failed | Creates a new job with same configuration |
| View Archive | completed (backup jobs) | Go to archive detail page |
| Download Log | any | Download full job output as .txt file |
Screenshot: Job action buttons at bottom of detail page
What It Does: Creates a new Borg archive (backup snapshot)
Typical Duration: 5 minutes to several hours (depends on data size)
Success Criteria: Archive created, all files backed up, no critical errors
Common Failures:
- Directory not found or permission denied
- Disk space full on repository storage
- Borg lock file exists (previous job crashed)
What It Does: Removes old archives based on retention policy, frees up space
Typical Duration: 1-10 minutes
Success Criteria: Old archives deleted, retention policy applied
Common Failures:
- Repository corruption
- Lock file exists
- Insufficient permissions
Triggered By: Automatic (after backup, if plan has retention policy)
What It Does: Reclaims disk space by compacting the repository after pruning
Typical Duration: 5-30 minutes (depends on repository size)
Success Criteria: Space reclaimed, repository compacted
Triggered By: Automatic (after prune, if enabled in plan)
What It Does: Extracts files from a backup archive to a directory on the client
Typical Duration: Varies (depends on archive size and file count)
Success Criteria: All requested files extracted to destination
Common Failures:
- Destination directory doesn't exist
- Insufficient disk space on client
- Permission denied writing to destination
Triggered By: Manual (user clicks "Restore Files" on archive)
What It Does: Restores MySQL database dumps from an archive
Typical Duration: 1-30 minutes (depends on database size)
Success Criteria: SQL dump imported, database restored
Common Failures:
- MySQL server unreachable
- Insufficient privileges on target server
- SQL syntax errors (version mismatch)
Triggered By: Manual (user clicks "Restore Database" on archive with MySQL dumps)
What It Does: Restores PostgreSQL database dumps from an archive
Typical Duration: 1-30 minutes
Success Criteria: pg_restore successful, database restored
Common Failures:
- PostgreSQL server unreachable
- User lacks CREATEDB privilege
- Dump format incompatible with target version
Triggered By: Manual (user clicks "Restore Database" on archive with PostgreSQL dumps)
What It Does: Verifies repository integrity and archive consistency
Typical Duration: 5-60 minutes (depends on repository size)
Success Criteria: No corruption detected, all archives verified
Common Failures:
- Repository corruption detected
- Missing chunks or metadata
Triggered By: Manual or scheduled (recommended weekly)
What It Does: Updates the borg binary on the agent to a target version
Typical Duration: 1-5 minutes
Success Criteria: Borg binary updated, version verified
Common Failures:
- Download failed (network issue)
- Binary not compatible with OS/architecture
- Insufficient disk space
Triggered By: Manual (client detail "Update Borg" or bulk update)
What It Does: Updates the bbs-agent.py script on the client
Typical Duration: 1-2 minutes
Success Criteria: Agent script updated, service restarted
Common Failures:
- Download failed
- File permissions issue
- Service restart failed
Triggered By: Manual (client detail "Update Agent" or bulk update)
What It Does: Tests a plugin configuration without running a full backup
Typical Duration: 1-5 minutes
Success Criteria: Plugin executes successfully, test output validates
Common Failures:
- Database connection refused
- Script not found or not executable
- S3 credentials invalid
Triggered By: Manual (click "Test" on plugin configuration)
What It Does: Syncs Borg repository to S3-compatible storage
Typical Duration: 5 minutes to several hours (first sync is slow, incremental syncs are fast)
Success Criteria: Repository mirrored to S3, rclone reports success
Common Failures:
- S3 credentials invalid
- Network timeout
- Bucket quota exceeded
Triggered By: Automatic (after successful prune, if S3 sync enabled)
Cancel a job that is queued, sent, or running:
- Click on the job to view details
- Click Cancel button
- Confirm cancellation
- Job state changes to
cancelled - Agent receives cancellation signal and stops execution
Notes:
- Cancelled backup jobs do not create an archive
- Partial progress is discarded
- Lock files are cleaned up automatically
Screenshot: Cancel button and confirmation dialog
Re-run a failed job with the same configuration:
- View the failed job detail page
- Click Retry button
- A new job is queued with identical settings
- Original job remains in history as failed
Notes:
- Fix the underlying issue before retrying (e.g., free up disk space)
- Review error log to understand why it failed
- Multiple retries are allowed
Screenshot: Retry button on failed job detail
BBS limits how many jobs can run simultaneously to prevent overloading the system.
- Navigate to Settings → General tab
- Set Max Concurrent Jobs (default: 4)
- Save
Screenshot: Settings → General → Max Concurrent Jobs field
- When the limit is reached, new jobs remain in
queuedstate - As running jobs complete, queued jobs are sent to agents
- Jobs are processed in FIFO order (first in, first out)
- Each agent can run one job at a time (jobs don't run in parallel on the same agent)
| Limit | Best For |
|---|---|
| 1-2 | Small servers, limited resources |
| 4 (default) | Typical installations, balanced performance |
| 8-10 | Large installations, powerful servers, many agents |
| 20+ | Enterprise deployments, dedicated backup infrastructure |
Considerations:
- Server resources (CPU, RAM, network bandwidth)
- Storage I/O capacity
- Number of agents
- Backup window requirements
The Queue page automatically refreshes every 10 seconds to show:
- New jobs added to the queue
- Updated progress bars for running jobs
- Newly completed jobs moving to the Recent section
- Updated stat cards
Disabling Auto-Refresh:
- Click the "Pause Auto-Refresh" toggle in the top-right
- Useful when analyzing job details or logs
Screenshot: Auto-refresh toggle in queue page header
Click stat cards to filter the job list:
- Click Failed (24h) card to show only failed jobs
- Click Completed (24h) to show only successful jobs
- Click In Queue to show queued/sent jobs
Use the search box to filter jobs by client name:
- Type client name
- Job list updates in real-time
- Search is case-insensitive
Screenshot: Search box filtering jobs by client name
Use the job type dropdown to show only specific types:
- Select "backup" to see only backup jobs
- Select "restore_mysql" to see database restores
- Select "All" to clear filter
BBS sends notifications for critical job events:
- Backup Failed: Red notification when backup job fails
- Agent Offline: Agent hasn't polled in 3x interval
- Storage Low: Repository approaching disk space limit
See Notifications for configuring email alerts and notification preferences.
Jobs running longer than expected may indicate:
- Large data sets (normal for first backup)
- Network congestion (S3 sync, remote repositories)
- Disk I/O bottlenecks (slow storage)
- Agent resource constraints (CPU, RAM)
Solutions:
- Check job detail output log for borg performance stats
- Monitor server resources (htop, iotop)
- Consider compression level adjustments
- Increase agent resources (RAM, faster disk)
Many jobs stuck in queued state:
Possible Causes:
- Max concurrent jobs limit reached
- Agents offline or not polling
- Long-running jobs blocking queue
Solutions:
- Increase max concurrent jobs limit
- Check agent connectivity
- Cancel or troubleshoot stuck running jobs
Cause: Agent not polling, or max concurrent jobs reached
Solution:
- Check agent status on client detail page
- Verify agent service is running:
systemctl status bbs-agent - Check max concurrent jobs setting (Settings → General)
- Review In Progress section for long-running jobs
Cause: Agent received job but hasn't started execution
Solution:
- Wait 1-2 minutes (agent may be preparing)
- Check agent logs:
journalctl -u bbs-agent -n 50 - Restart agent:
systemctl restart bbs-agent - Cancel and retry the job
Cause: Job is executing but taking longer than expected, or agent crashed
Solution:
- Check job detail output log for activity
- If no recent log output, agent may have crashed
- SSH to client, check if borg process is running:
ps aux | grep borg - If borg is running, wait; if not, cancel and retry
- Check for borg lock file:
/var/bbs/repos/{id}/lock.exclusive
Cause: Previous job crashed and left a lock file
Solution:
- SSH to client
- Remove lock file:
rm /var/bbs/repos/{id}/lock.* - Retry the job
Cause: Agent user lacks permissions to read directories or write to repository
Solution:
- Check directory permissions on client
- Ensure agent user (e.g.,
bbs-{agent-id}) can read backup directories - Verify repository storage path permissions
- Consider using sudo or adjusting directory ownership
Cause: Repository storage is out of disk space
Solution:
- Check disk space:
df -h /var/bbs/repos/ - Run prune jobs to remove old archives
- Run compact jobs to reclaim space
- Add more disk space to server
- Adjust retention policies to keep fewer archives
- Check Queue Daily: Review failed jobs, investigate errors
- Set Up Alerts: Enable email notifications for backup failures
- Review Trends: Monitor average duration and failure rates over time
- Dashboard Visibility: Keep Queue page open on monitoring displays
- Cancel Stuck Jobs: Don't let jobs run indefinitely if they're stalled
- Retry Strategically: Fix underlying issues before retrying failed jobs
- Stagger Schedules: Avoid scheduling all backups at the same time to prevent queue congestion
- Prune Regularly: Schedule prune jobs to run automatically after backups
- Concurrent Jobs: Start with default (4), increase if queue backlogs occur
- Compression: Use lower compression levels for faster backups (lz4, zstd,3)
- Exclusions: Exclude unnecessary files to reduce backup size and time
- Network: For S3 sync, use bandwidth limits to avoid saturating connections
- Backup-Plans — Configuring scheduled backups
- Restore — Restoring files and databases
- Notifications — Email alerts for job failures
- Plugins — Understanding plugin job types
- Troubleshooting — Resolving common job errors
📖 User Manual
Getting Started
Using BBS
- Dashboard
- Managing Clients
- Linux Agent Setup
- macOS Agent Setup
- Windows Agent Setup
- Repositories
- Storage Setup
- Backup Plans
- Restoring Files
- Database Backups
- Plugins
- Remote Storage
- S3 Offsite Sync
Monitoring
Administration
Reference