Queue and Jobs

The Queue page provides real-time visibility into all backup operations, job status, and system activity across all clients.

Queue Page Overview

Access the Queue at /queue or via the Queue link in the main navigation.

The Queue page displays:

Summary Statistics: Quick overview cards
In Progress: Currently running jobs with live progress bars
Recently Completed: Last 25 finished jobs
Auto-Refresh: Updates every 10 seconds automatically

Screenshot: Queue page showing stats cards, in-progress jobs, and recent jobs

Summary Statistics

Four stat cards provide at-a-glance status:

In Queue

Number of jobs waiting to run (status: queued or sent)
Includes jobs waiting for available worker slots
High numbers may indicate agent connectivity issues or insufficient concurrent job slots

Completed (24h)

Number of successfully completed jobs in the last 24 hours
Includes all job types: backups, restores, prunes, etc.
Trend indicator: green if increasing, red if decreasing

Failed (24h)

Number of failed jobs in the last 24 hours
Red alert if greater than 0
Click to filter job list by failed status

Avg Duration

Average job execution time (in minutes) for the last 50 jobs
Helps identify performance trends
Excludes queued time, only measures active execution

Screenshot: Stat cards showing sample values

Job States

Jobs progress through several states:

State	Description	Color	Meaning
queued	Job created, waiting to be sent to agent	Gray	Agent will pick it up on next poll
sent	Job sent to agent, agent acknowledged	Blue	Agent is preparing to execute
running	Job actively executing on agent	Yellow	Progress bars show completion percentage
completed	Job finished successfully	Green	Archive created, data backed up
failed	Job failed with error	Red	Check error log for details
cancelled	Job manually cancelled by user	Gray	Stopped before completion

In Progress Section

Shows all jobs currently in running state.

Progress Information

For each running job:

Job ID: Unique identifier, click to view details
Client Name: Which client the job is for
Job Type: backup, prune, restore, etc.
Started: How long ago the job started
Progress Bar: Visual percentage complete (for backup/restore jobs)
Current Action: What the job is doing right now (e.g., "Creating archive", "Pruning old archives")

Screenshot: In Progress section with multiple running jobs and progress bars

Progress Details

For Backup Jobs:

Shows files processed and bytes transferred
Progress percentage based on estimated total files
Archive name being created
Compression stats (original size vs. compressed size)

For Restore Jobs:

Shows files extracted and bytes written
Progress percentage based on archive size
Destination directory

For Prune Jobs:

Shows archives being analyzed
Space reclaimed so far
Compaction progress

For S3 Sync Jobs:

Shows bytes uploaded
Transfer speed
Bandwidth limit (if configured)

Recently Completed Section

Displays the 25 most recently finished jobs (completed, failed, or cancelled).

Columns

Column	Description
ID	Job ID (click to view details)
Client	Client name (click to go to client detail)
Type	Job type icon and label
Status	Completed, Failed, or Cancelled badge
Started	Timestamp when job started
Duration	How long the job took to execute
Actions	Quick action buttons (Retry, View Details)

Screenshot: Recently Completed table with various job types and statuses

Status Badges

Completed: Green badge, checkmark icon
Failed: Red badge, X icon, shows error summary on hover
Cancelled: Gray badge, stop icon

Job Type Icons

Icon	Job Type	Description
📦	backup	Regular borg create backup
✂️	prune	Prune old archives based on retention policy
🗜️	compact	Compact repository (reclaim space)
📥	restore	Restore files from archive
🗄️	restore_mysql	Restore MySQL database
🗄️	restore_pg	Restore PostgreSQL database
✅	check	Repository integrity check
⬆️	update_borg	Update borg binary on agent
⬆️	update_agent	Update bbs-agent script
🧪	plugin_test	Test plugin configuration
☁️	s3_sync	Sync repository to S3

Job Detail Page

Click any job ID to view the full job detail page at /queue/{id}.

Job Detail Sections

1. Job Header

Job ID and status badge
Client name (linked)
Backup plan name (if applicable)
Created, started, and completed timestamps
Total duration

Screenshot: Job detail header showing metadata

2. Progress and Status

Current state (queued, running, completed, failed)
Progress percentage (if running)
Current action description
ETA (estimated time to completion)

3. Job Options

Displays the configuration used for this job:

For Backup Jobs:

Repository path
Directories being backed up
Exclusion patterns
Compression level
Encryption passphrase (masked)

For Prune Jobs:

Retention policy (keep daily/weekly/monthly/yearly)
Prune statistics (archives kept vs. deleted)

For Restore Jobs:

Source archive
Destination directory
Files/directories to restore

Screenshot: Job options section showing backup configuration

4. Output Log

Real-time output from the borg/agent command:

Scrollable log window
Color-coded messages (info, warning, error)
Updates live while job is running
Shows verbose borg output for troubleshooting

Screenshot: Output log showing borg create verbose output

5. Error Log (if failed)

For failed jobs, a dedicated error section shows:

Error message from borg or agent
Exit code
Stack trace (if applicable)
Suggested solutions based on error type

Screenshot: Error log section highlighting failure reason

6. Job Actions

Available actions depend on job state:

Action	Available When	Effect
Cancel	queued, sent, running	Stops the job, sets status to cancelled
Retry	failed	Creates a new job with same configuration
View Archive	completed (backup jobs)	Go to archive detail page
Download Log	any	Download full job output as .txt file

Screenshot: Job action buttons at bottom of detail page

Job Types Explained

backup

What It Does: Creates a new Borg archive (backup snapshot)

Typical Duration: 5 minutes to several hours (depends on data size)

Success Criteria: Archive created, all files backed up, no critical errors

Common Failures:

Directory not found or permission denied
Disk space full on repository storage
Borg lock file exists (previous job crashed)

prune

What It Does: Removes old archives based on retention policy, frees up space

Typical Duration: 1-10 minutes

Success Criteria: Old archives deleted, retention policy applied

Common Failures:

Repository corruption
Lock file exists
Insufficient permissions

Triggered By: Automatic (after backup, if plan has retention policy)

compact

What It Does: Reclaims disk space by compacting the repository after pruning

Typical Duration: 5-30 minutes (depends on repository size)

Success Criteria: Space reclaimed, repository compacted

Triggered By: Automatic (after prune, if enabled in plan)

restore

What It Does: Extracts files from a backup archive to a directory on the client

Typical Duration: Varies (depends on archive size and file count)

Success Criteria: All requested files extracted to destination

Common Failures:

Destination directory doesn't exist
Insufficient disk space on client
Permission denied writing to destination

Triggered By: Manual (user clicks "Restore Files" on archive)

restore_mysql

What It Does: Restores MySQL database dumps from an archive

Typical Duration: 1-30 minutes (depends on database size)

Success Criteria: SQL dump imported, database restored

Common Failures:

MySQL server unreachable
Insufficient privileges on target server
SQL syntax errors (version mismatch)

Triggered By: Manual (user clicks "Restore Database" on archive with MySQL dumps)

restore_pg

What It Does: Restores PostgreSQL database dumps from an archive

Typical Duration: 1-30 minutes

Success Criteria: pg_restore successful, database restored

Common Failures:

PostgreSQL server unreachable
User lacks CREATEDB privilege
Dump format incompatible with target version

Triggered By: Manual (user clicks "Restore Database" on archive with PostgreSQL dumps)

check

What It Does: Verifies repository integrity and archive consistency

Typical Duration: 5-60 minutes (depends on repository size)

Success Criteria: No corruption detected, all archives verified

Common Failures:

Repository corruption detected
Missing chunks or metadata

Triggered By: Manual or scheduled (recommended weekly)

update_borg

What It Does: Updates the borg binary on the agent to a target version

Typical Duration: 1-5 minutes

Success Criteria: Borg binary updated, version verified

Common Failures:

Download failed (network issue)
Binary not compatible with OS/architecture
Insufficient disk space

Triggered By: Manual (client detail "Update Borg" or bulk update)

update_agent

What It Does: Updates the bbs-agent.py script on the client

Typical Duration: 1-2 minutes

Success Criteria: Agent script updated, service restarted

Common Failures:

Download failed
File permissions issue
Service restart failed

Triggered By: Manual (client detail "Update Agent" or bulk update)

plugin_test

What It Does: Tests a plugin configuration without running a full backup

Typical Duration: 1-5 minutes

Success Criteria: Plugin executes successfully, test output validates

Common Failures:

Database connection refused
Script not found or not executable
S3 credentials invalid

Triggered By: Manual (click "Test" on plugin configuration)

s3_sync

What It Does: Syncs Borg repository to S3-compatible storage

Typical Duration: 5 minutes to several hours (first sync is slow, incremental syncs are fast)

Success Criteria: Repository mirrored to S3, rclone reports success

Common Failures:

S3 credentials invalid
Network timeout
Bucket quota exceeded

Triggered By: Automatic (after successful prune, if S3 sync enabled)

Job Actions

Cancelling a Job

Cancel a job that is queued, sent, or running:

Click on the job to view details
Click Cancel button
Confirm cancellation
Job state changes to cancelled
Agent receives cancellation signal and stops execution

Notes:

Cancelled backup jobs do not create an archive
Partial progress is discarded
Lock files are cleaned up automatically

Screenshot: Cancel button and confirmation dialog

Retrying a Failed Job

Re-run a failed job with the same configuration:

View the failed job detail page
Click Retry button
A new job is queued with identical settings
Original job remains in history as failed

Notes:

Fix the underlying issue before retrying (e.g., free up disk space)
Review error log to understand why it failed
Multiple retries are allowed

Screenshot: Retry button on failed job detail

Max Concurrent Jobs

BBS limits how many jobs can run simultaneously to prevent overloading the system.

Configuring Concurrent Job Limit

Navigate to Settings → General tab
Set Max Concurrent Jobs (default: 4)
Save

Screenshot: Settings → General → Max Concurrent Jobs field

How It Works

When the limit is reached, new jobs remain in queued state
As running jobs complete, queued jobs are sent to agents
Jobs are processed in FIFO order (first in, first out)
Each agent can run one job at a time (jobs don't run in parallel on the same agent)

Choosing a Limit

Limit	Best For
1-2	Small servers, limited resources
4 (default)	Typical installations, balanced performance
8-10	Large installations, powerful servers, many agents
20+	Enterprise deployments, dedicated backup infrastructure

Considerations:

Server resources (CPU, RAM, network bandwidth)
Storage I/O capacity
Number of agents
Backup window requirements

Auto-Refresh

The Queue page automatically refreshes every 10 seconds to show:

New jobs added to the queue
Updated progress bars for running jobs
Newly completed jobs moving to the Recent section
Updated stat cards

Disabling Auto-Refresh:

Click the "Pause Auto-Refresh" toggle in the top-right
Useful when analyzing job details or logs

Screenshot: Auto-refresh toggle in queue page header

Filtering and Searching

Filter by Status

Click stat cards to filter the job list:

Click Failed (24h) card to show only failed jobs
Click Completed (24h) to show only successful jobs
Click In Queue to show queued/sent jobs

Search by Client

Use the search box to filter jobs by client name:

Type client name
Job list updates in real-time
Search is case-insensitive

Screenshot: Search box filtering jobs by client name

Filter by Job Type

Use the job type dropdown to show only specific types:

Select "backup" to see only backup jobs
Select "restore_mysql" to see database restores
Select "All" to clear filter

Notifications for Job Events

BBS sends notifications for critical job events:

Backup Failed: Red notification when backup job fails
Agent Offline: Agent hasn't polled in 3x interval
Storage Low: Repository approaching disk space limit

See Notifications for configuring email alerts and notification preferences.

Performance Monitoring

Identifying Slow Jobs

Jobs running longer than expected may indicate:

Large data sets (normal for first backup)
Network congestion (S3 sync, remote repositories)
Disk I/O bottlenecks (slow storage)
Agent resource constraints (CPU, RAM)

Solutions:

Check job detail output log for borg performance stats
Monitor server resources (htop, iotop)
Consider compression level adjustments
Increase agent resources (RAM, faster disk)

Queue Backlog

Many jobs stuck in queued state:

Possible Causes:

Max concurrent jobs limit reached
Agents offline or not polling
Long-running jobs blocking queue

Solutions:

Increase max concurrent jobs limit
Check agent connectivity
Cancel or troubleshoot stuck running jobs

Troubleshooting

Job Stuck in "queued"

Cause: Agent not polling, or max concurrent jobs reached

Solution:

Check agent status on client detail page
Verify agent service is running: systemctl status bbs-agent
Check max concurrent jobs setting (Settings → General)
Review In Progress section for long-running jobs

Job Stuck in "sent"

Cause: Agent received job but hasn't started execution

Solution:

Wait 1-2 minutes (agent may be preparing)
Check agent logs: journalctl -u bbs-agent -n 50
Restart agent: systemctl restart bbs-agent
Cancel and retry the job

Job Stuck in "running"

Cause: Job is executing but taking longer than expected, or agent crashed

Solution:

Check job detail output log for activity
If no recent log output, agent may have crashed
SSH to client, check if borg process is running: ps aux | grep borg
If borg is running, wait; if not, cancel and retry
Check for borg lock file: /var/bbs/repos/{id}/lock.exclusive

Job Failed with "Lock Exists"

Cause: Previous job crashed and left a lock file

Solution:

SSH to client
Remove lock file: rm /var/bbs/repos/{id}/lock.*
Retry the job

Job Failed with "Permission Denied"

Cause: Agent user lacks permissions to read directories or write to repository

Solution:

Check directory permissions on client
Ensure agent user (e.g., bbs-{agent-id}) can read backup directories
Verify repository storage path permissions
Consider using sudo or adjusting directory ownership

Job Failed with "Disk Full"

Cause: Repository storage is out of disk space

Solution:

Check disk space: df -h /var/bbs/repos/
Run prune jobs to remove old archives
Run compact jobs to reclaim space
Add more disk space to server
Adjust retention policies to keep fewer archives

Best Practices

Monitoring

Check Queue Daily: Review failed jobs, investigate errors
Set Up Alerts: Enable email notifications for backup failures
Review Trends: Monitor average duration and failure rates over time
Dashboard Visibility: Keep Queue page open on monitoring displays

Job Management

Cancel Stuck Jobs: Don't let jobs run indefinitely if they're stalled
Retry Strategically: Fix underlying issues before retrying failed jobs
Stagger Schedules: Avoid scheduling all backups at the same time to prevent queue congestion
Prune Regularly: Schedule prune jobs to run automatically after backups

Performance

Concurrent Jobs: Start with default (4), increase if queue backlogs occur
Compression: Use lower compression levels for faster backups (lz4, zstd,3)
Exclusions: Exclude unnecessary files to reduce backup size and time
Network: For S3 sync, use bandwidth limits to avoid saturating connections

Uh oh!

Queue and Jobs

Queue and Jobs

Queue Page Overview

Summary Statistics

In Queue

Completed (24h)

Failed (24h)

Avg Duration

Job States

In Progress Section

Progress Information

Progress Details

Recently Completed Section

Columns

Status Badges

Job Type Icons

Job Detail Page

Job Detail Sections

1. Job Header

2. Progress and Status

3. Job Options

4. Output Log

5. Error Log (if failed)

6. Job Actions

Job Types Explained

backup

prune

compact

restore

restore_mysql

restore_pg

check

update_borg

update_agent

plugin_test

s3_sync

Job Actions

Cancelling a Job

Retrying a Failed Job

Max Concurrent Jobs

Configuring Concurrent Job Limit

How It Works

Choosing a Limit

Auto-Refresh

Filtering and Searching

Filter by Status

Search by Client

Filter by Job Type

Notifications for Job Events

Performance Monitoring

Identifying Slow Jobs

Queue Backlog

Troubleshooting

Job Stuck in "queued"

Job Stuck in "sent"

Job Stuck in "running"

Job Failed with "Lock Exists"

Job Failed with "Permission Denied"

Job Failed with "Disk Full"

Best Practices

Monitoring

Job Management

Performance

Related Documentation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!