Skip to content

Queue and Jobs

Marc Pope edited this page Feb 3, 2026 · 1 revision

Queue and Jobs

The Queue page provides real-time visibility into all backup operations, job status, and system activity across all clients.

Queue Page Overview

Access the Queue at /queue or via the Queue link in the main navigation.

The Queue page displays:

  • Summary Statistics: Quick overview cards
  • In Progress: Currently running jobs with live progress bars
  • Recently Completed: Last 25 finished jobs
  • Auto-Refresh: Updates every 10 seconds automatically

Screenshot: Queue page showing stats cards, in-progress jobs, and recent jobs

Summary Statistics

Four stat cards provide at-a-glance status:

In Queue

  • Number of jobs waiting to run (status: queued or sent)
  • Includes jobs waiting for available worker slots
  • High numbers may indicate agent connectivity issues or insufficient concurrent job slots

Completed (24h)

  • Number of successfully completed jobs in the last 24 hours
  • Includes all job types: backups, restores, prunes, etc.
  • Trend indicator: green if increasing, red if decreasing

Failed (24h)

  • Number of failed jobs in the last 24 hours
  • Red alert if greater than 0
  • Click to filter job list by failed status

Avg Duration

  • Average job execution time (in minutes) for the last 50 jobs
  • Helps identify performance trends
  • Excludes queued time, only measures active execution

Screenshot: Stat cards showing sample values

Job States

Jobs progress through several states:

State Description Color Meaning
queued Job created, waiting to be sent to agent Gray Agent will pick it up on next poll
sent Job sent to agent, agent acknowledged Blue Agent is preparing to execute
running Job actively executing on agent Yellow Progress bars show completion percentage
completed Job finished successfully Green Archive created, data backed up
failed Job failed with error Red Check error log for details
cancelled Job manually cancelled by user Gray Stopped before completion

In Progress Section

Shows all jobs currently in running state.

Progress Information

For each running job:

  • Job ID: Unique identifier, click to view details
  • Client Name: Which client the job is for
  • Job Type: backup, prune, restore, etc.
  • Started: How long ago the job started
  • Progress Bar: Visual percentage complete (for backup/restore jobs)
  • Current Action: What the job is doing right now (e.g., "Creating archive", "Pruning old archives")

Screenshot: In Progress section with multiple running jobs and progress bars

Progress Details

For Backup Jobs:

  • Shows files processed and bytes transferred
  • Progress percentage based on estimated total files
  • Archive name being created
  • Compression stats (original size vs. compressed size)

For Restore Jobs:

  • Shows files extracted and bytes written
  • Progress percentage based on archive size
  • Destination directory

For Prune Jobs:

  • Shows archives being analyzed
  • Space reclaimed so far
  • Compaction progress

For S3 Sync Jobs:

  • Shows bytes uploaded
  • Transfer speed
  • Bandwidth limit (if configured)

Recently Completed Section

Displays the 25 most recently finished jobs (completed, failed, or cancelled).

Columns

Column Description
ID Job ID (click to view details)
Client Client name (click to go to client detail)
Type Job type icon and label
Status Completed, Failed, or Cancelled badge
Started Timestamp when job started
Duration How long the job took to execute
Actions Quick action buttons (Retry, View Details)

Screenshot: Recently Completed table with various job types and statuses

Status Badges

  • Completed: Green badge, checkmark icon
  • Failed: Red badge, X icon, shows error summary on hover
  • Cancelled: Gray badge, stop icon

Job Type Icons

Icon Job Type Description
📦 backup Regular borg create backup
✂️ prune Prune old archives based on retention policy
🗜️ compact Compact repository (reclaim space)
📥 restore Restore files from archive
🗄️ restore_mysql Restore MySQL database
🗄️ restore_pg Restore PostgreSQL database
check Repository integrity check
⬆️ update_borg Update borg binary on agent
⬆️ update_agent Update bbs-agent script
🧪 plugin_test Test plugin configuration
☁️ s3_sync Sync repository to S3

Job Detail Page

Click any job ID to view the full job detail page at /queue/{id}.

Job Detail Sections

1. Job Header

  • Job ID and status badge
  • Client name (linked)
  • Backup plan name (if applicable)
  • Created, started, and completed timestamps
  • Total duration

Screenshot: Job detail header showing metadata

2. Progress and Status

  • Current state (queued, running, completed, failed)
  • Progress percentage (if running)
  • Current action description
  • ETA (estimated time to completion)

3. Job Options

Displays the configuration used for this job:

For Backup Jobs:

  • Repository path
  • Directories being backed up
  • Exclusion patterns
  • Compression level
  • Encryption passphrase (masked)

For Prune Jobs:

  • Retention policy (keep daily/weekly/monthly/yearly)
  • Prune statistics (archives kept vs. deleted)

For Restore Jobs:

  • Source archive
  • Destination directory
  • Files/directories to restore

Screenshot: Job options section showing backup configuration

4. Output Log

Real-time output from the borg/agent command:

  • Scrollable log window
  • Color-coded messages (info, warning, error)
  • Updates live while job is running
  • Shows verbose borg output for troubleshooting

Screenshot: Output log showing borg create verbose output

5. Error Log (if failed)

For failed jobs, a dedicated error section shows:

  • Error message from borg or agent
  • Exit code
  • Stack trace (if applicable)
  • Suggested solutions based on error type

Screenshot: Error log section highlighting failure reason

6. Job Actions

Available actions depend on job state:

Action Available When Effect
Cancel queued, sent, running Stops the job, sets status to cancelled
Retry failed Creates a new job with same configuration
View Archive completed (backup jobs) Go to archive detail page
Download Log any Download full job output as .txt file

Screenshot: Job action buttons at bottom of detail page

Job Types Explained

backup

What It Does: Creates a new Borg archive (backup snapshot)

Typical Duration: 5 minutes to several hours (depends on data size)

Success Criteria: Archive created, all files backed up, no critical errors

Common Failures:

  • Directory not found or permission denied
  • Disk space full on repository storage
  • Borg lock file exists (previous job crashed)

prune

What It Does: Removes old archives based on retention policy, frees up space

Typical Duration: 1-10 minutes

Success Criteria: Old archives deleted, retention policy applied

Common Failures:

  • Repository corruption
  • Lock file exists
  • Insufficient permissions

Triggered By: Automatic (after backup, if plan has retention policy)


compact

What It Does: Reclaims disk space by compacting the repository after pruning

Typical Duration: 5-30 minutes (depends on repository size)

Success Criteria: Space reclaimed, repository compacted

Triggered By: Automatic (after prune, if enabled in plan)


restore

What It Does: Extracts files from a backup archive to a directory on the client

Typical Duration: Varies (depends on archive size and file count)

Success Criteria: All requested files extracted to destination

Common Failures:

  • Destination directory doesn't exist
  • Insufficient disk space on client
  • Permission denied writing to destination

Triggered By: Manual (user clicks "Restore Files" on archive)


restore_mysql

What It Does: Restores MySQL database dumps from an archive

Typical Duration: 1-30 minutes (depends on database size)

Success Criteria: SQL dump imported, database restored

Common Failures:

  • MySQL server unreachable
  • Insufficient privileges on target server
  • SQL syntax errors (version mismatch)

Triggered By: Manual (user clicks "Restore Database" on archive with MySQL dumps)


restore_pg

What It Does: Restores PostgreSQL database dumps from an archive

Typical Duration: 1-30 minutes

Success Criteria: pg_restore successful, database restored

Common Failures:

  • PostgreSQL server unreachable
  • User lacks CREATEDB privilege
  • Dump format incompatible with target version

Triggered By: Manual (user clicks "Restore Database" on archive with PostgreSQL dumps)


check

What It Does: Verifies repository integrity and archive consistency

Typical Duration: 5-60 minutes (depends on repository size)

Success Criteria: No corruption detected, all archives verified

Common Failures:

  • Repository corruption detected
  • Missing chunks or metadata

Triggered By: Manual or scheduled (recommended weekly)


update_borg

What It Does: Updates the borg binary on the agent to a target version

Typical Duration: 1-5 minutes

Success Criteria: Borg binary updated, version verified

Common Failures:

  • Download failed (network issue)
  • Binary not compatible with OS/architecture
  • Insufficient disk space

Triggered By: Manual (client detail "Update Borg" or bulk update)


update_agent

What It Does: Updates the bbs-agent.py script on the client

Typical Duration: 1-2 minutes

Success Criteria: Agent script updated, service restarted

Common Failures:

  • Download failed
  • File permissions issue
  • Service restart failed

Triggered By: Manual (client detail "Update Agent" or bulk update)


plugin_test

What It Does: Tests a plugin configuration without running a full backup

Typical Duration: 1-5 minutes

Success Criteria: Plugin executes successfully, test output validates

Common Failures:

  • Database connection refused
  • Script not found or not executable
  • S3 credentials invalid

Triggered By: Manual (click "Test" on plugin configuration)


s3_sync

What It Does: Syncs Borg repository to S3-compatible storage

Typical Duration: 5 minutes to several hours (first sync is slow, incremental syncs are fast)

Success Criteria: Repository mirrored to S3, rclone reports success

Common Failures:

  • S3 credentials invalid
  • Network timeout
  • Bucket quota exceeded

Triggered By: Automatic (after successful prune, if S3 sync enabled)


Job Actions

Cancelling a Job

Cancel a job that is queued, sent, or running:

  1. Click on the job to view details
  2. Click Cancel button
  3. Confirm cancellation
  4. Job state changes to cancelled
  5. Agent receives cancellation signal and stops execution

Notes:

  • Cancelled backup jobs do not create an archive
  • Partial progress is discarded
  • Lock files are cleaned up automatically

Screenshot: Cancel button and confirmation dialog

Retrying a Failed Job

Re-run a failed job with the same configuration:

  1. View the failed job detail page
  2. Click Retry button
  3. A new job is queued with identical settings
  4. Original job remains in history as failed

Notes:

  • Fix the underlying issue before retrying (e.g., free up disk space)
  • Review error log to understand why it failed
  • Multiple retries are allowed

Screenshot: Retry button on failed job detail

Max Concurrent Jobs

BBS limits how many jobs can run simultaneously to prevent overloading the system.

Configuring Concurrent Job Limit

  1. Navigate to SettingsGeneral tab
  2. Set Max Concurrent Jobs (default: 4)
  3. Save

Screenshot: Settings → General → Max Concurrent Jobs field

How It Works

  • When the limit is reached, new jobs remain in queued state
  • As running jobs complete, queued jobs are sent to agents
  • Jobs are processed in FIFO order (first in, first out)
  • Each agent can run one job at a time (jobs don't run in parallel on the same agent)

Choosing a Limit

Limit Best For
1-2 Small servers, limited resources
4 (default) Typical installations, balanced performance
8-10 Large installations, powerful servers, many agents
20+ Enterprise deployments, dedicated backup infrastructure

Considerations:

  • Server resources (CPU, RAM, network bandwidth)
  • Storage I/O capacity
  • Number of agents
  • Backup window requirements

Auto-Refresh

The Queue page automatically refreshes every 10 seconds to show:

  • New jobs added to the queue
  • Updated progress bars for running jobs
  • Newly completed jobs moving to the Recent section
  • Updated stat cards

Disabling Auto-Refresh:

  • Click the "Pause Auto-Refresh" toggle in the top-right
  • Useful when analyzing job details or logs

Screenshot: Auto-refresh toggle in queue page header

Filtering and Searching

Filter by Status

Click stat cards to filter the job list:

  • Click Failed (24h) card to show only failed jobs
  • Click Completed (24h) to show only successful jobs
  • Click In Queue to show queued/sent jobs

Search by Client

Use the search box to filter jobs by client name:

  • Type client name
  • Job list updates in real-time
  • Search is case-insensitive

Screenshot: Search box filtering jobs by client name

Filter by Job Type

Use the job type dropdown to show only specific types:

  • Select "backup" to see only backup jobs
  • Select "restore_mysql" to see database restores
  • Select "All" to clear filter

Notifications for Job Events

BBS sends notifications for critical job events:

  • Backup Failed: Red notification when backup job fails
  • Agent Offline: Agent hasn't polled in 3x interval
  • Storage Low: Repository approaching disk space limit

See Notifications for configuring email alerts and notification preferences.

Performance Monitoring

Identifying Slow Jobs

Jobs running longer than expected may indicate:

  • Large data sets (normal for first backup)
  • Network congestion (S3 sync, remote repositories)
  • Disk I/O bottlenecks (slow storage)
  • Agent resource constraints (CPU, RAM)

Solutions:

  • Check job detail output log for borg performance stats
  • Monitor server resources (htop, iotop)
  • Consider compression level adjustments
  • Increase agent resources (RAM, faster disk)

Queue Backlog

Many jobs stuck in queued state:

Possible Causes:

  • Max concurrent jobs limit reached
  • Agents offline or not polling
  • Long-running jobs blocking queue

Solutions:

  • Increase max concurrent jobs limit
  • Check agent connectivity
  • Cancel or troubleshoot stuck running jobs

Troubleshooting

Job Stuck in "queued"

Cause: Agent not polling, or max concurrent jobs reached

Solution:

  • Check agent status on client detail page
  • Verify agent service is running: systemctl status bbs-agent
  • Check max concurrent jobs setting (Settings → General)
  • Review In Progress section for long-running jobs

Job Stuck in "sent"

Cause: Agent received job but hasn't started execution

Solution:

  • Wait 1-2 minutes (agent may be preparing)
  • Check agent logs: journalctl -u bbs-agent -n 50
  • Restart agent: systemctl restart bbs-agent
  • Cancel and retry the job

Job Stuck in "running"

Cause: Job is executing but taking longer than expected, or agent crashed

Solution:

  • Check job detail output log for activity
  • If no recent log output, agent may have crashed
  • SSH to client, check if borg process is running: ps aux | grep borg
  • If borg is running, wait; if not, cancel and retry
  • Check for borg lock file: /var/bbs/repos/{id}/lock.exclusive

Job Failed with "Lock Exists"

Cause: Previous job crashed and left a lock file

Solution:

  • SSH to client
  • Remove lock file: rm /var/bbs/repos/{id}/lock.*
  • Retry the job

Job Failed with "Permission Denied"

Cause: Agent user lacks permissions to read directories or write to repository

Solution:

  • Check directory permissions on client
  • Ensure agent user (e.g., bbs-{agent-id}) can read backup directories
  • Verify repository storage path permissions
  • Consider using sudo or adjusting directory ownership

Job Failed with "Disk Full"

Cause: Repository storage is out of disk space

Solution:

  • Check disk space: df -h /var/bbs/repos/
  • Run prune jobs to remove old archives
  • Run compact jobs to reclaim space
  • Add more disk space to server
  • Adjust retention policies to keep fewer archives

Best Practices

Monitoring

  • Check Queue Daily: Review failed jobs, investigate errors
  • Set Up Alerts: Enable email notifications for backup failures
  • Review Trends: Monitor average duration and failure rates over time
  • Dashboard Visibility: Keep Queue page open on monitoring displays

Job Management

  • Cancel Stuck Jobs: Don't let jobs run indefinitely if they're stalled
  • Retry Strategically: Fix underlying issues before retrying failed jobs
  • Stagger Schedules: Avoid scheduling all backups at the same time to prevent queue congestion
  • Prune Regularly: Schedule prune jobs to run automatically after backups

Performance

  • Concurrent Jobs: Start with default (4), increase if queue backlogs occur
  • Compression: Use lower compression levels for faster backups (lz4, zstd,3)
  • Exclusions: Exclude unnecessary files to reduce backup size and time
  • Network: For S3 sync, use bandwidth limits to avoid saturating connections

Related Documentation

Clone this wiki locally