GitHub - USMAI-Library-Consortium/confluence-public-link-finder: Script to find public pages on one's Confluence, and then a follow-up script to verify the results.

Here is the content formatted in Markdown:

Confluence Public Page Auditor

This project provides a two-script utility for finding and verifying publicly accessible pages on a Confluence Server or Data Center instance.

1. Public Page Finder

This Python script scans the Confluence /rest/api/content endpoint anonymously. It finds all publicly visible pages and writes their titles and URLs to a public_pages.csv file. The script handles API pagination and builds full, clickable URLs for the output file.

2. Confluence Page Verifier

This script complements the first by automating the audit process. It reads the generated public_pages.csv file, takes a random 10% sample of the URLs, and attempts to access them anonymously to confirm they are still public.

To run quickly without being blocked, the script uses:

Concurrency: A ThreadPoolExecutor to make many HTTP requests in parallel.
Rate Limiting: A Semaphore to limit simultaneous requests (e.g., 5 at a time) and avoid being blocked.

The script reports a final summary of which sampled links passed (returned a 200 OK status) and which failed (e.g., 403 Forbidden, 404 Not Found, or a connection error).

How to Use

Step 1: Find Public Pages

Open the first script (e.g., find_public_pages.py).
Update the CONFLUENCE_BASE_URL variable at the top with your site's URL.
Run the script from your terminal: python find_public_pages.py
Wait for it to complete. A public_pages.csv file will be created in the same directory.

Step 2: Verify the Results

Open the second script (e.g., verify_pages.py).
You can adjust the SAMPLE_PERCENT or MAX_CONCURRENT_REQUESTS at the top if needed.
Run the script: python verify_pages.py
The script will test a sample of the links from the CSV and print a summary report to your terminal, highlighting any failures.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
design-docs		design-docs
.gitignore		.gitignore
find_public_pages.py		find_public_pages.py
readme.md		readme.md
verify_public_pages.py		verify_public_pages.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Confluence Public Page Auditor

1. Public Page Finder

2. Confluence Page Verifier

How to Use

About

Uh oh!

Releases

Packages

Languages

USMAI-Library-Consortium/confluence-public-link-finder

Folders and files

Latest commit

History

Repository files navigation

Confluence Public Page Auditor

1. Public Page Finder

2. Confluence Page Verifier

How to Use

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages