Open
Conversation
Deploys a bot-block.conf to all servers from a blocked_ips variable defined per environment in group_vars. Apache rejects listed IPs before the request reaches mod_wsgi/Django — near-zero CPU cost. UA-based bot blocking (ClaudeBot, GPTBot, etc.) lives in Django's BotBlockingMiddleware (Gluejar/regluit PR #1094). This conf is the supplemental layer for egregious single-IP offenders. Changes: - templates/bot-block.conf.j2: new Apache conf, renders empty if blocked_ips is undefined or empty (safe for test/ondeck with no list) - tasks/apache.yml: deploy + a2enconf the new conf - group_vars/production/vars.yml: adds blocked_ips with 216.73.216.178 (ClaudeBot single IP, 229K req on 2026-02-26) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two critical gaps addressed: 1. UA blocking moved to Apache (pre-WSGI). SetEnvIfNoCase rules mirror Django's BAD_ROBOTS list. Apache rejects matched requests before mod_wsgi spawns a thread — protects all 30 WSGI slots instead of occupying one to return a 403 from Django. 2. Tencent Cloud 43.173.0.0/16 added to blocked_cidrs. On 2026-02-26 this single /16 sent 14,797 req across 1,495 unique IPs — too distributed for per-IP blocking, needs CIDR treatment. Also splits blocked_ips (single-host offenders) from blocked_cidrs (network ranges) for clearer intent in vars files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add apache-log-gzip cron: gzip logs older than 1 day at 03:00 - Update apache-log-cleanup: delete .log.gz older than 30 days (was deleting .log older than 14 days, no compression) - Access logs compress ~86%; 4.9G → 943M on first manual run today - Update restart-workaround comment: bot mitigation now live on prod Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Four new UAs observed flooding prod on 2026-02-28 (03:00 UTC): - meta-webindexer/1.1: 206 req — Meta bot, different UA than meta-externalagent (same 57.141.x.x network, just switched UA string) - DataForSeoBot/1.0: 25 req — explicit SEO crawler - QIHU 360SE: 32 req — Chinese bot/scraper - MetaSr 1.0: 26 req — old Chinese browser UA used by scrapers All four were passing through the Apache block and reaching WSGI slots. Applied live to prod (manual bot-block.conf) before this commit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
Author
2026-02-28 — Live patch: 4 new bot UAs addedProd was slow at ~03:04 UTC (load avg 4.59, five Apache workers pegging CPU at 40–98%). Access log analysis identified four UAs not in the existing block list:
Applied live to Note: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an Ansible-managed Apache conf that blocks specific IPs at the web-server level — before requests reach mod_wsgi/Django — with near-zero CPU cost. IPs are configured per-environment via
group_vars.Companion to Gluejar/regluit#1094 (BotBlockingMiddleware for UA-based blocking). The two layers work together:
BotBlockingMiddleware(regluit#1094)bot-block.conf(this PR)Changes
roles/regluit_prod/templates/bot-block.conf.j2(new)<RequireAll>/Require not ipblock for each entry inblocked_ipsblocked_ipsis undefined or empty — safe for test/ondeckroles/regluit_prod/tasks/apache.yml/etc/apache2/conf-available/bot-block.confa2enconf bot-blockand triggers Apache restartgroup_vars/production/vars.ymlblocked_ipslist with216.73.216.178(ClaudeBot single IP, 229K req on 2026-02-26)Adding / removing IPs
Edit
blocked_ipsingroup_vars/production/vars.ymland re-run the playbook. No server login required.Test plan
ansible-playbook -i hosts setup-test.ymlsucceedscurl -I https://test.unglue.it/returns 403 from a blocked IP, 200 from a clean IPblocked_ips, conf file renders as empty and Apache starts cleanly