launchpad-reviewers team mailing list archive
-
launchpad-reviewers team
-
Mailing list archive
-
Message #31308
[Merge] ~ikoruk/launchpad:user-agent into launchpad:master
Yuliy Schwartzburg has proposed merging ~ikoruk/launchpad:user-agent into launchpad:master.
Commit message:
Adding blocked user agents to apache for mainsite and API
This is specifically to block Bytedance from scraping LP degrading performance
Requested reviews:
Launchpad code reviewers (launchpad-reviewers)
For more details, see:
https://code.launchpad.net/~ikoruk/launchpad/+git/launchpad/+merge/470982
The blocked user agents should be "Bytespider|Bytedance"
--
Your team Launchpad code reviewers is requested to review the proposed merge of ~ikoruk/launchpad:user-agent into launchpad:master.
diff --git a/charm/launchpad-appserver/config.yaml b/charm/launchpad-appserver/config.yaml
index b82a3d1..31aed04 100644
--- a/charm/launchpad-appserver/config.yaml
+++ b/charm/launchpad-appserver/config.yaml
@@ -12,6 +12,11 @@ options:
description: >
Cognitive Services subscription key for the Bing Custom Search API.
default:
+ blocked_user_agents:
+ type: string
+ description: >
+ User agents that should be blocked from Launchpad, separated by '|'.
+ default:
csrf_secret:
type: string
description: >
diff --git a/charm/launchpad-appserver/reactive/launchpad-appserver.py b/charm/launchpad-appserver/reactive/launchpad-appserver.py
index 879dbe4..5446487 100644
--- a/charm/launchpad-appserver/reactive/launchpad-appserver.py
+++ b/charm/launchpad-appserver/reactive/launchpad-appserver.py
@@ -331,6 +331,12 @@ def deconfigure_vhost():
remove_state("launchpad.vhost.configured")
+@when("config.changed.blocked_user_agents")
+def reconfigure_blocked_user_agents():
+ remove_state("launchpad.vhost.configured")
+ remove_state("launchpad.api-vhost.configured")
+
+
@when("api-vhost-config.available", "service.configured")
@when_not("launchpad.api-vhost.configured")
def configure_api_vhost():
diff --git a/charm/launchpad-appserver/templates/vhosts/api-https.conf.j2 b/charm/launchpad-appserver/templates/vhosts/api-https.conf.j2
index 52b9225..f9ed5fe 100644
--- a/charm/launchpad-appserver/templates/vhosts/api-https.conf.j2
+++ b/charm/launchpad-appserver/templates/vhosts/api-https.conf.j2
@@ -30,6 +30,12 @@
RewriteEngine on
+{% if blocked_user_agents %}
+ # Block certain user agents
+ RewriteCond %{HTTP_USER_AGENT} ^.*({{ blocked_user_agents }}).*$ [NC]
+ RewriteRule .* – [F,L]
+{%- endif %}
+
RewriteRule ^/offline\.html$ - [PT]
RewriteRule ^/robots\.txt$ - [PT]
RewriteRule ^/\+apidoc/(.*) /$1 [PT]
diff --git a/charm/launchpad-appserver/templates/vhosts/mainsite-https.conf.j2 b/charm/launchpad-appserver/templates/vhosts/mainsite-https.conf.j2
index 16708c2..7aac31e 100644
--- a/charm/launchpad-appserver/templates/vhosts/mainsite-https.conf.j2
+++ b/charm/launchpad-appserver/templates/vhosts/mainsite-https.conf.j2
@@ -38,6 +38,12 @@
RewriteEngine on
+{% if blocked_user_agents %}
+ # Block certain user agents
+ RewriteCond %{HTTP_USER_AGENT} ^.*({{ blocked_user_agents }}).*$ [NC]
+ RewriteRule .* – [F,L]
+{%- endif %}
+
{% if google_site_verification %}
# https://portal.admin.canonical.com/C49078: File needed for Google to
# verify domain control.
Follow ups