Skip to content

Fix HAL (Halton) scraper#347

Open
symroe wants to merge 1 commit into
masterfrom
fix/HAL-scraper
Open

Fix HAL (Halton) scraper#347
symroe wants to merge 1 commit into
masterfrom
fix/HAL-scraper

Conversation

@symroe

@symroe symroe commented Jun 9, 2026

Copy link
Copy Markdown
Member

What broke

The scraper used http://councillors.halton.gov.uk as its base URL. Halton's server now enforces HTTPS-only (HSTS with max-age=31536000) and no longer accepts TCP connections on port 80, causing the scraper to time out after 30 seconds. Switching to https:// restores connectivity, but wreq's embedded BoringSSL CA bundle does not trust Halton's TLS certificate, so verify_requests = False is also required.

What was fixed

  • Changed base_url from http://councillors.halton.gov.uk to https://councillors.halton.gov.uk in metadata.json
  • Added verify_requests = False to the Scraper class in councillors.py

Scrape results

Metric Count
Councillors found 53
With email address 53
With photo 53

Generated by Claude Code

The scraper was using http://councillors.halton.gov.uk which now fails
with a TCP connection timeout — the server no longer listens on port 80
(HSTS is enforced with max-age=31536000). Switching to HTTPS restores
connectivity, but wreq's embedded BoringSSL CA bundle does not trust
Halton's certificate, so verify_requests = False is also required.
@symroe

symroe commented Jun 9, 2026

Copy link
Copy Markdown
Member Author

Re-scrape after 4042ad0

Initial fix: HTTPS migration + verify_requests = False. Scraper ran cleanly in 4 seconds against https://councillors.halton.gov.uk/mgWebService.asmx/GetCouncillorsByWard.

Metric Count
Councillors found 53
With email address 53
With photo 53

All councillors have both email and photo populated via the ModGov XML feed.


Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant