Skip to content

Add B&H Photo WebHarbor mirror#38

Open
Lxr-max wants to merge 1 commit into
aiming-lab:mainfrom
Lxr-max:add-bh-photo-site
Open

Add B&H Photo WebHarbor mirror#38
Lxr-max wants to merge 1 commit into
aiming-lab:mainfrom
Lxr-max:add-bh-photo-site

Conversation

@Lxr-max

@Lxr-max Lxr-max commented May 28, 2026

Copy link
Copy Markdown

Site

Implementation Summary

This PR adds a local B&H Photo mirror built with Flask, SQLAlchemy, Jinja2, and SQLite. The site covers product search, category browsing, rich specs, reviews, Q&A, product comparison, bundles, wishlist, cart, reserve-in-store mock, checkout mock, order history, and account flows.

The mirror uses deterministic local product data and local SVG product/media assets. There are no external runtime calls, no real payment processing, and no live B&H dependencies.

Key User Flows

  • Search and filter products by category, brand, price, rating, availability, condition, and technical spec facets
  • Browse product detail, specs, reviews, and Q&A pages
  • Add products to compare, wishlist, and cart
  • Add bundles, reserve products in store locally, and complete a deterministic mock checkout
  • Review account details and order history

Seed Row Counts

  • users: 4
  • categories: 19
  • brands: 25
  • products: 146
  • product_specs: 1752
  • product_reviews: 186
  • product_questions: 206
  • product_answers: 206
  • bundles: 20
  • store_locations: 8
  • deals: 146
  • cart_items: 9
  • wishlist_items: 14
  • compare_items: 12
  • orders: 5
  • order_items: 10
  • store_reservations: 4

Benchmark Users

  • alice.j@test.com
  • bob.c@test.com
  • carol.d@test.com
  • david.k@test.com
  • Password: TestPass123!

Tasks

  • Task count: 20
  • Task categories: search, category filters, used/open-box deals, comparison, wishlist, cart updates, bundle purchase, reserve-in-store, checkout, order history, Q&A lookup, review/rating use, compatibility lookup, multi-page reasoning

WebSyn Port

  • Selected port: 40015

Verification Summary

  • Docker build and container run completed successfully
  • Control-plane health and all registered sites returned healthy status in the test container
  • Key B&H routes including home, categories, category listings, search, compare, deals, used, bundles, pickup, help, contact, and login returned 200
  • Login/logout, category filtering, sorting, product detail/specs/reviews/Q&A, compare, wishlist, cart, bundle add, reserve-in-store, checkout, account edit, and order history were smoke-tested successfully
  • Reset endpoint POST /reset/bh_photo returned ready successfully
  • instance/bh_photo.db and instance_seed/bh_photo.db matched after reset and after container restart
  • MD5: 8ba7a9d1ca30047959cc5ae9da9418f2

Screenshot Paths

  • Local review screenshots: C:\Users\34475\Desktop\VScode\WebHarborRepo-bh\sites\bh_photo\scraped_data\submission_review

HF Asset Status

HF asset PR is open and pending merge: https://huggingface.co/datasets/ChilleD/WebHarbor/discussions/28

This GitHub PR is opened for code review first. The local instance_seed DB and static assets were prepared locally and uploaded through the HF asset workflow.

Known Limitations

  • Uses deterministic local SVG product/media assets rather than copied B&H-owned product imagery
  • No real payment processing, order fulfillment, or live B&H calls
  • Final .assets-revision update still depends on HF asset PR merge

Manual Review Notes

  • Please review retail header/search density and comparison table readability
  • Please spot-check product filtering behavior and seeded spec realism
  • Please review mobile product layout and purchase column behavior

@YuanDaoze

Copy link
Copy Markdown

Review

Tested on a fresh worktree of this branch. HF asset (bh_photo.tar.gz, 113 KB) pulled directly from the dataset PR ref (refs/pr/28), the other 15 sites' assets hard-linked from upstream/main. Built as webharbor:rev38, run on alt ports :8241 / :42400-42415 to avoid colliding with my dev container.

What works ✓

Mechanical

  • Three-place registration in sync (websyn_start.sh SITES, control_server.py SITES, Dockerfile EXPOSE 8101 40000-40015).
  • All 16 sites alive after boot. bh_photo on :42415 returns 200.
  • PR base is current (af0765d, today's upstream/main) — only PR I've reviewed where this is true; all of Add UC Berkeley mirror site (port 40015) #11 / feat(drugs_com): add drugs.com mirror site (port 40015) #9 / Add Compass real-estate mirror (port 40015) #25 / Add GOV.UK mirror site (port 40015) #32 sat on 3c408d8 init. Single clean commit.
  • Byte-identical reset holds: md5(instance/bh_photo.db) == md5(instance_seed/bh_photo.db) == 8ba7a9d1ca30047959cc5ae9da9418f2 both before and after POST /reset/bh_photo — matches the PR description verbatim.
  • Reset hygiene confirmed: registered a new "Test User" (newtest2@test.com), instance/ MD5 changed to e6a85e6…, then /reset/bh_photo restored it to 8ba7a9d1….
  • Tarball is clean — 0 macOS ._* AppleDouble files. 167 SVG assets (products + bundles + 1 fallback) + 1 seed DB.
  • tasks.jsonl has 20 tasks, all five required fields on every line, port matches 40015.
  • Build context clean — no .db, no static/images/, no scraped_data/ committed. Tarball exists only on HF.

Functional depth

  • 38 routes, 21 templates, 4211 LOC across app.py + seed_data.py. Reasonable scope for the domain.
  • Real B&H aesthetic: yellow-on-black header, dark green nav strip, "Demo mirror only" disclosure banner. SVG product line-art is consistent across categories (cameras / lenses / monitors / drones / audio).
  • /categories, /c/<slug>, /search, /used, /bundles, /compare, /store-pickup, /help, /contact all 200 with populated content.
  • Product detail at /product/<slug> is rich: hero + 3-thumb gallery, breadcrumb, price + sale price + savings badge, In Stock / Free Shipping / 30-Day Return / Pickup Ready row, Sign-in-to-add-to-cart + Reserve-in-store CTAs, key specs (sensor/megapixels/recording/lens mount), Overview/Specs/Reviews/Q&A tab nav, "Similar products in this category" carousel.
  • Subroutes /product/<slug>/specs|reviews|qa exist as separate pages.
  • /bundles lists 20 bundles (Documentary Rig, Color Grade Suite, Sony Hybrid Creator Kit, etc.) each with included items and bundle price.
  • Login (alice.j@test.com / TestPass123!) works. /account shows alice's role (Photographer · Northlight Weddings), saved activity counts (4 wishlist items, 3 compare, 2 cart, 2 orders, 1 reservation), Pickup preference card, Recent reservations card.
  • /cart shows 2 alice items with quantity update + remove + summary (subtotal/shipping/tax/total) + Proceed to Checkout.
  • /account/orders shows BH-YYYYMMDD-NNNN format order numbers with delivery status, ship vs pickup, line items, totals.

Task quality

  • 20 tasks span 14 categories per PR description: search, category filters, used/open-box deals, comparison, wishlist, cart updates, bundle purchase, reserve-in-store, checkout, order history, Q&A lookup, review/rating, compatibility lookup, multi-page reasoning.
  • 8 of 20 tasks require authentication with named user + specific action — high specificity prevents leaks ("Sign in as Bob Chen and find the order containing Sony FX30 Studio Cinema Camera" → only one such order exists in Bob's history: BH-20260409-0201, verified).
  • Spot-checked 4 risky tasks:
    • --0 Sony 33MP full-frame mirrorless → /product/sony-aurora-a7x-mirrorless-camera exists
    • --11 DJI Airframe 4S Q&A → /product/dji-airframe-4s-fly-more-drone/qa returns 200 and contains "local store pickup" terminology
    • --17 David Kim's processing order → /account/orders shows exactly one Processing · Store pickup entry (BH-20260411-0401)
    • --10 Bob's order with Sony FX30 → confirmed BH-20260409-0201
  • Naming convention is cute and benchmark-safe: real B&H product names lightly remixed (Sony Aurora A7X ↔ Sony A7 IV, Canon Orbit R6 ↔ Canon R6 II, Nikon Zenith Z8 ↔ Nikon Z8). This avoids accidental real-product knowledge leaking into agent answers, since neither the model nor a Bing search will match these names.

Should-fix (non-blocking)

1. .assets-revision pinned to main, but bh_photo.tar.gz is in HF PR ref refs/pr/28 and not yet merged.

./scripts/fetch_assets.sh on a fresh clone won't find bh_photo on main and will silently skip → site won't boot. Author acknowledges this in the "Known Limitations" section ("Final .assets-revision update still depends on HF asset PR merge"), which is honest, but means this PR isn't independently reviewable without the manual refs/pr/28 workaround I used.

Suggest either bumping .assets-revision to a fork ref / PR ref temporarily (the merriam_webster PR uses repo: YuanDaozeiii/WebHarbor for this), or coordinating with HF maintainers to merge HF PR #28 first.

2. Benchmark seed users use bcrypt.generate_password_hash() with random salt.

def set_password(self, password: str) -> None:
    self.password_hash = bcrypt.generate_password_hash(password).decode("utf-8")

Within a single shipped image this is fine — the seed DB is fixed and reset re-copies it. But re-running seed_data.py against an empty DB would produce a different bh_photo.db MD5 every time (bcrypt's per-row random salt). That means the seed DB is not reproducible from source code alone — anyone wanting to verify the asset has to trust the shipped tarball.

Compass (PR #25) handles this by writing PBKDF2 hashes with hand-derived deterministic salts for benchmark users while keeping bcrypt for live registrations. Worth borrowing that pattern so future maintainers can regenerate bh_photo.db deterministically from seed_data.py.

3. Visual fidelity — synthetic SVG product images.

Author transparently calls this out as a deliberate IP/legal trade-off ("Uses deterministic local SVG product/media assets rather than copied B&H-owned product imagery"). It's a defensible choice but the cards do feel wireframe-y compared to the photographic richness of e.g. Compass (PR #25). Not a bug, just noting that the visual gap with the real bhphotovideo.com is significant. The 167 SVGs are at least style-consistent across the site, which keeps the overall look coherent.

4. requirements.txt is unused.

Same comment as on PR #9 and #25. The Dockerfile pip-installs explicitly with locked versions (correct per AGENTS.md). sites/bh_photo/requirements.txt is dead code — either delete or wire it in.

5. _health.py orphan file.

sites/bh_photo/_health.py exists but isn't imported anywhere and isn't part of any framework hook. Same pattern as PR #9 / #25 — looks like a holdover from a custom health-check infra that didn't ship. Either delete or document.

Bottom line

Strongest mechanical hygiene of the five PRs I've reviewed: clean tarball, MD5 matches description, PR is properly rebased onto current main (only one of five), reset hygiene perfect, 8/20 tasks require authenticated multi-step workflows with verified-distinct ground truths. The two real items are the same .assets-revision situation that affects every PR with a pending HF asset, plus a soft suggestion to deterministic-hash seed users. The IP-careful synthetic-SVG choice is reasonable; visual fidelity is the trade. Once HF #28 merges, this is solidly mergeable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants