Files
PowderCoatingLogix/scripts/Prismatic Data Scraper/Services
spouliot da2bb46d5a Tighten Prismatic scrape parsing after live smoke test
Validated against live product pages; fixed three edge cases (also present in
the original JS scraper) surfaced by specialty AkzoNobel products:

- Sample image: only accept real product images on the NIC CDN
  (images.nicindustries.com/prismatic/products), preferring full-size over
  thumbnail. Dropped the loose "prismatic|powder|color" fallback that grabbed
  the site logo on products with no image.
- SDS/TDS/app-guide links: require the href to be an actual document (NIC CDN
  or a .pdf) so a generic /documents nav link isn't captured as the SDS.
- Description: also stop at PRODUCT SUPPORT / PRODUCT COLLECTIONS / CUSTOMER
  SERVICE so less page footer is captured (app-side StripBoilerplate cleans the
  rest).

Structural fields (sku, color, price tiers) verified correct on live data.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 12:41:47 -04:00
..