Add PrismaticSync console tool for unattended Prismatic catalog sync

Standalone .NET 8 console app (not part of the main solution) that scrapes the
Prismatic Powders catalog via Playwright and pushes it into the app's catalog
import. Prismatic has no API, so this runs on a workstation (Task Scheduler),
never the deployed server.

- Discovery: incremental newest-first via ?category=created_at (stops once it
  reaches already-known URLs — cheap, finds new colors) and a full all-colors
  crawl for occasional reconcile.
- Scraper: resumable product-page scrape (sku/color/description/price tiers/
  SDS/TDS/app-guide/image), with --refresh-older-than to re-scrape stale
  products and catch price changes. Output matches the app import format so it
  flows through the same shared upsert as the Columbia sync.
- Resilience: brisk randomized base delay, escalating 403 cooldown-and-retry to
  avoid hard bans, periodic rest. All configurable.
- Visibility: streams every product + the inter-product wait to the console
  (colored) and a log file, with an up-front ETA.
- Push: token-authenticated POST to the app import endpoint (skips to manual
  upload when unconfigured).

The app-side token import endpoint is a separate follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-18 11:30:47 -04:00
parent f752abad86
commit c59d55529f
13 changed files with 1037 additions and 0 deletions
@@ -0,0 +1,43 @@
using Microsoft.Playwright;
namespace PrismaticSync.Infrastructure;
/// <summary>
/// A headless Chromium session with a realistic desktop fingerprint (UA, viewport, locale,
/// timezone) — matching the original scraper's settings to look like a normal browser.
/// </summary>
public sealed class BrowserSession : IAsyncDisposable
{
private IPlaywright? _pw;
private IBrowser? _browser;
private IBrowserContext? _context;
public IPage Page { get; private set; } = null!;
public static async Task<BrowserSession> CreateAsync(bool headed)
{
var session = new BrowserSession();
session._pw = await Playwright.CreateAsync();
session._browser = await session._pw.Chromium.LaunchAsync(new BrowserTypeLaunchOptions
{
Headless = !headed
});
session._context = await session._browser.NewContextAsync(new BrowserNewContextOptions
{
UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 " +
"(KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
ViewportSize = new ViewportSize { Width = 1365, Height = 900 },
Locale = "en-US",
TimezoneId = "America/New_York"
});
session.Page = await session._context.NewPageAsync();
return session;
}
public async ValueTask DisposeAsync()
{
if (_context is not null) await _context.CloseAsync();
if (_browser is not null) await _browser.CloseAsync();
_pw?.Dispose();
}
}