Public career pages only
Front door, never the back.
We crawl pages a company has chosen to expose to the open web - /careers, /jobs, public ATS boards (Greenhouse, Lever, Ashby, Workable). Nothing behind a login, paywall, or robots.txt disallow.
- Respect robots.txt and meta noindex on every fetch
- Identify our crawler with a contactable User-Agent
- Rate-limit per host, back off on 429/5xx, never parallelise abusively
- Skip pages requiring authentication, cookies, or captcha solving