Blog
Product updates, engineering deep-dives, and customer stories.
- Product
Welcome to the DocuSift blog
What we're building, why template-free extraction matters, and what to expect from this blog going forward.
- Product
Why template-based extraction keeps breaking (and what we did instead)
Templates worked when documents fit a known shape. They stopped working a long time ago. Here's what fails, why, and how DocuSift's template-free pipeline routes around it.
- Engineering
Per-tenant S3 isolation: how DocuSift keeps customer documents apart
A walkthrough of the storage layer — how we resolve a per-tenant S3 provider on every upload, what gets cached, and how a tenant brings their own bucket without touching deploy.
- Product
Extraction confidence isn't accuracy: how to actually use the number
Confidence scores are the most misread number in document AI. Here's what they actually mean, how to set thresholds, and why pinning your accuracy KPI to 'average confidence' will quietly mislead you.
- Product
Migrating from Docparser, Nanonets, or Rossum: a practical guide
If you're running a legacy template-based extractor and considering DocuSift, here's the realistic migration path — what to keep, what to throw out, and a one-week plan to validate.
- Engineering
Structured outputs, vision models, and the boring engineering between them
Multimodal LLMs do the magic. The other 80% of a production extraction pipeline is plumbing. A walkthrough of the parts you don't think about until they break: provider routing, retry semantics, schema enforcement, cost attribution.