Reducto releases Deep Extract
Reducto unveils 'Deep Extract', an AI agent-based solution designed to revolutionize structured data extraction from notoriously long and complex documents. By employing an 'agent-in-the-loop' verification process, it autonomously corrects its own output, achieving 99-100% accuracy, often surpassing human capabilities. This innovative approach to a common enterprise pain point, leveraging trending 'agent harness' architectures, makes it highly appealing to the HN audience.
The Lowdown
Reducto has launched 'Deep Extract', a powerful new agent-harness approach to structured data extraction that aims to solve the pervasive problem of inaccurate information retrieval from lengthy and intricate documents. This system employs an "agent-in-the-loop" methodology, where an AI autonomously verifies and corrects its own output until it meets stringent accuracy criteria.
- Existing extraction pipelines often fail on long documents, leading to errors like dropped line items, which traditionally required tedious human-in-the-loop (HITL) manual checks.
- Deep Extract addresses this by using a multi-step agentic loop: extract, verify against the source, identify discrepancies, and re-extract until a defined quality threshold is met.
- Unlike single-pass models that are prone to shortcuts, Deep Extract deploys sub-agents to break down complex tasks, ensuring thorough processing across thousands of rows and hundreds of pages.
- Users can define custom verification criteria, such as reconciling totals on an invoice, allowing the system to achieve 99-100% field accuracy, even outperforming expert human labelers.
- The system can provide granular bounding box citations for all extracted fields, crucial for audit trails and human review workflows.
- During beta testing, Deep Extract drastically improved accuracy in real-world scenarios, turning 10-20% field accuracy with frontier models into near-perfect results for various complex documents like payment reports and financial statements.
- While it takes longer than standard extraction, it offers a faster, cheaper, and more consistent alternative to manual human review for high-stakes, large-scale document processing.
Deep Extract is available as a configuration for Reducto's existing Extract endpoint, enabling developers and enterprise teams to implement this advanced, self-correcting AI-driven solution for critical data extraction tasks.