Back to DocuExtract

Help & Documentation

Everything you need to get started with DocuExtract

Getting Started

New to DocuExtract? Start here.

DocuExtract is an AI-powered document extraction tool. Upload PDFs or images and extract structured data using the latest AI vision models.

  1. 1

    Create a free account or sign in with Google

  2. 2

    Upload an image or PDF document

  3. 3

    Choose an extraction method — Quick Extract for images or Template Extract for PDFs

  4. 4

    Review your extracted data and export as Markdown, JSON, or copy to clipboard

Quick Extract

Instant AI extraction from images

Quick Extract is the fastest way to get structured data from images. Paste from clipboard, drag and drop, or upload images directly.

  • Paste images with Ctrl+V from anywhere — screenshots, browser content, image editors
  • Drag and drop one or multiple images directly onto the page
  • Click to upload images from your device
  • Optionally specify what to extract with a custom prompt, or let the AI auto-detect
  • Process multiple images in a single extraction job

Extraction Templates

Reusable schemas for structured output

Templates define what data to extract and how to structure the output. Create templates for different document types — invoices, receipts, forms, catalogs, and more.

  • Define custom output fields and data types with a visual schema editor
  • Set system prompts to give the AI domain-specific expertise
  • Write detailed extraction prompts describing exactly what to look for
  • Attach example images for few-shot learning to improve accuracy
  • Use AI to automatically generate output schemas from your extraction prompt

Region Selection

Target specific areas for precise extraction

Region Selection lets you draw boxes on PDF pages to focus the AI on specific areas. This is invaluable for documents with complex layouts — diagrams, forms, tables — where you only need data from certain sections.

  • Draw rectangular regions on any page by clicking and dragging
  • Select multiple regions on a single page for parallel extraction
  • Resize and adjust regions by dragging corner or edge handles
  • Pan and zoom the document for precise region placement
  • Each region is extracted independently for focused, accurate results

Batch Extraction

Process multiple pages asynchronously

Batch extraction processes all pages in a PDF document (or a specified page range) as a single combined result. Jobs run asynchronously — no need to wait.

  • Specify start and end pages, or extract all pages at once
  • Jobs run in the background — you can close the tab and come back later
  • Real-time progress updates via WebSocket connection
  • View and manage queued jobs with the built-in job queue panel

Jobs & Export

View history, manage results, and export data

The Job Center is your central hub for all extraction jobs. Filter, sort, and export results in multiple formats.

  • Browse all extraction jobs with filters for status, type, model, and date
  • Re-run failed extractions or retry with a different AI model
  • Copy results as Markdown, JSON, or Excel-compatible table format
  • Select multiple jobs and batch export to a single Markdown file
  • Edit extraction results inline and save changes back to the database

Account & Billing

Manage your plan, usage, and profile

DocuExtract offers a Free tier for getting started and a Pro tier for power users.

Free Plan

  • 3 quick extractions per month
  • Gemini Flash & GPT models
  • 1 custom template
  • 1 document, 1 page per document

Pro Plan

  • Unlimited extractions
  • All AI models including Claude & Gemini Pro
  • Unlimited templates and documents
  • Unlimited pages per document
  • Priority support

Frequently Asked Questions

DocuExtract works with PDF documents and most image formats (PNG, JPG, WebP). For Quick Extract, you can paste screenshots or upload images. For Template Extract, upload PDF files.

Accuracy depends on document quality and AI model. Claude and Gemini Pro models typically achieve the highest accuracy. Using templates with output schemas and example images significantly improves results.

Yes. Documents are transmitted over encrypted connections (HTTPS), stored with access controls, and processed via AI provider APIs that do not use your data for training. See our Privacy Policy for details.

For most use cases, Gemini Flash offers the best speed-to-quality ratio. For complex or dense documents, Claude Sonnet/Opus or Gemini Pro deliver superior accuracy. Pro users have access to all models.

Free plan users can wait until the next month for limits to reset, or upgrade to Pro for unlimited extractions. Usage resets on the first of each month.

Yes, cancel anytime through the Billing page. You'll retain Pro access until the end of your current billing period.

Use single page extraction when you need results from specific pages or want to select regions. Use batch extraction when you want to process an entire document or page range into a combined result.

Draw regions tightly around the content you want extracted. Avoid including too much whitespace or unrelated content. For tables, include the header row in your region. Each region is processed independently.

Still need help?

Our support team is here to assist you.