Documentation Index Fetch the complete documentation index at: https://docs.grigori.in/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Document Converter API provides RESTful endpoints for uploading, converting, and retrieving documents. All endpoints return JSON responses and use standard HTTP status codes.
Base URL : http://localhost:8000/api/v1Content-Type : application/json for responses, multipart/form-data for file uploads
Authentication
The current version does not require authentication. For production deployments, implement authentication middleware.
Jobs API
Create Job
curl -X POST "http://localhost:8000/api/v1/jobs" \
-H "Content-Type: multipart/form-data" \
-F "[email protected] " \
-F "output_format=md" \
-F "webhook_url=https://your-site.com/webhook"
Request Parameters:
Output format: md for Markdown or json for structured JSON.
URL to receive job completion notifications.
Enable OCR processing for images and scanned documents.
OCR provider: paddle, easyocr, or mistral.
Response:
{
"id" : "123e4567-e89b-12d3-a456-426614174000" ,
"status" : "pending" ,
"progress" : 0 ,
"file_name" : "document.pdf" ,
"output_format" : "md" ,
"use_ocr" : false ,
"created_at" : "2024-01-15T10:00:00Z" ,
"updated_at" : "2024-01-15T10:00:00Z"
}
Get Job Status
curl "http://localhost:8000/api/v1/jobs/123e4567-e89b-12d3-a456-426614174000"
Response:
{
"id" : "123e4567-e89b-12d3-a456-426614174000" ,
"status" : "completed" ,
"progress" : 100 ,
"file_name" : "document.pdf" ,
"output_format" : "md" ,
"use_ocr" : false ,
"created_at" : "2024-01-15T10:00:00Z" ,
"updated_at" : "2024-01-15T10:05:00Z" ,
"completed_at" : "2024-01-15T10:05:00Z"
}
Download Result
curl "http://localhost:8000/api/v1/jobs/123e4567-e89b-12d3-a456-426614174000/result" \
-o converted_document.md
Health API
Health Check
curl "http://localhost:8000/api/v1/health"
Response:
{
"status" : "healthy" ,
"timestamp" : "2024-01-15T10:30:00Z" ,
"version" : "1.0.0"
}
Readiness Check
curl "http://localhost:8000/api/v1/ready"
Response:
{
"status" : "ready" ,
"timestamp" : "2024-01-15T10:30:00Z" ,
"services" : {
"redis" : true ,
"celery" : true
}
}
File Download Behavior
When downloading converted files via the /api/v1/jobs/{job_id}/result endpoint, the filename is automatically generated based on the original filename and output format:
Filename Generation Rules
Original extension is removed
The file extension from the uploaded file is stripped
New extension is added
The appropriate extension for the output format is added
Base name is preserved
The original filename (without extension) is kept intact
Examples
Original Filename Output Format Download Filename document.pdfmddocument.mdpresentation.pptxmdpresentation.mdspreadsheet.xlsxjsonspreadsheet.jsonMy Report (v2).docxmdMy Report (v2).mddata-2024.csvjsondata-2024.json
Markdown (.md) Human-readable markdown with embedded images as base64
JSON (.json) Structured data with content, metadata, and base64-encoded images
The download response includes appropriate headers:
Content-Type :
text/markdown for .md files
application/json for .json files
Content-Disposition : attachment; filename="generated_filename"
Error Responses
The request was invalid or cannot be served. {
"detail" : "Job is not completed yet"
}
The requested resource was not found. {
"detail" : "Job not found"
}
413 Request Entity Too Large
The uploaded file exceeds the maximum size limit. {
"detail" : "File too large. Maximum size is 100MB"
}
The request was well-formed but contains semantic errors. {
"detail" : [
{
"loc" : [ "body" , "output_format" ],
"msg" : "Invalid output format" ,
"type" : "value_error"
}
]
}
500 Internal Server Error
An unexpected error occurred on the server. {
"detail" : "Internal server error"
}
Rate Limiting
Rate limiting is not implemented in the current version. For production deployments, implement rate limiting middleware.
Webhook Notifications
When a job completes (successfully or with failure), a webhook notification is sent to the specified URL:
{
"job_id" : "123e4567-e89b-12d3-a456-426614174000" ,
"status" : "completed" ,
"progress" : 100 ,
"created_at" : "2024-01-15T10:00:00Z" ,
"updated_at" : "2024-01-15T10:05:00Z" ,
"completed_at" : "2024-01-15T10:05:00Z" ,
"result_url" : "http://localhost:8000/api/v1/jobs/123e4567-e89b-12d3-a456-426614174000/result" ,
"metadata" : {}
}
Example Workflow
Here’s a complete example of converting a document:
Upload Document
curl -X POST "http://localhost:8000/api/v1/jobs" \
-F "[email protected] " \
-F "output_format=md"
Response: {"id": "job_123", "status": "pending", ...}
Check Status
curl "http://localhost:8000/api/v1/jobs/job_123"
Response: {"status": "processing", "progress": 50, ...}
Download Result
curl "http://localhost:8000/api/v1/jobs/job_123/result" -o presentation.md
The converted document is saved as presentation.md