Extract multiple FHIR resources from a document

POST/lang2fhir/document/multi

Extracts text from a document (PDF or image) and converts it into multiple FHIR resources, returned as a transaction Bundle. Combines document text extraction with multi-resource detection. Automatically detects Patient, Condition, MedicationRequest, Observation, and other resource types. Resources are linked with proper references (e.g., Conditions reference the Patient).

Patient identifier handling. US Core requires Patient.identifier (a business identifier such as an MRN). When the source text contains an identifier, it is extracted with an appropriate URI system. When the source text does not contain a detectable identifier, a synthetic one is generated with system: "urn:phenoml:lang2fhir-generated-id" and a UUID value so the bundle remains FHIR-valid and US Core conformant. Callers who need a tenant-specific namespace should rewrite the synthetic system after extraction.

RequiresBearerauthentication

Body parameters

versionstringrequired

FHIR version to use

contentstringrequired

Base64 encoded file content. Supported file types: PDF (application/pdf), PNG (image/png), JPEG (image/jpeg). File type is auto-detected from content magic bytes.

providerstringoptional

Optional FHIR provider name for provider-specific profiles

implementation_guidestringoptional

Custom Implementation Guide name. When specified, profiles from this IG are included alongside US Core profiles during resource detection. US Core is always the base layer; custom IG profiles are additive.

detection_effortstringoptionaldefault standard

Detection effort. 'standard' runs detection once, 'deep' runs detection multiple times for higher recall.

standarddeep
validation_methodstringoptionaldefault none

FHIR validation method to apply to the generated bundle. 'none' skips validation (default). 'check' runs the bundle through a FHIR structure validator and includes the results in the response. 'fix' runs validation and attempts to auto-correct errors using an LLM (up to 3 validation passes). The response includes results from each pass. Warning: 'fix' can significantly increase latency due to multiple LLM and validation round-trips.

nonecheckfix
configobjectoptional

Optional processing configuration shared across document endpoints.

page_filterobjectoptional

Configures per-page pre-extraction filtering. When set, each page of text extracted from the document is classified by an LLM, and pages classified as irrelevant to the supplied context are dropped before FHIR extraction.

contextstringrequired

Natural-language description of what IS relevant to the extraction goal. Pages that do not match are dropped from downstream FHIR extraction.

Returns  

Successfully extracted FHIR resources from document

Response fields

successbooleanoptional
messagestringoptional
bundleobjectoptional
resourceTypestringoptional
typestringoptional
entryarray<object>optional
fullUrlstringoptional
resourceobjectoptional
requestobjectoptional
methodstringoptional
urlstringoptional
resourcesarray<object>optional
tempIdstringoptional
resourceTypestringoptional
descriptionstringoptional
originalTextstringoptional
validationobjectoptional
passesarray<object>optional
issuesarray<object>optional
severitystringoptional
codestringoptional
diagnosticsstringoptional
expressionarray<string>optional
sourcestringoptional
statsobjectoptional
resource_typestringoptional
profile_urlstringoptional
is_custom_profilebooleanoptional
duration_msnumberoptional
fixedbooleanoptional
attemptsintegeroptional
summarystringoptional
page_classificationsarray<object>optional
page_numberintegeroptional
includebooleanoptional
reasonstringoptional
POSTRequest
curl -X POST 'https://experiment.app.pheno.ml/lang2fhir/document/multi' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
  "version": "R4",
  "content": "Example text content",
  "provider": "canvas",
  "implementation_guide": "acme-cardiology",
  "detection_effort": "standard",
  "validation_method": "none",
  "config": {
    "page_filter": {
      "context": "clinical notes, diagnoses, medications — not sample collection instructions or insurance forms"
    }
  }
}'
200 OKResponse
{
  "success": true,
  "message": "Successfully extracted FHIR resources from document",
  "bundle": {
    "resourceType": "Bundle",
    "type": "transaction",
    "entry": [
      {
        "fullUrl": "urn:uuid:a842c4bc-f6cb-4555-9741-ac3aec4ef0b8",
        "resource": {},
        "request": {
          "method": "POST",
          "url": "Patient"
        }
      }
    ]
  },
  "resources": [
    {
      "tempId": "urn:uuid:a842c4bc-f6cb-4555-9741-ac3aec4ef0b8",
      "resourceType": "Patient",
      "description": "John Smith (DOB 1980-05-12) was diagnosed with Type 2 Diabetes during office visit on 2025-03-01 with Dr. Chen",
      "originalText": "diagnosed with Type 2 Diabetes"
    }
  ],
  "validation": {
    "passes": [
      {
        "issues": [
          {
            "severity": "fatal",
            "code": "ABC123",
            "diagnostics": "example",
            "expression": [
              "example"
            ],
            "source": "example"
          }
        ],
        "stats": {
          "resource_type": "Patient",
          "profile_url": "https://example.com",
          "is_custom_profile": true,
          "duration_ms": 0.5
        }
      }
    ],
    "fixed": true,
    "attempts": 10,
    "summary": "example"
  },
  "page_classifications": [
    {
      "page_number": 1,
      "include": true,
      "reason": "clinical notes with diagnoses"
    }
  ]
}