Extract multiple FHIR resources from a document

POST/lang2fhir/document/multi

Extracts text from a document (PDF or image) and converts it into multiple FHIR resources, returned as a transaction Bundle. Combines document text extraction with multi-resource detection. Automatically detects Patient, Condition, MedicationRequest, Observation, and other resource types. Resources are linked with proper references (e.g., Conditions reference the Patient).

Patient identifier handling. US Core requires Patient.identifier (a business identifier such as an MRN). When the source text contains an identifier, it is extracted with an appropriate URI system. When the source text does not contain a detectable identifier, a synthetic one is generated with system: "urn:phenoml:lang2fhir-generated-id" and a UUID value so the bundle remains FHIR-valid and US Core conformant. Callers who need a tenant-specific namespace should rewrite the synthetic system after extraction.

RequiresBearerauthentication

Body parameters

versionstringrequired

FHIR version to use

contentstringrequired

Base64 encoded file content. Supported file types: PDF (application/pdf), PNG (image/png), JPEG (image/jpeg). File type is auto-detected from content magic bytes.

providerstringoptional

Optional FHIR provider name for provider-specific profiles

implementation_guidestringoptional

Custom Implementation Guide name. When specified, profiles from this IG are included alongside US Core profiles during resource detection. US Core is always the base layer; custom IG profiles are additive.

detection_effortstringoptionaldefault standard

Detection effort. 'standard' runs detection once, 'deep' runs detection multiple times for higher recall.

standarddeep

validation_methodstringoptionaldefault none

FHIR validation method to apply to the generated bundle. 'none' skips validation (default). 'check' runs the bundle through a FHIR structure validator and includes the results in the response. 'fix' runs validation and attempts to auto-correct errors using an LLM (up to 3 validation passes). The response includes results from each pass. Warning: 'fix' can significantly increase latency due to multiple LLM and validation round-trips.

nonecheckfix

configobjectoptional

Optional processing configuration shared across document endpoints.

page_filterobjectoptional

Configures per-page pre-extraction filtering. When set, each page of text extracted from the document is classified by an LLM, and pages classified as irrelevant to the supplied context are dropped before FHIR extraction.

contextstringrequired

Natural-language description of what IS relevant to the extraction goal. Pages that do not match are dropped from downstream FHIR extraction.

Returns

Successfully extracted FHIR resources from document

Response fields

successbooleanoptional

messagestringoptional

bundleobjectoptional

resourceTypestringoptional

typestringoptional

entryarray<object>optional

fullUrlstringoptional

resourceobjectoptional

requestobjectoptional

methodstringoptional

urlstringoptional

resourcesarray<object>optional

tempIdstringoptional

resourceTypestringoptional

descriptionstringoptional

originalTextstringoptional

validationobjectoptional

passesarray<object>optional

issuesarray<object>optional

severitystringoptional

codestringoptional

diagnosticsstringoptional

expressionarray<string>optional

sourcestringoptional

statsobjectoptional

resource_typestringoptional

profile_urlstringoptional

is_custom_profilebooleanoptional

duration_msnumberoptional

fixedbooleanoptional

attemptsintegeroptional

summarystringoptional

page_classificationsarray<object>optional

page_numberintegeroptional

includebooleanoptional

reasonstringoptional

POSTRequest

curl -X POST 'https://experiment.app.pheno.ml/lang2fhir/document/multi' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
  "version": "R4",
  "content": "Example text content",
  "provider": "canvas",
  "implementation_guide": "acme-cardiology",
  "detection_effort": "standard",
  "validation_method": "none",
  "config": {
    "page_filter": {
      "context": "clinical notes, diagnoses, medications — not sample collection instructions or insurance forms"
    }
  }
}'

200 OKResponse

{
  "success": true,
  "message": "Successfully extracted FHIR resources from document",
  "bundle": {
    "resourceType": "Bundle",
    "type": "transaction",
    "entry": [
      {
        "fullUrl": "urn:uuid:a842c4bc-f6cb-4555-9741-ac3aec4ef0b8",
        "resource": {},
        "request": {
          "method": "POST",
          "url": "Patient"
        }
      }
    ]
  },
  "resources": [
    {
      "tempId": "urn:uuid:a842c4bc-f6cb-4555-9741-ac3aec4ef0b8",
      "resourceType": "Patient",
      "description": "John Smith (DOB 1980-05-12) was diagnosed with Type 2 Diabetes during office visit on 2025-03-01 with Dr. Chen",
      "originalText": "diagnosed with Type 2 Diabetes"
    }
  ],
  "validation": {
    "passes": [
      {
        "issues": [
          {
            "severity": "fatal",
            "code": "ABC123",
            "diagnostics": "example",
            "expression": [
              "example"
            ],
            "source": "example"
          }
        ],
        "stats": {
          "resource_type": "Patient",
          "profile_url": "https://example.com",
          "is_custom_profile": true,
          "duration_ms": 0.5
        }
      }
    ],
    "fixed": true,
    "attempts": 10,
    "summary": "example"
  },
  "page_classifications": [
    {
      "page_number": 1,
      "include": true,
      "reason": "clinical notes with diagnoses"
    }
  ]
}