Native Medical Services¶
OpenMed currently operates across three relevant boundaries:
- the local operator runtime
- the configured model-provider path for general agent reasoning
- the native medical services that power extraction, de-identification, terminology, and HCC/RAF work
The point of this split is not locality for its own sake. It is to keep sensitive or high-volume clinical processing on dedicated medical-service planes that can be governed separately, respond faster on long unstructured inputs, and operate more cost-efficiently than sending every page through a frontier LLM path.
Inspectable runtime. Protected medical services. Explicit boundary.
OpenMed keeps sessions, plans, workflow previews, and artifacts in the operator runtime while general agent reasoning uses the configured model provider. Clinical extraction and medical-coding services sit behind separate configurable endpoints, so teams do not need to route every PHI-heavy or page-heavy workload through the same frontier-model path.
Operator runtime
CLI, TUI, sessions, plans, provenance, and workflow artifacts remain on the operator machine, with the review loop visible before teams depend on outputs.
Clinical extraction service
NER, PII detection, de-identification, and batch extraction run through a dedicated endpoint.
Medical coding and terminology service
ICD-10, CPT, SNOMED, LOINC, RxNorm, MedlinePlus, PubMed, HCC mapping, and RAF scoring run through a separate service boundary.
Service planes¶
Clinical extraction plane¶
The extraction plane is the endpoint behind OPENMED_INFERENCE_URL.
It powers native OpenMed tools such as:
extract_entitiesextract_piideidentify_text- batch extraction workflows built on the same service
Operationally, this is where OpenMed handles:
- clinical NER
- billing-oriented entity extraction
- PII detection
- de-identification
Current posture:
- during preview, OpenMed provisions this endpoint on private Hugging Face accelerated infrastructure
- access can be protected with
OPENMED_INFERENCE_API_KEY - access can also use
OPENMED_INFERENCE_HF_TOKENorHF_TOKEN - packaged binaries can resolve embedded service credentials without hard-coding plaintext defaults into source
Coding and terminology plane¶
The coding plane is the endpoint behind OPENMED_MED_CODES_API_URL.
It powers native OpenMed tools such as:
- PubMed search and abstract retrieval
- ICD-10, CPT, SNOMED, and LOINC search / lookup / validation
- RxNorm medication normalization and related-concept lookup
- MedlinePlus patient-education topic lookup
- code crosswalks
- HCC mapping
- RAF score calculation
This is the service boundary that makes the HCC and revenue-integrity story concrete: OpenMed can orchestrate clinical note review, extract coding candidates, and then hand those codes to a separate terminology/HCC service instead of collapsing everything into one generic model call.
Current posture:
- during preview, OpenMed provisions this endpoint on private Hugging Face accelerated infrastructure
- access can be protected with
OPENMED_MED_CODES_API_KEY - access can also use
OPENMED_MED_CODES_HF_TOKENorHF_TOKEN - packaged binaries can resolve embedded service credentials without hard-coding plaintext defaults into source
Why the split matters¶
This architecture gives OpenMed a stronger healthcare deployment story than a single undifferentiated model endpoint:
- extraction and de-identification can scale independently from coding and terminology lookup
- HCC and RAF operations can live behind a dedicated protected service boundary
- sensitive or page-heavy clinical inputs do not need to consume frontier-model context for every extraction or coding pass
- teams can reserve the model-provider path for general reasoning while keeping protected clinical processing on dedicated services
- the service tier can be swapped without changing the operator workflow surface
These service endpoints are native OpenMed backends. They are not remote MCP servers.
Deployment patterns¶
| Pattern | Runtime | Extraction / PII | Coding / HCC |
|---|---|---|---|
| Preview reference | Local operator machine | Private Hugging Face accelerated endpoint operated by OpenMed | Separate protected med-codes endpoint operated by OpenMed |
| Customer cloud | Local operator machine or managed desktop | Private container in VPC / private cloud | Private terminology and HCC service in the same environment |
| On-prem / edge | Managed workstation | Local GPU or internal inference cluster | Internal terminology / HCC API |
| Lab / dev | Local operator machine | Local or sandbox endpoint | Local or sandbox endpoint |
Private Hugging Face hosting is the current preview deployment, not a hard dependency. The real product boundary is the pair of configurable endpoint URLs.
Security and access model¶
OpenMed itself does not claim that every deployment is automatically compliant because one part of the product is local. The defensible statement is narrower and more useful:
- the operator runtime and review loop are inspectable
- the model-provider path and the medical-service tier are separate, explicit boundaries
- the actual privacy and compliance posture depends on where those services are hosted and governed
- service access can be protected with API keys and optional bearer tokens
- packaged binaries can carry embedded service credentials instead of shipping plaintext defaults
- no product telemetry is built into the runtime
During preview, OpenMed serves the native medical-service tier from private Hugging Face infrastructure so evaluators do not need to deploy it themselves. The same workflow surface can later target customer-managed cloud or on-prem environments while preserving the OpenMed runtime and review loop.
Configuration surface¶
export OPENMED_INFERENCE_URL="https://<private-inference-endpoint>"
export OPENMED_INFERENCE_API_KEY="..."
export OPENMED_INFERENCE_TIMEOUT_SECONDS="30"
export OPENMED_MED_CODES_API_URL="https://<private-med-codes-endpoint>"
export OPENMED_MED_CODES_API_KEY="..."
export OPENMED_MED_CODES_TIMEOUT_SECONDS="10"
export OPENMED_SERVICE_MAX_RETRIES="3"
export OPENMED_SERVICE_RETRY_BACKOFF="1"
export OPENMED_SERVICE_CIRCUIT_OPEN_SECONDS="120"
Optional bearer-token auth:
See Configuration for the full environment-variable reference and Privacy & Security for the runtime-boundary explanation.