AI Agents / ICD-10 Logic Controller

AGT-MED-001 Medical AI-based

ICD-10 Logic Controller

AGT-MED-001 uses a medical Knowledge Graph built on WHO ICD-10 ontology, SNOMED-CT, and clinical treatment guidelines to detect diagnostic-treatment inconsistencies in health insurance claims. The agent extracts diagnosis codes and treatment items from medical reports using spaCy NLP, then queries a Neo4j graph database to determine whether the prescribed medications, lab tests, and procedures are clinically valid for the diagnosed condition. A gastritis diagnosis (K29) that comes with a prescription for digoxin (a cardiac glycoside) is a classic overbilling pattern. The agent flags any diagnosis-treatment pair with a clinical relevance score below the configured threshold, providing the specific guideline reference that contradicts the prescription.

Tech Stack

Python 3.11 Runtime

Neo4j 5.x Graph database storing ICD-10 / SNOMED-CT / drug ontology relationships

spaCy 3.x + scispaCy Clinical NLP for code and entity extraction from medical reports

WHO ICD-10 Ontology Authoritative diagnosis code hierarchy and disease definitions

SNOMED-CT Clinical terminology for mapping symptoms, findings, and procedures

RxNorm / ATC Drug classification and indication mapping

py2neo Python driver for Neo4j Cypher queries

FastAPI REST API endpoint

Input

A structured or unstructured medical report document plus extracted ICD-10 codes and treatment items.

Accepted Formats

JSON PDF text Plain text

Fields

Name	Type	Req	Description
icd10_codes	array<string>	Yes	List of ICD-10 diagnosis codes from the medical report (e.g. ['K29.7', 'J18.9'])
treatments	array<object>	Yes	List of treatment items: [{type: 'drug'\|'test'\|'procedure', code: '...', name: '...'}]
report_text	string	No	Raw text of the medical report for NLP extraction if structured codes are unavailable
patient_age	int	No	Patient age for age-appropriate treatment validation
patient_sex	string	No	Patient biological sex (M/F) for sex-specific treatment validation

Output

Per-pair clinical relevance assessments and an overall inconsistency verdict with specific guideline references.

Format:

JSON

Fields

Name	Type	Description
pairs_analysed	int	Total number of diagnosis-treatment pairs evaluated
inconsistencies	array<object>	List of flagged pairs: {icd10_code, icd10_name, treatment_code, treatment_name, relevance_score, flag_reason, guideline_reference}
overbilling_indicators	array<string>	Detected patterns: UNRELATED_DRUG, DUPLICATE_TEST, EXCESSIVE_DOSAGE, CONTRAINDICATED
total_inconsistency_score	float	Aggregate inconsistency score 0.0–1.0
flags	array<string>	FLAG_DIAGNOSIS_TREATMENT_MISMATCH, FLAG_CONTRAINDICATION, FLAG_OVERBILLING
risk_score	float	Normalised risk contribution 0.0–1.0
verdict	string	PASS \| FLAG \| INCONCLUSIVE

Example Response

{
  "pairs_analysed": 6,
  "inconsistencies": [
    {
      "icd10_code": "K29.7",
      "icd10_name": "Gastritis, unspecified",
      "treatment_code": "C01AA05",
      "treatment_name": "Digoxin",
      "relevance_score": 0.02,
      "flag_reason": "Digoxin is a cardiac glycoside with no established indication for gastritis",
      "guideline_reference": "WHO EML 23rd edition; ATC C01AA05 indications: heart failure, atrial fibrillation"
    }
  ],
  "overbilling_indicators": ["UNRELATED_DRUG"],
  "total_inconsistency_score": 0.78,
  "flags": ["FLAG_DIAGNOSIS_TREATMENT_MISMATCH"],
  "risk_score": 0.78,
  "verdict": "FLAG"
}

How It Works

Medical insurance fraud via overbilling or phantom treatment is one of the highest-cost fraud categories globally. The challenge is that most fraudulent claims use real ICD-10 codes for real diagnoses — the fraud lies in the mismatch between the diagnosis and the treatments billed.

AGT-MED-001 approaches this as a graph reasoning problem. Rather than maintaining a flat lookup table of allowed treatments per diagnosis, the agent models all medical knowledge as a graph. Nodes represent diseases, symptoms, drugs, lab tests, procedures, and body systems. Edges represent clinical relationships: 'treats', 'is_contraindicated_for', 'is_first_line_for', 'is_alternative_to', 'requires_test_before_use'.

This graph structure allows the agent to reason at multiple levels of abstraction. A drug that treats a parent disease class is considered relevant for its children. A drug indicated for a comorbid condition is given partial relevance. This prevents the brittle false-positive problem of naive code matching.

The NLP layer (scispaCy) enables the agent to process semi-structured medical reports where the physician has described the diagnosis in prose rather than coding it explicitly. The model extracts clinical entities, normalises them to UMLS concepts, and maps to ICD-10 codes.

The fraud detection logic ultimately rests on the clinical relevance score: if a treatment has no established clinical relationship to any of the patient's diagnoses — and the knowledge graph has been built from authoritative sources like WHO guidelines and SNOMED-CT — then the treatment was either prescribed in error or fraudulently billed.

Thinking Steps

Document Parsing & Code Extraction

If structured ICD-10 codes are provided in the request, use them directly. If only report_text is provided, run the scispaCy en_ner_bc5cdr_md model to extract disease mentions and map them to ICD-10 codes via the UMLS concept normaliser.

Clinical NLP extraction has ~88% precision on Vietnamese medical reports translated to English; structured codes are always preferred when available.

Knowledge Graph Loading

For each ICD-10 code, query the Neo4j knowledge graph to retrieve the code's disease node, its parent/child hierarchy, and all validated clinical relationships: standard treatments, contraindicated drugs, recommended tests, and typical procedures.

The graph contains 72,000 ICD-10 nodes, 450,000 treatment relationship edges, and 180,000 drug contraindication edges sourced from WHO guidelines and UpToDate clinical database.

Treatment Relevance Scoring

For each diagnosis-treatment pair, compute a clinical relevance score by traversing the knowledge graph. Score = 1.0 if the treatment is a first-line recommendation; 0.7 if second-line; 0.4 if indicated for comorbidities; 0.0–0.2 if no established clinical relationship exists.

Scores below 0.30 are flagged as potentially inconsistent; scores below 0.10 are flagged as strongly inconsistent.

Contraindication Detection

Check each drug against the contraindication edges in the graph for the patient's diagnoses. A contraindicated drug prescribes FLAG_CONTRAINDICATION regardless of the declared diagnosis — this is also a patient safety concern, not just a fraud signal.

Contraindication detection requires the patient's other active diagnoses and age to avoid false positives (some drugs are contraindicated only in specific comorbidity contexts).

Overbilling Pattern Detection

Apply rule-based patterns on top of graph scores: detect duplicate tests (same test ordered twice within 48h), excessive lab panel upcoding (ordering a full biochemistry panel for a minor injury), and unrelated specialist referrals.

Overbilling patterns are often combinations of individually justifiable items that collectively indicate inflation.

Guideline Reference Lookup

For each flagged inconsistency, query the knowledge graph for the canonical guideline reference (WHO Essential Medicines List, national formulary, or specialty clinical guideline) that contradicts or fails to support the prescribed treatment.

Returning the specific guideline gives the adjudicator a verifiable external reference to confront the healthcare provider.

Thinking Tree

Root Question: Are all treatments clinically justified by the stated diagnoses?
- Extract ICD-10 codes and treatment items
  - Structured codes provided → use directly
  - Only report text → run scispaCy NER + UMLS mapping
- For each diagnosis-treatment pair, query knowledge graph
  - Relevance score ≥ 0.30 → clinically justified
  - Relevance score 0.10–0.29 → INCONCLUSIVE, review
  - Relevance score < 0.10 → FLAG_DIAGNOSIS_TREATMENT_MISMATCH
- Contraindication check
  - Drug contraindicated for patient's condition → FLAG_CONTRAINDICATION
  - No contraindication → continue
- Overbilling pattern detection
  - Duplicate tests detected → FLAG_OVERBILLING
  - Unrelated specialist referral → FLAG_OVERBILLING
  - All patterns within norms → PASS

Decision Tree

Are ICD-10 codes and treatments successfully extracted?

Yes → d2 No → inconclusive

Any drug contraindicated for the patient's diagnoses?

Yes → flag_contra No → d3

Any diagnosis-treatment pair with relevance score < 0.10?

Yes → flag_mismatch No → d4

Any overbilling pattern detected (duplicates, upcoding)?

Yes → flag_overbill No → d5

Any pair with relevance score 0.10–0.29?

Yes → inconclusive_partial No → pass

FLAG — CONTRAINDICATION: Drug prescribed despite clear clinical contraindication

flag_contra

FLAG — DIAGNOSIS_TREATMENT_MISMATCH: Treatment has no established clinical indication for diagnosis

flag_mismatch

FLAG — OVERBILLING: Pattern of redundant or inflated treatments detected

flag_overbill

INCONCLUSIVE — Cannot extract sufficient codes from document

inconclusive

INCONCLUSIVE — Borderline relevance; recommend clinical expert review

inconclusive_partial

PASS — All treatments are clinically justified for stated diagnoses

pass

Technical Design

Architecture

AGT-MED-001 is a stateless async microservice backed by a shared Neo4j graph database instance. The knowledge graph is read-only at inference time and updated weekly with new guideline releases. NLP models are loaded at startup and kept in memory. The Neo4j queries are parameterised Cypher statements executed via the bolt protocol for low latency.

Components

Component	Role	Technology
DocumentParser	Extracts structured codes or triggers NLP pipeline	pdfminer + Pillow for scanned reports
ClinicalNLPExtractor	NER for disease/drug/procedure entities in report text	scispaCy en_ner_bc5cdr_md + UMLS linker
KnowledgeGraphQuerier	Executes Cypher queries against Neo4j for treatment relationships	py2neo + Neo4j 5 bolt
RelevanceScorer	Computes clinical relevance score via graph path traversal	Custom Python graph traversal + edge weight lookup
ContraindicationChecker	Checks drug contraindication edges in the graph	Neo4j Cypher pattern match
OverbillingDetector	Rule-based pattern matching for billing anomalies	Python rule engine
GuidelineReferencer	Retrieves canonical guideline text for flagged pairs	Neo4j property lookup

Architecture Diagram

┌────────────────────────────────┐
│ POST /analyze                  │
│ (icd10_codes + treatments)     │
└───────────────┬────────────────┘
                │
                ▼
┌────────────────────────────────┐
│       DocumentParser           │
│  (structured or NLP extract)   │
└───────────────┬────────────────┘
                │
     ┌──────────┴──────────┐
     ▼                     ▼
┌──────────┐      ┌────────────────────┐
│Contra-   │      │  KnowledgeGraph    │
│indication│      │  Querier (Neo4j)   │
│Checker   │      └──────────┬─────────┘
└──────┬───┘                 │
       │              ┌──────┴───────┐
       │              ▼              ▼
       │      ┌──────────┐  ┌────────────────┐
       │      │Relevance │  │  Overbilling   │
       │      │Scorer    │  │  Detector      │
       │      └──────┬───┘  └──────┬─────────┘
       │             │             │
       └─────────────┴─────────────┘
                     │
                     ▼
       ┌─────────────────────────┐
       │   GuidelineReferencer   │
       └─────────────┬───────────┘
                     │
                     ▼
              JSON verdict

Data Flow

API Gateway DocumentParser | ICD-10 codes array + treatments array (or raw report text)

DocumentParser KnowledgeGraphQuerier | Normalised code-treatment pairs

KnowledgeGraphQuerier RelevanceScorer | Graph subgraph for each diagnosis

KnowledgeGraphQuerier ContraindicationChecker | Drug-diagnosis edges

RelevanceScorer GuidelineReferencer | Pairs with score < 0.30

GuidelineReferencer API Gateway | Full inconsistency report with guideline citations

Back to AI Agents AGT-MED-001