Help & Documentation

Radiology Code Semantic Cleaner

๐Ÿ“š Getting Started

The Radiology Code Semantic Cleaner is an AI-powered tool that standardizes radiology exam names against NHS reference data using advanced natural language processing and semantic matching.

Quick Start Guide

  1. Select AI Models: Choose your preferred retriever and reranker models from the AI Model Settings section
  2. Upload Data: Drop your JSON file containing radiology codes in the upload area
  3. Review Results: Examine the cleaned names, confidence scores, and mappings
  4. Export Data: Download the processed results for use in your systems

๐Ÿ”ง AI Model Settings

Retriever Selection

The retriever model performs semantic search to find candidate matches from the NHS reference database:

  • BioLORD: Biomedical language model optimized for medical terminology
  • Default: Standard embedding model for general medical text

Reranker Selection

The reranker model scores and ranks the candidates found by the retriever:

  • MedCPT (HuggingFace): Medical cross-encoder for high-accuracy reranking
  • OpenRouter LLMs: GPT-4, Claude, and Gemini models for contextual understanding
๐Ÿ’ก Tip: Your model selections are automatically saved and will be remembered for future sessions.

Data Format

Input Format

Upload a JSON file containing an array of radiology codes with the following structure:

[ { "EXAM_NAME": "CT CHEST WITH CONTRAST", "MODALITY_CODE": "CT", "DATA_SOURCE": "Hospital_A", "EXAM_CODE": "CTX001" }, { "EXAM_NAME": "MRI BRAIN", "MODALITY_CODE": "MR", "DATA_SOURCE": "Hospital_B", "EXAM_CODE": "MRI123" } ]

Required Fields

  • EXAM_NAME: The original radiology exam name to be cleaned
  • MODALITY_CODE: Imaging modality (CT, MR, US, XR, etc.)
  • DATA_SOURCE: Source system or hospital identifier
  • EXAM_CODE: Unique identifier for the exam

๐Ÿ“Š Understanding Results

Confidence Scores

Each cleaned name includes a confidence score (0-100%) indicating the system's certainty in the match:

  • 90-100%: High confidence - likely accurate match
  • 70-89%: Good confidence - review recommended
  • 50-69%: Moderate confidence - manual review needed
  • Below 50%: Low confidence - likely requires correction

View Modes

  • Full View: Complete table with all individual mappings
  • Consolidated View: Groups similar results showing consolidation opportunities

SNOMED Integration

Results include SNOMED CT codes where available, providing standardized medical terminology for interoperability.

๐Ÿงช Testing & Quality Assurance

100 Exam Test Suite

Use the built-in test suite to evaluate different model combinations:

  1. Select your preferred models
  2. Click "100 Exam Test Suite"
  3. Review performance metrics and accuracy
  4. Compare results across different model configurations
โš ๏ธ Note: Test suite processing may take several minutes depending on selected models and system load.

Configuration Management

Edit Config

Advanced users can modify system configuration including:

  • Model weights and scoring parameters
  • Threshold settings for confidence levels
  • NHS reference data sources
  • Processing pipeline settings
โš ๏ธ Warning: Configuration changes affect system behavior and may require cache rebuilding. Changes should be tested thoroughly.

๐Ÿšจ Troubleshooting

Common Issues

  • Upload fails: Ensure JSON format is valid and contains required fields
  • Processing stalls: Large files may take time; check status messages
  • Low confidence scores: Try different model combinations or review input data quality
  • Missing results: Verify internet connection for API access to reranker models
๐Ÿ”— Quick Links:
System Architecture | Main Application