Help - Radiology Code Semantic Cleaner

📚 Getting Started

The Radiology Code Semantic Cleaner is an AI-powered tool that standardizes radiology exam names against NHS reference data using advanced natural language processing and semantic matching.

Quick Start Guide

Select AI Models: Choose your preferred retriever and reranker models from the AI Model Settings section
Upload Data: Drop your JSON file containing radiology codes in the upload area
Review Results: Examine the cleaned names, confidence scores, and mappings
Export Data: Download the processed results for use in your systems

🔧 AI Model Settings

Retriever Selection

The retriever model performs semantic search to find candidate matches from the NHS reference database:

BioLORD: Biomedical language model optimized for medical terminology
Default: Standard embedding model for general medical text

Reranker Selection

The reranker model scores and ranks the candidates found by the retriever:

MedCPT (HuggingFace): Medical cross-encoder for high-accuracy reranking
OpenRouter LLMs: GPT-4, Claude, and Gemini models for contextual understanding

💡 Tip: Your model selections are automatically saved and will be remembered for future sessions.

Data Format

Input Format

Upload a JSON file containing an array of radiology codes with the following structure:

[ { "EXAM_NAME": "CT CHEST WITH CONTRAST", "MODALITY_CODE": "CT", "DATA_SOURCE": "Hospital_A", "EXAM_CODE": "CTX001" }, { "EXAM_NAME": "MRI BRAIN", "MODALITY_CODE": "MR", "DATA_SOURCE": "Hospital_B", "EXAM_CODE": "MRI123" } ]

Required Fields

EXAM_NAME: The original radiology exam name to be cleaned
MODALITY_CODE: Imaging modality (CT, MR, US, XR, etc.)
DATA_SOURCE: Source system or hospital identifier
EXAM_CODE: Unique identifier for the exam

📊 Understanding Results

Confidence Scores

Each cleaned name includes a confidence score (0-100%) indicating the system's certainty in the match:

90-100%: High confidence - likely accurate match
70-89%: Good confidence - review recommended
50-69%: Moderate confidence - manual review needed
Below 50%: Low confidence - likely requires correction

View Modes

Full View: Complete table with all individual mappings
Consolidated View: Groups similar results showing consolidation opportunities

SNOMED Integration

Results include SNOMED CT codes where available, providing standardized medical terminology for interoperability.

🧪 Testing & Quality Assurance

100 Exam Test Suite

Use the built-in test suite to evaluate different model combinations:

Select your preferred models
Click "100 Exam Test Suite"
Review performance metrics and accuracy
Compare results across different model configurations

⚠️ Note: Test suite processing may take several minutes depending on selected models and system load.

Configuration Management

Edit Config

Advanced users can modify system configuration including:

Model weights and scoring parameters
Threshold settings for confidence levels
NHS reference data sources
Processing pipeline settings

⚠️ Warning: Configuration changes affect system behavior and may require cache rebuilding. Changes should be tested thoroughly.

🚨 Troubleshooting

Common Issues

Upload fails: Ensure JSON format is valid and contains required fields
Processing stalls: Large files may take time; check status messages
Low confidence scores: Try different model combinations or review input data quality
Missing results: Verify internet connection for API access to reranker models

🔗 Quick Links:
System Architecture | Main Application

Help & Documentation