๐ Getting Started
The Radiology Code Semantic Cleaner is an AI-powered tool that standardizes radiology exam names against NHS reference data using advanced natural language processing and semantic matching.
Quick Start Guide
- Select AI Models: Choose your preferred retriever and reranker models from the AI Model Settings section
- Upload Data: Drop your JSON file containing radiology codes in the upload area
- Review Results: Examine the cleaned names, confidence scores, and mappings
- Export Data: Download the processed results for use in your systems
๐ง AI Model Settings
Retriever Selection
The retriever model performs semantic search to find candidate matches from the NHS reference database:
- BioLORD: Biomedical language model optimized for medical terminology
- Default: Standard embedding model for general medical text
Reranker Selection
The reranker model scores and ranks the candidates found by the retriever:
- MedCPT (HuggingFace): Medical cross-encoder for high-accuracy reranking
- OpenRouter LLMs: GPT-4, Claude, and Gemini models for contextual understanding
๐ก Tip: Your model selections are automatically saved and will be remembered for future sessions.
Data Format
Input Format
Upload a JSON file containing an array of radiology codes with the following structure:
[
{
"EXAM_NAME": "CT CHEST WITH CONTRAST",
"MODALITY_CODE": "CT",
"DATA_SOURCE": "Hospital_A",
"EXAM_CODE": "CTX001"
},
{
"EXAM_NAME": "MRI BRAIN",
"MODALITY_CODE": "MR",
"DATA_SOURCE": "Hospital_B",
"EXAM_CODE": "MRI123"
}
]
Required Fields
- EXAM_NAME: The original radiology exam name to be cleaned
- MODALITY_CODE: Imaging modality (CT, MR, US, XR, etc.)
- DATA_SOURCE: Source system or hospital identifier
- EXAM_CODE: Unique identifier for the exam
๐ Understanding Results
Confidence Scores
Each cleaned name includes a confidence score (0-100%) indicating the system's certainty in the match:
- 90-100%: High confidence - likely accurate match
- 70-89%: Good confidence - review recommended
- 50-69%: Moderate confidence - manual review needed
- Below 50%: Low confidence - likely requires correction
View Modes
- Full View: Complete table with all individual mappings
- Consolidated View: Groups similar results showing consolidation opportunities
SNOMED Integration
Results include SNOMED CT codes where available, providing standardized medical terminology for interoperability.
๐งช Testing & Quality Assurance
100 Exam Test Suite
Use the built-in test suite to evaluate different model combinations:
- Select your preferred models
- Click "100 Exam Test Suite"
- Review performance metrics and accuracy
- Compare results across different model configurations
โ ๏ธ Note: Test suite processing may take several minutes depending on selected models and system load.
Configuration Management
Edit Config
Advanced users can modify system configuration including:
- Model weights and scoring parameters
- Threshold settings for confidence levels
- NHS reference data sources
- Processing pipeline settings
โ ๏ธ Warning: Configuration changes affect system behavior and may require cache rebuilding. Changes should be tested thoroughly.
๐จ Troubleshooting
Common Issues
- Upload fails: Ensure JSON format is valid and contains required fields
- Processing stalls: Large files may take time; check status messages
- Low confidence scores: Try different model combinations or review input data quality
- Missing results: Verify internet connection for API access to reranker models