# Dots.OCR API Testing This directory contains comprehensive testing scripts for the Dots.OCR API endpoint. ## Test Scripts ### 1. `test_api_endpoint.py` - Comprehensive API Testing The main testing script that provides full API validation capabilities. **Features:** - Health check validation - Single and multiple image testing - ROI (Region of Interest) testing - Field extraction validation - Response structure validation - Performance metrics - Detailed error reporting **Usage:** ```bash # Basic test with default settings python test_api_endpoint.py # Test with custom API URL python test_api_endpoint.py --url https://your-api.example.com # Test with ROI python test_api_endpoint.py --roi '{"x1": 0.1, "y1": 0.1, "x2": 0.9, "y2": 0.9}' # Test with specific expected fields python test_api_endpoint.py --expected-fields document_number surname given_names # Verbose output python test_api_endpoint.py --verbose # Custom timeout python test_api_endpoint.py --timeout 60 ``` **Options:** - `--url`: API base URL (default: http://localhost:7860) - `--timeout`: Request timeout in seconds (default: 30) - `--roi`: ROI coordinates as JSON string - `--expected-fields`: List of expected field names to validate - `--verbose`: Enable verbose logging ### 2. `quick_test.py` - Quick Validation A simple script for quick API validation after deployment. **Usage:** ```bash # Test local API python quick_test.py # Test remote API python quick_test.py https://your-api.example.com ``` ## Test Configuration ### `test_config.json` Configuration file for test parameters and thresholds. **Configuration sections:** - `api_endpoints`: Different API URLs for various environments - `test_images`: List of test image files - `expected_fields`: Fields that should be extracted - `roi_test_cases`: Different ROI configurations to test - `performance_thresholds`: Performance validation criteria - `test_timeout`: Default timeout for requests ## Test Images The following test images are used for validation: - `tom_id_card_front.jpg` - Front of Dutch ID card - `tom_id_card_back.jpg` - Back of Dutch ID card ## Testing Scenarios ### 1. Basic Functionality Test ```bash python test_api_endpoint.py ``` Tests basic API functionality with default settings. ### 2. ROI Testing ```bash python test_api_endpoint.py --roi '{"x1": 0.25, "y1": 0.25, "x2": 0.75, "y2": 0.75}' ``` Tests Region of Interest cropping functionality. ### 3. Field Validation Test ```bash python test_api_endpoint.py --expected-fields document_number surname given_names nationality ``` Tests that specific fields are extracted correctly. ### 4. Performance Test ```bash python test_api_endpoint.py --timeout 60 --verbose ``` Tests API performance with extended timeout and detailed logging. ## Expected Results ### Successful Test Output ``` 🔍 Checking API health... ✅ API is healthy: {'status': 'healthy', 'version': '1.0.0', 'model_loaded': True} 🚀 Starting API tests with 2 images... ✅ tom_id_card_front.jpg: 2.45s ✅ tom_id_card_back.jpg: 1.23s 📊 Test Results: Total images: 2 Successful: 2 Failed: 0 Success rate: 100.0% Average processing time: 1.84s 🎉 All tests completed successfully! ``` ### Field Extraction Example ``` Page 1: 11 fields extracted document_number: NLD123456789 (confidence: 0.90) surname: MULDER (confidence: 0.90) given_names: THOMAS JAN (confidence: 0.90) nationality: NLD (confidence: 0.95) date_of_birth: 15-03-1990 (confidence: 0.90) gender: M (confidence: 0.95) ``` ## Troubleshooting ### Common Issues 1. **Connection Refused** - Check if the API is running - Verify the correct URL and port - Check firewall settings 2. **Timeout Errors** - Increase timeout with `--timeout` parameter - Check API performance and resource usage 3. **Missing Fields** - Verify test images contain the expected text - Check field extraction patterns in the code - Review API logs for processing errors 4. **Validation Errors** - Check API response format - Verify model is loaded correctly - Review error logs for details ### Debug Mode Enable verbose logging for detailed debugging: ```bash python test_api_endpoint.py --verbose ``` ## Integration with CI/CD The test scripts can be integrated into CI/CD pipelines: ```yaml # Example GitHub Actions step - name: Test API Endpoint run: | python scripts/test_api_endpoint.py --url ${{ env.API_URL }} --timeout 60 ``` ## Performance Monitoring The scripts provide performance metrics that can be used for monitoring: - Processing time per image - Success rate - Field extraction accuracy - Response validation results These metrics can be integrated with monitoring systems like Prometheus or DataDog. ## 🚀 Production API Testing ### Current Production Endpoint - **URL**: https://algoryn-dots-ocr-idcard.hf.space - **Health Check**: https://algoryn-dots-ocr-idcard.hf.space/health - **API Docs**: https://algoryn-dots-ocr-idcard.hf.space/docs ### Quick Production Test ```bash # Test production API ./run_tests.sh -e production # Quick test with curl (no Python dependencies) ./test_production_curl.sh ``` ### Staging Environment - **Staging URL**: https://algoryn-dots-ocr-idcard-staging.hf.space (to be created) - **Purpose**: Safe testing before production deployment ### Environment-Specific Testing ```bash # Test different environments ./run_tests.sh -e local # Local development ./run_tests.sh -e staging # Staging environment ./run_tests.sh -e production # Production environment ``` --- ### 5. `test_debug_ocr.sh` - Per-request debug logging via curl Use this for quick, dependency-light testing of the server-side debug mode that prints OCR snippets, extracted fields, and MRZ details to logs. **Usage:** ```bash # Local server (per-request debug on) ./test_debug_ocr.sh -u http://localhost:7860 -f tom_id_card_front.jpg -d # Hugging Face Space (replace with your Space URL) ./test_debug_ocr.sh -u https://.hf.space -f tom_id_card_front.jpg -d \ -r '{"x1":0,"y1":0,"x2":1,"y2":0.5}' ``` You can also enable debug globally on the server with `DOTS_OCR_DEBUG=1`. The script only toggles the request-level flag via `-d`.