Spaces:

davanstrien
/

ocr-time-capsule

Running

davanstrien HF Staff Claude commited on Aug 7

Commit

34cedd8

1 Parent(s): 1e32a60

Add support for reasoning trace display from NuMarkdown-8B-Thinking model

- Created ReasoningParser module to detect and parse <think>/<answer> tags
- Added collapsible reasoning panel UI with formatted step display
- Automatically separates reasoning from final output for cleaner view
- Shows reasoning statistics (word count, percentage of output)
- Added india-medical-ocr-test dataset to examples
- Styled reasoning sections with dark mode support
- Includes reasoning trace indicator badge in statistics panel

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (8) hide show

CLAUDE.md +89 -2
css/styles.css +49 -0
index.html +68 -2
js/app.js +65 -3
js/reasoning-parser.js +224 -0
linkedin-post.txt +18 -0
mobile-enhancement-plan.md +237 -0
multi-ocr-comparison-ui-patterns.md +277 -0

CLAUDE.md CHANGED Viewed

@@ -6,6 +6,32 @@ This file provides guidance to Claude Code (claude.ai/code) when working with th
 OCR Text Explorer is a modern, standalone web application for browsing and comparing OCR text improvements in HuggingFace datasets. Built as a lightweight alternative to the Gradio-based OCR Time Machine, it focuses specifically on exploring pre-OCR'd datasets with enhanced user experience.
 ## Architecture
 ### Technology Stack
@@ -123,6 +149,23 @@ case 'your_key':
 // Dark mode: bg-red-950, text-red-300
 ```
 ## Performance Optimizations
 1. **Direct Dataset Indexing**: Uses `dataset[index]` instead of loading batches into memory
@@ -146,8 +189,33 @@ case 'your_key':
 **Cause**: Signed URLs expire after ~1 hour
 **Fix**: Implemented handleImageError() with automatic URL refresh
 ## Future Enhancements
 - [ ] Search/filter within dataset
 - [ ] Bookmark favorite samples
 - [ ] Export selected texts
@@ -178,9 +246,28 @@ npx serve .
 ## Testing Datasets
 Known working datasets:
-- `davanstrien/exams-ocr` - Default dataset with great examples
 - Any dataset with image + text columns
 Column patterns automatically detected:
 - Original: `text`, `ocr`, `original_text`, `ground_truth`
-- Improved: `markdown`, `new_ocr`, `corrected_text`, `vlm_ocr`

 OCR Text Explorer is a modern, standalone web application for browsing and comparing OCR text improvements in HuggingFace datasets. Built as a lightweight alternative to the Gradio-based OCR Time Machine, it focuses specifically on exploring pre-OCR'd datasets with enhanced user experience.
+## Recent Updates
+### Markdown Rendering Support (Added 2025-08-01)
+The application now supports rendering markdown-formatted VLM output for improved readability:
+**Features:**
+- Automatic markdown detection in improved OCR text
+- Toggle button to switch between raw markdown and rendered view
+- Support for common markdown elements: headers, lists, tables, code blocks, links
+- Security-focused implementation with XSS prevention
+- Performance optimization with render caching
+**Implementation Details:**
+- Uses marked.js library for markdown parsing
+- Custom renderers for security (sanitizes URLs, prevents script injection)
+- Tailwind-styled markdown elements matching the app's design
+- HTML table support for VLM outputs that use table tags
+- Cache system limits memory usage to 50 rendered items
+**UI Changes:**
+- Markdown toggle button appears when markdown is detected
+- "Markdown Detected" badge in statistics panel
+- New "Markdown Diff" mode showing plain vs rendered comparison
+- Both "Improved Only" and "Side by Side" views support rendering
 ## Architecture
 ### Technology Stack
 // Dark mode: bg-red-950, text-red-300
 ```
+### Working with Markdown Rendering
+```javascript
+// Enable/disable markdown rendering
+this.renderMarkdown = true; // Toggle markdown rendering
+// Add new markdown patterns to detection
+// In app.js detectMarkdown() method
+const markdownPatterns = [
+    /your_pattern_here/,  // Add your pattern
+    // ... existing patterns
+];
+// Customize markdown styles
+// In app.js renderMarkdownText() method
+html = html.replace(/<your_element>/g, '<your_element class="your-tailwind-classes">');
+```
 ## Performance Optimizations
 1. **Direct Dataset Indexing**: Uses `dataset[index]` instead of loading batches into memory
 **Cause**: Signed URLs expire after ~1 hour
 **Fix**: Implemented handleImageError() with automatic URL refresh
+### Issue: Markdown tables not rendering
+**Cause**: Default marked.js settings and HTML security restrictions
+**Fix**:
+- Enabled `tables: true` in marked.js options
+- Added safe HTML table tag allowlist in renderer
+- Applied proper Tailwind CSS classes to table elements
+- Added CSS overrides for prose container compatibility
+## Mobile Support Status
+While the application claims responsive design, the current mobile support is limited. A comprehensive mobile enhancement is planned but not yet implemented. See [mobile-enhancement-plan.md](mobile-enhancement-plan.md) for detailed technical requirements and implementation approach.
+**Current limitations:**
+- Fixed desktop layout doesn't adapt well to small screens
+- No touch gesture support for navigation
+- Small touch targets for buttons and inputs
+- Desktop-only interactions (hover states, keyboard shortcuts)
+**Planned improvements:**
+- Responsive stacked layout for mobile devices
+- Touch gestures (swipe for navigation)
+- Mobile-optimized navigation bar
+- Touch-friendly UI components
 ## Future Enhancements
+- [ ] Comprehensive mobile support (see mobile-enhancement-plan.md)
 - [ ] Search/filter within dataset
 - [ ] Bookmark favorite samples
 - [ ] Export selected texts
 ## Testing Datasets
 Known working datasets:
+- `davanstrien/exams-ocr` - Default dataset with exam papers (uses `text` and `markdown` columns)
+- `davanstrien/rolm-test` - Victorian theatre playbills processed with RolmOCR (uses `text` and `rolmocr_text` columns, includes `inference_info` metadata)
 - Any dataset with image + text columns
 Column patterns automatically detected:
 - Original: `text`, `ocr`, `original_text`, `ground_truth`
+- Improved: `markdown`, `new_ocr`, `corrected_text`, `vlm_ocr`, `rolmocr_text`
+- Metadata: `inference_info` (JSON array with model details, processing date, parameters)
+## Recent Updates
+### Model Information Display (Added 2025-08-04)
+The application now displays model processing information when available:
+**Features:**
+- Automatic detection of `inference_info` column
+- Model metadata panel showing: model name, processing date, batch size, max tokens
+- Link to processing script when available
+- Positioned prominently below image for immediate visibility
+**Implementation Notes:**
+- The model info panel only appears when `inference_info` column exists
+- Supports datasets processed with UV scripts via HF Jobs
+- Gracefully handles datasets without model metadata

css/styles.css CHANGED Viewed

@@ -48,6 +48,55 @@ body {
     word-break: break-word;
 }
 /* Keyboard hint styling */
 kbd {
     @apply inline-block px-2 py-1 text-xs font-semibold text-gray-800 bg-gray-100 border border-gray-300 rounded dark:bg-gray-700 dark:text-gray-200 dark:border-gray-600;

     word-break: break-word;
 }
+/* Reasoning trace styling */
+.reasoning-panel {
+    @apply bg-gradient-to-r from-blue-50 to-indigo-50 dark:from-blue-950/20 dark:to-indigo-950/20;
+    @apply border-l-4 border-blue-500 dark:border-blue-400;
+}
+.reasoning-step {
+    @apply transition-all hover:bg-gray-50 dark:hover:bg-gray-800/50 rounded-md p-2 -m-2;
+}
+.reasoning-step-number {
+    @apply inline-flex items-center justify-center w-7 h-7;
+    @apply bg-gradient-to-br from-blue-500 to-indigo-600;
+    @apply text-white text-xs font-bold rounded-full;
+    @apply shadow-sm;
+}
+.reasoning-step-title {
+    @apply font-semibold text-gray-900 dark:text-gray-100;
+    @apply border-b border-gray-200 dark:border-gray-700 pb-1 mb-2;
+}
+.reasoning-step-content {
+    @apply text-sm text-gray-700 dark:text-gray-300;
+    @apply leading-relaxed;
+}
+/* Collapse animation for reasoning panel */
+[x-collapse] {
+    overflow: hidden;
+    transition: max-height 0.3s ease-out;
+}
+[x-collapse].collapsed {
+    max-height: 0;
+}
+/* Reasoning trace indicators */
+.reasoning-indicator {
+    @apply animate-pulse;
+}
+.reasoning-badge {
+    @apply inline-flex items-center px-3 py-1 rounded-full text-xs font-medium;
+    @apply bg-gradient-to-r from-blue-100 to-indigo-100 dark:from-blue-900 dark:to-indigo-900;
+    @apply text-blue-800 dark:text-blue-200;
+    @apply border border-blue-200 dark:border-blue-700;
+}
 /* Keyboard hint styling */
 kbd {
     @apply inline-block px-2 py-1 text-xs font-semibold text-gray-800 bg-gray-100 border border-gray-300 rounded dark:bg-gray-700 dark:text-gray-200 dark:border-gray-600;

index.html CHANGED Viewed

@@ -314,13 +314,19 @@
                                         <span x-text="wordStats.original || '-'"></span> → <span x-text="wordStats.improved || '-'"></span>
                                     </span>
                                 </div>
-                                <div x-show="hasMarkdown" class="mt-2 flex items-center justify-center">
-                                    <span class="inline-flex items-center px-2.5 py-0.5 rounded-full text-xs font-medium bg-purple-100 dark:bg-purple-900 text-purple-800 dark:text-purple-200">
                                         <svg class="w-3 h-3 mr-1" fill="none" stroke="currentColor" viewBox="0 0 24 24">
                                             <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 12h6m-6 4h6m2 5H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z"></path>
                                         </svg>
                                         Markdown Detected
                                     </span>
                                 </div>
                             </div>
                         </div>
@@ -390,6 +396,65 @@
                     <!-- Improved Only -->
                     <div x-show="activeTab === 'improved'" class="max-w-none">
                         <div x-show="!renderMarkdown">
                             <pre class="whitespace-pre-wrap font-mono text-xs bg-gray-50 dark:bg-gray-800 text-gray-900 dark:text-gray-100 p-4 rounded-lg" x-text="getImprovedText()"></pre>
                         </div>
@@ -532,6 +597,7 @@
     <!-- Local Scripts -->
     <script src="js/diff-utils.js"></script>
     <script src="js/dataset-api.js"></script>
     <script src="js/app.js"></script>
 </body>
 </html>

                                         <span x-text="wordStats.original || '-'"></span> → <span x-text="wordStats.improved || '-'"></span>
                                     </span>
                                 </div>
+                                <div class="mt-2 flex items-center justify-center space-x-2">
+                                    <span x-show="hasMarkdown" class="inline-flex items-center px-2.5 py-0.5 rounded-full text-xs font-medium bg-purple-100 dark:bg-purple-900 text-purple-800 dark:text-purple-200">
                                         <svg class="w-3 h-3 mr-1" fill="none" stroke="currentColor" viewBox="0 0 24 24">
                                             <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 12h6m-6 4h6m2 5H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z"></path>
                                         </svg>
                                         Markdown Detected
                                     </span>
+                                    <span x-show="hasReasoningTrace" class="inline-flex items-center px-2.5 py-0.5 rounded-full text-xs font-medium bg-blue-100 dark:bg-blue-900 text-blue-800 dark:text-blue-200">
+                                        <svg class="w-3 h-3 mr-1" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                                            <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9.663 17h4.673M12 3v1m6.364 1.636l-.707.707M21 12h-1M4 12H3m3.343-5.657l-.707-.707m2.828 9.9a5 5 0 117.072 0l-.548.547A3.374 3.374 0 0014 18.469V19a2 2 0 11-4 0v-.531c0-.895-.356-1.754-.988-2.386l-.548-.547z"></path>
+                                        </svg>
+                                        Reasoning Trace
+                                    </span>
                                 </div>
                             </div>
                         </div>
                     <!-- Improved Only -->
                     <div x-show="activeTab === 'improved'" class="max-w-none">
+                        <!-- Reasoning Trace Panel -->
+                        <div x-show="hasReasoningTrace" class="mb-4">
+                            <div class="bg-blue-50 dark:bg-blue-950/20 border border-blue-200 dark:border-blue-800 rounded-lg">
+                                <button
+                                    @click="showReasoning = !showReasoning"
+                                    class="w-full px-4 py-3 flex items-center justify-between text-left hover:bg-blue-100 dark:hover:bg-blue-950/40 transition-colors rounded-t-lg"
+                                >
+                                    <div class="flex items-center space-x-2">
+                                        <svg class="w-5 h-5 text-blue-600 dark:text-blue-400" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                                            <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9.663 17h4.673M12 3v1m6.364 1.636l-.707.707M21 12h-1M4 12H3m3.343-5.657l-.707-.707m2.828 9.9a5 5 0 117.072 0l-.548.547A3.374 3.374 0 0014 18.469V19a2 2 0 11-4 0v-.531c0-.895-.356-1.754-.988-2.386l-.548-.547z"></path>
+                                        </svg>
+                                        <span class="font-medium text-gray-900 dark:text-gray-100">Model Reasoning</span>
+                                        <span class="text-sm text-gray-600 dark:text-gray-400" x-show="reasoningStats">
+                                            (<span x-text="reasoningStats?.reasoningWords"></span> words, <span x-text="reasoningStats?.reasoningRatio"></span>% of output)
+                                        </span>
+                                    </div>
+                                    <svg
+                                        class="w-5 h-5 text-gray-500 dark:text-gray-400 transition-transform"
+                                        :class="showReasoning ? 'rotate-180' : ''"
+                                        fill="none" stroke="currentColor" viewBox="0 0 24 24"
+                                    >
+                                        <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 9l-7 7-7-7"></path>
+                                    </svg>
+                                </button>
+                                <div x-show="showReasoning" x-collapse class="px-4 pb-4">
+                                    <div class="bg-white dark:bg-gray-800 rounded-lg p-4 mt-2">
+                                        <template x-if="formattedReasoning && formattedReasoning.steps.length > 0">
+                                            <div class="space-y-3">
+                                                <template x-for="(step, index) in formattedReasoning.steps" :key="index">
+                                                    <div class="pl-4 border-l-2 border-gray-200 dark:border-gray-700">
+                                                        <div class="font-medium text-sm text-gray-900 dark:text-gray-100 mb-1">
+                                                            <span class="inline-block w-6 h-6 bg-blue-100 dark:bg-blue-900 text-blue-600 dark:text-blue-400 rounded-full text-center text-xs leading-6 mr-2" x-text="step.number || (index + 1)"></span>
+                                                            <span x-text="step.title"></span>
+                                                        </div>
+                                                        <div class="text-sm text-gray-700 dark:text-gray-300 whitespace-pre-wrap" x-text="step.content"></div>
+                                                    </div>
+                                                </template>
+                                            </div>
+                                        </template>
+                                        <template x-if="!formattedReasoning || formattedReasoning.steps.length === 0">
+                                            <pre class="whitespace-pre-wrap font-mono text-xs text-gray-700 dark:text-gray-300" x-text="reasoningContent"></pre>
+                                        </template>
+                                    </div>
+                                </div>
+                            </div>
+                        </div>
+                        <!-- Final Answer Content -->
+                        <div x-show="hasReasoningTrace" class="mb-2">
+                            <div class="flex items-center space-x-2 text-sm text-gray-600 dark:text-gray-400 mb-2">
+                                <svg class="w-4 h-4" fill="none" stroke="currentColor" viewBox="0 0 24 24">
+                                    <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 12l2 2 4-4m6 2a9 9 0 11-18 0 9 9 0 0118 0z"></path>
+                                </svg>
+                                <span>Final Output</span>
+                            </div>
+                        </div>
                         <div x-show="!renderMarkdown">
                             <pre class="whitespace-pre-wrap font-mono text-xs bg-gray-50 dark:bg-gray-800 text-gray-900 dark:text-gray-100 p-4 rounded-lg" x-text="getImprovedText()"></pre>
                         </div>
     <!-- Local Scripts -->
     <script src="js/diff-utils.js"></script>
     <script src="js/dataset-api.js"></script>
+    <script src="js/reasoning-parser.js"></script>
     <script src="js/app.js"></script>
 </body>
 </html>

js/app.js CHANGED Viewed

@@ -12,7 +12,8 @@ document.addEventListener('alpine:init', () => {
         // Example datasets
         exampleDatasets: [
             { id: 'davanstrien/exams-ocr', name: 'Exams OCR', description: 'Historical exam papers with VLM corrections' },
-            { id: 'davanstrien/rolm-test', name: 'ROLM Test', description: 'Documents processed with RolmOCR model' }
         ],
         // Navigation state
@@ -33,6 +34,14 @@ document.addEventListener('alpine:init', () => {
         renderMarkdown: false,
         hasMarkdown: false,
         // Flow view state
         flowItems: [],
         flowStartIndex: 0,
@@ -190,9 +199,10 @@ document.addEventListener('alpine:init', () => {
                 console.log('Column info:', this.columnInfo);
                 console.log('Current sample keys:', Object.keys(this.currentSample));
-                // Check if improved text contains markdown
                 const improvedText = this.getImprovedText();
-                this.hasMarkdown = this.detectMarkdown(improvedText);
                 // Update diff when sample changes
                 this.updateDiff();
@@ -279,6 +289,38 @@ document.addEventListener('alpine:init', () => {
             };
         },
         getOriginalText() {
             if (!this.currentSample) return '';
             const columns = this.api.detectColumns(null, this.currentSample);
@@ -286,6 +328,17 @@ document.addEventListener('alpine:init', () => {
         },
         getImprovedText() {
             if (!this.currentSample) return '';
             const columns = this.api.detectColumns(null, this.currentSample);
             return this.currentSample[columns.improvedText] || 'No improved text found';
@@ -564,6 +617,15 @@ document.addEventListener('alpine:init', () => {
             content += `${'='.repeat(50)}\n`;
             content += original;
             content += `\n\n${'='.repeat(50)}\n\n`;
             content += `IMPROVED OCR:\n`;
             content += `${'='.repeat(50)}\n`;
             content += improved;

         // Example datasets
         exampleDatasets: [
             { id: 'davanstrien/exams-ocr', name: 'Exams OCR', description: 'Historical exam papers with VLM corrections' },
+            { id: 'davanstrien/rolm-test', name: 'ROLM Test', description: 'Documents processed with RolmOCR model' },
+            { id: 'davanstrien/india-medical-ocr-test', name: 'India Medical OCR', description: 'Medical documents with NuMarkdown reasoning traces' }
         ],
         // Navigation state
         renderMarkdown: false,
         hasMarkdown: false,
+        // Reasoning trace state
+        hasReasoningTrace: false,
+        showReasoning: false,
+        reasoningContent: null,
+        answerContent: null,
+        reasoningStats: null,
+        formattedReasoning: null,
         // Flow view state
         flowItems: [],
         flowStartIndex: 0,
                 console.log('Column info:', this.columnInfo);
                 console.log('Current sample keys:', Object.keys(this.currentSample));
+                // Check if improved text contains markdown and reasoning traces
                 const improvedText = this.getImprovedText();
+                this.parseReasoningTrace(improvedText);
+                this.hasMarkdown = this.detectMarkdown(this.answerContent || improvedText);
                 // Update diff when sample changes
                 this.updateDiff();
             };
         },
+        parseReasoningTrace(text) {
+            // Reset reasoning state
+            this.hasReasoningTrace = false;
+            this.reasoningContent = null;
+            this.answerContent = null;
+            this.reasoningStats = null;
+            this.formattedReasoning = null;
+            if (!text || !window.ReasoningParser) return;
+            // Check if text contains reasoning trace
+            if (ReasoningParser.detectReasoningTrace(text)) {
+                const parsed = ReasoningParser.parseReasoningContent(text);
+                if (parsed.hasReasoning) {
+                    this.hasReasoningTrace = true;
+                    this.reasoningContent = parsed.reasoning;
+                    this.answerContent = parsed.answer;
+                    this.formattedReasoning = ReasoningParser.formatReasoningSteps(parsed.reasoning);
+                    this.reasoningStats = ReasoningParser.getReasoningStats(parsed);
+                    console.log('Reasoning trace detected:', this.reasoningStats);
+                } else {
+                    // No reasoning found, use original text as answer
+                    this.answerContent = text;
+                }
+            } else {
+                // No reasoning markers, use original text
+                this.answerContent = text;
+            }
+        },
         getOriginalText() {
             if (!this.currentSample) return '';
             const columns = this.api.detectColumns(null, this.currentSample);
         },
         getImprovedText() {
+            if (!this.currentSample) return '';
+            const columns = this.api.detectColumns(null, this.currentSample);
+            const rawText = this.currentSample[columns.improvedText] || 'No improved text found';
+            // If we have parsed answer content from reasoning trace, use that
+            // Otherwise return the raw text
+            return this.hasReasoningTrace && this.answerContent ? this.answerContent : rawText;
+        },
+        getRawImprovedText() {
+            // Get the raw improved text without parsing reasoning traces
             if (!this.currentSample) return '';
             const columns = this.api.detectColumns(null, this.currentSample);
             return this.currentSample[columns.improvedText] || 'No improved text found';
             content += `${'='.repeat(50)}\n`;
             content += original;
             content += `\n\n${'='.repeat(50)}\n\n`;
+            // Include reasoning trace if available
+            if (this.hasReasoningTrace && this.reasoningContent) {
+                content += `MODEL REASONING:\n`;
+                content += `${'='.repeat(50)}\n`;
+                content += this.reasoningContent;
+                content += `\n\n${'='.repeat(50)}\n\n`;
+            }
             content += `IMPROVED OCR:\n`;
             content += `${'='.repeat(50)}\n`;
             content += improved;

js/reasoning-parser.js ADDED Viewed

	@@ -0,0 +1,224 @@

+/**
+ * Reasoning Trace Parser
+ * Handles parsing and formatting of model reasoning traces from OCR outputs
+ */
+class ReasoningParser {
+    /**
+     * Detect if text contains reasoning trace markers
+     * @param {string} text - The text to check
+     * @returns {boolean} - True if reasoning trace is detected
+     */
+    static detectReasoningTrace(text) {
+        if (!text || typeof text !== 'string') return false;
+        // Check for common reasoning trace patterns
+        const patterns = [
+            /<think>/i,
+            /<thinking>/i,
+            /<reasoning>/i,
+            /<thought>/i
+        ];
+        return patterns.some(pattern => pattern.test(text));
+    }
+    /**
+     * Parse reasoning content from text
+     * @param {string} text - The text containing reasoning trace
+     * @returns {object} - Object with reasoning and answer sections
+     */
+    static parseReasoningContent(text) {
+        if (!text) {
+            return { reasoning: null, answer: null, original: text };
+        }
+        // Try multiple patterns for flexibility
+        const patterns = [
+            {
+                start: /<think>/i,
+                end: /<\/think>/i,
+                answerStart: /<answer>/i,
+                answerEnd: /<\/answer>/i
+            },
+            {
+                start: /<thinking>/i,
+                end: /<\/thinking>/i,
+                answerStart: /<answer>/i,
+                answerEnd: /<\/answer>/i
+            },
+            {
+                start: /<reasoning>/i,
+                end: /<\/reasoning>/i,
+                answerStart: /<output>/i,
+                answerEnd: /<\/output>/i
+            }
+        ];
+        for (const pattern of patterns) {
+            const reasoningMatch = text.match(new RegExp(
+                pattern.start.source + '([\\s\\S]*?)' + pattern.end.source,
+                'i'
+            ));
+            const answerMatch = text.match(new RegExp(
+                pattern.answerStart.source + '([\\s\\S]*?)' + pattern.answerEnd.source,
+                'i'
+            ));
+            if (reasoningMatch || answerMatch) {
+                return {
+                    reasoning: reasoningMatch ? reasoningMatch[1].trim() : null,
+                    answer: answerMatch ? answerMatch[1].trim() : null,
+                    hasReasoning: !!reasoningMatch,
+                    hasAnswer: !!answerMatch,
+                    original: text
+                };
+            }
+        }
+        // If no patterns match, return original text as answer
+        return {
+            reasoning: null,
+            answer: text,
+            hasReasoning: false,
+            hasAnswer: true,
+            original: text
+        };
+    }
+    /**
+     * Format reasoning steps for display
+     * @param {string} reasoningText - The raw reasoning text
+     * @returns {object} - Formatted reasoning with steps and metadata
+     */
+    static formatReasoningSteps(reasoningText) {
+        if (!reasoningText) return null;
+        // Parse numbered steps (e.g., "1. Step content")
+        const stepPattern = /^\d+\.\s+\*\*(.+?)\*\*(.+?)(?=^\d+\.\s|\z)/gms;
+        const steps = [];
+        let match;
+        while ((match = stepPattern.exec(reasoningText)) !== null) {
+            steps.push({
+                title: match[1].trim(),
+                content: match[2].trim()
+            });
+        }
+        // If no numbered steps found, try to parse by line breaks
+        if (steps.length === 0) {
+            const lines = reasoningText.split('\n').filter(line => line.trim());
+            lines.forEach((line, index) => {
+                // Check if line starts with a number
+                const numberedMatch = line.match(/^(\d+)\.\s*(.+)/);
+                if (numberedMatch) {
+                    const title = numberedMatch[2].replace(/\*\*/g, '').trim();
+                    steps.push({
+                        number: numberedMatch[1],
+                        title: title,
+                        content: ''
+                    });
+                } else if (steps.length > 0) {
+                    // Add to previous step's content
+                    steps[steps.length - 1].content += '\n' + line;
+                }
+            });
+        }
+        return {
+            steps: steps,
+            rawText: reasoningText,
+            stepCount: steps.length,
+            characterCount: reasoningText.length,
+            wordCount: reasoningText.split(/\s+/).filter(w => w).length
+        };
+    }
+    /**
+     * Extract key insights from reasoning
+     * @param {string} reasoningText - The reasoning text
+     * @returns {array} - Array of key insights or decisions
+     */
+    static extractInsights(reasoningText) {
+        if (!reasoningText) return [];
+        const insights = [];
+        // Look for decision points and key observations
+        const patterns = [
+            /decision:\s*(.+)/gi,
+            /observation:\s*(.+)/gi,
+            /note:\s*(.+)/gi,
+            /important:\s*(.+)/gi,
+            /key finding:\s*(.+)/gi
+        ];
+        patterns.forEach(pattern => {
+            let match;
+            while ((match = pattern.exec(reasoningText)) !== null) {
+                insights.push(match[1].trim());
+            }
+        });
+        return insights;
+    }
+    /**
+     * Get summary statistics about the reasoning trace
+     * @param {object} parsedContent - Parsed reasoning content
+     * @returns {object} - Statistics about the reasoning
+     */
+    static getReasoningStats(parsedContent) {
+        if (!parsedContent || !parsedContent.reasoning) {
+            return {
+                hasReasoning: false,
+                reasoningLength: 0,
+                answerLength: 0,
+                reasoningRatio: 0
+            };
+        }
+        const reasoningLength = parsedContent.reasoning.length;
+        const answerLength = parsedContent.answer ? parsedContent.answer.length : 0;
+        const totalLength = reasoningLength + answerLength;
+        return {
+            hasReasoning: true,
+            reasoningLength: reasoningLength,
+            answerLength: answerLength,
+            totalLength: totalLength,
+            reasoningRatio: totalLength > 0 ? (reasoningLength / totalLength * 100).toFixed(1) : 0,
+            reasoningWords: parsedContent.reasoning.split(/\s+/).filter(w => w).length,
+            answerWords: parsedContent.answer ? parsedContent.answer.split(/\s+/).filter(w => w).length : 0
+        };
+    }
+    /**
+     * Format reasoning for export
+     * @param {object} parsedContent - Parsed reasoning content
+     * @param {boolean} includeReasoning - Whether to include reasoning in export
+     * @returns {string} - Formatted text for export
+     */
+    static formatForExport(parsedContent, includeReasoning = true) {
+        if (!parsedContent) return '';
+        let exportText = '';
+        if (includeReasoning && parsedContent.reasoning) {
+            exportText += '=== MODEL REASONING ===\n\n';
+            exportText += parsedContent.reasoning;
+            exportText += '\n\n=== FINAL OUTPUT ===\n\n';
+        }
+        if (parsedContent.answer) {
+            exportText += parsedContent.answer;
+        }
+        return exportText;
+    }
+}
+// Export for use in other scripts
+window.ReasoningParser = ReasoningParser;

linkedin-post.txt ADDED Viewed

	@@ -0,0 +1,18 @@

+How well do VLM-based OCR models handle Victorian theatre playbills? 🎭
+Last week I shared OCR Time Capsule for comparing traditional vs VLM-based OCR. I've now added some examples from challenging collections: The British Library's Theatrical playbills from Britain and Ireland collection.
+These 150-year-old documents are brutal for OCR:
+- Decorative fonts in every size imaginable
+- Multi-column layouts with text at odd angles
+- Faded ink and show-through from the reverse
+- ALL CAPS DRAMATIC ANNOUNCEMENTS!!!
+For this dataset I used the RolmOCR model from Reducto (processed via HF Jobs - love how easy UV scripts make GPU inference!). The results? The improvements over traditional OCR are even more dramatic than with exam papers.
+🔗 Explore the app: https://huggingface.co/spaces/davanstrien/ocr-time-capsule
+📚 BL Theatre dataset: https://bl.iro.bl.uk/concern/datasets/a8534aff-c8e3-4fc8-adc1-da542080b1e3
+I'll continue to work through the suggestions I got last week but feel free to suggest other hairy OCR challenges to compare VLMs vs existing OCR!
+#DigitalHumanities #OCR #GLAM #BritishLibrary #TheatreHistory

mobile-enhancement-plan.md ADDED Viewed

	@@ -0,0 +1,237 @@

+# Mobile Enhancement Plan for OCR Time Capsule
+## Overview
+This document outlines the technical requirements for implementing comprehensive mobile support in OCR Time Capsule. While the application claims mobile support, the current implementation has significant limitations that prevent a good mobile user experience.
+**Estimated Effort:** 800-1,200 lines of code changes
+**Complexity:** Medium-High
+**Development Time:** 3-5 days for full implementation, 2 days for MVP
+## Current Mobile Limitations
+1. **Fixed desktop layout** - Rigid 1/3 + 2/3 split doesn't adapt to small screens
+2. **No touch support** - Navigation relies entirely on keyboard shortcuts
+3. **Fixed positioning issues** - Footer overlaps content on mobile browsers
+4. **Small touch targets** - Buttons/inputs too small for finger interaction
+5. **Desktop-only interactions** - Hover states, dropdown menus not touch-friendly
+6. **Overflow problems** - Content gets cut off due to fixed heights
+## Required Changes
+### 1. Layout Restructuring (Critical)
+**Current:** Fixed side-by-side layout
+```html
+<!-- Current structure -->
+<div class="flex-1 flex h-full">
+    <div class="w-1/3">...</div>  <!-- Image panel -->
+    <div class="flex-1">...</div>  <!-- Text panel -->
+</div>
+```
+**Required:** Responsive stacked layout
+```html
+<!-- Mobile-first approach -->
+<div class="flex flex-col md:flex-row h-full">
+    <div class="w-full md:w-1/3">...</div>
+    <div class="w-full md:flex-1">...</div>
+</div>
+```
+**Changes needed:**
+- Update all layout containers in `index.html` (~50 lines)
+- Add mobile-specific CSS classes (~100 lines)
+- Implement collapsible image panel for mobile
+### 2. Touch Navigation Implementation
+**New JavaScript required in `app.js`:**
+```javascript
+// Touch gesture handling
+let touchStartX = 0;
+let touchEndX = 0;
+initTouchNavigation() {
+    const container = document.getElementById('main-content');
+    container.addEventListener('touchstart', (e) => {
+        touchStartX = e.changedTouches[0].screenX;
+    });
+    container.addEventListener('touchend', (e) => {
+        touchEndX = e.changedTouches[0].screenX;
+        this.handleSwipe();
+    });
+}
+handleSwipe() {
+    const swipeThreshold = 50;
+    const diff = touchStartX - touchEndX;
+    if (Math.abs(diff) > swipeThreshold) {
+        if (diff > 0) {
+            this.nextSample();  // Swipe left
+        } else {
+            this.previousSample();  // Swipe right
+        }
+    }
+}
+```
+**Scope:** ~150 lines for complete touch support including:
+- Swipe detection
+- Touch feedback
+- Gesture velocity calculation
+- Preventing accidental triggers
+### 3. Mobile Navigation UI
+**Replace fixed footer with mobile-friendly navigation:**
+```html
+<!-- Mobile navigation bar -->
+<nav class="md:hidden fixed bottom-0 left-0 right-0 bg-white dark:bg-gray-800 border-t">
+    <div class="grid grid-cols-3 h-16">
+        <button class="flex items-center justify-center" @click="previousSample()">
+            <svg class="w-8 h-8">...</svg>
+        </button>
+        <button class="flex items-center justify-center" @click="showPageSelector = true">
+            <span class="text-lg font-medium" x-text="`${currentIndex + 1}/${totalSamples}`"></span>
+        </button>
+        <button class="flex items-center justify-center" @click="nextSample()">
+            <svg class="w-8 h-8">...</svg>
+        </button>
+    </div>
+</nav>
+```
+**Changes:** ~100 lines for navigation components
+### 4. Touch-Friendly Components
+**Update all interactive elements:**
+- Minimum touch target size: 44x44px
+- Add `touch-action` CSS properties
+- Increase padding on all buttons
+- Replace hover menus with tap-to-open modals
+**Example button update:**
+```html
+<!-- Before -->
+<button class="px-2 py-1 text-sm">Load</button>
+<!-- After -->
+<button class="px-4 py-3 md:px-2 md:py-1 text-base md:text-sm min-w-[44px] min-h-[44px] md:min-w-0 md:min-h-0">
+    Load
+</button>
+```
+### 5. Mobile Dock/Gallery
+**Transform desktop dock to mobile carousel:**
+```javascript
+// Mobile-optimized thumbnail gallery
+initMobileGallery() {
+    this.mobileGallery = {
+        currentIndex: 0,
+        itemsPerView: 3,
+        thumbnails: []
+    };
+    // Horizontal scroll with snap points
+    const gallery = document.getElementById('mobile-gallery');
+    gallery.style.scrollSnapType = 'x mandatory';
+    gallery.style.overflowX = 'auto';
+    gallery.style.webkitOverflowScrolling = 'touch';
+}
+```
+**Scope:** ~200 lines for mobile gallery implementation
+### 6. Responsive Breakpoints
+**Implement proper breakpoint system:**
+```css
+/* Mobile first approach */
+/* Base: Mobile (< 640px) */
+.container {
+    display: block;
+    padding: 1rem;
+}
+/* Tablet (640px - 1024px) */
+@media (min-width: 640px) {
+    .container {
+        display: flex;
+        padding: 1.5rem;
+    }
+}
+/* Desktop (> 1024px) */
+@media (min-width: 1024px) {
+    .container {
+        padding: 2rem;
+    }
+}
+```
+### 7. Performance Optimizations
+**Mobile-specific optimizations:**
+- Lazy load images with Intersection Observer
+- Reduce initial JavaScript bundle
+- Implement virtual scrolling for large datasets
+- Add `will-change` CSS for smooth animations
+## Implementation Approach
+### Phase 1: MVP (2 days)
+1. Basic responsive layout
+2. Touch navigation (swipe gestures)
+3. Mobile-friendly buttons
+4. Fix overflow issues
+### Phase 2: Enhanced Mobile UX (2 days)
+1. Mobile navigation bar
+2. Touch-optimized dock
+3. Page selector modal
+4. Gesture refinements
+### Phase 3: Polish (1 day)
+1. Performance optimizations
+2. PWA features
+3. Cross-device testing
+4. Documentation
+## Testing Requirements
+### Devices to Test
+- **iOS:** iPhone SE, iPhone 12/13, iPad
+- **Android:** Various screen sizes (5", 6", 7")
+- **Browsers:** Safari iOS, Chrome Android, Firefox Mobile
+### Key Test Scenarios
+1. Portrait/landscape orientation changes
+2. Touch gesture accuracy
+3. Text readability at different zoom levels
+4. Navigation button accessibility
+5. Image loading performance on slow connections
+## Code Impact Summary
+| Component | Lines Changed | Complexity |
+|-----------|--------------|------------|
+| HTML Layout | 150-200 | Medium |
+| CSS/Tailwind | 200-300 | Low-Medium |
+| Touch Events | 150 | High |
+| Mobile Navigation | 100 | Medium |
+| Gallery/Dock | 200 | High |
+| **Total** | **800-1,200** | **Medium-High** |
+## Priority Recommendations
+1. **Must Have:** Responsive layout, basic touch navigation
+2. **Should Have:** Mobile navigation bar, touch-friendly buttons
+3. **Nice to Have:** Gesture refinements, PWA features, animations
+The most critical change is the layout restructuring - without this, other mobile features won't work properly. Start there and build up progressively.

multi-ocr-comparison-ui-patterns.md ADDED Viewed

	@@ -0,0 +1,277 @@

+# Multi-OCR Engine Comparison UI Patterns
+## Executive Summary
+This document outlines UI design patterns for comparing the results of 5+ OCR engines in the OCR Time Capsule application. Based on research of existing comparison tools and UI best practices, we recommend a hybrid approach combining selective comparison, matrix views, and progressive disclosure.
+## Key Design Constraints
+1. **Human Cognitive Limits**: Users can effectively compare 3-7 items simultaneously
+2. **Screen Real Estate**: Limited horizontal space for side-by-side comparisons
+3. **Information Density**: Need to show both text content and metadata
+4. **Performance**: Rendering 5+ full texts simultaneously can impact performance
+## Recommended UI Patterns
+### 1. Selective Comparison Mode (Primary Recommendation)
+Allow users to select 2-4 engines for detailed comparison from a larger set.
+```
+┌─────────────────────────────────────────────────────────────┐
+│ Select OCR Engines to Compare:                              │
+│ ┌─┐ Tesseract 5.0   ┌─┐ Google Vision   ┌─┐ AWS Textract │
+│ ├─┤ Azure AI        ├─┤ PaddleOCR      ├─┤ Surya OCR     │
+│ └─┘ EasyOCR         └─┘ TrOCR           └─┘ RolmOCR       │
+│                                                             │
+│ [Compare Selected (3)]                                      │
+└─────────────────────────────────────────────────────────────┘
+After selection:
+┌─────────┬─────────────┬─────────────┬─────────────┐
+│ Image   │ Tesseract   │ Google      │ AWS         │
+│ Preview │ 5.0         │ Vision      │ Textract    │
+├─────────┼─────────────┼─────────────┼─────────────┤
+│         │ Text output │ Text output │ Text output │
+│ [IMG]   │ Lorem ipsum │ Lorem ipsum │ Lorem ipsum │
+│         │ dolor sit   │ dolor sit   │ dolar sit   │
+│         │ amet...     │ amet...     │ amet...     │
+└─────────┴─────────────┴─────────────┴─────────────┘
+```
+**Advantages:**
+- Maintains readable comparison
+- User controls complexity
+- Scalable to any number of engines
+### 2. Matrix/Grid Overview
+Show all results in a compact grid with expand/collapse functionality.
+```
+┌────────────────────────────────────────────────────────┐
+│ OCR Engine Comparison Matrix                           │
+├────────────┬───────────┬──────────┬─────────┬────────┤
+│ Engine     │ Accuracy  │ Time(ms) │ Preview │ Action │
+├────────────┼───────────┼──────────┼─────────┼────────┤
+│ Tesseract  │ 94.2%     │ 1250     │ Lorem...│ [View] │
+│ Google     │ 98.1%     │ 320      │ Lorem...│ [View] │
+│ AWS        │ 97.5%     │ 410      │ Lorem...│ [View] │
+│ Azure      │ 96.8%     │ 380      │ Lorem...│ [View] │
+│ PaddleOCR  │ 95.3%     │ 890      │ Lorem...│ [View] │
+│ Surya      │ 93.7%     │ 1100     │ Lorem...│ [View] │
+└────────────┴───────────┴──────────┴─────────┴────────┘
+Click [View] to see full text in modal/sidebar
+```
+**Advantages:**
+- Shows all engines at once
+- Easy to scan metrics
+- Detailed view on demand
+### 3. Reference + Diff View
+Select one OCR result as reference and show diffs from others.
+```
+┌─────────────────────────────────────────────────────────┐
+│ Reference: Google Vision OCR                            │
+│ ┌─────────────────────────────────────────────────────┐│
+│ │ Lorem ipsum dolor sit amet, consectetur adipiscing  ││
+│ │ elit, sed do eiusmod tempor incididunt ut labore   ││
+│ └─────────────────────────────────────────────────────┘│
+│                                                         │
+│ Differences from Reference:                             │
+│ ┌─────────────┬───────────────────────────────────────┐│
+│ │ Tesseract   │ -dolor +dolar (char 12)              ││
+│ │             │ -adipiscing +adipiscing (char 38)    ││
+│ ├─────────────┼───────────────────────────────────────┤│
+│ │ AWS         │ -consectetur +consektetur (char 27)  ││
+│ ├─────────────┼───────────────────────────────────────┤│
+│ │ Azure       │ No differences                        ││
+│ └─────────────┴───────────────────────────────────────┘│
+└─────────────────────────────────────────────────────────┘
+```
+**Advantages:**
+- Reduces visual complexity
+- Easy to see variations
+- Good for finding consensus
+### 4. Accordion/Tab Hybrid
+Combine tabs for primary views with accordions for details.
+```
+┌─────────────────────────────────────────────────────────┐
+│ [Overview] [Side-by-Side] [Consensus] [Analytics]      │
+├─────────────────────────────────────────────────────────┤
+│ Overview Tab:                                           │
+│                                                         │
+│ ▼ Tesseract 5.0 (94.2% accuracy)                      │
+│   Lorem ipsum dolor sit amet...                        │
+│   [Show full text] [Compare with others]               │
+│                                                         │
+│ ▶ Google Vision (98.1% accuracy)                      │
+│ ▶ AWS Textract (97.5% accuracy)                       │
+│ ▶ Azure AI (96.8% accuracy)                           │
+│ ▶ PaddleOCR (95.3% accuracy)                          │
+└─────────────────────────────────────────────────────────┘
+```
+**Advantages:**
+- Progressive disclosure
+- Maintains context
+- Flexible navigation
+### 5. Consensus/Voting View
+Show agreement levels between engines.
+```
+┌─────────────────────────────────────────────────────────┐
+│ Consensus View - 6 OCR Engines                         │
+├─────────────────────────────────────────────────────────┤
+│ Lorem ipsum █████ sit amet, ████████████ adipiscing   │
+│             ^^^^^           ^^^^^^^^^^^^               │
+│          5/6 agree       6/6 agree (consensus)         │
+│                                                         │
+│ Disagreements:                                          │
+│ Position 12-16: "dolor"                                │
+│   - Tesseract: "dolar" (1 vote)                       │
+│   - Others: "dolor" (5 votes) ✓                       │
+│                                                         │
+│ Position 27-38: "consectetur"                          │
+│   - AWS: "consektetur" (1 vote)                       │
+│   - Others: "consectetur" (5 votes) ✓                 │
+└─────────────────────────────────────────────────────────┘
+```
+**Advantages:**
+- Shows confidence levels
+- Identifies problem areas
+- Good for quality assessment
+### 6. Layered Comparison
+Stack results with transparency/overlay controls.
+```
+┌─────────────────────────────────────────────────────────┐
+│ Layer Controls:                  │ Opacity    Visible  │
+│ ┌──────────────────────────────┐├───────────┬────────┤│
+│ │                              ││ ●━━━━━━━━ │ ☑      ││
+│ │     [Overlaid Text View]     ││ Tesseract │        ││
+│ │                              │├───────────┼────────┤│
+│ │   Multiple colored layers    ││ ━●━━━━━━━ │ ☑      ││
+│ │   showing differences        ││ Google    │        ││
+│ │                              │├───────────┼────────┤│
+│ │                              ││ ━━━●━━━━━ │ ☐      ││
+│ │                              ││ AWS       │        ││
+│ └──────────────────────────────┘└───────────┴────────┘│
+└─────────────────────────────────────────────────────────┘
+```
+**Advantages:**
+- Visual diff representation
+- Adjustable comparison
+- Good for alignment issues
+## Metadata Display Patterns
+### Inline Badges
+```
+┌─────────────────────────────────────────┐
+│ Tesseract 5.0 [94.2%] [1.2s] [MIT]    │
+│ Lorem ipsum dolor sit amet...           │
+└─────────────────────────────────────────┘
+```
+### Hover Cards
+```
+┌─────────────────────────────────────────┐
+│ Google Vision ⓘ                        │
+│ ┌─────────────────────┐                │
+│ │ Accuracy: 98.1%     │ (on hover)     │
+│ │ Time: 320ms         │                │
+│ │ Cost: $0.0015       │                │
+│ │ Language: Multi     │                │
+│ └─────────────────────┘                │
+└─────────────────────────────────────────┘
+```
+## Navigation Patterns
+### 1. Engine Selector Bar
+```
+[All] [High Accuracy] [Fast] [Open Source] [Custom Group]
+```
+### 2. Quick Switch
+```
+Previous Engine [Tesseract ▼] Next Engine
+                 Google Vision
+                 AWS Textract
+                 Azure AI
+```
+### 3. Comparison History
+```
+Recent Comparisons:
+• Tesseract vs Google vs AWS (2 min ago)
+• All engines - Page 15 (5 min ago)
+• Azure vs PaddleOCR (10 min ago)
+```
+## Mobile Considerations
+For mobile devices, use a stacked card approach:
+```
+┌─────────────────┐
+│ Original Image  │
+├─────────────────┤
+│ Tesseract 94.2% │
+│ ▼ Show text     │
+├─────────────────┤
+│ Google 98.1%    │
+│ ▶ Show text     │
+├─────────────────┤
+│ AWS 97.5%       │
+│ ▶ Show text     │
+└─────────────────┘
+```
+## Performance Optimizations
+1. **Lazy Loading**: Only load full text when expanded/selected
+2. **Virtual Scrolling**: For long documents
+3. **Caching**: Store OCR results client-side
+4. **Progressive Enhancement**: Start with 2-3 engines, load more on demand
+## Recommended Implementation Priority
+1. **Phase 1**: Selective Comparison (2-4 engines)
+2. **Phase 2**: Matrix Overview with metrics
+3. **Phase 3**: Consensus/Voting view
+4. **Phase 4**: Advanced features (layers, history, etc.)
+## Accessibility Considerations
+- Keyboard navigation between engines
+- Screen reader announcements for differences
+- High contrast mode for diff highlighting
+- Alternative text descriptions for visual comparisons
+## Conclusion
+The selective comparison pattern combined with a matrix overview provides the best balance of usability and functionality for comparing 5+ OCR engines. This approach:
+- Respects cognitive limits (3-7 items)
+- Provides overview and detail views
+- Scales to any number of engines
+- Maintains performance
+- Works on mobile devices
+The key is progressive disclosure: show summary information for all engines, but limit detailed comparison to user-selected subsets.