Spaces:
Running
Running
| <html> | |
| <head> | |
| <title>Watermark Detector</title> | |
| <meta name="viewport" content="width=device-width, initial-scale=1"> | |
| <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css" rel="stylesheet"> | |
| <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap-icons@1.7.2/font/bootstrap-icons.css"> | |
| <link rel="stylesheet" href="{{ url_for('static', filename='styles.css') }}"> | |
| </head> | |
| <body> | |
| <div class="container"> | |
| <div class="d-flex justify-content-between align-items-center"> | |
| <h1>Interactive watermark detector</h1> | |
| <div class="d-flex gap-2"> | |
| <button class="btn btn-outline-secondary" data-bs-toggle="modal" data-bs-target="#helpModal"> | |
| <i class="bi bi-question-circle"></i> | |
| </button> | |
| <button class="btn btn-outline-secondary" data-bs-toggle="modal" data-bs-target="#paramsModal"> | |
| <i class="bi bi-gear"></i> | |
| </button> | |
| </div> | |
| </div> | |
| <!-- Help Modal --> | |
| <div class="modal fade" id="helpModal" tabindex="-1"> | |
| <div class="modal-dialog modal-lg"> | |
| <div class="modal-content"> | |
| <div class="modal-header"> | |
| <h5 class="modal-title">Watermark Detection Help</h5> | |
| <button type="button" class="btn-close" data-bs-dismiss="modal"></button> | |
| </div> | |
| <div class="modal-body help-modal-body"> | |
| <h4>What is LLM watermarking?</h4> | |
| <p>LLM watermarking is a technique that slightly modifies how language models generate text, making it possible to detect if text was generated by a specific AI model without visibly changing the text quality.</p> | |
| <h4>How to use this demo</h4> | |
| <ol> | |
| <li>Enter a prompt in the top text area to generate watermarked text</li> | |
| <li>The generated text will appear in the second text box</li> | |
| <li>The text will be automatically analyzed to show which tokens (parts of text) were influenced by the watermark</li> | |
| <li>The statistics at the bottom show the detection results</li> | |
| <li>You can also paste any text in the second box to test if it contains a watermark</li> | |
| </ol> | |
| <h4>Detection Methods</h4> | |
| <p><strong>Maryland</strong>: A token-level detection algorithm that analyzes how unexpected each token is, based on the paper "<a href="https://arxiv.org/abs/2301.10226" target="_blank">A Watermark for Large Language Models</a>" by Kirchenbauer et al.</p> | |
| <p><strong>OpenAI</strong>: A similar watermarking method inspired by initial reports from OpenAI.</p> | |
| <p><strong>Maryland Z-score</strong>: A worse variant of the Maryland detector that uses z-scores for statistical interpretation.</p> | |
| <p><strong>OpenAI Z-score</strong>: A worse variant of the OpenAI detector that uses z-scores for statistical interpretation.</p> | |
| <h4>Parameters Explained</h4> | |
| <dl class="help-description-list"> | |
| <dt>Detector Type</dt> | |
| <dd>The algorithm used to detect watermarks in the text. Different detectors perform better in different scenarios.</dd> | |
| <dt>Seed</dt> | |
| <dd>The random seed used for watermarking. The detector must use the same seed that was used when generating the text. In a real-world scenario, this would be kept private by the model provider.</dd> | |
| <dt>N-gram Size</dt> | |
| <dd>The number of previous tokens considered when choosing "greenlist" tokens. Larger values make the watermark less robust against edits but may improve text quality.</dd> | |
| <dt>Delta</dt> | |
| <dd>The bias added to "greenlist" tokens during generation. Higher values make the watermark stronger but might affect text quality. Typical values range from 1.0 to 5.0.</dd> | |
| <dt>Temperature</dt> | |
| <dd>Controls randomness in text generation. Higher values (e.g., 1.0) produce more diverse outputs; lower values (e.g., 0.2) make outputs more focused and deterministic.</dd> | |
| </dl> | |
| <h4>Understanding Results</h4> | |
| <dl class="help-description-list"> | |
| <dt>Tokens</dt> | |
| <dd>The total number of tokens in the analyzed text. Tokens are units of text that may represent words, parts of words, or punctuation.</dd> | |
| <dt>Scored Tokens</dt> | |
| <dd>The number of tokens that were actually evaluated by the detector (excludes first few tokens that don't have enough context).</dd> | |
| <dt>Final Score</dt> | |
| <dd>A measure of how likely the text contains a watermark. Higher scores indicate stronger evidence of watermarking.</dd> | |
| <dt>P-value</dt> | |
| <dd>The statistical significance of the detection. Lower values (especially p < 1e-6) indicate strong evidence that the text was watermarked. Values close to 0.5 suggest no watermark is present.</dd> | |
| </dl> | |
| <h4>Related Papers</h4> | |
| <ul class="paper-references"> | |
| <li> | |
| <a href="https://arxiv.org/abs/2301.10226" target="_blank">A Watermark for Large Language Models</a> | |
| <span class="paper-authors">Kirchenbauer, et al. (2023)</span> | |
| </li> | |
| <li> | |
| <a href="https://arxiv.org/abs/2308.00113" target="_blank">Three Bricks to Consolidate Watermarks for Large Language Models</a> | |
| <span class="paper-authors">Fernandez, et al. (2023)</span> | |
| </li> | |
| </ul> | |
| </div> | |
| <div class="modal-footer"> | |
| <button type="button" class="btn btn-primary" data-bs-dismiss="modal">Close</button> | |
| </div> | |
| </div> | |
| </div> | |
| </div> | |
| <!-- Advanced Parameters Modal --> | |
| <div class="modal fade" id="paramsModal" tabindex="-1"> | |
| <div class="modal-dialog"> | |
| <div class="modal-content"> | |
| <div class="modal-header"> | |
| <h5 class="modal-title">Advanced Parameters</h5> | |
| <button type="button" class="btn-close" data-bs-dismiss="modal"></button> | |
| </div> | |
| <div class="modal-body"> | |
| <div class="mb-3"> | |
| <div class="d-flex align-items-center"> | |
| <i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="The algorithm used to detect watermarks in the text. Different algorithms have different strengths and weaknesses."></i> | |
| <label for="detectorType" class="form-label mb-0">Detector Type</label> | |
| </div> | |
| <select class="form-select" id="detectorType"> | |
| <option value="maryland">Maryland</option> | |
| <option value="marylandz">Maryland Z-score</option> | |
| <option value="openai">OpenAI</option> | |
| <option value="openaiz">OpenAI Z-score</option> | |
| </select> | |
| <div class="form-text">Type of watermark detection algorithm</div> | |
| </div> | |
| <div class="mb-3"> | |
| <div class="d-flex align-items-center"> | |
| <i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="The random seed used for watermarking. For detection to work, this must match the seed used during generation."></i> | |
| <label for="seed" class="form-label mb-0">Seed</label> | |
| </div> | |
| <input type="number" class="form-control" id="seed" value="0"> | |
| <div class="form-text">Random seed for the watermark detector</div> | |
| </div> | |
| <div class="mb-3"> | |
| <div class="d-flex align-items-center"> | |
| <i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="The number of previous tokens considered when determining the hashing. Higher values are more secure but may affect robustness. Lower values may introduce a bias in the text distribution."></i> | |
| <label for="ngram" class="form-label mb-0">N-gram Size</label> | |
| </div> | |
| <input type="number" class="form-control" id="ngram" value="1"> | |
| <div class="form-text">Size of the n-gram window used for detection</div> | |
| </div> | |
| <div class="mb-3"> | |
| <div class="d-flex align-items-center"> | |
| <i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="The bias added to greenlist tokens during generation. Higher values (2.0-5.0) make watermarks easier to detect but might affect text quality."></i> | |
| <label for="delta" class="form-label mb-0">Delta</label> | |
| </div> | |
| <input type="number" step="0.1" class="form-control" id="delta" value="2.0"> | |
| <div class="form-text">Bias added to greenlist tokens (for Maryland method)</div> | |
| </div> | |
| <div class="mb-3"> | |
| <div class="d-flex align-items-center"> | |
| <i class="bi bi-info-circle param-help-icon me-2" data-bs-toggle="tooltip" title="Controls randomness in generation. Higher values (closer to 2.0) create more diverse outputs and make watermarks easier to detect."></i> | |
| <label for="temperature" class="form-label mb-0">Temperature</label> | |
| </div> | |
| <input type="number" step="0.1" class="form-control" id="temperature" value="0.8"> | |
| <div class="form-text">Temperature for sampling (higher = more random)</div> | |
| </div> | |
| </div> | |
| <div class="modal-footer"> | |
| <button type="button" class="btn btn-secondary" data-bs-dismiss="modal">Close</button> | |
| <button type="button" class="btn btn-primary" id="applyParams">Apply</button> | |
| </div> | |
| </div> | |
| </div> | |
| </div> | |
| <!-- Input Form --> | |
| <div class="input-section"> | |
| <div class="prompt-container"> | |
| <textarea id="prompt_text" | |
| placeholder="Enter your prompt here to generate text with the model..."></textarea> | |
| <button class="floating-btn" id="generateBtn"> | |
| <i class="bi bi-send-fill send-icon"></i> | |
| <i class="bi bi-stop-fill stop-icon"></i> | |
| </button> | |
| </div> | |
| <textarea id="user_text" | |
| placeholder="Generated text will appear here. Replace or edit this text to see how watermark detection works."></textarea> | |
| </div> | |
| <!-- Token Display --> | |
| <div class="token-display" id="tokenDisplay"></div> | |
| <!-- Statistics --> | |
| <div class="stats-container"> | |
| <div> | |
| <div class="stat-value" id="tokenCount">0</div> | |
| <div class="stat-label"> | |
| Tokens | |
| <i class="bi bi-question-circle help-icon"></i> | |
| <span class="help-tooltip">Total number of tokens in the text</span> | |
| </div> | |
| </div> | |
| <div> | |
| <div class="stat-value" id="scoredTokens">0</div> | |
| <div class="stat-label"> | |
| Scored Tokens | |
| <i class="bi bi-question-circle help-icon"></i> | |
| <span class="help-tooltip">Number of tokens that were actually scored by the detector (excludes first n-gram tokens and duplicates)</span> | |
| </div> | |
| </div> | |
| <div> | |
| <div class="stat-value" id="finalScore">0.00</div> | |
| <div class="stat-label"> | |
| Final Score | |
| <i class="bi bi-question-circle help-icon"></i> | |
| <span class="help-tooltip">Cumulative score from all scored tokens. Higher values indicate more likely watermarked text</span> | |
| </div> | |
| </div> | |
| <div> | |
| <div class="stat-value" id="pValue">0.500</div> | |
| <div class="stat-label"> | |
| P-value | |
| <i class="bi bi-question-circle help-icon"></i> | |
| <span class="help-tooltip">Statistical significance of the score. Lower values indicate stronger evidence of watermarking (p < 0.05 is typically considered significant)</span> | |
| </div> | |
| </div> | |
| </div> | |
| </div> | |
| <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/js/bootstrap.bundle.min.js"></script> | |
| <script> | |
| let debounceTimeout = null; | |
| let abortController = null; | |
| const textarea = document.getElementById('user_text'); | |
| const promptArea = document.getElementById('prompt_text'); | |
| const generateBtn = document.getElementById('generateBtn'); | |
| const tokenDisplay = document.getElementById('tokenDisplay'); | |
| const tokenCount = document.getElementById('tokenCount'); | |
| const scoredTokens = document.getElementById('scoredTokens'); | |
| const finalScore = document.getElementById('finalScore'); | |
| const pValue = document.getElementById('pValue'); | |
| const applyParamsBtn = document.getElementById('applyParams'); | |
| const seedInput = document.getElementById('seed'); | |
| const ngramInput = document.getElementById('ngram'); | |
| const detectorTypeSelect = document.getElementById('detectorType'); | |
| const deltaInput = document.getElementById('delta'); | |
| const temperatureInput = document.getElementById('temperature'); | |
| // Initialize tooltips | |
| document.addEventListener('DOMContentLoaded', function() { | |
| // Initialize Bootstrap tooltips | |
| const tooltipTriggerList = [].slice.call(document.querySelectorAll('[data-bs-toggle="tooltip"]')); | |
| const tooltipList = tooltipTriggerList.map(function (tooltipTriggerEl) { | |
| return new bootstrap.Tooltip(tooltipTriggerEl); | |
| }); | |
| // Initialize our tokenization | |
| updateTokenization().catch(error => { | |
| console.error('Error during initial tokenization:', error); | |
| }); | |
| }); | |
| function startGeneration() { | |
| const prompt = promptArea.value.trim(); | |
| if (!prompt) { | |
| alert('Please enter a prompt first.'); | |
| return; | |
| } | |
| generateBtn.classList.add('generating'); | |
| textarea.value = ''; | |
| // Create new AbortController for this request | |
| abortController = new AbortController(); | |
| // Get current parameters | |
| const params = { | |
| detector_type: detectorTypeSelect.value, | |
| seed: parseInt(seedInput.value) || 0, | |
| ngram: parseInt(ngramInput.value) || 1, | |
| delta: parseFloat(deltaInput.value) || 2.0, | |
| temperature: parseFloat(temperatureInput.value) || 0.8 | |
| }; | |
| // Create headers for SSE | |
| const headers = new Headers({ | |
| 'Content-Type': 'application/json', | |
| 'Accept': 'text/event-stream', | |
| }); | |
| // Start fetch request with signal | |
| fetch('/generate', { | |
| method: 'POST', | |
| headers: headers, | |
| body: JSON.stringify({ | |
| prompt: prompt, | |
| params: params | |
| }), | |
| signal: abortController.signal // Add the abort signal | |
| }).then(response => { | |
| const reader = response.body.getReader(); | |
| const decoder = new TextDecoder(); | |
| let buffer = ''; | |
| function processText(text) { | |
| const lines = text.split('\n'); | |
| for (const line of lines) { | |
| if (line.startsWith('data: ')) { | |
| try { | |
| const data = JSON.parse(line.slice(6)); | |
| if (data.error) { | |
| alert('Error: ' + data.error); | |
| stopGeneration(); | |
| return; | |
| } | |
| if (data.token) { | |
| // Append new token to existing text | |
| textarea.value += data.token; | |
| updateTokenization(); | |
| } | |
| if (data.text) { | |
| // Final text (only used if something went wrong with streaming) | |
| textarea.value = data.text; | |
| updateTokenization(); | |
| } | |
| if (data.done) { | |
| stopGeneration(); | |
| } | |
| } catch (e) { | |
| console.error('Error parsing SSE data:', e); | |
| } | |
| } | |
| } | |
| } | |
| function pump() { | |
| return reader.read().then(({value, done}) => { | |
| if (done) { | |
| if (buffer.length > 0) { | |
| processText(buffer); | |
| } | |
| return; | |
| } | |
| buffer += decoder.decode(value, {stream: true}); | |
| const lines = buffer.split('\n\n'); | |
| buffer = lines.pop(); | |
| for (const line of lines) { | |
| processText(line); | |
| } | |
| return pump(); | |
| }); | |
| } | |
| return pump(); | |
| }) | |
| .catch(error => { | |
| if (error.name === 'AbortError') { | |
| console.log('Generation stopped by user'); | |
| } else { | |
| console.error('Error:', error); | |
| alert('Error: Failed to generate text'); | |
| } | |
| }) | |
| .finally(() => { | |
| generateBtn.classList.remove('generating'); | |
| abortController = null; | |
| }); | |
| } | |
| function stopGeneration() { | |
| if (abortController) { | |
| abortController.abort(); | |
| abortController = null; | |
| } | |
| generateBtn.classList.remove('generating'); | |
| } | |
| // Remove BOTH old event listeners and add just one new one | |
| generateBtn.addEventListener('click', function(e) { | |
| e.preventDefault(); // Prevent any double triggers | |
| if (generateBtn.classList.contains('generating')) { | |
| stopGeneration(); | |
| } else { | |
| startGeneration(); | |
| } | |
| }); | |
| async function updateTokenization() { | |
| const text = textarea.value; | |
| try { | |
| // Validate parameters before sending | |
| const seed = parseInt(seedInput.value); | |
| const ngram = parseInt(ngramInput.value); | |
| const delta = parseFloat(deltaInput.value); | |
| const temperature = parseFloat(temperatureInput.value); | |
| const response = await fetch('/tokenize', { | |
| method: 'POST', | |
| headers: { | |
| 'Content-Type': 'application/json', | |
| }, | |
| body: JSON.stringify({ | |
| text: text, | |
| params: { | |
| detector_type: detectorTypeSelect.value, | |
| seed: isNaN(seed) ? 0 : seed, | |
| ngram: isNaN(ngram) ? 1 : ngram, | |
| delta: isNaN(delta) ? 2.0 : delta, | |
| temperature: isNaN(temperature) ? 0.8 : temperature | |
| } | |
| }) | |
| }); | |
| if (!response.ok) { | |
| const errorData = await response.json(); | |
| throw new Error(errorData.error || `HTTP error! status: ${response.status}`); | |
| } | |
| const data = await response.json(); | |
| if (data.error) { | |
| throw new Error(data.error); | |
| } | |
| // Update token display | |
| tokenDisplay.innerHTML = data.tokens.map((token, i) => { | |
| const score = data.scores[i]; | |
| const pvalue = data.pvalues[i]; | |
| const scoreDisplay = (score !== null && !isNaN(score)) ? score.toFixed(3) : 'N/A'; | |
| const pvalueDisplay = (pvalue !== null && !isNaN(pvalue)) ? formatPValue(pvalue) : 'N/A'; | |
| return `<span class="token" style="background-color: ${data.colors[i]}"> | |
| ${token} | |
| <div class="token-tooltip"> | |
| Score: ${scoreDisplay}<br> | |
| P-value: ${pvalueDisplay} | |
| </div> | |
| </span>`; | |
| }).join(''); | |
| // Update counts and stats - safely handle null values | |
| tokenCount.textContent = data.token_count || 0; | |
| scoredTokens.textContent = data.ntoks_scored || 0; | |
| finalScore.textContent = (data.final_score !== null && !isNaN(data.final_score)) ? | |
| data.final_score.toFixed(2) : '0.00'; | |
| pValue.textContent = (data.final_pvalue !== null && !isNaN(data.final_pvalue)) ? | |
| formatPValue(data.final_pvalue) : '0.500'; | |
| // Clear any previous error | |
| const existingError = tokenDisplay.querySelector('.alert-danger'); | |
| if (existingError) { | |
| existingError.remove(); | |
| } | |
| } catch (error) { | |
| console.error('Error updating tokenization:', error); | |
| // Show detailed error to user | |
| tokenDisplay.innerHTML = `<div class="alert alert-danger"> | |
| <strong>Error:</strong> ${error.message || 'Error updating results. Please try again.'} | |
| </div>`; | |
| // Reset stats on error | |
| tokenCount.textContent = '0'; | |
| scoredTokens.textContent = '0'; | |
| finalScore.textContent = '0.00'; | |
| pValue.textContent = '0.500'; | |
| } | |
| } | |
| // Increase debounce timeout and ensure it's properly cleared | |
| textarea.addEventListener('input', function() { | |
| if (debounceTimeout) { | |
| clearTimeout(debounceTimeout); | |
| } | |
| debounceTimeout = setTimeout(updateTokenization, 500); // Increased to 500ms | |
| }); | |
| // Add input event listeners for parameter fields to trigger updates | |
| seedInput.addEventListener('input', function() { | |
| const value = this.value === '' ? '' : parseInt(this.value); | |
| if (isNaN(value) && this.value !== '') { | |
| this.value = "0"; | |
| } | |
| if (debounceTimeout) { | |
| clearTimeout(debounceTimeout); | |
| } | |
| debounceTimeout = setTimeout(updateTokenization, 500); | |
| }); | |
| ngramInput.addEventListener('input', function() { | |
| const value = this.value === '' ? '' : parseInt(this.value); | |
| if (isNaN(value) && this.value !== '') { | |
| this.value = "1"; | |
| } | |
| if (debounceTimeout) { | |
| clearTimeout(debounceTimeout); | |
| } | |
| debounceTimeout = setTimeout(updateTokenization, 500); | |
| }); | |
| deltaInput.addEventListener('input', function() { | |
| const value = this.value === '' ? '' : parseFloat(this.value); | |
| if (isNaN(value) && this.value !== '') { | |
| this.value = "2.0"; | |
| } | |
| if (debounceTimeout) { | |
| clearTimeout(debounceTimeout); | |
| } | |
| debounceTimeout = setTimeout(updateTokenization, 500); | |
| }); | |
| temperatureInput.addEventListener('input', function() { | |
| const value = this.value === '' ? '' : parseFloat(this.value); | |
| if (isNaN(value) && this.value !== '') { | |
| this.value = "0.8"; | |
| } | |
| if (debounceTimeout) { | |
| clearTimeout(debounceTimeout); | |
| } | |
| debounceTimeout = setTimeout(updateTokenization, 500); | |
| }); | |
| // Add keyboard shortcut for applying changes | |
| document.addEventListener('keydown', function(e) { | |
| if ((e.metaKey || e.ctrlKey) && e.key === 'Enter') { | |
| e.preventDefault(); | |
| if (document.activeElement === promptArea) { | |
| if (generateBtn.classList.contains('generating')) { | |
| stopGeneration(); | |
| } else { | |
| startGeneration(); | |
| } | |
| } else { | |
| applyParamsBtn.click(); | |
| } | |
| } | |
| }); | |
| detectorTypeSelect.addEventListener('change', function() { | |
| if (debounceTimeout) { | |
| clearTimeout(debounceTimeout); | |
| } | |
| debounceTimeout = setTimeout(updateTokenization, 500); | |
| }); | |
| // Ensure the modal apply button properly triggers an update | |
| applyParamsBtn.addEventListener('click', function() { | |
| updateTokenization().then(() => { | |
| const modal = bootstrap.Modal.getInstance(document.getElementById('paramsModal')); | |
| if (modal) { | |
| modal.hide(); | |
| } | |
| }).catch(error => { | |
| console.error('Error applying parameters:', error); | |
| }); | |
| }); | |
| // Add this helper function for formatting p-values | |
| function formatPValue(value) { | |
| if (value >= 0.001) { | |
| return value.toFixed(3); | |
| } else { | |
| return value.toExponential(2); | |
| } | |
| } | |
| </script> | |
| </body> | |
| </html> | |