Spaces:
Running
A newer version of the Gradio SDK is available:
5.49.1
title: NER Explorer Tool
emoji: π
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 5.36.2
app_file: app.py
pinned: false
license: mit
Named Entity Recognition (NER) Explorer Tool
Background
This is a web-based interactive tool designed specifically for exploring Named Entity Recognition (NER) in practice. It was developed as a result of the Digital Scholarship at Oxford (DiSc) funded Extracting Keywords from Crowdsourced Collections project.
Overview
This NER Explorer Tool is an educational and exploratory interface to enable users to 'play' with different NER models and approaches. It was created in an effort to make the Natural Language Processing (NLP) approach more accessible to Digital Humanities (DH), Galleries, Libraries, Archives and Museums (GLAM) professionals, volunteers and researchers - who might otherwise not have the means or opportunity to explore what they can do with NER. Simply copy in some text you would like to test the models on or click examples provided if you don't have/wish to use your own text.
Why this tool?
During our short exploratory research project on keyword extraction from crowdsourced collections, we found that NER has real potential for enhancing search and discovery in digital archives while allowing records to 'speak for themselves'.
It can be difficult to know where to start when selecting NER models, as they can work differently and can be used to find different things. So here we've provided access to models that, of those we tested on a small sample, performed the best, while also trying to be clear that no model is perfect.
We also wanted to raise awareness of the existence of zero-shot NER models (e.g. GLiNER) which can be more flexible than models with pre-defined entity types (e.g. SpaCy), and show how it's possible to use these together.
Models included in the Explorer tool:
spacy_en_core_web_trf- spaCy's transformer-based modelflair_ner-large- Flair's large English NER modelflair_ner-ontonotes-large- Flair's OntoNotes-based modelgliner_knowledgator/modern-gliner-bi-large-v1.0- Modern zero-shot GLiNER model
Key features:
- Highlighted Text: See entities highlighted directly in your text with color-coded labels
- Split-Color Highlighting: Entities identified by both common NER models AND custom GLiNER searches are shown with distinctive split-color highlighting (marked with π€)
- Detailed Tables: Examine all identified entities with confidence scores and source attribution
- Adjustable confidence threshold: Control how certain models need to be before predicting entities (0.1-0.9)
Important
Please note this tool is designed for exploration and education purposes. This tool is not designed or recommended for production use with very long text (e.g. more than 5,000 characters), large collections or sensitive materials. In those cases, if working with these NER models in other environments, additional testing, validation, and ethical review are strongly recommended.
If you have any questions about this tool please email: catherine.conisbee@bodleian.ox.ac.uk See also:main project repository
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference