Solomatin Roman's picture

In a Training Loop 🔄

Solomatin Roman

Samoed

·

AI & ML interests

None yet

Recent Activity

updated a dataset about 15 hours ago

mteb/SpeechCommandsZeroshotv0.01

published a dataset about 15 hours ago

mteb/SpeechCommandsZeroshotv0.01

updated a dataset 3 days ago

mteb/JaCWIRRetrievalLite

View all activity

Organizations

upvoted a collection 24 days ago

SauerkrautLM-Vision-Document-Retrieval

7 items • Updated 24 days ago • 8

upvoted a paper 27 days ago

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

Paper • 2512.10430 • Published 28 days ago • 113

upvoted a collection 29 days ago

NanoBEIR datasets

These datasets are compatible with the (Sparse)NanoBEIREvaluator with Sentence Transformers v5.2+. Also CrossEncoderNanoBEIREvaluator if bm25 column • 16 items • Updated 26 days ago • 12

upvoted an article about 1 month ago

Article

Building and evaluating Multimodal Rerankers

Nov 30, 2025

•

7

upvoted 2 articles 2 months ago

Article

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

Nov 5, 2025

•

58

Article

Improving Parquet Dedupe on Hugging Face Hub

Oct 5, 2024

•

40

upvoted 3 articles 3 months ago

Article

LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR

Oct 23, 2025

•

62

Article

Sentence Transformers is joining Hugging Face!

Oct 22, 2025

•

86

Article

Introducing MTEB v2: Evaluation of embedding and retrieval systems for more than just text

Oct 20, 2025

•

34

upvoted 2 papers 3 months ago

Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13, 2025 • 100

HUME: Measuring the Human-Model Performance Gap in Text Embedding Task

Paper • 2510.10062 • Published Oct 11, 2025 • 9

upvoted an article 3 months ago

Article

Vocabulary is the most important element of Sparse Retrieval

Oct 4, 2025

•

9

upvoted a paper 3 months ago

ModernVBERT: Towards Smaller Visual Document Retrievers

Paper • 2510.01149 • Published Oct 1, 2025 • 30

upvoted an article 3 months ago

Article

ModernVBERT: Towards Smaller Visual Document Retrievers

Oct 3, 2025

•

46

upvoted a collection 3 months ago

ModernVBERT

Resources for ModernVBERT • 5 items • Updated Oct 3, 2025 • 11

upvoted an article 3 months ago

Article

Introducing RTEB: A New Standard for Retrieval Evaluation

+4

Oct 1, 2025

•

133

upvoted a changelog 3 months ago

Changelog

Repositories total file size is now displayed

Sep 18, 2025

• 175

upvoted a paper 3 months ago

AutoIntent: AutoML for Text Classification

Paper • 2509.21138 • Published Sep 25, 2025 • 36

upvoted an article 4 months ago

Article

mmBERT: ModernBERT goes Multilingual

+4

Sep 9, 2025

•

133

upvoted a collection 4 months ago

mmBERT: a modern multilingual encoder

mmBERT is trained on 3T tokens from over 1800 languages, showing SoTA scores on benchmarks and exceptional low-resource performance • 16 items • Updated Sep 9, 2025 • 49