AWS Trainium & Inferentia documentation

Sentence Transformers 🤗

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Sentence Transformers 🤗

SentenceTransformers 🤗 is a Python framework for state-of-the-art sentence, text and image embeddings. It can be used to compute embeddings using Sentence Transformer models or to calculate similarity scores using Cross-Encoder (a.k.a. reranker) models. This unlocks a wide range of applications, including semantic search, semantic textual similarity, and paraphrase mining. Optimum Neuron offer APIs to ease the use of SentenceTransformers on AWS Neuron devices.

Export to Neuron

Option 1: CLI

  • Example - Text embeddings
optimum-cli export neuron -m BAAI/bge-large-en-v1.5 --sequence_length 384 --batch_size 1 --task feature-extraction bge_emb_neuron/
  • Example - Image Search
optimum-cli export neuron -m sentence-transformers/clip-ViT-B-32 --sequence_length 64 --text_batch_size 3 --image_batch_size 1 --num_channels 3 --height 224 --width 224 --task feature-extraction --subfolder 0_CLIPModel clip_emb_neuron/

Option 2: Python API

  • Example - Text embeddings
from optimum.neuron import NeuronSentenceTransformers

# configs for compiling model
input_shapes = {
    "batch_size": 1,
    "sequence_length": 512,
}
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}

neuron_model = NeuronSentenceTransformers.from_pretrained(
    "BAAI/bge-large-en-v1.5",
    export=True,
    **input_shapes,
    **compiler_args,
)

# Save locally
neuron_model.save_pretrained("bge_emb_neuron/")

# Upload to the HuggingFace Hub
neuron_model.push_to_hub(
    "bge_emb_neuron/", repository_id="optimum/bge-base-en-v1.5-neuronx"  # Replace with your HF Hub repo id
)

sentences_1 = ["Life is pain au chocolat", "Life is galette des rois"]
sentences_2 = ["Life is eclaire au cafe", "Life is mille feuille"]
embeddings_1 = neuron_model.encode(sentences_1, normalize_embeddings=True)
embeddings_2 = neuron_model.encode(sentences_2, normalize_embeddings=True)
similarity = neuron_model.similarity(embeddings_1, embeddings_2)
  • Example - Image Search
from optimum.neuron import NeuronSentenceTransformers

# configs for compiling model
input_shapes = {
    "num_channels": 3,
    "height": 224,
    "width": 224,
    "text_batch_size": 3,
    "image_batch_size": 1,
    "sequence_length": 64,
}
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}

neuron_model = NeuronSentenceTransformers.from_pretrained(
    "sentence-transformers/clip-ViT-B-32",
    subfolder="0_CLIPModel",
    export=True,
    dynamic_batch_size=False,
    **input_shapes,
    **compiler_args,
)

# Save locally
neuron_model.save_pretrained("clip_emb_neuron/")

# Upload to the HuggingFace Hub
neuron_model.push_to_hub(
    "clip_emb_neuron/", repository_id="optimum/clip_vit_emb_neuronx"  # Replace with your HF Hub repo id
)