AWS Trainium & Inferentia documentation
Sentence Transformers 🤗
Sentence Transformers 🤗
SentenceTransformers 🤗 is a Python framework for state-of-the-art sentence, text and image embeddings. It can be used to compute embeddings using Sentence Transformer models or to calculate similarity scores using Cross-Encoder (a.k.a. reranker) models. This unlocks a wide range of applications, including semantic search, semantic textual similarity, and paraphrase mining. Optimum Neuron offer APIs to ease the use of SentenceTransformers on AWS Neuron devices.
Export to Neuron
Option 1: CLI
- Example - Text embeddings
optimum-cli export neuron -m BAAI/bge-large-en-v1.5 --sequence_length 384 --batch_size 1 --task feature-extraction bge_emb_neuron/- Example - Image Search
optimum-cli export neuron -m sentence-transformers/clip-ViT-B-32 --sequence_length 64 --text_batch_size 3 --image_batch_size 1 --num_channels 3 --height 224 --width 224 --task feature-extraction --subfolder 0_CLIPModel clip_emb_neuron/Option 2: Python API
- Example - Text embeddings
from optimum.neuron import NeuronSentenceTransformers
# configs for compiling model
input_shapes = {
"batch_size": 1,
"sequence_length": 512,
}
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}
neuron_model = NeuronSentenceTransformers.from_pretrained(
"BAAI/bge-large-en-v1.5",
export=True,
**input_shapes,
**compiler_args,
)
# Save locally
neuron_model.save_pretrained("bge_emb_neuron/")
# Upload to the HuggingFace Hub
neuron_model.push_to_hub(
"bge_emb_neuron/", repository_id="optimum/bge-base-en-v1.5-neuronx" # Replace with your HF Hub repo id
)
sentences_1 = ["Life is pain au chocolat", "Life is galette des rois"]
sentences_2 = ["Life is eclaire au cafe", "Life is mille feuille"]
embeddings_1 = neuron_model.encode(sentences_1, normalize_embeddings=True)
embeddings_2 = neuron_model.encode(sentences_2, normalize_embeddings=True)
similarity = neuron_model.similarity(embeddings_1, embeddings_2)- Example - Image Search
from optimum.neuron import NeuronSentenceTransformers
# configs for compiling model
input_shapes = {
"num_channels": 3,
"height": 224,
"width": 224,
"text_batch_size": 3,
"image_batch_size": 1,
"sequence_length": 64,
}
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}
neuron_model = NeuronSentenceTransformers.from_pretrained(
"sentence-transformers/clip-ViT-B-32",
subfolder="0_CLIPModel",
export=True,
dynamic_batch_size=False,
**input_shapes,
**compiler_args,
)
# Save locally
neuron_model.save_pretrained("clip_emb_neuron/")
# Upload to the HuggingFace Hub
neuron_model.push_to_hub(
"clip_emb_neuron/", repository_id="optimum/clip_vit_emb_neuronx" # Replace with your HF Hub repo id
)