One of the best explanations of embedding I've ever read. Well done, @hesamation !
Had to share this: hesamation/primer-llm-embedding
Selma. Fine-tuning is a great way to start, but for enterprise use cases the real leverage comes after we “crack” the problem. Once the business workflow and KPIs are clear... a domain‑specific pretrained model (say 3B–7B) is viable and impactful. Budget‑wise, pretraining in that range can be around ~$400k (very very approximate estyimation), but the point is after validating the use case and TAM, a domain‑specific pretrained model becomes a big multiplier for retrieval, reasoning, and integration across the stack, far beyond what incremental fine‑tuning alone can deliver... I'm think find enterprise usecase and train (like B2B SaaS)
I think domain-specific use cases are many... in every vertical there are more nuances as we try to crack deeper-level problems. I believe it’s worth investing time in experimentation to start with, then identify the TAM for that, and decide whether to train further or not. What do you think about this???
imagen-4.0-fast-generate-001 model for Image Generation (Text-to-Image) and Multi-Image Editing (Image-to-Image), and Draw-to-Image powered by gemini-2.5-flash-image (aka Nano Banana).head?" or "if multiple licenses were found, do they contradict each other?", which makes further filtering a breeze. torchaoInt8WeightOnlyConfig is already working flawlessly in our tests.import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_
pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)
@spaces.GPU
def generate(prompt: str):
return pipeline(prompt).images[0]