Writer

Enterprise

company

Verified

https://writer.com/

Get_Writer

writer

Activity Feed

AI & ML interests

AGI, LLMs, Knowledge Graph, Palmyra, Domain Specific LLM

Recent Activity

kiranr published a model 11 days ago

Writer/Palmyra-Fin-70B-32K

kiranr published a model 11 days ago

Writer/palmyra-vision

kiranr published a model 11 days ago

Writer/Palmyra-Creative

View all activity

Articles

Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason!

Sep 11

• 58

kiranr

published 3 models 11 days ago

tperes

updated a model 15 days ago

Writer/palmyra-mini-thinking-b

Text Generation • 2B • Updated 15 days ago • 10 • 27

wassemgtk

updated a Space 21 days ago

Palmyra Sec

🐨

palmyra-sec playground

wassemgtk

published a Space 26 days ago

Palmyra Sec

🐨

palmyra-sec playground

rakshith-writer

in Writer/palmyra-mini-thinking-a about 1 month ago

Adding `transformers` tag

#1 opened about 1 month ago by

ariG23498

rakshith-writer

in Writer/palmyra-mini about 1 month ago

Adding `transformers` as the library tag

❤️ 1

#1 opened about 1 month ago by

ariG23498

tperes

updated 2 models about 1 month ago

Writer/palmyra-mini-thinking-a

Text Generation • 2B • Updated Sep 19 • 11 • 26

Writer/palmyra-mini

Text Generation • 2B • Updated Sep 19 • 149 • 30

tperes

in Writer/palmyra-mini-thinking-b about 1 month ago

Fix pipeline_tag 🤗

❤️ 2

#1 opened about 1 month ago by

merve

tperes

posted an update about 2 months ago

Post

217

Introducing Palmyra-mini: Compact AI Models for Efficient Inference

The Palmyra-mini family from Writer includes three lightweight models designed for high performance and efficient inference. These models are ideal for developers looking to integrate AI capabilities without excessive computational overhead.

Model Variants

* palmyra-mini: A base model for general-purpose generative tasks, achieving 52.6% on Big Bench Hard (exact match).

* palmyra-mini-thinking-a: Optimized for complex logical reasoning with a Chain of Thought (CoT) approach, scoring 82.87% on GSM8K (strict match).

* palmyra-mini-thinking-b: Specialized for mathematical reasoning, achieving 92.5% on AMC23.

Technical Details

* All models are based on the Qwen architecture, compatible with popular inference frameworks like vLLM, SGLang, and TGI.

* "Thinking" models utilize CoT training for enhanced reasoning capabilities.

* GGUF and MLX quantizations are available for optimized performance.

For more information, including benchmark methodologies and detailed performance metrics, refer to our blog post: (https://huggingface.co/blog/Writer/announcing-palmyra-mini).

Model repos can be found here:
* Writer/palmyra-mini
* Writer/palmyra-mini-thinking-a
* Writer/palmyra-mini-thinking-b

Also check out a mobile implementation of palmyra-mini on iOS here to see a to see a working example of how inference can be incorporated on-device.(https://github.com/tsperes/palmyra-mini-mobile/)

dmytro-writer

authored a paper 5 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 274

wassemgtk

authored a paper 5 months ago

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 274

wassemgtk

posted an update 7 months ago

Post

3222

I’ve been diving into the iRoPE architecture from Llama 4—a game-changer for long-context models! It interleaves local attention (with RoPE) for short contexts and global attention (with inference-time temp scaling) for long-range reasoning, aiming for infinite context. I’m going to try writing iRoPE—who wants to help?

Code: https://github.com/wassemgtk/iRoPE-try/blob/main/iRoPE.ipynb

1 reply

wassemgtk

posted an update 7 months ago

Post

2132

For fun, a new project: SuperTokenizer! A BPE tokenizer trained on C4 to beat GPT-4. Byte-level, A100-powered, and open-source. Messing around with tokens!
https://github.com/wassemgtk/SuperTokenizer

1 reply

wassemgtk

posted an update 8 months ago

Post

1916

# GESAL: Real-Time Adaptation for LLMs

We’re excited to unveil **Graph-Enhanced Singular Adaptive Learning (GESAL)**, a framework that lets LLMs like meta-llama/Llama-3.2-1B adapt in real time using user feedback. Check out the code and white paper on GitHub!

🔗 **Code**: [https://github.com/writer/AI-Adaptive-Learning-GESAL](https://github.com/writer/AI-Adaptive-Learning-GESAL)

---

## Why GESAL?

Static LLMs struggle to adapt without heavy retraining. GESAL solves this with:
- **SVF**: Adapts weights via \( W' = U (\Sigma \cdot z) V^T \), using few parameters.
- **Graph Memory**: Stores adaptations in nodes for scalability.
- **RL**: Updates via \( J(z) = \mathbb{E}[\log \pi_z(y|x) r] \) based on feedback.

---

## How It Works

Ask "How many R’s in ‘strawberry’?" If it says "2" and you say "no," GESAL learns to say "3" next time, avoiding repeats.

---

## Try It

Built with Hugging Face’s transformers:

pip install transformers torch numpy
python Adaptive_Learning_(GESAL).py

Needs a Hugging Face token for Llama-3.2-1B.

---

## Results

GESAL hits 95% accuracy after 5 feedbacks vs. LoRA’s 70%. It’s efficient (~0.5M params) and scalable.