s3nh (s3nh)

reacted to appvoid's post with 👍 2 days ago

Post

4048

today is going to be a great day for small models, are you ready?

3 replies

·

reacted to their post with 🔥 13 days ago

Post

4031

Just tried to create an educational assistant for younger people who can struggle with visualsation of 'what is this sorcery all about'.
Its first step of my spare time projects, sft on Qwen3-8B,

EduHelper is a child-friendly tutoring assistant fine-tuned from the Qwen3-8B base model using parameter-efficient fine-tuning (PEFT) with LoRA on the ajibawa-2023/Education-Young-Children dataset.

s3nh/EduHelp-8B

Glad to share my work, have a wonderful day!

2 replies

·

reacted to ZennyKenny's post with 👍 16 days ago

Post

2152

Did Hugging Face just ban hammer a bunch of bot accounts or am I just so uninteresting that 30% of my subs dropped me overnight?

😬 Wait, don't answer that.

2 replies

·

posted an update 16 days ago

Post

455

Eduhelp with more empathy, based on model finetuned on
psychotheraputic preferences just landed on

Beck-8B as a base model, 13000 steps on educational dataset.
Time to go further and build more 🥰
s3nh/EduHelp_Beck_8B
Thanks to @basilic_ai for computations <3

replied to their post 17 days ago

Thanks!

posted an update 18 days ago

Post

4031

Just tried to create an educational assistant for younger people who can struggle with visualsation of 'what is this sorcery all about'.
Its first step of my spare time projects, sft on Qwen3-8B,

EduHelper is a child-friendly tutoring assistant fine-tuned from the Qwen3-8B base model using parameter-efficient fine-tuning (PEFT) with LoRA on the ajibawa-2023/Education-Young-Children dataset.

s3nh/EduHelp-8B

Glad to share my work, have a wonderful day!

2 replies

·

reacted to Severian's post with 👀 18 days ago

Post

300

New Technique to Deeply Poison AI on Images and Prove Creative Provenance

I've developed a new method to protect creative work from unauthorized AI training. My Poisonous Shield for Images algorithm embeds a deep, removal-resistant poison into the mathematical structure of your images. It's designed to be toxic to machine learning models, achieving up to 20-348% disruption in AI training convergence in benchmark tests.

Unlike traditional watermarks, this protection survives compression and resizing and is not removed by standard tools. The technique also embeds cryptographic proof of provenance directly into the image, verifying ownership and detecting tampering.

You can see examples and learn more about how and WHY it works better than current methods:

https://severian-poisonous-shield-for-images.static.hf.space

If you are interested in using this technology to protect your work from AI training and unauthorized use, please reach out to me. It is currently in the prototype phase but fully functioning and effective. Still working on expanding it to a production-grade usable app.

This is not intended as a pure self-promotion post. I am genuinely wanting to help creators and want to gauge interest from different communities. I've spent the past year and a half building this from scratch with new math and code to try and solve this massive problem.

reacted to Severian's post with 👍 23 days ago

Post

3154

MLX port of BDH (Baby Dragon Hatchling) is up!

I’ve ported the BDH ( https://github.com/pathwaycom/bdh ) model to MLX for Apple Silicon. It’s a faithful conversion of the PyTorch version: same math, same architecture (byte-level vocab, shared weights across layers, ReLU sparsity, RoPE attention with Q=K), with MLX-friendly APIs and a detailed README explaining the few API-level differences and why results are equivalent.

Code, docs, and training script are ready to use. You may need to adjust the training script a bit to fit your own custom dataset. Only tested on M4 so far, but should work perfect for any M1/M2/M3 users out there.

I’m currently training this MLX build on my Internal Knowledge Map (IKM) dataset Severian/Internal-Knowledge-Map
Training’s underway; expect a day or so before I publish weights. When it’s done, I’ll upload the checkpoint to Hugging Face for anyone to test.

Repo: https://github.com/severian42/BDH-MLX
HF model (coming soon): Severian/BDH-MLX

If you try it on your own data, feedback and PRs are welcome.

reacted to mitkox's post with 🚀 23 days ago

Post

380

Hermes4 70B synthetic dataset generation on my desktop Z8 GPU rig:
307 tok/sec
1.1M tok/hour

The bottleneck for generating massive, high-quality reinforcement learning datasets is never the GPU compute; it's always the model's willingness to actually answer the darn question.

reacted to andywu-kby's post with 👍 27 days ago

Post

2389

Hello everyone,
I hope you’re doing well.

We’re currently developing a chatbot that can analyze and forecast sales directly from Excel files. Do you think this would be useful?

Miragic-AI/Miragic-Sales-Pilot

Please share your feedback by 👍 or 👎 this post.

Best regards,

reacted to Xenova's post with 👍🔥 28 days ago

Post

4375

The next generation of AI-powered websites is going to be WILD! 🤯

In-browser tool calling & MCP is finally here, allowing LLMs to interact with websites programmatically.

To show what's possible, I built a demo using Liquid AI's new LFM2 model, powered by 🤗 Transformers.js: LiquidAI/LFM2-WebGPU

As always, the demo is open source (which you can find under the "Files" tab), so I'm excited to see how the community builds upon this! 🚀

2 replies

·

reacted to AdinaY's post with 🔥 about 1 month ago

Post

1963

Qwen3Guard 🛡️ a series of safety moderation models built upon Qwen3

Qwen/qwen3guard-68d2729abbfae4716f3343a1

✨ 0.6B/4B/8B - Apache2.0
✨ Two variants: Gen & Steam
✨ Trained on a dataset of 1.19 million prompts
✨ Classifies content into Safe / Unsafe / Controversial
✨ Supports 119 languages & dialects

reacted to vikhyatk's post with 🔥 about 1 month ago

Post

4195

Just released a preview of Moondream 3! moondream/moondream3-preview

This is a 9B parameter, 2B active MoE VLM with state of the art visual reasoning capabilities.

More details in the release blog post: https://moondream.ai/blog/moondream-3-preview

3 replies

·

reacted to MikeDoes's post with 🚀 3 months ago

Post

2019

🛡️ At Ai4Privacy, our goal is to empower researchers to build a safer AI ecosystem. Today, we're highlighting crucial research that does just that by exposing a new vulnerability.

The paper "Forget to Flourish" details a new model poisoning technique. It's a reminder that as we fine-tune LLMs, our anonymization and privacy strategies must evolve to counter increasingly sophisticated threats.

We're proud that the Ai4Privacy dataset was instrumental in this study. It served two key purposes:

Provided a Realistic Testbed: It gave the researchers access to a diverse set of synthetic and realistic PII samples in a safe, controlled environment.

Enabled Impactful Benchmarking: It allowed them to measure the actual effectiveness of their data extraction attack, proving it could compromise specific, high-value information.

This work reinforces our belief that progress in AI security is a community effort. By providing robust tools for benchmarking, we can collectively identify weaknesses and build stronger, more resilient systems. A huge congratulations to the authors on this important contribution.

🔗 Read the full paper: https://arxiv.org/html/2408.17354v1

#OpenSource #DataPrivacy #LLM #Anonymization #AIsecurity #HuggingFace #Ai4Privacy #World's largest open privacy masking dataset

reacted to AtAndDev's post with 🚀 3 months ago

Post

542

Qwen 3 Coder is a personal attack to k2, and I love it.
It achieves near SOTA on LCB while not having reasoning.
Finally people are understanding that reasoning isnt necessary for high benches...

Qwen ftw!

DECENTRALIZE DECENTRALIZE DECENTRALIZE

reacted to KnutJaegersberg's post with ❤️ 6 months ago

Post

2758

The Intelligence Curse

The document warns of the "intelligence curse," a potential consequence of advanced AI (AGI) where powerful entities lose their incentive to invest in people as AI automates work[cite: 13, 297]. This could lead to job displacement, reduced social mobility, and a concentration of power and wealth based on AI ownership, similar to the "resource curse" in resource-rich states[cite: 17, 18, 31, 329, 353]. To counter this, the authors propose averting AI catastrophes to prevent centralization, diffusing AI widely to keep humans economically relevant, and democratizing institutions to remain anchored to human needs[cite: 22, 23, 25, 35, 36, 37, 566].

https://intelligence-curse.ai/intelligence-curse.pdf

reacted to loubnabnl's post with ❤️ 6 months ago

Post

5134

SmolVLM is now available on PocketPal — you can run it offline on your smartphone to interpret the world around you. 🌍📱

And check out this real-time camera demo by @ngxson , powered by llama.cpp:
https://github.com/ngxson/smolvlm-realtime-webcam
https://x.com/pocketpal_ai

4 replies

·

reacted to merve's post with 👍🚀 6 months ago

Post

6686

A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers 🔥

D-FINE is the sota real-time object detector that runs on T4 (free Colab) 🤩

> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352

Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper 🎩

Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve 🥲☹️

D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate 🤩

Another core idea behind this model is Global Optimal Localization Self-Distillation ⤵️

this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.

2 replies

·

s3nh PRO

AI & ML interests

Recent Activity

Organizations

s3nh PRO

AI & ML interests

Recent Activity

Organizations

s3nh's activity