Krishna Mishra

xkkm

AI & ML interests

None yet

Recent Activity

upvoted an article 16 days ago

How to make NeuTTS-air generate over 200 seconds of audio in a single second.

upvoted an article 16 days ago

LLM based Audio models

reacted to Parveshiiii's post with 👍 3 months ago

Ever wanted an open‑source deep research agent? Meet Deepresearch‑Agent 🔍🤖 1. Multi‑step reasoning: Reflects between steps, fills gaps, iterates until evidence is solid. 2. Research‑augmented: Generates queries, searches, synthesizes, and cites sources. 3. Fullstack + LLM‑friendly: React/Tailwind frontend, LangGraph/FastAPI backend; works with OpenAI/Gemini. 🔗 GitHub: https://github.com/Parveshiiii/Deepresearch-Agent

View all activity

Organizations

None yet

upvoted 2 articles 16 days ago

Article

How to make NeuTTS-air generate over 200 seconds of audio in a single second.

Nov 21, 2025

•

Article

LLM based Audio models

16 days ago

•

reacted to Parveshiiii's post with 👍 3 months ago

Post

4493

Ever wanted an open‑source deep research agent? Meet Deepresearch‑Agent 🔍🤖

1. Multi‑step reasoning: Reflects between steps, fills gaps, iterates until evidence is solid.

2. Research‑augmented: Generates queries, searches, synthesizes, and cites sources.

3. Fullstack + LLM‑friendly: React/Tailwind frontend, LangGraph/FastAPI backend; works with OpenAI/Gemini.

🔗 GitHub: https://github.com/Parveshiiii/Deepresearch-Agent

reacted to Akhil-Theerthala's post with 🚀 5 months ago

Post

2249

I'm excited to announce that I've just released the newest versions of my Kuvera models and the expanded Personal Finance Reasoning dataset on Hugging Face!

What's new:
I've expanded the Personal Finance Reasoning Dataset, which now includes 18.9k samples of real-world financial questions paired with detailed, empathetic answers. The previous generation pipeline was also streamlined with better psychological context and response validations.

I've also released new Kuvera models trained on this improved dataset:
- Kuvera-4B & 8B: These are my upgraded non-reasoning models, fine-tuned to provide practical financial advice. I've specifically trained the 8B model to better understand the user's emotional context.
- Kuvera-12B: A first experimental reasoning model focused on the query resolution.

As the sole person working on this project, this release is a noticeable step forward from my previous work, offering more powerful and nuanced tools for financial AI.

I am actively looking to collaborate with others who are passionate about analyzing and improving the quality of personal finance advice generated by large language models. If this sounds like you, please reach out!

You can check these out on the following links:

Models:
- Akhil-Theerthala/Kuvera-8B-qwen3-v0.2.1
- Akhil-Theerthala/Kuvera-4B-unsloth-gemma3
- Akhil-Theerthala/kuvera-12B-v0.2.0-unsloth-gemma3

Dataset:
- Akhil-Theerthala/Kuvera-PersonalFinance-V2.1

P.S. The paper on the framework used to generate these models along with the detailed evaluation of the main 8B model's responses is going to be released soon!

2 replies

reacted to Abhaykoul's post with 🔥 6 months ago

Post

3115

🎉 Dhanishtha 2.0 Preview is Now Open Source!

The world's first Intermediate Thinking Model is now available to everyone!

Dhanishtha 2.0 Preview brings revolutionary intermediate thinking capabilities to the open-source community. Unlike traditional reasoning models that think once, Dhanishtha can think, answer, rethink, answer again, and continue rethinking as needed using multiple blocks between responses.

🚀 Key Features
- Intermediate thinking: Think → Answer → Rethink → Answer → Rethink if needed...
- Token efficient: Uses up to 79% fewer tokens than DeepSeek R1 on similar queries
- Transparent thinking: See the model's reasoning process in real-time
- Open source: Freely available for research and development

HelpingAI/Dhanishtha-2.0-preview
https://helpingai.co/chat

1 reply

reacted to azettl's post with 👍 7 months ago

Post

479

Agents & MCP Hackathon Day 5

I submitted my projects yesterday at lunch. Check the submissions and let me know what you think! There's also demo videos.

#1: Consilium: Multi-AI Expert Consensus Platform
Agents-MCP-Hackathon/consilium_mcp

UI Video: https://www.youtube.com/watch?v=ciYLqI-Nawc
MCP Video: https://www.youtube.com/watch?v=r92vFUXNg74

#2: Consilium Roundtable - Custom Gradio Component (used in #1)
Agents-MCP-Hackathon/gradio_consilium_roundtable

Video: https://www.youtube.com/watch?v=oyYlf1BfuU8

If you find this cool, please like the spaces and videos ❤️. Now that they extended the time by 2 days, I will polish the custom component a little more and update the submission.

reacted to Kseniase's post with 👍 7 months ago

Post

6432

12 Foundational AI Model Types

Let’s refresh some fundamentals today to stay fluent in the what we all work with. Here are some of the most popular model types that shape the vast world of AI (with examples in the brackets):

1. LLM - Large Language Model (GPT, LLaMA) -> Large Language Models: A Survey (2402.06196)
+ history of LLMs: https://www.turingpost.com/t/The%20History%20of%20LLMs
It's trained on massive text datasets to understand and generate human language. They are mostly build on Transformer architecture, predicting the next token. LLMs scale by increasing overall parameter count across all components (layers, attention heads, MLPs, etc.)

2. SLM - Small Language Model (TinyLLaMA, Phi models, SmolLM) A Survey of Small Language Models (2410.20011)
Lightweight LM optimized for efficiency, low memory use, fast inference, and edge use. SLMs work using the same principles as LLMs

3. VLM - Vision-Language Model (CLIP, Flamingo) -> An Introduction to Vision-Language Modeling (2405.17247)
Processes and understands both images and text. VLMs map images and text into a shared embedding space or generate captions/descriptions from both

4. MLLM - Multimodal Large Language Model (Gemini) -> A Survey on Multimodal Large Language Models (2306.13549)
A large-scale model that can understand and process multiple types of data (modalities) — usually text + other formats, like images, videos, audio, structured data, 3D or spatial inputs. MLLMs can be LLMs extended with modality adapters or trained jointly across vision, text, audio, etc.

5. LAM - Large Action Model (InstructDiffusion, RT-2) -> Large Action Models: From Inception to Implementation (2412.10047)
Understands and generates action sequences by predicting action tokens (discrete/continuous instructions) that guide agents. Trained on behavior datasets, LAMs generalize across tasks, environments, and modalities - video, sensor data, etc.

Read about LRM, MoE, SSM, RNN, CNN, SAM and LNN below👇

Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

2 replies

liked a Space 7 months ago

Chat with Kimi-VL-A3B-Thinking-2506

🤔

180

Chat with images, videos, or PDFs to generate text

liked a model 8 months ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8, 2025 • 311 • 98

liked 2 Spaces 8 months ago

DeepSite v3

🐳

16.2k

Generate any application by Vibe Coding

ICEdit

🖼

663

Universal Image Editing is worth a single LoRA

updated a Space 8 months ago

kspace

🐳

published a Space 8 months ago

kspace

🐳

reacted to juhoinkinen's post with 🚀 8 months ago

Post

2798

We ( @osma , @MonaLehtinen & me, i.e. the Annif team at the National Library of Finland) recently took part in the LLMs4Subjects challenge at the SemEval-2025 workshop. The task was to use large language models (LLMs) to generate good quality subject indexing for bibliographic records, i.e. titles and abstracts.

We are glad to report that our system performed well; it was ranked

🥇 1st in the category where the full vocabulary was used
🥈 2nd in the smaller vocabulary category
🏅 4th in the qualitative evaluations.

14 participating teams developed their own solutions for generating subject headings and the output of each system was assessed using both quantitative and qualitative evaluations. Research papers about most of the systems are going to be published around the time of the workshop in late July, and many pre-prints are already available.

We applied Annif together with several LLMs that we used to preprocess the data sets: translated the GND vocabulary terms to English, translated bibliographic records into English and German as required, and generated additional synthetic training data. After the preprocessing, we used the traditional machine learning algorithms in Annif as well as the experimental XTransformer algorithm that is based on language models. We also combined the subject suggestions generated using English and German language records in a novel way.

More information can be found in our system description preprint: Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs (2504.19675)

See also the task description preprint: SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog (2504.07199)

The Annif models trained for this task are available here: NatLibFi/Annif-LLMs4Subjects-data

2 replies

liked a Space 8 months ago

VisionScout

🛰

Object Detection & Scene Understanding for Images and Video

Krishna Mishra

AI & ML interests

Recent Activity

Organizations

xkkm's activity

How to make NeuTTS-air generate over 200 seconds of audio in a single second.

LLM based Audio models

Chat with Kimi-VL-A3B-Thinking-2506

DeepSite v3

ICEdit

kspace

kspace

VisionScout