Darmm

company

https://darmm.kz

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

R3iwan updated a Space 21 days ago

Darmm/README

R3iwan updated a Space 4 months ago

Darmm/README

R3iwan updated a model 4 months ago

Darmm/darmm-text-generation-kazakh-v2

View all activity

Organization Card

Community About org cards

Darmm AI

Independent R&D effort focused on speech and language AI for the Kazakh language and the broader Central Asian context. Maintained by R3iwan.

The focus is the underserved part of the AI landscape: languages and domains where global open-source models drop in quality, and where local context, phonetics, and terminology matter.

Website: darmm.kz — currently AI tutoring, gradually becoming a project showcase.

What I'm working on

Kazakh ASR — fine-tuning and benchmarking modern open-source ASR (Whisper family, Wav2Vec2) on Kazakh data, with public evaluation reports.
Kazakh TTS — voice synthesis for Kazakh including text normalization, G2P handling, and voice cloning experiments.
Real-time voice agents — end-to-end speech pipelines for Kazakh and Russian, built on LiveKit and self-hosted models.
Localized LLM fine-tuning — domain-adapted models for Kazakh/Russian text, with a focus on legal and technical terminology.

Most work is published as open models and benchmarks on Hugging Face. Production-specific components stay closed.

Technical focus

Speech: Whisper fine-tuning, Faster-Whisper / CTranslate2 inference, VITS-family TTS, VAD-based streaming, LiveKit voice agents.
Inference: vLLM serving, quantization (AWQ, GGUF), latency optimization for production deployment.
LLM adaptation: LoRA/QLoRA fine-tuning on Kazakh/Russian data, domain-specific embeddings, RAG and GraphRAG pipelines.
Evaluation: WER/CER for ASR, RAGAS and LLM-as-judge for RAG, honest reporting of where models fail.

Why Kazakh

Most open-source speech and language models treat Kazakh as an afterthought. Whisper-large can transcribe it, but with WER significantly worse than English or Russian. TTS quality is even further behind. There are no widely-used open Kazakh-specific speech models.

This is the gap I work on. Not because it's prestigious, but because it's unsolved and locally important.

Status

Darmm is an early-stage R&D effort, not a finished product line. Models and writeups are published as they're ready, with honest baselines and known limitations. Reach out via github.com/R3iwan.

Collections 5

View 5 collections

models 5

Darmm

AI & ML interests

Recent Activity

Darmm AI

What I'm working on

Technical focus

Why Kazakh

Status

Collections 5

Darmm/darmm-tech-scribe

Darmm/darmm-embedding-multilingual

Darmm/darmm-tech-scribe

Darmm/darmm-embedding-multilingual

models 5

Darmm/darmm-text-generation-kazakh-v2

Darmm/darmm-tech-scribe

Darmm/darmm-text-generation-kazakh

Darmm/darmm-sentiment-kazakh

Darmm/darmm-embedding-multilingual

datasets 2

Darmm/darmm-text-generation-kazakh

Darmm/darmm-sentiment-kk

AI & ML interests

Recent Activity

Team members 1

Darmm AI

What I'm working on

Technical focus

Why Kazakh

Status

Collections 5

models 5 Sort: Recently updated

datasets 2 Sort: Recently updated

models 5

datasets 2