Zen Omni 30b Thinking

Thinking variant of Zen Omni 30B with extended chain-of-thought for multimodal reasoning.

Overview

Built on Zen MoDE (Mixture of Distilled Experts) architecture with 30B MoE parameters and 128K context window.

Developed by Hanzo AI and the Zoo Labs Foundation.

Quick Start

from transformers import AutoModelForVision2Seq, AutoProcessor
from PIL import Image
import torch

model_id = "zenlm/zen-omni-30b-thinking"
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True)

messages = [
    {"role": "user", "content": "Describe this image in detail."}
]

# Text-only
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(processor.batch_decode(outputs, skip_special_tokens=True)[0])

API Access

from openai import OpenAI

client = OpenAI(base_url="https://api.hanzo.ai/v1", api_key="your-api-key")
response = client.chat.completions.create(
    model="zen-omni-30b-thinking",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Model Details

Attribute	Value
Parameters	30B MoE
Architecture	Zen MoDE
Context	128K tokens
License	Apache 2.0

License

Apache 2.0

Downloads last month: 21

Safetensors

Model size

32B params

Tensor type

BF16