Instructions to use fblgit/cybertron-v4-qw7B-MGS with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use fblgit/cybertron-v4-qw7B-MGS with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="fblgit/cybertron-v4-qw7B-MGS")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("fblgit/cybertron-v4-qw7B-MGS")
model = AutoModelForCausalLM.from_pretrained("fblgit/cybertron-v4-qw7B-MGS")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use fblgit/cybertron-v4-qw7B-MGS with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "fblgit/cybertron-v4-qw7B-MGS"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fblgit/cybertron-v4-qw7B-MGS",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/fblgit/cybertron-v4-qw7B-MGS

SGLang

How to use fblgit/cybertron-v4-qw7B-MGS with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "fblgit/cybertron-v4-qw7B-MGS" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fblgit/cybertron-v4-qw7B-MGS",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "fblgit/cybertron-v4-qw7B-MGS" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "fblgit/cybertron-v4-qw7B-MGS",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use fblgit/cybertron-v4-qw7B-MGS with Docker Model Runner:
```
docker model run hf.co/fblgit/cybertron-v4-qw7B-MGS
```

some quston about post-traing (function calling)

by postitive666 - opened Nov 11, 2024

Discussion

postitive666

Nov 11, 2024

hello !I use this model in the second stage of Function calling, and the model alters the image URL generated by my function call." Is this due to post-training? Any suggestions?

fblgit

Owner Nov 11, 2024

Can U try this with the base models of qwen and see wether it happens there also?
Provide as well the prompt so I can reproduce it.

postitive666

Nov 12, 2024

hello ， the original instruction model of Qwen is relatively more stable in terms of performance. Here is an example of a model hallucination I encountered after calling the image generation tool:
{
"generate_image": {
"prompt": "a cat",
"toolbench_rapidapi_key": ""（error generate）
}
}
My interface should only have one prompt, but the model generates a new "toolbench_rapidapi_key" on its own. I have tried some simple adjustments, such as adjusting the prompt or performing some rule matching to identify some regular errors. I can handle such issues, but when, for example, I generate some URLs like 'geni.static/xxx.png' and then provide them to the large model for summarization, it might alter my URLs, such as changing them to 'https://geni/static' or '/geni.xxxstaice/xx.png'

postitive666 changed discussion status to closed Nov 12, 2024

postitive666 changed discussion status to open Nov 12, 2024

fblgit

Owner Nov 12, 2024

I dont see the prompt itself and the answer of the model.
Please provide:

Hyperparameters (temp, top_p, top_k)
System Prompt
Input Prompt

For both Base Model and Cybertron model, thanks

postitive666

Nov 14, 2024

"Sorry for the late reply. I did some comparisons and found that the main issue is not with this model. When calling the drawing API, some hallucinations may occur, but this does not affect the usage. QWEN may also recognize errors. Do you have experience with multi-turn tool calls? Can you provide some guidance?"

fblgit

Owner Nov 14, 2024

unfortunately not, i havent used this feature of LLM's.. been busy with other stuff. But i let the thread open just in case someone can help u.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment