Instructions to use fblgit/cybertron-v4-qw7B-MGS with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use fblgit/cybertron-v4-qw7B-MGS with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="fblgit/cybertron-v4-qw7B-MGS") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("fblgit/cybertron-v4-qw7B-MGS") model = AutoModelForCausalLM.from_pretrained("fblgit/cybertron-v4-qw7B-MGS") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use fblgit/cybertron-v4-qw7B-MGS with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "fblgit/cybertron-v4-qw7B-MGS" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "fblgit/cybertron-v4-qw7B-MGS", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/fblgit/cybertron-v4-qw7B-MGS
- SGLang
How to use fblgit/cybertron-v4-qw7B-MGS with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "fblgit/cybertron-v4-qw7B-MGS" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "fblgit/cybertron-v4-qw7B-MGS", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "fblgit/cybertron-v4-qw7B-MGS" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "fblgit/cybertron-v4-qw7B-MGS", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use fblgit/cybertron-v4-qw7B-MGS with Docker Model Runner:
docker model run hf.co/fblgit/cybertron-v4-qw7B-MGS
some quston about post-traing (function calling)
hello !I use this model in the second stage of Function calling, and the model alters the image URL generated by my function call." Is this due to post-training? Any suggestions?
Can U try this with the base models of qwen and see wether it happens there also?
Provide as well the prompt so I can reproduce it.
hello , the original instruction model of Qwen is relatively more stable in terms of performance. Here is an example of a model hallucination I encountered after calling the image generation tool:
{
"generate_image": {
"prompt": "a cat",
"toolbench_rapidapi_key": ""(error generate)
}
}
My interface should only have one prompt, but the model generates a new "toolbench_rapidapi_key" on its own. I have tried some simple adjustments, such as adjusting the prompt or performing some rule matching to identify some regular errors. I can handle such issues, but when, for example, I generate some URLs like 'geni.static/xxx.png' and then provide them to the large model for summarization, it might alter my URLs, such as changing them to 'https://geni/static' or '/geni.xxxstaice/xx.png'
I dont see the prompt itself and the answer of the model.
Please provide:
- Hyperparameters (temp, top_p, top_k)
- System Prompt
- Input Prompt
For both Base Model and Cybertron model, thanks
"Sorry for the late reply. I did some comparisons and found that the main issue is not with this model. When calling the drawing API, some hallucinations may occur, but this does not affect the usage. QWEN may also recognize errors. Do you have experience with multi-turn tool calls? Can you provide some guidance?"
unfortunately not, i havent used this feature of LLM's.. been busy with other stuff. But i let the thread open just in case someone can help u.