Instructions to use inclusionAI/Ring-1T-FP8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use inclusionAI/Ring-1T-FP8 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="inclusionAI/Ring-1T-FP8", trust_remote_code=True) messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("inclusionAI/Ring-1T-FP8", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use inclusionAI/Ring-1T-FP8 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "inclusionAI/Ring-1T-FP8" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "inclusionAI/Ring-1T-FP8", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/inclusionAI/Ring-1T-FP8
- SGLang
How to use inclusionAI/Ring-1T-FP8 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "inclusionAI/Ring-1T-FP8" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "inclusionAI/Ring-1T-FP8", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "inclusionAI/Ring-1T-FP8" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "inclusionAI/Ring-1T-FP8", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use inclusionAI/Ring-1T-FP8 with Docker Model Runner:
docker model run hf.co/inclusionAI/Ring-1T-FP8
Update model card: Add library_name, paper/code links, transformers usage, and deployment info
#1
by nielsr HF Staff - opened
This PR significantly enhances the model card for the Ring-1T model by:
- Adding
library_name: transformersto the metadata: This enables the automated "how to use" widget on the Hugging Face Hub, providing users with automated code snippets for easy integration with thetransformerslibrary. - Aligning the main title of the model card with the official paper title: "Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model".
- Including a direct link to the Hugging Face paper page: Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model in the introductory section.
- Adding a prominent link to the GitHub repository: https://github.com/inclusionAI/Ring-V2 for quick access to the code.
- Integrating a
transformerscode snippet for quick model usage, as found in the original GitHub README, under the Quickstart section. - Updating the SGLang and vLLM deployment sections with more comprehensive environment preparation and usage instructions from the GitHub repository.
- Adding the BibTeX citation for the paper.
These updates collectively improve the discoverability, usability, and completeness of the model card on the Hugging Face Hub.
Thanks Niels, it is pretty helpful update. Appreciate your continued help.
RichardBian changed pull request status to merged