Gateway Problem

JoseLuisNeves · December 20, 2024, 11:22am

I bought huggingface pro. I am trying to use the model microsoft/Phi-3-mini-128k-instruct on HuggingFace’s Serverless API since yesterday through the following code:

response = self.client.chat.completions.create(
model=self.model_name,
messages=[{“role”: “user”, “content”: prompt}],
temperature=0,
max_tokens=300
)
However, I keep having the same errors:

and
504 Server Error: Gateway Time-out for url: /static-proxy?url=https%3A%2F%2Fapi-inference.huggingface.co%2Fmodels%2Fmicrosoft%2FPhi-3-mini-128k-instruct%2Fv1%2Fchat%2Fcompletions%3C%2Fa%3E%3C%2Fp%3E

I tried other models such as meta-llama/Llama-3.2-1B and I keep having the same problem. In the midst of the errors, there were like one or two sucessful requests. Is this normal?

January 5, 2025, 7:07pm

I am getting the same thing.

meganariley · January 7, 2025, 6:55pm

Hi @JoseLuisNeves The model microsoft/Phi-3-mini-128k-instruct is not loaded on the serverless API, but you can use this model with Inference Endpoints. Inference Endpoints allows you to easily deploy your models on dedicated, fully-managed infrastructure, and will give you the flexibility to quickly create endpoints on CPU or GPU resources. It’s billed by compute uptime vs character usage.

Topic		Replies	Views
HF Inference API: 503/504 Server Error Inference Endpoints on the Hub	4	443	September 5, 2025
Phi-3-mini-128k-instruct not working with pro inference api Inference Endpoints on the Hub	14	2392	August 26, 2024
Gateway timeout - google/flan-t5-xxl Beginners	0	27	August 4, 2024
Hugging Face API timeouts on all models! Inference Endpoints on the Hub	5	99	September 18, 2025
Is the serverless API completely broken and unreliable? Inference Endpoints on the Hub	5	183	January 6, 2025

Gateway Problem

Related topics