Gateway Problem

I bought huggingface pro. I am trying to use the model microsoft/Phi-3-mini-128k-instruct on HuggingFace’s Serverless API since yesterday through the following code:

response = self.client.chat.completions.create(
model=self.model_name,
messages=[{“role”: “user”, “content”: prompt}],
temperature=0,
max_tokens=300
)
However, I keep having the same errors:
image
and
504 Server Error: Gateway Time-out for url: /static-proxy?url=https%3A%2F%2Fapi-inference.huggingface.co%2Fmodels%2Fmicrosoft%2FPhi-3-mini-128k-instruct%2Fv1%2Fchat%2Fcompletions%3C%2Fa%3E%3C%2Fp%3E

I tried other models such as meta-llama/Llama-3.2-1B and I keep having the same problem. In the midst of the errors, there were like one or two sucessful requests. Is this normal?

1 Like

I am getting the same thing.

1 Like

Hi @JoseLuisNeves The model microsoft/Phi-3-mini-128k-instruct is not loaded on the serverless API, but you can use this model with Inference Endpoints. Inference Endpoints allows you to easily deploy your models on dedicated, fully-managed infrastructure, and will give you the flexibility to quickly create endpoints on CPU or GPU resources. It’s billed by compute uptime vs character usage.

1 Like