Make inference requests on your model APIs.
POST https://run.blaxel.ai/{YOUR-WORKSPACE}/models/{YOUR-MODEL}
run.blaxel.ai/your-workspace/models/your-model
(the base endpoint) will generate text based on a promptrun.blaxel.ai/your-workspace/models/your-model/v1/chat/completions
(the ChatCompletions API implementation) will generate response based on a list of messagescurl 'https://run.blaxel.ai/YOUR-WORKSPACE/models/YOUR-MODEL' \
-H 'accept: application/json, text/plain, */*' \
-H 'x-Blaxel-authorization: Bearer YOUR-TOKEN' \
-H 'x-Blaxel-workspace: YOUR-WORKSPACE' \
--data-raw $'{"inputs":"Enter your input here."}'
bl run model your-model --data '{"inputs":"Enter your input here."}'
--path
:
bl run model your-model --path /v1/chat/completions --data '{"inputs":"Hello there!"}'
Was this page helpful?