| title: REST API endpoints for models inference | |
| shortTitle: Inference | |
| intro: Use the REST API to submit a chat completion request to a specified model, with or without organizational attribution. | |
| versions: # DO NOT MANUALLY EDIT. CHANGES WILL BE OVERWRITTEN BY A 🤖 | |
| fpt: '*' | |
| topics: | |
| - API | |
| autogenerated: rest | |
| allowTitleToDifferFromFilename: true | |
| ## About {% data variables.product.prodname_github_models %} inference | |
| You can use the REST API to run inference requests using the {% data variables.product.prodname_github_models %} platform. The API requires the `models: read` scope when using a {% data variables.product.pat_v2 %} or when authenticating using a {% data variables.product.prodname_github_app %}. | |
| The API supports: | |
| * Accessing top models from OpenAI, DeepSeek, Microsoft, Llama, and more. | |
| * Running chat-based inference requests with full control over sampling and response parameters. | |
| * Streaming or non-streaming completions. | |
| * Organizational attribution and usage tracking. | |
| <!-- Content after this section is automatically generated --> | |