Skip to content

OpenAI Chat API

Overview

Chat API provides an interface fully compatible with OpenAI Chat Completions API, supporting multi-turn conversations, streaming responses, tool calls, and more. Create model responses from conversation history, supporting both streaming and non-streaming responses.


Interface Details

Create Chat Completion

Endpoint: POST /v1/chat/completions

Description: Create model response based on conversation history. Supports streaming and non-streaming responses.

Authentication: Bearer Token

http
Authorization: Bearer YOUR_API_TOKEN

Request Parameters

Header Parameters

ParameterTypeRequiredDescriptionExample
AuthorizationstringYesBearer Token AuthenticationBearer sk-xxx...
Content-TypestringYesContent Typeapplication/json

Body Parameters

ParameterTypeRequiredDefaultDescriptionConstraints
modelstringYes-Model IDgpt-5.1, gpt-4, etc.
messagesarray[object]Yes-List of messagesMin 1 message
temperaturenumberNo1Sampling temperature0 ≤ x ≤ 2
top_pnumberNo1Nucleus sampling0 ≤ x ≤ 1
nintegerNo1Number of generations≥ 1
streambooleanNofalseStream response-
stream_optionsobjectNo-Stream options-
stream_options.include_usagebooleanNo-Include usage stats-
stopstring/arrayNo-Stop sequences-
max_tokensintegerNo-Max generation tokens≥ 0
max_completion_tokensintegerNo-Max completion tokens≥ 0
presence_penaltynumberNo0Presence penalty-2 ≤ x ≤ 2
frequency_penaltynumberNo0Frequency penalty-2 ≤ x ≤ 2
logit_biasobjectNo-Token bias-
userstringNo-User identifier-
toolsarray[object]No-List of tools-
tool_choicestring/objectNoautoTool choice strategynone, auto, required
response_formatobjectNo-Response format-
seedintegerNo-Random seed-
reasoning_effortstringNo-Reasoning effortlow, medium, high
modalitiesarray[string]No-Modality typestext, audio
audioobjectNo-Audio config-

Messages Object

ParameterTypeRequiredDescription
roleenumYesMessage role: system, user, assistant, tool, developer
contentstringYesMessage content
namestringNoSender name
tool_callsarray[object]NoList of tool calls
tool_call_idstringNoTool call ID (tool role message)
reasoning_contentstringNoReasoning content

Tool Object

ParameterTypeRequiredDescription
typestringYesTool type, usually "function"
functionobjectYesFunction definition
function.namestringYesFunction name
function.descriptionstringNoFunction description
function.parametersobjectNoFunction parameter Schema

ResponseFormat Object

ParameterTypeRequiredDescription
typeenumNoResponse type: text, json_object, json_schema
json_schemaobjectNoJSON Schema definition

Audio Object

ParameterTypeRequiredDescription
voicestringNoVoice type
formatstringNoAudio format

Response Format

Success Response (200)

Non-streaming Response:

ParameterTypeDescription
idstringResponse ID
objectstringObject type, fixed as "chat.completion"
createdintegerCreation timestamp
modelstringModel used
choicesarray[object]List of choices
choices[].indexintegerChoice index
choices[].messageobjectMessage content
choices[].finish_reasonenumFinish reason: stop, length, tool_calls, content_filter
usageobjectUsage statistics
system_fingerprintstringSystem fingerprint

Usage Object

| Parameter | Type | Description | |--------|------|------|------| | prompt_tokens | integer | Prompt tokens | | completion_tokens | integer | Completion tokens | | total_tokens | integer | Total tokens | | prompt_tokens_details | object | Prompt details | | completion_tokens_details | object | Completion details |

Streaming Response (Server-Sent Events)

Streaming responses are returned as data chunks starting with data::

data
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk",...}
data: [DONE]

Code Examples

1. Basic Conversation

Request

bash
curl -X POST https://api.ezmodel.cloud/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{
    "model": "gpt-5.1",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello, how are you?"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 150
  }'

Response

json
{
  "id": "chatcmpl-8abcd1234efgh5678",
  "object": "chat.completion",
  "created": 1699012345,
  "model": "gpt-5.1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I\'m doing well, thank you for asking. How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 22,
    "completion_tokens": 18,
    "total_tokens": 40
  }
}

2. Streaming Response

Request

bash
curl -X POST https://api.ezmodel.cloud/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{
    "model": "gpt-5.1",
    "messages": [
      {
        "role": "user",
        "content": "Write a poem about spring"
      }
    ],
    "stream": true,
    "temperature": 0.8
  }'

Response

data: {"id":"chatcmpl-8abcd1234efgh5678","object":"chat.completion.chunk","created":1699012345,"model":"gpt-5.1","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-8abcd1234efgh5678","object":"chat.completion.chunk","created":1699012345,"model":"gpt-5.1","choices":[{"index":0,"delta":{"content":"Spring"},"finish_reason":null}]}

data: {"id":"chatcmpl-8abcd1234efgh5678","object":"chat.completion.chunk","created":1699012345,"model":"gpt-5.1","choices":[{"index":0,"delta":{"content":" breeze"},"finish_reason":null}]}

... 

data: [DONE]

3. Tool Call

Request

bash
curl -X POST https://api.ezmodel.cloud/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{
    "model": "gpt-5.1",
    "messages": [
      {
        "role": "user",
        "content": "What time is it in Beijing now?"
      }
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_current_time",
          "description": "Get current time in specified timezone",
          "parameters": {
            "type": "object",
            "properties": {
              "timezone": {
                "type": "string",
                "description": "Timezone, e.g. Asia/Shanghai"
              }
            },
            "required": ["timezone"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Response

json
{
  "id": "chatcmpl-8abcd1234efgh5678",
  "object": "chat.completion",
  "created": 1699012345,
  "model": "gpt-5.1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_current_time",
              "arguments": "{\"timezone\": \"Asia/Shanghai\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": {
    "prompt_tokens": 58,
    "completion_tokens": 21,
    "total_tokens": 79
  }
}

4. JSON Response

Request

bash
curl -X POST https://api.ezmodel.cloud/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d '{
    "model": "gpt-5.1",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant designed to output JSON."
      },
      {
        "role": "user",
        "content": "List three fruits, including name and color"
      }
    ],
    "response_format": {
      "type": "json_object"
    }
  }'

Response

json
{
  "id": "chatcmpl-8abcd1234efgh5678",
  "object": "chat.completion",
  "created": 1699012345,
  "model": "gpt-5.1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"fruits\":[{\"name\":\"Apple\",\"color\":\"Red\"},{\"name\":\"Banana\",\"color\":\"Yellow\"},{\"name\":\"Grape\",\"color\":\"Purple\"}]}"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 45,
    "completion_tokens": 42,
    "total_tokens": 87
  }
}

Error Handling

Error Response Format

json
{
  "error": {
    "message": "Error description message",
    "type": "Error type",
    "param": "Related parameter",
    "code": "Error code"
  }
}

Common Error Codes

HTTP Status CodeError TypeDescription
400invalid_request_errorRequest parameter error
401invalid_api_keyAPI key invalid or not provided
401insufficient_quotaAPI quota insufficient
403access_deniedAccess denied
404not_foundResource not found
429rate_limit_exceededRequest rate limit exceeded
500api_errorInternal server error
503service_unavailableService unavailable

Common Error Examples

Invalid API Key

json
{
  "error": {
    "message": "Invalid API key provided",
    "type": "invalid_request_error",
    "param": "authorization",
    "code": "invalid_api_key"
  }
}

Rate Limit Exceeded

json
{
  "error": {
    "message": "Rate limit exceeded. Please try again later.",
    "type": "rate_limit_exceeded",
    "param": null,
    "code": "rate_limit_exceeded"
  }
}

Token Limit Exceeded

json
{
  "error": {
    "message": "This model\'s maximum context length is 4097 tokens. However, your messages resulted in 5120 tokens.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": "context_length_exceeded"
  }
}

Limits

Request Limits

  • Token Limit: Depends on model type, typically 4K-128K tokens
  • Request Rate: Depends on account level, typically 60-3000 requests per minute
  • Concurrent Connections: Depends on account level, typically 1-10 concurrent connections
  • Response Timeout: Non-streaming requests default timeout 30 seconds

Model Support

  • Supported Models: GPT-3.5 Series, GPT-4 Series, Claude Series, etc.
  • Tool Calls: GPT-4, GPT-4 Turbo, GPT-4o, etc.
  • Vision Features: GPT-4 Vision, GPT-4o, etc.
  • Audio Features: GPT-4o, etc.

Best Practices

  1. Set Temperature Appropriately: High values (0.8-1.0) for creative tasks, low values (0.1-0.3) for accuracy tasks
  2. Control Token Usage: Set appropriate max_tokens to avoid unnecessary consumption
  3. Use System Messages: Set clear roles and behavior guidance via system role
  4. Error Handling: Always include proper error handling and retry mechanisms
  5. Streaming Responses: Recommend using stream=true for long text generation for better experience

SDKs and Tools

Official SDKs

  • OpenAI Python SDK: pip install openai
  • OpenAI Node.js SDK: npm install openai
  • OpenAI Java SDK: Supports various Java HTTP clients

Third-party Libraries

  • LangChain: Supports chain calls and complex workflows
  • LlamaIndex: Focuses on RAG (Retrieval Augmented Generation) applications

Version Updates

v1.0 (Current Version)

  • Fully compatible with OpenAI Chat Completions API
  • Supports streaming and non-streaming responses
  • Supports tool calls and function calls
  • Supports multi-modal input (text, image, audio)

Coming Soon

  • More model choices
  • Enhanced error handling
  • Finer cost control
  • Batch processing features

企业合作联系:service@ezmodel.cloud