Skip to main content
POST
/
messages
curl --request POST \
--url https://ai.megallm.io/v1/messages \
--header 'Content-Type: application/json' \
--header 'anthropic-version: <anthropic-version>' \
--header 'x-api-key: <api-key>' \
--data '{
"model": "claude-3.5-sonnet",
"max_tokens": 100,
"messages": [
{
"role": "user",
"content": "What are the primary colors?"
}
]
}'
{
  "id": "msg_abc123",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "<string>"
    }
  ],
  "model": "claude-3.5-sonnet",
  "stop_reason": "end_turn",
  "stop_sequence": "<string>",
  "usage": {
    "input_tokens": 123,
    "output_tokens": 123,
    "cache_creation_input_tokens": 123,
    "cache_read_input_tokens": 123
  }
}

Authorizations

x-api-key
string
header
required

API key authentication (Anthropic-compatible)

Headers

anthropic-version
string
required

The version of the Anthropic API you want to use. Use '2023-06-01' for the current stable version.

Example:

"2023-06-01"

anthropic-beta
string

Optional header for accessing beta features. Format: comma-separated list of beta feature names.

Example:

"max-tokens-3-5-sonnet-2022-07-15"

Body

application/json
model
string
required

The model that will complete your prompt. See the list of Claude models at https://docs.claude.com/en/docs/about-claude/models

Example:

"claude-3.5-sonnet"

messages
object[]
required

Input messages. Alternating user and assistant conversational turns. The first message must use the user role. Text content is the only format supported for the first user message.

Minimum length: 1
max_tokens
integer
required

The maximum number of tokens to generate before stopping. Note that Claude may stop before reaching this maximum. This parameter only specifies the absolute maximum number of tokens to generate.

Required range: x >= 1
system

System prompt. A system prompt is a way of providing context and instructions to Claude, such as specifying a particular goal or role. System prompt as plain text

temperature
number
default:1

Amount of randomness injected into the response. Defaults to 1.0. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical / multiple choice, and closer to 1.0 for creative and generative tasks.

Required range: 0 <= x <= 1
top_p
number

Use nucleus sampling. In nucleus sampling, Claude computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches a particular probability specified by top_p.

Required range: 0 <= x <= 1
top_k
integer

Only sample from the top K options for each subsequent token. Used to remove "long tail" low probability responses. Recommended for advanced use cases only. You usually only need to use temperature.

Required range: x >= 0
stream
boolean
default:false

Whether to incrementally stream the response using server-sent events. See streaming documentation for details.

stop_sequences
string[]

Custom text sequences that will cause the model to stop generating. Claude will stop when it encounters any of these strings.

tools
object[]

Definitions of tools that the model may use. If you include tools in your API request, the model may return tool_use content blocks that represent the model's use of those tools.

tool_choice
object

How the model should use the provided tools. The model can use a specific tool, any available tool, or decide by itself. allows Claude to decide whether to call any provided tools or not

  • Option 1
  • Option 2
  • Option 3
thinking
object

Enable extended thinking by Claude. When enabled, Claude will think through the problem before responding.

service_tier
enum<string>

The service tier to use for the request. 'auto' lets us choose the tier, 'standard_only' restricts to standard tier.

Available options:
auto,
standard_only
metadata
object

An object describing metadata about the request

Response

Successful response

id
string
required

Unique object identifier. The format and length of IDs may change over time.

Example:

"msg_abc123"

type
enum<string>
required

Object type. For Messages, this is always 'message'.

Available options:
message
role
enum<string>
required

Conversational role of the generated message. This is always 'assistant'.

Available options:
assistant
content
object[]
required

Content generated by the model. This is an array of content blocks, each of which has a type that determines its shape.

  • Option 1
  • Option 2
  • Option 3
model
string
required

The model that handled the request.

Example:

"claude-3.5-sonnet"

stop_reason
enum<string>
required

The reason that we stopped. This may be one of the following values: end_turn (the model reached a natural stopping point), max_tokens (we exceeded the requested max_tokens or the model's maximum), stop_sequence (one of your provided custom stop_sequences was generated), or tool_use (the model invoked one or more tools).

Available options:
end_turn,
max_tokens,
stop_sequence,
tool_use
usage
object
required

Billing and rate-limit usage. Anthropic's API bills and rate-limits by token counts, as tokens represent the underlying cost to our systems.

stop_sequence
string | null

Which custom stop sequence was generated, if any. This value will be a non-null string if one of your custom stop sequences was generated.