MegaLLM Logo
MegaLLM
Anthropic API

Anthropic API

Complete guide to Anthropic Claude-compatible endpoints

Anthropic API

MegaLLM provides full compatibility with Anthropic's Claude API format, enabling you to use Claude models through our infrastructure.

Base URL: https://ai.megallm.io for all Anthropic-compatible endpoints

Available Endpoints

Quick Example

from anthropic import Anthropic

# Initialize client
client = Anthropic(
    base_url="https://ai.megallm.io",
    api_key="your-api-key"
)

# Create a message
message = client.messages.create(
    model="claude-3.5-sonnet",
    max_tokens=100,
    messages=[
        {
            "role": "user",
            "content": "Explain the theory of relativity in simple terms"
        }
    ]
)

print(message.content[0].text)

Supported Models

Model IDContext WindowUse Case
claude-opus-4-1-20250805200K tokensComplex analysis, research
claude-3.5-sonnet200K tokensBalanced performance
claude-3.7-sonnet200K tokensFast, efficient responses
claude-sonnet-4200K tokensAdvanced generation

Features

🧠 Advanced Reasoning

Claude's sophisticated reasoning capabilities for complex tasks.

📚 Large Context Window

Process up to 200K tokens for extensive document analysis.

🛠️ Tool Use

Native support for function calling and tool integration.

👁️ Vision Capabilities

Analyze images and visual content alongside text.

SDK Support

MegaLLM works with Anthropic-compatible SDKs:

  • Python: anthropic official SDK
  • TypeScript/JavaScript: @anthropic-ai/sdk
  • Go: Community SDKs
  • Ruby: anthropic-rb

Key Differences from OpenAI

Message Format

Anthropic uses a slightly different message format:

# Anthropic format
messages = [
    {
        "role": "user",
        "content": "Hello, Claude!"
    }
]

# System messages are separate
system = "You are a helpful assistant"

message = client.messages.create(
    model="claude-3.5-sonnet",
    max_tokens=100,
    system=system,  # System prompt is separate
    messages=messages
)

Response Format

# Anthropic response structure
response = {
    "id": "msg_123",
    "type": "message",
    "role": "assistant",
    "content": [
        {
            "type": "text",
            "text": "Hello! How can I help you today?"
        }
    ],
    "model": "claude-3.5-sonnet",
    "usage": {
        "input_tokens": 10,
        "output_tokens": 25
    }
}

Tool Use Format

tools = [
    {
        "name": "get_weather",
        "description": "Get weather for a location",
        "input_schema": {  # Note: input_schema, not parameters
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["location"]
        }
    }
]

Migration Guide

Migrating from Anthropic to MegaLLM:

# Before (Anthropic Cloud)
client = Anthropic(api_key="sk-ant-...")

# After (MegaLLM)
client = Anthropic(
    base_url="https://ai.megallm.io",
    api_key="your-api-key"
)

All your existing Anthropic code continues to work!

Authentication

Use the x-api-key header for Anthropic format:

curl https://ai.megallm.io/v1/messages \
  -H "x-api-key: $MEGALLM_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-3.5-sonnet",
    "max_tokens": 100,
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Rate Limits

TierRequests/minTokens/minConcurrent
Basic50100,00010
Pro200400,00040
EnterpriseCustomCustomCustom

Error Handling

MegaLLM returns Anthropic-compatible error responses:

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "max_tokens is required"
  }
}

Advanced Features

Conversation History

Maintain context across multiple turns:

conversation = []

def chat(user_input):
    conversation.append({"role": "user", "content": user_input})

    response = client.messages.create(
        model="claude-3.5-sonnet",
        max_tokens=150,
        messages=conversation
    )

    assistant_message = response.content[0].text
    conversation.append({"role": "assistant", "content": assistant_message})

    return assistant_message

# Usage
print(chat("What's the capital of France?"))
print(chat("What's its population?"))  # Knows "its" refers to Paris

Temperature and Sampling

Control response creativity:

# More deterministic
response = client.messages.create(
    model="claude-3.5-sonnet",
    max_tokens=100,
    temperature=0.0,  # Very consistent
    messages=messages
)

# More creative
response = client.messages.create(
    model="claude-3.5-sonnet",
    max_tokens=100,
    temperature=1.0,  # More varied
    top_p=0.95,       # Nucleus sampling
    messages=messages
)

Use Cases

Document Analysis

def analyze_document(document_text):
    response = client.messages.create(
        model="claude-3-opus-20240229",
        max_tokens=500,
        system="You are a document analysis expert.",
        messages=[
            {
                "role": "user",
                "content": f"""Analyze this document and provide:
                1. Main topics
                2. Key insights
                3. Summary

                Document: {document_text}"""
            }
        ]
    )
    return response.content[0].text

Code Review

def review_code(code):
    response = client.messages.create(
        model="claude-3.5-sonnet",
        max_tokens=800,
        system="You are an expert code reviewer.",
        messages=[
            {
                "role": "user",
                "content": f"""Review this code for:
                - Bugs
                - Performance issues
                - Best practices
                - Security concerns

                Code:
                ```python
                {code}
                ```"""
            }
        ]
    )
    return response.content[0].text

Pro Tip: Claude excels at tasks requiring careful reasoning, long context understanding, and nuanced responses.

Best Practices

  1. Use System Prompts: Claude responds well to clear system instructions
  2. Leverage Context Window: Take advantage of the 200K token context for large documents
  3. Structured Prompts: Use clear formatting and numbered lists for complex requests
  4. Temperature Settings: Use lower temperatures (0-0.3) for factual tasks
  5. Model Selection: Choose Opus for complex reasoning, Sonnet for balance, Haiku for speed

Next Steps