๐Ÿ’ป Developer Guide

The Claude API Guide

Everything developers need to integrate Claude into their apps โ€” from your first API call to production-ready patterns. Includes Python and JavaScript examples throughout.

What Is the Claude API?

The API lets you send messages to Claude programmatically and receive responses โ€” enabling you to build Claude into any product or workflow.

๐Ÿ”Œ REST API

The Claude API is a standard REST API. You send HTTP POST requests with your message and receive JSON responses. It works with any language that can make HTTP requests.

๐Ÿ”‘ API Keys

Every request needs an API key for authentication. Keys are created in the Anthropic console and should be kept secret โ€” never hardcode them in client-side code.

๐Ÿ’ฌ Messages Format

Requests use a "messages" array where each message has a role ("user" or "assistant") and content. This lets you send full conversation histories for multi-turn chats.

๐Ÿ“ฆ Official SDKs

Anthropic provides official SDKs for Python and TypeScript/JavaScript. These handle authentication, retries, and streaming for you โ€” always use them over raw HTTP.

โšก Streaming

Instead of waiting for the full response, streaming sends tokens as they're generated. This makes your app feel faster and is essential for good UX in chat applications.

๐Ÿง  System Prompts

A system prompt sets Claude's persona, rules, and context before the conversation starts. It's separate from the messages array and shapes how Claude behaves throughout.

Your First API Call

From zero to a working API call in under 5 minutes.

1

Get your API key

Sign up at console.anthropic.com โ†’ Go to "API Keys" โ†’ Create a new key. Copy it somewhere safe โ€” you won't see it again.

2

Install the SDK

Terminal
# Python
pip install anthropic

# Node.js
npm install @anthropic-ai/sdk
3

Make your first call

hello_claude.py
import anthropic client = anthropic.Anthropic( api_key="your-api-key-here" # or set ANTHROPIC_API_KEY env var ) message = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, messages=[ {"role": "user", "content": "Explain APIs in one paragraph."} ] ) print(message.content[0].text)
โš ๏ธ

Security tip: Never hardcode your API key in source code. Use environment variables (ANTHROPIC_API_KEY) or a secrets manager. Never commit keys to GitHub.

Understanding Every Parameter

What each field in the API request does and when to use it.

ParameterRequired?TypeWhat it doesRecommended value
modelRequiredstringWhich Claude model to useclaude-sonnet-4-5 for most tasks
messagesRequiredarrayThe conversation history as an array of {role, content} objectsAlways start with role: "user"
max_tokensRequiredintegerMaximum tokens in the response. Setting this too low will cut off responses mid-sentence.1024 for chat, 4096 for long content
systemOptionalstringA system prompt that sets Claude's persona, rules, and context for the entire conversationAlways use for production apps
temperatureOptionalfloat 0โ€“1Controls randomness. 0 = deterministic, 1 = most creative. Default is 1.0 for factual, 0.7 for creative
streamOptionalbooleanIf true, streams tokens as they generate instead of waiting for the full responsetrue for chat UIs
stop_sequencesOptionalarrayClaude stops generating when it produces any of these strings. Useful for structured output.e.g. ["", "Human:"]
top_pOptionalfloat 0โ€“1Nucleus sampling threshold. Recommended to use temperature OR top_p, not both.Leave unset if using temperature

Real-World Code Patterns

Copy-paste patterns for the most common use cases.

๐Ÿ’ฌ Multi-Turn Conversation

multi_turn.py
# Build a conversation history and keep adding to it conversation_history = [] def chat(user_message): conversation_history.append({ "role": "user", "content": user_message }) response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, system="You are a helpful assistant.", messages=conversation_history ) reply = response.content[0].text conversation_history.append({ "role": "assistant", "content": reply }) return reply # Use it print(chat("What's the capital of France?")) print(chat("What's the population of that city?")) # Claude remembers context

โšก Streaming Response

streaming.py
# Tokens print as they generate โ€” great for chat UIs with client.messages.stream( model="claude-sonnet-4-5", max_tokens=1024, messages=[{"role": "user", "content": "Write a short poem about code."}] ) as stream: for text in stream.text_stream: print(text, end="", flush=True)

๐Ÿ—๏ธ Structured JSON Output

structured_output.py
import json response = client.messages.create( model="claude-sonnet-4-5", max_tokens=1024, system="""You are a data extractor. Always respond with valid JSON only. No explanation, no markdown, just the JSON object.""", messages=[{ "role": "user", "content": """Extract the key info from this job posting as JSON with keys: title, company, salary_range, required_skills (array), location. Job posting: Senior Python Developer at TechCorp, Sydney. Salary $120-150k. Must know Python, FastAPI, PostgreSQL, Docker.""" }] ) data = json.loads(response.content[0].text) print(data['title']) # โ†’ "Senior Python Developer"

Understanding API Costs

You pay per token โ€” both for input (your prompt) and output (Claude's response). Here's how to estimate your costs.

ModelInput (per 1M tokens)Output (per 1M tokens)Best for
Claude Haiku 4.5LowestLowestHigh-volume, simple tasks
Claude Sonnet 4.5MidMidMost production use cases
Claude Opus 4HighestHighestComplex, high-value tasks
๐Ÿ’ก

Rule of thumb: ~1 token โ‰ˆ ยพ of a word. A typical user message is 50โ€“200 tokens. A full blog post response is ~800โ€“1500 tokens. Check exact pricing at anthropic.com/pricing as rates change.

๐Ÿงฎ Quick Cost Estimator

(input + output combined)
Estimated monthly cost: calculating...

Production Best Practices

What separates a hobby project from a robust production integration.

๐Ÿ” Handle Retries & Rate Limits

The API can return 429 (rate limit) or 529 (overloaded) errors. Always implement exponential backoff. The official SDK handles this automatically โ€” one more reason to use it.

๐Ÿงน Manage Context Window

For long conversations, old messages eat up your context window and cost money. Summarise old turns, prune irrelevant messages, or use a sliding window to keep only recent history.

๐Ÿ“ Log Everything

Store prompts, responses, token counts, and latency in a database. This is invaluable for debugging, cost tracking, and improving your prompts over time.

๐Ÿ›ก๏ธ Validate Outputs

If Claude returns structured data (JSON, code), always validate it before using it. Don't trust the shape of the output blindly โ€” add error handling and fallbacks.

๐ŸŒก๏ธ Tune Temperature by Task

Use temperature=0 for factual Q&A, data extraction, and classification. Use 0.5โ€“0.8 for creative writing and brainstorming. Never use temperature=1 for tasks that need precision.

๐Ÿ’พ Cache Repeated Prompts

If many users ask similar questions (e.g. about your docs), cache the responses. Claude's prompt caching feature can significantly reduce costs for repeated context.

Ready to start building?

Get your API key and make your first call in under 5 minutes.

โœฆ Get API Key Official Docs โ†’