env.dev

LLM APIs for Developers

Using Claude, OpenAI, and Gemini APIs in code. Practical guide to integrating LLM APIs into developer tools and applications.

Overview

LLM APIs let you integrate AI capabilities directly into your applications. Whether you are building a coding assistant, chatbot, content generator, or data analysis tool, understanding how to call LLM APIs effectively is an essential developer skill.

API Comparison

ProviderTop ModelContextPrice (Input/Output per 1M tokens)
AnthropicClaude 3.5 Sonnet200K$3 / $15
OpenAIGPT-4o128K$2.50 / $10
GoogleGemini 1.5 Pro1M+$1.25 / $5
Meta (via providers)Llama 3.1 405B128KVaries by host

Example: Claude API (TypeScript)

TypeScript
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

const message = await client.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  messages: [
    {
      role: 'user',
      content: 'Write a TypeScript function to validate email addresses.',
    },
  ],
});

console.log(message.content[0].text);

Best Practices

  • Always set max_tokens to prevent runaway costs
  • Implement retry logic with exponential backoff for rate limits
  • Cache LLM responses when the same prompt produces the same result
  • Use streaming for long responses to improve perceived latency
  • Store API keys in environment variables, never in code
  • Monitor usage and set billing alerts to prevent surprise costs

Frequently Asked Questions

Which LLM API should I use?

For coding tasks, Claude API offers the best reasoning and large context. OpenAI's API has the broadest ecosystem. Gemini offers the largest context window. Choose based on your primary use case.

How much do LLM APIs cost?

Pricing varies by model and usage. Claude Sonnet costs ~$3/million input tokens. GPT-4o costs ~$2.50/million input tokens. Most APIs have free tiers for experimentation.

Can I use LLM APIs in production?

Yes. All major LLM APIs offer production-grade SLAs, rate limits, and security features. Add proper error handling, rate limiting, and caching for production use.