PythonNode.jsJava

SDK Documentation

Add AI cost protection to your stack in under 5 minutes. No proxy, no infrastructure changes — just wrap your existing provider calls.

Get Your API Key

Installation

Install the ModelCost SDK for your language. All SDKs are lightweight with minimal dependencies.

terminal

pip install modelcost

Minimum versions: Python 3.9+ · Node.js 18+ · Java 17+

Quick Start

Three steps to production safety: initialize the SDK, wrap your AI provider client, and use it as normal. The SDK intercepts calls transparently to track costs, enforce budgets, and detect anomalies.

app.py

import modelcost
from openai import OpenAI

# 1. Initialize the SDK
modelcost.init(
    api_key="mc_your_api_key",
    org_id="org-123"
)

# 2. Wrap your AI provider client
client = modelcost.wrap(OpenAI())

# 3. Use it exactly as before — costs are tracked automatically
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello, world!"}]
)

That's it. Every call through the wrapped client is now automatically tracked, attributed, and budget-protected. No other code changes required.

Configuration Reference

The SDK accepts the following configuration parameters. Only apiKey and orgId are required.

Parameter	Type	Default	Description
`apiKey`required	string	—	Your ModelCost API key. Must start with `mc_`
`orgId`required	string	—	Your organization identifier
`environment`	string	`"production"`	Label for the current deployment environment
`baseUrl`	string	`https://api.modelcost.ai`	API endpoint URL
`monthlyBudget`	number	—	Monthly spending limit in USD
`budgetAction`	string	`"alert"`	Action when budget is exceeded: `"alert"`, `"throttle"`, or `"block"`
`failOpen`	boolean	`true`	If true, AI calls proceed normally when ModelCost is unreachable
`flushIntervalMs`	number	`5000`	Milliseconds between event batch flushes
`flushBatchSize`	number	`100`	Maximum events per batch
`syncIntervalMs`	number	`10000`	Milliseconds between configuration syncs
`contentPrivacy`	boolean	`false`	When enabled, all PII/governance scanning runs locally. No prompt or completion text is sent to ModelCost

Full Configuration Example

config.py

import modelcost

modelcost.init(
    api_key="mc_your_api_key",          # Required — your API key
    org_id="org-123",                    # Required — your organization ID
    environment="production",            # Label for this environment
    base_url="https://api.modelcost.ai", # API endpoint (default)
    monthly_budget=10000,                # USD monthly spending limit
    budget_action="block",               # "alert" | "throttle" | "block"
    fail_open=True,                      # Continue if ModelCost is unreachable
    flush_interval_seconds=5.0,          # How often to flush event batches
    flush_batch_size=100,                # Max events per batch
    sync_interval_seconds=10.0,          # Config sync interval
    content_privacy=False,               # Enable local-only PII scanning
)

Environment Variables

All configuration can be loaded from environment variables. Explicit parameters always take precedence.

Variable	Maps To
`MODELCOST_API_KEY`	`apiKey`
`MODELCOST_ORG_ID`	`orgId`
`MODELCOST_ENV`	`environment`
`MODELCOST_BASE_URL`	`baseUrl`
`MODELCOST_CONTENT_PRIVACY`	`contentPrivacy`

Zero-Config Initialization

Set your environment variables and initialize the SDK with no arguments.

app.py

import modelcost

# Reads MODELCOST_API_KEY, MODELCOST_ORG_ID, etc. from environment
modelcost.init()

Content Privacy & Governance

Content Privacy Mode

Keep sensitive data in your environment. Not ours.

When contentPrivacy is enabled, the ModelCost SDK performs all PII and governance scanning locally inside your process. Raw prompt and completion text never leaves your environment.

Only anonymized metadata signals are sent to ModelCost — violation type, severity, and action taken. Never the underlying content.

How It Works

1The SDK embeds precompiled regex-based PII detection patterns directly into your application runtime. No external calls needed for scanning.
2When a prompt or completion passes through the wrapped client, the SDK scans locally for PII, PHI, secrets, and financial data.
3If violations are detected, the SDK reports only metadata signals to ModelCost: violation type, subtype, severity, action taken, and timestamp.
4The raw text is never transmitted. Signals are tagged with source: "metadata_only" so the server knows no content was received.

Your AppPrompt Data

→

SDK (Local)PII Scan

→

ModelCost APIMetadata Only

What Gets Detected

The embedded scanner detects the following categories of sensitive data:

Social Security NumbersEmail AddressesCredit Card NumbersPhone NumbersIBAN NumbersAPI Keys & TokensJWT TokensSecrets & PasswordsMedical Terms (PHI)

Credit card detection includes Luhn validation to minimize false positives. Custom detection patterns can be configured through the ModelCost dashboard.

Enable Content Privacy

app.py

import modelcost

modelcost.init(
    api_key="mc_your_api_key",
    org_id="org-123",
    content_privacy=True   # All PII scanning happens locally
)

# Manual PII scan (runs entirely in your environment)
result = modelcost.scan_pii("Contact me at john@example.com or 555-0123")
# result.violations: [{"type": "email", ...}, {"type": "phone", ...}]

Environment variable: You can also enable content privacy by setting MODELCOST_CONTENT_PRIVACY=true in your environment.

Budget Management

The SDK automatically enforces budgets configured in the dashboard. You can also perform pre-flight budget checks programmatically before making expensive calls.

budget.py

import modelcost

# Pre-flight budget check before making an expensive call
budget = modelcost.check_budget(
    scope="organization",
    scope_id="org-123"
)

if budget.allowed:
    response = wrapped_client.chat.completions.create(...)
else:
    print(f"Budget exceeded: {budget.reason}")
    # budget.action: "alert" | "throttle" | "block"

Budget Actions

Action	Behavior
`alert`	Send notifications but allow all calls to proceed
`throttle`	Progressively reduce the percentage of allowed calls as spend approaches the limit
`block`	Reject calls entirely once the budget is exceeded. No exceptions

Supported Providers

The SDK wraps all major AI provider clients transparently. The integration pattern is the same for every provider — just pass your client to wrap().

OpenAI

GPT-4o, GPT-4, GPT-3.5

Anthropic

Claude Opus, Sonnet, Haiku

Google

Gemini 1.5 Pro, Flash

API Reference

Core methods available across all SDK languages.

init(config)

Initialize the SDK with your configuration. Must be called before any other SDK method. Takes an API key, org ID, and optional settings.

wrap(client)

Wrap an AI provider client (OpenAI, Anthropic, or Google). Returns a drop-in replacement that automatically tracks costs, enforces budgets, and scans for governance violations.

checkBudget(scope, scopeId)

Pre-flight budget check. Returns whether a call would be allowed under current budget policies, plus the action that would be taken if not.

scanPii(text)

Manually scan text for PII and governance violations. Runs locally when contentPrivacy is enabled. Returns detected violation types and positions.

trackCost(params) Python only

Manually record a cost event without using the wrapped client. Useful for tracking costs from providers not yet supported by the wrapper.

getUsage(scope?, period?)

Retrieve current usage statistics for your organization, filtered by scope and time period.

shutdown()

Flush any pending events and gracefully shut down the SDK. Call this before your process exits.

Graceful Shutdown

Always call shutdown() before your process exits to ensure all pending cost events are flushed to ModelCost.

app.py

import modelcost

# Flush pending events and shut down gracefully
modelcost.shutdown()

Process signals: The SDK automatically registers a shutdown hook for SIGTERM and SIGINT in Node.js and Python. In Java, add the shutdown call to your application's lifecycle callbacks.

FAQ

What happens if ModelCost is unreachable?

By default, failOpen is set to true. This means your AI calls proceed normally even if the ModelCost API is down. Cost events are buffered locally and flushed when connectivity is restored.

How much latency does the SDK add?

Sub-millisecond. The SDK performs budget checks asynchronously and batches cost events in the background. It never blocks your AI provider calls synchronously.

Can I use multiple AI providers simultaneously?

Yes. Wrap each provider client independently. The SDK tracks costs per-provider and attributes them to the correct models automatically.

Is data encrypted in transit?

Yes. All communication with the ModelCost API uses TLS 1.2+ encryption. When contentPrivacy is enabled, raw content never leaves your environment at all.

What data is sent when contentPrivacy is enabled?

Only anonymized metadata: violation type (e.g., "email"), severity level, action taken, and timestamp. No prompt text, completion text, or user content is ever transmitted to ModelCost.

Do I need to change my existing code?

Minimal changes. You initialize the SDK once and wrap your provider client. All existing code that uses the client continues to work unchanged — the wrapped client is a drop-in replacement.