SDK Documentation
Add AI cost protection to your stack in under 5 minutes. No proxy, no infrastructure changes — just wrap your existing provider calls.
Get Your API KeyInstallation
Install the ModelCost SDK for your language. All SDKs are lightweight with minimal dependencies.
pip install modelcost
Minimum versions: Python 3.9+ · Node.js 18+ · Java 17+
Quick Start
Three steps to production safety: initialize the SDK, wrap your AI provider client, and use it as normal. The SDK intercepts calls transparently to track costs, enforce budgets, and detect anomalies.
import modelcostfrom openai import OpenAI# 1. Initialize the SDKmodelcost.init(api_key="mc_your_api_key",org_id="org-123")# 2. Wrap your AI provider clientclient = modelcost.wrap(OpenAI())# 3. Use it exactly as before — costs are tracked automaticallyresponse = client.chat.completions.create(model="gpt-4o",messages=[{"role": "user", "content": "Hello, world!"}])
Configuration Reference
The SDK accepts the following configuration parameters. Only apiKey and orgId are required.
| Parameter | Type | Default | Description |
|---|---|---|---|
apiKeyrequired | string | — | Your ModelCost API key. Must start with mc_ |
orgIdrequired | string | — | Your organization identifier |
environment | string | "production" | Label for the current deployment environment |
baseUrl | string | https://api.modelcost.ai | API endpoint URL |
monthlyBudget | number | — | Monthly spending limit in USD |
budgetAction | string | "alert" | Action when budget is exceeded: "alert", "throttle", or "block" |
failOpen | boolean | true | If true, AI calls proceed normally when ModelCost is unreachable |
flushIntervalMs | number | 5000 | Milliseconds between event batch flushes |
flushBatchSize | number | 100 | Maximum events per batch |
syncIntervalMs | number | 10000 | Milliseconds between configuration syncs |
contentPrivacy | boolean | false | When enabled, all PII/governance scanning runs locally. No prompt or completion text is sent to ModelCost |
Full Configuration Example
import modelcostmodelcost.init(api_key="mc_your_api_key", # Required — your API keyorg_id="org-123", # Required — your organization IDenvironment="production", # Label for this environmentbase_url="https://api.modelcost.ai", # API endpoint (default)monthly_budget=10000, # USD monthly spending limitbudget_action="block", # "alert" | "throttle" | "block"fail_open=True, # Continue if ModelCost is unreachableflush_interval_seconds=5.0, # How often to flush event batchesflush_batch_size=100, # Max events per batchsync_interval_seconds=10.0, # Config sync intervalcontent_privacy=False, # Enable local-only PII scanning)
Environment Variables
All configuration can be loaded from environment variables. Explicit parameters always take precedence.
| Variable | Maps To |
|---|---|
MODELCOST_API_KEY | apiKey |
MODELCOST_ORG_ID | orgId |
MODELCOST_ENV | environment |
MODELCOST_BASE_URL | baseUrl |
MODELCOST_CONTENT_PRIVACY | contentPrivacy |
Zero-Config Initialization
Set your environment variables and initialize the SDK with no arguments.
import modelcost# Reads MODELCOST_API_KEY, MODELCOST_ORG_ID, etc. from environmentmodelcost.init()
Content Privacy & Governance
Keep sensitive data in your environment. Not ours.
When contentPrivacy is enabled, the ModelCost SDK performs all PII and governance scanning locally inside your process. Raw prompt and completion text never leaves your environment.
Only anonymized metadata signals are sent to ModelCost — violation type, severity, and action taken. Never the underlying content.
How It Works
- 1The SDK embeds precompiled regex-based PII detection patterns directly into your application runtime. No external calls needed for scanning.
- 2When a prompt or completion passes through the wrapped client, the SDK scans locally for PII, PHI, secrets, and financial data.
- 3If violations are detected, the SDK reports only metadata signals to ModelCost: violation type, subtype, severity, action taken, and timestamp.
- 4The raw text is never transmitted. Signals are tagged with
source: "metadata_only"so the server knows no content was received.
What Gets Detected
The embedded scanner detects the following categories of sensitive data:
Enable Content Privacy
import modelcostmodelcost.init(api_key="mc_your_api_key",org_id="org-123",content_privacy=True # All PII scanning happens locally)# Manual PII scan (runs entirely in your environment)result = modelcost.scan_pii("Contact me at john@example.com or 555-0123")# result.violations: [{"type": "email", ...}, {"type": "phone", ...}]
MODELCOST_CONTENT_PRIVACY=true in your environment.Budget Management
The SDK automatically enforces budgets configured in the dashboard. You can also perform pre-flight budget checks programmatically before making expensive calls.
import modelcost# Pre-flight budget check before making an expensive callbudget = modelcost.check_budget(scope="organization",scope_id="org-123")if budget.allowed:response = wrapped_client.chat.completions.create(...)else:print(f"Budget exceeded: {budget.reason}")# budget.action: "alert" | "throttle" | "block"
Budget Actions
| Action | Behavior |
|---|---|
alert | Send notifications but allow all calls to proceed |
throttle | Progressively reduce the percentage of allowed calls as spend approaches the limit |
block | Reject calls entirely once the budget is exceeded. No exceptions |
Supported Providers
The SDK wraps all major AI provider clients transparently. The integration pattern is the same for every provider — just pass your client to wrap().
API Reference
Core methods available across all SDK languages.
contentPrivacy is enabled. Returns detected violation types and positions.Graceful Shutdown
Always call shutdown() before your process exits to ensure all pending cost events are flushed to ModelCost.
import modelcost# Flush pending events and shut down gracefullymodelcost.shutdown()
FAQ
What happens if ModelCost is unreachable?
By default, failOpen is set to true. This means your AI calls proceed normally even if the ModelCost API is down. Cost events are buffered locally and flushed when connectivity is restored.
How much latency does the SDK add?
Sub-millisecond. The SDK performs budget checks asynchronously and batches cost events in the background. It never blocks your AI provider calls synchronously.
Can I use multiple AI providers simultaneously?
Yes. Wrap each provider client independently. The SDK tracks costs per-provider and attributes them to the correct models automatically.
Is data encrypted in transit?
Yes. All communication with the ModelCost API uses TLS 1.2+ encryption. When contentPrivacy is enabled, raw content never leaves your environment at all.
What data is sent when contentPrivacy is enabled?
Only anonymized metadata: violation type (e.g., "email"), severity level, action taken, and timestamp. No prompt text, completion text, or user content is ever transmitted to ModelCost.
Do I need to change my existing code?
Minimal changes. You initialize the SDK once and wrap your provider client. All existing code that uses the client continues to work unchanged — the wrapped client is a drop-in replacement.