Error Handling¶
Complete guide to handling errors in the Vectorcache API.
Error Response Format¶
All errors return JSON with a detail field:
HTTP Status Codes¶
| Code | Status | Description |
|---|---|---|
| 200 | OK | Request successful |
| 400 | Bad Request | Invalid request parameters |
| 401 | Unauthorized | Authentication failed |
| 403 | Forbidden | Insufficient permissions |
| 404 | Not Found | Resource not found |
| 429 | Too Many Requests | Rate limit exceeded |
| 500 | Internal Server Error | Server error |
| 502 | Bad Gateway | LLM provider error |
| 503 | Service Unavailable | Service temporarily unavailable |
Common Errors¶
400 Bad Request¶
Missing Required Fields¶
Cause: Missing prompt or model in request
Solution:
// ❌ Missing model
{ prompt: "What is AI?" }
// ✅ Include all required fields
{ prompt: "What is AI?", model: "gpt-4o" }
Invalid Similarity Threshold¶
Cause: similarity_threshold outside valid range
Solution:
// ❌ Invalid threshold
{ similarity_threshold: 1.5 }
// ✅ Valid threshold
{ similarity_threshold: 0.85 }
Invalid JSON¶
Cause: Malformed JSON in request body
Solution: Validate JSON before sending
401 Unauthorized¶
Missing API Key¶
Cause: No Authorization header
Solution:
// ❌ Missing auth header
fetch(url, {
method: 'POST',
body: JSON.stringify(data)
});
// ✅ Include auth header
fetch(url, {
method: 'POST',
headers: {
'Authorization': `Bearer ${apiKey}`
},
body: JSON.stringify(data)
});
Invalid API Key¶
Cause: API key is invalid or revoked
Solution: - Verify your API key in the dashboard - Check if the key has been revoked - Create a new API key if needed
403 Forbidden¶
Insufficient Permissions¶
Cause: Using an API key from a different project
Solution: Use the correct API key for this project
429 Too Many Requests¶
Rate Limit Exceeded¶
Headers:
Solution: Implement retry with exponential backoff
async function queryWithRetry(client, request, maxRetries = 3) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await client.query(request);
} catch (error) {
if (error.statusCode === 429) {
const resetTime = error.headers['x-ratelimit-reset'];
const waitTime = Math.max(
(resetTime * 1000) - Date.now(),
1000 * Math.pow(2, attempt)
);
await new Promise(resolve => setTimeout(resolve, waitTime));
continue;
}
throw error;
}
}
}
500 Internal Server Error¶
Server Error¶
Cause: Unexpected server error
Solution: - Retry the request - If persists, contact support with request details
502 Bad Gateway¶
LLM Provider Error¶
Cause: Issue with your LLM provider API key
Solution: - Verify your LLM API key in Settings → LLM Keys - Check if you have sufficient credits with the provider - Ensure the model is available for your LLM provider
503 Service Unavailable¶
Service Temporarily Unavailable¶
Cause: Service maintenance or high load
Solution: Retry with exponential backoff
Error Handling Patterns¶
JavaScript/TypeScript¶
import { VectorcacheClient, VectorcacheError } from 'vectorcache';
const client = new VectorcacheClient({ apiKey: 'YOUR_API_KEY' });
try {
const result = await client.query({
prompt: 'What is AI?',
model: 'gpt-4o'
});
console.log(result.response);
} catch (error) {
if (error instanceof VectorcacheError) {
switch (error.statusCode) {
case 400:
console.error('Invalid request:', error.message);
break;
case 401:
console.error('Authentication failed - check your API key');
break;
case 429:
console.error('Rate limit exceeded - please slow down');
break;
case 500:
console.error('Server error - retrying...');
// Implement retry logic
break;
case 502:
console.error('LLM provider error:', error.message);
break;
default:
console.error('Unexpected error:', error.message);
}
} else {
console.error('Network or unknown error:', error);
}
}
Python¶
import requests
from requests.exceptions import HTTPError, Timeout, RequestException
def query_with_error_handling(prompt: str, model: str, api_key: str):
url = 'https://api.vectorcache.ai/v1/cache/query'
headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json'
}
data = {
'prompt': prompt,
'model': model
}
try:
response = requests.post(url, json=data, headers=headers, timeout=30)
response.raise_for_status()
return response.json()
except HTTPError as e:
status_code = e.response.status_code
error_detail = e.response.json().get('detail', 'Unknown error')
if status_code == 400:
raise ValueError(f"Invalid request: {error_detail}")
elif status_code == 401:
raise ValueError("Authentication failed - check your API key")
elif status_code == 429:
raise ValueError("Rate limit exceeded - please slow down")
elif status_code == 500:
raise ValueError(f"Server error: {error_detail}")
elif status_code == 502:
raise ValueError(f"LLM provider error: {error_detail}")
else:
raise ValueError(f"HTTP error {status_code}: {error_detail}")
except Timeout:
raise ValueError("Request timed out")
except RequestException as e:
raise ValueError(f"Request failed: {e}")
Retry Strategies¶
Exponential Backoff¶
async function exponentialBackoff(
fn,
maxRetries = 3,
baseDelay = 1000
) {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
// Don't retry on client errors (4xx except 429)
if (error.statusCode >= 400 &&
error.statusCode < 500 &&
error.statusCode !== 429) {
throw error;
}
if (attempt === maxRetries) {
throw error;
}
const delay = Math.min(baseDelay * Math.pow(2, attempt - 1), 10000);
console.log(`Retry attempt ${attempt} after ${delay}ms`);
await new Promise(resolve => setTimeout(resolve, delay));
}
}
}
// Usage
const result = await exponentialBackoff(
() => client.query({ prompt: 'What is AI?', model: 'gpt-4o' })
);
Rate Limit Aware Retry¶
import time
from datetime import datetime
def retry_with_rate_limit(fn, max_retries=3):
for attempt in range(1, max_retries + 1):
try:
return fn()
except requests.exceptions.HTTPError as e:
if e.response.status_code == 429:
# Check rate limit reset time
reset_time = int(e.response.headers.get('X-RateLimit-Reset', 0))
if reset_time:
wait_time = max(reset_time - int(time.time()), 0) + 1
else:
wait_time = min(2 ** attempt, 60)
print(f"Rate limited. Waiting {wait_time}s...")
time.sleep(wait_time)
continue
raise
except Exception as e:
if attempt == max_retries:
raise
delay = min(2 ** attempt, 10)
print(f"Retry {attempt} after {delay}s")
time.sleep(delay)
Validation Best Practices¶
Client-Side Validation¶
Validate inputs before making API calls:
function validateQueryRequest(request: CacheQueryRequest): void {
if (!request.prompt || request.prompt.trim().length === 0) {
throw new Error('Prompt is required');
}
if (!request.model || request.model.trim().length === 0) {
throw new Error('Model is required');
}
if (request.similarityThreshold !== undefined) {
if (request.similarityThreshold < 0 || request.similarityThreshold > 1) {
throw new Error('Similarity threshold must be between 0 and 1');
}
}
}
// Usage
try {
validateQueryRequest(request);
const result = await client.query(request);
} catch (error) {
console.error('Validation error:', error.message);
}
Graceful Degradation¶
Implement fallback logic when Vectorcache is unavailable:
async function queryWithFallback(prompt: string, model: string) {
try {
// Try Vectorcache first
const result = await client.query({ prompt, model });
return result.response;
} catch (error) {
console.warn('Vectorcache unavailable, falling back to direct LLM');
// Fallback to direct LLM call
return await directLLMCall(prompt, model);
}
}
Monitoring and Logging¶
Log Error Details¶
function logError(error: VectorcacheError, context: any) {
const errorLog = {
timestamp: new Date().toISOString(),
statusCode: error.statusCode,
message: error.message,
context: context,
headers: error.headers
};
// Send to logging service
logger.error('Vectorcache API error', errorLog);
// For 5xx errors, alert operations team
if (error.statusCode >= 500) {
alertOps(errorLog);
}
}
Track Error Rates¶
const errorMetrics = {
total: 0,
byStatusCode: {} as Record<number, number>
};
function trackError(error: VectorcacheError) {
errorMetrics.total++;
errorMetrics.byStatusCode[error.statusCode] =
(errorMetrics.byStatusCode[error.statusCode] || 0) + 1;
// Alert if error rate is high
const errorRate = errorMetrics.total / totalRequests;
if (errorRate > 0.05) { // 5% error rate
alertHighErrorRate(errorMetrics);
}
}
Next Steps¶
- Best Practices - Production deployment tips
- API Reference - Complete API documentation
- Support - Get help with errors