Configuration¶
Configure the Vectorcache SDK for your application.
JavaScript/TypeScript Configuration¶
Basic Setup¶
import { VectorcacheClient } from 'vectorcache';
const client = new VectorcacheClient({
apiKey: 'your_api_key_here',
baseUrl: 'https://api.vectorcache.ai' // optional, defaults to production
});
Configuration Options¶
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
apiKey |
string | Yes | - | Your Vectorcache API key |
baseUrl |
string | No | https://api.vectorcache.ai |
API base URL |
timeout |
number | No | 30000 |
Request timeout in milliseconds |
Environment Variables¶
Store your API key in environment variables:
.env file:
TypeScript Types¶
The SDK is fully typed. Import types as needed:
import {
VectorcacheClient,
CacheQueryRequest,
CacheQueryResponse
} from 'vectorcache';
const request: CacheQueryRequest = {
prompt: 'What is AI?',
model: 'gpt-4o',
similarityThreshold: 0.85
};
const response: CacheQueryResponse = await client.query(request);
Python Configuration¶
Basic Setup¶
import requests
import os
api_key = os.environ.get('VECTORCACHE_API_KEY')
base_url = 'https://api.vectorcache.ai'
headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json'
}
Environment Variables¶
.env file:
Using python-dotenv:
from dotenv import load_dotenv
import os
load_dotenv()
api_key = os.environ.get('VECTORCACHE_API_KEY')
Query Parameters¶
Required Parameters¶
| Parameter | Type | Description |
|---|---|---|
prompt |
string | The text prompt to cache/query |
model |
string | LLM model identifier (e.g., 'gpt-4o', 'claude-3-5-sonnet-20241022') |
Optional Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
similarity_threshold |
number | 0.85 |
Minimum similarity score (0-1) for cache hits |
context |
string | null |
Additional context for the query |
project_id |
string | Auto-detected | Project ID (inferred from API key) |
include_debug |
boolean | false |
Include debug information in response |
Example with All Parameters¶
const result = await client.query({
prompt: 'Explain machine learning',
context: 'Educational content for beginners',
model: 'gpt-4o',
similarityThreshold: 0.85,
includeDebug: true
});
Similarity Threshold¶
The similarity_threshold parameter controls cache sensitivity:
- 0.95-1.0: Very strict - only nearly identical queries match
- 0.85-0.94: Recommended - good balance of accuracy and hit rate
- 0.70-0.84: Relaxed - more cache hits but less precise matches
- Below 0.70: Not recommended - may return irrelevant cached responses
Finding the Right Threshold¶
Start with 0.85 and adjust based on your use case:
// Educational content - can be more relaxed
const eduResult = await client.query({
prompt: 'What is photosynthesis?',
similarityThreshold: 0.80 // Lower threshold OK
});
// Legal/Medical - needs precision
const legalResult = await client.query({
prompt: 'Interpret contract clause 5.2',
similarityThreshold: 0.92 // Higher threshold for accuracy
});
Learn more in the Similarity Tuning Guide.
Error Handling¶
JavaScript/TypeScript¶
try {
const result = await client.query({
prompt: 'What is AI?',
model: 'gpt-4o'
});
console.log(result.response);
} catch (error) {
if (error.response?.status === 401) {
console.error('Invalid API key');
} else if (error.response?.status === 429) {
console.error('Rate limit exceeded');
} else {
console.error('Unexpected error:', error.message);
}
}
Python¶
try:
response = requests.post(
f"{base_url}/v1/cache/query",
json=data,
headers=headers
)
response.raise_for_status()
result = response.json()
except requests.exceptions.HTTPError as e:
if e.response.status_code == 401:
print("Invalid API key")
elif e.response.status_code == 429:
print("Rate limit exceeded")
else:
print(f"HTTP error: {e}")
except Exception as e:
print(f"Unexpected error: {e}")
See Error Handling for complete error codes.
Production Best Practices¶
1. Use Environment Variables¶
Never hardcode API keys:
// ❌ Bad
const client = new VectorcacheClient({
apiKey: 'vc_1234567890abcdef'
});
// ✅ Good
const client = new VectorcacheClient({
apiKey: process.env.VECTORCACHE_API_KEY!
});
2. Set Appropriate Timeouts¶
const client = new VectorcacheClient({
apiKey: process.env.VECTORCACHE_API_KEY!,
timeout: 10000 // 10 seconds for production
});
3. Handle Errors Gracefully¶
Always have fallback logic:
async function getResponse(prompt: string) {
try {
const result = await client.query({ prompt, model: 'gpt-4o' });
return result.response;
} catch (error) {
console.error('Vectorcache error:', error);
// Fallback to direct LLM call
return await fallbackLLMCall(prompt);
}
}
4. Monitor Performance¶
Track cache performance in your application:
const result = await client.query({ prompt, model: 'gpt-4o' });
// Log metrics
analytics.track('vectorcache_query', {
cache_hit: result.cache_hit,
similarity_score: result.similarity_score,
cost_saved: result.cost_saved
});
Next Steps¶
- API Reference - Complete API documentation
- Best Practices - Production tips
- Similarity Tuning - Optimize cache hit rates