Building Production-Ready Chatbots with COZHUB
Step-by-step tutorial on building a customer support chatbot with fallback handling and multiple models.
Developer Relations
DevRel
Introduction
Building a chatbot is easy. Building a production-ready chatbot that handles edge cases, optimizes costs, and provides great user experience? That's harder.
In this tutorial, we'll build a customer support chatbot that:
- Uses multiple AI models intelligently
- Handles model failures gracefully
- Maintains conversation context
- Optimizes for cost and quality
Prerequisites
- Node.js 18+
- COZHUB account with API key
- Basic TypeScript knowledge
Project Setup
mkdir cozhub-chatbot
cd cozhub-chatbot
npm init -y
npm install openai express dotenv
npm install -D typescript @types/node @types/express
Create a tsconfig.json:
{
"compilerOptions": {
"target": "ES2022",
"module": "commonjs",
"outDir": "./dist",
"strict": true,
"esModuleInterop": true
}
}
Step 1: Basic Chatbot
Let's start with a simple chatbot:
// src/chatbot.ts;import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://api.cozhub.ai/v1',
apiKey: process.env.COZHUB_API_KEY,
});
interface Message {
role: 'user'
'system';
'assistant' content: string;
}
const SYSTEM_PROMPT =
You are a helpful customer support agent for TechCorp.You help users with:
- Account issues
- Billing questions
- Technical support
- Product information
Be friendly, concise, and helpful. If you don't know something, say so.
export async function chat(
messages: Message[],
model = 'gpt-4o-mini'
): Promise<string> {
const response = await client.chat.completions.create({
model,
messages: [
{ role: 'system', content: SYSTEM_PROMPT },
...messages,
],
temperature: 0.7,
max_tokens: 500,
});
return response.choices[0].message.content || '';
}
Step 2: Add Model Fallback
What happens when GPT-4o is down? Let's add automatic fallback:
// src/chatbot-with-fallback.ts
const MODELS = [
{ id: 'gpt-4o-mini', priority: 1 },
{ id: 'claude-3-5-haiku', priority: 2 },
{ id: 'gemini-2.0-flash', priority: 3 },
];
export async function chatWithFallback(
messages: Message[]
): Promise<{ response: string; model: string }> {
for (const model of MODELS) {
try {
const response = await client.chat.completions.create({
model: model.id,
messages: [
{ role: 'system', content: SYSTEM_PROMPT },
...messages,
],
temperature: 0.7,
max_tokens: 500,
});
return {
response: response.choices[0].message.content || '',
model: model.id,
};
} catch (error: any) {
console.error(Model ${model.id} failed: ${error.message});
// If it's a rate limit, wait before trying next model
if (error.status === 429) {
await new Promise(resolve => setTimeout(resolve, 1000));
}
continue;
}
}
throw new Error('All models failed');
}
Step 3: Conversation Memory
Production chatbots need to remember conversation history:
// src/conversation-store.ts
interface Conversation {
id: string;
messages: Message[];
createdAt: Date;
lastMessageAt: Date;
}
class ConversationStore {
private conversations = new Map<string, Conversation>();
create(): string {
const id = crypto.randomUUID();
this.conversations.set(id, {
id,
messages: [],
createdAt: new Date(),
lastMessageAt: new Date(),
});
return id;
}
addMessage(conversationId: string, message: Message): void {
const conv = this.conversations.get(conversationId);
if (!conv) throw new Error('Conversation not found');
conv.messages.push(message);
conv.lastMessageAt = new Date();
// Keep only last 20 messages to manage context length
if (conv.messages.length > 20) {
conv.messages = conv.messages.slice(-20);
}
}
getMessages(conversationId: string): Message[] {
const conv = this.conversations.get(conversationId);
return conv?.messages || [];
}
// Clean up old conversations (call periodically)
cleanup(maxAgeMs = 24 60 60 * 1000): void {
const now = Date.now();
for (const [id, conv] of this.conversations) {
if (now - conv.lastMessageAt.getTime() > maxAgeMs) {
this.conversations.delete(id);
}
}
}
}
export const conversationStore = new ConversationStore();
Step 4: Smart Model Selection
Different queries need different models:
// src/model-selector.ts
type QueryType = 'simple'
'complex' 'code'
'unknown';
function classifyQuery(message: string): QueryType {
const lower = message.toLowerCase();
// Simple queries - use fast/cheap model
if (
lower.includes('hours') ||
lower.includes('contact') ||
lower.includes('price') ||
message.length < 50
) {
return 'simple';
}
// Code-related - use code-specialized model
if (
lower.includes('code') ||
lower.includes('api') ||
lower.includes('integration') ||
message.includes('
')
) {
return 'code';
}
// Complex queries - use powerful model
if (
lower.includes('explain') ||
lower.includes('compare') ||
lower.includes('analyze') ||
message.length > 200
) {
return 'complex';
}
return 'unknown';
}
function selectModel(queryType: QueryType): string {
switch (queryType) {
case 'simple':
return 'gpt-4o-mini'; // Fast and cheap
case 'code':
return 'claude-3-5-sonnet'; // Great for code
case 'complex':
return 'gpt-4o'; // Best reasoning
default:
return 'gpt-4o-mini'; // Default to cost-effective
}
}
export function getOptimalModel(message: string): string {
const queryType = classifyQuery(message);
return selectModel(queryType);
}
## Step 5: Express API Server
Put it all together with an API:
typescript
// src/server.ts
import express from 'express';
import { chatWithFallback } from './chatbot-with-fallback';
import { conversationStore } from './conversation-store';
import { getOptimalModel } from './model-selector';
const app = express();
app.use(express.json());
// Create new conversation
app.post('/conversations', (req, res) => {
const id = conversationStore.create();
res.json({ conversationId: id });
});
// Send message
app.post('/conversations/:id/messages', async (req, res) => {
const { id } = req.params;
const { message } = req.body;
try {
// Add user message
conversationStore.addMessage(id, {
role: 'user',
content: message,
});
// Get conversation history
const messages = conversationStore.getMessages(id);
// Select optimal model based on query
const model = getOptimalModel(message);
// Get response with fallback
const { response, model: usedModel } = await chatWithFallback(messages);
// Add assistant message
conversationStore.addMessage(id, {
role: 'assistant',
content: response,
});
res.json({
response,
model: usedModel,
});
} catch (error: any) {
res.status(500).json({ error: error.message });
}
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
console.log(Chatbot server running on port ${PORT});
});
## Step 6: Testing
Test your chatbot:
bash
Create a conversation
curl -X POST http://localhost:3000/conversations
Send a message
curl -X POST http://localhost:3000/conversations/{id}/messages \
-H "Content-Type: application/json" \
-d '{"message": "What are your business hours?"}'
```
Cost Analysis
Here's what this chatbot costs at 10,000 messages/day:
| Query Type | Model | Avg Tokens | Cost/Message | Daily Cost |
| Simple (60%) | GPT-4o-mini | 300 | $0.0003 | $1.80 |
| Complex (25%) | GPT-4o | 500 | $0.0075 | $18.75 |
| Code (15%) | Claude 3.5 | 600 | $0.012 | $18.00 |
Total: ~$38.55/day for 10,000 messages
Without smart routing (all GPT-4o): ~$75/day — 48% savings!
Production Checklist
Before going live:
- [ ] Set up error monitoring (Sentry, DataDog)
- [ ] Add rate limiting per user
- [ ] Implement request logging
- [ ] Set up usage alerts in COZHUB dashboard
- [ ] Test fallback behavior
- [ ] Add input validation and sanitization
- [ ] Consider adding response caching
Conclusion
You now have a production-ready chatbot that:
- ✅ Uses the right model for each query
- ✅ Handles failures gracefully
- ✅ Maintains conversation context
- ✅ Optimizes costs automatically
The complete code is available on GitHub.
Questions? Join our Discord community!