Introduction

Building a chatbot is easy. Building a production-ready chatbot that handles edge cases, optimizes costs, and provides great user experience? That's harder.

In this tutorial, we'll build a customer support chatbot that:

Uses multiple AI models intelligently
Handles model failures gracefully
Maintains conversation context
Optimizes for cost and quality

Prerequisites

Node.js 18+
COZHUB account with API key
Basic TypeScript knowledge

Project Setup

mkdir cozhub-chatbot cd cozhub-chatbot npm init -y npm install openai express dotenv

npm install -D typescript @types/node @types/express

Create a tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "commonjs",
    "outDir": "./dist",
    "strict": true,
    "esModuleInterop": true
  }
}

Step 1: Basic Chatbot

Let's start with a simple chatbot:

// src/chatbot.ts
import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://api.cozhub.ai/v1',
  apiKey: process.env.COZHUB_API_KEY,
});

interface Message {
  role: 'user' 
'assistant'
 'system';
  content: string;
}

const SYSTEM_PROMPT = You are a helpful customer support agent for TechCorp.

You help users with:
Account issues
Billing questions
Technical support
Product information

Be friendly, concise, and helpful. If you don't know something, say so.;

export async function chat(
  messages: Message[],
  model = 'gpt-4o-mini'
): Promise<string> {
  const response = await client.chat.completions.create({
    model,
    messages: [
      { role: 'system', content: SYSTEM_PROMPT },
      ...messages,
    ],
    temperature: 0.7,
    max_tokens: 500,
  });

  return response.choices[0].message.content || '';
}

Step 2: Add Model Fallback

What happens when GPT-4o is down? Let's add automatic fallback:

// src/chatbot-with-fallback.ts
const MODELS = [
  { id: 'gpt-4o-mini', priority: 1 },
  { id: 'claude-3-5-haiku', priority: 2 },
  { id: 'gemini-2.0-flash', priority: 3 },
];

export async function chatWithFallback(
  messages: Message[]
): Promise<{ response: string; model: string }> {
  for (const model of MODELS) {
    try {
      const response = await client.chat.completions.create({
        model: model.id,
        messages: [
          { role: 'system', content: SYSTEM_PROMPT },
          ...messages,
        ],
        temperature: 0.7,
        max_tokens: 500,
      });

      return {
        response: response.choices[0].message.content || '',
        model: model.id,
      };
    } catch (error: any) {
      console.error(Model ${model.id} failed: ${error.message});

      // If it's a rate limit, wait before trying next model
      if (error.status === 429) {
        await new Promise(resolve => setTimeout(resolve, 1000));
      }

      continue;
    }
  }

  throw new Error('All models failed');
}

Step 3: Conversation Memory

Production chatbots need to remember conversation history:

// src/conversation-store.ts
interface Conversation {
  id: string;
  messages: Message[];
  createdAt: Date;
  lastMessageAt: Date;
}

class ConversationStore {
  private conversations = new Map<string, Conversation>();

  create(): string {
    const id = crypto.randomUUID();
    this.conversations.set(id, {
      id,
      messages: [],
      createdAt: new Date(),
      lastMessageAt: new Date(),
    });
    return id;
  }

  addMessage(conversationId: string, message: Message): void {
    const conv = this.conversations.get(conversationId);
    if (!conv) throw new Error('Conversation not found');

    conv.messages.push(message);
    conv.lastMessageAt = new Date();

    // Keep only last 20 messages to manage context length
    if (conv.messages.length > 20) {
      conv.messages = conv.messages.slice(-20);
    }
  }

  getMessages(conversationId: string): Message[] {
    const conv = this.conversations.get(conversationId);
    return conv?.messages || [];
  }

  // Clean up old conversations (call periodically)
  cleanup(maxAgeMs = 24  60  60 * 1000): void {
    const now = Date.now();
    for (const [id, conv] of this.conversations) {
      if (now - conv.lastMessageAt.getTime() > maxAgeMs) {
        this.conversations.delete(id);
      }
    }
  }
}

export const conversationStore = new ConversationStore();

Step 4: Smart Model Selection

Different queries need different models:

// src/model-selector.ts
type QueryType = 'simple' 
'complex' 'code'
 'unknown';

function classifyQuery(message: string): QueryType {
  const lower = message.toLowerCase();

  // Simple queries - use fast/cheap model
  if (
    lower.includes('hours') ||
    lower.includes('contact') ||
    lower.includes('price') ||
    message.length < 50
  ) {
    return 'simple';
  }

  // Code-related - use code-specialized model
  if (
    lower.includes('code') ||
    lower.includes('api') ||
    lower.includes('integration') ||
    message.includes('

) {

return 'code';

}

// Complex queries - use powerful model

if (

lower.includes('explain') ||

lower.includes('compare') ||

lower.includes('analyze') ||

message.length > 200

) {

return 'complex';

}

return 'unknown';

}

function selectModel(queryType: QueryType): string {

switch (queryType) {

case 'simple':

return 'gpt-4o-mini'; // Fast and cheap

case 'code':

return 'claude-3-5-sonnet'; // Great for code

case 'complex':

return 'gpt-4o'; // Best reasoning

default:

return 'gpt-4o-mini'; // Default to cost-effective

}

export function getOptimalModel(message: string): string {

const queryType = classifyQuery(message);

return selectModel(queryType);

}

## Step 5: Express API Server

Put it all together with an API:

typescript

// src/server.ts

import express from 'express';

import { chatWithFallback } from './chatbot-with-fallback';

import { conversationStore } from './conversation-store';

import { getOptimalModel } from './model-selector';

const app = express();

app.use(express.json());

// Create new conversation

app.post('/conversations', (req, res) => {

const id = conversationStore.create();

res.json({ conversationId: id });

});

// Send message

app.post('/conversations/:id/messages', async (req, res) => {

const { id } = req.params;

const { message } = req.body;

try {

// Add user message

conversationStore.addMessage(id, {

role: 'user',

content: message,

});

// Get conversation history

const messages = conversationStore.getMessages(id);

// Select optimal model based on query

const model = getOptimalModel(message);

// Get response with fallback

const { response, model: usedModel } = await chatWithFallback(messages);

// Add assistant message

conversationStore.addMessage(id, {

role: 'assistant',

content: response,

});

res.json({

response,

model: usedModel,

});

} catch (error: any) {

res.status(500).json({ error: error.message });

}

});

const PORT = process.env.PORT || 3000;

app.listen(PORT, () => {

console.log(Chatbot server running on port ${PORT});

});

## Step 6: Testing

Test your chatbot:

bash

Create a conversation

curl -X POST http://localhost:3000/conversations

Send a message

curl -X POST http://localhost:3000/conversations/{id}/messages \

-H "Content-Type: application/json" \

-d '{"message": "What are your business hours?"}'

```

Cost Analysis

Here's what this chatbot costs at 10,000 messages/day:

Query Type

Model

Avg Tokens

Cost/Message

Daily Cost

Simple (60%)	GPT-4o-mini	300	$0.0003	$1.80
Complex (25%)	GPT-4o	500	$0.0075	$18.75
Code (15%)	Claude 3.5	600	$0.012	$18.00

Total: ~$38.55/day for 10,000 messages

Without smart routing (all GPT-4o): ~$75/day — 48% savings!

Production Checklist

Before going live:

[ ] Set up error monitoring (Sentry, DataDog)
[ ] Add rate limiting per user
[ ] Implement request logging
[ ] Set up usage alerts in COZHUB dashboard
[ ] Test fallback behavior
[ ] Add input validation and sanitization
[ ] Consider adding response caching

Conclusion

You now have a production-ready chatbot that:

✅ Uses the right model for each query
✅ Handles failures gracefully
✅ Maintains conversation context
✅ Optimizes costs automatically

The complete code is available on GitHub.

Questions? Join our Discord community!

Building Production-Ready Chatbots with COZHUB