Capstone Project 6

AI Chatbot

Build an intelligent conversational AI system from scratch. You will implement intent recognition using NLU models, slot filling for entity extraction, context management for multi-turn conversations, and deploy your chatbot as a REST API that can be integrated with any frontend.

18-25 hours
Advanced
800 Points
What You Will Build
  • Intent classification model
  • Named entity recognition (NER)
  • Dialogue state tracker
  • Response generation engine
  • REST API with Flask/FastAPI
Contents
01

Project Overview

Conversational AI systems are transforming how users interact with applications. In this project, you will build a complete chatbot pipeline that understands user intents, extracts relevant entities (slots), maintains conversation context across multiple turns, and responds appropriately. Target: Achieve over 90% intent classification accuracy and over 85% slot filling F1-score.

Skills Applied: This project tests your understanding of NLP/NLU fundamentals, sequence labeling, transformer models, state management, and API development with Python.
Intent Recognition

Classify user messages into predefined intent categories

Slot Filling

Extract entities like dates, names, locations from text

Context Management

Track conversation state across multiple turns

REST API

Deploy chatbot as a scalable API service

Learning Objectives

Technical Skills
  • Build intent classification with transformers
  • Implement NER for slot extraction
  • Design dialogue state tracking system
  • Create response templates with slot substitution
  • Deploy REST API with Flask or FastAPI
NLU Concepts
  • Understand intent-slot architecture
  • Master BIO tagging for entity extraction
  • Handle multi-turn conversation flows
  • Manage fallback and error handling
  • Evaluate NLU performance metrics
Ready to submit? Already completed the project? Submit your work now!
Submit Now
02

Problem Scenario

TechAssist Solutions

You have been hired as an NLU Engineer at TechAssist Solutions, a company building AI-powered customer support systems. The company needs a chatbot that can handle common customer queries for an e-commerce platform - including order tracking, product inquiries, returns, and FAQs. The bot must understand context and maintain coherent conversations.

"Our support team is overwhelmed with repetitive queries. We need a chatbot that can understand what customers want, extract order numbers and product names, remember context during conversations, and provide helpful responses. It should also know when to escalate to a human agent."

Michael Chen, VP of Customer Experience, TechAssist Solutions

Technical Challenges to Solve

Intent Understanding
  • How to classify diverse user queries?
  • Handling ambiguous or multi-intent messages
  • Confidence thresholds for fallback
  • Out-of-scope detection
Entity Extraction
  • Extracting order IDs, dates, product names
  • Handling different entity formats
  • BIO tagging for sequence labeling
  • Dealing with entity variations
Dialogue Management
  • Tracking conversation state
  • Handling slot confirmation and correction
  • Managing multi-turn flows
  • Context carryover between turns
API Integration
  • RESTful endpoint design
  • Session management
  • Request/response formats
  • Error handling and logging
Pro Tip: Start with a simple rule-based prototype to understand the conversation flows, then progressively add ML components. Always test with real user queries to identify edge cases.
03

Dataset Resources

You can use these datasets for training your intent classification and slot filling models, or create your own custom dataset for the e-commerce domain.

Recommended Datasets

Choose from these publicly available intent recognition and NLU datasets:

Intent Categories (Example)

Your chatbot should recognize at least 10-15 intents:

  • order_status - Check order tracking
  • order_cancel - Cancel an order
  • return_request - Request a return
  • product_inquiry - Ask about products
  • payment_issue - Payment problems
  • shipping_info - Delivery questions
  • greeting - Hello, hi, etc.
  • goodbye - Bye, thanks, etc.
  • human_handoff - Talk to agent
  • out_of_scope - Unrelated queries
Slot Types (Entities)

Extract these entities from user messages:

  • order_id - Order numbers (e.g., #12345)
  • product_name - Product references
  • date - Dates and time references
  • email - Email addresses
  • phone - Phone numbers
  • amount - Monetary values
  • category - Product categories
  • quantity - Number of items
Custom Dataset: You may create your own training data with at least 50-100 examples per intent. Include variations in phrasing, typos, and different slot values. Document your data collection process in the README.
04

Project Requirements

Your project must include all of the following components. This is a comprehensive NLU project covering the full chatbot development pipeline.

1
Data Preparation

Prepare training data:

  • Collect or create intent-labeled utterances
  • Annotate entities with BIO tagging
  • Split into train/validation/test sets
  • Handle class imbalance if present
  • Document data statistics and distribution
Deliverable: Data preparation notebook with annotated dataset and statistics.
2
Intent Classification

Build intent recognition model:

  • Implement text classification model (BERT/DistilBERT)
  • Train and evaluate on your dataset
  • Add confidence threshold for fallback
  • Handle out-of-scope detection
  • Achieve over 90% accuracy on test set
Deliverable: Intent classification notebook with trained model and evaluation.
3
Slot Filling (NER)

Build entity extraction model:

  • Implement sequence labeling with BIO scheme
  • Use token classification with transformers
  • Handle entity normalization
  • Evaluate with precision, recall, F1
  • Achieve over 85% F1-score
Deliverable: NER notebook with trained model and entity extraction demo.
4
Context Management

Implement dialogue state tracking:

  • Design conversation state schema
  • Track slots across multiple turns
  • Handle slot confirmation and updates
  • Implement conversation history
  • Support context reset
Deliverable: Dialogue manager module with state tracking logic.
5
Response Generation

Create response system:

  • Design response templates for each intent
  • Implement slot substitution in templates
  • Add response variations for naturalness
  • Handle missing slot prompts
  • Implement fallback responses
Deliverable: Response generation module with templates.
6
REST API Deployment

Deploy as API service:

  • Create Flask or FastAPI application
  • Implement /chat endpoint
  • Add session management
  • Include API documentation (Swagger)
  • Add logging and error handling
Deliverable: API code with documentation and usage examples.
05

Intent Recognition

Intent recognition is the core NLU component that classifies user messages into predefined categories. Modern approaches use transformer-based models fine-tuned on domain-specific data.

Model Architecture Options
ModelParametersProsCons
DistilBERT 66M Fast, good accuracy, lightweight Slightly lower accuracy than BERT
BERT-base 110M Strong baseline, well-documented Slower inference
RoBERTa 125M Often better than BERT More compute needed
ALBERT 12M Very lightweight May need more tuning
Evaluation Metrics
Accuracy

Overall correct predictions

> 90%
Precision

Per-class precision

> 85%
Recall

Per-class recall

> 85%
F1-Score

Macro F1

> 88%
06

Slot Filling (NER)

Slot filling extracts relevant entities from user messages using sequence labeling. The BIO (Beginning-Inside-Outside) tagging scheme is the standard approach.

BIO Tagging Scheme
TagMeaningExample
B-entity Beginning of entity "B-order_id" for first token of order ID
I-entity Inside (continuation) "I-product" for middle tokens
O Outside (no entity) Regular words not part of any entity
Example Annotation

User Input: "I want to cancel order #12345 placed on Monday"

Iwanttocancelorder#12345placedonMonday
OOOOOB-order_idOOB-date
07

Context Management

Multi-turn conversations require tracking dialogue state across user turns. The dialogue manager maintains context, filled slots, and conversation history.

Dialogue State Components
Current Intent
  • Active intent being processed
  • Confidence score
  • Intent history
Filled Slots
  • Extracted entity values
  • Slot confirmation status
  • Required vs optional slots
Conversation History
  • Previous user messages
  • Bot responses
  • Turn counter
Conversation Flow Example
TurnUserBotState Update
1 "I want to check my order" "Sure! What's your order number?" intent: order_status, slots: {}
2 "It's #12345" "Order #12345 is out for delivery!" slots: {order_id: "#12345"}
3 "When will it arrive?" "Expected delivery: Today by 5 PM" intent: shipping_info (context carried)
08

REST API Integration

Deploy your chatbot as a REST API using Flask or FastAPI. The API should handle stateful conversations with session management.

API Endpoints
EndpointMethodDescription
/chat POST Send message and get response
/session/new POST Create new conversation session
/session/{id} GET Get session state and history
/session/{id}/reset POST Reset conversation context
/health GET API health check
Request/Response Format
Request Body
{
  "session_id": "abc123",
  "message": "Check order #12345",
  "metadata": {
    "user_id": "user_001",
    "channel": "web"
  }
}
Response Body
{
  "session_id": "abc123",
  "response": "Order #12345 is shipped!",
  "intent": "order_status",
  "confidence": 0.95,
  "slots": {"order_id": "#12345"}
}
09

Submission Requirements

Create a public GitHub repository with the exact name shown below:

Required Repository Name
ai-chatbot
github.com/<your-username>/ai-chatbot
Required Project Structure
ai-chatbot/
├── notebooks/
│   ├── 01_data_preparation.ipynb       # Data loading and preprocessing
│   ├── 02_intent_classification.ipynb  # Intent model training
│   ├── 03_slot_filling.ipynb           # NER model training
│   └── 04_evaluation.ipynb             # Model evaluation
├── src/
│   ├── nlu/
│   │   ├── intent_classifier.py        # Intent classification module
│   │   └── slot_filler.py              # Entity extraction module
│   ├── dialogue/
│   │   ├── state_tracker.py            # Dialogue state management
│   │   └── response_generator.py       # Response templates
│   ├── api/
│   │   ├── app.py                      # Flask/FastAPI application
│   │   └── routes.py                   # API endpoints
│   └── utils.py                        # Helper functions
├── data/
│   ├── intents.json                    # Intent training data
│   └── entities.json                   # Entity annotations
├── models/
│   ├── intent_model/                   # Saved intent classifier
│   └── ner_model/                      # Saved NER model
├── tests/
│   └── test_chatbot.py                 # Unit tests
├── requirements.txt                    # Python dependencies
├── Dockerfile                          # Optional: containerization
└── README.md                           # Project documentation
README.md Required Sections
1. Project Overview
  • Your full name and submission date
  • Project description
  • Supported intents and entities
2. Model Performance
  • Intent classification accuracy
  • Slot filling F1-score
  • Confusion matrix
3. Architecture
  • NLU pipeline diagram
  • Model choices and reasoning
  • Dialogue flow design
4. API Documentation
  • Endpoint descriptions
  • Request/response examples
  • Error codes
5. Demo
  • Sample conversations
  • Screenshots or GIFs
  • Edge case handling
6. How to Run
  • Installation instructions
  • API startup commands
  • Testing instructions
Submit Your Project

Enter your GitHub username - we will verify your repository automatically

10

Grading Rubric

Your project will be graded on the following criteria. Total: 800 points.

Criteria Points Description
Data Preparation 75 Quality dataset with proper annotations
Intent Classification 175 Over 90% accuracy, proper evaluation
Slot Filling 150 Over 85% F1, BIO tagging implementation
Context Management 125 Multi-turn tracking, state management
Response Generation 75 Templates, slot filling, fallbacks
REST API 100 Working endpoints, documentation
Documentation 100 README, code quality, reproducibility
Total 800
Grading Levels
Excellent
720-800

All components, excellent performance

Good
600-719

Meets requirements, good docs

Satisfactory
480-599

Basic implementation

Needs Work
< 480

Missing components

Ready to Submit?

Make sure your API is working and documentation is complete.

Submit Your Project