Verta - An Intelligent Chatbot
Year:
2024
This project is done as part of the course IE7374 - Machine Learning Operations
Overview
The Verta Chatbot is an AI-driven solution designed to enhance user interactions with product information by answering questions based on both metadata and user reviews. Recognized as the Top Project in IE7374 - Machine Learning Operations at Northeastern University (Fall 2024), this system combines the power of multi-agent LLM workflows, cloud deployment, and MLOps automation to deliver scalable and intelligent insights.
Deployed as a serverless FastAPI API on Google Cloud Run, the chatbot integrates multiple specialized agents for efficient query handling:
A Metadata Agent summarizes product descriptions.
A Retriever Agent fetches contextually relevant information from a vector store containing user reviews.
This architecture allows the chatbot to answer a wide variety of product-related queries, blending factual product data with real-world customer perspectives.
Approach
The solution was architected with a strong focus on scalability, observability, and automation:
Infrastructure & Storage: PostgreSQL on Google Cloud Platform (GCP) ensures reliable and scalable data storage.
CI/CD: Automated with GitHub Actions, streamlining deployment and integration workflows.
Bias & Evaluation: Implements LLM-as-Judge for generating synthetic test questions and bias detection algorithms to evaluate fairness in responses.
Experiment Tracking & Monitoring:
MLflow captures experiment metrics and model metadata.
Langfuse traces user interactions and collects feedback for continuous improvement.
GCP Logs with Teams channel alerts ensure proactive system monitoring.
Data Orchestration: Managed with Apache Airflow for task scheduling and automation.
Vector Database: Utilizes FaissDB for storing product reviews and embedding-based context retrieval.
Key Features
Multi-Agent Workflow: Managed via LangGraph, coordinating the actions of Metadata and Retriever agents.
LLM Integration: Combines GPT-4o-Mini, Llama 3.1-70B, and Llama 3.1-8B across four nodes to support hybrid reasoning and retrieval tasks.
Bias & Quality Control: Detects and mitigates response bias while improving accuracy through synthetic evaluation.
End-to-End Automation: CI/CD pipelines with MLflow and Airflow for complete lifecycle management.
Cloud-Native Architecture: Fully deployed on Google Cloud Run, optimized for cost-efficient scalability.





