AI-Powered Agent-Based Support Assistant

AI RAG Python FastAPI Azure OpenAI React Docker Terraform

AI-Powered Agent-Based Support Assistant — Defense Client (Switzerland)

Evolved a production chatbot from naive RAG to autonomous AI agent over 14 months — each architecture shift doubling answer quality.

Show more ↓

1. Problem

New recruits must master a feature-rich HR & logistics portal. Sparse, outdated documentation generated many support tickets every month and slowed onboarding.

2. Solution Overview

I designed and led delivery of a micro-service chat assistant that evolved through five architectural phases:

Service Purpose Key Tech
Agent Core ReAct agent loop — reasons about queries, autonomously retrieves data across multiple sources, synthesizes answers FastAPI · Azure OpenAI (GPT-4o)
Retrieval Vector search, structured action data, and database access — exposed as tools the agent invokes on demand Azure AI Search · MCP
Ingestion Parses → chunks → embeds content on each release Azure AI Search · langchain
Evaluation LLM-based quality checks on every PR RAGAS · pytest
Web UI Responsive chat + feedback panel React · Tailwind

Architecture evolution: GPT-3.5 with chunks in system prompt (Q1 2024) → RAG pipeline with HyDE, reranking and retrieval tricks (mid 2024) → GPT-4 upgrade, biggest single quality jump (Q3 2024) → Lazy Graph RAG experiment (late 2025) → ReAct agent architecture replacing the fixed pipeline (early 2026).

Security: Private VNets, sealed storage; passed Swiss MoD pentest and audit. Azure OpenAI hosted on Swiss servers per client requirement.

3. Impact

  • Successful user adoption, 100+ messages per day, reducing first-level support tickets -40%
  • Improved correct answer rate from 40% (GPT-3.5 PoC) to 80% with the agent architecture
  • Frequent rollouts to 12 000 users every 3 weeks with zero P1 incidents since Q1 2024
  • Agent architecture made elaborate RAG retrieval strategies (HyDE, reranking, Graph RAG) redundant — simpler code, better results

4. My Contributions

Architected the end-to-end system across four major architecture evolutions; built the chat service, ingestion pipeline, evaluation framework, and ReAct agent core; evaluated and discarded multiple RAG strategies (HyDE, reranking, Lazy Graph RAG) based on measured impact; mentored junior engineers on RAG and agent patterns; measured success metrics continuously via RAGAS

5. Key Challenges & Mitigations

Challenge Mitigation & Result
Sparse & outdated docs Curated 30 high-impact UI walkthroughs + generated 715 structured actions via UI-tree pipeline → +40 pp accuracy uplift
GPT-3.5 hallucinations & small context Feature-toggle architecture → zero-downtime model upgrades as Azure released GPT-4 and GPT-4o
Diminishing returns from RAG tricks Replaced fixed retrieval pipeline with ReAct agent loop → agent reasons about its own search strategy, adapting in real time
Measuring answer quality RAGAS suite in CI → PR feedback < 5 min; evaluation survived every architecture change

6. Dev-/MLOps Diagram

architecture