AI-Powered Agent-Based Support Assistant

AI-Powered Agent-Based Support Assistant — Defense Client (Switzerland)

Evolved a production chatbot from naive RAG to autonomous AI agent over 14 months — each architecture shift doubling answer quality.

Date: Jan 2024 - present
Roles: Software Architect · AI/ML Engineer · Lead Software Engineer
Team size: 7
Customer: Confidential Defense Client (Switzerland)
Core Stack:
Backend: Python 3.11 · FastAPI · Azure OpenAI · Azure AI Search
Frontend: React · Tailwind
Cloud/DevOps: Azure Web Apps · Container Registry · Cosmos DB · Blob Storage · Terraform IaC · Docker · Azure DevOps CI/CD
Blog post: From RAG Pipeline to AI Agent: 14 Months of Building a Production Chatbot

Show more ↓

1. Problem

New recruits must master a feature-rich HR & logistics portal. Sparse, outdated documentation generated many support tickets every month and slowed onboarding.

2. Solution Overview

I designed and led delivery of a micro-service chat assistant that evolved through five architectural phases:

Service	Purpose	Key Tech
Agent Core	ReAct agent loop — reasons about queries, autonomously retrieves data across multiple sources, synthesizes answers	FastAPI · Azure OpenAI (GPT-4o)
Retrieval	Vector search, structured action data, and database access — exposed as tools the agent invokes on demand	Azure AI Search · MCP
Ingestion	Parses → chunks → embeds content on each release	Azure AI Search · langchain
Evaluation	LLM-based quality checks on every PR	RAGAS · pytest
Web UI	Responsive chat + feedback panel	React · Tailwind

Architecture evolution: GPT-3.5 with chunks in system prompt (Q1 2024) → RAG pipeline with HyDE, reranking and retrieval tricks (mid 2024) → GPT-4 upgrade, biggest single quality jump (Q3 2024) → Lazy Graph RAG experiment (late 2025) → ReAct agent architecture replacing the fixed pipeline (early 2026).

Security: Private VNets, sealed storage; passed Swiss MoD pentest and audit. Azure OpenAI hosted on Swiss servers per client requirement.

3. Impact

Successful user adoption, 100+ messages per day, reducing first-level support tickets -40%
Improved correct answer rate from 40% (GPT-3.5 PoC) to 80% with the agent architecture
Frequent rollouts to 12 000 users every 3 weeks with zero P1 incidents since Q1 2024
Agent architecture made elaborate RAG retrieval strategies (HyDE, reranking, Graph RAG) redundant — simpler code, better results

4. My Contributions

Architected the end-to-end system across four major architecture evolutions; built the chat service, ingestion pipeline, evaluation framework, and ReAct agent core; evaluated and discarded multiple RAG strategies (HyDE, reranking, Lazy Graph RAG) based on measured impact; mentored junior engineers on RAG and agent patterns; measured success metrics continuously via RAGAS

5. Key Challenges & Mitigations

Challenge	Mitigation & Result
Sparse & outdated docs	Curated 30 high-impact UI walkthroughs + generated 715 structured actions via UI-tree pipeline → +40 pp accuracy uplift
GPT-3.5 hallucinations & small context	Feature-toggle architecture → zero-downtime model upgrades as Azure released GPT-4 and GPT-4o
Diminishing returns from RAG tricks	Replaced fixed retrieval pipeline with ReAct agent loop → agent reasons about its own search strategy, adapting in real time
Measuring answer quality	RAGAS suite in CI → PR feedback < 5 min; evaluation survived every architecture change

6. Dev-/MLOps Diagram

architecture