Design Mode: WhatsApp AI Booking Assistant

April 26, 2026

This case study covers a Design Mode session for a single-salon WhatsApp automation. The interesting part is not the complexity — it is the opposite: how Ileen resisted overengineering and produced an architecture deliberately proportional to the problem.

At a glance


Project	Bilingual WhatsApp reception agent for a salon
Mode	Design Mode
Stack	FastAPI, WhatsApp Business API, GPT-4, SQLite
Languages	English + Arabic
Architectural choice	Deliberately minimal — no DB-backed booking, link handoff instead
Output	System design, API contract, knowledge file format, deployment plan

The brief

A salon owner needed to stop answering WhatsApp messages manually. Clients would write in asking about services, prices and availability. The owner would reply individually — in English and in Arabic — and eventually share the booking link.

The requirements:

Handle all incoming WhatsApp messages automatically.
Answer salon-related questions using a knowledge file provided by the owner.
Support both English and Arabic fluently.
When a client is ready to book, share the existing booking system link.
No custom booking logic inside WhatsApp — just handoff.
Do not answer questions unrelated to the salon.

The architectural challenge

The naive path would be to reach for a complex stack: a vector database for knowledge retrieval, a multi-agent orchestration framework, a custom booking integration, conversation analytics, and so on.

Ileen went in the opposite direction.

The key architectural decision

A single backend service. A plain text file. A small conversation store.

The plan’s rationale was explicit:

One salon, one knowledge file, one channel. A single service with a flat file and a small conversation store is proportional to the problem.

Architecture:

WhatsApp Business Cloud API
        ↓
  Agent service (FastAPI or Express)
  ├── Webhook handler (validates signature, parses message)
  ├── Conversation store (SQLite, keyed by phone number)
  ├── Knowledge file (plain .txt, loaded on startup)
  └── LLM call (GPT-4-class, grounded via prompt injection)
        ↓
WhatsApp Business Cloud API (reply)

No vector database. No embeddings. No retrieval pipeline. The knowledge file is injected directly into the system prompt — at single-salon scale, it fits.

Provider selection rationale from the plan:

An LLM with strong multilingual (English + Arabic) support. Arabic quality is the main selection criterion — GPT-4-class or Claude. SQLite is enough given single-salon scale.

What the plan produced

Technology stack:

WhatsApp Business Cloud API (webhook receiver + message sender)
Python (FastAPI) or Node.js (Express) — implementer’s choice
GPT-4-class LLM for multilingual reasoning
SQLite for per-number conversation history
Plain filesystem for the knowledge file

Components designed:

Webhook handler — validates WhatsApp signatures, parses inbound events, dispatches to agent.
Agent orchestrator — loads knowledge file, retrieves conversation history for the phone number, constructs prompt, calls LLM, sends reply.
Topic guardrail — system prompt instructs the model to refuse off-topic questions gracefully, not silently.
Booking handoff — when intent detected, agent inserts the booking link into the reply with a natural transition phrase, in the detected language.
Conversation state — last N messages per phone number stored in SQLite, passed as context to the LLM.

Milestones:

M1: Webhook + echo reply (validate WhatsApp integration works).
M2: LLM integration + knowledge file grounding + language detection.
M3: Topic guardrail + booking handoff logic.
M4: Deployment + monitoring + handoff to owner.

What makes this case study interesting

It is not the most complex project in this series. It is included because it demonstrates something important: Ileen does not default to maximum complexity.

When the constraints are:

One salon
One knowledge file
One WhatsApp channel
No custom booking system needed

…the right architecture is the simplest one that works. SQLite, not Redis. A plain file, not a vector database. One service, not a microservices mesh.

The plan explicitly argued against overengineering at each decision point. That is the behaviour you want from a technical architect.

Honest limitations

The knowledge file approach scales to one salon. If the client had hundreds of locations with different services, a retrieval system would be needed.
Arabic LLM quality varies by provider and model version. The plan flags this as the primary vendor selection criterion but does not guarantee a specific accuracy level.
WhatsApp Business API requires a Meta-approved business account. The salon must have this set up before implementation begins.
The system relies on the salon keeping the knowledge file up to date. Stale information in the file = wrong answers to clients.

Browse the full case study library →