Design Mode: WhatsApp AI Booking Assistant
April 26, 2026
This case study covers a Design Mode session for a single-salon WhatsApp automation. The interesting part is not the complexity — it is the opposite: how Ileen resisted overengineering and produced an architecture deliberately proportional to the problem.
At a glance
| Project | Bilingual WhatsApp reception agent for a salon |
| Mode | Design Mode |
| Stack | FastAPI, WhatsApp Business API, GPT-4, SQLite |
| Languages | English + Arabic |
| Architectural choice | Deliberately minimal — no DB-backed booking, link handoff instead |
| Output | System design, API contract, knowledge file format, deployment plan |
The brief
A salon owner needed to stop answering WhatsApp messages manually. Clients would write in asking about services, prices and availability. The owner would reply individually — in English and in Arabic — and eventually share the booking link.
The requirements:
- Handle all incoming WhatsApp messages automatically.
- Answer salon-related questions using a knowledge file provided by the owner.
- Support both English and Arabic fluently.
- When a client is ready to book, share the existing booking system link.
- No custom booking logic inside WhatsApp — just handoff.
- Do not answer questions unrelated to the salon.
The architectural challenge
The naive path would be to reach for a complex stack: a vector database for knowledge retrieval, a multi-agent orchestration framework, a custom booking integration, conversation analytics, and so on.
Ileen went in the opposite direction.
The key architectural decision
A single backend service. A plain text file. A small conversation store.
The plan’s rationale was explicit:
One salon, one knowledge file, one channel. A single service with a flat file and a small conversation store is proportional to the problem.
Architecture:
WhatsApp Business Cloud API ↓ Agent service (FastAPI or Express) ├── Webhook handler (validates signature, parses message) ├── Conversation store (SQLite, keyed by phone number) ├── Knowledge file (plain .txt, loaded on startup) └── LLM call (GPT-4-class, grounded via prompt injection) ↓WhatsApp Business Cloud API (reply)No vector database. No embeddings. No retrieval pipeline. The knowledge file is injected directly into the system prompt — at single-salon scale, it fits.
Provider selection rationale from the plan:
An LLM with strong multilingual (English + Arabic) support. Arabic quality is the main selection criterion — GPT-4-class or Claude. SQLite is enough given single-salon scale.
What the plan produced
Technology stack:
- WhatsApp Business Cloud API (webhook receiver + message sender)
- Python (FastAPI) or Node.js (Express) — implementer’s choice
- GPT-4-class LLM for multilingual reasoning
- SQLite for per-number conversation history
- Plain filesystem for the knowledge file
Components designed:
- Webhook handler — validates WhatsApp signatures, parses inbound events, dispatches to agent.
- Agent orchestrator — loads knowledge file, retrieves conversation history for the phone number, constructs prompt, calls LLM, sends reply.
- Topic guardrail — system prompt instructs the model to refuse off-topic questions gracefully, not silently.
- Booking handoff — when intent detected, agent inserts the booking link into the reply with a natural transition phrase, in the detected language.
- Conversation state — last N messages per phone number stored in SQLite, passed as context to the LLM.
Milestones:
- M1: Webhook + echo reply (validate WhatsApp integration works).
- M2: LLM integration + knowledge file grounding + language detection.
- M3: Topic guardrail + booking handoff logic.
- M4: Deployment + monitoring + handoff to owner.
What makes this case study interesting
It is not the most complex project in this series. It is included because it demonstrates something important: Ileen does not default to maximum complexity.
When the constraints are:
- One salon
- One knowledge file
- One WhatsApp channel
- No custom booking system needed
…the right architecture is the simplest one that works. SQLite, not Redis. A plain file, not a vector database. One service, not a microservices mesh.
The plan explicitly argued against overengineering at each decision point. That is the behaviour you want from a technical architect.
Honest limitations
- The knowledge file approach scales to one salon. If the client had hundreds of locations with different services, a retrieval system would be needed.
- Arabic LLM quality varies by provider and model version. The plan flags this as the primary vendor selection criterion but does not guarantee a specific accuracy level.
- WhatsApp Business API requires a Meta-approved business account. The salon must have this set up before implementation begins.
- The system relies on the salon keeping the knowledge file up to date. Stale information in the file = wrong answers to clients.