Skip to content

From Discovery to Production in 11 Hours

April 26, 2026

This is a technical case study of a real project built with Ileen. No mock data, no synthetic demo. The goal is to show the full chain: conversation → architecture → milestones → code → deployed MVP — and to be transparent about what Ileen did autonomously and where a human stepped in.


At a glance

ProjectAI Image-to-3D + Bill of Materials pipeline
ModeDesign Mode → Build
StackFastAPI, Meshy, GPT-4o Vision, PostgreSQL, GCS
Autonomous time~10 hours (discovery, architecture, planning, build)
Human time~1 hour (deployment)
OutputsFrontend repo, backend repo, database DDL, production-usable MVP
Key decisionArchitecture revised mid-conversation — from semantic vector search to DB-grounded agentic flow

The brief

A client needed an AI system that could take a product image and produce two outputs:

  • A 3D model of the product (via a generative 3D provider).
  • A Bill of Materials (BOM) — a structured list of the components and materials visible in the image, with each item grounded in the client’s existing materials database (no invented materials).

The stretch goal was to produce a technical production design as a separate deliverable. The constraints were real: the client’s materials database existed and had to be used as the authoritative source of truth. Hallucinated BOM entries were not acceptable.

There were many open questions at the start:

  • Which 3D generation provider to use?
  • How to extract a BOM from an image with high accuracy?
  • How to ground the BOM on a real database without over-engineering?
  • What accuracy targets were realistic?
  • How to handle costs and latency?
  • How to build a learning loop for BOM quality?

The architectural challenge

The naive approach to BOM generation would be: take the 3D model, analyse its geometry, and derive materials from the mesh. This is fragile — it relies on the quality of the 3D generation and on geometry-to-material inference, which is error-prone and poorly supported by current tooling.

A second naive approach: use free-form semantic search (embeddings / vector similarity) to match components detected in the image to entries in the materials database. This is better, but introduces hallucination risk: the model may produce plausible-sounding component names that don’t cleanly map to real database entries.


The architect conversation

During the Design Mode session, Ileen initially proposed an embedding-based approach for BOM generation: detect components from the image, embed their descriptions, and retrieve the closest matches from the database.

The user challenged this directly:

“This is fragile. If the component description doesn’t embed close to the right database entry, we get a wrong match or no match. We should ground the BOM on the database categories explicitly.”

Ileen recognised the objection was correct and revised the architecture:

  1. Step 1 — Category identification: given the image, identify which categories of materials are present, using the real category taxonomy from the client’s database. This constrains the model to a known vocabulary.
  2. Step 2 — Item selection: for each identified category, select specific items from the database that best match the visible components.

This two-step approach reduces hallucination because the model is always choosing from a bounded, real set — not generating names freely.


The key architectural decision

BOM and 3D generation run as two parallel, independent flows — both anchored to the original image, not to each other.

Image
├── → Meshy (or similar provider) → 3D model
└── → GPT-4o Vision
├── Step 1: identify categories from DB taxonomy
└── Step 2: select items per category from DB → BOM

This is the most important decision in the plan. Deriving the BOM from the 3D model would have introduced a hard dependency: if the 3D generation fails or is poor quality, the BOM fails too. By grounding both on the original image, the two outputs are independent and the pipeline is more resilient.


Risk management

The plan explicitly named the risks — not as caveats to hide, but as gates to manage:

RiskMitigation
Lost-in-the-middle on large categoriesIf a DB category has many items, the model may miss the right one. Add sub-filtering only if benchmark shows this is a problem.
Accuracy ceiling90% BOM accuracy may not be achievable on all product types. 70% is the acceptable baseline. 100% is out of scope.
Meshy import compatibility3D output format must be validated for the client’s use case in Milestone 1.
API cost and latencyGPT-4o Vision and Meshy calls are not free. Estimated per-job cost must be benchmarked before scaling.
Client DB dependencyBOM quality depends directly on the completeness of the client’s materials database. Gaps in the DB = gaps in BOM coverage.
Feedback loop qualityThe correction-driven improvement loop (see below) only works if corrections are captured consistently.

The plan defines a benchmark gate: if accuracy on large categories falls below 70%, add a sub-filtering step. This gate prevents overengineering a solution before knowing whether the problem exists.


Generated architecture

Cloud Run (API)
├── /jobs → job orchestration (FastAPI)
├── /3d → Meshy integration
└── /bom → GPT-4o Vision + DB grounding
Cloud SQL (PostgreSQL)
├── jobs → job tracking and status
├── bom_results → generated BOM entries
├── corrections → human corrections (feedback loop)
└── materials → client's materials database (imported)
GCS
├── input-images/ → uploaded product images
├── 3d-outputs/ → generated 3D files
└── bom-exports/ → BOM exports (JSON, CSV)
Frontend (React)
├── image upload
├── 3D viewer
├── BOM table (editable, corrections captured)
└── job history

The improvement loop

Rather than “self-learning” (which implies model training), the system implements a correction-driven improvement loop:

  1. A user reviews the generated BOM and corrects wrong entries.
  2. Corrections are stored against the original image and the generated output.
  3. Accepted corrections become few-shot examples for future similar images.
  4. Item ordering heuristics within each category improve over time based on correction patterns.

This is not model fine-tuning. It is feedback-based prompt and ranking optimisation — explicit, inspectable, and controllable.


The autonomous build output

After ~10 hours of autonomous operation, Ileen produced:

  • Backend repository — FastAPI application with full job orchestration, Meshy integration, GPT-4o Vision BOM pipeline, Cloud SQL models, GCS storage layer, and REST API.
  • Frontend repository — React application with image upload, job tracking, 3D viewer integration, editable BOM table, and correction capture.
  • Database DDL — complete PostgreSQL schema for all tables, ready to apply.
  • Deployment configuration — Cloud Run service definitions, environment variable templates.

All code was committed to main branch in production-ready state.


Human deployment

One human. One hour.

The hour was spent on:

  • Provisioning Cloud SQL and running the DDL.
  • Creating the GCS buckets.
  • Configuring environment variables (API keys, DB credentials).
  • Deploying Cloud Run services.
  • Importing the client’s materials database.
  • Smoke testing the first end-to-end job.

No structural changes to the generated code were required.


What Ileen did autonomously

  • Conducted technical discovery: identified risks, open questions, provider options.
  • Proposed and revised the architecture after a technical challenge.
  • Selected the two-parallel-flows pattern (3D + BOM independent from each other).
  • Designed the DB-grounded BOM approach with explicit two-step pipeline.
  • Defined accuracy gates (70% baseline, conditional sub-filtering).
  • Generated the complete project plan with milestones and tasks.
  • Implemented the backend, frontend, database schema and deployment config.
  • Committed all code to main in production-ready state.

What required human control

  • Final provider selection (Meshy was validated in M1 as planned).
  • Cloud infrastructure provisioning (GCP credentials and project setup).
  • Materials database import from the client’s existing system.
  • Deployment execution and smoke testing.
  • Decision to go live (the human validated the output before putting it in front of users).

Honest limitations

This case study represents a technically complex but well-delimited MVP. It is not a claim that Ileen can replace an engineering team on any project. Specifically:

  • The project had a clear scope: image in → 3D + BOM out.
  • The client’s materials database was available from the start.
  • There were no legacy systems to integrate with.
  • Performance and load testing were not in scope for the MVP.
  • The feedback loop improvement is gradual and requires real usage data to show results.

The claim is precise:

For technically complex but well-delimited MVP projects, Ileen can compress discovery + architecture + scaffolding + implementation into hours rather than weeks.


The result

An AI Image-to-3D + BOM pipeline, built from a discovery conversation to a production-usable MVP, in 11 hours total — 10 autonomous, 1 human.

The architecture decision that made it work: treat 3D generation and BOM generation as independent parallel flows, both grounded on the original image. Don’t derive BOM from the 3D model. Ground BOM on the real database. Accept that 90% accuracy may not always be reachable, and design explicit gates to know when you’re below baseline.

That is Ileen’s job: not to promise magic, but to produce a traceable, honest, deployable architecture from an ambiguous starting point.


Browse the full case study library →