[FROSTLABS] · home / case study: AI-augmented Odoo engineering
Case study · 2026-06-01 · 12-min read · AI-augmented engineering · Odoo · Multi-marketplace

What 7 months of AI-augmented Odoo engineering actually delivered.

A traditional senior consulting engagement on a 7,000-SKU multi-marketplace Odoo 18 platform would be estimated at 18 to 24 months. The actual engagement closed in 7 months. 5 of those 7 months were learning curve, Claude Code (early-tooling era), Odoo data hygiene, and stakeholder politics that should have been handled before I arrived. 2 of those 7 months were the accelerated engineering work itself. The practice stack the engagement produced is now reusable: knowledge vault, persistent memory, module-lock hooks, watchdog snapshots, hard rules, test suite, multi-account coordination, council patterns. New engagements get the stack in week 1. With a passed Reconciliation Audit and decision-maker access, the engineering sprint compresses to roughly 1 month. Cost comparison: traditional senior consulting ~$600K, AI-augmented Frost Labs ~$40-90K. 15-20x calendar compression, 6-15x cost reduction.

$An opening scene where the practice stack mattered

On a Thursday afternoon in month 4 of the engagement, I converted 8,623 master images from WebP and PNG-with-alpha to PNG-RGB. Amazon's catalog had been quietly rejecting the originals. The conversion ran in 12 minutes. About 90 seconds after it completed, the watchdog snapshot diff alert fired: 34,488 ir_attachment records had disappeared from the production database in a 5-minute window. Odoo's image-derivative compute chain had cascade-deleted every derivative thumbnail in the catalog.

The alert hit my phone before customer-facing pages had time to render any missing thumbnails. I had four hours before the catalog started showing as visually broken to buyers. The fix was non-obvious. Re-firing Odoo's derivative chain through the ORM would have processed at 11 minutes per thousand records, a recovery timeline measured in days. I wrote a PIL-based bulk-insert that wrote directly to ir_attachment while triggering the derivative-compute path in the same transaction. About 32,500 derivatives recovered without service disruption. (The full diagnostic is at the image-cascade writeup.)

The recovery wasn't the noteworthy part of that afternoon. The watchdog firing on a write-cascade 90 seconds after it happened was. Without it, the cascade would have been discovered hours later when customer support tickets about missing product images started rolling in. The watchdog is one of eight pieces of an AI-augmented practice stack that, in aggregate, made this engagement viable as a solo operation. This case study is about that stack and what it actually delivers.

$The engagement: setup and traditional estimate

The engagement was 7 months on an Odoo 18 platform for a U.S.-based made-to-order manufacturer. About 7,000 SKUs across Amazon (FBA + FBM), eBay, and Walmart, with low-thousands of monthly orders. An in-house developer had been holding the system together for years but didn't have time to step back and fix the deeper drift. A pre-engagement Claude-driven code review had surfaced 41 findings: 8 critical, 33 high/medium. The staging-to-production migration was months overdue. The Walmart integration was returning silently wrong status data. Amazon variation families had drifted out of sync across hundreds of parent-child trees. Image format incompatibilities were quietly suppressing listings.

The traditional consulting estimate for this kind of work, gathered from informal quotes during the pre-engagement period, ran 18 to 24 months at $150 to $250 per hour. Midpoint cost projection: around $600,000. ROI starting month 18 or later. The client's leadership signed off on it but had real concerns about timeline. Multiple agencies passed because the platform's domain-specific weirdness (made-to-order patterns, capacity-bounded production, custom variation themes) required ramp-up time their delivery teams couldn't absorb without billing it.

I took the engagement at $175 per hour (standard at the time; the $225/hour AI-augmented tier didn't exist yet) on the bet that AI-augmented engineering with discipline could compress the work materially. The compression turned out to be real, but the path to it ran through 5 months of practice-stack building before the accelerated shipping actually started.

$The honest 5+2 month decomposition

Months 1 through 5 were the long part. Three things happened in parallel during this period and they all interfered with each other.

The first was learning Claude Code at production scale. This was early in Claude Code's lifecycle and most of the patterns that today take 30 minutes to set up took days to figure out. The knowledge vault structure, persistent-memory conventions, hard-rule format, watchdog snapshot specs, module-lock semantics, multi-account coordination protocol: all of it was authored during this period by a person figuring it out, not deployed from a known-good template. About 40% of the calendar time in months 1 through 5 went into this category.

The second was learning the client's specific platform deeply. Odoo as a framework is one thing. The client's Odoo, with its custom modules, MTO patterns, accumulated technical debt, undocumented business rules in older addon code, and the in-house developer's institutional knowledge, was another. Reverse-engineering all of it while diagnosing the active integration bugs consumed roughly 30% of the calendar time.

The third was data hygiene and stakeholder alignment that the client had not done before the engagement started. Variation families weren't just drifting in the API surface; the Odoo data was actively contradicting itself across legacy import imports. Critical product categorization decisions needed sign-off from stakeholders who were available on Thursday afternoons only. Master images were in formats no one had documented as required. The remaining 30% of months 1 through 5 went into reconciliation work that should have been completed as pre-engagement scoping but wasn't.

Months 6 and 7 were the accelerated part. By this point the practice stack had been built and battle-tested. The data hygiene was stable enough to work against. The stakeholders were trained on response patterns. The actual engineering work shipped at a pace that matched the original AI-augmented thesis: 703 of 703 Amazon variation families reconciled in days, not weeks (7,644 SP-API submissions accepted with no rejections needing follow-up). 6,838 FBM listings with their lead-time-to-ship windows wiped were diagnosed, fixed at the connector level, and re-pushed in roughly a week (the diagnostic is at the lead-time-wipe writeup). The Walmart pagination workaround for the broken /v3/items endpoint shipped in two days once I recognized the pattern (full writeup at the Walmart writeup). The image-derivative recovery described in the opening took an afternoon. 38 of 41 council audit remediations closed.

The transferable lesson from the 5+2 split: most of the calendar time in any consulting engagement is not the engineering. It's the preconditions to the engineering. If the preconditions are met before the engineer arrives, the engineering is fast.

$The practice stack, piece by piece

Eight pieces compose the AI-augmented practice stack the engagement produced. Each is buildable independently. The compression effect comes from running them together. Each subsection below names what the piece is, what it cost to set up, and a specific moment during the engagement where it earned its keep.

1. Project knowledge vault

A version-controlled directory tree (knowledge/) containing the project's institutional memory: brand voice, pricing rules, hard rules tied to real incidents, decision records, runbooks, session logs, domain glossary. Read at session start. Approximately 300 to 500 markdown files at maturity for a 7-month engagement-scale project. Setup cost: roughly 2 days to build the initial structure plus ongoing accrual during the engagement.

Specific moment it mattered: Month 3, a new Claude session needed to understand why marketplace_base.product_template.write() was being treated as performance-sensitive. The hard rule documenting the cascade behavior (with the original incident date and the affected listing count) was in the vault. The session loaded the rule, applied the change-aware-write pattern from the start, and shipped the fix without re-discovering the cost. Full writeup of the pattern.

2. Persistent memory layer

Auto-memory inferred patterns plus manually-curated learning files that survive context compaction. Stored in ~/.claude/projects/<project>/memory/ for the machine-local layer and promoted to knowledge/learnings/ for the durable git-tracked layer. Setup cost: nominal (the machinery is built into Claude Code). Discipline cost: real, requires regular promotion of inferred patterns into git-tracked files so they survive across machines and accounts.

Specific moment it mattered: Month 5, after a context compaction, a new session needed to know that the Walmart /v3/items endpoint was returning duplicate pages and the workaround was /v3/reports?reportType=ITEM&reportVersion=v4. The memory layer surfaced this within the first three exchanges of the new session. Without it, the session would have re-discovered the bug from scratch.

3. Module-lock hooks

A file-system-based lock mechanism that lets multiple parallel Claude sessions coordinate without overwriting each other's edits. Each session claims an exclusive lock on the modules it's editing. Other sessions read the locks and route around them. The locks live in a .locks/ directory checked into git, with timestamps and session IDs. Setup cost: half a day to implement the hook scripts plus the convention.

Specific moment it mattered: Multiple times in months 6 and 7, when running parallel sessions during the accelerated shipping phase. One session would be analyzing variation-family drift while another shipped the lead-time-wipe fix while a third was writing tests. Without coordination, two sessions both editing fl_marketplaces_amazon/models/marketplace_listing.py would have produced merge conflicts every few minutes. With coordination, the conflicts surfaced before either session committed.

4. Watchdog snapshot/diff system

Pre-change and post-change snapshots of business-critical state across the Odoo database, written to JSONL files, diff-compared after every production-affecting change. 11 health checks at maturity covering product master data, marketplace listings, pricing, stock levels, and configuration parameters. Catches silent regressions that don't trip alarms (the 12% price shift, the 14% of catalog quietly going out-of-stock, the cascade that empties 34,488 attachments). Setup cost: roughly 1 day to build the snapshot scripts plus ongoing tuning of which fields belong in the snapshot spec.

Specific moment it mattered: The opening scene of this case study. The watchdog firing 90 seconds after the image cascade was the difference between a 4-hour-recoverable incident and a 4-day catalog-visually-broken incident. Full writeup of the pattern.

5. Hard rules tied to real incidents

28 documented never-do rules at engagement close, each anchored to the incident that produced it. Format: a short rule statement, the incident date, what broke, what the rule prevents. Stored in knowledge/hard-rules.md and loaded at session start. Setup cost: zero up-front (you accumulate them as incidents happen), but discipline cost to actually write them down rather than just remembering them.

Specific moment it mattered: Month 6, a session was about to bypass the Odoo ORM for bulk image work without re-firing the derivative-compute chain. The hard rule documenting the cascade from month 4 caught this and forced the safe path (PIL bulk-insert plus explicit derivative-compute re-fire in the same transaction). Without the hard rule, the session would have re-created the original incident.

6. 143-method test suite

Test coverage organized around business invariants, not just function behaviors. Six categories: security regressions, order lifecycle, business rules, integration paths, data integrity, and marketplace-specific behaviors. Property-based tests via Hypothesis for invariants over wide input domains. Fast subset runs in under 90 seconds for the inner edit-test loop; full suite runs in under 5 minutes on CI. Setup cost: roughly 2 weeks of dedicated test-infrastructure work plus ongoing test additions per shipped feature.

Specific moment it mattered: Multiple times. The most consequential was month 7, when a refactor of the variation-family reconciliation code path passed all function-level tests but failed an invariant test that asserted "after any publish call, listing.price must equal product.template.list_price." The failing invariant test surfaced a silent price-drift bug that would have shipped to production. Full writeup of the test-suite-as-safety-net pattern.

7. Multi-account coordination

Two Claude Pro accounts operated by one person, coordinating via a git-tracked ACTIVE.md state file. Each session reads ACTIVE.md at start to see what the other account is working on, claims its workstream by writing a claim line and pushing, releases the workstream when done. Setup cost: half a day to define the protocol plus the SessionStart hooks. Discipline cost: pushing the claim file commits aggressively so the other account never sees stale state.

Specific moment it mattered: Month 6, parallel work on the Amazon variation-family reconciliation (Account A) and the eBay [21916735] diagnostic (Account B). Without coordination, both accounts would have touched the shared fl_marketplaces_base/models/marketplace_listing.py within the same hour. With coordination, Account A claimed the file for the variation work; Account B routed around it and worked on eBay-specific files until released.

8. AI council patterns

For high-stakes architectural decisions or hard debugging sessions, a multi-agent council pattern: KISS / YAGNI / DRY / SOLID reviewers plus an explicit dissenter, evaluated in parallel and synthesized. Spawned on demand (not running continuously). Used roughly 8 to 12 times during the engagement at decision points where the cost of being wrong was high. Setup cost: nominal (the council pattern is a workflow definition); compute cost is real per use because each council member runs as a separate agent.

Specific moment it mattered: Month 4, the decision to swap the Amazon connector's variation-family reconciliation strategy from "incremental PATCH per family" to "batch DELETE+CREATE for the most-broken families." Each council seat surfaced a different concern (KISS: simpler code; YAGNI: don't optimize until you measure; SOLID: don't couple the reconciliation logic to the connector's other write paths; dissenter: the batch strategy will rate-limit on SP-API). The dissenter caught the rate-limit risk. The final strategy was a hybrid that handled the rate-limit explicitly.

$The compression math

Three numbers matter for the cost comparison. Hourly rate, total hours, and the hidden cost of calendar time before ROI starts.

Traditional senior Odoo consulting for a 7,000-SKU multi-marketplace platform modernization runs at $150 to $250 per hour. The 18 to 24 month estimate at roughly 160 hours per month puts total hours at approximately 2,880 to 3,840. Midpoint cost: around $600,000. Plus the opportunity cost of 18 months of revenue at the broken-catalog status quo before any fix lands.

AI-augmented Frost Labs delivery on the same scope, with the practice stack already built and deployed in week 1 of the engagement, runs at $225 per hour. The engineering sprint that closes the actual scope compresses to roughly 1 month at 160 hours, conditional on the preconditions listed below. Adding pre-engagement discovery via an Odoo + Marketplace Reconciliation Audit (3-day Snapshot) at $1,500 to $2,500, the total cost lands at $40,000 to $90,000 depending on scope complexity. Plus only 1 month of calendar time before ROI starts.

Compression: roughly 15 to 20 times faster on calendar, 6 to 15 times lower on total cost, comparable per-hour rate.

The preconditions that make the 1-month claim defensible: (a) a passed Reconciliation Audit, or equivalent pre-engagement discovery surfacing the actual scope; (b) named decision-makers accessible within 24 hours during the sprint; (c) data inputs not actively pathological (some data hygiene work is expected and absorbed in the sprint; large-scale data correction is a separate phase); (d) no live emergency consuming the in-house team's attention. If any of those preconditions are unmet, the engagement either reverts to discovery first or stretches honestly.

$What new engagements look like in practice

The first week of a new engagement is practice-stack deployment. The knowledge vault structure ports from prior engagements with the project-specific files initialized; persistent memory configurations are set up; module-lock hooks are installed; the watchdog snapshot spec is tuned to the client's specific models and business-critical fields; the hard-rules file is seeded with the universal ones from prior engagements and ready to accept project-specific additions; a minimum-viable test suite is scaffolded against the most-critical paths; multi-account coordination is configured; the council pattern is available for high-stakes decisions.

Weeks 2 through 4 are the engineering sprint. With the practice stack in place from day 1, the calendar compression that took 5 months to build is available immediately. The actual scope (variation reconciliation, integration debugging, migration toolkit, reliability infrastructure, depending on the client) shipped at the pace the practice stack enables.

The Reconciliation Audit, if commissioned beforehand, has already surfaced the actual scope at week 1 of the engineering sprint. The engineer arrives with a known-good problem definition instead of a discovery phase masquerading as billable time.

$What this means if you're considering hiring

Two specific shapes of engagement work, depending on what you actually want to buy.

Hire me to teach your team to do this. (Engineering Practice Bootstrap)

If you have in-house engineers and want to deploy the AI-augmented practice stack in your own organization, the Engineering Practice Bootstrap is a 1-week engagement at $18,000 to $28,000 fixed fee. Output: the full stack installed in your environment (knowledge vault structure, memory layer, module-lock hooks, watchdog snapshots, hard-rules seed, test suite scaffolding), trained against your codebase, with your team trained on the discipline patterns. After the bootstrap, your team owns it and can continue building on it. Suitable for organizations with 3 to 30 person engineering teams that have heard about AI tooling and want to deploy it with the safety rails that prevent "AI did something weird" incidents.

Hire me to absorb the engineering work itself. (Engineering Engagement)

If you want the work shipped without your team having to learn the practice, the Engineering Engagement runs at $225 per hour AI-augmented, with three structures available: 1099 sub-contract through your agency (won't poach your clients), weekly retainer for ongoing scope, or fixed-fee project for a defined deliverable. The practice stack is invisible to you; you see the outcomes (variation reconciliation done, integrations fixed, migrations executed, test suite installed). Suitable for Odoo agencies who need senior overflow capacity or operators who want results without team change-management overhead. For predictable ongoing capacity, a Senior-Engineer-on-Demand retainer at $4,000 to $15,000 per month is the cleanest structure.

Or start smaller. (Odoo + Marketplace Reconciliation Audit)

If neither of the above is the right entry point yet, the Reconciliation Audit is the 3-business-day discovery at $1,500 to $2,500 fixed fee. The audit reconciles your Odoo catalog against each connected marketplace, surfaces where they disagree and which side is right, quantifies what the disagreement is costing, and produces a prioritized fix backlog. Most audits surface 5 to 10 specific findings. The audit converts cleanly into either of the above engagements if the findings warrant it, or stands on its own as a deliverable your in-house team can execute against.

By David H. Frost · Frost Labs LLC · Riverdale, Utah Home · Writing · Privacy