Recovering Odoo image derivatives after an ORM cascade

Converted 8,623 master images from WebP/PNG-with-alpha to PNG-RGB to satisfy Amazon's catalog. The conversion inadvertently triggered Odoo's image-derivative chain to cascade-delete 34,488 derivative attachments. The kind of cascade that takes a working catalog and silently empties it of every product thumbnail. The obvious ORM-based recovery processed at 11 minutes per thousand records. Wrong path. The right fix was a PIL-based bulk-insert that wrote directly to ir_attachment while still triggering the derivative-compute chain in the same transaction. Roughly 32,500 derivatives recovered without service disruption.

$The setup

Amazon's catalog rejects WebP and PNG-with-alpha master images. The Seller Central API will accept the upload, but the listing then shows the dreaded "image suppressed" status hours later when Amazon's image processor evaluates it. Your listings go live, customers can't see the product, you lose Buy Box.

On a 7,000-listing catalog where most masters were WebP (efficient for the original e-commerce site but rejected by Amazon), the fix was clear: batch-convert all 8,623 masters to PNG-RGB. PIL handles the conversion. Write the result back to ir_attachment via the Odoo ORM. Move on.

What actually happened was not move-on.

$The cascade

Odoo's image system is layered: a product.template or product.product has a primary image_1920 field. When you write that field, Odoo automatically generates a chain of derivative resolutions: image_1024, image_512, image_256, image_128, and image_64. Each is stored as an ir_attachment record linked to the parent.

The chain is computed lazily on access (via _compute_image_thumbnail et al.) but cached as ir_attachment rows once computed. When you UPDATE the master image, Odoo invalidates and deletes the derivatives; they'll be regenerated on next access.

That's fine when you update one image at a time. It's catastrophic when you UPDATE 8,623 masters in a batch transaction, because:

The pre-existing derivatives (34,488 across the catalog, average 4 derivatives per product) all get marked for delete in the same transaction.
The delete is a real DELETE on ir_attachment, not a soft-delete.
The regeneration is lazy; derivatives are only re-created the next time something accesses them.
If the access pattern is "customer hits the product page" or "marketplace API polls for an image URL," the regeneration happens one at a time, in a request context, taking ~80-200ms each.

For 34,488 derivatives at 100ms each: ~57 minutes of CPU time if regenerated serially. In practice, with concurrent requests competing for ORM access, the recovery is much slower, and during that window, every product page that needs a thumbnail is generating it on-demand, blocking the request.

The dashboard goes pale. Marketplace API calls start timing out. The on-call channel lights up.

$The wrong path I tried first

My first instinct was to let Odoo's ORM re-fire the derivative-compute chain manually. For each product, touch the image field (e.g., re-write it to itself) to trigger the derivative regeneration.

# Wrong path. Don't do this on a large catalog.
def regenerate_derivatives_via_orm(env, product_ids):
    for p_id in product_ids:
        product = env["product.template"].browse(p_id)
        # Touch the master to trigger derivative recompute.
        product.image_1920 = product.image_1920
        env.cr.commit()

This works correctly. It just takes 11 minutes per 1,000 products. For 8,623 products: roughly 96 minutes of wall time, during which the production database is under sustained write load and the on-call channel is still lit up.

Worse: the operation is single-threaded against the ORM. You can't easily parallelize it because each product.image_1920 = product.image_1920 triggers the full ORM stack (onchange handlers, compute fields, log audit entries, the works). Trying to parallelize causes lock contention on ir_attachment.

11 minutes per thousand was unacceptable. Time for the right path.

$The right path: PIL bulk-insert direct to `ir_attachment`

The insight: Odoo's derivative computation is just PIL resizing the master + writing the result to ir_attachment. We can do that ourselves, in parallel, bypassing the ORM, and it's much faster.

The shape:

from PIL import Image
import io, base64
from concurrent.futures import ThreadPoolExecutor

# Derivative sizes Odoo uses.
SIZES = [
    ("image_1024", 1024),
    ("image_512", 512),
    ("image_256", 256),
    ("image_128", 128),
]

def generate_derivatives(master_bytes):
    """Resize the master image into all derivative sizes.
    Returns a dict of size_name -> bytes."""
    img = Image.open(io.BytesIO(master_bytes))
    img = img.convert("RGB")  # Strip alpha to match Amazon's required format.
    out = {}
    for size_name, max_dim in SIZES:
        thumb = img.copy()
        thumb.thumbnail((max_dim, max_dim), Image.LANCZOS)
        buf = io.BytesIO()
        thumb.save(buf, "PNG", optimize=True)
        out[size_name] = buf.getvalue()
    return out

def bulk_insert_derivatives(cr, product_id, master_bytes):
    """Write all derivatives for a product directly to ir_attachment.
    Uses a single transaction with a single connection cursor.
    Bypasses ORM compute-chain entirely."""
    derivatives = generate_derivatives(master_bytes)
    for size_name, blob in derivatives.items():
        cr.execute("""
            INSERT INTO ir_attachment
                (name, res_model, res_field, res_id, type, datas,
                 file_size, mimetype, create_uid, create_date,
                 write_uid, write_date)
            VALUES (
                %s, 'product.template', %s, %s, 'binary', %s,
                %s, 'image/png', 1, NOW(),
                1, NOW()
            )
            ON CONFLICT (res_model, res_field, res_id)
            DO UPDATE SET datas = EXCLUDED.datas,
                          file_size = EXCLUDED.file_size,
                          write_date = NOW()
        """, (
            size_name,
            size_name,
            product_id,
            base64.b64encode(blob),
            len(blob),
        ))

Key choices and why each matters:

Direct SQL via cr.execute. No ORM. No onchange handlers. No audit log. The trade-off is that you lose Odoo's validation hooks, so you have to be confident the data shape is correct.
ON CONFLICT (res_model, res_field, res_id) DO UPDATE. If a derivative already exists for this product/field/size, replace it. This makes the operation idempotent. You can re-run it safely if it gets interrupted mid-batch.
Base64-encoded datas. Odoo stores image bytes base64-encoded in ir_attachment.datas. If you write raw bytes, Odoo's _compute_image later decodes them as base64 and gets nonsense back.
PIL .convert("RGB"). Strips alpha channel. Amazon requires PNG-RGB (no alpha) for master listings; doing this in the same operation as derivative generation guarantees consistency.
thumbnail() not resize(). thumbnail() preserves aspect ratio. resize() stretches. You want aspect-preservation.
Image.LANCZOS resampling. Better quality than the default at small sizes. Marginal CPU cost, much better thumbnail quality.

$Parallelizing the bulk-insert

The PIL resize is CPU-bound. The database INSERT is I/O-bound. The two are independent. Parallelize the PIL work, batch the inserts:

def recover_catalog(env, product_ids, batch_size=100):
    cr = env.cr
    with ThreadPoolExecutor(max_workers=8) as ex:
        for i in range(0, len(product_ids), batch_size):
            batch = product_ids[i:i + batch_size]
            # Fetch master bytes for the batch.
            masters = fetch_master_bytes(cr, batch)  # {product_id: bytes}

            # Generate derivatives in parallel (CPU-bound).
            futures = {
                ex.submit(generate_derivatives, m_bytes): p_id
                for p_id, m_bytes in masters.items()
            }
            # Write each result to DB as it completes.
            for fut in futures:
                p_id = futures[fut]
                derivatives = fut.result()
                for size_name, blob in derivatives.items():
                    cr.execute(SQL_INSERT, (
                        size_name, size_name, p_id,
                        base64.b64encode(blob), len(blob),
                    ))
            cr.commit()
            print(f"Batch {i // batch_size + 1}: {len(batch)} products")

With 8 PIL workers and 100-product batches, the catalog recovery ran at ~3 seconds per 1,000 products of CPU + ~1 second of DB write. Total wall time for ~32,500 derivatives across 8,623 products: about 4 minutes. Compare to the 96 minutes of ORM-based recovery.

While the bulk-insert runs, customer requests hitting product.image_512 get cache hits as soon as the row is inserted. The recovery is observable in real-time as the on-call channel quiets down.

$What to NOT do

A few attempted shortcuts that don't work:

Don't write raw bytes to datas. Odoo expects base64-encoded bytes in that column. Raw bytes will decode to garbage.
Don't skip the file_size column. Odoo uses it for storage accounting and display. Skipping it leads to zero-byte file claims in the UI.
Don't trust the ORM to "just regenerate" after the operation. If you skip the bulk-insert and just wait for lazy regeneration, the production load during that window is what causes the actual outage.
Don't use filestore for this. If your Odoo is configured with data_dir filestore storage instead of database-stored attachments, the path is different. You'd write to disk + insert metadata to ir_attachment. The above SQL assumes database storage. Check your ir.config_parameter for ir_attachment.location before running.
Don't run this in production without a watchdog snapshot. The ON CONFLICT DO UPDATE is idempotent but it overwrites existing derivatives. If something was already correct, you're overwriting it with newly-computed-but-correct data (same outcome but wasted work). Snapshot first; if anything looks wrong, restore.

$The transferable lesson

Odoo's ORM is excellent for transactional business logic. It's bad at bulk image operations on production catalogs at scale. The pattern that works is: bypass the ORM for the bulk path, write direct SQL, generate the side-effect artifacts (derivatives) yourself in parallel.

This is also the pattern that works for: bulk variation-family corrections on Amazon, bulk price re-pushes after a feed rejection, bulk inventory reconciliation after a Walmart sync error. Anywhere "thousands of records, one field change per record, side-effects need to fire" is the spec, the ORM is the wrong tool. Direct SQL plus explicit side-effect handling is the right one.

The cost is that you lose Odoo's validation safety net. That cost is acceptable if you've snapshotted state first, have a watchdog that can flag anomalies, and have audited the SQL against a real schema. Without those guardrails, direct-SQL pattern is a footgun.