# Stacked tables & reinitialisation — the layout engine, in depth > 🎨 **Colour walkthrough:** [open the interactive visual version](../visualizations/concepts.html) — 7 toggleable before/after scenarios, including the reinitialise-with-more-columns centerpiece. The Unicode grids below are the always-available text version. > **Audience.** You want to lay out *more than one* managed table on a sheet and > evolve them safely over time — grow them, shrink them, widen them when the > upstream Benchling schema changes — without ever silently corrupting data. > This is the conceptual reference for the layout engine that makes that safe. > > **Scope.** Everything here is grounded in the real implementation: > `src/xlwings_package/excel_io.py` (`paste_df_to_named_range`), > `src/xlwings_package/excel_utils.py` (the invariant / band / danger-zone > machinery), `src/xlwings_package/benchling_io.py` > (`initialise_table_benchling` + `_reinit`), and > `src/xlwings_package/exceptions.py`. Related docs: - Quick rules for where to put loose content: [../safe-placement.md](../safe-placement.md) - Bookmark recovery: [../bookmark-auto-heal.md](../bookmark-auto-heal.md) - Stale-name pruning: [../cleanup-stale-names.md](../cleanup-stale-names.md) - Task guides: [../guides/normal-tables.md](../guides/normal-tables.md) · [../guides/benchling-single.md](../guides/benchling-single.md) · [../guides/benchling-multiple.md](../guides/benchling-multiple.md) · [../guides/plates.md](../guides/plates.md) - API surface: [../reference/tools.md](../reference/tools.md) - Interactive companions (open in a browser from inside the repo): `../visualizations/stacked-tables-contract.html`, `../visualizations/schema-drift-with-data.html`. --- ## Reading the grid mockups Throughout this doc, sheet states are drawn as Excel grids with **column letters across the top** and **row numbers down the left**. Markers on the right edge point at the structural rows the engine cares about:

Assay results (title, merged)

← title row

← header row (schema anchor)

10start

20end

(gap — empty, safe to grow into)

qc (next stacked table's title)

← stacked neighbor below

Legend used in every diagram: | Glyph / shade | Meaning | |--------------------------|-------------------------------------------------------------| | `title` / merged row | The Benchling title cell (`_write_title`) | | `header` row | Column headers; the **schema anchor** named range lives here| | `d1`, `d2`, … | Existing data rows | | `D1`, `D2`, … | Data *written this operation* | | `(gap)` | Empty rows between this table and the next stacked table | | `▼ DANGER ZONE ▼` | The lane band below a table that the table owns (see §3) | --- ## 1. The managed-table model A **managed table** is a rectangular region of a worksheet whose structure the package tracks with **sheet-scoped named ranges**. The package never relies on Excel's built-in ListObject "Tables"; it builds its own lightweight scaffolding out of names so it works identically on any sheet, in any workbook. There are four kinds of name the engine attaches to a table named ``: ### 1.1 The header anchor (``) A single named range whose `refers_to_range` is the **header row** of the table — one cell per column, contiguous. For a Benchling table this is the *schema anchor*: its short token is the schema's warehouse-derived range name (see §1.4), and it spans every schema column.

← header anchor assay_x = $A$2:$C$2

`paste_df_to_named_range` resolves this anchor first (`_resolve_paste_target`): the header row, the start column, and the current header width all come from it. The anchor's *width* is the table's declared column count. ### 1.2 Per-column names Each header cell may carry its own single-cell sheet-scoped name — the **column token** (e.g. `sample`, `batch`, `result`). These are what `import_excel_table` keys its DataFrame columns on (with `use_name_range_names=True`), and what `_check_df_columns_match_headers` matches against on paste. They are created by `initialise_table_named_ranges` (manual tables) or `_write_fresh` / `_reinit` (Benchling tables). ### 1.3 The start/end bookmarks (`_table_start` / `_table_end`) This is the heart of the engine. Two **sheet-scoped** names track the *live data extent* — not the header, the **data**: - `_table_start` → top-left **data** cell (the cell at `header_row + 1`, start column). - `_table_end` → bottom-right **data** cell (last data row, last *data* column).

10start

assay_x_table_start = $A$3

30end

assay_x_table_end = $C$5

Every call to `paste_df_to_named_range` rewrites these two bookmarks (`_write_data_and_bookmarks`) so the *next* repaste knows exactly which rows to clear and how many rows to shift. The bookmark prefix (``, the substring before `_table_start` / `_table_end`) is how `_iter_managed_table_extents` discovers every managed table on a sheet — this is the identity the invariant check, the cluster band, and the danger-zone scan all operate on. > **Why a separate end bookmark instead of "scan down until blank"?** > A scan can't tell the difference between a deliberately-empty cell in the > middle of a sparse dataset and the true bottom of the table. The bookmark > records the truth. (Reads still tolerate a *stale* bookmark by scanning and > growing — see [../bookmark-auto-heal.md](../bookmark-auto-heal.md) — but the > *write* path trusts the bookmark to clear precisely.) ### 1.4 The schema manifest (Benchling only) — `__schema_columns` A Benchling-managed table also carries a fifth name: the **manifest**. It is a sheet-scoped name whose `refers_to` is a *string formula* listing the canonical warehouse column names: ``` assay_x__schema_columns refers_to: ="sample,batch,result" ``` The manifest is the engine's record of *which columns Benchling owns*. It is read by `_read_schema_manifest` and written at the end of every `_write_fresh` and every `_reinit` (`_write_schema_manifest`). Its job is to let `_reinit` distinguish **schema-owned columns** (which it may rename/purge when Benchling drops them) from **manual named ranges a developer added inside the band** (which it must leave alone — the "M.2.1" protection). It is also the ownership key for `cleanup_stale_table_names`: no manifest ⇒ `ManifestNotFoundError`, because this package only manages tables it created. > Warehouse names containing a comma or `"` would break the round-trip parse, so > `_write_schema_manifest` raises eagerly. Benchling warehouse names are > snake_case by convention, so this never fires in practice. ### 1.5 Putting it together A two-table stacked Benchling sheet, fully scaffolded:

Assay X (title, merged)

anchor assay_x = $A$2:$C$2

10start

20end

assay_x_table_start $A$3 · assay_x_table_end $C$4

(gap)

Assay Y (title, merged)

well

conc

odanchor

anchor assay_y = $A$8:$C$8

0.2start

0.3end

assay_y_table_start $A$9 · assay_y_table_end $C$10

Names on this sheet: `assay_x`, `assay_x_table_start`, `assay_x_table_end`, `assay_x__schema_columns`, the per-column tokens (`sample`, `batch`, `result`), and the same family for `assay_y`. The two tables share columns A–C, placing them in the **same lane**, and occupy **disjoint rows** (2–4 vs 8–10). Within a lane, row disjointness is required — see §2. --- ## 2. The lane model and the row-overlap invariant A **lane** is the column-connected group of managed tables returned by `_cluster_for_operating_table` — tables stacked vertically so that members are **row-disjoint within the lane**. The lane's column band runs from the leftmost column of any member to the rightmost (`min(member lefts)` … `max(member rights)`). Every row-grow and row-shrink is bounded to that band, so tables in other lanes — **column-disjoint by construction** — are never touched. Side-by-side, L-shaped, and T-shaped layouts are simply **several lanes in separate column bands that happen to share rows**. Because each operation is bounded to its own lane's columns, the lanes are independently safe: ``` A B C E F G H J+ (free) 2 [Xh Xh Xh] [Yh Yh Yh Yh] 3 [ x x x] [ y y y y] 4 [ x x x] [ y y y y] 5 [ x x x] <- Y grows/shrinks in E-H; 6 A-C is never touched 7 [ x x x] <- X grows down freely LANE 1 (A-C) LANE 2 (E-H) danger zone A-C danger zone E-H disjoint columns => independent forever ``` `SideBySideNotSupportedError` is raised only for **row overlap within a single lane** (column-overlapping tables). Two tables in disjoint column bands sharing rows is not an error — they are different lanes. ### 2.1 What counts as row overlap The check in `_assert_stacked_only` operates on **row intervals within the operating table's lane** only. For two managed tables with extents taken from their bookmark unions via `_iter_managed_table_extents`: - Table P spans rows `[Pr_min, Pr_max]`. - Table Q spans rows `[Qr_min, Qr_max]`. A table is part of the operating lane iff it shares at least one column with the transitive cluster anchored at the operating table (§3.2). A table whose columns are entirely disjoint is a **different lane** and is skipped. Within a lane they **overlap** iff `not (Pr_max < Qr_min or Qr_min > Pr_max)`. **Valid vertical stack** — rows disjoint, columns identical (same lane, fine):

← Table X rows 2–4 (Lane A-C)

(gap)

well

conc

← Table Y rows 6–8 (same lane)

0.2

0.3

X rows [2,4] ∩ Y rows [6,8] = ∅ ✔ ACCEPTED

**Also valid — disjoint-column side-by-side (two lanes sharing rows):**

Y_h

← Lane 1 (A–C) and Lane 2 (E–F) share row 2

Column-disjoint: different lanes. X rows [2,4] ∩ Y rows [2,4] = [2,4] but ACCEPTED — different lanes

**Also valid — side-by-side in truly disjoint columns (two lanes sharing rows):** ``` A B C E F G ✓ col D is empty gap [Left ] [Right ] Two separate lanes — paste into either succeeds ``` The key criterion is **column disjointness**: Left (A-C) and Right (E-G) share no column, so they are independent lanes regardless of which rows they occupy. **Edge case — wide bridging table merges two groups into one lane:** A table whose column span *bridges* two otherwise-disjoint groups merges them transitively into one lane. If Wide covers A-G, it overlaps both Left (A-C) and Right (E-G) — none of A-C, E-G are disjoint from A-G — so all three are in the same lane. Right's rows then overlap Left's rows **within that merged lane**, and any paste into Right raises `SideBySideNotSupportedError`. ``` A B C E F G ✓ before Wide is added: two independent lanes [Left ] [Right ] A B C D E F G ✗ after Wide (A-G) straddles both: [Left ] [Right ] Right is now part of Left's lane; [ Wide (A-G) ] paste into Right → SideBySideNotSupportedError ``` Fix: move Right past Wide's right edge (e.g. cols I-K) so it is column-disjoint from A-G. It then becomes its own independent lane again. **Invalid — row overlap within the same lane:**

Y_h

← X (A–C) and Y (D–F) share a column cluster via C/D overlap → same lane, BOTH at rows 2–4 ✘

Column-overlapping tables, same lane, rows [2,4] = [2,4] ✘ REJECTED → SideBySideNotSupportedError

### 2.2 Why row overlap within a lane is rejected The engine resizes a table by **inserting or deleting whole worksheet rows** across its lane's column band (§4). An Insert/Delete in that band moves every cell in those columns. If a second table in the **same lane** (overlapping columns) shared even one of the operating table's rows, the row-shift would tear it — its left columns shifted, its right columns not, or vice versa. Column overlap is the criterion because the row-shift only touches rows **strictly below** the operating table's old bottom row (`_shift_band_below`, §4.2). A stacked neighbor below shifts down or up **as a whole unit** — all its columns move by the same delta — so it stays intact. Tables in a different lane (disjoint columns) are in a different Insert/Delete band entirely and are never touched. ### 2.3 The projection: when does the check fire? `_assert_stacked_only` does not just compare *current* rows. It compares the operating table's **post-operation** row range against every other **same-lane** table's **projected** post-operation row range, because the shift the engine is about to perform will move some lane-mates: - It computes the **lane band** — the transitive column-overlap closure anchored at the operating table's columns (see §3.2 / `_cluster_for_operating_table`). - A neighbor that (a) overlaps the lane's columns **and** (b) currently sits **below** `op_old_end` is projected to shift by `delta` rows: `(R_min + delta, R_max + delta)`. - A neighbor in a **different lane** (column-disjoint) is skipped entirely. - The invariant fires iff **any** projected same-lane range intersects the operating table's post-operation range `[op_top, op_bottom_post]`. So the check is *forward-looking*: it rejects layouts that would overlap **after the shift completes** within the lane, not merely ones that overlap now. ### 2.4 Exactly where it fires | Call site | When | `op_top` / `op_bottom_post` | `delta` | |---|---|---|---| | `initialise_table_named_ranges` (manual init) | after computing the footprint, **before** writing bookmarks | header_row / footprint bottom | `0` | | `initialise_table_benchling` → `_write_fresh` path | via the same init machinery for fresh tables | header / footprint | `0` | | `paste_df_to_named_range` | **before** any destructive op | header_row / `header_row + n_new` | the computed shift amount | | `create_plate_layout` | before the plate paste | plate extent | plate row delta | In every case the check runs **before** anything destructive, so a rejection leaves the sheet exactly as it was — no half-written bookmarks, no cleared cells. At init the check runs *before* the bookmarks are even written, so a rejected init leaves no stale names behind. --- ## 3. The danger zone & safe placement ### 3.1 What the danger zone is A lane **owns the band directly below its stack, within its column range, indefinitely** — because that's the space its data grows into. That band is the **danger zone**. Loose content (a stray note, a summary formula, a label) placed there will be shifted or destroyed when the table grows or shrinks. Floating shapes (images, text boxes) anchored in the band are also detected and raise before any shift occurs (see §3.4).

20end

▼ danger zone — Lane A–C; col D is outside this lane, SAFE

▼

SAFE

owned by

the lane

below ↓

SAFE

▼

SAFE

Three regions are **always safe** (see [../safe-placement.md](../safe-placement.md)): 1. **Above the title row.** The shift only ever inserts/deletes *downward*. 2. **In columns outside the lane's band.** Content in a different lane (disjoint column band) is never in any lane's shift band. A second managed table in a disjoint column band sharing the same rows is a different lane — it is accepted, not rejected (§2). 3. **Cells you protect with their own named range.** Both the danger-zone scan and the widen-claim scan exempt any cell covered by a named range other than the operating table's own bookmarks. Shapes cannot be exempted this way — a shape anchored in the zone always raises. ### 3.2 The lane band — how wide is "the lane's columns"? The danger zone covers the **full column extent of the operating table's lane** (`_cluster_for_operating_table`, used by `_managed_band_for_sheet`). A managed table joins the lane if its columns share ≥1 column with **any table already in the lane** — a BFS closure, not a first-degree test: ``` operating table cols 1–3 T_A cols 2–5 ← overlaps operating → in lane T_B cols 4–8 ← overlaps T_A (not operating) → in lane (transitive) T_C cols 20–25 ← disjoint from all → different lane, excluded ⇒ lane band = cols 1–8. The Insert/Delete spans cols 1–8 so the whole chain shifts as one unit; T_C is in a different lane and is never touched. ``` The danger-zone scan (`_check_danger_zone`) and the row-shift (`_shift_band_below`) use **the same band**, so anything that *would* be shifted is also *inspected* first. That's the safety guarantee. ### 3.3 Grow that pushes a neighbor down — SAFE The neighbor below is itself a managed table (covered by its bookmark extent), so the danger-zone scan treats it as "safe to shift as a unit" and the grow proceeds. Before — operating table X (rows 2–4) with stacked neighbor Y (rows 6–8); paste a 4-row DataFrame into X:

BEFORE — X: n_old = 2

d2end

(gap)

Y_h

→

AFTER — X: n_new = 4, delta = +2

D4end

(gap — preserved)

Y_h

Y moved from rows 6–8 to rows 8–10 — pushed down by `delta = +2`. The 1-row gap between X and Y (row 5 before → row 7 after) is preserved (§4.3). ### 3.4 Grow that hits loose content or a shape — REFUSED (`TableSpaceOccupiedError`) Now suppose instead of a managed neighbor, there's a **loose note** in the danger zone (col B, row 6) with no named range protecting it:

d2end

▼ danger zone — lane X (cols A–C); attempt: paste 4 rows

NOTE: QA

```text _check_danger_zone scans rows 5..(old_end+100) across the lane band (A–C), finds "NOTE: QA" at B6 — not covered by any managed table, not protected by a named range — and raises BEFORE any cell is cleared or shifted. ``` > `TableSpaceOccupiedError` — *"Raised when a write would shift or destroy > non-table content in the target band — either the danger zone below a managed > table or the cells a fresh header would land on. Distinct from the > width-mismatch case (DataFrame wider than the declared header), which raises > plain `ValueError` because it is caller-side input validation, not a > sheet-state conflict."* **Fix options** (all in the message): move the note above the title row; move it to a column outside the lane band; or give B6 its own named range to mark it intentional (the scan then exempts it). **Shapes are also detected.** `_check_danger_zone` enumerates `sheet.shapes` and checks each shape's anchor cells (`TopLeftCell` / `BottomRightCell`) against the affected region. A floating image or text box anchored inside the lane band raises `TableSpaceOccupiedError` with the shape's name and a **move instruction** computed as: "move it at least N rows down (away from the table), or above the title row." Shapes cannot be exempted by a named range; any shape anchored in the zone raises. After moving the shape and re-running, the scan re-checks and proceeds. No persistent state is stored between attempts. For button-driven macros: the caller catches the exception and passes `str(exc)` to `display_message_and_exit` (`ui.py`). The engine itself never calls `display_message_and_exit` — keeping the core testable and context-free. ### 3.5 The claim zone — widening When a paste's DataFrame is **wider** than the current header, the engine auto-widens the header into new columns to the right (§4.4). Those new columns were never part of the table, so before touching them `_check_widen_claim_zone` runs **two** scans — both raising `TableSpaceOccupiedError` on loose content or shapes: 1. **New-column extension in the operating rows.** Cells in `(op_top..scan_bottom, old_header_end+1 .. new_header_end)` would be overwritten by the new header / data write.

(new)

← col D about to become header (df has 4 cols; header is A–C)

LEFTOVER

← claim-zone scan finds this → refuses

2. **Post-widen lane extension below the table.** If the new columns *bridge* the operating table into a previously-disjoint stacked table (widening the lane band), any loose content in those *newly-in-lane* columns inside the danger zone would now be caught by the band-Insert. The pre-widen `_check_danger_zone` couldn't have seen those columns, so this second scan covers them. Both scans check cells **and** floating shapes (`_shapes_in_region`). A shape anchored in the claim zone raises with a **column move instruction** computed by `_column_move_hint`: "move it at least N columns to the right (away from the table)." Cells covered by a named range or another managed table are exempt; shapes are never exempt. Both scans exempt cells covered by a named range or another managed table. --- ## 4. The paste shift mechanism — clear → shift → write `paste_df_to_named_range` resizes a table with a **three-step multi-pass shift**. This is the precise mechanism, in execution order, exactly as implemented. ### 4.0 Pre-flight (before any destructive step) In order, *all before anything is cleared or moved*: 1. **Sanitize the DataFrame** (`_sanitize_df_for_excel`): `±inf → NaN`; truncate strings over Excel's 32,767-char cell limit (with one `UserWarning`); and — because **`sanitize_formulas=True` by default** — apostrophe-escape text cells starting with `=` `+` `-` `@` so external data can't inject a live formula. Numbers and numeric-looking strings are left alone. The caller's DataFrame is never mutated. 2. **Resolve the target** (`_resolve_paste_target`): header range, start column, current header width, first data row. 3. **Column-order check** (`_check_df_columns_match_headers`): every existing header position must match the DataFrame column at that position — see §6.5. Raises `ColumnOrderMismatchError` if not. 4. **Read the old extent** (`_read_op_extent`): `old_start` / `old_end` from the live bookmarks, with auto-heal (§5.1) when they're missing/corrupt/inverted. 5. **Compute the shift amount** (§4.3) and **assert the stacked-only invariant** (§2) on the projected post-paste rows. 6. **Run the widen-claim scan** (§3.5) if the DataFrame is wider than the header. Only after all of the above passes does the destructive sequence begin. ### 4.1 Step 1 — ClearContents `_clear_old_data_rows` runs the danger-zone scan (§3.4) and then `ClearContents` on the operating table's *own* data cells — `(old_start, start_col)` through `(old_end, header_end_col)` — **in place, no row shift**. On a first paste (`old_end < old_start`) this is a no-op. ### 4.2 Step 2 — Shift (Insert / Delete whole rows) `_shift_band_below` opens or closes row slots **strictly below `old_end`**, across the cluster band `[band_left, band_right]` (§3.2). Let `delta = n_new - n_old_effective`: - `delta > 0` (**grow**): `Range.Insert(Shift=xlShiftDown)` for `delta` rows at `old_end + 1`. Everything in the band below `old_end` slides **down** by `delta`. - `delta < 0` (**shrink**): `Range.Delete(Shift=xlShiftUp)` of the bottom `|delta|` rows of the old extent. Everything below slides **up** by `|delta|`. - `delta == 0`: nothing. Because the shift only ever touches rows **below `old_end`**, cells at or above `old_end` — including the header itself and any loose content sharing the operating rows but above `old_end` — are never moved by this step. Content in a different lane (disjoint columns) is also untouched because the Insert/Delete spans only the current lane's column band. (Row overlap within the same lane is already rejected by §2 before this step runs.) ### 4.3 Step 2 (cont.) — how the gap to the neighbor below is preserved The shift amount differs between first and subsequent pastes, and that difference is exactly what preserves the user's chosen gap. **Subsequent paste (`n_old > 0`)** — pure delta: ``` shift_amount = n_new - n_old # (excel_io.py) ``` The neighbor below is pushed down by `+delta` on grow, pulled up by `|delta|` on shrink. Whatever gap existed between this table and the next stacked table is mechanically preserved, because both the table's new bottom and the neighbor's top move by the *same* delta. **First paste (`n_old == 0`)** — gap-aware: ``` gap_below = _nearest_neighbor_gap_below(...) # empty rows below header to nearest neighbor shift_amount = max(0, n_new - gap_below) ``` On the very first paste there are no data rows yet, but the *user* has already laid out a gap between this table's header and the next table (`table_gap`, 2 by default). The engine **consumes that gap first** and only pushes the neighbor by the **deficit** (`n_new - gap`) when the data is taller than the gap. Data that fits in the existing gap is written into it without displacing the neighbor at all. `_nearest_neighbor_gap_below` scans by **cell content** (not by bookmark), because a header-only neighbor that's never been pasted into has only its anchor named range — no `_table_start`/`_table_end` bookmarks yet — so it's invisible to `_iter_managed_table_extents`. Scanning cell content catches the neighbor's header cells the way the user sees them. (When no neighbor is found within the scan window, it returns a large sentinel = "unlimited room below".) **First-paste grid — data fits the gap (no neighbor displacement):**

BEFORE — X header only, gap = 2

(gap)

Y_h

→

AFTER — paste 2 rows; n_new = 2 ≤ gap

← written into the gap

Y_h

← UNMOVED (shift = max(0, 2−2) = 0)

**First-paste grid — data exceeds the gap (push by the deficit only):**

BEFORE — gap = 2

(gap)

Y_h

→

AFTER — paste 3 rows; shift = max(0, 3−2) = 1

← consumed gap + 1 new row

Y_h

← shifted down by 1 only

The net effect across **any** sequence of grows and shrinks is that the gap set at initial layout time is the gap you keep, forever. ### 4.4 Step 3 — Write + bookmarks (and column-width handling) Between the clear and the write, the header is widened if needed (`_extend_header_for_wider_df`), then `_shift_band_below` runs in the **post-widen** band, then `_write_data_and_bookmarks`: - Writes `df.values` at `(old_start, start_col)` — **the index is never written**. - `ClearFormats` on the written block. - Upserts `_table_start` and `_table_end` to the new data corners. **Widen.** When `df` has more columns than the header, the header named range is auto-extended rightward; new header cells take their display names from `df.columns`. **No `Insert columns` is needed** — the lane-row-disjoint invariant guarantees no same-lane table shares the operating rows, so the new header cells land on rows nobody in the lane owns. Content in a different lane at the same rows is unaffected because the widen only writes into the lane's new column(s). (If the rightmost new column would pass Excel's last column XFD = 16,384, `ExcelColumnOverflowError` is raised first.) **Narrow.** When `df` has *fewer* columns than the header, the trailing data cells are already cleared by Step 1 (the clear spans `header_end_col`). The header named range **stays at its declared width**; the `_table_end` bookmark tracks the *data* extent, ending at `start_col + df.shape[1] - 1`. So a later re-widen to the original width needs no header change.

WIDEN — header A–C, df has 4 cols

NEW

← header extended

D2end

_table_end → (4, D)

NARROW — header A–D, df has 3 cols

← header KEPT wide

D1end

—

← D cleared

—

_table_end → (4, C) (data extent)

### 4.5 Failure ordering The steps are ordered so any failure leaves the sheet consistent: 1. Invariant check (no mutation) → can abort cleanly. 2. Widen-claim scan (no mutation) → can abort cleanly. 3. Clear old data — runs the danger-zone scan first; raising here means **no header has been mutated yet**, safe to abort. 4. Widen header (commits only after the clear succeeded). 5. Shift + write. --- ## 5. Bookmark auto-heal & `cleanup_stale_table_names` Workbooks get edited by humans in Excel. Columns get deleted, rows get right-clicked away, names get fiddled in the Name Manager. The engine recovers from the common forms of damage. ### 5.1 Bookmark auto-heal (the write path) `_read_op_extent` resolves `_table_start` / `_table_end`. If the pair is **missing, corrupt (`#REF!` or any COM error), or inverted (`end < start`)**, it logs a DEBUG breadcrumb, **drops the fragments** (`_drop_names`), and returns an empty extent so the next paste falls through to the **first-paste path** — writing new data in the row immediately below the header, exactly as a fresh table would.

BROKEN — user deleted _table_end

← header anchor OK

???

_table_start present, _table_end #REF!

???

→ drop both fragments, first-paste path

→

HEALED — next paste, n_old treated as 0

← rewritten from header+1

→ both bookmarks rewritten fresh

| Trigger | Cause | Outcome | |---|---|---| | `_table_start` missing | Deleted via Name Manager | Drop `_table_end`, first-paste path | | `_table_end` missing | Same | Drop `_table_start`, first-paste path | | Either `#REF!` | Right-clicked → Delete on data rows | Drop both, first-paste path | | `end.row < start.row` | Manual mis-edit | Drop both, first-paste path | Recovery is silent except for a DEBUG log. See [../bookmark-auto-heal.md](../bookmark-auto-heal.md). > **Read paths heal differently.** `import_excel_table` / > `scan_named_range_table` *grow* a stale `_table_end` (scan down, extend) and > rewrite missing/orphaned bookmarks, but **never auto-shrink** — a read should > never lose track of data it can see. ### 5.2 `cleanup_stale_table_names` (on-demand pruning) `#REF!` names accumulate when a user deletes a column manually in Excel — the named range lingers but `refers_to` resolves to `=#REF!#REF!`. `cleanup_stale_table_names` prunes all such broken names for a managed table. ```python from xlwings_package import cleanup_stale_table_names cleanup_stale_table_names(wb, "assay_x") # prune #REF! names cleanup_stale_table_names(wb, "assay_x", sheet="Results") # one sheet only # → {"removed_ref": ["Results!col_b"]} ``` Ownership is decided by name pattern + the manifest. Because ownership keys off the manifest, a table with **no manifest** raises `ManifestNotFoundError`: > `ManifestNotFoundError` — *"Raised when a Benchling-managed table has no > schema-column manifest on the requested sheet(s) — i.e. the table was never > initialised by `initialise_table_benchling` or the manifest name was deleted > manually."* (A missing `sheet=` raises `WorksheetNotFoundError`.) Details: [../cleanup-stale-names.md](../cleanup-stale-names.md). --- ## 6. CENTERPIECE — reinitialise a stacked Benchling table with MORE columns This is the payoff: two stacked Benchling tables on one sheet, the upstream schema gains a column, you call `initialise_table_benchling` again, and the table **auto-widens in place** — header extends, named ranges rebind, manifest updates — while the stacked neighbor below is **protected**. ### 6.0 Starting state — two stacked tables, with data `assay_x` (cols A–C) stacked above `assay_y` (cols A–C), both with data and a 2-row gap between:

Assay X (title)

← anchor assay_x $A$2:$C$2

10start

20end

assay_x_table_end = $C$4

(gap — 2 rows)

Assay Y (title)

well

conc

odanchor

← anchor assay_y $A$8:$C$8

0.2start

0.3end

assay_y_table_end = $C$10

manifests: assay_x__schema_columns = "sample,batch,result" · assay_y__schema_columns = "well,conc,od"

### 6.1 The upstream schema gains a column In Benchling, `Assay X` adds a new result field **`qc`** at the **end** of its schema. The schema is now `[sample, batch, result, qc]`. ### 6.2 The reinit call (runnable) Reinit is automatic: `initialise_table_benchling` detects that `assay_x` already exists on the sheet (`_find_table` finds the anchor name) and routes to `_reinit` instead of `_write_fresh`. You call it **exactly the same way you did the first time**: ```python from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext from xlwings_package import initialise_table_benchling # You build the context yourself — credentials are mgtx-benchling-wrapper's job. ctx = BenchlingContext( client_id=..., client_secret=..., # plaintext — resolve from your store base_url="https://yourorg.benchling.com", token_url="https://yourorg.benchling.com/api/v2/token", ) # Same two schemas, same sheet, same column. assay_x's schema now has `qc`. results = initialise_table_benchling( wb, { "Results": ["assaysch_assayx", "assaysch_assayy"], }, ctx=ctx, column="A", # autopaste; both stack in column A table_gap=2, # the gap that's preserved across reinits/pastes ) # Each result dict carries action + a diff when reinitialised: for r in results: print(r["display_name"], r["action"], r.get("diff")) # Assay X reinitialised {'added': ['qc'], 'updated': [], 'removed': [], ...} # Assay Y reinitialised {'added': [], ...} # unchanged ``` ### 6.3 What `_reinit` does, step by step `_reinit` is the in-place updater. For `assay_x` gaining `qc`: 1. **Locate the table** — `_find_table` returns `(header_row=2, start_col=1, end_col=3)`. 2. **Re-sync display names + title** — column headers and the title cell are rewritten from the schema on every call (Benchling is the source of truth for text). 3. **Diff against the manifest** — `cur_map_schema_only` is `cur_map` filtered to the manifest `{sample, batch, result}`, so any manual range a dev added in the band is excluded (M.2.1). `qc` is in the new schema but not in `cur_map` → it's a **new column**. 4. **Append the new column.** New columns are appended at the right edge of the current band (`end_col + 1`, col D). `_check_band_empty` confirms D2 is free first (raises `TableSpaceOccupiedError` if not). The header cell gets the display name + bold + fill, and a fresh sheet-scoped range `qc` binds to `$D$2`. **No `Insert columns` is emitted** — the reorder/repack pass (step 5) places every column in schema order via a clear-and-rewrite, so appending then repacking yields the same final layout without a propagating column shift. 5. **Reorder pass** — the active band is snapshotted, cleared, and rewritten in schema order. Named ranges are rebound in place. Because this operates only on the lane's own columns (clear-and-rewrite, no Insert/Delete), no other lane is affected. 6. **Extend the table extent** — the schema anchor grows to `$A$2:$D$2`, the title merge re-merges across A–D, and `assay_x_table_end` extends to col D. 7. **Rewrite the manifest** — `assay_x__schema_columns = "sample,batch,result,qc"`. ### 6.4 BEFORE / AFTER — the widen, with the neighbor protected

BEFORE — assay_x = [sample, batch, result]

← anchor $A$2:$C$2

20end

assay_x_table_end $C$4

(gap — 2 rows)

Assay Y (title)

well

conc

0.2

0.3

manifest: assay_x__schema_columns = "sample,batch,result"

→

AFTER — assay_x = [sample, batch, result, qc]

qcanchor

← anchor NOW $A$2:$D$2, qc → $D$2

(empty)

← existing data INTACT

(empty)end

assay_x_table_end NOW $D$4

(gap — 2-row gap PRESERVED)

Assay Y (title)

← Y UNMOVED — still rows 7–10

well

conc

0.2

0.3

manifest now: assay_x__schema_columns = "sample,batch,result,qc"

Why the neighbor is protected: the reorder/repack pass performs a **clear-and- rewrite on the lane's own columns only** — no row Insert/Delete, no column Insert/Delete that propagates rightward. The new `qc` header lands at D2 and its data column extends straight down into column D; the repack places it in schema order by value-snapshot → clear → rewrite, so `assay_y` at rows 7–10 is untouched. The gap is preserved. Existing `sample`/`batch`/`result` data is preserved cell-for-cell; only the headers are re-synced and `qc`'s cells start empty until the next paste. ### 6.5 Re-pasting after the schema change — column-order matching After reinit, you pull fresh data and repaste. The DataFrame's columns must line up with the headers, because `paste_df_to_named_range` writes `df.values` **positionally** — column 0 under the first header, column 1 under the second, and so on. `_check_df_columns_match_headers` enforces this **before any destructive op**, matching each DataFrame column against either the per-column range token or the header display value (case- and whitespace-normalized): ```python from xlwings_package import import_excel_table, paste_df_to_named_range # After reinit, the sheet header order is [sample, batch, result, qc]. df = pull_assay_x_results() # however you build it # df.columns must be [sample, batch, result, qc] in THIS order. paste_df_to_named_range(wb, df, "assay_x") ``` If the DataFrame's columns are **reordered**, or a column was **inserted in the middle** or **dropped from the middle** relative to the sheet headers, the positional write would silently misalign data — so it raises instead: > `ColumnOrderMismatchError` — *"Raised when a DataFrame's columns don't match > the existing Excel header positions on a paste. `paste_df_to_named_range` > writes `df.values` positionally — column 0 lands under the first header cell, > column 1 under the second, etc. If `df.columns` is reordered (or a column was > inserted in the middle, or one was removed from the middle), the positional > write would silently misalign data with the headers. This error fires before > any destructive operation so the user can either reorder the DataFrame or > re-initialise the table to reflect the new schema. Matches against either the > column-level sheet-scoped named-range token or the header cell display value > (case + whitespace normalized), so both the Benchling workflow > (`use_name_range_names=True` → range-name tokens) and the manual-table workflow > (display values) work without configuration."*

sheet

sample

result

batch

← positions 1 & 2 swapped (batch/result) → ColumnOrderMismatchError

```text Fix A: reorder the DataFrame to [sample, batch, result, qc] before pasting. Fix B: if Benchling itself reordered, re-run initialise_table_benchling first (the reorder pass rebinds the sheet to the new order), THEN paste. ``` Only **existing** header positions are checked; columns past the current header are handled by the auto-widen path (and pre-scanned by the claim zone, §3.5). A freshly-anchored, empty-header position has no identity to match against and is accepted (this is what lets a first paste into a brand-new table go through). ### 6.6 The mirror case — Benchling DROPS a column When a column is removed from the Benchling schema, `_reinit` **purges it outright**: clears every cell in the column from the header through `_table_end`, deletes its named range, **closes the horizontal gap** so the remaining schema columns are contiguous (the repack pass), and **shrinks the table extent** — schema anchor width, title merge, and `_table_end` — to the new rightmost column. The stacked neighbor below is untouched: the purge operates on **columns**, not rows, so no vertical shift reaches `assay_y`.

BEFORE — schema [sample, batch, result]; Benchling drops batch

batch

← batch dropped upstream

20end

assay_x_table_end $C$4

→

AFTER — batch purged; result closes up into B, table shrinks to A–B

result

(empty)

← anchor NOW $A$2:$B$2

(empty)

20end

(empty)

assay_x_table_end NOW $B$4

The repack/shrink is guarded by `all_active_resolved` — the engine only compacts and shrinks when **every** active schema column's name resolved, so it can guarantee a contiguous block. If a name were missing it leaves the wider extent intact rather than risk stranding a live column outside the bookmark. > See `../visualizations/schema-drift-with-data.html` for animated > add-end / add-middle / remove / reorder cases on a table that already holds > data, and `../visualizations/stacked-tables-contract.html` for the row/column > operations and the lane layout. --- ## 7. Mental-model summary - A managed table = **header anchor + per-column names + `_table_start`/`_table_end` bookmarks** (+ a **manifest** for Benchling tables). The bookmark prefix is the table's identity. - **Lanes.** A lane is the transitive column-overlap cluster anchored at the operating table. Row overlap is rejected **only within a lane** (column-overlapping tables). Column-disjoint tables sharing rows are different lanes and are fully independent. `SideBySideNotSupportedError` means "row overlap within a single lane." - A lane **owns the danger zone below it** across its **lane band** (transitive column-overlap closure). Loose content or shapes there → `TableSpaceOccupiedError` with a move instruction. Safe places: above the title, outside the lane's columns, or under its own named range (shapes cannot be exempted by a named range). - A paste resizes via **clear → shift → write**. The shift only touches rows **below** the old bottom, so stacked lane-mates move as a unit and the **gap is preserved** — pure `delta` on subsequent pastes, gap-aware on first paste. Tables in other lanes are never moved. - Widen = extend the header rightward, **no column insert** (the lane invariant makes it safe). Narrow = clear trailing cells, **keep the header width**, bookmark tracks data extent. - **Reinit auto-widens in place**: new columns append at the right edge, the repack pass places them in schema order (clear-and-rewrite, no propagating shift). Dropped columns are **purged + closed up + shrunk** (cells cleared, name deleted, repack closes the gap). Neighbor lanes and rows are untouched throughout. - Repasting after a schema change is **positional**; mismatched column order → `ColumnOrderMismatchError` (reorder the DataFrame, or reinit then paste). - Damage recovery: write path **auto-heals** broken bookmarks (falls to first-paste); `cleanup_stale_table_names` prunes `#REF!` broken names.