Skip to the content.

Stacked tables & reinitialisation — the layout engine, in depth

🎨 Colour walkthrough: open the interactive visual version — 7 toggleable before/after scenarios, including the reinitialise-with-more-columns centerpiece. The Unicode grids below are the always-available text version.

Audience. You want to lay out more than one managed table on a sheet and evolve them safely over time — grow them, shrink them, widen them when the upstream Benchling schema changes — without ever silently corrupting data. This is the conceptual reference for the layout engine that makes that safe.

Scope. Everything here is grounded in the real implementation: src/xlwings_package/excel_io.py (paste_df_to_named_range), src/xlwings_package/excel_utils.py (the invariant / band / danger-zone machinery), src/xlwings_package/benchling_io.py (initialise_table_benchling + _reinit), and src/xlwings_package/exceptions.py.

Related docs:


Reading the grid mockups

Throughout this doc, sheet states are drawn as Excel grids with column letters across the top and row numbers down the left. Markers on the right edge point at the structural rows the engine cares about:

A
B
C
D
1
Assay results (title, merged)
← title row
2
sample
batch
resultanchor
← header row (schema anchor)
3
S1
B1
10start
4
S2
B1
20end
(gap — empty, safe to grow into)
6
qc (next stacked table's title)
← stacked neighbor below

Legend used in every diagram:

Glyph / shade Meaning
title / merged row The Benchling title cell (_write_title)
header row Column headers; the schema anchor named range lives here
d1, d2, … Existing data rows
D1, D2, … Data written this operation
(gap) Empty rows between this table and the next stacked table
▼ DANGER ZONE ▼ The lane band below a table that the table owns (see §3)

1. The managed-table model

A managed table is a rectangular region of a worksheet whose structure the package tracks with sheet-scoped named ranges. The package never relies on Excel’s built-in ListObject “Tables”; it builds its own lightweight scaffolding out of names so it works identically on any sheet, in any workbook.

There are four kinds of name the engine attaches to a table named <name>:

1.1 The header anchor (<name>)

A single named range whose refers_to_range is the header row of the table — one cell per column, contiguous. For a Benchling table this is the schema anchor: its short token is the schema’s warehouse-derived range name (see §1.4), and it spans every schema column.

A
B
C
2
sample
batch
resultanchor
← header anchor assay_x = $A$2:$C$2

paste_df_to_named_range resolves this anchor first (_resolve_paste_target): the header row, the start column, and the current header width all come from it. The anchor’s width is the table’s declared column count.

1.2 Per-column names

Each header cell may carry its own single-cell sheet-scoped name — the column token (e.g. sample, batch, result). These are what import_excel_table keys its DataFrame columns on (with use_name_range_names=True), and what _check_df_columns_match_headers matches against on paste. They are created by initialise_table_named_ranges (manual tables) or _write_fresh / _reinit (Benchling tables).

1.3 The start/end bookmarks (<name>_table_start / <name>_table_end)

This is the heart of the engine. Two sheet-scoped names track the live data extent — not the header, the data:

A
B
C
2
sample
batch
resultanchor
3
S1
B1
10start
assay_x_table_start = $A$3
4
S2
B1
20
5
S3
B2
30end
assay_x_table_end = $C$5

Every call to paste_df_to_named_range rewrites these two bookmarks (_write_data_and_bookmarks) so the next repaste knows exactly which rows to clear and how many rows to shift. The bookmark prefix (<name>, the substring before _table_start / _table_end) is how _iter_managed_table_extents discovers every managed table on a sheet — this is the identity the invariant check, the cluster band, and the danger-zone scan all operate on.

Why a separate end bookmark instead of “scan down until blank”? A scan can’t tell the difference between a deliberately-empty cell in the middle of a sparse dataset and the true bottom of the table. The bookmark records the truth. (Reads still tolerate a stale bookmark by scanning and growing — see ../bookmark-auto-heal.md — but the write path trusts the bookmark to clear precisely.)

1.4 The schema manifest (Benchling only) — <name>__schema_columns

A Benchling-managed table also carries a fifth name: the manifest. It is a sheet-scoped name whose refers_to is a string formula listing the canonical warehouse column names:

assay_x__schema_columns   refers_to:   ="sample,batch,result"

The manifest is the engine’s record of which columns Benchling owns. It is read by _read_schema_manifest and written at the end of every _write_fresh and every _reinit (_write_schema_manifest). Its job is to let _reinit distinguish schema-owned columns (which it may rename/purge when Benchling drops them) from manual named ranges a developer added inside the band (which it must leave alone — the “M.2.1” protection). It is also the ownership key for cleanup_stale_table_names: no manifest ⇒ ManifestNotFoundError, because this package only manages tables it created.

Warehouse names containing a comma or " would break the round-trip parse, so _write_schema_manifest raises eagerly. Benchling warehouse names are snake_case by convention, so this never fires in practice.

1.5 Putting it together

A two-table stacked Benchling sheet, fully scaffolded:

A
B
C
1
Assay X (title, merged)
2
sample
batch
resultanchor
anchor assay_x = $A$2:$C$2
3
S1
B1
10start
4
S2
B1
20end
assay_x_table_start $A$3 · assay_x_table_end $C$4
(gap)
7
Assay Y (title, merged)
8
well
conc
odanchor
anchor assay_y = $A$8:$C$8
9
A1
5
0.2start
10
A2
5
0.3end
assay_y_table_start $A$9 · assay_y_table_end $C$10

Names on this sheet: assay_x, assay_x_table_start, assay_x_table_end, assay_x__schema_columns, the per-column tokens (sample, batch, result), and the same family for assay_y. The two tables share columns A–C, placing them in the same lane, and occupy disjoint rows (2–4 vs 8–10). Within a lane, row disjointness is required — see §2.


2. The lane model and the row-overlap invariant

A lane is the column-connected group of managed tables returned by _cluster_for_operating_table — tables stacked vertically so that members are row-disjoint within the lane. The lane’s column band runs from the leftmost column of any member to the rightmost (min(member lefts)max(member rights)). Every row-grow and row-shrink is bounded to that band, so tables in other lanes — column-disjoint by construction — are never touched.

Side-by-side, L-shaped, and T-shaped layouts are simply several lanes in separate column bands that happen to share rows. Because each operation is bounded to its own lane’s columns, the lanes are independently safe:

        A   B   C          E   F   G   H        J+  (free)
  2    [Xh  Xh  Xh]        [Yh  Yh  Yh  Yh]
  3    [ x   x   x]        [ y   y   y   y]
  4    [ x   x   x]        [ y   y   y   y]
  5    [ x   x   x]                              <- Y grows/shrinks in E-H;
  6                                                 A-C is never touched
  7    [ x   x   x]   <- X grows down freely
       LANE 1 (A-C)        LANE 2 (E-H)
       danger zone A-C     danger zone E-H      disjoint columns => independent forever

SideBySideNotSupportedError is raised only for row overlap within a single lane (column-overlapping tables). Two tables in disjoint column bands sharing rows is not an error — they are different lanes.

2.1 What counts as row overlap

The check in _assert_stacked_only operates on row intervals within the operating table’s lane only. For two managed tables with extents taken from their bookmark unions via _iter_managed_table_extents:

A table is part of the operating lane iff it shares at least one column with the transitive cluster anchored at the operating table (§3.2). A table whose columns are entirely disjoint is a different lane and is skipped.

Within a lane they overlap iff not (Pr_max < Qr_min or Qr_min > Pr_max).

Valid vertical stack — rows disjoint, columns identical (same lane, fine):

A
B
C
2
sample
batch
result
← Table X rows 2–4 (Lane A-C)
3
S1
B1
10
4
S2
B1
20
(gap)
6
well
conc
od
← Table Y rows 6–8 (same lane)
7
A1
5
0.2
8
A2
5
0.3
X rows [2,4] ∩ Y rows [6,8] = ∅ ✔ ACCEPTED

Also valid — disjoint-column side-by-side (two lanes sharing rows):

A
B
C
D
E
F
2
X_h
X_h
X_h
Y_h
Y_h
← Lane 1 (A–C) and Lane 2 (E–F) share row 2
3
x1
x1
x1
y1
y1
4
x2
x2
x2
y2
y2
Column-disjoint: different lanes. X rows [2,4] ∩ Y rows [2,4] = [2,4] but ACCEPTED — different lanes

Also valid — side-by-side in truly disjoint columns (two lanes sharing rows):

  A  B  C     E  F  G        ✓ col D is empty gap
  [Left  ]    [Right ]        Two separate lanes — paste into either succeeds

The key criterion is column disjointness: Left (A-C) and Right (E-G) share no column, so they are independent lanes regardless of which rows they occupy.

Edge case — wide bridging table merges two groups into one lane:

A table whose column span bridges two otherwise-disjoint groups merges them transitively into one lane. If Wide covers A-G, it overlaps both Left (A-C) and Right (E-G) — none of A-C, E-G are disjoint from A-G — so all three are in the same lane. Right’s rows then overlap Left’s rows within that merged lane, and any paste into Right raises SideBySideNotSupportedError.

  A  B  C     E  F  G           ✓ before Wide is added: two independent lanes
  [Left  ]    [Right ]

  A  B  C  D  E  F  G           ✗ after Wide (A-G) straddles both:
  [Left  ]    [Right ]            Right is now part of Left's lane;
  [   Wide (A-G)    ]            paste into Right → SideBySideNotSupportedError

Fix: move Right past Wide’s right edge (e.g. cols I-K) so it is column-disjoint from A-G. It then becomes its own independent lane again.

Invalid — row overlap within the same lane:

A
B
C
D
E
F
2
X_h
X_h
X_h
Y_h
Y_h
Y_h
← X (A–C) and Y (D–F) share a column cluster via C/D overlap → same lane, BOTH at rows 2–4 ✘
3
x1
x1
x1
y1
y1
y1
4
x2
x2
x2
y2
y2
y2
Column-overlapping tables, same lane, rows [2,4] = [2,4] ✘ REJECTED → SideBySideNotSupportedError

2.2 Why row overlap within a lane is rejected

The engine resizes a table by inserting or deleting whole worksheet rows across its lane’s column band (§4). An Insert/Delete in that band moves every cell in those columns. If a second table in the same lane (overlapping columns) shared even one of the operating table’s rows, the row-shift would tear it — its left columns shifted, its right columns not, or vice versa.

Column overlap is the criterion because the row-shift only touches rows strictly below the operating table’s old bottom row (_shift_band_below, §4.2). A stacked neighbor below shifts down or up as a whole unit — all its columns move by the same delta — so it stays intact. Tables in a different lane (disjoint columns) are in a different Insert/Delete band entirely and are never touched.

2.3 The projection: when does the check fire?

_assert_stacked_only does not just compare current rows. It compares the operating table’s post-operation row range against every other same-lane table’s projected post-operation row range, because the shift the engine is about to perform will move some lane-mates:

So the check is forward-looking: it rejects layouts that would overlap after the shift completes within the lane, not merely ones that overlap now.

2.4 Exactly where it fires

Call site When op_top / op_bottom_post delta
initialise_table_named_ranges (manual init) after computing the footprint, before writing bookmarks header_row / footprint bottom 0
initialise_table_benchling_write_fresh path via the same init machinery for fresh tables header / footprint 0
paste_df_to_named_range before any destructive op header_row / header_row + n_new the computed shift amount
create_plate_layout before the plate paste plate extent plate row delta

In every case the check runs before anything destructive, so a rejection leaves the sheet exactly as it was — no half-written bookmarks, no cleared cells. At init the check runs before the bookmarks are even written, so a rejected init leaves no stale names behind.


3. The danger zone & safe placement

3.1 What the danger zone is

A lane owns the band directly below its stack, within its column range, indefinitely — because that’s the space its data grows into. That band is the danger zone. Loose content (a stray note, a summary formula, a label) placed there will be shifted or destroyed when the table grows or shrinks. Floating shapes (images, text boxes) anchored in the band are also detected and raise before any shift occurs (see §3.4).

A
B
C
D
2
sample
batch
result
3
S1
B1
10
4
S2
B1
20end
▼ danger zone — Lane A–C; col D is outside this lane, SAFE
5
SAFE
6
owned by
the lane
below ↓
SAFE
7
SAFE

Three regions are always safe (see ../safe-placement.md):

  1. Above the title row. The shift only ever inserts/deletes downward.
  2. In columns outside the lane’s band. Content in a different lane (disjoint column band) is never in any lane’s shift band. A second managed table in a disjoint column band sharing the same rows is a different lane — it is accepted, not rejected (§2).
  3. Cells you protect with their own named range. Both the danger-zone scan and the widen-claim scan exempt any cell covered by a named range other than the operating table’s own bookmarks. Shapes cannot be exempted this way — a shape anchored in the zone always raises.

3.2 The lane band — how wide is “the lane’s columns”?

The danger zone covers the full column extent of the operating table’s lane (_cluster_for_operating_table, used by _managed_band_for_sheet). A managed table joins the lane if its columns share ≥1 column with any table already in the lane — a BFS closure, not a first-degree test:

   operating table   cols 1–3
   T_A               cols 2–5    ← overlaps operating          → in lane
   T_B               cols 4–8    ← overlaps T_A (not operating) → in lane (transitive)
   T_C               cols 20–25  ← disjoint from all            → different lane, excluded

   ⇒ lane band = cols 1–8.  The Insert/Delete spans cols 1–8 so the whole
     chain shifts as one unit; T_C is in a different lane and is never touched.

The danger-zone scan (_check_danger_zone) and the row-shift (_shift_band_below) use the same band, so anything that would be shifted is also inspected first. That’s the safety guarantee.

3.3 Grow that pushes a neighbor down — SAFE

The neighbor below is itself a managed table (covered by its bookmark extent), so the danger-zone scan treats it as “safe to shift as a unit” and the grow proceeds. Before — operating table X (rows 2–4) with stacked neighbor Y (rows 6–8); paste a 4-row DataFrame into X:

BEFORE — X: n_old = 2
A
B
C
2
X_h
X_h
X_h
3
d1
d1
d1
4
d2
d2
d2end
(gap)
6
Y_h
Y_h
Y_h
7
b1
b1
b1
8
b2
b2
b2
AFTER — X: n_new = 4, delta = +2
A
B
C
2
X_h
X_h
X_h
3
D1
D1
D1
4
D2
D2
D2
5
D3
D3
D3
6
D4
D4
D4end
(gap — preserved)
8
Y_h
Y_h
Y_h
9
b1
b1
b1
10
b2
b2
b2

Y moved from rows 6–8 to rows 8–10 — pushed down by delta = +2. The 1-row gap between X and Y (row 5 before → row 7 after) is preserved (§4.3).

3.4 Grow that hits loose content or a shape — REFUSED (TableSpaceOccupiedError)

Now suppose instead of a managed neighbor, there’s a loose note in the danger zone (col B, row 6) with no named range protecting it:

A
B
C
2
X_h
X_h
X_h
3
d1
d1
d1
4
d2
d2
d2end
▼ danger zone — lane X (cols A–C); attempt: paste 4 rows
5
6
NOTE: QA
_check_danger_zone scans rows 5..(old_end+100) across the lane band (A–C),
finds "NOTE: QA" at B6 — not covered by any managed table, not protected by a
named range — and raises BEFORE any cell is cleared or shifted.

TableSpaceOccupiedError“Raised when a write would shift or destroy non-table content in the target band — either the danger zone below a managed table or the cells a fresh header would land on. Distinct from the width-mismatch case (DataFrame wider than the declared header), which raises plain ValueError because it is caller-side input validation, not a sheet-state conflict.”

Fix options (all in the message): move the note above the title row; move it to a column outside the lane band; or give B6 its own named range to mark it intentional (the scan then exempts it).

Shapes are also detected. _check_danger_zone enumerates sheet.shapes and checks each shape’s anchor cells (TopLeftCell / BottomRightCell) against the affected region. A floating image or text box anchored inside the lane band raises TableSpaceOccupiedError with the shape’s name and a move instruction computed as: “move it at least N rows down (away from the table), or above the title row.” Shapes cannot be exempted by a named range; any shape anchored in the zone raises. After moving the shape and re-running, the scan re-checks and proceeds. No persistent state is stored between attempts.

For button-driven macros: the caller catches the exception and passes str(exc) to display_message_and_exit (ui.py). The engine itself never calls display_message_and_exit — keeping the core testable and context-free.

3.5 The claim zone — widening

When a paste’s DataFrame is wider than the current header, the engine auto-widens the header into new columns to the right (§4.4). Those new columns were never part of the table, so before touching them _check_widen_claim_zone runs two scans — both raising TableSpaceOccupiedError on loose content or shapes:

  1. New-column extension in the operating rows. Cells in (op_top..scan_bottom, old_header_end+1 .. new_header_end) would be overwritten by the new header / data write.
A
B
C
D
2
X_h
X_h
X_h
(new)
← col D about to become header (df has 4 cols; header is A–C)
3
d1
d1
d1
LEFTOVER
← claim-zone scan finds this → refuses
4
d2
d2
d2
  1. Post-widen lane extension below the table. If the new columns bridge the operating table into a previously-disjoint stacked table (widening the lane band), any loose content in those newly-in-lane columns inside the danger zone would now be caught by the band-Insert. The pre-widen _check_danger_zone couldn’t have seen those columns, so this second scan covers them.

Both scans check cells and floating shapes (_shapes_in_region). A shape anchored in the claim zone raises with a column move instruction computed by _column_move_hint: “move it at least N columns to the right (away from the table).” Cells covered by a named range or another managed table are exempt; shapes are never exempt.

Both scans exempt cells covered by a named range or another managed table.


4. The paste shift mechanism — clear → shift → write

paste_df_to_named_range resizes a table with a three-step multi-pass shift. This is the precise mechanism, in execution order, exactly as implemented.

4.0 Pre-flight (before any destructive step)

In order, all before anything is cleared or moved:

  1. Sanitize the DataFrame (_sanitize_df_for_excel): ±inf → NaN; truncate strings over Excel’s 32,767-char cell limit (with one UserWarning); and — because sanitize_formulas=True by default — apostrophe-escape text cells starting with = + - @ so external data can’t inject a live formula. Numbers and numeric-looking strings are left alone. The caller’s DataFrame is never mutated.
  2. Resolve the target (_resolve_paste_target): header range, start column, current header width, first data row.
  3. Column-order check (_check_df_columns_match_headers): every existing header position must match the DataFrame column at that position — see §6.5. Raises ColumnOrderMismatchError if not.
  4. Read the old extent (_read_op_extent): old_start / old_end from the live bookmarks, with auto-heal (§5.1) when they’re missing/corrupt/inverted.
  5. Compute the shift amount (§4.3) and assert the stacked-only invariant (§2) on the projected post-paste rows.
  6. Run the widen-claim scan (§3.5) if the DataFrame is wider than the header.

Only after all of the above passes does the destructive sequence begin.

4.1 Step 1 — ClearContents

_clear_old_data_rows runs the danger-zone scan (§3.4) and then ClearContents on the operating table’s own data cells — (old_start, start_col) through (old_end, header_end_col)in place, no row shift. On a first paste (old_end < old_start) this is a no-op.

4.2 Step 2 — Shift (Insert / Delete whole rows)

_shift_band_below opens or closes row slots strictly below old_end, across the cluster band [band_left, band_right] (§3.2). Let delta = n_new - n_old_effective:

Because the shift only ever touches rows below old_end, cells at or above old_end — including the header itself and any loose content sharing the operating rows but above old_end — are never moved by this step. Content in a different lane (disjoint columns) is also untouched because the Insert/Delete spans only the current lane’s column band. (Row overlap within the same lane is already rejected by §2 before this step runs.)

4.3 Step 2 (cont.) — how the gap to the neighbor below is preserved

The shift amount differs between first and subsequent pastes, and that difference is exactly what preserves the user’s chosen gap.

Subsequent paste (n_old > 0) — pure delta:

shift_amount = n_new - n_old           # (excel_io.py)

The neighbor below is pushed down by +delta on grow, pulled up by |delta| on shrink. Whatever gap existed between this table and the next stacked table is mechanically preserved, because both the table’s new bottom and the neighbor’s top move by the same delta.

First paste (n_old == 0) — gap-aware:

gap_below    = _nearest_neighbor_gap_below(...)   # empty rows below header to nearest neighbor
shift_amount = max(0, n_new - gap_below)

On the very first paste there are no data rows yet, but the user has already laid out a gap between this table’s header and the next table (table_gap, 2 by default). The engine consumes that gap first and only pushes the neighbor by the deficit (n_new - gap) when the data is taller than the gap. Data that fits in the existing gap is written into it without displacing the neighbor at all.

_nearest_neighbor_gap_below scans by cell content (not by bookmark), because a header-only neighbor that’s never been pasted into has only its anchor named range — no _table_start/_table_end bookmarks yet — so it’s invisible to _iter_managed_table_extents. Scanning cell content catches the neighbor’s header cells the way the user sees them. (When no neighbor is found within the scan window, it returns a large sentinel = “unlimited room below”.)

First-paste grid — data fits the gap (no neighbor displacement):

BEFORE — X header only, gap = 2
A
B
2
X_h
X_h
(gap)
(gap)
5
Y_h
Y_h
AFTER — paste 2 rows; n_new = 2 ≤ gap
A
B
2
X_h
X_h
3
D1
D1
← written into the gap
4
D2
D2
5
Y_h
Y_h
← UNMOVED (shift = max(0, 2−2) = 0)

First-paste grid — data exceeds the gap (push by the deficit only):

BEFORE — gap = 2
A
B
2
X_h
X_h
(gap)
(gap)
5
Y_h
Y_h
AFTER — paste 3 rows; shift = max(0, 3−2) = 1
A
B
2
X_h
X_h
3
D1
D1
4
D2
D2
5
D3
D3
← consumed gap + 1 new row
6
Y_h
Y_h
← shifted down by 1 only

The net effect across any sequence of grows and shrinks is that the gap set at initial layout time is the gap you keep, forever.

4.4 Step 3 — Write + bookmarks (and column-width handling)

Between the clear and the write, the header is widened if needed (_extend_header_for_wider_df), then _shift_band_below runs in the post-widen band, then _write_data_and_bookmarks:

Widen. When df has more columns than the header, the header named range is auto-extended rightward; new header cells take their display names from df.columns. No Insert columns is needed — the lane-row-disjoint invariant guarantees no same-lane table shares the operating rows, so the new header cells land on rows nobody in the lane owns. Content in a different lane at the same rows is unaffected because the widen only writes into the lane’s new column(s). (If the rightmost new column would pass Excel’s last column XFD = 16,384, ExcelColumnOverflowError is raised first.)

Narrow. When df has fewer columns than the header, the trailing data cells are already cleared by Step 1 (the clear spans header_end_col). The header named range stays at its declared width; the _table_end bookmark tracks the data extent, ending at start_col + df.shape[1] - 1. So a later re-widen to the original width needs no header change.

WIDEN — header A–C, df has 4 cols
A
B
C
D
2
h
h
h
NEW
← header extended
3
D1
D1
D1
D1
4
D2
D2
D2
D2end
_table_end → (4, D)
NARROW — header A–D, df has 3 cols
A
B
C
D
2
h
h
h
h
← header KEPT wide
3
D1
D1
D1end
← D cleared
4
D2
D2
D2
_table_end → (4, C) (data extent)

4.5 Failure ordering

The steps are ordered so any failure leaves the sheet consistent:

  1. Invariant check (no mutation) → can abort cleanly.
  2. Widen-claim scan (no mutation) → can abort cleanly.
  3. Clear old data — runs the danger-zone scan first; raising here means no header has been mutated yet, safe to abort.
  4. Widen header (commits only after the clear succeeded).
  5. Shift + write.

5. Bookmark auto-heal & cleanup_stale_table_names

Workbooks get edited by humans in Excel. Columns get deleted, rows get right-clicked away, names get fiddled in the Name Manager. The engine recovers from the common forms of damage.

5.1 Bookmark auto-heal (the write path)

_read_op_extent resolves <name>_table_start / <name>_table_end. If the pair is missing, corrupt (#REF! or any COM error), or inverted (end < start), it logs a DEBUG breadcrumb, drops the fragments (_drop_names), and returns an empty extent so the next paste falls through to the first-paste path — writing new data in the row immediately below the header, exactly as a fresh table would.

BROKEN — user deleted _table_end
A
B
2
h
h
← header anchor OK
3
???
???
_table_start present, _table_end #REF!
4
???
???
→ drop both fragments, first-paste path
HEALED — next paste, n_old treated as 0
A
B
2
h
h
3
D1
D1
← rewritten from header+1
4
D2
D2
→ both bookmarks rewritten fresh
Trigger Cause Outcome
_table_start missing Deleted via Name Manager Drop _table_end, first-paste path
_table_end missing Same Drop _table_start, first-paste path
Either #REF! Right-clicked → Delete on data rows Drop both, first-paste path
end.row < start.row Manual mis-edit Drop both, first-paste path

Recovery is silent except for a DEBUG log. See ../bookmark-auto-heal.md.

Read paths heal differently. import_excel_table / scan_named_range_table grow a stale _table_end (scan down, extend) and rewrite missing/orphaned bookmarks, but never auto-shrink — a read should never lose track of data it can see.

5.2 cleanup_stale_table_names (on-demand pruning)

#REF! names accumulate when a user deletes a column manually in Excel — the named range lingers but refers_to resolves to =#REF!#REF!. cleanup_stale_table_names prunes all such broken names for a managed table.

from xlwings_package import cleanup_stale_table_names

cleanup_stale_table_names(wb, "assay_x")               # prune #REF! names
cleanup_stale_table_names(wb, "assay_x", sheet="Results")  # one sheet only
# → {"removed_ref": ["Results!col_b"]}

Ownership is decided by name pattern + the manifest. Because ownership keys off the manifest, a table with no manifest raises ManifestNotFoundError:

ManifestNotFoundError“Raised when a Benchling-managed table has no schema-column manifest on the requested sheet(s) — i.e. the table was never initialised by initialise_table_benchling or the manifest name was deleted manually.”

(A missing sheet= raises WorksheetNotFoundError.) Details: ../cleanup-stale-names.md.


6. CENTERPIECE — reinitialise a stacked Benchling table with MORE columns

This is the payoff: two stacked Benchling tables on one sheet, the upstream schema gains a column, you call initialise_table_benchling again, and the table auto-widens in place — header extends, named ranges rebind, manifest updates — while the stacked neighbor below is protected.

6.0 Starting state — two stacked tables, with data

assay_x (cols A–C) stacked above assay_y (cols A–C), both with data and a 2-row gap between:

A
B
C
D
1
Assay X (title)
2
sample
batch
resultanchor
← anchor assay_x $A$2:$C$2
3
S1
B1
10start
4
S2
B1
20end
assay_x_table_end = $C$4
(gap — 2 rows)
7
Assay Y (title)
8
well
conc
odanchor
← anchor assay_y $A$8:$C$8
9
A1
5
0.2start
10
A2
5
0.3end
assay_y_table_end = $C$10
manifests: assay_x__schema_columns = "sample,batch,result" · assay_y__schema_columns = "well,conc,od"

6.1 The upstream schema gains a column

In Benchling, Assay X adds a new result field qc at the end of its schema. The schema is now [sample, batch, result, qc].

6.2 The reinit call (runnable)

Reinit is automatic: initialise_table_benchling detects that assay_x already exists on the sheet (_find_table finds the anchor name) and routes to _reinit instead of _write_fresh. You call it exactly the same way you did the first time:

from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext
from xlwings_package import initialise_table_benchling

# You build the context yourself — credentials are mgtx-benchling-wrapper's job.
ctx = BenchlingContext(
    client_id=...,
    client_secret=...,                 # plaintext — resolve from your store
    base_url="https://yourorg.benchling.com",
    token_url="https://yourorg.benchling.com/api/v2/token",
)

# Same two schemas, same sheet, same column. assay_x's schema now has `qc`.
results = initialise_table_benchling(
    wb,
    {
        "Results": ["assaysch_assayx", "assaysch_assayy"],
    },
    ctx=ctx,
    column="A",             # autopaste; both stack in column A
    table_gap=2,            # the gap that's preserved across reinits/pastes
)

# Each result dict carries action + a diff when reinitialised:
for r in results:
    print(r["display_name"], r["action"], r.get("diff"))
# Assay X  reinitialised  {'added': ['qc'], 'updated': [], 'removed': [], ...}
# Assay Y  reinitialised  {'added': [], ...}            # unchanged

6.3 What _reinit does, step by step

_reinit is the in-place updater. For assay_x gaining qc:

  1. Locate the table_find_table returns (header_row=2, start_col=1, end_col=3).
  2. Re-sync display names + title — column headers and the title cell are rewritten from the schema on every call (Benchling is the source of truth for text).
  3. Diff against the manifestcur_map_schema_only is cur_map filtered to the manifest {sample, batch, result}, so any manual range a dev added in the band is excluded (M.2.1). qc is in the new schema but not in cur_map → it’s a new column.
  4. Append the new column. New columns are appended at the right edge of the current band (end_col + 1, col D). _check_band_empty confirms D2 is free first (raises TableSpaceOccupiedError if not). The header cell gets the display name + bold + fill, and a fresh sheet-scoped range qc binds to $D$2. No Insert columns is emitted — the reorder/repack pass (step 5) places every column in schema order via a clear-and-rewrite, so appending then repacking yields the same final layout without a propagating column shift.
  5. Reorder pass — the active band is snapshotted, cleared, and rewritten in schema order. Named ranges are rebound in place. Because this operates only on the lane’s own columns (clear-and-rewrite, no Insert/Delete), no other lane is affected.
  6. Extend the table extent — the schema anchor grows to $A$2:$D$2, the title merge re-merges across A–D, and assay_x_table_end extends to col D.
  7. Rewrite the manifestassay_x__schema_columns = "sample,batch,result,qc".

6.4 BEFORE / AFTER — the widen, with the neighbor protected

BEFORE — assay_x = [sample, batch, result]
A
B
C
D
2
sample
batch
resultanchor
← anchor $A$2:$C$2
3
S1
B1
10
4
S2
B1
20end
assay_x_table_end $C$4
(gap — 2 rows)
7
Assay Y (title)
8
well
conc
od
9
A1
5
0.2
10
A2
5
0.3
manifest: assay_x__schema_columns = "sample,batch,result"
AFTER — assay_x = [sample, batch, result, qc]
A
B
C
D
2
sample
batch
result
qcanchor
← anchor NOW $A$2:$D$2, qc → $D$2
3
S1
B1
10
(empty)
← existing data INTACT
4
S2
B1
20
(empty)end
assay_x_table_end NOW $D$4
(gap — 2-row gap PRESERVED)
7
Assay Y (title)
← Y UNMOVED — still rows 7–10
8
well
conc
od
9
A1
5
0.2
10
A2
5
0.3
manifest now: assay_x__schema_columns = "sample,batch,result,qc"

Why the neighbor is protected: the reorder/repack pass performs a clear-and- rewrite on the lane’s own columns only — no row Insert/Delete, no column Insert/Delete that propagates rightward. The new qc header lands at D2 and its data column extends straight down into column D; the repack places it in schema order by value-snapshot → clear → rewrite, so assay_y at rows 7–10 is untouched. The gap is preserved. Existing sample/batch/result data is preserved cell-for-cell; only the headers are re-synced and qc’s cells start empty until the next paste.

6.5 Re-pasting after the schema change — column-order matching

After reinit, you pull fresh data and repaste. The DataFrame’s columns must line up with the headers, because paste_df_to_named_range writes df.values positionally — column 0 under the first header, column 1 under the second, and so on. _check_df_columns_match_headers enforces this before any destructive op, matching each DataFrame column against either the per-column range token or the header display value (case- and whitespace-normalized):

from xlwings_package import import_excel_table, paste_df_to_named_range

# After reinit, the sheet header order is [sample, batch, result, qc].
df = pull_assay_x_results()            # however you build it
# df.columns must be [sample, batch, result, qc] in THIS order.
paste_df_to_named_range(wb, df, "assay_x")

If the DataFrame’s columns are reordered, or a column was inserted in the middle or dropped from the middle relative to the sheet headers, the positional write would silently misalign data — so it raises instead:

ColumnOrderMismatchError“Raised when a DataFrame’s columns don’t match the existing Excel header positions on a paste. paste_df_to_named_range writes df.values positionally — column 0 lands under the first header cell, column 1 under the second, etc. If df.columns is reordered (or a column was inserted in the middle, or one was removed from the middle), the positional write would silently misalign data with the headers. This error fires before any destructive operation so the user can either reorder the DataFrame or re-initialise the table to reflect the new schema. Matches against either the column-level sheet-scoped named-range token or the header cell display value (case + whitespace normalized), so both the Benchling workflow (use_name_range_names=True → range-name tokens) and the manual-table workflow (display values) work without configuration.”

sheet
sample
batch
result
qc
df
sample
result
batch
qc
← positions 1 & 2 swapped (batch/result) → ColumnOrderMismatchError
Fix A: reorder the DataFrame to [sample, batch, result, qc] before pasting.
Fix B: if Benchling itself reordered, re-run initialise_table_benchling first
       (the reorder pass rebinds the sheet to the new order), THEN paste.

Only existing header positions are checked; columns past the current header are handled by the auto-widen path (and pre-scanned by the claim zone, §3.5). A freshly-anchored, empty-header position has no identity to match against and is accepted (this is what lets a first paste into a brand-new table go through).

6.6 The mirror case — Benchling DROPS a column

When a column is removed from the Benchling schema, _reinit purges it outright: clears every cell in the column from the header through _table_end, deletes its named range, closes the horizontal gap so the remaining schema columns are contiguous (the repack pass), and shrinks the table extent — schema anchor width, title merge, and _table_end — to the new rightmost column. The stacked neighbor below is untouched: the purge operates on columns, not rows, so no vertical shift reaches assay_y.

BEFORE — schema [sample, batch, result]; Benchling drops batch
A
B
C
D
2
sample
batch
result
← batch dropped upstream
3
S1
B1
10
4
S2
B1
20end
assay_x_table_end $C$4
AFTER — batch purged; result closes up into B, table shrinks to A–B
A
B
C
D
2
sample
result
(empty)
← anchor NOW $A$2:$B$2
3
S1
10
(empty)
4
S2
20end
(empty)
assay_x_table_end NOW $B$4

The repack/shrink is guarded by all_active_resolved — the engine only compacts and shrinks when every active schema column’s name resolved, so it can guarantee a contiguous block. If a name were missing it leaves the wider extent intact rather than risk stranding a live column outside the bookmark.

See ../visualizations/schema-drift-with-data.html for animated add-end / add-middle / remove / reorder cases on a table that already holds data, and ../visualizations/stacked-tables-contract.html for the row/column operations and the lane layout.


7. Mental-model summary