Stacked tables & reinitialisation — the layout engine, in depth
🎨 Colour walkthrough: open the interactive visual version — 7 toggleable before/after scenarios, including the reinitialise-with-more-columns centerpiece. The Unicode grids below are the always-available text version.
Audience. You want to lay out more than one managed table on a sheet and evolve them safely over time — grow them, shrink them, widen them when the upstream Benchling schema changes — without ever silently corrupting data. This is the conceptual reference for the layout engine that makes that safe.
Scope. Everything here is grounded in the real implementation:
src/xlwings_package/excel_io.py(paste_df_to_named_range),src/xlwings_package/excel_utils.py(the invariant / band / danger-zone machinery),src/xlwings_package/benchling_io.py(initialise_table_benchling+_reinit), andsrc/xlwings_package/exceptions.py.
Related docs:
- Quick rules for where to put loose content: ../safe-placement.md
- Bookmark recovery: ../bookmark-auto-heal.md
- Stale-name pruning: ../cleanup-stale-names.md
- Task guides: ../guides/normal-tables.md · ../guides/benchling-single.md · ../guides/benchling-multiple.md · ../guides/plates.md
- API surface: ../reference/tools.md
- Interactive companions (open in a browser from inside the repo):
../visualizations/stacked-tables-contract.html,../visualizations/schema-drift-with-data.html.
Reading the grid mockups
Throughout this doc, sheet states are drawn as Excel grids with column letters across the top and row numbers down the left. Markers on the right edge point at the structural rows the engine cares about:
Legend used in every diagram:
| Glyph / shade | Meaning |
|---|---|
title / merged row |
The Benchling title cell (_write_title) |
header row |
Column headers; the schema anchor named range lives here |
d1, d2, … |
Existing data rows |
D1, D2, … |
Data written this operation |
(gap) |
Empty rows between this table and the next stacked table |
▼ DANGER ZONE ▼ |
The lane band below a table that the table owns (see §3) |
1. The managed-table model
A managed table is a rectangular region of a worksheet whose structure the package tracks with sheet-scoped named ranges. The package never relies on Excel’s built-in ListObject “Tables”; it builds its own lightweight scaffolding out of names so it works identically on any sheet, in any workbook.
There are four kinds of name the engine attaches to a table named <name>:
1.1 The header anchor (<name>)
A single named range whose refers_to_range is the header row of the table
— one cell per column, contiguous. For a Benchling table this is the schema
anchor: its short token is the schema’s warehouse-derived range name (see
§1.4), and it spans every schema column.
assay_x = $A$2:$C$2paste_df_to_named_range resolves this anchor first
(_resolve_paste_target): the header row, the start column, and the current
header width all come from it. The anchor’s width is the table’s declared
column count.
1.2 Per-column names
Each header cell may carry its own single-cell sheet-scoped name — the column
token (e.g. sample, batch, result). These are what import_excel_table
keys its DataFrame columns on (with use_name_range_names=True), and what
_check_df_columns_match_headers matches against on paste. They are created by
initialise_table_named_ranges (manual tables) or _write_fresh /
_reinit (Benchling tables).
1.3 The start/end bookmarks (<name>_table_start / <name>_table_end)
This is the heart of the engine. Two sheet-scoped names track the live data extent — not the header, the data:
<name>_table_start→ top-left data cell (the cell atheader_row + 1, start column).<name>_table_end→ bottom-right data cell (last data row, last data column).
assay_x_table_start = $A$3assay_x_table_end = $C$5Every call to paste_df_to_named_range rewrites these two bookmarks
(_write_data_and_bookmarks) so the next repaste knows exactly which rows to
clear and how many rows to shift. The bookmark prefix (<name>, the substring
before _table_start / _table_end) is how _iter_managed_table_extents
discovers every managed table on a sheet — this is the identity the invariant
check, the cluster band, and the danger-zone scan all operate on.
Why a separate end bookmark instead of “scan down until blank”? A scan can’t tell the difference between a deliberately-empty cell in the middle of a sparse dataset and the true bottom of the table. The bookmark records the truth. (Reads still tolerate a stale bookmark by scanning and growing — see ../bookmark-auto-heal.md — but the write path trusts the bookmark to clear precisely.)
1.4 The schema manifest (Benchling only) — <name>__schema_columns
A Benchling-managed table also carries a fifth name: the manifest. It is a
sheet-scoped name whose refers_to is a string formula listing the canonical
warehouse column names:
assay_x__schema_columns refers_to: ="sample,batch,result"
The manifest is the engine’s record of which columns Benchling owns. It is
read by _read_schema_manifest and written at the end of every _write_fresh
and every _reinit (_write_schema_manifest). Its job is to let _reinit
distinguish schema-owned columns (which it may rename/purge when Benchling
drops them) from manual named ranges a developer added inside the band
(which it must leave alone — the “M.2.1” protection). It is also the ownership
key for cleanup_stale_table_names: no manifest ⇒ ManifestNotFoundError,
because this package only manages tables it created.
Warehouse names containing a comma or
"would break the round-trip parse, so_write_schema_manifestraises eagerly. Benchling warehouse names are snake_case by convention, so this never fires in practice.
1.5 Putting it together
A two-table stacked Benchling sheet, fully scaffolded:
assay_x = $A$2:$C$2assay_x_table_start $A$3 · assay_x_table_end $C$4assay_y = $A$8:$C$8assay_y_table_start $A$9 · assay_y_table_end $C$10Names on this sheet: assay_x, assay_x_table_start, assay_x_table_end,
assay_x__schema_columns, the per-column tokens (sample, batch, result),
and the same family for assay_y. The two tables share columns A–C, placing
them in the same lane, and occupy disjoint rows (2–4 vs 8–10). Within a
lane, row disjointness is required — see §2.
2. The lane model and the row-overlap invariant
A lane is the column-connected group of managed tables returned by
_cluster_for_operating_table — tables stacked vertically so that members are
row-disjoint within the lane. The lane’s column band runs from the leftmost
column of any member to the rightmost (min(member lefts) … max(member rights)).
Every row-grow and row-shrink is bounded to that band, so tables in other lanes —
column-disjoint by construction — are never touched.
Side-by-side, L-shaped, and T-shaped layouts are simply several lanes in separate column bands that happen to share rows. Because each operation is bounded to its own lane’s columns, the lanes are independently safe:
A B C E F G H J+ (free)
2 [Xh Xh Xh] [Yh Yh Yh Yh]
3 [ x x x] [ y y y y]
4 [ x x x] [ y y y y]
5 [ x x x] <- Y grows/shrinks in E-H;
6 A-C is never touched
7 [ x x x] <- X grows down freely
LANE 1 (A-C) LANE 2 (E-H)
danger zone A-C danger zone E-H disjoint columns => independent forever
SideBySideNotSupportedError is raised only for row overlap within a single
lane (column-overlapping tables). Two tables in disjoint column bands sharing
rows is not an error — they are different lanes.
2.1 What counts as row overlap
The check in _assert_stacked_only operates on row intervals within the
operating table’s lane only. For two managed tables with extents taken from
their bookmark unions via _iter_managed_table_extents:
- Table P spans rows
[Pr_min, Pr_max]. - Table Q spans rows
[Qr_min, Qr_max].
A table is part of the operating lane iff it shares at least one column with the transitive cluster anchored at the operating table (§3.2). A table whose columns are entirely disjoint is a different lane and is skipped.
Within a lane they overlap iff not (Pr_max < Qr_min or Qr_min > Pr_max).
Valid vertical stack — rows disjoint, columns identical (same lane, fine):
Also valid — disjoint-column side-by-side (two lanes sharing rows):
Also valid — side-by-side in truly disjoint columns (two lanes sharing rows):
A B C E F G ✓ col D is empty gap
[Left ] [Right ] Two separate lanes — paste into either succeeds
The key criterion is column disjointness: Left (A-C) and Right (E-G) share no column, so they are independent lanes regardless of which rows they occupy.
Edge case — wide bridging table merges two groups into one lane:
A table whose column span bridges two otherwise-disjoint groups merges them
transitively into one lane. If Wide covers A-G, it overlaps both Left (A-C) and
Right (E-G) — none of A-C, E-G are disjoint from A-G — so all three are in the
same lane. Right’s rows then overlap Left’s rows within that merged lane, and
any paste into Right raises SideBySideNotSupportedError.
A B C E F G ✓ before Wide is added: two independent lanes
[Left ] [Right ]
A B C D E F G ✗ after Wide (A-G) straddles both:
[Left ] [Right ] Right is now part of Left's lane;
[ Wide (A-G) ] paste into Right → SideBySideNotSupportedError
Fix: move Right past Wide’s right edge (e.g. cols I-K) so it is column-disjoint from A-G. It then becomes its own independent lane again.
Invalid — row overlap within the same lane:
2.2 Why row overlap within a lane is rejected
The engine resizes a table by inserting or deleting whole worksheet rows across its lane’s column band (§4). An Insert/Delete in that band moves every cell in those columns. If a second table in the same lane (overlapping columns) shared even one of the operating table’s rows, the row-shift would tear it — its left columns shifted, its right columns not, or vice versa.
Column overlap is the criterion because the row-shift only touches rows
strictly below the operating table’s old bottom row (_shift_band_below,
§4.2). A stacked neighbor below shifts down or up as a whole unit — all its
columns move by the same delta — so it stays intact. Tables in a different lane
(disjoint columns) are in a different Insert/Delete band entirely and are never
touched.
2.3 The projection: when does the check fire?
_assert_stacked_only does not just compare current rows. It compares the
operating table’s post-operation row range against every other same-lane
table’s projected post-operation row range, because the shift the engine is
about to perform will move some lane-mates:
- It computes the lane band — the transitive column-overlap closure
anchored at the operating table’s columns (see §3.2 /
_cluster_for_operating_table). - A neighbor that (a) overlaps the lane’s columns and (b) currently sits
below
op_old_endis projected to shift bydeltarows:(R_min + delta, R_max + delta). - A neighbor in a different lane (column-disjoint) is skipped entirely.
- The invariant fires iff any projected same-lane range intersects the
operating table’s post-operation range
[op_top, op_bottom_post].
So the check is forward-looking: it rejects layouts that would overlap after the shift completes within the lane, not merely ones that overlap now.
2.4 Exactly where it fires
| Call site | When | op_top / op_bottom_post |
delta |
|---|---|---|---|
initialise_table_named_ranges (manual init) |
after computing the footprint, before writing bookmarks | header_row / footprint bottom | 0 |
initialise_table_benchling → _write_fresh path |
via the same init machinery for fresh tables | header / footprint | 0 |
paste_df_to_named_range |
before any destructive op | header_row / header_row + n_new |
the computed shift amount |
create_plate_layout |
before the plate paste | plate extent | plate row delta |
In every case the check runs before anything destructive, so a rejection leaves the sheet exactly as it was — no half-written bookmarks, no cleared cells. At init the check runs before the bookmarks are even written, so a rejected init leaves no stale names behind.
3. The danger zone & safe placement
3.1 What the danger zone is
A lane owns the band directly below its stack, within its column range, indefinitely — because that’s the space its data grows into. That band is the danger zone. Loose content (a stray note, a summary formula, a label) placed there will be shifted or destroyed when the table grows or shrinks. Floating shapes (images, text boxes) anchored in the band are also detected and raise before any shift occurs (see §3.4).
Three regions are always safe (see ../safe-placement.md):
- Above the title row. The shift only ever inserts/deletes downward.
- In columns outside the lane’s band. Content in a different lane (disjoint column band) is never in any lane’s shift band. A second managed table in a disjoint column band sharing the same rows is a different lane — it is accepted, not rejected (§2).
- Cells you protect with their own named range. Both the danger-zone scan and the widen-claim scan exempt any cell covered by a named range other than the operating table’s own bookmarks. Shapes cannot be exempted this way — a shape anchored in the zone always raises.
3.2 The lane band — how wide is “the lane’s columns”?
The danger zone covers the full column extent of the operating table’s lane
(_cluster_for_operating_table, used by _managed_band_for_sheet). A managed
table joins the lane if its columns share ≥1 column with any table already
in the lane — a BFS closure, not a first-degree test:
operating table cols 1–3
T_A cols 2–5 ← overlaps operating → in lane
T_B cols 4–8 ← overlaps T_A (not operating) → in lane (transitive)
T_C cols 20–25 ← disjoint from all → different lane, excluded
⇒ lane band = cols 1–8. The Insert/Delete spans cols 1–8 so the whole
chain shifts as one unit; T_C is in a different lane and is never touched.
The danger-zone scan (_check_danger_zone) and the row-shift
(_shift_band_below) use the same band, so anything that would be shifted
is also inspected first. That’s the safety guarantee.
3.3 Grow that pushes a neighbor down — SAFE
The neighbor below is itself a managed table (covered by its bookmark extent), so the danger-zone scan treats it as “safe to shift as a unit” and the grow proceeds. Before — operating table X (rows 2–4) with stacked neighbor Y (rows 6–8); paste a 4-row DataFrame into X:
Y moved from rows 6–8 to rows 8–10 — pushed down by delta = +2. The 1-row gap
between X and Y (row 5 before → row 7 after) is preserved (§4.3).
3.4 Grow that hits loose content or a shape — REFUSED (TableSpaceOccupiedError)
Now suppose instead of a managed neighbor, there’s a loose note in the danger zone (col B, row 6) with no named range protecting it:
_check_danger_zone scans rows 5..(old_end+100) across the lane band (A–C),
finds "NOTE: QA" at B6 — not covered by any managed table, not protected by a
named range — and raises BEFORE any cell is cleared or shifted.
TableSpaceOccupiedError— “Raised when a write would shift or destroy non-table content in the target band — either the danger zone below a managed table or the cells a fresh header would land on. Distinct from the width-mismatch case (DataFrame wider than the declared header), which raises plainValueErrorbecause it is caller-side input validation, not a sheet-state conflict.”
Fix options (all in the message): move the note above the title row; move it to a column outside the lane band; or give B6 its own named range to mark it intentional (the scan then exempts it).
Shapes are also detected. _check_danger_zone enumerates sheet.shapes
and checks each shape’s anchor cells (TopLeftCell / BottomRightCell) against
the affected region. A floating image or text box anchored inside the lane band
raises TableSpaceOccupiedError with the shape’s name and a move instruction
computed as: “move it at least N rows down (away from the table), or above the
title row.” Shapes cannot be exempted by a named range; any shape anchored in
the zone raises. After moving the shape and re-running, the scan re-checks and
proceeds. No persistent state is stored between attempts.
For button-driven macros: the caller catches the exception and passes
str(exc) to display_message_and_exit (ui.py). The engine itself never
calls display_message_and_exit — keeping the core testable and context-free.
3.5 The claim zone — widening
When a paste’s DataFrame is wider than the current header, the engine
auto-widens the header into new columns to the right (§4.4). Those new columns
were never part of the table, so before touching them _check_widen_claim_zone
runs two scans — both raising TableSpaceOccupiedError on loose content or
shapes:
- New-column extension in the operating rows. Cells in
(op_top..scan_bottom, old_header_end+1 .. new_header_end)would be overwritten by the new header / data write.
- Post-widen lane extension below the table. If the new columns bridge
the operating table into a previously-disjoint stacked table (widening the
lane band), any loose content in those newly-in-lane columns inside the
danger zone would now be caught by the band-Insert. The pre-widen
_check_danger_zonecouldn’t have seen those columns, so this second scan covers them.
Both scans check cells and floating shapes (_shapes_in_region). A shape
anchored in the claim zone raises with a column move instruction computed by
_column_move_hint: “move it at least N columns to the right (away from the
table).” Cells covered by a named range or another managed table are exempt;
shapes are never exempt.
Both scans exempt cells covered by a named range or another managed table.
4. The paste shift mechanism — clear → shift → write
paste_df_to_named_range resizes a table with a three-step multi-pass shift.
This is the precise mechanism, in execution order, exactly as implemented.
4.0 Pre-flight (before any destructive step)
In order, all before anything is cleared or moved:
- Sanitize the DataFrame (
_sanitize_df_for_excel):±inf → NaN; truncate strings over Excel’s 32,767-char cell limit (with oneUserWarning); and — becausesanitize_formulas=Trueby default — apostrophe-escape text cells starting with=+-@so external data can’t inject a live formula. Numbers and numeric-looking strings are left alone. The caller’s DataFrame is never mutated. - Resolve the target (
_resolve_paste_target): header range, start column, current header width, first data row. - Column-order check (
_check_df_columns_match_headers): every existing header position must match the DataFrame column at that position — see §6.5. RaisesColumnOrderMismatchErrorif not. - Read the old extent (
_read_op_extent):old_start/old_endfrom the live bookmarks, with auto-heal (§5.1) when they’re missing/corrupt/inverted. - Compute the shift amount (§4.3) and assert the stacked-only invariant (§2) on the projected post-paste rows.
- Run the widen-claim scan (§3.5) if the DataFrame is wider than the header.
Only after all of the above passes does the destructive sequence begin.
4.1 Step 1 — ClearContents
_clear_old_data_rows runs the danger-zone scan (§3.4) and then
ClearContents on the operating table’s own data cells —
(old_start, start_col) through (old_end, header_end_col) — in place, no row
shift. On a first paste (old_end < old_start) this is a no-op.
4.2 Step 2 — Shift (Insert / Delete whole rows)
_shift_band_below opens or closes row slots strictly below old_end,
across the cluster band [band_left, band_right] (§3.2). Let
delta = n_new - n_old_effective:
delta > 0(grow):Range.Insert(Shift=xlShiftDown)fordeltarows atold_end + 1. Everything in the band belowold_endslides down bydelta.delta < 0(shrink):Range.Delete(Shift=xlShiftUp)of the bottom|delta|rows of the old extent. Everything below slides up by|delta|.delta == 0: nothing.
Because the shift only ever touches rows below old_end, cells at or above
old_end — including the header itself and any loose content sharing the
operating rows but above old_end — are never moved by this step. Content in
a different lane (disjoint columns) is also untouched because the Insert/Delete
spans only the current lane’s column band. (Row overlap within the same lane is
already rejected by §2 before this step runs.)
4.3 Step 2 (cont.) — how the gap to the neighbor below is preserved
The shift amount differs between first and subsequent pastes, and that difference is exactly what preserves the user’s chosen gap.
Subsequent paste (n_old > 0) — pure delta:
shift_amount = n_new - n_old # (excel_io.py)
The neighbor below is pushed down by +delta on grow, pulled up by |delta| on
shrink. Whatever gap existed between this table and the next stacked table is
mechanically preserved, because both the table’s new bottom and the neighbor’s
top move by the same delta.
First paste (n_old == 0) — gap-aware:
gap_below = _nearest_neighbor_gap_below(...) # empty rows below header to nearest neighbor
shift_amount = max(0, n_new - gap_below)
On the very first paste there are no data rows yet, but the user has already
laid out a gap between this table’s header and the next table (table_gap, 2 by
default). The engine consumes that gap first and only pushes the neighbor by
the deficit (n_new - gap) when the data is taller than the gap. Data that
fits in the existing gap is written into it without displacing the neighbor at
all.
_nearest_neighbor_gap_below scans by cell content (not by bookmark),
because a header-only neighbor that’s never been pasted into has only its anchor
named range — no _table_start/_table_end bookmarks yet — so it’s invisible
to _iter_managed_table_extents. Scanning cell content catches the neighbor’s
header cells the way the user sees them. (When no neighbor is found within the
scan window, it returns a large sentinel = “unlimited room below”.)
First-paste grid — data fits the gap (no neighbor displacement):
First-paste grid — data exceeds the gap (push by the deficit only):
The net effect across any sequence of grows and shrinks is that the gap set at initial layout time is the gap you keep, forever.
4.4 Step 3 — Write + bookmarks (and column-width handling)
Between the clear and the write, the header is widened if needed
(_extend_header_for_wider_df), then _shift_band_below runs in the
post-widen band, then _write_data_and_bookmarks:
- Writes
df.valuesat(old_start, start_col)— the index is never written. ClearFormatson the written block.- Upserts
<name>_table_startand<name>_table_endto the new data corners.
Widen. When df has more columns than the header, the header named range is
auto-extended rightward; new header cells take their display names from
df.columns. No Insert columns is needed — the lane-row-disjoint invariant
guarantees no same-lane table shares the operating rows, so the new header cells
land on rows nobody in the lane owns. Content in a different lane at the same
rows is unaffected because the widen only writes into the lane’s new column(s).
(If the rightmost new column would pass Excel’s last column XFD = 16,384,
ExcelColumnOverflowError is raised first.)
Narrow. When df has fewer columns than the header, the trailing data
cells are already cleared by Step 1 (the clear spans header_end_col). The
header named range stays at its declared width; the _table_end bookmark
tracks the data extent, ending at start_col + df.shape[1] - 1. So a later
re-widen to the original width needs no header change.
_table_end → (4, D)_table_end → (4, C) (data extent)4.5 Failure ordering
The steps are ordered so any failure leaves the sheet consistent:
- Invariant check (no mutation) → can abort cleanly.
- Widen-claim scan (no mutation) → can abort cleanly.
- Clear old data — runs the danger-zone scan first; raising here means no header has been mutated yet, safe to abort.
- Widen header (commits only after the clear succeeded).
- Shift + write.
5. Bookmark auto-heal & cleanup_stale_table_names
Workbooks get edited by humans in Excel. Columns get deleted, rows get right-clicked away, names get fiddled in the Name Manager. The engine recovers from the common forms of damage.
5.1 Bookmark auto-heal (the write path)
_read_op_extent resolves <name>_table_start / <name>_table_end. If the
pair is missing, corrupt (#REF! or any COM error), or inverted
(end < start), it logs a DEBUG breadcrumb, drops the fragments
(_drop_names), and returns an empty extent so the next paste falls through to
the first-paste path — writing new data in the row immediately below the
header, exactly as a fresh table would.
_table_start present, _table_end #REF!| Trigger | Cause | Outcome |
|---|---|---|
_table_start missing |
Deleted via Name Manager | Drop _table_end, first-paste path |
_table_end missing |
Same | Drop _table_start, first-paste path |
Either #REF! |
Right-clicked → Delete on data rows | Drop both, first-paste path |
end.row < start.row |
Manual mis-edit | Drop both, first-paste path |
Recovery is silent except for a DEBUG log. See ../bookmark-auto-heal.md.
Read paths heal differently.
import_excel_table/scan_named_range_tablegrow a stale_table_end(scan down, extend) and rewrite missing/orphaned bookmarks, but never auto-shrink — a read should never lose track of data it can see.
5.2 cleanup_stale_table_names (on-demand pruning)
#REF! names accumulate when a user deletes a column manually in Excel — the
named range lingers but refers_to resolves to =#REF!#REF!.
cleanup_stale_table_names prunes all such broken names for a managed table.
from xlwings_package import cleanup_stale_table_names
cleanup_stale_table_names(wb, "assay_x") # prune #REF! names
cleanup_stale_table_names(wb, "assay_x", sheet="Results") # one sheet only
# → {"removed_ref": ["Results!col_b"]}
Ownership is decided by name pattern + the manifest. Because ownership keys off
the manifest, a table with no manifest raises ManifestNotFoundError:
ManifestNotFoundError— “Raised when a Benchling-managed table has no schema-column manifest on the requested sheet(s) — i.e. the table was never initialised byinitialise_table_benchlingor the manifest name was deleted manually.”
(A missing sheet= raises WorksheetNotFoundError.) Details:
../cleanup-stale-names.md.
6. CENTERPIECE — reinitialise a stacked Benchling table with MORE columns
This is the payoff: two stacked Benchling tables on one sheet, the upstream
schema gains a column, you call initialise_table_benchling again, and the
table auto-widens in place — header extends, named ranges rebind, manifest
updates — while the stacked neighbor below is protected.
6.0 Starting state — two stacked tables, with data
assay_x (cols A–C) stacked above assay_y (cols A–C), both with data and a
2-row gap between:
assay_x_table_end = $C$4assay_y_table_end = $C$10assay_x__schema_columns = "sample,batch,result" · assay_y__schema_columns = "well,conc,od"6.1 The upstream schema gains a column
In Benchling, Assay X adds a new result field qc at the end of its
schema. The schema is now [sample, batch, result, qc].
6.2 The reinit call (runnable)
Reinit is automatic: initialise_table_benchling detects that assay_x already
exists on the sheet (_find_table finds the anchor name) and routes to
_reinit instead of _write_fresh. You call it exactly the same way you did
the first time:
from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext
from xlwings_package import initialise_table_benchling
# You build the context yourself — credentials are mgtx-benchling-wrapper's job.
ctx = BenchlingContext(
client_id=...,
client_secret=..., # plaintext — resolve from your store
base_url="https://yourorg.benchling.com",
token_url="https://yourorg.benchling.com/api/v2/token",
)
# Same two schemas, same sheet, same column. assay_x's schema now has `qc`.
results = initialise_table_benchling(
wb,
{
"Results": ["assaysch_assayx", "assaysch_assayy"],
},
ctx=ctx,
column="A", # autopaste; both stack in column A
table_gap=2, # the gap that's preserved across reinits/pastes
)
# Each result dict carries action + a diff when reinitialised:
for r in results:
print(r["display_name"], r["action"], r.get("diff"))
# Assay X reinitialised {'added': ['qc'], 'updated': [], 'removed': [], ...}
# Assay Y reinitialised {'added': [], ...} # unchanged
6.3 What _reinit does, step by step
_reinit is the in-place updater. For assay_x gaining qc:
- Locate the table —
_find_tablereturns(header_row=2, start_col=1, end_col=3). - Re-sync display names + title — column headers and the title cell are rewritten from the schema on every call (Benchling is the source of truth for text).
- Diff against the manifest —
cur_map_schema_onlyiscur_mapfiltered to the manifest{sample, batch, result}, so any manual range a dev added in the band is excluded (M.2.1).qcis in the new schema but not incur_map→ it’s a new column. - Append the new column. New columns are appended at the right edge of the
current band (
end_col + 1, col D)._check_band_emptyconfirms D2 is free first (raisesTableSpaceOccupiedErrorif not). The header cell gets the display name + bold + fill, and a fresh sheet-scoped rangeqcbinds to$D$2. NoInsert columnsis emitted — the reorder/repack pass (step 5) places every column in schema order via a clear-and-rewrite, so appending then repacking yields the same final layout without a propagating column shift. - Reorder pass — the active band is snapshotted, cleared, and rewritten in schema order. Named ranges are rebound in place. Because this operates only on the lane’s own columns (clear-and-rewrite, no Insert/Delete), no other lane is affected.
- Extend the table extent — the schema anchor grows to
$A$2:$D$2, the title merge re-merges across A–D, andassay_x_table_endextends to col D. - Rewrite the manifest —
assay_x__schema_columns = "sample,batch,result,qc".
6.4 BEFORE / AFTER — the widen, with the neighbor protected
assay_x_table_end $C$4assay_x__schema_columns = "sample,batch,result"assay_x_table_end NOW $D$4assay_x__schema_columns = "sample,batch,result,qc"Why the neighbor is protected: the reorder/repack pass performs a clear-and-
rewrite on the lane’s own columns only — no row Insert/Delete, no column
Insert/Delete that propagates rightward. The new qc header lands at D2 and its
data column extends straight down into column D; the repack places it in schema
order by value-snapshot → clear → rewrite, so assay_y at rows 7–10 is
untouched. The gap is preserved. Existing sample/batch/result data is
preserved cell-for-cell; only the headers are re-synced and qc’s cells start
empty until the next paste.
6.5 Re-pasting after the schema change — column-order matching
After reinit, you pull fresh data and repaste. The DataFrame’s columns must line
up with the headers, because paste_df_to_named_range writes df.values
positionally — column 0 under the first header, column 1 under the second,
and so on. _check_df_columns_match_headers enforces this before any
destructive op, matching each DataFrame column against either the per-column
range token or the header display value (case- and whitespace-normalized):
from xlwings_package import import_excel_table, paste_df_to_named_range
# After reinit, the sheet header order is [sample, batch, result, qc].
df = pull_assay_x_results() # however you build it
# df.columns must be [sample, batch, result, qc] in THIS order.
paste_df_to_named_range(wb, df, "assay_x")
If the DataFrame’s columns are reordered, or a column was inserted in the middle or dropped from the middle relative to the sheet headers, the positional write would silently misalign data — so it raises instead:
ColumnOrderMismatchError— “Raised when a DataFrame’s columns don’t match the existing Excel header positions on a paste.paste_df_to_named_rangewritesdf.valuespositionally — column 0 lands under the first header cell, column 1 under the second, etc. Ifdf.columnsis reordered (or a column was inserted in the middle, or one was removed from the middle), the positional write would silently misalign data with the headers. This error fires before any destructive operation so the user can either reorder the DataFrame or re-initialise the table to reflect the new schema. Matches against either the column-level sheet-scoped named-range token or the header cell display value (case + whitespace normalized), so both the Benchling workflow (use_name_range_names=True→ range-name tokens) and the manual-table workflow (display values) work without configuration.”
Fix A: reorder the DataFrame to [sample, batch, result, qc] before pasting.
Fix B: if Benchling itself reordered, re-run initialise_table_benchling first
(the reorder pass rebinds the sheet to the new order), THEN paste.
Only existing header positions are checked; columns past the current header are handled by the auto-widen path (and pre-scanned by the claim zone, §3.5). A freshly-anchored, empty-header position has no identity to match against and is accepted (this is what lets a first paste into a brand-new table go through).
6.6 The mirror case — Benchling DROPS a column
When a column is removed from the Benchling schema, _reinit purges it
outright: clears every cell in the column from the header through _table_end,
deletes its named range, closes the horizontal gap so the remaining schema
columns are contiguous (the repack pass), and shrinks the table extent —
schema anchor width, title merge, and _table_end — to the new rightmost column.
The stacked neighbor below is untouched: the purge operates on columns, not
rows, so no vertical shift reaches assay_y.
assay_x_table_end $C$4assay_x_table_end NOW $B$4The repack/shrink is guarded by all_active_resolved — the engine only
compacts and shrinks when every active schema column’s name resolved, so it
can guarantee a contiguous block. If a name were missing it leaves the wider
extent intact rather than risk stranding a live column outside the bookmark.
See
../visualizations/schema-drift-with-data.htmlfor animated add-end / add-middle / remove / reorder cases on a table that already holds data, and../visualizations/stacked-tables-contract.htmlfor the row/column operations and the lane layout.
7. Mental-model summary
- A managed table = header anchor + per-column names +
_table_start/_table_endbookmarks (+ a manifest for Benchling tables). The bookmark prefix is the table’s identity. - Lanes. A lane is the transitive column-overlap cluster anchored at the operating
table. Row overlap is rejected only within a lane (column-overlapping tables).
Column-disjoint tables sharing rows are different lanes and are fully independent.
SideBySideNotSupportedErrormeans “row overlap within a single lane.” - A lane owns the danger zone below it across its lane band (transitive
column-overlap closure). Loose content or shapes there →
TableSpaceOccupiedErrorwith a move instruction. Safe places: above the title, outside the lane’s columns, or under its own named range (shapes cannot be exempted by a named range). - A paste resizes via clear → shift → write. The shift only touches rows
below the old bottom, so stacked lane-mates move as a unit and the gap is
preserved — pure
deltaon subsequent pastes, gap-aware on first paste. Tables in other lanes are never moved. - Widen = extend the header rightward, no column insert (the lane invariant makes it safe). Narrow = clear trailing cells, keep the header width, bookmark tracks data extent.
- Reinit auto-widens in place: new columns append at the right edge, the repack pass places them in schema order (clear-and-rewrite, no propagating shift). Dropped columns are purged + closed up + shrunk (cells cleared, name deleted, repack closes the gap). Neighbor lanes and rows are untouched throughout.
- Repasting after a schema change is positional; mismatched column order →
ColumnOrderMismatchError(reorder the DataFrame, or reinit then paste). - Damage recovery: write path auto-heals broken bookmarks (falls to first-paste);
cleanup_stale_table_namesprunes#REF!broken names.