Many Benchling tables, driven by a dictionary

🎨 Colour walkthrough: open the interactive visual version — colour multi-sheet / stacked layout diagrams. The Unicode grids below are the always-available text version.

This guide shows how to stand up many Benchling assay-result tables in one workbook — spread across several sheets, or stacked vertically on one sheet — all driven from a single Python dictionary, then filled, read back, and refreshed in a loop.

It builds directly on the single-table guide. If you have not yet stood up one table (auth with a BenchlingContext, the assaysch_… id rule, what initialise_table_benchling writes, the sanitize_formulas default), read that first and come back here:

Prerequisite: Benchling → Excel: a single table

This guide does not repeat those basics. It focuses on the fan-out: how a plan dict becomes many tables, how to write/read them in bulk, and how to keep them all in sync with Benchling.

Mental model: sheets vs. vertical stacks
The driving data structure + the init loop
Writing results back to many tables (multi-paste)
Reading many tables back into a dict of DataFrames
Refreshing / reinitialising many tables
End-to-end worked example
Common errors & fixes
Related

1. Mental model: sheets vs. vertical stacks

A managed Benchling table is a header block plus a set of named ranges and two bookmarks, <name>_table_start / <name>_table_end, that track where its data begins and ends. Multiple managed tables can coexist in one of two ways:

Across multiple sheets — one (or more) table per tab:

Potency tab

Potency Assay

VS0001

0.42

Identity tab

Identity Assay

VS0001

LC-MS

Pass

Purity tab

Purity Assay

VS0001

98.7

One managed Benchling table per sheet.

Stacked vertically on one sheet — two tables sharing a tab:

Identity tab — Identity over Purity

Identity Assay

VS0001

LC-MS

Pass

VS0002

LC-MS

Pass

⋮ table_gap = 2 blank rows ⋮

Purity Assay

sample_id

peak

pct_purity

VS0001

98.7

The blue band is the merged, bold, centred schema title; the strong-blue header row carries the per-column sheet-scoped named ranges (warehouse names); the light-blue cells are live data. The purple Purity Assay is the stacked-below neighbour table, landing table_gap blank rows below the first table’s _table_end.

The lane rule. Managed tables on the same sheet form lanes based on column connectivity. Tables within the same lane must occupy disjoint row ranges — they stack top-to-bottom. Tables in separate column bands (different lanes) may share rows — side-by-side, L-shaped, and T-shaped layouts across disjoint column bands are supported. Tables on different sheets have no constraints — each sheet is its own canvas.

For the full contract (lane model, how the gap below a table is preserved across grow/shrink pastes, danger-zone shape detection, and how reinit grows columns) see:

Concepts: Stacked tables & reinitialisation

When to use which:

Situation	Layout
Each assay is conceptually a separate report	One table per sheet — cleanest, no row-collision risk, tabs name themselves after the assay
A few related assays a reviewer reads top-to-bottom together	Stacked on one sheet (same lane) — keep `table_gap` ≥ 2 so grow/shrink never collides
Many assays (5+), or assays that grow to thousands of rows	One per sheet — avoids any chance of a growing table pushing into its lane-mate
Two small lookup tables that always travel together	Stacked (same lane) optionally with per-schema column anchors (`column={schema_id: col}`), or side-by-side in separate column bands (different lanes) if they share the same rows

When in doubt, one table per sheet. It is the layout with the fewest failure modes.

2. The driving data structure + the init loop

initialise_table_benchling has the following real signature (copied verbatim from the source):

def initialise_table_benchling(
    wb,
    schema_map: "dict[str, list[str]]",
    *,
    ctx,
    mode: Literal["autopaste", "paste"] = "autopaste",
    column: "int | str | dict[str, int | str] | None" = None,
    at: "str | None" = None,
    table_gap: int = 2,
) -> list[SchemaInitResult]:

There is one calling shape: a dict mapping each sheet name to the ordered list of schema ids on it. It reads exactly how you think about the layout — “this sheet holds these tables”:

Multiple ids under one sheet stack vertically on it, in list order, separated by table_gap rows.
The same id may appear under more than one sheet.

Keep your layout as a plan dict keyed by sheet —

tables = {
    "Potency":  ["assaysch_a"],                 # one table on the Potency tab
    "Identity": ["assaysch_b", "assaysch_c"],   # two tables stacked on Identity
}

— and pass it straight in; it is the argument, no loop or flattening:

import xlwings as xw
from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext
from xlwings_package import initialise_table_benchling

wb = xw.Book.caller()  # or xw.Book(r"C:\path\to\book.xlsx")

ctx = BenchlingContext(
    client_id="your-oauth2-app-client-id",
    client_secret=PLAINTEXT_SECRET,                 # resolve from your store
    base_url="https://yourorg.benchling.com",
    token_url="https://yourorg.benchling.com/api/v2/token",
)

tables = {
    "Potency":  ["assaysch_a"],
    "Identity": ["assaysch_b", "assaysch_c"],
}

results = initialise_table_benchling(wb, tables, ctx=ctx)

What happens per sheet:

The sheet is created if it does not already exist (wb.sheets.add); an existing sheet is reused.
Each schema id is resolved against Benchling (assaysch_… only — name lookup is not supported) to get its display name and column list.
For each id in list order: if no table for that schema exists on the sheet yet, a fresh header block is written and its action is "created"; if it already exists, it is refreshed in place and its action is "reinitialised" (see §5).
Stacked tables on the same sheet each get their own row cursor at their start column, so a new table lands below the previous one with table_gap empty rows between them.

results is a flat list[SchemaInitResult] (one per schema, across all sheets) — a frozen, slotted dataclass, so access fields by attribute (not subscript). Each carries schema_name, display_name, warehouse_name, range_name, sheet_name, action, columns (a tuple[ColumnInfo, …]), and — on reinit — diff. After the loop you get one INFO log line per schema:

Schema 'Potency Assay'   →  Potency    [created]
Schema 'Identity Assay'  →  Identity   [created]
Schema 'Purity Assay'    →  Identity   [created]

When you still want a loop

A single call shares one column anchor across every sheet. If you need a different anchor column per sheet, or want to catch and log errors sheet-by-sheet, loop and call once per sheet — each entry is a valid one-entry schema_map:

results = []
for sheet_name, schema_ids in tables.items():
    results += initialise_table_benchling(
        wb,
        {sheet_name: schema_ids},
        ctx=ctx,
        column="A" if sheet_name == "Potency" else "E",
    )

Resulting layout

Potency tab — one table

Potency Assay

Identity tab — two tables stacked vertically

Identity Assay (assaysch_b)

⋮ table_gap rows ⋮

Purity Assay (assaysch_c)

sample_id

peak

pct_purity

3. Writing results back to many tables (multi-paste)

paste_df_to_named_range writes a DataFrame into a managed table addressed by its named-range token — the schema warehouse name (snake_case), the same token shown by import_excel_table’s mapping. Its real signature:

def paste_df_to_named_range(
    wb,
    df: pd.DataFrame | list[pd.DataFrame],
    named_range: str | list[str],
    scope_sheet: Optional[str] = None,
    sanitize_formulas: bool = True,
) -> None:

It has a built-in list mode: pass a list of DataFrames and a matching list of named-range tokens. The lengths must match; it zips them and pastes each pair. Under the hood this is exactly a loop over the single-paste path — there is no batch optimisation, it is purely a convenience.

List mode — many tables on ONE sheet

Because scope_sheet is a single value in list mode (it is forwarded unchanged to every paste), list mode is the right tool when all the target tables live on the same sheet:

from xlwings_package import paste_df_to_named_range

paste_df_to_named_range(
    wb,
    [identity_df, purity_df],                 # list of DataFrames
    ["identity_assay", "purity_assay"],       # matching list of range tokens
    scope_sheet="Identity",                   # both tables are on this one tab
)

Loop — many tables across MULTIPLE sheets

When your tables span several sheets, each needs its own scope_sheet, so loop and call the single-paste form once per table. Drive it from a dict keyed by (token, sheet):

to_write = {
    ("potency_assay",  "Potency"):  potency_df,
    ("identity_assay", "Identity"): identity_df,
    ("purity_assay",   "Identity"): purity_df,
}

for (token, sheet), df in to_write.items():
    paste_df_to_named_range(wb, df, token, scope_sheet=sheet)

Column-order matching is enforced per table

Each paste validates that your DataFrame’s columns line up with that table’s existing headers, position by position. The data is written positionally (df.values), so a reordered or shifted DataFrame would silently misalign under the headers — to prevent that, a mismatch raises ColumnOrderMismatchError before anything is written.

Match against either the range-name token (warehouse name) or the header display text (case- and whitespace-insensitive). The cleanest contract is to key each DataFrame’s columns by the warehouse-name tokens (which is exactly what import_excel_table returns by default — see §4). If Benchling added columns to a schema, re-initialise that table first (§5) so its headers exist, then paste the wider DataFrame.

Notes that apply to every paste (single or list):

The DataFrame index is not written — only df.values.
A zero-row DataFrame raises EmptyDataError; filter those out of your dict before looping.
Formula-injection sanitisation is on by default (sanitize_formulas=True): text cells beginning with =/+/-/@ are apostrophe-escaped. Pass sanitize_formulas=False only for trusted sources where you deliberately want formula strings written through verbatim.

4. Reading many tables back into a dict of DataFrames

To pull every managed table back out, loop import_excel_table, keyed by the schema token and its sheet. import_excel_table returns (DataFrame, mapping), where columns default to the warehouse-name tokens (use_name_range_names=True) — the same tokens paste_df_to_named_range expects, so a read → transform → write round-trip lines up by construction.

from xlwings_package import import_excel_table

tables_on_sheet = {
    ("potency_assay",  "Potency"):  None,
    ("identity_assay", "Identity"): None,
    ("purity_assay",   "Identity"): None,
}

frames: dict[str, "pd.DataFrame"] = {}
mappings: dict[str, dict] = {}
for token, sheet in tables_on_sheet:
    df, mapping = import_excel_table(wb, token, scope_sheet=sheet)
    frames[token] = df
    mappings[token] = mapping

# frames["identity_assay"]  → DataFrame with columns ["sample_id", "method", "call", …]
# mappings["identity_assay"] → {"sample_id": "Sample ID", "method": "Method", …}

Want only some columns of a table? Pass include_list (cannot be combined with ignore_list):

df, mapping = import_excel_table(
    wb, "potency_assay", scope_sheet="Potency",
    include_list=["sample_id", "ec50"],
)

A completely empty table (or an all-null column, when check_empty_columns=True, the default) raises EmptyDataError — wrap the call in try/except if some tables may legitimately be empty at read time.

5. Refreshing / reinitialising many tables

Re-running your init call (§2) against a workbook whose tables already exist refreshes each one in place instead of recreating it — that schema’s action on the returned SchemaInitResult becomes "reinitialised" and a diff is attached. Just call it again with the same plan dict:

results = initialise_table_benchling(wb, tables, ctx=ctx)

for r in results:
    if r.action == "reinitialised":
        print(r.sheet_name, r.display_name, r.diff)

On reinit, Benchling is the source of truth. For each table:

Title and column-header text are re-synced from the current schema.
Columns added in Benchling are inserted/appended — reinit grows the table’s columns in place. The exact placement rules (middle-insert vs. end-append, reorder handling) are covered in the concepts doc.
Columns dropped in Benchling are always purged outright: cells are cleared, the remaining columns close up contiguously, and the table’s extent (named range, title merge, and _table_end bookmark) shrinks to the new rightmost column — without disturbing any table stacked below it or in a neighbouring lane.

Because reinit can grow a table’s columns, always reinit before pasting when a schema may have changed — that way the headers exist and the wider DataFrame matches the column-order check in §3. For the full reinit semantics (column add/remove/reorder, the schema manifest that protects manually-added named ranges, and the partial_diff attribute on mid-loop failures) see:

Concepts: Stacked tables & reinitialisation — “reinit grows columns”

Prune stale #REF! names with cleanup_stale_table_names (one call per table token).

6. End-to-end worked example

A dict → several sheets each with a Benchling table → fill → read back → save.

import pandas as pd
import xlwings as xw

from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext
from xlwings_package import (
    initialise_table_benchling,
    paste_df_to_named_range,
    import_excel_table,
    save_dataframe,
)

# --- 0. Workbook + Benchling context -------------------------------------
wb = xw.Book.caller()  # or xw.Book(r"C:\path\to\assays.xlsx")

ctx = BenchlingContext(
    client_id="your-oauth2-app-client-id",
    client_secret=PLAINTEXT_SECRET,                 # resolve from your store
    base_url="https://yourorg.benchling.com",
    token_url="https://yourorg.benchling.com/api/v2/token",
)

# --- 1. The plan: which tables live on which sheet -----------------------
tables = {
    "Potency":  ["assaysch_a"],                 # one table
    "Identity": ["assaysch_b", "assaysch_c"],   # two tables stacked vertically
}

# Each table is addressed by its anchor token (SchemaInitResult.range_name).
# Capture it from the real results rather than hard-coding — but for this
# example assume:
#   assaysch_a → "potency_assay"
#   assaysch_b → "identity_assay"
#   assaysch_c → "purity_assay"

# --- 2. Initialise every table (creates sheets + headers) ----------------
results = initialise_table_benchling(wb, tables, ctx=ctx)

# Capture each table's anchor token + sheet from the real results so
# downstream paste/import code never hard-codes tokens.
layout = {r.range_name: r.sheet_name for r in results}

# --- 3. Fill each table ---------------------------------------------------
# DataFrame columns are keyed by warehouse-name tokens so they line up with
# the headers initialise_table_benchling just wrote (column-order check, §3).
potency_df = pd.DataFrame({
    "sample_id": ["VS0001", "VS0002"],
    "ec50":      [0.42, 0.51],
})
identity_df = pd.DataFrame({
    "sample_id": ["VS0001", "VS0002"],
    "method":    ["LC-MS", "LC-MS"],
    "call":      ["Pass", "Pass"],
})
purity_df = pd.DataFrame({
    "sample_id":  ["VS0001", "VS0002"],
    "peak":       [1, 1],
    "pct_purity": [98.7, 97.2],
})

# Identity + Purity share the "Identity" sheet → list mode, one scope_sheet.
paste_df_to_named_range(
    wb,
    [identity_df, purity_df],
    ["identity_assay", "purity_assay"],
    scope_sheet="Identity",
)
# Potency is on its own sheet → single paste.
paste_df_to_named_range(wb, potency_df, "potency_assay", scope_sheet="Potency")

# --- 4. Read everything back into a dict of DataFrames -------------------
read_plan = {
    "potency_assay":  "Potency",
    "identity_assay": "Identity",
    "purity_assay":   "Identity",
}
frames = {}
for token, sheet in read_plan.items():
    df, _mapping = import_excel_table(wb, token, scope_sheet=sheet)
    frames[token] = df

# --- 5. Save one of them --------------------------------------------------
save_dataframe(frames["purity_assay"], default_file_name="purity_results")
# Opens a save dialog; writes CSV or XLSX depending on user choice.

To refresh after a schema change in Benchling, just re-run step 2’s loop — existing tables reinitialise in place (§5), then repeat steps 3–5.

7. Common errors & fixes

Exception / symptom	Cause	Fix
`ValueError: Schema lookup requires a Benchling ID …`	A schema key was a name, not an `assaysch_…` id	Use the `assaysch_…` id from the schema’s Benchling URL — name lookup is not supported.
`TypeError: schema_map must be a dict[sheet_name, list[schema_id]] …`	Passed a bare sheet-name string (or other non-dict) as the second argument	Wrap it in a dict: `{"Sheet": ["assaysch_…"]}`.
`TypeError: schema_map['Sheet'] must be a list of schema ids …`	A sheet’s value was a bare id string, not a list	Wrap the single id in a list: `{"Sheet": ["assaysch_…"]}`.
`ValueError: schema_map['Sheet'] has no schema ids.`	A sheet was mapped to an empty list	Give each sheet at least one id, or drop the sheet from the dict.
`ValueError: schema_map must not be empty.`	Passed an empty dict	Populate the plan dict before calling.
`ImportError: mgtx-benchling-wrapper is required.`	The Benchling extra is not installed	`pip install "mgtx-xlwings-utils[benchling]"`
`SideBySideNotSupportedError`	Two managed tables in the same lane (overlapping columns) ended up with overlapping rows	Stack them vertically (raise `table_gap`, or move a table to its own sheet). Tables in disjoint column bands (different lanes) sharing rows are allowed. See the concepts doc.
`TableSpaceOccupiedError`	A fresh table’s header band, or the danger zone below an existing table, holds loose user content	Clear the cells, choose a different sheet, or anchor elsewhere (`column` / `mode="paste"` with a different `at`).
`ColumnOrderMismatchError` on paste	DataFrame columns don’t line up with that table’s existing headers	Reorder the DataFrame to match, or re-initialise that table (§5) so new/reordered Benchling columns are applied first.
`ExcelColumnOverflowError`	A schema (or auto-widen) would land a column past column XFD (16,384)	Anchor the table further left, or split the schema across sheets.
`EmptyDataError` on read	The table (or a column) is entirely empty	Skip empty tables, or pass `check_empty_columns=False` to `import_excel_table`.
`ManifestNotFoundError` from `cleanup_stale_table_names`	The table was never created by `initialise_table_benchling`, or its `{token}__schema_columns` manifest was deleted	Only run cleanup against tables this package initialised; re-initialise to rewrite the manifest.
List-mode paste raises `ValueError: Number of DataFrames must match number of named ranges.`	The DataFrame list and token list have different lengths	Make the two lists the same length, or build them together so they can’t drift.

Benchling → Excel: a single table — the prerequisite: auth, the assaysch_… id rule, what one table looks like.
Stacked tables & reinitialisation — the stacked-only contract, gap preservation, and how reinit grows columns.
Normal (non-Benchling) tables — initialise_table_named_ranges
- import_excel_table for tables you build by hand.
Plates — 96/384-well plate layouts as managed regions.
Tools reference — full signatures for initialise_table_benchling, paste_df_to_named_range, import_excel_table, cleanup_stale_table_names, and the exception types.
Documentation home

mgtx-xlwings-utils

Reusable xlwings + pandas helpers for Excel automation workflows at MeiraGTx.

Many Benchling tables, driven by a dictionary

Contents

1. Mental model: sheets vs. vertical stacks

2. The driving data structure + the init loop

When you still want a loop

Resulting layout

3. Writing results back to many tables (multi-paste)

List mode — many tables on ONE sheet

Loop — many tables across MULTIPLE sheets

Column-order matching is enforced per table

4. Reading many tables back into a dict of DataFrames

5. Refreshing / reinitialising many tables

6. End-to-end worked example

7. Common errors & fixes

Many Benchling tables, driven by a dictionary

Contents

1. Mental model: sheets vs. vertical stacks

2. The driving data structure + the init loop

When you still want a loop

Resulting layout

3. Writing results back to many tables (multi-paste)

List mode — many tables on ONE sheet

Loop — many tables across MULTIPLE sheets

Column-order matching is enforced per table

4. Reading many tables back into a dict of DataFrames

5. Refreshing / reinitialising many tables

6. End-to-end worked example

7. Common errors & fixes

8. Related