Many Benchling tables, driven by a dictionary
🎨 Colour walkthrough: open the interactive visual version — colour multi-sheet / stacked layout diagrams. The Unicode grids below are the always-available text version.
This guide shows how to stand up many Benchling assay-result tables in one workbook — spread across several sheets, or stacked vertically on one sheet — all driven from a single Python dictionary, then filled, read back, and refreshed in a loop.
It builds directly on the single-table guide. If you have not yet stood up
one table (auth with a BenchlingContext, the assaysch_… id rule, what
initialise_table_benchling writes, the sanitize_formulas default), read
that first and come back here:
Prerequisite: Benchling → Excel: a single table
This guide does not repeat those basics. It focuses on the fan-out: how a plan dict becomes many tables, how to write/read them in bulk, and how to keep them all in sync with Benchling.
Contents
- Mental model: sheets vs. vertical stacks
- The driving data structure + the init loop
- Writing results back to many tables (multi-paste)
- Reading many tables back into a dict of DataFrames
- Refreshing / reinitialising many tables
- End-to-end worked example
- Common errors & fixes
- Related
1. Mental model: sheets vs. vertical stacks
A managed Benchling table is a header block plus a set of named ranges and
two bookmarks, <name>_table_start / <name>_table_end, that track where its
data begins and ends. Multiple managed tables can coexist in one of two ways:
Across multiple sheets — one (or more) table per tab:
One managed Benchling table per sheet.
Stacked vertically on one sheet — two tables sharing a tab:
The blue band is the merged, bold, centred schema title; the strong-blue
header row carries the per-column sheet-scoped named ranges (warehouse
names); the light-blue cells are live data. The purple Purity Assay is
the stacked-below neighbour table, landing table_gap blank rows below the
first table’s _table_end.
The lane rule. Managed tables on the same sheet form lanes based on column connectivity. Tables within the same lane must occupy disjoint row ranges — they stack top-to-bottom. Tables in separate column bands (different lanes) may share rows — side-by-side, L-shaped, and T-shaped layouts across disjoint column bands are supported. Tables on different sheets have no constraints — each sheet is its own canvas.
For the full contract (lane model, how the gap below a table is preserved across grow/shrink pastes, danger-zone shape detection, and how reinit grows columns) see:
Concepts: Stacked tables & reinitialisation
When to use which:
| Situation | Layout |
|---|---|
| Each assay is conceptually a separate report | One table per sheet — cleanest, no row-collision risk, tabs name themselves after the assay |
| A few related assays a reviewer reads top-to-bottom together | Stacked on one sheet (same lane) — keep table_gap ≥ 2 so grow/shrink never collides |
| Many assays (5+), or assays that grow to thousands of rows | One per sheet — avoids any chance of a growing table pushing into its lane-mate |
| Two small lookup tables that always travel together | Stacked (same lane) optionally with per-schema column anchors (column={schema_id: col}), or side-by-side in separate column bands (different lanes) if they share the same rows |
When in doubt, one table per sheet. It is the layout with the fewest failure modes.
2. The driving data structure + the init loop
initialise_table_benchling has the following real signature (copied verbatim
from the source):
def initialise_table_benchling(
wb,
schema_map: "dict[str, list[str]]",
*,
ctx,
mode: Literal["autopaste", "paste"] = "autopaste",
column: "int | str | dict[str, int | str] | None" = None,
at: "str | None" = None,
table_gap: int = 2,
) -> list[SchemaInitResult]:
There is one calling shape: a dict mapping each sheet name to the ordered list of schema ids on it. It reads exactly how you think about the layout — “this sheet holds these tables”:
- Multiple ids under one sheet stack vertically on it, in list order,
separated by
table_gaprows. - The same id may appear under more than one sheet.
Keep your layout as a plan dict keyed by sheet —
tables = {
"Potency": ["assaysch_a"], # one table on the Potency tab
"Identity": ["assaysch_b", "assaysch_c"], # two tables stacked on Identity
}
— and pass it straight in; it is the argument, no loop or flattening:
import xlwings as xw
from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext
from xlwings_package import initialise_table_benchling
wb = xw.Book.caller() # or xw.Book(r"C:\path\to\book.xlsx")
ctx = BenchlingContext(
client_id="your-oauth2-app-client-id",
client_secret=PLAINTEXT_SECRET, # resolve from your store
base_url="https://yourorg.benchling.com",
token_url="https://yourorg.benchling.com/api/v2/token",
)
tables = {
"Potency": ["assaysch_a"],
"Identity": ["assaysch_b", "assaysch_c"],
}
results = initialise_table_benchling(wb, tables, ctx=ctx)
What happens per sheet:
- The sheet is created if it does not already exist (
wb.sheets.add); an existing sheet is reused. - Each schema id is resolved against Benchling (
assaysch_…only — name lookup is not supported) to get its display name and column list. - For each id in list order: if no table for that schema exists on the
sheet yet, a fresh header block is written and its action is
"created"; if it already exists, it is refreshed in place and its action is"reinitialised"(see §5). - Stacked tables on the same sheet each get their own row cursor at their
start column, so a new table lands below the previous one with
table_gapempty rows between them.
results is a flat list[SchemaInitResult] (one per schema, across all
sheets) — a frozen, slotted dataclass, so access fields by attribute (not
subscript). Each carries schema_name, display_name, warehouse_name,
range_name, sheet_name, action, columns (a tuple[ColumnInfo, …]), and
— on reinit — diff. After the loop you get one INFO log line per schema:
Schema 'Potency Assay' → Potency [created]
Schema 'Identity Assay' → Identity [created]
Schema 'Purity Assay' → Identity [created]
When you still want a loop
A single call shares one column anchor across every sheet. If you need a
different anchor column per sheet, or want to catch and log errors
sheet-by-sheet, loop and call once per sheet — each entry is a valid one-entry
schema_map:
results = []
for sheet_name, schema_ids in tables.items():
results += initialise_table_benchling(
wb,
{sheet_name: schema_ids},
ctx=ctx,
column="A" if sheet_name == "Potency" else "E",
)
Resulting layout
3. Writing results back to many tables (multi-paste)
paste_df_to_named_range writes a DataFrame into a managed table addressed by
its named-range token — the schema warehouse name (snake_case), the same
token shown by import_excel_table’s mapping. Its real signature:
def paste_df_to_named_range(
wb,
df: pd.DataFrame | list[pd.DataFrame],
named_range: str | list[str],
scope_sheet: Optional[str] = None,
sanitize_formulas: bool = True,
) -> None:
It has a built-in list mode: pass a list of DataFrames and a matching
list of named-range tokens. The lengths must match; it zips them and pastes
each pair. Under the hood this is exactly a loop over the single-paste path —
there is no batch optimisation, it is purely a convenience.
List mode — many tables on ONE sheet
Because scope_sheet is a single value in list mode (it is forwarded
unchanged to every paste), list mode is the right tool when all the target
tables live on the same sheet:
from xlwings_package import paste_df_to_named_range
paste_df_to_named_range(
wb,
[identity_df, purity_df], # list of DataFrames
["identity_assay", "purity_assay"], # matching list of range tokens
scope_sheet="Identity", # both tables are on this one tab
)
Loop — many tables across MULTIPLE sheets
When your tables span several sheets, each needs its own scope_sheet, so loop
and call the single-paste form once per table. Drive it from a dict keyed by
(token, sheet):
to_write = {
("potency_assay", "Potency"): potency_df,
("identity_assay", "Identity"): identity_df,
("purity_assay", "Identity"): purity_df,
}
for (token, sheet), df in to_write.items():
paste_df_to_named_range(wb, df, token, scope_sheet=sheet)
Column-order matching is enforced per table
Each paste validates that your DataFrame’s columns line up with that table’s
existing headers, position by position. The data is written positionally
(df.values), so a reordered or shifted DataFrame would silently misalign under
the headers — to prevent that, a mismatch raises ColumnOrderMismatchError
before anything is written.
Match against either the range-name token (warehouse name) or the header
display text (case- and whitespace-insensitive). The cleanest contract is to
key each DataFrame’s columns by the warehouse-name tokens (which is exactly what
import_excel_table returns by default — see §4).
If Benchling added columns to a schema, re-initialise that table first
(§5) so its headers exist, then paste the wider DataFrame.
Notes that apply to every paste (single or list):
- The DataFrame index is not written — only
df.values. - A zero-row DataFrame raises
EmptyDataError; filter those out of your dict before looping. - Formula-injection sanitisation is on by default (
sanitize_formulas=True): text cells beginning with=/+/-/@are apostrophe-escaped. Passsanitize_formulas=Falseonly for trusted sources where you deliberately want formula strings written through verbatim.
4. Reading many tables back into a dict of DataFrames
To pull every managed table back out, loop import_excel_table, keyed by the
schema token and its sheet. import_excel_table returns
(DataFrame, mapping), where columns default to the warehouse-name tokens
(use_name_range_names=True) — the same tokens paste_df_to_named_range
expects, so a read → transform → write round-trip lines up by construction.
from xlwings_package import import_excel_table
tables_on_sheet = {
("potency_assay", "Potency"): None,
("identity_assay", "Identity"): None,
("purity_assay", "Identity"): None,
}
frames: dict[str, "pd.DataFrame"] = {}
mappings: dict[str, dict] = {}
for token, sheet in tables_on_sheet:
df, mapping = import_excel_table(wb, token, scope_sheet=sheet)
frames[token] = df
mappings[token] = mapping
# frames["identity_assay"] → DataFrame with columns ["sample_id", "method", "call", …]
# mappings["identity_assay"] → {"sample_id": "Sample ID", "method": "Method", …}
Want only some columns of a table? Pass include_list (cannot be combined with
ignore_list):
df, mapping = import_excel_table(
wb, "potency_assay", scope_sheet="Potency",
include_list=["sample_id", "ec50"],
)
A completely empty table (or an all-null column, when
check_empty_columns=True, the default) raises EmptyDataError — wrap the call
in try/except if some tables may legitimately be empty at read time.
5. Refreshing / reinitialising many tables
Re-running your init call (§2) against a workbook whose tables already exist
refreshes each one in place instead of recreating it — that schema’s action
on the returned SchemaInitResult becomes "reinitialised" and a diff is
attached. Just call it again with the same plan dict:
results = initialise_table_benchling(wb, tables, ctx=ctx)
for r in results:
if r.action == "reinitialised":
print(r.sheet_name, r.display_name, r.diff)
On reinit, Benchling is the source of truth. For each table:
- Title and column-header text are re-synced from the current schema.
- Columns added in Benchling are inserted/appended — reinit grows the table’s columns in place. The exact placement rules (middle-insert vs. end-append, reorder handling) are covered in the concepts doc.
- Columns dropped in Benchling are always purged outright: cells are
cleared, the remaining columns close up contiguously, and the table’s extent
(named range, title merge, and
_table_endbookmark) shrinks to the new rightmost column — without disturbing any table stacked below it or in a neighbouring lane.
Because reinit can grow a table’s columns, always reinit before pasting when
a schema may have changed — that way the headers exist and the wider DataFrame
matches the column-order check in §3. For the full reinit semantics (column
add/remove/reorder, the schema manifest that protects manually-added named
ranges, and the partial_diff attribute on mid-loop failures) see:
Concepts: Stacked tables & reinitialisation — “reinit grows columns”
Prune stale #REF! names with
cleanup_stale_table_names (one call per table token).
6. End-to-end worked example
A dict → several sheets each with a Benchling table → fill → read back → save.
import pandas as pd
import xlwings as xw
from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext
from xlwings_package import (
initialise_table_benchling,
paste_df_to_named_range,
import_excel_table,
save_dataframe,
)
# --- 0. Workbook + Benchling context -------------------------------------
wb = xw.Book.caller() # or xw.Book(r"C:\path\to\assays.xlsx")
ctx = BenchlingContext(
client_id="your-oauth2-app-client-id",
client_secret=PLAINTEXT_SECRET, # resolve from your store
base_url="https://yourorg.benchling.com",
token_url="https://yourorg.benchling.com/api/v2/token",
)
# --- 1. The plan: which tables live on which sheet -----------------------
tables = {
"Potency": ["assaysch_a"], # one table
"Identity": ["assaysch_b", "assaysch_c"], # two tables stacked vertically
}
# Each table is addressed by its anchor token (SchemaInitResult.range_name).
# Capture it from the real results rather than hard-coding — but for this
# example assume:
# assaysch_a → "potency_assay"
# assaysch_b → "identity_assay"
# assaysch_c → "purity_assay"
# --- 2. Initialise every table (creates sheets + headers) ----------------
results = initialise_table_benchling(wb, tables, ctx=ctx)
# Capture each table's anchor token + sheet from the real results so
# downstream paste/import code never hard-codes tokens.
layout = {r.range_name: r.sheet_name for r in results}
# --- 3. Fill each table ---------------------------------------------------
# DataFrame columns are keyed by warehouse-name tokens so they line up with
# the headers initialise_table_benchling just wrote (column-order check, §3).
potency_df = pd.DataFrame({
"sample_id": ["VS0001", "VS0002"],
"ec50": [0.42, 0.51],
})
identity_df = pd.DataFrame({
"sample_id": ["VS0001", "VS0002"],
"method": ["LC-MS", "LC-MS"],
"call": ["Pass", "Pass"],
})
purity_df = pd.DataFrame({
"sample_id": ["VS0001", "VS0002"],
"peak": [1, 1],
"pct_purity": [98.7, 97.2],
})
# Identity + Purity share the "Identity" sheet → list mode, one scope_sheet.
paste_df_to_named_range(
wb,
[identity_df, purity_df],
["identity_assay", "purity_assay"],
scope_sheet="Identity",
)
# Potency is on its own sheet → single paste.
paste_df_to_named_range(wb, potency_df, "potency_assay", scope_sheet="Potency")
# --- 4. Read everything back into a dict of DataFrames -------------------
read_plan = {
"potency_assay": "Potency",
"identity_assay": "Identity",
"purity_assay": "Identity",
}
frames = {}
for token, sheet in read_plan.items():
df, _mapping = import_excel_table(wb, token, scope_sheet=sheet)
frames[token] = df
# --- 5. Save one of them --------------------------------------------------
save_dataframe(frames["purity_assay"], default_file_name="purity_results")
# Opens a save dialog; writes CSV or XLSX depending on user choice.
To refresh after a schema change in Benchling, just re-run step 2’s loop — existing tables reinitialise in place (§5), then repeat steps 3–5.
7. Common errors & fixes
| Exception / symptom | Cause | Fix |
|---|---|---|
ValueError: Schema lookup requires a Benchling ID … |
A schema key was a name, not an assaysch_… id |
Use the assaysch_… id from the schema’s Benchling URL — name lookup is not supported. |
TypeError: schema_map must be a dict[sheet_name, list[schema_id]] … |
Passed a bare sheet-name string (or other non-dict) as the second argument | Wrap it in a dict: {"Sheet": ["assaysch_…"]}. |
TypeError: schema_map['Sheet'] must be a list of schema ids … |
A sheet’s value was a bare id string, not a list | Wrap the single id in a list: {"Sheet": ["assaysch_…"]}. |
ValueError: schema_map['Sheet'] has no schema ids. |
A sheet was mapped to an empty list | Give each sheet at least one id, or drop the sheet from the dict. |
ValueError: schema_map must not be empty. |
Passed an empty dict | Populate the plan dict before calling. |
ImportError: mgtx-benchling-wrapper is required. |
The Benchling extra is not installed | pip install "mgtx-xlwings-utils[benchling]" |
SideBySideNotSupportedError |
Two managed tables in the same lane (overlapping columns) ended up with overlapping rows | Stack them vertically (raise table_gap, or move a table to its own sheet). Tables in disjoint column bands (different lanes) sharing rows are allowed. See the concepts doc. |
TableSpaceOccupiedError |
A fresh table’s header band, or the danger zone below an existing table, holds loose user content | Clear the cells, choose a different sheet, or anchor elsewhere (column / mode="paste" with a different at). |
ColumnOrderMismatchError on paste |
DataFrame columns don’t line up with that table’s existing headers | Reorder the DataFrame to match, or re-initialise that table (§5) so new/reordered Benchling columns are applied first. |
ExcelColumnOverflowError |
A schema (or auto-widen) would land a column past column XFD (16,384) | Anchor the table further left, or split the schema across sheets. |
EmptyDataError on read |
The table (or a column) is entirely empty | Skip empty tables, or pass check_empty_columns=False to import_excel_table. |
ManifestNotFoundError from cleanup_stale_table_names |
The table was never created by initialise_table_benchling, or its {token}__schema_columns manifest was deleted |
Only run cleanup against tables this package initialised; re-initialise to rewrite the manifest. |
List-mode paste raises ValueError: Number of DataFrames must match number of named ranges. |
The DataFrame list and token list have different lengths | Make the two lists the same length, or build them together so they can’t drift. |
8. Related
- Benchling → Excel: a single table — the prerequisite:
auth, the
assaysch_…id rule, what one table looks like. - Stacked tables & reinitialisation — the stacked-only contract, gap preservation, and how reinit grows columns.
- Normal (non-Benchling) tables —
initialise_table_named_rangesimport_excel_tablefor tables you build by hand.
- Plates — 96/384-well plate layouts as managed regions.
- Tools reference — full signatures for
initialise_table_benchling,paste_df_to_named_range,import_excel_table,cleanup_stale_table_names, and the exception types. - Documentation home