Skip to the content.

Guide: one Benchling-backed table, start to finish

🎨 Colour walkthrough: open the interactive visual version — colour before/after Excel-grid diagrams of every step. The Unicode grids below are the always-available text version.

This guide walks the full lifecycle of a single Benchling assay-result table in Excel: pull the schema’s columns from Benchling, lay out a managed table (title + headers + named ranges + manifest), read existing rows back, transform them in pandas, write results back, colour them, refresh the table when the schema changes in Benchling, and tidy up.

Everything here drives a live Excel workbook through xlwings (Excel COM on Windows) and the public API exported from xlwings_package. Schema metadata comes from Benchling through a context object (ctx) you build yourself.

Audience & scope. You are a scientist/analyst running Python against an open workbook. Benchling credentials are out of scope for this package — you construct a BenchlingContext (ctx) and pass it in. See credentials.md for how ctx is built.

Related: normal (non-Benchling) tables · multiple Benchling tables / multi-sheet · stacked tables & reinit deep mechanics · credential resolution · tools reference


The 7-step pattern

Every single-table workflow is the same loop. The rest of this guide expands each step.

  1. Initialise the table from the Benchling schema — initialise_table_benchling
  2. Read existing rows out of the sheet — import_excel_table / scan_named_range_table
  3. Transform in pandas (your analysis)
  4. Write results back into the table — paste_df_to_named_range
  5. Colour rows by outcome — colour_rows
  6. Refresh the table when the Benchling schema changes — re-run initialise_table_benchling
  7. Clean up stale named ranges and save — cleanup_stale_table_names
import xlwings as xw
from xlwings_package import (
    initialise_table_benchling,
    import_excel_table,
    paste_df_to_named_range,
    colour_rows,
    cleanup_stale_table_names,
)

wb = xw.Book.caller()          # or xw.Book("MyWorkbook.xlsm")

Step 0 (external) — find your assay-result schema id in Benchling

This step happens in the Benchling web UI, not in this package. It is external to mgtx-xlwings-utils; the guidance below is practical orientation, not an API contract.

initialise_table_benchling accepts Benchling ids only — strings with the assaysch_ prefix (e.g. assaysch_abc123). Free-text name lookup has been removed: schema retrieval is fully delegated to mgtx-benchling-wrapper, which exposes no name-search endpoint for assay-result schemas. The package raises a ValueError if you pass anything that does not start with assaysch_.

A result schema (what the package internally builds via SchemaHandler.build_schema_definition(schema_id, "assay_results_schema")) maps to one Excel table:

Where to get the id (typical Benchling UI):

You will pass this id inside schema_map — as one entry of the per-sheet id list.


Step 1a — install the Benchling extra & build ctx

The Benchling integration depends on mgtx-benchling-wrapper, which is an optional extra. The module imports it lazily — only when a Benchling function actually runs — so a plain import xlwings_package works without it. To use initialise_table_benchling, install the extra:

pip install "mgtx-xlwings-utils[benchling]"

If the extra is missing, the first call to initialise_table_benchling raises:

ImportError: mgtx-benchling-wrapper is required.
Install with: pip install "mgtx-xlwings-utils[benchling]"

Build the context. Credential resolution is your responsibility — load the secret however your deployment does (env var, config.yaml, keystore) and pass the constructed ctx. Full discussion in credentials.md; the short version:

from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext

ctx = BenchlingContext(
    client_id="your-oauth2-app-client-id",
    client_secret=my_resolve_secret(),                 # plaintext, your loader
    base_url="https://yourorg.benchling.com",
    token_url="https://yourorg.benchling.com/api/v2/token",
)

What ctx is, in one line: a mgtx_benchling_wrapper BenchlingContext (or any duck-typed object exposing a callable .benchling() method). The package never sees your client_id / client_secret directly — it only calls ctx.benchling() to obtain the underlying client and pull schema metadata.


Step 1b — initialise_table_benchling: lay out the table

Signature

initialise_table_benchling(
    wb,
    schema_map,                         # dict[sheet_name, list[assaysch_… ids]]
    *,
    ctx,                                # your BenchlingContext
    mode="autopaste",                   # "autopaste" (auto-stack) or "paste" (pin to a cell)
    column=None,                        # autopaste: int / "A" / {schema_id: col} / None (auto-detect)
    at=None,                            # paste: top-left cell, e.g. "C5"; defaults to "A1"
    table_gap=2,                        # empty rows between stacked tables
) -> list[SchemaInitResult]

For one table, the simplest call maps the target sheet to a one-element id list:

results = initialise_table_benchling(
    wb,
    # sheet (created if it doesn't exist) → [schema id from Step 0]
    {"Assay Results": ["assaysch_abc123"]},
    ctx=ctx,
)

What it writes onto the sheet

For each schema, the function lays out a managed table:

Data bookmarks {anchor}_table_start / {anchor}_table_end are not written at init time — they are created on the first paste_df_to_named_range (Step 4) and then track the live data extent.

Before / after

Suppose the schema Purification Assay (warehouse token purification_assay) has three fields: Sample ID, Conc (mg/mL), Pass/Fail. Anchoring at column A on a blank sheet:

Calling initialise_table_benchling(wb, {"Assay Results": ["assaysch_abc123"]}, ctx=ctx) on an empty sheet lays out the title band and header row:

BEFORE — empty "Assay Results" sheet
A
B
C
1
2
3
→
AFTER — title band + header row laid out
A
B
C
1
Purification Assay
2
Sample ID
Conc (mg/mL)
Pass/Fail
3
—
—
—
named ranges + manifest created (sheet-scoped)

Named ranges created (all sheet-scoped):

Column tokens are derived from the schema’s warehouse names, lowercased and sanitised to snake_case (Conc (mg/mL) → conc_mg_ml). Use the token when you read/write columns, and use the schema anchor name when you pass a named_range for the whole table.

Modes & anchoring (single-table essentials)

If the target band already contains values, the function refuses to overwrite and raises TableSpaceOccupiedError.

Return value

A flat list[SchemaInitResult], one entry per schema. For a single table you get a one-element list. SchemaInitResult is a frozen, slotted dataclass — access fields by attribute (results[0].action), not subscript. Its attributes:

Attribute Meaning
schema_name Human-readable schema name (or system name fallback).
display_name Title text written into the sheet.
warehouse_name Schema system (warehouse) name.
range_name Schema anchor token (bare, e.g. purification_assay).
sheet_name Target sheet.
action "created" on first run, "reinitialised" on a refresh.
columns tuple[ColumnInfo, …]; each ColumnInfo has display_name, warehouse_name, and range_name (the namespaced per-column Excel name).
diff dict or None; populated only when action == "reinitialised" (see Step 6).
table = results[0]
print(table.action)             # "created"
print(table.range_name)         # "purification_assay"  (schema anchor token)
print([c.warehouse_name for c in table.columns])
# ['sample_id', 'conc_mg_ml', 'pass_fail']   (bare tokens — what you read/write by)
print([c.range_name for c in table.columns])
# ['purification_assay__sample_id', 'purification_assay__conc_mg_ml', 'purification_assay__pass_fail']

An INFO log line is also emitted per schema:

Schema 'Purification Assay'  →  Assay Results   [created]

Step 2 — read existing rows out of the table

Two readers are available. Both return (DataFrame, mapping) where mapping is {range_name_token: display_header}.

Pass the schema anchor name. Data rows are discovered by scanning down from the header; the _table_start / _table_end bookmarks are auto-healed (written if missing, grown if stale).

df, mapping = import_excel_table(
    wb,
    "purification_assay",            # the schema anchor token
    scope_sheet="Assay Results",
    use_name_range_names=True,       # columns keyed by token (sample_id, ...)
)
# mapping == {"sample_id": "Sample ID", "conc_mg_ml": "Conc (mg/mL)", "pass_fail": "Pass/Fail"}
df, mapping = import_excel_table(
    wb, "purification_assay", scope_sheet="Assay Results",
    include_list=["sample_id", "conc_mg_ml"],
)

Tip — match df columns to what you’ll paste back. Read with use_name_range_names=True so the DataFrame is keyed by token. The column tokens then line up positionally with the header on write-back (Step 4).

scan_named_range_table — explicit start/end bookmarks

When you’d rather drive off the data bookmarks directly (the header row is inferred as the row above _table_start):

df, mapping = scan_named_range_table(
    wb,
    "purification_assay_table_start",
    "purification_assay_table_end",
    scope_sheet="Assay Results",
    use_name_range_names=True,
)

Column width is found by scanning header cells rightward until the first blank, so keep the header row contiguous (no blank separator columns). On a freshly-initialised table that has never been pasted into, the bookmarks don’t exist yet — prefer import_excel_table for the first read.


Step 3 — transform in pandas

Ordinary pandas, plus a few package helpers (strip_strings, limit_columns_to_decimals, …). Keep the row order stable if you intend to colour rows afterwards (colouring is positional).

from xlwings_package import strip_strings, limit_columns_to_decimals

df = strip_strings(df)
df = limit_columns_to_decimals(df, ["conc_mg_ml"], decimals=3)
df["pass_fail"] = df["conc_mg_ml"].apply(
    lambda v: "Pass" if (v is not None and v >= 0.5) else "Fail"
)

Step 4 — write results back with paste_df_to_named_range

Signature

paste_df_to_named_range(
    wb,
    df,                      # DataFrame (or list of DataFrames)
    named_range,             # schema anchor token (or list of tokens)
    scope_sheet=None,
    sanitize_formulas=True,  # apostrophe-escape =/+/-/@ text cells
) -> None

Write your transformed df back under the schema anchor:

paste_df_to_named_range(
    wb,
    df,
    "purification_assay",
    scope_sheet="Assay Results",
)
A
B
C
1
Purification Assay
2
Sample ID
Conc (mg/mL)
Pass/Fail
3
VS0001
0.812
Pass
4
VS0002
0.430
Fail
5
VS0003end
1.204
Pass

Column-order matching to the Benchling headers

paste_df_to_named_range writes df.values positionally under the header. Before writing, it checks each DataFrame column against the existing header at the same position (matching against either the per-column token or the header display value, case/whitespace-normalised). A mismatch raises ColumnOrderMismatchError instead of silently misaligning data.

A DataFrame wider than the header auto-extends the header (new header cells take their labels from df.columns); a narrower DataFrame clears the trailing data cells and the _table_end bookmark tracks the narrower extent. Only df.values is written — the index is never written.

sanitize_formulas (default True)

By default, text cells beginning with =, +, -, or @ are apostrophe-escaped so externally-sourced data can’t smuggle in a live Excel formula (CSV/formula injection). Numbers and numeric-looking strings ("-5.3", "+1e3") are left untouched, and bio labels like "+/+" / "-/-" survive as text. Pass sanitize_formulas=False only when you trust the source and deliberately want formula strings written through verbatim.

Re-running the paste is safe: existing data rows are cleared and the new rows written, with rows inserted/deleted below so any table stacked underneath shifts to preserve its gap. A table in a disjoint column band (different lane) is never moved. (Row overlap within the same lane raises SideBySideNotSupportedError, and loose content or shapes directly below the table raise TableSpaceOccupiedError. Deep mechanics: stacked tables & reinit.)


Step 5 — colour rows by outcome with colour_rows

Signature

colour_rows(
    wb,
    name_range,              # schema anchor token (header row)
    df,                      # same row order as the sheet
    mode,                    # "group_alternate" | "equals_no" | "values_match"
    column_name,             # df column driving the colour logic
    scope_sheet=None,
    values_to_colour=None,   # required for mode="values_match"
    colour_name=None,        # required for mode="values_match"
)

Three modes:

Colour the failures red:

colour_rows(
    wb,
    name_range="purification_assay",
    df=df,
    mode="values_match",
    column_name="pass_fail",
    values_to_colour=["Fail"],
    colour_name="light red",
    scope_sheet="Assay Results",
)
A
B
C
2
Sample ID
Conc (mg/mL)
Pass/Fail
3
VS0001
0.812
Pass
4
VS0002
0.430
Fail
5
VS0003
1.204
Pass

Colouring is positional against the header row — the DataFrame must be in the same row order as the sheet. Raises EmptyDataError if column_name is missing from df or entirely null.


Step 6 — refresh the SAME table when the Benchling schema changes

When fields are added, renamed, reordered, or dropped in Benchling, re-run initialise_table_benchling on the same sheet with the same schema id. The function sees the existing schema anchor and reinitialises in place rather than creating a new table. The returned SchemaInitResult’s action becomes "reinitialised" and carries a diff.

results = initialise_table_benchling(
    wb,
    {"Assay Results": ["assaysch_abc123"]},
    ctx=ctx,
)
print(results[0].action)   # "reinitialised"
print(results[0].diff)
# {'updated': [...], 'removed': [...], 'added': [...], 'rebound': [...],
#  'reordered': [...]}

What reinit does (summary — full mechanics in stacked tables & reinit):

Dropped-column behaviour, before / after (Conc (mg/mL) removed from the schema):

BEFORE — schema [Sample ID, Conc (mg/mL), Pass/Fail]
A
B
C
1
Purification Assay
2
Sample ID
Conc (mg/mL)
Pass/Fail
3
VS0001
0.812
Pass
4
VS0002
0.430
Fail
5
VS0003
1.204
Pass
→
AFTER — column purged, gap closed, table shrinks to A:B
A
B
C
1
Purification Assay
2
Sample ID
Pass/Fail
name deleted
3
VS0001
Pass
4
VS0002
Fail
5
VS0003end
Pass
Conc (mg/mL) purged · Pass/Fail slid left into B

After a refresh that changed columns, re-read before you write back. Read the table again (Step 2) so your DataFrame’s columns match the new header order — otherwise the next paste_df_to_named_range will raise ColumnOrderMismatchError.


Step 7 — clean up stale names, then save

#REF! names accumulate when a user deletes a column manually in Excel — the name lingers but resolves to =#REF!#REF!. cleanup_stale_table_names prunes all such broken names for the managed table.

Signature

cleanup_stale_table_names(
    wb,
    table_name,              # the schema anchor token (e.g. "purification_assay")
    *,
    sheet=None,              # restrict to one sheet (None = every sheet hosting the manifest)
) -> dict[str, list[str]]
report = cleanup_stale_table_names(
    wb,
    "purification_assay",
    sheet="Assay Results",
)
# {"removed_ref": ["Assay Results!conc_mg_ml"]}

table_name is the schema anchor token; it must match the manifest name {table_name}__schema_columns. If no manifest is found on the targeted sheet(s), it raises ManifestNotFoundError (this function only manages tables written by initialise_table_benchling). A non-existent sheet= raises WorksheetNotFoundError.

Save

wb.save()                      # save the workbook in place

Or export the analysed DataFrame to a separate file via the package’s save_dataframe (opens a save dialog by default):

from xlwings_package import save_dataframe

save_dataframe(df, default_file_name="purification_assay_results")

Full single-table script

import xlwings as xw
from mgtx_benchling_wrapper.context.benchling_context import BenchlingContext
from xlwings_package import (
    initialise_table_benchling,
    import_excel_table,
    paste_df_to_named_range,
    colour_rows,
    cleanup_stale_table_names,
    limit_columns_to_decimals,
    strip_strings,
)

wb = xw.Book.caller()
ctx = BenchlingContext(
    client_id="your-oauth2-app-client-id",
    client_secret=my_resolve_secret(),
    base_url="https://yourorg.benchling.com",
    token_url="https://yourorg.benchling.com/api/v2/token",
)

SHEET = "Assay Results"
SCHEMA_ID = "assaysch_abc123"

# 1. Initialise (or refresh) the table from the Benchling schema.
# Dropped Benchling columns are always purged (cells cleared, gap closed).
results = initialise_table_benchling(wb, {SHEET: [SCHEMA_ID]}, ctx=ctx)
anchor = results[0].range_name                   # schema anchor token (Excel named range)

# 2. Read existing rows (skip if the table is brand new / empty).
try:
    df, mapping = import_excel_table(
        wb, anchor, scope_sheet=SHEET, use_name_range_names=True,
    )
except Exception:
    df = None  # no data yet — populate df from your own source instead

# 3. Transform.
if df is not None:
    df = strip_strings(df)
    df = limit_columns_to_decimals(df, ["conc_mg_ml"], decimals=3)
    df["pass_fail"] = df["conc_mg_ml"].apply(
        lambda v: "Pass" if (v is not None and v >= 0.5) else "Fail"
    )

    # 4. Write results back (column order must match the header).
    paste_df_to_named_range(wb, df, anchor, scope_sheet=SHEET)

    # 5. Colour the failures.
    colour_rows(
        wb, name_range=anchor, df=df, mode="values_match",
        column_name="pass_fail", values_to_colour=["Fail"],
        colour_name="light red", scope_sheet=SHEET,
    )

# 7. Tidy stale names and save.
cleanup_stale_table_names(wb, anchor, sheet=SHEET)
wb.save()

Common errors & fixes

Error When it fires Fix
ImportError: mgtx-benchling-wrapper is required First Benchling call without the optional extra installed. pip install "mgtx-xlwings-utils[benchling]".
ValueError: Schema lookup requires a Benchling ID … A schema_map id value doesn’t start with assaysch_. Pass the assaysch_… id (Step 0). Name lookup is unsupported.
ManifestNotFoundError cleanup_stale_table_names can’t find {table_name}__schema_columns on the targeted sheet(s). Confirm table_name is the schema anchor token and the table was created by initialise_table_benchling; check the manifest wasn’t deleted manually.
NamedRangeNotFoundError A reader/writer/colour call can’t resolve the anchor or a column token. Verify the token spelling and scope_sheet; re-run initialise_table_benchling to (re)create the named ranges.
TableSpaceOccupiedError Init target band is occupied, or loose content sits in a table’s danger zone below it on paste. Move the stray content, choose a different sheet, or anchor elsewhere (column / at).
ColumnOrderMismatchError paste_df_to_named_range finds df.columns don’t match the header at the existing positions. Reorder df to match the header, or re-initialise the table (Step 6) to apply the new schema before pasting.
SideBySideNotSupportedError Two managed tables in the same lane (overlapping columns) have overlapping rows. Tables within the same column band must be vertically stacked. Tables in disjoint column bands (different lanes) sharing rows are allowed. See stacked tables & reinit.
EmptyDataError Read found an empty table/column, or colour_rows/paste got a column that’s empty or zero-row. Ensure the table has data; for reads, set check_empty_columns=False to tolerate null columns.
ExcelColumnOverflowError Placing/widening the schema would push a column past Excel’s last column (XFD). Use a leftmost anchor with more room, or split the schema across sheets.