Skip to main content

What functional data actually means — and why it changes what gets into the clinic.

Not all data predicts clinical performance. Most drug discovery measures the wrong thing.

Most antibody and bispecific drug discovery selects candidates by measuring how tightly they bind. Binding affinity is fast to measure, scalable across thousands of candidates, and produces numbers that are easy to rank and compare. It has been the standard filter in the field for decades. It is also a poor predictor of whether a molecule will work in a patient.

This is not a controversial claim. It is well-established in the literature and widely acknowledged by scientists working in the field. The question is not whether binding affinity is insufficient — it is why, knowing this, the field continues to use it as the primary selection criterion. The answer is infrastructure. Measuring the right thing is hard. Measuring the right thing at scale is very hard.

What binding affinity does and does not tell you

A binding affinity measurement answers one question: does this molecule attach to its target? High affinity means strong attachment. High specificity means preferential attachment to the intended target over related proteins. Both properties are necessary for a therapeutic molecule. Neither is sufficient.

Binding does not tell you whether the molecule, once attached, activates a productive immune response. It does not reveal the epitope geometry — the precise location and orientation of where the molecule binds — and whether that geometry positions a T cell in the conformation required for an effective immune synapse with a tumour cell. It does not measure the cytokine environment the molecule generates after engagement: whether that environment resolves cleanly after target clearance or escalates toward systemic release. It does not reflect how the molecule behaves in the presence of heterogeneous T-cell populations — exhausted, activated, naive — which is what it encounters in a patient.

These are not marginal variables. Epitope geometry governs killing geometry. Cytokine balance governs both efficacy and toxicity. T-cell population dynamics in the tumour microenvironment govern whether a molecule that performs well in a simplified model will perform at all in a patient. Binding affinity is orthogonal to all of them. A high-affinity molecule can produce an abortive immune synapse. A lower-affinity molecule can produce a more productive one, depending on epitope. The binding number does not tell you which.

The variables that actually determine clinical efficacy

The readouts that predict whether a T-cell engager will work in a patient are functional: does the molecule drive T-cell killing of tumour cells? Does it generate an activation profile — measured by T-cell activation markers — that indicates productive engagement? Does the cytokine secretion profile it produces match what is expected of an effective immune response, without the escalation patterns associated with severe toxicity?

These readouts can only be observed in living cells. Not in a binding assay. Not in structural modelling, which can suggest likely behaviour but cannot substitute for measurement in the biological context of interest. And not in engineered cell lines that have been simplified for assay convenience — cell lines that may express the target at unnaturally high levels, or that lack the surface receptor complexity present on primary human cells. The biology that determines clinical performance is present in primary human immune cells. That is where the relevant measurement must be made.

The infrastructure problem

The field has known for years that binding-selected candidates fail to translate at higher rates than the preclinical data predicts. The knowledge gap is not the problem. The infrastructure gap is the problem.

Running functional screens in primary human immune cells is operationally difficult. Primary cells are variable across donors, sensitive to handling conditions, available in limited quantities, and expensive to obtain and maintain. Measuring T-cell killing, activation markers, and cytokine profiles across large candidate panels — rather than a small number of finalists already selected by binding — requires infrastructure that most drug discovery operations were not designed to operate at scale.

The result is a rational but consequential choice made by the industry: measure what is measurable at scale, then apply functional assays late in the process, to a small number of candidates pre-selected by binding. Candidates that would have performed poorly in functional assays but well in binding assays advance. Candidates that would have performed well functionally but were deprioritised on binding metrics are dropped. The clinical translation gap is, in significant part, a consequence of this selection order.

What it means to screen in primary human immune cells from experiment one

Functional-first discovery inverts the selection order. Instead of applying functional assays to a small set of binding-selected finalists, functional measurements are made on large candidate panels — T-cell killing, activation markers, and cytokine secretion profiles — in primary human immune cells, from the first experiment. Candidates are selected based on the readouts that predict clinical behaviour, not the readouts that are easiest to generate at scale.

The candidates that survive functional pre-selection have been evaluated in the biology they must ultimately engage. They have been tested against primary human T cells from multiple donors — capturing the variability in immune cell populations that a therapy will encounter across a real patient population. The functional behaviour observed during discovery is not a proxy for clinical performance — it is the same measurement, made in the same biology, at an earlier point in the development timeline.

When a molecule selected this way enters a patient, the clinical translation is not a leap of faith from an artificial model. It is a continuation of the same measurement, in the same biology, at the scale of a human organism. That is what functional data actually means — and why it changes the clinical translation profile of what gets into the clinic.

The infrastructure that makes it possible

Running functional screens at this scale requires building the infrastructure specifically for it. Primary human immune cells. T-cell killing assays. Activation marker panels. Cytokine secretion profiles. At up to one million candidates per experiment, across multiple donor backgrounds, generating data that can train predictive models for the next cycle of design.

Every functional screen produces more than a ranked candidate list. It produces data that encodes the relationship between molecular architecture and immune cell behaviour in primary human biology. That dataset is the compounding asset. Each cycle of screening trains better predictions for the next cycle. The infrastructure was not built to run one discovery programme — it was built to run many, simultaneously, improving with each one.

The infrastructure constraint that has kept the field measuring binding is real. It is not insurmountable. Building it is where the investment goes — and what makes the difference between a molecule selected for what was measurable and a molecule selected for what actually determines whether a patient responds.