Proxy Particulars

Bayesian science is orientated by theory and guided by observation. Theory provides initial direction, but it is deliberate, continuous empiricism that chauffeurs theory to verity. This empirical dependency means scientists have to measure things. 

Unfortunately, it is not always possible to measure something directly. I can count how many trees are in a field unaided – but ask me to quantify something I can not see – such as the number of chlorophyll pigments in the field and I will need assistance. Help arrives as a “proxy”: something I can measure directly that correlates with the thing I’m interested in. For example, chlorophyll blocks light at a known wavelength. This disruption correlates with the abundance of chlorophyll. If I measure the light, I can quantify the chlorophyll. 

By measuring a proxy, we can quantify the invisible.

Scientists use established proxies everyday. Pre-defined, broadly-applied, quantifiable metrics. Where they come from and why we use them is rarely questioned. Proxies are subordinate tools.

But as science continues to prod the unknown, existing proxies occasionally start to fail. Proxies may not sufficiently correlate with a desired measurement. And if a proxy is inaccurate, it can not inform a hypothesis. When this happens, proxies – the lowly tools of analysis – require analysis themselves.

In my own field of proteomics, peptide abundance is used as a proxy for protein abundance. (It is much easier to measure peptides than whole proteins.) Proteomics is a nascent campaign – its face is firmly squashed against the unknown. Consequently, proteomic proxies need to be carefully scrutinised.

A big question for proteomics is: which peptides are the best proxies for proteins? 

A new paper published in Nature Methods (from Jonathan Worboys in my ICR lab) addresses this question empirically. Using the cancer kinome as a model system, Worboys et al demonstrate the first systematic evaluation of “quantotypic” peptides. The better a peptide correlates with protein level, the more “quantotypic” it is. Quantotypic peptides are good proxies.

The outcome? Quantotypic peptides can be (and should be) determined empirically. That is, the best proxies – those subordinate tools of empiricism – require individual empirical induction themselves. 

Meta.