Embracing Results-Driven Discovery

Lewis Martin
June 17, 2021


Introduction

June 2, 2021 marked 10 years since Anthony Nicholls published his sensible, skeptical blog post “Avoiding Hype-based Science,” in which he advised what—and what not—to do when making technology decisions in preclinical drug discovery. Baked into the piece is the brilliant mantra:

When in doubt, buy the results, not the technology

which should forever act as a beacon guiding drug hunters away from the jagged rocks of hype and towards the safe harbors of tech that actually works. Coming from a field beset with hype (machine learning applied to drug discovery), and with the benefit of ten years hindsight, I thought to revisit some of the technologies Nicholls analyzes and take another look at how they are used in drug discovery—and particularly hit discovery—today.

Screening Libraries and Hype

The Gartner hype cycle

The Gartner hype cycle as applied to high-throughput-screening, from the chemists at Enamine (Grygorenko et al. 2019)

The most common approach for launching a small molecule discovery campaign is high-throughput screening, a.k.a. HTS. Since its origins in the 1990s, HTS has charted the full Gartner hype cycle: promises that it would revolutionize drug discovery spurred large investments in big pharma as well as academia, but by the time of Nicholls’s writing in 2011, most discovery organizations were climbing out of the trough of disillusionment dug by over-optimistic marketing claims and underwhelming results in the late nineties and early aughts.

In principle, HTS is easy to understand: screen more compounds to find more drugs. In practice, limitations in organic chemistry and a preference to satisfy quantity before quality meant that early screening libraries yielded poor hits: false positives (costing money to confirm), synthetic dead-ends (lacking developability), and compounds with unfavorable physicochemical properties (failing to meet target PK profiles.) In its mature state, HTS is neither a perfect solution nor a total flop—the truth is somewhere in between. Yet one thing is for sure. The HTS boom kicked off an obsession with scaling up the quantity of molecules screened in hit discovery.

Combinatorial Chemistry: Is Bigger Better?

Combinatorial chemistry addressed this quantity question head-on. The core concept is that the combination of chemical building blocks yields a combinatorial explosion of available molecules, which, in turn, helps one brute-force their way through chemical space. In practice, combinatorially crafted libraries are at the mercy of available chemistry and technical limitations around separating out synthetic outputs. Today, DNA-encoded libraries (DELs), which barcode ligands using DNA subsequently decoded through PCR, mostly solve the species separation issues. But, they trade that benefit off with new restrictions on chemistry due to the solubility and stability constraints of DNA. All else being equal, quantity can be a boon if you have the time and budget to take advantage of it—but what of quality?

The past 10 years have seen a huge improvement in the quality of available chemical matter. It was only 2010 when Johnathan Baell and Georgina Holloway publicized the concept of ‘pan-assay interference compounds’, helping drug hunters more reliably filter out false-positive hits. Since Lipinski/Lombardo/Dominy/Feeney published their famed rule of five in 1997, the rules have been continually revised, becoming stricter around “lead-like” chemical matter in recognition of the facts that: (a) chemists don’t usually discover a clinical candidate in one shot in initial screening, and (b) lead optimization typically increases molecular weight and lipophilicity of molecules. Keeping in mind these physicochemical guidelines, a new library-building paradigm arose: begin with property-restricted building blocks, combine in parallel using available chemistry, and synthesize with robots as needed.

If this approach sounds familiar, it’s because it forms the basis of the famous Enamine ‘REadily-AccessibLe’ (REAL) library and its lead-like subset. Several pharma organisations take the same approach—Merck claimed in 2018 to have 1020 available molecules in their ‘MASSIV’ library, outstripping today’s DEL screening sizes by a factor of around a hundred billion. Hoffman and Gastreich (2019) offer a nice review of progress in the space, but at a mere two years old it already feels outdated due to the rapid pace of growth of virtual libraries. Enamine currently markets ~19B compounds in their REAL Space offering, about 5x what’s cited by the 2019 review.

Growth in virtual libraries

While possible chemical space may be >1060, all you need is one potent molecule with no prior art to get started. 4 billion options ought to do it! (Hoffman and Gastreich 2019)

Robotic parallel synthesis is relatively cheap, reliable, and fast. The molecules are in the right property range, with plenty of room to move in structure-space. But, as Anthony Nicholls says:

It doesn’t matter how many compounds you make; what matters is how many drugs you make.

Impressive technological feats in library size are cool, but one can’t lose sight of the ultimate goal: to treat disease by turning molecules into drugs.

Virtual Screening and Hype

“When we raise money it’s AI, when we hire it's machine learning, and when we do the work it's logistic regression” - Juan Miguel Lavista (@BDataScientist)

Affordable search through billions of molecules is only possible with virtual screening. In particular, recently, deep learning based virtual screening methods have garnered a lot of interest. Yet unlike other fields where deep learning has seen great success (say, language translation or image recognition), virtual screening is a problem for which: (a) not that much high quality training data exists privately or publicly, especially in light of sampling bias of chemical space queried in the drug discovery literature, and (b) big data isn’t guaranteed to be enough.

The determinants of binding to a protein active site are difficult to represent in a tensor. Polarizability, rotational invariance, and dynamics of both the solvent and protein are essential components of binding yet are expensive to model and lack a straightforward numeric featurization. So, while a convolutional neural network can learn the implicit rules differentiating a cat vs a dog in a 2D image with sufficient data, it’s not clear if deep learning can (yet) represent bound molecules in a way that accurately describes bioactivity. Perhaps paradoxically for the deep learning practitioner, while it is very difficult to exhaustively enumerate the rules defining the presence of a cat in an image, using rules-based methods to encode protein-ligand interactions can be quite effective.

Indeed, over the past thirty years, shape- and charge-overlap calculations, either ligand-centric or protein-centric have emerged as reliable tools to rank-order putative hits based on similarity to an existing ligand or binding site. After three decades of research, the results are in: these approaches work, with in vitro hit rates from virtual screens between ~5 and 50% consistently achievable under the right circumstances (Irwin & Shoichet 2016).

So, where are all the drugs from this technology? Like with HTS, over most of the past 30 years, these approaches have been limited by expense and availability of chemistry. Recently, though, molecular docking to protein binding sites using ultra-large virtual libraries has yielded several high profile examples demonstrating good hit rates using diverse, leadlike material. These papers show that joining high-quality virtual libraries with “old-fashioned” virtual screening can reliably result in novel hits.

Hype vs. Results

Returning to Anthony Nicholls’s mantra, it appears that today we are able to buy into results, not merely technology. The hit discovery hype roller coaster has finally found itself on the plateau of productivity, and along the way, we have achieved an understanding of how to screen large swathes of drug-like chemical space to surface promising leads within a modest budget.

This outcome may lack pizzazz—remember we are at the end of the hype curve—but it represents a very attractive proposition for people with a therapeutic hypothesis and no chemical matter. And that’s a lot of people! There is an abundance of promising disease targets identified in the literature that are yet to be drugged. Furthermore, there is a large appetite to fund the de-risking of candidate molecules. So, while there are unsolved problems down the track (like, say, all of translational drug discovery), the work is cut out: take disease hypotheses, leverage structural data, identify leads, and progress up the value chain.

At OpenBench, armed with well validated techniques, we feel confident providing a hit discovery service that isn’t merely technology-driven but results-driven. In practice, we offer a novel proposition: we bear the full cost of virtual screening, hit acquisition, and confirmatory assaying of hits. Our clients only pay if, and when, we find novel chemical material that engages their target and meets their developability criteria.

With this approach, we hope to say “so long” to the days of buying technology hype. For structurally enabled hit discovery, at least, you can now buy results.