Discover more from Decoding Bio
BioByte 050: multimodal biomedical AI from Amazon, AlphaFold for pandemic preparedness, a large pharma protein language model partnership
Welcome to Decoding Bio, a writing collective focused on the latest scientific advancements, news, and people building at the intersection of tech x bio. If you’d like to connect or collaborate, please shoot us a note here or chat with us on Twitter: @ameekapadia @ketanyerneni @morgancheatham @pablolubroth @patricksmalone. Happy decoding!
Welcome to the semicentennial edition of BioByte. Some highlights from this week:
A serendipitous discovery of hemoglobin, the canonical oxygen-carrying molecule in red blood cells, in cartilage cells called chondrocytes. This may explain why patients with cartilage-related diseases like rheumatoid arthritis often have anemia, and may have implications for recovery from sports-related joint/cartilage injuries.
The future of computational biology is multimodal. A new method called BioBRIDGE from Amazon AI for harmonizing many unimodal foundation models across several data modalities into one unified model using a knowledge graph, resulting in superior generalization and cross-modal retrieval.
A new partnership between BioMap (AI biotech that developed a 100B parameter protein language model, the largest to date; covered in BioByte 037) and Sanofi to co-develop biologic therapeutics, with a total deal size of $1B.
An update on macrocycles, an important emerging therapeutic modality that bends med chem rules of drug-likeness, and a PhIII Merck asset targeting extracellular PCSK9, a well genetically validated target involved in atherosclerotic cardiovascular disease. Macrocycles have a troubled history (for example, engineering oral bioavailability and cell permeability has been challenging), but advantages in dosing, manufacturing and route of administration will open opportunities to take over validated targets for indications whose only available therapies are injectable biologics.
Using AlphaFold for biosecurity preparation, for example by using protein folding algorithms to predict viral mutations that will enable immune system evasion. This algorithm will help forecast emerging strains as a tool for continuing vaccine development.
What we read
Ionis re-commits to new timeline for prion disease ASO trials [CureFFI, October 2023]
Sonia Vallabh and Eric Minikel are prion-hunters based at the Broad Institute, and have been making a concerted effort to develop preventive therapeutics for this terrible set of diseases. They had collaborated with Ionis for several years, and together, had produced enough data that could rationalize a clinical trial. However - as many things go in biotech - the road to the envisioned trial was delayed significantly.
Here, Eric highlights a recent announcement by Ionis to the prion disease community that holds promise for patients:
The new drug is ION717, and will be wholly developed by Ionis
The first Phase 1/2a trial (PrProfile) will start by end of 2023
The trial will be in symptomatic patients - NOT pre-symptomatic at-risk people
They don’t plan on making the drug available outside the trial
This announcement is exciting for the prion community given the timeline provided: with a start date by end of year, this suggests that Ionis has been well at work for quite some time, with several of the administrative and infrastructural tasks taken care of for launching a trial. However, Ionis also states this trial will only be for symptomatic patients, but their language suggests that they envision treating pre-symptomatic people in the future. Additionally, given ION717 is targeting PrP, the fundamental driver across various subtypes of prion disease, there is hope that all patients may be eligible over time.
ION717 will only be able to enrolled patients; this is relatively standard, and is likely due to a few fundamental reasons: 1) prion diseases are rare, and it doesn’t make sense to pull patients away from an established trial through extended access pathways, 2) it opens them up to potential liabilities from the FDA if patients have adverse events; that said, this is exceptionally rare (0.2% of expanded access INDs had clinical holds). Finally, ION717 will be wholly owned, which is in contrast to several of their recent programs. Only time will tell if this is a good thing - Ionis clearly feels confident in its ability to bring this forward, and a smaller company can often act much quicker than large pharma. However, on the flip side, big pharma has unparalleled resources that may enable success…
There is still tremendous work to be done in the prion field, but we’re happy to see Ionis tackle such a challenging problem and bring hope to patients and their families.
Macrocycle drugs serve up new opportunities [Kingwell, Nature News, September 2023]
Macrocycles, defined as ring structures that contain at least 12 atoms, include very large and complex natural products such as antibiotics, mid-sized cyclic peptides and stapled peptide therapeutics.
The cyclic structure of these molecules can offer the potency and selectivity of antibodies, with the dosing and target space opportunities of small molecules. For instance, Merck is developing MK-0616, a macrocycle targeting extracellular PCSK9, a well genetically validated target involved in atherosclerotic cardiovascular disease.
Merck’s molecule, which was discovered through a collaboration with Ra Pharma, can be taken orally and with a better dosing schedule than current approved antibody (Amgen’s evolocumab and Sanofi and Regeneron’s alirocumab) and siRNA (inclisiran, from Novartis and Alnylam) therapies.
In many ways, macrocycles bend the classical medicinal chemistry guidelines of the “rule of 5”. This is a ~1.5kDa complex peptide that is orally bioavailable that binds to PCSK9 with 5 pM affinity. Medicinal chemists are surprised this is a drug at all when looking at the structure.
Given the advantages in dosing, manufacturing and route of administration, macrocycles have an opportunity to take over well validated targets for indications whose only available therapies are injectable biologics.
Concepts of health and disease as a barrier to progress [Alex Telford, October 2023]
A little more philosophical in nature than what we usually cover here, Alex’s post on how we think about health and disease is worth a read. His goal is to question how we categorize the two terms and how contextualizing them differently may lead to more innovation and biomedical progress. Some interesting points:
We think of health as a discrete variable—you’re either sick or you’re health. Diseases are similarly discrete—cancer, infertility, Celiac, etc. This discredits all the overlap and gray area in medicine.
Medical specialization started in the 1800s when groups realized you can centralize treatments and research for patient’s suffering from similar ailments. It was how physician’s distinguished themselves.
There was pushback on this as diseases are so connected with each other and a wider field of view can often help
We now think of disciplines of biology as discrete and thus think of treatments as discrete as well. This is functional but ignores the underlying questions of root cause
Categories are labels, we shouldn’t put too much emphasis on them beyond a useful abstraction and quick way to compartmentalize/allocate resources.
Categorizing disease could be similar to dimensionality reduction (ex: casting a 3D object into a lower dimension). Helps us generalize but you lose the nuance
We should think about how truth might be getting distorted in medicine
Precision medicine helps but still suffers from compartmentalization because it’s a precision medicine approach to disease X, not an approach to overall well-being
Discrete thinking leads to discrete solutions. This is narrow-minded and discredits evolution and our adaptability as humans.
“Dynamism is more natural when we think in continuums instead of rigid categories”
Current state of biotech drugs point a single transformation from point A to point B. But the landscape is infinite. There’s an invitation to rethink our strategies.
The Six Moats of Data Businesses [Travis May, October 2023]
New blog post from Datavant Founder and Former CEO, Travis May, analyzes how data businesses can build defensible moats to generate $1 billion+ in enterprise value. The piece outlines that “better data curation” and API-based delivery formats are necessary, but insufficient. The most effective moats are becoming a "data currency" that facilitates transactions, aggregating long tail proprietary data sources, establishing exclusive relationships with originators, using "give-to-get" data sharing models, creating proprietary data assets, or leveraging "exhaust data" from another business line. Companies that employ these strategies, especially in the life sciences, are poised to create network effects and enduring, hard-to-replicate advantages.
How AlphaFold and other AI tools could help us prepare for the next pandemic [Ewen Callaway, Nature News, October 2023]
Researchers are harnessing AI tools like AlphaFold and LLMs to prepare for future pandemics. These tools are enabling faster vaccine design by predicting the structure of viral proteins and identifying stabilizing mutations. Models can also anticipate how viruses might evolve to evade immunity, allowing scientists to stay one step ahead. Recall that pne such model accurately predicted many mutations that emerged in real-world SARS-CoV-2 variants. Researchers hope these AI predictions will lead to more resilient vaccines.
Major funders like CEPI are pouring money into using machine learning for pandemic preparedness. While experts caution AI is not a cure-all, they say it provides promising new capabilities. AI is speeding up vaccine development, enabling entirely new design approaches, and could give the world a head start on emerging viral threats. But researchers emphasize it must complement, not replace, traditional scientific methods.
Interested in biosecurity? Check out the Decoding Bio Biosecurity Report!
An extra-erythrocyte role of haemoglobin body in chondrocyte hypoxia adaption [Zhang et al., Nature, October 2023]
Why it matters: It has always been taught that hemoglobin (Hb), the canonical oxygen-carrying protein, is exclusively produced in red blood cells. While other forms of globin exist (e.g. myoglobin in muscles, neuroglobin in neurons), the authors demonstrate that hemoglobin is indeed produced by chondrocytes, the cells of cartilage. This work demonstrates that – even today, certain dogmas of biology continue to be disproven – and sheds light on fundamental physiology critical for development and homeostasis.
When studying cartilage growth plates of mice, the authors serendipitously found eosin-positive (eosin stains the cytoplasm) globules in chondrocytes, and that these were present in both mouse and human cells across different types of cartilaginous tissues (such as the ribs and foot bones). Using laser-based microdissection, the authors were able to separate the eosin globules, and using mass spectrometry, identified the top hits as hemoglobin subunits. This was confirmed with further western blotting and immunohistochemistry. Thus, the authors concluded that hemoglobin is significantly produced in chondrocytes - which they named “Hedy”.
Using a series of techniques (including transmission electron microscopy, sequence analyses, etc.) the authors probed the structure and dynamics of Hedy and found that Hedy is condensed together via phase separation, forming essentially organelles in the cytoplasm. Interestingly, intrinsically disordered regions (IDRs) of the hemoglobin beta subunits were critical for condensation.
In normal development, hemoglobin undergoes globin switching (embryonic → fetal → adult) at well-established timeframes. Similarly, the authors found that chondrocytes underwent an identical form of globin switching as is seen in RBCs. Additionally, they found that – as compared to traditional hemoglobin being induced via hypoxia-inducible factors (HIFs) in settings of hypoxia – chondrocytes increased hemoglobin expression via KLF1, the same protein required for fetal to adult hemoglobin structure. Finally, the authors found that, by deleting the hemoglobin beta chain gene in chondrocytes, chondrocytes die, with mice dying just a few days after birth.
Thus, the authors identify that cartilage (which lacks vasculature) has its own globin; not a different form, but rather the same as is seen in red blood cells. Biology continues to humble us, and what is taken for gospel isn’t necessarily true…
An Automated Scientist to Design and Optimize Microbial Strains for the Industrial Production of Small Molecules [Singh et al., bioRxiv, January 2023]
Why it matters: the synthesis of biomolecules via fermentation is a complex optimization problem. The vast biological search space of potential strains that can synthesize one molecule exceeds 10^24. This means that if one DBTL cycle takes weeks to months, scaling production of the molecule from milligrams to kilograms can take years.
To reduce the risk of developing a strain for a single molecule, companies use two de-risking methods: 1) to work on multiple similar strains at once and 2) to diversify the kinds of molecules in the pipeline. In both cases, expecting that at least some will successfully commercialize. However, for these to work there has to be some level of automation so that each new molecule added to the portfolio does not linearly increase cost in the form of time or headcount.
To aid in these de-risking methods, a research team at Amyris developed Lila, an automated scientist software platform. It generates ‘metabolic routes, identifies relevant genetic elements for perturbation, and specifies the design and re-design of microbial strains in a matter of seconds to minutes.’
Lila has impressively shortened the timeline for molecule PoC production from months to weeks and decreased the cost of molecule development <$10M per molecule (from ~$100M per molecule.
Human-induced pluripotent stem cell-derived ovarian support cell co-culture improves oocyte maturation invitro after abbreviated gonadotropin stimulation [Piechota et al., Human Reproduction, October 2023]
Why it matters: The ability to mature human eggs in a dish outside the body is a feat that many groups are trying to achieve in different ways. Current IVF and egg-freezing protocols heavily rely upon hormone injections which are costly and physiologically taxing to women undergoing treatment. This procedure furthers the conversation on creating human eggs ex vivo using clinical samples.
The study sought to answer the question whether in vivo maturation of human oocytes can be improved by co-culture of various ovarian support cells (OSCs) that are derived from human-induced pluripotent stem cells. Knowing in vitro maturation protocols struggle from replicability, the authors designed the study to collect oocytes from various donors and stimulate with gonadotropin and various hCG triggers. The OSC-IVM culture was made up of 100000 OSCs with hCG, recombinant FSH, androstenedione, and doxycycline. The controls had the same supplements but not OSCs. Interestingly, there was 1.5x improvement in maturation of oocytes in the OSC culture compared to the control. These oocytes had higher rates of passing through the MII state (meiosis II). This study suggests co-cultures with OSCs could improve oocyte maturation outside the body. The study did not test whether the resulting embryos are viable and capable of implantation and further development.
scHyena: Foundation Model for Full-Length Single-Cell RNA-Seq Analysis in Brain [Oh et al., arXiv, October 2023]
Why it matters: single-cell RNA sequencing (scRNA-seq) enables profiling individual cells, revealing intricate diversity in complex tissues like the brain. But measurement noise and high dimensionality hinder analysis. By effectively using full gene expression, scHyena provides a strong foundation model to enhance scRNA-seq interpretation. This framework can unlock new insights into healthy brain function and disease states like Alzheimer's. More broadly, scHyena exemplifies how custom foundation models can empower analysis in specialized biomedical domains.
This pre-print introduces scHyena, a new foundation model for analyzing scRNA-seq data from the brain. scHyena uses a novel neural network architecture called Hyena that can handle very long input sequences like full gene expression profiles. This allows scHyena to leverage information from all genes without needing to reduce dimensions.
The model is pre-trained using large scRNA-seq datasets to learn generalizable patterns. scHyena demonstrated superior performance on downstream tasks like identifying cell types and imputing missing gene expression values compared to existing methods. It accurately filtered out doublets, which are mixed cell profiles that confuse analysis. scHyena also excelled at filling in dropout events with biologically meaningful values.
BioBridge: Bridging Biomedical Foundation Models via Knowledge Graph [Wang et al., arXiv, October 2023]
Why it matters: BioBRIDGE provides an effective way to combine AI systems trained separately into a “greater whole” with multimodal capabilities. This approach avoids the computational and data scarcity issues of joint training from scratch. As data and models proliferate across modalities, connecting them flexibly will be crucial. BioBRIDGE offers a generalizable approach to fuse independently developed AI building blocks into more powerful unified systems. The method exemplifies integrating disparate AI advancements efficiently while retaining their individual strengths.
This pre-print presents BioBRIDGE, a new framework to connect independently trained foundation models from different data modalities like text, images, proteins, and molecules. BioBRIDGE utilizes knowledge graphs to learn transformations between the models without fine-tuning the foundation models themselves. This allows bridging models trained on large unimodal datasets efficiently.
Empirical evaluations demonstrate BioBRIDGE achieves strong performance on diverse cross-modal prediction tasks, beating specialized knowledge graph embedding methods. It also generalizes to unseen entities and relationships. Additionally, BioBRIDGE enables multimodal applications like answering questions with both text and molecular inputs. The method matters because it provides an effective way to combine independently developed AI systems to enable richer multimodal capabilities. This allows leveraging the vast unimodal data available while avoiding the computational and data scarcity issues of joint training. BioBRIDGE exemplifies connecting AI systems trained separately into a greater whole.
What we listened to
Mirati: Commercial stage small molecule oncology biotech, founded in 1995 and IPO’d in 2013.
BMS: American multinational pharma with a rich oncology pipeline (Revlimid, Opdivo Sprycel, Yervoy)
Deal: BMS is paying $4.8B + $1B in contingent value rights (this top-up is unlocked if the FDA accepts a marketing application for Mirati’s PRMT5/MTA inhibitor in certain advanced non-small cell lung cancer (NSCLC) indications within 7 years).
Why did BMS buy?
Overview: Mirati is a strategic fit given BMS’ large presence in oncology and potential combination potential with PD-1 inhibitors (BMS’ Opdivo). The pipeline also brings BMS near-term and long-term revenue potential at a relatively cheap price, leaving plenty of cash for further M&A.
KRAS G12C (Krazati): A significant portion of the acquisition price is attributed Krazati’s blockbuster potential in NSCLC / CRC. However, this may be difficult due to extreme competition e.g. Roche’s Divarasib, Eli Lilly’s LY3537982 and RevMed’s RMC-6291 that pose further risk to blockbuster status. This highlights the difficulty of small molecule drug development and the ease of competitors emerging with similar profiles. Additionally, Lumakras (approved KRAS G12C competitor from Amgen) has plateaued revenues around $75M per quarter.
PRMT5 (MRTX1719): PRMT5 is synthetic lethal in MTAP-deleted tumours, which occurs in around 15% of cancers. However, this is fragmented (e.g., 6% lung, 10% pancreatic, 12% bladder, 7% skin). Recent data shows only 6 responses that are merely case studies and not a full assessment of the trial. Only one of these responses was in NSCLC which is what the $1B is attached to. This drug may also be IRA-eligible due to its pan-tumour potential.
KRAS G12D (MRTX1133): One of the great holy-grail oncology targets in an astonishing 60%+ number of cases. However, acquisition value doesn’t seem to include its potential and there is no CVR tied to the program. Perhaps this suggests data is too early or not trending in the right direction.
Why did Mirati sell? The ongoing Krazati commercial efforts and 1L NSCLC development plans will be expensive and difficult to achieve by Mirati currently.
A thought: The FTC’s hard-line Lina Khan has mentioned that small biotech acquisitions will also be subject to scrutiny. Lets see whether this small and very standard oncology acquisition catches her glare.
BioMap: Beijing-based AI-driven biotech that raised $100M Series A in 2021, co-founded by Baidu’s CEO Robin Li. BioMap has created xTrimo, a protein-centric foundation model trained on public and private data sources of proteins and their interactions. This model has 100B parameters and is used to create “enhanced” biologics for a range of applications. The company has 300+ staff, 100,000 sq ft wet labs and a supercomputing facility.
Sanofi: French multinational pharmaceutical and healthcare company headquartered in Paris, France.
Partnership: To co-develop novel AI modules for biotherapeutic drug discovery. Sanofi will provide proprietary data, computational innovations in protein engineering and biological development expertise. BioMap will provide its xTrimo LLM and high performance computing facilities and expertise. The total deal size is $1B with milestones likely for target discovery + progression as well as technology development.
Rationale: Sanofi’s Global Research Platforms leader Matt Truppo affirmed that the pharma giant is committed to becoming the first company powered by AI at scale. This partnership seems unique from Sanofi’s previous collaborations as it entails co-development of technology suggesting Sanofi is moving towards owning and bringing the technologies in-house rather than just accessing them via a third party (such as the Exscientia collaboration).
A notable seed financing has been accomplished by the Israeli biotech which combines AI and nanotechnology to design lipid nanoparticles (LNPs). LNPs are crucial to deliver mRNA-based or genetic medicines to their destination. The biotech has already filed a patent and has in vivo data in mice.
Other highlighted deals:
In case you missed it
Artificial General Intelligence Is Already Here [Blaise Aguera y Arcas and Peter Norvig, Noema Mag]
What we liked on Twitter
Did we miss anything? Would you like to contribute to Decoding Bio by writing a guest post? Drop us a note here or chat with us on Twitter: @ameekapadia @ketanyerneni @morgancheatham @pablolubroth @patricksmalone