Discover more from Decoding Bio
BioByte 38: advances in medical AI, 100 pages on biosecurity, personalized cancer vaccines, self-delivering RNPs for gene editing, and more
Welcome to Decoding Bio, a writing collective focused on the latest scientific advancements, news, and people building at the intersection of tech x bio. If you’d like to connect or collaborate, please shoot us a note here or chat with us on Twitter: @ameekapadia @ketanyerneni @morgancheatham @pablolubroth @patricksmalone. Happy decoding!
We can’t top the excitement of NVIDIA and Recursion’s partnership from this week but here are some other highlights covered in depth below in case you missed it:
Medical AI is advancing rapidly for clinical application, especially in diagnostics
large language models may finally give us a way to digitize and interact with biological composition
Personalized cancer vaccines are looking more like reality than ever with new Moderna data
WGS in infants gets better
single cell RNAseq gives us a glance into malfunctioning cells involved in neurodegenerative disease
Self-delivering RNPs for genomic editing
100+ pages on the state of biosecurity research and funding
What we read
Recent advances in AI have led to the development of new tools that can help doctors diagnose diseases, recommend treatments, and monitor patients' health. In his Substack newsletter, Dr. Eric Topol, a cardiologist and professor of molecular medicine at the Scripps Research Translational Institute, provides a comprehensive overview of the latest advances in medical AI.
In Part One of the series, Topol discusses three new deep learning cardiopulmonary imaging studies, including a recent paper showing that Chest X-Rays may offer critical diagnostic information for Type 2 diabetes. one transformer model (aka generative AI, or large language model, LLM), and advancements in ambient virtual scribes, including a company called Abridge.
In Part Two, he discusses a slew of new papers that are coming out, including studies on AI-assisted diagnosis of cancer, heart disease, and Alzheimer's disease.
A few of our favorite examples of the capabilities of AI systems:
Diagnosing diabetic retinopathy as accurately as a human ophthalmologist. The system, which was developed by researchers at Google AI, was trained on a dataset of over 100,000 eye images. In a clinical trial, the system was able to correctly diagnose diabetic retinopathy with an accuracy of 90%.
Predicting the risk of developing heart failure. The system, which was developed by researchers at the University of California, San Francisco, was trained on a dataset of over 100,000 patients. The system was able to predict the risk of heart failure with an accuracy of 80%.
Predicting the risk of developing Alzheimer's disease with an accuracy of 90%. The system was trained on a dataset of over 800,000 patients and was able to identify a number of risk factors for Alzheimer's disease, including age, genetics, and lifestyle factors.
Dectecting cancer cells in tissue samples with an accuracy of 99%. The system was trained on a dataset of over 100,000 tissue samples and was able to identify cancer cells that were missed by human pathologists.
The articles also discuss the challenges that need to be addressed before AI can be widely adopted in healthcare delivery, such as the need for more data and the need to ensure that AI systems are fair and unbiased.
The Next Frontier for Large Language Models is Biology [Rob Toews, Forbes]
Large language models are giving us all the realization that biology might actually be a programmable system. While the digital world of computers is encoded in a binary system of 0s and 1s, Toews argues biology isn’t all that different in its quaternary system of A T G and Cs (the genetic code). Learning the language of life is not all that straightforward however, as the data needed to train these models is largely inaccessible to date. That being said, LLMs for protein design is an area that is gaining traction as there are strong databases of information on proteins. Some highlights from Toews journey into protein LLMs:
The first known protein LLM is UniRep, a protein structure prediction model that debuted from George Church’s lab in 2019 and formed the foundation of Nabla Bio.
Meta’s ESM-2/ESMFold is about as accurate as AlphaFold in prediction 3-D protein structure but can also generate a structure from a given protein sequence without additional input.
You can reverse these models and generate new protein sequences that don’t exist in nature yet. The human proteome is 80-400k but the number of proteins that could theoretically exist is closer to 10^1300. Tl;dr: evolution hasn’t shown us even a glimpse of the types of proteins that could exist yet
Salesforce Research’s ProGen, the first transformer-based LLM to design proteins first started with 1.2 billion parameters. That work is now part of startup Profluent.
“We can read, write, and edit sequences of DNA but we struggle to compose it” AI is changing that.
Personalized cancer vaccines pass first major clinical test [Elie Dolgin, Nature Reviews Drug Discovery, 2023]
Earlier this year, Phase II data from Moderna’s randomized trial for mRNA-4157 provided the first clinical evidence that personalized therapeutic vaccines can have a meaningful benefit to patients.
These findings revitalized the cancer vaccine community: for decades, companies have tried and failed to bring effective therapeutic cancer vaccines to market. The dendritic cell vaccine Sipuleucel-T, the only personalized vaccine approved to-date, fell short of both efficacy and commercial expectations. The focus now is following a neoantigen strategy (as with many of the drugs in development in the table above), which involves finding those antigens solely expressed by cancer cells, instead of tumor-associated antigens that are preferentially but not exclusively found in tumor cells. Paired with checkpoint inhibitors, it is possible it will create a synergistic effect by unleashing T-cells killing ability.
Whilst it is difficult to predict which kind of vaccine modality will yield the highest efficacy, it seems clear to physicians that the adjuvant approach for early-stage cancers versus late-stage, metastasized tumor treatment will probably be more efficacious.
The Phase III trials of mRNA-4157, in combination with a checkpoint inhibitor, will start later in 2023 for both surgically removable melanoma and lung cancer.
Why nearly half of Americans with Parkinson’s don’t see a neurologist [Simar Bijaj, STAT, July 2023]
Care gaps in Parkinson’s disease (PD) are staggering. Forty percent of PD patients on Medicare (~250k patients) don’t see a neurologist for their disease, 80% don’t see a physical, occupational, or speech-language therapist, and over 95% with depression or anxiety don’t see a mental health professional. The reasons are myriad - lack of a standard presentation of disease or diagnostic test, insufficient numbers of neurologists, and difficulty commuting to appointments given motor symptoms of the disease. Some neurologists are working to solve key barriers to care. For example, clinics are being opened in urban and rural areas to make clinics more accessible by public transport or to families that live outside of the city. Clinics are prioritizing the hiring of translators to make care more accessible to everyone, and to make neurological care more culturally relevant (e.g. inviting Hispanic patients’ children and grandchildren to appointments). These solutions and new care models are hardly common and widespread, and will require investment for adoption if these care gaps are to be addressed.
Rapid Whole-Genomic Sequencing and a Targeted Neonatal Gene Panel in Infants With a Suspected Genetic Disorder [Maron et al., JAMA, July 2023]
Why it matters: Genetic diseases are a leading cause of infant mortality, and early diagnosis and treatment can save lives. However, current methods for diagnosing genetic diseases are often slow and inaccurate.
A new study published in the journal Nature Genetics found that whole-genome sequencing (WGS) can be a more effective way to diagnose genetic diseases in infants than traditional methods. The study, which was conducted by researchers at Rady Children's Hospital in San Diego, involved 400 hospitalized infants who had their entire genomes sequenced. The researchers found that WGS was able to identify the cause of an infant's symptoms in 49% of cases, compared to 27% for a targeted gene panel. The findings of this study suggest that WGS could be a valuable tool for diagnosing genetic diseases in infants, especially as the costs continue to decline. However, the researchers also noted that the interpretation of WGS data is still a challenge, and that more research is needed to develop better methods for interpreting genetic variants. Researchers from the project are now working to develop a clinical protocol for using WGS to diagnose genetic diseases in infants.
Multicellular communities are perturbed in the aging human brain and Alzheimer’s disease [Cain et al., Nature Neuro, June 2023]
Why it matters: A high-resolution cellular map of the aging human frontal cortex reveals distributed subpopulations of individual-specific combinations of neuronal, glial, and endothelial cells that show differential correlation with tau and beta-amyloid pathology, as well as cognitive decline.
A recent paper in Nature Neuroscience used single-cell RNA-seq to characterize the contributions of unique cellular subtypes to Alzheimer’s disease (AD) pathophysiology. Single-nucleus RNA profiling in the dorsolateral prefrontal cortex was performed in a group of 24 individuals, and the patterns of cell subtypes and cell states were compared to bulk RNA profiles in 638 individuals to help adequately power the analysis. A computational method called CelMod (Cellular Landscape Modeling by Deconvolution) was developed to identify specific cell subpopulations associated with AD progression. A number of patterns emerged, but the primary innovation of the paper was the definition of cell communities or subpopulations that showed characteristic changes in the frequency of different cell subsets across individuals and their association with AD traits. The results support the idea that AD is a distributed pathophysiological process that involves multiple cell types. Critically, some cell subsets were associated with beta-amyloid deposition, but NOT with cognitive decline. Other subsets were correlated with both tau deposition AND with cognitive decline, suggesting that therapeutics that target these latter subtypes may be more clinically efficacious.
Genome editing in the mouse brain with minimally immunogenic Cas9 RNPs [Stahl et al., Molecular Therapy, 2023]
Why it matters: delivery of genomic medicines, such as gene editing systems like CRISPR-Cas9, is a critical theme in therapeutics. Viral vectors, such as AAV, have been successful carriers but have several limitations: cargo capacity, immunogenicity and cost. Self-delivering, cell-penetrating CRISPR-Cas9 RNP complexes could avoid these issues. This group studied the differences in host immune response and editing efficiency to demonstrate that RNPs are a viable delivery vehicle.
In this pre-proof from the Doudna Lab, Stahl et al compare the editing efficiency and immune reaction to two methods of delivering CRISPR-Cas9 to the CNS: AAV or cell penetrant ribonucleoprotein (RNP). To produce the self-delivery of cell penetrating Cas9 RNP, Simian vacuolating virus 40 nuclear localization sequences were fused to the Cas9.
The results showed:
Cas9 AAV diffused better across the brain leading to distally edited cells; while Cas9-RNP edited more neurons near the injection site
Both methods elicited humoral responses, but the AAV group antibodies persisted at higher levels after 90 days. Cas9 kinetics seemed similar between delivery vehicles.
AAV treated animals also showed elevated Cd3e levels, suggesting ongoing adaptive immune response
Cas9-RNP showed acute microglial activation, which was mitigated by endotoxin purification during manufacturing
Biosecurity Deep Dive [Aron Lajko, July 2023]
While not explicitly a research paper, Aron’s comprehensive dive into biosecurity deserves the feature of one. He highlights several research papers, themes, and opportunities in biosecurity—creating a true “lay of the land” type report. This is an evergreen document that is constantly being updated, but some things you can learn from a scan:
defining biosecurity and risk
dual use research and gain of function research
biohacking and how that relates to biosecurity
interventions and prevention, ranging from PPE to scientific limitations
state of funding (non dilutive and dilutive) for biosecurity-related projects
What we listened to
In case you missed it
Harvard launches a new PhD Program focused on Artificial Intelligence in Medicine. Read more here.
What we liked on Twitter
How to give a scientific talk @wc_ratcliff
Anecdotal stats of women with autoimmune diseases @nwilliams030
Tales from Solugen’s beginning @solugen
Just how valuable is the last bit of data in ML @charleskfisher
IVF, IVG, and the future of reproduction @hankgreelylsju
AI as the good guy in pop culture @shantenuagarwal
Did we miss anything? Would you like to contribute to Decoding Bio by writing a guest post? Drop us a note here or chat with us on Twitter: @ameekapadia @ketanyerneni @morgancheatham @pablolubroth @patricksmalone