This week in TechBio 2024/03/18
Arc Institute and Instadeep announce two new models for nucleic acid data (ExoGRU and SegmentNT), a new drug is approved for MASH with an ASO1 likely to become a second approval, ZephyrAI raises a $111m series A, and hospitals are still wondering how to leverage large language models in their practices.
Pharma and Biotechs:
2024/03/15 Fierce Biotech: Oxford Nanopore Technologies (ONT) partners with SeqOne to use its long-read sequencing technology in diagnostics.
2024/03/14 STAT: A study, just published in Science, shows that we may have found the first drug in pill form, Gilead’s obeldesivir, that is effective against Ebola Sudan. This discovery is rather important, as we had no treatment for that strain and the standard of care for Ebola Zaire (the previous strain) was monoclonal antibodies that required IV administration and cold storage. Adding a new antiviral to our anti-epidemic arsenal is always good news!
2024/03/14 Fierce Biotech: The FDA approves the first metabolic dysfunction-associated steatohepatitis (MASH, previously known as NASH) drug from Madrigal. This drug is planned to be marketed at $47k/year (with an estimated cost-effectiveness), but may grow less prevalent as GLP-1 inhibitors become more common. This announcement is coming close to Ionis’s own phase 2 success in an antisense oligonucleotide (ASO) based MASH therapeutics.
2024/03/14 Yahoo!Finance: Relation Therapeutics, a drug discovery techbio, raises a $35m seed round, led by DCVC and NVidia’s venture arm. Relation leverages single-cell data analysis to work on osteoporosis. Their approach is similar to that of Cellarity for cancer and is presented in an opinion piece in Nature Biotech.
2024/03/13 BusinessWire: ZephyrAI, a Lilly-backed company, raises a $111m series A. This comes just 2 years after its $18.5m seed round. The company describes itself as constructing a large clinicogenomics dataset, and providing AI analysis on top of it, which is quite frankly not a very clear description since everyone is doing AI on genomics data, I have to assume that this lack of clarity is on purpose.
2024/03/13 Nature: A short review on the ways that AI is used in clinical trials. In particular, it highlights the use of tools for summarizing existing trial designs (CliniDIGEST), designing less stringent eligibility criteria for patient enrollment (Trial Pathfinder), and matching patients to existing trials (TrialGPT). While these tools can be extremely helpful, many boil down to making a nice interface on top of dirty data; they are a fix for a poorly designed system that should instead work on generating higher quality data (TrialGPT should not have to exist, instead a simple database should do the work).
2024/03/13 STAT: Hemgenix and Roctavian, two gene therapies for hemophilia, are currently struggling for sales. Despite their high price (~$3m), insurers are actually willing to pay in order to avoid the even more expensive alternative treatments (~$1m/year). The issue here is that patients are wary of “first generation” gene therapies and prefer to wait for pharma to get “the kinks out,” especially since the alternative treatment, while expensive for insurers, works great and is not too burdensome. Another issue with gene therapies is that most are delivered via the same viral vector (AAV), so patients will develop antibodies after the first use, thus leaving them unable to be treated with new generations if they get better (or for other diseases). I don’t know how much that last part is a true blocker, as there are probably ways to get rid of the memory B cells, or repress them, during a gene therapy (e.g. chemo).
2024/03/13 STAT: Two recent studies in Nature Medicine and JAMA covered the performances of LLMs2 for patient record summarization. The goal is either to summarize them in a more efficient fashion for medical practitioners or to give a layman version to patients. Researchers and hospital employees are however limited by their ability to correctly evaluate how good the output of these models are: which information can be skipped? How do you catch hallucinations3? The current lack of standards for evaluating these tools, as well as knowledge of what would constitute acceptable performances (since trained physicians also make mistakes) is creating some trust issues.
But at the end of the day, UTHealth Houston’s Roberts said, nobody in the field yet knows what they’re doing. “A lot of people like billing themselves as large language model experts,” he said. The technology is so new, ”I’m not really even sure what an expert in large language models means at this point.”
2024/03/13 STAT: Many genAI tools in the clinical space claim to use “human in the loop” in order to ensure that their output is correct. However, there is currently no standard, either regulatory or industrial, about what such a thing could mean. The claim by tools like Epic that they only provide suggestions that have to be approved by healthcare professionals is also a flimsy excuse to avoid liability in my opinion. Indeed it is the same way that Tesla’s autopilot claims to not be a self-driving car but just a driver’s aid and expects drivers to keep their hands on the wheel, despite everyone knowing that it is used as an actual autopilot.
2024/03/13 Twitter: Google Deepmind’s AlphaMissense can now be used for commercial use.
2024/03/12 BusinessWire: The Biswas Family Foundation gave a $15m grant, spread across five research projects, to support the development of AI technology in the computational biology field.
2024/03/11 Recursion: Recursion opens an office in London and starts hiring there, too. They may have realized that Europe has a tremendous pool of talent that most companies fail to tap.
Papers and science:
Since there are quite a few interesting papers that get published every week, and only so many hours in a day, I have to confess to not having read them all so “readers beware”.
2024/03/16 Twitter: Interesting thread by Daniel Liu on some low hanging fruits of using systems/CS for computational biology problems.
2024/03/14 Twitter: Bo Wang catches a second paper that was obviously, at least in parts, written by an LLM.
2024/03/14 Instadeep: “SegmentNT: annotating the genome at single-nucleotide resolution with DNA foundation models” — InstaDeep drops a new model running on top of its Nucleotide Transformer (announcement). It is a U-Net head added after the pre-trained transformer layers, and trained to predict ENCODE and GENCODE annotations (e.g. intron, exon, UTR, splice acceptors etc…). The paper is extremely well written, and shows that nucleotide transformer acts as a foundation model (e.g. increases downstream performances) with clear ablation studies.
2023/03/11 Cell Genomics: “Revealing the grammar of small RNA secretion using interpretable machine learning” — Another paper from ArcInstitute, after their EVO paper.
2024/03/11 ArXiv: “Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling” — bi-directional DNA language model built on Mamba, with long range modeling, associated twitter thread
2024/03/11 Twitter: Jacob Schreiber gives some good tricks for making DNA sequence models fit in memory.
2024/03/08 biorXiv: “Advancing Drug-Target Interactions Prediction: Leveraging a Large-Scale Dataset with a Rapid and Robust Chemogenomic Algorithm.”
2024/02/24 Bioinformatics: “Bioframe: operations on genomic intervals in Pandas dataframes” — Bioframe offers a way to interface with bio data as if it were a pandas object, making data science on biological data much smoother.
Community events:
2024/04/02-2024/04/04 — Boston: scVerse hachathon
2024/03/27 — Seattle: Join WIB-Seattle and GirlUP Entrepreneurs at Sage Bionetworks for a panel discussion about intersection of Generative AI and Healthcare.
2024/03/21 — SF: Bits in Bio Tech Showcase
Jobs announcements:
With Erle’s help, we are adding here the job opportunities we found, so that folks can either find people to hire, or find new jobs.
If you want your offers to appear here, post them on the #jobs channel of the bitsinbio slack.
Insitro opens a few senior positions
Isomorphic Labs starts hiring computational biologists: offer
Generate Biomedicine: Computational Protein design
Recursion opens an office in London, and is mostly hiring computational biologists and software engineers there: offers
AI Scientist at Myria Biosciences
Computational Biologist at Pasture Biosciences
Computational Biology, Research Engineer Intern at MantleBio
Data Engineer/Bioinformatician – Systems Chemical Biology at The Francis Crick Institute
Information Technology Security Specialist at Sanavia Oncology
Machine Learning Engineer at Oddity Labs
Software Engineer (Nexflow) at Seqera
Summer Internship, Software at Qwarke
Various Roles (Cell Culture) at Hoxton Farms
I very likely missed quite a few announcements, don’t hesitate to DM me @gama_search if you see anything missing or needing corrections.
Or DM me if you have seen some cool paper / news coverage that you would like to see on next week’s newsletter.
This newsletter was originally published here, but was moved to bitsinbio.
ASO: AntiSense Oligonucleotide
LLM: Large Language model, tools like GPT, LLama or Claude
When LLMs start to invent incorrect facts