#StructurallyAugmentedIC50Repository
Explore tagged Tumblr posts
govindhtech · 11 hours ago
Text
Introducing SAIR: Quantum AI Drug Discovery Accelerator
Tumblr media
Structurally Enhanced IC50 Repository
The open dataset Structurally Augmented IC50 Repository (SAIR) of protein-ligand structures tagged by binding affinity is released by SandboxAQ.
SAIR (Structurally Augmented IC50 Repository) was introduced today by SandboxAQ, a B2B quantum and AI company. This landmark release of protein-ligand pairings with annotated experimental potency data, the largest and most thorough open dataset, might revolutionise computational drug development. SAIR uses cutting-edge AI algorithms to give researchers a new resource that will speed up and improve binding affinity predictions.
SAIR fills a critical gap in AI-driven drug design, marking a turning point in AI in biology. Historically, deep learning algorithms that employ 3D chemical structures to create medications have struggled with data shortages. Because few protein-ligand complexes have a resolved 3D structure and measurable potency (IC50 or Ki values), many AI methods use sequences or 2D chemical structures.
SAIR was created to overcome this restriction by providing a large library of computationally folded protein-ligand structures with equivalent experimental affinity values. Closing this data gap will help machine learning algorithms predict binding affinity more accurately.
SAIR has almost one million protein-ligand complexes, 1,048,857 of which are unique pairings, and 5.2 million synthetic 3D molecular structures. The Boltz-1x model cofolded structures in this huge dataset from ChEMBL and BindingDB. Citizens can access 2.5 gigabytes of data. These structures were built using SandboxAQ's advanced AI Large Quantitative Model (LQM) and NVIDIA DGX Cloud, a powerful AI training and tuning platform. The relationship with NVIDIA doubled GPU utilisation and increased throughput across SandboxAQ's scientific workloads to optimise SAIR's computing infrastructure.
SAIR's unique combination of LQM skills and physics-based modelling improves generality, dependability, and application across drug development processes. By sharing the SAIR dataset, SandboxAQ showcases its patented LQMs' unrivalled potential and quantitative AI for drug discovery expertise.
Using NVIDIA's AI LQM and knowledge SAIR was meant to make large-scale in silico protein-ligand binding affinity predictions using accelerated computation. SandboxAQ General Manager of AI Simulation Nadia Harhen stressed the significance of this achievement. Harhen added, “This achievement marks a pivotal moment in drug discovery, demonstrating capacity to fundamentally transform the traditional trial-and-error process into a rapid, data-driven approach” to highlight the revolutionary potential.
It gives any scientist the raw fuel to train breakthrough models overnight, setting a new pace for drug discovery,” she said, adding that over five million affinity-labeled protein-ligand structures were publicly available. SAIR turns limited experimental data into a chance, and this release shows SandboxAQ's LQM platform's breadth and complexity.
The comprehensive and outstanding SAIR dataset can be used to train AI models that accurately predict protein-ligand binding affinities. SAIR allows these models to forecast 1,000 times faster than physics-based methods. This enormous acceleration is projected to speed drug researchers' journey from discovery to commercialisation, improving patient outcomes and therapeutic breakthroughs. The bioRxiv preprint “SAIR (Structurally Augmented IC50 Repository): Enabling Deep Learning for Protein-Ligand Interactions with a Synthetic Structural Dataset” provides technical details about the dataset.
SandboxAQ's quantitative artificial intelligence platform has yielded exceptional outcomes through strategic partnerships with top academic institutions and pharmaceutical industries. Riboscience, the Michael J. Fox Foundation, UCSF's Institute of Neurodegenerative Diseases, and Stand Up To Cancer are these partners. Large quantitative models from the company frequently outperform conventional methods, indicating a major improvement in medical development pace.
Non-commercial use of the SAIR dataset is free under the CC BY-NC-SA 4.0 license. After submitting a simple form to SandboxAQ, commercial users can utilise the data for free. Researchers can access the dataset using SandboxAQ or Google Cloud Platform.
Researchers can contact SandboxAQ at [email protected] to work on expanding SAIR or using these unique models for their hardest targets. A upcoming webinar with SandboxAQ and an NVIDIA speaker will explain how to access and use the data. SandboxAQ plans to deliver new datasets, AI models, and innovative solutions to revolutionise drug development.
About SandboxAQ
SandboxAQ provides quantum-AI solutions to businesses. The organization's Large Quantitative Models (LQMs) aim to improve financial services, navigation, and life sciences. Top investors and strategic partners like T. Rowe Price Associates, Inc., Alger, IQT, US Innovative Technology Fund, S32, Paladin Capital, BNP Paribas, Eric Schmidt, Breyer Capital, Ray Dalio, Marc Benioff, Thomas Tull, and Yann LeCun helped SandboxAQ become an independent, growth-backed business from Alphabet Inc.
0 notes