Materials
Welcome to IBM’s multi-modal foundation model for materials, FM4M, designed to support and advance research in materials science and chemistry.
UpdatedNote A Multi-view Mixture-of-Experts framework designed to predict molecular properties by integrating latent spaces derived from SMILES, SELFIES, and molecular graphs. Our approach leverages the complementary strengths of these representations to enhance predictive accuracy.
ibm-research/materials.3dgrid_vqgan
Updated • 28Note 3DGrid-VQGAN is an encoder-decoder chemical foundation model for representing 3D electron density grids, pre-trained on a dataset of approximately 855K molecules from PubChem database. 3DGrid-VQGAN efficiently encodes high-dimensional data into compact latent representations, enabling downstream tasks such as molecular property prediction with enhanced accuracy.
ibm-research/materials.pos-egnn
Graph Machine Learning • Updated • 174 • 8Note A Position-based Equivariant Graph Neural Network foundation model for Chemistry and Materials. The model was pre-trained on 1.4M samples (i.e., 90%) from the Materials Project Trajectory (MPtrj) dataset to predict energies, forces and stress. pos-egnn can be used as a machine-learning potential, as a feature extractor, or can be fine-tuned for specific downstream tasks.
ibm-research/materials.mhg-ged
Feature Extraction • Updated • 159 • 4Note We present MHG-GED, an autoencoder architecture that has an encoder based on GNN and a decoder based on a sequential model with MHG. Since the encoder is a GNN variant, MHG-GNN can accept any molecule as input, and demonstrate high predictive performance on molecular graph data. In addition, the decoder inherits the theoretical guarantee of MHG on always generating a structurally valid molecule as output.
ibm-research/materials.selfies-ted2m
Feature Extraction • Updated • 142 • 2Note SELFIES-TED introduces a transformer trained on SELFIES strings for improved molecule property prediction. SELFIES-TED uses a BART backbone to learn a molecule representation while also being able to generate novel molecules. SELFIES-TED has 2.2M parameters and was trained on >1 billion molecules from zinc-22, applying smiles enumeration.
ibm-research/materials.selfies-ted
Feature Extraction • 0.4B • Updated • 20.5k • • 9Note SELFIES-TED introduces a transformer trained on SELFIES strings for improved molecule property prediction. SELFIES-TED uses a BART backbone to learn a molecule representation while also being able to generate novel molecules. SELFIES-TED has 354M parameters and was trained on 1 billion molecules from zinc-22, applying smiles enumeration.
ibm-research/materials.smi_ssed
Feature Extraction • Updated • 33 • 7Note A Mamba-based encoder-decoder chemical foundation model, SMILES-based State-Space Encoder-Decoder (SMI-SSED), pre-trained on a curated dataset of 91 million SMILES samples sourced from PubChem, equivalent to 4 billion molecular tokens.
ibm-research/materials.smi-ted
Feature Extraction • Updated • 37.3k • • 30Note Note A large encoder-decoder chemical foundation model, SMILES-based Transformer Encoder-Decoder (SMI-TED), pre-trained on a curated dataset of 91 million SMILES samples sourced from PubChem, equivalent to 4 billion molecular tokens. SMI-TED supports various complex tasks, including quantum property prediction. Our experiments across multiple benchmark datasets demonstrate state-of-the-art performance for various tasks.
7FM4M-demo1
🐢Generate and analyze molecular structures
Note Explore Foundation Models for Materials with an intuitive Gradio app! Test our state-of-the-art models—SMI-TED, SELFIES-TED, and MHG-GED—on your custom datasets for both classification and regression property prediction tasks. Get insights into materials science with ease
4FM4M-demo2
🐢Generate and analyze molecular structures
Note Explore Foundation Models for Materials with an intuitive Gradio app! Test our state-of-the-art models—SMI-TED, SELFIES-TED, and MHG-GED—on your custom datasets for both classification and regression property prediction tasks. Get insights into materials science with ease
1Fm4m Eval Demo
👁Note Explore Foundation Models for Materials with an intuitive Gradio app! Test our state-of-the-art models—SMI-TED, SELFIES-TED, MHG-GED and POS-EGNN —on your custom datasets for both classification and regression property prediction tasks. Get insights into materials science with ease