ml-toolkit wraps the complexity of running AI and machine learning models on HPC clusters into a single, simple command-line interface — powered by containers.
Although, Primarily designed and optimised for running AI/ML models on the Bede Grace-Hopper HPC cluster it can be easily installed on any Linux system. From a Laptop to a large HPC cluster.
See the code on GitHub → Read the Docs →Features: Why ml-toolkit
01
All software dependencies are packaged inside containers by the developers. You focus on your science, not your environment.
02
Run the same container on your laptop or at HPC scale. Workflows move with you — no reconfiguration required.
03
Containers are built to leverage GPU nodes with hardware acceleration, so you get the full power of the hardware from day one.
04
Definition and config files for the all 36 models1 on the MatBench Discovery interatomic potential leaderboard are included and ready to use by name.
05
Can be Easily extended to add in new tools and models to work with any ML software via a simple .yaml based config.
06
Software versions are frozen inside container images — your workflow won't break when external libraries update.
07
Four verbs cover everything: build, run, start, stop. Whether you're running a one-off job or a long-lived background process, the interface stays the same.
08
Works out of the box for calculations with both the atomic simulation environment (ASE) and CASTEP.
09
Code is fully Open Source under the GPL-3.0 licence and is freely available on github.
Supported models
Preconfigured container definitions for the top-ranked interatomic potential models from the MatBench Discovery leaderboard. Use any model by name — no manual setup needed.
How it works
One pip install command fetches the toolkit, Apptainer, and sets up your ~/ML_Toolkit directory.
ml-toolkit build <ModelName> downloads and builds the container image. Done once, reused forever.
ml-toolkit run <ModelName> <cmd> executes your script inside the container with full GPU access.
. ml-toolkit was developed with funding from the N8 Research Partnership through the "AI4Science" initiative (EPSRC grant number EP/T022167/1). The N8 is a consortium of the eight most research-intensive universities in the north of England, built around the N8 HPC resource, Bede. With the aim of making advanced ML capabilities accessible to researchers across all N8 institutions.