Introducing Docker Model Runner: A Better Way to Build and Run GenAI Models Locally
Generative AI is transforming software development, but building and running AI models locally is still harder than it should be. Today’s developers face fragmented tooling, hardware compatibility headaches, and disconnected application development workflows, all of which hinder iteration and slow down progress.
That’s why we’re launching Docker Model Runner — a faster, simpler way to run and test AI models locally, right from your existing workflow. Whether you’re experimenting with the latest LLMs or deploying to production, Model Runner brings the performance and control you need, without the friction.
We’re also teaming up with some of the most influential names in AI and software development, including Google, Continue, Dagger, Qualcomm, HuggingFace, Spring AI, and VMware Tanzu AI Solutions, to give developers direct access to the latest models, frameworks, and tools. These partnerships aren’t just integrations, they’re a shared commitment to making AI innovation more accessible, powerful, and developer-friendly. With Docker Model Runner, you can tap into the best of the AI ecosystem from right inside your Docker workflow.
LLM development is evolving: We’re making it local-first
Local development for applications powered by LLMs is gaining momentum, and for good reason. It offers several advantages on key dimensions such as performance, cost, and data privacy. But today, local setup is complex.
Developers are often forced to manually integrate multiple tools, configure environments, and manage models separately from container workflows. Running a model varies by platform and depends on available hardware. Model storage is fragmented because there is no standard way to store, share, or serve models.
The result? Rising cloud inference costs and a disjoined developer experience. With our first release, we’re focused on reducing that friction, making local model execution simpler, faster, and easier to fit into the way developers already build.
Docker Model Runner: The simple, secure way to run AI models locally
Docker Model Runner is designed to make AI model execution as simple as running a container. With this Beta release, we’re giving developers a fast, low-friction way to run models, test them, and iterate on application code that uses models locally, without all the usual setup headaches. Here’s how:
Running models locally
With Docker Model Runner, running AI models locally is now as simple as running any other service in your inner loop. Docker Model Runner delivers this by including an inference engine as part of Docker Desktop, built on top of llama.cpp and accessible through the familiar OpenAI API. No extra tools, no extra setup, and no disconnected workflows. Everything stays in one place, so you can test and iterate quickly, right on your machine.
Enabling GPU acceleration (Apple silicon)
GPU acceleration on Apple silicon helps developers get fast inference and the most out of their local hardware. By using host-based execution, we avoid the performance limitations of running models inside virtual machines. This translates to faster inference, smoother testing, and better feedback loops.
Standardizing model packaging with OCI Artifacts
Model distribution today is messy. Models are often shared as loose files or behind proprietary download tools with custom authentication. With Docker Model Runner, we package models as OCI Artifacts, an open standard that allows you to distribute and version them through the same registries and workflows you already use for containers. Today, you can easily pull ready-to-use models from Docker Hub. Soon, you’ll also be able to push your own models, integrate with any container registry, connect them to your CI/CD pipelines, and use familiar tools for access control and automation.
Building momentum with a thriving GenAI ecosystem
To make local development seamless, it needs an ecosystem. That starts with meeting developers where they are, whether they’re testing model performance on their local machines or building applications that run these models.
That’s why we’re launching Docker Model Runner with a powerful ecosystem of partners on both sides of the AI application development process. On the model side, we’re collaborating with industry leaders like Google and community platforms like HuggingFace to bring you high-quality, optimized models ready for local use. These models are published as OCI artifacts, so you can pull and run them using standard Docker commands, just like any container image.
But we aren’t stopping at models. We’re also working with application, language, and tooling partners like Dagger, Continue, and Spring AI and VMware Tanzu to ensure applications built with Model Runner integrate seamlessly into real-world developer workflows. Additionally, we’re working with hardware partners like Qualcomm to ensure high performance inference on all platforms.
As Docker Model Runner evolves, we’ll work to expand its ecosystem of partners, allowing for ample distribution and added functionality.
Where We’re Going
This is just the beginning. With Docker Model Runner, we’re making it easier for developers to bring AI model execution into everyday workflows, securely, locally, and with a low barrier of entry. Soon, you’ll be able to run models on more platforms, including Windows with GPU acceleration, customize and publish your own models, and integrate AI into your dev loop with even greater flexibility (including Compose and Testcontainers). With each Docker Desktop release, we’ll continue to unlock new capabilities that make GenAI development easier, faster, and way more fun to build with.
Try it out now!
Docker Model Runner is now available as a Beta feature in Docker Desktop 4.40. To get started:
On a Mac with Apple silicon
Update to Docker Desktop 4.40
Pull models developed by our partners at Docker’s GenAI Hub and start experimenting
For more information, check out our documentation here.
Try it out and let us know what you think!
How can I learn more about Docker Model Runner?
Check out our available assets today!
Turn your Mac into an AI playground YouTube tutorialA Quickstart Guide to Docker Model Runner Docker Model Runner on Docker Docs
Come meet us at Google Cloud Next!
Swing by booth 1530 in the Mandalay Convention Center for hands-on demos and exclusive content.
Quelle: https://blog.docker.com/feed/