Run LLMs Locally with Docker: A Quickstart Guide to Model Runner

AI is quickly becoming a core part of modern applications, but running large language models (LLMs) locally can still be a pain. Between picking the right model, navigating hardware quirks, and optimizing for performance, it’s easy to get stuck before you even start building. At the same time, more and more developers want the flexibility to run LLMs locally for development, testing, or even offline use cases. That’s where Docker Model Runner comes in.

Now available in Beta with Docker Desktop 4.40 for macOS on Apple silicon, Model Runner makes it easy to pull, run, and experiment with LLMs on your local machine. No infrastructure headaches, no complicated setup. Here’s what Model Runner offers in our initial beta release.

Local LLM inference powered by an integrated engine built on top of llama.cpp, exposed through an OpenAI-compatible API.

GPU acceleration on Apple silicon by executing the inference engine directly as a host process.

A growing collection of popular, usage-ready models packaged as standard OCI artifacts, making them easy to distribute and reuse across existing Container Registry infrastructure.

Keep reading to learn how to run LLMs locally on your own computer using Docker Model Runner!

Enabling Docker Model Runner for LLMs

Docker Model Runner is enabled by default and shipped as part of Docker Desktop 4.40 for macOS on Apple silicon hardware. However, in case you’ve disabled it, you can easily enable it through the CLI with a single command:

docker desktop enable model-runner

In its default configuration, the Model Runner will only be accessible through the Docker socket on the host, or to containers via the special model-runner.docker.internal endpoint. If you want to interact with it via TCP from a host process (maybe because you want to point some OpenAI SDK within your codebase straight to it), you can also enable it via CLI by specifying the intended port:

docker desktop enable model-runner –tcp 12434

A first look at the command line interfaceCLI

The Docker Model Runner CLI will feel very similar to working with containers, but there are also some caveats regarding the execution model, so let’s check it out. For this guide, I’m going to use a very small model to ensure it runs on hardware with limited resources and provides a fast and responsive user experience. To be more specific, we’ll use the SmolLM model, published by HuggingFace in 2024.

We’ll want to start by pulling a model. As with Docker Images, you can omit the specific tag, and it’ll default to latest. But for this example, let’s be specific:

docker model pull ai/smollm2:360M-Q4_K_M

Here I am pulling the SmolLM2 model with 360M parameters and 4-bit quantization. Tags for models distributed by Docker follow this scheme with regard to model metadata:

{model}:{parameters}-{quantization}

After having the model pulled, let’s give it a spin by asking it a question:

docker model run ai/smollm2:360M-Q4_K_M "Give me a fact about whales."

Whales are magnificent marine animals that have fascinated humans for centuries. They belong to the order Cetacea and have a unique body structure that allows them to swim and move around the ocean. Some species of whales, like the blue whale, can grow up to 100 feet (30 meters) long and weigh over 150 tons (140 metric tons) each. They are known for their powerful tails that propel them through the water, allowing them to dive deep to find food or escape predators.

Is this whale fact actually true? Honestly, I have no clue; I’m not a marine biologist. But it’s a fun example to illustrate a broader point: LLMs can sometimes generate inaccurate or unpredictable information. As with anything, especially smaller local models with a limited number of parameters or small quantization values, it’s important to verify what you’re getting back.

So what actually happened when we ran the docker model run command? It makes sense to have a closer look at the technical underpinnings, since it differs from what you might expect after using docker container run commands for years. In the case of the Model Runner, this command won’t spin up any kind of container. Instead, it’ll call an Inference Server API endpoint, hosted by the Model Runner through Docker Desktop, and provide an OpenAI compatible API. The Inference Server will use llama.cpp as the Inference Engine, running as a native host process, load the requested model on demand, and then perform the inference on the received request. Then, the model will stay in memory until another model is requested, or until a pre-defined inactivity timeout (currently 5 minutes) is reached. 

That also means that there isn’t a need to perform a docker model run before interacting with a specific model from a host process or from within a container. Model Runner will transparently load the requested model on-demand, assuming it has been pulled beforehand and is locally available. 

Speaking of interacting with models from other processes, let’s have a look at how to integrate with Model Runner from within your application code.

Having fun with GenAI development

Model Runner exposes an OpenAI endpoint under http://model-runner.docker.internal/engines/v1 for containers, and under http://localhost:12434/engines/v1 for host processes (assuming you have enabled TCP host access on default port 12434). You can use this endpoint to hook up any OpenAI-compatible clients or frameworks. 

In this example, I’m using Java and LangChain4j. Since I develop and run my Java application directly on the host, all I have to do is configure the Model Runner OpenAI endpoint as the baseUrl and specify the model to use, following the Docker Model addressing scheme we’ve already seen in the CLI usage examples. 

And that’s all there is to it, pretty straightforward, right?

Please note that the model has to be already locally present for this code to work.

OpenAiChatModel model = OpenAiChatModel.builder()
.baseUrl("http://localhost:12434/engines/v1")
.modelName("ai/smollm2:360M-Q4_K_M")
.build();

String answer = model.chat("Give me a fact about whales.");
System.out.println(answer);

Finding more models

Now, you probably don’t just want to use SmolLM models, so you might wonder what other models are currently available to use with Model Runner.? The easiest way to get started is by checking out the ai/ namespace on Docker Hub.

On Docker Hub, you can find a curated list of the most popular models that are well-suited for local use cases. They’re offered in different flavors to suit different hardware and performance needs. You can also find more details about each in the model card, available on the overview page of the model repository.

What’s next?

Of course, this was just a first look at what Docker Model Runner can do. We have many more features in the works and can’t wait to see what you build with it. For hands-on guidance, check out our latest YouTube tutorial on running LLMs locally with Model Runner. And be sure to keep an eye on the blog, we’ll be sharing more updates, tips, and deep dives as we continue to expand Model Runner’s capabilities

Resources

Learn more on Docker Docs 

Subscribe to the Docker Navigator Newsletter.

New to Docker? Create an account. 

Have questions? The Docker community is here to help.

Quelle: https://blog.docker.com/feed/

Master Docker and VS Code: Supercharge Your Dev Workflow

Hey there, fellow developers and DevOps champions! I’m excited to share a workflow that has transformed my teams’ development experience time and time again: pairing Docker and Visual Studio Code (VS Code). If you’re looking to speed up local development, minimize those “it works on my machine” excuses, and streamline your entire workflow, you’re in the right place. 

I’ve personally encountered countless environment mismatches during my career, and every time, Docker plus VS Code saved the day — no more wasted hours debugging bizarre OS/library issues on a teammate’s machine.

In this post, I’ll walk you through how to get Docker and VS Code humming in perfect harmony, cover advanced debugging scenarios, and even drop some insider tips on security, performance, and ephemeral development environments. By the end, you’ll be equipped to tackle DevOps challenges confidently. 

Let’s jump in!

Why use Docker with VS Code?

Whether you’re working solo or collaborating across teams, Docker and Visual Studio Code work together to ensure your development workflow remains smooth, scalable, and future-proof. Let’s take a look at some of the most notable benefits to using Docker with VS Code.

Consistency Across EnvironmentsDocker’s containers standardize everything from OS libraries to dependencies, creating a consistent dev environment. No more “it works on my machine!” fiascos.

Faster Development Feedback LoopVS Code’s integrated terminal, built-in debugging, and Docker extension cut down on context-switching—making you more productive in real time.

Multi-Language, Multi-FrameworkWhether you love Node.js, Python, .NET, or Ruby, Docker packages your app identically. Meanwhile, VS Code’s extension ecosystem handles language-specific linting and debugging.

Future-ProofBy 2025, ephemeral development environments, container-native CI/CD, and container security scanning (e.g., Docker Scout) will be table stakes for modern teams. Combining Docker and VS Code puts you on the fast track.

Together, Docker and VS Code create a streamlined, flexible, and reliable development environment.

How Docker transformed my development workflow

Let me tell you a quick anecdote:

A few years back, my team spent two days diagnosing a weird bug that only surfaced on one developer’s machine. It turned out they had an outdated system library that caused subtle conflicts. We wasted hours! In the very next sprint, we containerized everything using Docker. Suddenly, everyone’s environment was the same — no more “mismatch” drama. That was the day I became a Docker convert for life.

Since then, I’ve recommended Dockerized workflows to every team I’ve worked with, and thanks to VS Code, spinning up, managing, and debugging containers has become even more seamless.

How to set up Docker in Visual Studio Code

Setting up Docker in Visual Studio Code can be done in a matter of minutes. In two steps, you should be well on your way to leveraging the power of this dynamic duo. You can use the command line to manage containers and build images directly within VS Code. Also, you can create a new container with specific Docker run parameters to maintain configuration and settings.

But first, let’s make sure you’re using the right version!

Ensuring version compatibility

As of early 2025, here’s what I recommend to minimize friction:

Docker Desktop: v4.37+ or newer (Windows, macOS, or Linux)

Docker Engine: 27.5+ (if running Docker directly on Linux servers)

Visual Studio Code: 1.96+

Lastly, check the Docker release notes and VS Code updates to stay aligned with the latest features and patches. Trust me, staying current prevents a lot of “uh-oh” moments later.

1. Docker Desktop (or Docker Engine on Linux)

Download from Docker’s official site.

Install and follow prompts.

2. Visual Studio Code

Install from VS Code’s website.

Docker Extension: Press Ctrl+Shift+X (Windows/Linux) or Cmd+Shift+X (macOS) in VS Code → search for “Docker” → Install.

Explore: You’ll see a whale icon on the left toolbar, giving you a GUI-like interface for Docker containers, images, and registries.Pro Tip: If you’re going fully DevOps, keep Docker Desktop and VS Code up to date. Each release often adds new features or performance/security enhancements.

Examples: Building an app with a Docker container

Node.js example

We’ll create a basic Node.js web server using Express, then build and run it in Docker. The server listens on port 3000 and returns a simple greeting.

Create a Project Folder:

mkdir node-docker-app
cd node-docker-app

Initialize a Node.js Application:

npm init -y
npm install express

‘npm init -y’ generates a default package.json.

‘npm install express’ pulls in the Express framework and saves it to ‘package.json’.

Create the Main App File (index.js):

cat <<EOF >index.js
const express = require('express');
const app = express();
const port = 3000;

app.get('/', (req, res) => {
res.send('Hello from Node + Docker + VS Code!');
});

app.listen(port, () => {
console.log(`App running on port ${port}`);
});
EOF

This code starts a server on port 3000. When a user hits http://localhost:3000, they see a “Hello from Node + Docker + VS Code!” message.

Add an Annotated Dockerfile:

# Use a lightweight Node.js 23.x image based on Alpine Linux
FROM node:23.6.1-alpine

# Set the working directory inside the container
WORKDIR /usr/src/app

# Copy only the package files first (for efficient layer caching)
COPY package*.json ./

# Install Node.js dependencies
RUN npm install

# Copy the entire project (including index.js) into the container
COPY . .

# Expose port 3000 for the Node.js server (metadata)
EXPOSE 3000

# The default command to run your app
CMD ["node", "index.js"]

Build and run the container image:

docker build -t node-docker-app .

Run the container, mapping port 3000 on the container to port 3000 locally:

docker run -p 3000:3000 node-docker-app

Open http://localhost:3000 in your browser. You’ll see “Hello from Node + Docker + VS Code!”

Python example

Here, we’ll build a simple Flask application that listens on port 3001 and displays a greeting. We’ll then package it into a Docker container.

Create a Project Folder:

mkdir python-docker-app
cd python-docker-app

Create a Basic Flask App (app.py):

cat <<EOF >app.py
from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello():
return 'Hello from Python + Docker!'

if __name__ == '__main__':
app.run(host='0.0.0.0', port=3001)
EOF

This code defines a minimal Flask server that responds to requests on port 3001.

Add Dependencies:

echo "Flask==2.3.0" > requirements.txt

Lists your Python dependencies. In this case, just Flask.

Create an Annotated Dockerfile:

# Use Python 3.11 on Alpine for a smaller base image
FROM python:3.11-alpine

# Set the container's working directory
WORKDIR /app

# Copy only the requirements file first to leverage caching
COPY requirements.txt .

# Install Python libraries
RUN pip install -r requirements.txt

# Copy the rest of the application code
COPY . .

# Expose port 3001, where Flask will listen
EXPOSE 3001

# Run the Flask app
CMD ["python", "app.py"]

Build and Run:

docker build -t python-docker-app .
docker run -p 3001:3001 python-docker-app

Visit http://localhost:3001 and you’ll see “Hello from Python + Docker!”

Manage containers in VS Code with the Docker Extension

Once you install the Docker extension in Visual Studio Code, you can easily manage containers, images, and registries — bringing the full power of Docker into VS Code. You can also open a code terminal within VS Code to execute commands and manage applications seamlessly through the container’s isolated filesystem. And the context menu can be used to perform actions like starting, stopping, and removing containers.

With the Docker extension installed, open VS Code and click the Docker icon:

Containers: View running/stopped containers, view logs, stop, or remove them at a click. Each container is associated with a specific container name, which helps you identify and manage them effectively.

Images: Inspect your local images, tag them, or push to Docker Hub/other registries.

Registries: Quickly pull from Docker Hub or private repos.

No more memorizing container IDs or retyping long CLI commands. This is especially handy for visual folks or those new to Docker.

Advanced Debugging (Single & Multi-Service)

Debugging containerized applications can be challenging, but with Docker and Visual Studio Code, you can streamline the process. Here’s some advanced debugging that you can try for yourself!

Containerized Debug Ports

For Node, expose port 9229 (EXPOSE 9229) and run with –inspect.

In VS Code, create a “Docker: Attach to Node” debug config to step through code in real time.

Microservices / Docker Compose

For multiple services, define them in compose.yml.

Spin them up with docker compose up.

Configure VS Code to attach debuggers to each service’s port (e.g., microservice A on 9229, microservice B on 9230).

Remote – Containers (Dev Containers)

Use VS Code’s Dev Containers extension for a fully containerized development environment.

Perfect for large teams: Everyone codes in the same environment with preinstalled tools/libraries. Onboarding new devs is a breeze.

Insider Trick: If you’re juggling multiple services, label your containers meaningfully (–name web-service, –name auth-service) so they’re easy to spot in VS Code’s Docker panel.

Pro Tips: Security & Performance

Security

Use Trusted Base ImagesOfficial images like node:23.6.1-alpine, python:3.11-alpine, etc., reduce the risk of hidden vulnerabilities.

Scan ImagesTools like Docker Scout or Trivy spot vulnerabilities. Integrate into CI/CD to catch them early.

Secrets ManagementNever hardcode tokens or credentials—use Docker secrets, environment variables, or external vaults.

Encryption & SSLFor production apps, consider a reverse proxy (Nginx, Traefik) or in-container SSL termination.

Performance

Resource AllocationDocker Desktop on Windows/macOS allows you to tweak CPU/RAM usage. Don’t starve your containers if you run multiple services.

Multi-Stage Builds & .dockerignoreKeep images lean and build times short by ignoring unneeded files (node_modules, .git).

Dev ContainersOffload heavy dependencies to a container, freeing your host machine.

Docker Compose v2It’s the new standard—faster, more intuitive commands. Great for orchestrating multi-service setups on your dev box.

Real-World Impact: In a recent project, containerizing builds and using ephemeral agents slashed our development environment setup time by 80%. That’s more time coding, and less time wrangling dependencies!

Going Further: CI/CD and Ephemeral Dev Environments

CI/CD Integration:Automate Docker builds in Jenkins, GitHub Actions, or Azure DevOps. Run your tests within containers for consistent results.

# Example GitHub Actions snippet
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
– uses: actions/checkout@v4
– name: Build Docker Image
run: docker build -t myorg/myapp:${{ github.sha }} .
– name: Run Tests
run: docker run –rm myorg/myapp:${{ github.sha }} npm test
– name: Push Image
run: |
docker login -u $USER -p $TOKEN
docker push myorg/myapp:${{ github.sha }}

Ephemeral Build Agents:Use Docker containers as throwaway build agents. Each CI job spins up a clean environment—no leftover caches or config drift.

Docker in Kubernetes:If you need to scale horizontally, deploying containers to Kubernetes (K8s) can handle higher traffic or more complex microservice architectures.

Dev Containers & Codespaces:Tools like GitHub Codespaces or the “Remote – Containers” extension let you develop in ephemeral containers in the cloud. Perfect for distributed teams—everyone codes in the same environment, minus the “It’s fine on my laptop!” tension.

Conclusion

Congrats! You’ve just stepped through how Docker and Visual Studio Code can supercharge your development:

Consistent Environments: Docker ensures no more environment mismatch nightmares.

Speedy Debugging & Management: VS Code’s integrated Docker extension and robust debug tools keep you in the zone.

Security & Performance: Multi-stage builds, scans, ephemeral dev containers—these are the building blocks of a healthy, future-proofed DevOps pipeline.

CI/CD & Beyond: Extend the same principles to your CI/CD flows, ensuring smooth releases and easy rollbacks.

Once you see the difference in speed, reliability, and consistency, you’ll never want to go back to old-school local setups.

And that’s it! We’ve packed in real-world anecdotes, pro tips, stories, and future-proof best practices to impress seasoned IT pros on Docker.com. Now go forth, embrace containers, and code with confidence — happy shipping!

Learn more

Install or upgrade Docker Desktop & VS Code to the latest versions.

Try containerizing an app from scratch — your future self (and your teammates) will thank you.

Experiment with ephemeral dev containers, advanced debugging, or Docker security scanning (Docker Scout).

Join the Docker community on Slack or the forums. Swap stories or pick up fresh tips!

Quelle: https://blog.docker.com/feed/