How to Add MCP Servers to ChatGPT with Docker MCP Toolkit

ChatGPT is great at answering questions and generating code. But here’s what it can’t do: execute that code, query your actual database, create a GitHub repo with your project, or scrape live data from websites. It’s like having a brilliant advisor who can only talk, never act.

Docker MCP Toolkit changes this completely. 

Here’s what that looks like in practice: You ask ChatGPT to check MacBook Air prices across Amazon, Walmart, and Best Buy. If competitor prices are lower than yours, it doesn’t just tell you, it acts: automatically adjusting your Stripe product price to stay competitive, logging the repricing decision to SQLite, and pushing the audit trail to GitHub. All through natural conversation. No manual coding. No copy-pasting scripts. Real execution.

“But wait,” you might say, “ChatGPT already has a shopping research feature.” True. But ChatGPT’s native shopping can only lookup prices. Only MCP can execute: creating payment links, generating invoices, storing data in your database, and pushing to your GitHub. That’s the difference between an advisor and an actor.

By the end of this guide, you’ll build exactly this: a Competitive Repricing Agent that checks competitor prices on demand, compares them to yours, and automatically adjusts your Stripe product prices when competitors are undercutting you.

Here’s how the pieces fit together:

ChatGPT provides the intelligence: understanding your requests and determining what needs to happen

Docker MCP Gateway acts as the secure bridge: routing requests to the right tools

MCP Servers are the hands: executing actual tasks in isolated Docker containers

The result? ChatGPT can query your SQL database, manage GitHub repositories, scrape websites, process payments, run tests, and more—all while Docker’s security model keeps everything contained and safe.

In this guide, you’ll learn how to add seven MCP servers to ChatGPT by connecting to Docker MCP Toolkit. We’ll use a handful of must-have MCP servers: Firecrawl for web scraping, SQLite for data persistence, GitHub for version control, Stripe for payment processing, Node.js Sandbox for calculations, Sequential Thinking for complex reasoning, and Context7 for documentation. Then, you’ll build the Competitive Repricing Agent shown above, all through conversation.

What is Model Context Protocol (MCP)?

Before we dive into the setup, let’s clarify what MCP actually is.

Model Context Protocol (MCP) is the standardized way AI agents like ChatGPT and Claude connect to tools, APIs, and services. It’s what lets ChatGPT go beyond conversation and perform real-world actions like querying databases, deploying containers, analyzing datasets, or managing GitHub repositories.

In short: MCP is the bridge between ChatGPT’s reasoning and your developer stack. And Docker? Docker provides the guardrails that make it safe.

Why Use Docker MCP Toolkit with ChatGPT?

I’ve been working with AI tools for a while now, and this Docker MCP integration stands out for one reason: it actually makes ChatGPT productive.

Most AI integrations feel like toys: impressive demos that break in production. Docker MCP Toolkit is different. It creates a secure, containerized environment where ChatGPT can execute real tasks without touching your local machine or production systems.

Every action happens in an isolated container. Every MCP server runs in its own security boundary. When you’re done, containers are destroyed. No residue, no security debt, complete reproducibility across your entire team.

What ChatGPT Can and Can’t Do Without MCP

Let’s be clear about what changes when you add MCP.

Without MCP

You ask ChatGPT to build a system to regularly scrape product prices and store them in a database. ChatGPT responds with Python code, maybe 50 lines using BeautifulSoup and SQLite. Then you must copy the code, install dependencies, create the database schema, run the script manually, and set up a scheduler if you want it to run regularly.

Yes, ChatGPT remembers your conversation and can store memories about you. But those memories live on OpenAI’s servers—not in a database you control.

With MCP

You ask ChatGPT the same thing. Within seconds, it calls Firecrawl MCP to actually scrape the website. It calls SQLite MCP to create a database on your machine and store the data. It calls GitHub MCP to save a report to your repository. The entire workflow executes in under a minute.

Real data gets stored in a real database on your infrastructure. Real commits appear in your GitHub repository. Close ChatGPT, come back tomorrow, and ask “Show me the price trends.” ChatGPT queries your SQLite database and returns results instantly because the data lives in a database you own and control, not in ChatGPT’s conversation memory.

The data persists in your systems, ready to query anytime; no manual script execution required.

Why This Is Different from ChatGPT’s Native Shopping

ChatGPT recently released a shopping research feature that can track prices and make recommendations. Here’s what it can and cannot do:

What ChatGPT Shopping Research can do:

Track prices across retailers

Remember price history in conversation memory

Provide comparisons and recommendations

What ChatGPT Shopping Research cannot do:

Automatically update your product prices in Stripe

Execute repricing logic based on competitor changes

Store pricing data in your database (not OpenAI’s servers)

Push audit trails to your GitHub repository

Create automated competitive response workflows

With Docker MCP Toolkit, ChatGPT becomes a competitive pricing execution system. When you ask it to check prices and competitors are undercutting you, it doesn’t just inform you, it acts: updating your Stripe prices to match or beat competitors, logging decisions to your database, and pushing audit records to GitHub. The data lives in your infrastructure, not OpenAI’s servers.

Setting Up ChatGPT with Docker MCP Toolkit

Prerequisites

Before you begin, ensure you have:

A machine with 8 GB RAM minimal, ideally 16GB

Install Docker Desktop

A ChatGPT Plus, Pro, Business, or Enterprise Account

ngrok account (free tier works) – For exposing the Gateway publicly

Step 1. Enable ChatGPT developer mode

Head over to ChatGPT and create a new account. 

Click on your profile icon at the top left corner of the ChatGPT page and select “Settings”. Select “Apps and Connectors” and scroll down to the end of the page to select “Advanced Settings.”

Settings → Apps & Connectors → Advanced → Developer Mode (ON)

ChatGPT Developer Mode provides full Model Context Protocol (MCP) client support for all tools, both read and write operations. This feature was announced in the first week of September 2025, marking a significant milestone in AI-developer integration. ChatGPT can perform write actions—creating repositories, updating databases, modifying files—all with proper confirmation modals for safety.

Key capabilities:

Full read/write MCP tool support

Custom connector creation

OAuth and authentication support

Explicit confirmations for write operations

Available on Plus, Pro, Business, Enterprise, and Edu plans

Step 2. Create MCP Gateway

This creates and starts the MCP Gateway container that ChatGPT will connect to.

docker mcp server init –template=chatgpt-app-basic test-chatgpt-app

Successfully initialized MCP server project in test-chatgpt-app (template: chatgpt-app-basic)
Next steps:
cd test-chatgpt-app
docker build -t test-chatgpt-app:latest .

Step 3. List out all the project files

ls -la
total 64
drwxr-xr-x@ 9 ajeetsraina staff 288 16 Nov 16:53 .
drwxr-x—+ 311 ajeetsraina staff 9952 16 Nov 16:54 ..
-rw-r–r–@ 1 ajeetsraina staff 165 16 Nov 16:53 catalog.yaml
-rw-r–r–@ 1 ajeetsraina staff 371 16 Nov 16:53 compose.yaml
-rw-r–r–@ 1 ajeetsraina staff 480 16 Nov 16:53 Dockerfile
-rw-r–r–@ 1 ajeetsraina staff 88 16 Nov 16:53 go.mod
-rw-r–r–@ 1 ajeetsraina staff 2576 16 Nov 16:53 main.go
-rw-r–r–@ 1 ajeetsraina staff 2254 16 Nov 16:53 README.md
-rw-r–r–@ 1 ajeetsraina staff 6234 16 Nov 16:53 ui.html

Step 4. Examine the Compose file

services:
gateway:
image: docker/mcp-gateway # Official Docker MCP Gateway image
command:
– –servers=test-chatgpt-app # Name of the MCP server to expose
– –catalog=/mcp/catalog.yaml # Path to server catalog configuration
– –transport=streaming # Use streaming transport for real-time responses
– –port=8811 # Port the gateway listens on
environment:
– DOCKER_MCP_IN_CONTAINER=1 # Tells gateway it's running inside a container
volumes:
– /var/run/docker.sock:/var/run/docker.sock # Allows gateway to spawn sibling containers
– ./catalog.yaml:/mcp/catalog.yaml # Mount local catalog into container
ports:
– "8811:8811" # Expose gateway port to host

Step 5. Bringing up the compose services

docker compose up -d
[+] Running 2/2
✔ Network test-chatgpt-app_default Created 0.0s
✔ Container test-chatgpt-app-gateway-1 Started

docker ps | grep test-chatgpt-app
eb22b958e09c docker/mcp-gateway "/docker-mcp gateway…" 21 seconds ago Up 20 seconds 0.0.0.0:8811->8811/tcp, [::]:8811->8811/tcp test-chatgpt-app-gateway-1

Step 6. Verify the MCP session

curl http://localhost:8811/mcp
GET requires an active session

Step 7. Expose with Ngrok

Install ngrok and expose your local gateway. You will need to sign up for an ngrok account to obtain an auth token.

brew install ngrok
ngrok config add-authtoken <your_token_id>
ngrok http 8811

Note the public URL (like https://91288b24dc98.ngrok-free.app). Keep this terminal open.

Step 8. Connect ChatGPT

In ChatGPT, go to Settings → Apps & Connectors → Create.

Step 9. Create connector:

Settings → Apps & Connectors → Create

– Name: Test MCP Server
– Description: Testing Docker MCP Toolkit integration
– Connector URL: https://[YOUR_NGROK_URL]/mcp
– Authentication: None
– Click "Create"

Test it by asking ChatGPT to call the greet tool. If it responds, your connection works. Here’s how it looks:

Real-World Demo: Competitive Repricing Agent

Now that you’ve connected ChatGPT to Docker MCP Toolkit, let’s build something that showcases what only MCP can do—something ChatGPT’s native shopping feature cannot replicate.

We’ll create a Competitive Repricing Agent that checks competitor prices on demand, and when competitors are undercutting you, automatically adjusts your Stripe product prices to stay competitive, logs the repricing decision to SQLite, and pushes audit records to GitHub.

Time to build: 15 minutes  

Monthly cost: Free Stripe (test mode) + $1.50-$15 (Firecrawl API)

Infrastructure: $0 (SQLite is free)

The Challenge

E-commerce businesses face a constant dilemma:

Manual price checking across multiple retailers is time-consuming and error-prone

Comparing competitor prices and calculating optimal repricing requires multiple tools

Executing price changes across your payment infrastructure requires context-switching

Historical trend data is scattered across spreadsheets

Strategic insights require manual analysis and interpretation

Result: Missed opportunities, delayed reactions, and losing sales to competitors with better prices.

The Solution: On-Demand Competitive Repricing Agent

Docker MCP Toolkit transforms ChatGPT from an advisor into an autonomous agent that can actually execute. The architecture routes your requests through a secure MCP Gateway that orchestrates specialized tools: Firecrawl scrapes live prices, Stripe creates payment links and invoices, SQLite stores data on your infrastructure, and GitHub maintains your audit trail. Each tool runs in an isolated Docker container: secure, reproducible, and under your control.

The 7 MCP Servers We’ll Use

Server

Purpose

Why It Matters

Firecrawl

Web scraping

Extracts live prices from any website

SQLite

Data persistence

Stores 30+ days of price history

Stripe

Payment management

Updates your product prices to match or beat competitors

GitHub

Version control

Audit trail for all reports

Sequential Thinking

Complex reasoning

Multi-step strategic analysis

Context7

Documentation

Up-to-date library docs for code generation

Node.js Sandbox

Calculations

Statistical analysis in isolated containers

The Complete MCP Workflow (Executes in under 3 minutes)

Step 1. Scrape and Store (30 seconds)

Agent scrapes live prices from Amazon, Walmart, and Best Buy 

Compares against your current Stripe product price

Step 2: Compare Against Your Price (15 seconds) 

Best Buy drops to $509.99—undercutting your $549.99

Agent calculates optimal repricing strategy

Determines new competitive price point

Step 3: Execute Repricing (30 seconds)

Updates your Stripe product with the new competitive price

Logs repricing decision to SQLite with full audit trail

Pushes pricing change report to GitHub

Step 4: Stay Competitive (instant)

Your product now priced competitively

Complete audit trail in your systems

Historical data ready for trend analysis

The Demo Setup: Enable Docker MCP Toolkit

Open Docker Desktop and enable the MCP Toolkit from the Settings menu.

To enable:

Open Docker Desktop

Go to Settings → Beta Features

Toggle Docker MCP Toolkit ON

Click Apply

Click MCP Toolkit in the Docker Desktop sidebar, then select Catalog to explore available servers.

For this demonstration, we’ll use seven MCP servers:

SQLite – RDBMS with advanced analytics, text and vector search, geospatial capabilities, and intelligent workflow automation

Stripe –  Updates your product prices to match or beat competitors for automated repricing workflows

GitHub – Handles version control and deployment

Firecrawl – Web scraping and content extraction

Node.js Sandbox – Runs tests, installs dependencies, validates code (in isolated containers)

Sequential Thinking – Debugs failing tests and optimizes code

Context7 – Provides code documentation for LLMs and AI code editors

Let’s configure each one step by step.

1. Configure SQLite MCP Server

The SQLite MCP Server requires no external database setup. It manages database creation and queries through its 25 built-in tools.

To setup the SQLite MCP Server, follow these steps:

Open Docker Desktop → access MCP Toolkit → Catalog

Search “SQLite”

Click + Add

No configuration needed, just click Start MCP Server

docker mcp server ls
# Should show sqlite-mcp-server as enabled

That’s it. ChatGPT can now create databases, tables, and run queries through conversation.

2. Configure Stripe MCP Server

The Stripe MCP server gives ChatGPT full access to payment infrastructure—listing products, managing prices, and updating your catalog to stay competitive.

Get Stripe API Key

Go to dashboard.stripe.com

Navigate to Developers → API Keys

Copy your Secret Key:

Use sk_test_… for sandbox/testing

Use sk_live_… for production

Configure in Docker Desktop

Open Docker Desktop → MCP Toolkit → Catalog

Search for “Stripe”

Click + Add

Go to the Configuration tab

Add your API key:

Field: stripe.api_key

Value: Your Stripe secret key

Click Save and Start Server

Or via CLI:

docker mcp secret set STRIPE.API_KEY="sk_test_your_key_here"
docker mcp server enable stripe

3. Configure GitHub Official MCP Server

The GitHub MCP server lets ChatGPT create repositories, manage issues, review pull requests, and more.

Option 1: OAuth Authentication (Recommended)

OAuth is the easiest and most secure method:

In MCP Toolkit → Catalog, search “GitHub Official”

Click + Add

Go to the OAuth tab in Docker Desktop

Find the GitHub entry

Click “Authorize”

Your browser opens GitHub’s authorization page

Click “Authorize Docker” on GitHub

You’re redirected back to Docker Desktop

Return to the Catalog tab, find GitHub Official

Click Start Server

Advantage: No manual token creation. Authorization happens through GitHub’s secure OAuth flow with automatic token refresh.

Option 2: Personal Access Token

If you prefer manual control or need specific scopes:

Step 1: Create GitHub Personal Access Token

Go to https://github.com and sign in

Click your profile picture → Settings

Scroll to “Developer settings” in the left sidebar

Click “Personal access tokens” → “Tokens (classic)”

Click “Generate new token” → “Generate new token (classic)”

Name it: “Docker MCP ChatGPT”

Select scopes:

repo (Full control of repositories)

workflow (Update GitHub Actions workflows)

read:org (Read organization data)

Click “Generate token”

Copy the token immediately (you won’t see it again!)

Step 2: Configure in Docker Desktop

In MCP Toolkit → Catalog, find GitHub Official:

Click + Add (if not already added)

Go to the Configuration tab

Select “Personal Access Token” as the authentication method

Paste your token

Click Start Server

Or via CLI:

docker mcp secret set GITHUB.PERSONAL_ACCESS_TOKEN="github_pat_YOUR_TOKEN_HERE"

Verify GitHub Connection

docker mcp server ls

# Should show github as enabled

4. Configure Firecrawl MCP Server

The Firecrawl MCP server gives ChatGPT powerful web scraping and search capabilities.

Get Firecrawl API Key

Go to https://www.firecrawl.dev

Create an account (or sign in)

Navigate to API Keys in the sidebar

Click “Create New API Key”

Copy the API key

Configure in Docker Desktop

Open Docker Desktop → MCP Toolkit → Catalog

Search for “Firecrawl”

Find Firecrawl in the results

Click + Add

Go to the Configuration tab

Add your API key:

Field: firecrawl.api_key

Value: Your Firecrawl API key

Leave all other entries blank

Click Save and Add Server

Or via CLI:

docker mcp secret set FIRECRAWL.API_KEY="fc-your-api-key-here"
docker mcp server enable firecrawl

What You Get

6+ Firecrawl tools, including:

firecrawl_scrape – Scrape content from a single URL

firecrawl_crawl – Crawl entire websites and extract content

firecrawl_map – Discover all indexed URLs on a site

firecrawl_search – Search the web and extract content

firecrawl_extract – Extract structured data using LLM capabilities

firecrawl_check_crawl_status – Check crawl job status

5. Configure Node.js Sandbox MCP Server

The Node.js Sandbox enables ChatGPT to execute JavaScript in isolated Docker containers.

Note: This server requires special configuration because it uses Docker-out-of-Docker (DooD) to spawn containers.

Understanding the Architecture

The Node.js Sandbox implements the Docker-out-of-Docker (DooD) pattern by mounting /var/run/docker.sock. This gives the sandbox container access to the Docker daemon, allowing it to spawn ephemeral sibling containers for code execution.

When ChatGPT requests JavaScript execution:

Sandbox container makes Docker API calls

Creates temporary Node.js containers (with resource limits)

Executes code in complete isolation

Returns results

Auto-removes the container

Security Note: Docker socket access is a privilege escalation vector (effectively granting root-level host access). This is acceptable for local development but requires careful consideration for production use.

Add Via Docker Desktop

MCP Toolkit → Catalog

Search “Node.js Sandbox”

Click + Add

Unfortunately, the Node.js Sandbox requires manual configuration that can’t be done entirely through the Docker Desktop UI. We’ll need to configure ChatGPT’s connector settings directly.

Prepare Output Directory

Create a directory for sandbox output:

# macOS/Linux
mkdir -p ~/Desktop/sandbox-output

# Windows
mkdir %USERPROFILE%Desktopsandbox-output

Configure Docker File Sharing

Ensure this directory is accessible to Docker:

Docker Desktop → Settings → Resources → File Sharing

Add ~/Desktop/sandbox-output (or your Windows equivalent)

Click Apply & Restart

6. Configure Sequential Thinking MCP Server

The Sequential Thinking MCP server gives ChatGPT the ability for dynamic and reflective problem-solving through thought sequences. Adding the Sequential Thinking MCP server is straightforward –  it doesn’t require any API key. Just search for Sequential Thinking in the Catalog and get it to your MCP server list.

In Docker Desktop:

Open Docker Desktop → MCP Toolkit → Catalog

Search for “Sequential Thinking”

Find Sequential Thinking in the results

Click “Add MCP Server” to add without any configuration

The Sequential Thinking MCP server should now appear under “My Servers” in Docker MCP Toolkit.

What you get:

A single Sequential Thinking tool that includes:

sequentialthinking – A detailed tool for dynamic and reflective problem-solving through thoughts. This tool helps analyze problems through a flexible thinking process that can adapt and evolve. Each thought can build on, question, or revise previous insights as understanding deepens.

7. Configure Context7 MCP Server

The Context7 MCP enables ChatGPT to access the latest and up-to-date code documentation for LLMs and AI code editors. Adding the Context7 MCP server is straightforward. It doesn’t require any API key. Just search for Context7 in the Catalog and get it added to the MCP server lists.

In Docker Desktop:

Open Docker Desktop → MCP Toolkit → Catalog

Search for “Context7”

Find Context7 in the results

Click “Add MCP Server” to add without any configuration

The Context7 MCP server should now appear under “My Servers” in Docker MCP Toolkit

What you get:

2 Context7 tools including:

get-library-docs – Fetches up-to-date documentation for a library.

resolve-library-id – Resolves a package/product name to a Context7-compatible library ID and returns a list of matching libraries. 

Verify if all the MCP servers are available and running.

docker mcp server ls

MCP Servers (7 enabled)

NAME OAUTH SECRETS CONFIG DESCRIPTION
————————————————————————————————
context7 – – – Context7 MCP Server — Up-to-da…
fetch – – – Fetches a URL from the internet…
firecrawl – ✓ done partial Official Firecrawl MCP Server…
github-official ✓ done ✓ done – Official GitHub MCP Server, by …
node-code-sandbox – – – A Node.js–based Model Context P…
sequentialthinking – – – Dynamic and reflective problem-…
sqlite-mcp-server – – – The SQLite MCP Server transform…
stripe – ✓ done – Interact with Stripe services o…

Tip: To use these servers, connect to a client (IE: claude/cursor) with docker mcp client connect <client-name>

Configuring ChatGPT App and Connector

Use the following compose file in order to let ChatGPT discover all the tools under Docker MCP Catalog:

services:
gateway:
image: docker/mcp-gateway
command:
– –catalog=/root/.docker/mcp/catalogs/docker-mcp.yaml
– –servers=context7,firecrawl,github-official,node-code-sandbox,sequentialthinking,sqlite-mcp-server,stripe
– –transport=streaming
– –port=8811
environment:
– DOCKER_MCP_IN_CONTAINER=1
volumes:
– /var/run/docker.sock:/var/run/docker.sock
– ~/.docker/mcp:/root/.docker/mcp:ro
ports:
– "8811:8811"

By now, you should be able to view all the MCP tools under ChatGPT Developer Mode.

Let’s Test it Out

Now we give ChatGPT its intelligence. Copy this system prompt and paste it into your ChatGPT conversation:

You are a Competitive Repricing Agent that monitors competitor prices, automatically adjusts your Stripe product prices, and provides strategic recommendations using 7 MCP servers: Firecrawl (web scraping), SQLite (database), Stripe (price management), GitHub (reports), Node.js Sandbox (calculations), Context7 (documentation), and Sequential Thinking (complex reasoning).

DATABASE SCHEMA

Products table: id (primary key), sku (unique), name, category, brand, stripe_product_id, stripe_price_id, current_price, created_at
Price_history table: id (primary key), product_id, competitor, price, original_price, discount_percent, in_stock, url, scraped_at
Price_alerts table: id (primary key), product_id, competitor, alert_type, old_price, new_price, change_percent, created_at
Repricing_log table: id, product_name, competitor_triggered, competitor_price, old_stripe_price, new_stripe_price, repricing_strategy, stripe_price_id, triggered_at, status

Indexes: idx_price_history_product on (product_id, scraped_at DESC), idx_price_history_competitor on (competitor)

WORKFLOW

On-demand check: Scrape (Firecrawl) → Store (SQLite) → Analyze (Node.js) → Report (GitHub)
Competitive repricing: Scrape (Firecrawl) → Compare to your price → Update (Stripe) → Log (SQLite) → Report (GitHub)

STRIPE REPRICING WORKFLOW

When competitor price drops below your current price:
1. list_products – Find your existing Stripe product
2. list_prices – Get current price for the product
3. create_price – Create new price to match/beat competitor (prices are immutable in Stripe)
4. update_product – Set the new price as default
5. Log the repricing decision to SQLite

Price strategies:
– "match": Set price equal to lowest competitor
– "undercut": Set price 1-2% below lowest competitor
– "margin_floor": Never go below your minimum margin threshold

Use Context7 when: Writing scripts with new libraries, creating visualizations, building custom scrapers, or needing latest API docs

Use Sequential Thinking when: Making complex pricing strategy decisions, planning repricing rules, investigating market anomalies, or creating strategic recommendations requiring deep analysis

EXTRACTION SCHEMAS

Amazon: title, price, list_price, rating, reviews, availability
Walmart: name, current_price, was_price, availability
Best Buy: product_name, sale_price, regular_price, availability

RESPONSE FORMAT

Price Monitoring: Products scraped, competitors covered, your price vs competitors
Repricing Triggers: Which competitor triggered, price difference, strategy applied
Price Updated: New Stripe price ID, old vs new price, margin impact
Audit Trail: GitHub commit SHA, SQLite log entry, timestamp

TOOL ORCHESTRATION PATTERNS

Simple price check: Firecrawl → SQLite → Response
Trend analysis: SQLite → Node.js → Response
Strategy analysis: SQLite → Sequential Thinking → Response
Competitive repricing: Firecrawl → Compare → Stripe → SQLite → GitHub
Custom tool development: Context7 → Node.js → GitHub
Full intelligence report: Firecrawl → SQLite → Node.js → Sequential Thinking → GitHub

KEY USAGE PATTERNS

Use Stripe for: Listing products, listing prices, creating new prices, updating product default prices

Use Sequential Thinking for: Pricing strategy decisions (match, undercut, or hold), market anomaly investigations (why did competitor prices spike), multi-factor repricing recommendations

Use Context7 for: Getting documentation before coding, learning new libraries on-the-fly, ensuring code uses latest API conventions

Use Node.js for: Statistical calculations (moving averages, standard deviation, volatility), chart generation, margin calculations

BEST PRACTICES

Space web scraping requests 2 seconds apart to respect rate limits
Calculate price difference as (your_price – competitor_price)
Trigger repricing when competitor drops below your current price
Log all repricing decisions to SQLite with Stripe IDs for audit trail
Push pricing reports to GitHub for compliance
Always use Context7 before writing code with unfamiliar libraries
Respect margin floors—never reprice below minimum acceptable margin

COMMAND RECOGNITION PATTERNS

"Check X prices and stay competitive" → Full repricing pipeline: scrape → compare → if competitor lower: Stripe update + SQLite + GitHub
"Match competitor price for X" → Stripe: list_products → list_prices → create_price (matching)
"Undercut competitors on X" → Stripe: create_price (1-2% below lowest)
"Show price history" → SQLite query → format results
"Analyze pricing strategy for X" → Sequential Thinking analysis
"Why did competitor prices change" → Query data → Sequential Thinking investigation

CORE PRINCIPLE

When competitors change prices, don't just report—ACT. Update your Stripe prices to stay competitive, log decisions to SQLite, push audit records to GitHub. Transform competitor intelligence into automated repricing. All data stays on YOUR infrastructure.

Here’s the prompt:

Set up a competitive repricing agent:
1. I sell MacBook Air M3 on my store – current Stripe price is $549.99
2. Monitor competitor prices on Amazon, Walmart, Best Buy
3. When ANY competitor drops below my price:
– Automatically update my Stripe product price to match or beat them
– Use "undercut" strategy (price 1% below lowest competitor)
– Log the repricing decision to SQLite
– Push pricing change report to GitHub

Check prices now and reprice if needed.

Here’s how ChatGPT responds – orchestrating all 7 MCP servers in a single workflow:

Competitive Repricing Agent – Execution Complete

Within 3 minutes, ChatGPT orchestrated all 7 MCP servers to analyse the market and automatically reprice your product.

Repricing Triggered and Executed:

Metrics

Before

After

Your Price

$549.99

$504.99

Price Change

-$45.00 (-8.2%)

Market Position

3rd (behind Best Buy)

#1 Lowest

Competitor Price Scan Results:

Retailer

Price

Vs. Your New Price

Your Store

$504.99

Market Leader

Best Buy

$509.99

+$5.00 (you beat by 1%)

Walmart

$669.00

+$164.01 higher

Amazon

$699.00

+$194.01 higher

What the Agent did (6 Steps):

Installed SQLite3 and created database schema with 4 tables

Created Stripe product (prod_TZaK0ARRJ5OJJ8) with initial $549.99 price 

Scraped live competitor prices via Firecrawl from Amazon, Best Buy, and Walmart 

Analysed pricing strategy with Sequential Thinking — detected Best Buy at $509.99 below your price

Executed repricing — created new Stripe price at $504.99 (price_1ScRCVI9l1vmUkzn0hTnrLmW)

Pushed audit report to GitHub (commit `64a488aa`)

All data stored on your infrastructure – not OpenAI’s servers. 

To check prices again, simply ask ChatGPT to ‘check MacBook Air M3 competitor prices’—it will scrape, compare, and reprice automatically. Run this check daily, weekly, or whenever you want competitive intelligence

Explore the Full Demo

View the complete repricing report and audit trail on GitHub: https://github.com/ajeetraina/competitive-repricing-agent-mcp

Want true automation? This demo shows on-demand repricing triggered by conversation. For fully automated periodic checks, you could build a simple scheduler that calls the OpenAI API every few hours to trigger the same workflow—turning this into a hands-free competitive intelligence system.Default houston Paragraph Text

Wrapping Up

You’ve just connected ChatGPT to Docker MCP Toolkit and configured multiple MCP servers. What used to require context-switching between multiple tools, manual query writing, and hours of debugging now happens through natural conversation, safely executed in Docker containers.

This is the new paradigm for AI-assisted development. ChatGPT isn’t just answering questions anymore. It’s querying your databases, managing your repositories, scraping data, and executing code—all while Docker ensures everything stays secure and contained.

Ready to try it? Open Docker Desktop and explore the MCP Catalog. Start with SQLite, add GitHub, experiment with Firecrawl. Each server unlocks new capabilities.

The future of development isn’t writing every line of code yourself. It’s having an AI partner that can execute tasks across your entire stack securely, reproducibly, and at the speed of thought.

Learn More

New to Docker? Download Docker Desktop

Explore the MCP Catalog: Discover containerized, security-hardened MCP servers

Get Started with MCP Toolkit: Official Documentation

Quelle: https://blog.docker.com/feed/

Is AI the New Insider Threat?

Insider threats have always been difficult to manage because they blur the line between trusted access and risky behavior. 

With generative AI, these risks aren’t tied to malicious insiders misusing credentials or bypassing controls; they come from well-intentioned employees simply trying to get work done faster. Whether it’s developers refactoring code, analysts summarizing long reports, or marketers drafting campaigns, the underlying motivation is almost always productivity and efficiency.

Unfortunately, that’s precisely what makes this risk so difficult to manage. Employees don’t see themselves as creating security problems; they’re solving bottlenecks. Security is an afterthought at best. 

This gap in perception creates an opportunity for missteps. By the time IT or security teams realize an AI tool has been widely adopted, patterns of risky use may already be deeply embedded in workflows.

Right now, AI use in the workplace is a bit of a free-for-all. And when everyone’s saying “it’s fun” and “everyone’s doing it”, it feels like being back in high school: no one wants to be *that* person telling them to stop because it’s risky. 

But, as security, we do have a responsibility.

In this article, I explore the risks of unmanaged AI use, why existing security approaches fall short, and suggest one thing I believe we can do to balance users’ enthusiasm with responsibility (without being the party pooper).

Examples of Risky AI Use

The risks of AI use in the workplace usually fall into one of three categories:

Sensitive data breaches: A single pasted transcript, log, or API key may seem minor, but once outside company boundaries, it’s effectively gone, subject to provider retention and analysis.

Intellectual property leakage: Proprietary code, designs, or research drafts fed into AI tools can erode competitive advantage if they become training data or are exposed via prompt injection.

Regulatory and compliance violations: Uploading regulated data HIPAA, GDPR, etc. into unsanctioned AI systems can trigger fines or legal action, even if no breach occurs.

What makes these risks especially difficult is their subtlety. They emerge from everyday workflows, not obvious policy violations, which means they often go unnoticed until the damage is done.

Shadow AI

For years, Shadow IT has meant unsanctioned SaaS apps, messaging platforms, or file storage systems. 

Generative AI is now firmly in this category. 

Employees don’t think that pasting text into a chatbot like ChatGPT introduces a new system to the enterprise. In practice, however, they’re moving data into an external environment with no oversight, logging, or contractual protection.

What’s different about Shadow AI is the lack of visibility: unlike past technologies, it often leaves no obvious logs, accounts, or alerts for security teams to follow. With cloud file-sharing, security teams could trace uploads, monitor accounts created with corporate emails, or detect suspicious network traffic. 

But AI use often looks like normal browser activity. And while some security teams do scan what employees paste into web forms, those controls are limited. 

Which brings us to the real problem: we don’t really have the tools to manage AI use properly. Not yet, at least.

Controls Are Lacking

We all see people trying to get work done faster, and we know we should be putting some guardrails in place, but the options out there are either expensive, complicated, or still figuring themselves out.

The few available AI governance and security tools have clear limitations (even though their marketing might try to convince you otherwise):

Emerging AI governance platforms offer usage monitoring, policy enforcement, and guardrails around sensitive data, but they’re often expensive, complex, or narrowly focused.

Traditional controls like DLP and XDR catch structured data such as phone numbers, IDs, or internal customer records, but they struggle with more subtle, hard-to-detect information: source code, proprietary algorithms, or strategic documents.

Even with these tools, the pace of AI adoption means security teams are often playing catch-up. The reality is that while controls are improving, they rarely keep up with how quickly employees are exploring AI.

Lessons from Past Security Blind Spots

Employees charging ahead with new tools while security teams scramble to catch up is not so different from the early days of cloud file sharing: employees flocked to Dropbox or Google Drive before IT had sanctioned solutions. Or think back to the rise of “bring your own device” (BYOD), when personal phones and laptops started connecting to corporate networks without clear policies in place.

Both movements promised productivity, but they also introduced risks that security teams struggled to manage retroactively.

Generative AI is repeating this pattern, only at a much faster rate. While cloud tools or BYOD require some setup, or at least a decision to connect a personal device, AI tools are available instantly in a browser. The barrier to entry is practically zero. That means adoption can spread through an organization long before security leaders even realize it’s happening. 

And as with cloud and BYOD, the sequence is familiar: employee adoption comes first, controls follow later, and those retroactive measures are almost always costlier, clumsier, and less effective than proactive governance.

So What Can We Do?

Remember: AI-driven insider risk isn’t about bad actors but about good people trying to be productive and efficient. (OK, maybe with a few lazy ones thrown in for good measure.) It’s ordinary rather than malicious behavior that’s unfortunately creating unnecessary exposure. 

That means there’s one measure every organization can implement immediately: educating employees.

Education works best when it’s practical and relatable. Think less “compliance checkbox,” and more “here’s a scenario you’ve probably been in.” That’s how you move from fuzzy awareness to actual behavior change.

Here are three steps that make a real difference:

Build awareness with real examples. Show how pasting code, customer details, or draft plans into a chatbot can have the same impact as posting them publicly. That’s the “aha” moment most people need.

Emphasize ownership. Employees already know they shouldn’t reuse passwords or click suspicious links; AI use should be framed in the same personal-responsibility terms. The goal is a culture where people feel they’re protecting the company, not just following rules.

Set clear boundaries. Spell out which categories of data are off-limits PII, source code, unreleased products, regulated records) and offer safe alternatives like internal AI sandboxes. Clarity reduces guesswork and removes the temptation of convenience.

Until governance tools mature, these low-friction steps form the strongest defense we have.

If you can enable people to harness AI’s productivity while protecting your critical data, you reduce today’s risks. And you’re better prepared for the regulations and oversight that are certain to follow.

Quelle: https://blog.docker.com/feed/

Highlights from AWS re:Invent: Supercharging Kiro with Docker Sandboxes and MCP Catalog

At the recent AWS re:Invent, Docker focused on a very real developer problem: how to run AI agents locally without giving them access to your machine, credentials, or filesystem.

With AWS introducing Kiro, Docker demonstrated how Docker Sandboxes and MCP Toolkit allow developers to run agents inside isolated containers, keeping host environments and secrets out of reach. The result is a practical setup where agents can write code, run tests, and use tools safely, while you stay focused on building, not cleaning up accidental damage.

Local AI Agents, Isolation, and Docker at AWS re:Invent

Two weeks ago, a Reddit user posted how their filesystem was accidentally deleted by Google Antigravity. And the top comment?

Alright no more antigravity outside of a container

And another user’s home directory was recently wiped using Claude Code this past week. And yet another top comment:

That’s exactly why Claude code should be used only inside an isolated container or vm

We agree that this should never happen and that containers provide the proper isolation and segmentation.

At AWS re:Invent 2025, we were able to show off this vision using Kiro running in our new Docker sandboxes, using MCP servers provided by the Docker MCP Toolkit. 

If you weren’t able to attend or visit us at the booth, fear not! I’ll share the demo with you.

Jim Clark, one of Docker’s Principal Engineers, providing a demo of running an secured AI development environment using Docker’s sandboxes and MCP Toolkit

Giving Kiro safety guardrails

Docker Sandboxes provide the ability to run an agent inside an isolated environment using containers. In this environment, the agent has no access to credentials stored on the host and can only access the files of the specified project directory.

As an example, I have some demo AWS credentials on my machine:

> cat ~/.aws/credentials
[default]
aws_access_key_id=demo_access_key
aws_secret_access_key=demo_secret_key

Now, I’m going to clone the Catalog Service demo project and start a sandbox using Kiro:

git clone https://github.com/dockersamples/catalog-service-node.git
cd catalog-service-node
docker sandbox run –mount-docker-socket kiro

The –mount-docker-socket flag is added to give the sandbox the Docker socket, which will allow the agent to run my integration tests that use Testcontainers.

On the first launch, I will be required to authenticate. After that’s done, I will ask Kiro to tell me about the AWS credentials it has access to:

⢀⣴⣶⣶⣦⡀⠀⠀⠀⢀⣴⣶⣦⣄⡀⠀⠀⢀⣴⣶⣶⣦⡀⠀⠀⢀⣴⣶⣶⣶⣶⣶⣶⣶⣶⣶⣦⣄⡀⠀⠀⠀⠀⠀⠀⢀⣠⣴⣶⣶⣶⣶⣶⣦⣄⡀⠀⠀⠀
⢰⣿⠋⠁⠈⠙⣿⡆⠀⢀⣾⡿⠁⠀⠈⢻⡆⢰⣿⠋⠁⠈⠙⣿⡆⢰⣿⠋⠁⠀⠀⠀⠀⠀⠀⠀⠀⠈⠙⠻⣦⠀⠀⠀⠀⣴⡿⠟⠋⠁⠀⠀⠀⠈⠙⠻⢿⣦⠀⠀
⢸⣿⠀⠀⠀⠀⣿⣇⣴⡿⠋⠀⠀⠀⢀⣼⠇⢸⣿⠀⠀⠀⠀⣿⡇⢸⣿⠀⠀⠀⢠⣤⣤⣤⣤⣄⠀⠀⠀⠀⣿⡆⠀⠀⣼⡟⠀⠀⠀⠀⣀⣀⣀⠀⠀⠀⠀⢻⣧⠀
⢸⣿⠀⠀⠀⠀⣿⡿⠋⠀⠀⠀⢀⣾⡿⠁⠀⢸⣿⠀⠀⠀⠀⣿⡇⢸⣿⠀⠀⠀⢸⣿⠉⠉⠉⣿⡇⠀⠀⠀⣿⡇⠀⣼⡟⠀⠀⠀⣰⡿⠟⠛⠻⢿⣆⠀⠀⠀⢻⣧
⢸⣿⠀⠀⠀⠀⠙⠁⠀⠀⢀⣼⡟⠁⠀⠀⠀⢸⣿⠀⠀⠀⠀⣿⡇⢸⣿⠀⠀⠀⢸⣿⣶⣶⡶⠋⠀⠀⠀⠀⣿⠇⢰⣿⠀⠀⠀⢰⣿⠀⠀⠀⠀⠀⣿⡆⠀⠀⠀⣿⡆
⢸⣿⠀⠀⠀⠀⠀⠀⠀⠀⠹⣷⡀⠀⠀⠀⠀⢸⣿⠀⠀⠀⠀⣿⡇⢸⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠⣼⠟⠀⢸⣿⠀⠀⠀⢸⣿⠀⠀⠀⠀⠀⣿⡇⠀⠀⠀⣿⡇
⢸⣿⠀⠀⠀⠀⠀⣠⡀⠀⠀⠹⣷⡄⠀⠀⠀⢸⣿⠀⠀⠀⠀⣿⡇⢸⣿⠀⠀⠀⠀⣤⣄⠀⠀⠀⠀⠹⣿⡅⠀⠀⠸⣿⠀⠀⠀⠸⣿⠀⠀⠀⠀⠀⣿⠇⠀⠀⠀⣿⠇
⢸⣿⠀⠀⠀⠀⣾⡟⣷⡀⠀⠀⠘⣿⣆⠀⠀⢸⣿⠀⠀⠀⠀⣿⡇⢸⣿⠀⠀⠀⠀⣿⡟⣷⡀⠀⠀⠀⠘⣿⣆⠀⠀⢻⣧⠀⠀⠀⠹⣷⣦⣤⣤⣾⠏⠀⠀⠀⣼⡟
⢸⣿⠀⠀⠀⠀⣿⡇⠹⣷⡀⠀⠀⠈⢻⡇⠀⢸⣿⠀⠀⠀⠀⣿⡇⢸⣿⠀⠀⠀⠀⣿⡇⠹⣷⡀⠀⠀⠀⠈⢻⡇⠀⠀⢻⣧⠀⠀⠀⠀⠉⠉⠉⠀⠀⠀⠀⣼⡟
⠸⣿⣄⡀⢀⣠⣿⠇⠀⠙⣷⡀⠀⢀⣼⠇⠀⠸⣿⣄⡀⢀⣠⣿⠇⠸⣿⣄⡀⢀⣠⣿⠇⠀⠙⣷⡀⠀⠀⢀⣼⠇⠀⠀⠀⠻⣷⣦⣄⡀⠀⠀⠀⢀⣠⣴⣾⠟
⠀⠈⠻⠿⠿⠟⠁⠀⠀⠀⠈⠻⠿⠿⠟⠁⠀⠀⠈⠻⠿⠿⠟⠁⠀⠀⠈⠻⠿⠿⠟⠁⠀⠀⠀⠈⠻⠿⠿⠟⠁⠀⠀⠀⠀⠀⠈⠙⠻⠿⠿⠿⠿⠟⠋⠁
Model: Auto (/model to change) | Plan: KIRO FREE (/usage for more detail)

!> Tell me about the AWS credentials you have access to

From here, Kiro will search the typical places AWS credentials are configured. But, finally, it reaches the following conclusion:

Currently, there are no AWS credentials configured on your system

And why is this? The credentials on the host are not accessible inside the sandbox environment. The agent is in the isolated environment and only has access to the current project directory.

Giving Kiro secure tools with the MCP Toolkit

If we take a step back and think about it, the only credential an agent should have access to is to authenticate with the model provider. All other credentials belong to the tools (or MCP servers) around the agent.

And that’s where the MCP Toolkit comes in!

Sandboxes don’t yet have an automatic way to connect to the MCP Toolkit (it’s coming soon!). Until that’s available I will start a MCP Gateway with the following command:

docker mcp gateway run –transport=streaming

There are a variety of ways to configure Kiro with MCP servers, but the project-level configuration provides an easy way that also works with sandboxes.

In the project, I will create a .kiro/settings/mcp.json file with the following contents:

{
"mcpServers": {
"docker-mcp-toolkit": {
"type": "http",
"url": "http://host.docker.internal:8811/"
}
}
}

After restarting Kiro, I can ask it about the available tools:

/tools

The output then shows the following tools coming from the MCP Gateway:

docker-mcp-toolkit (MCP)
– code-mode trusted
– mcp-add trusted
– mcp-config-set trusted
– mcp-create-profile trusted
– mcp-exec trusted
– mcp-find trusted
– mcp-remove trusted

These tools highlight the ability to dynamically add and remove MCP servers through the Gateway. 

By using an AGENTS.md file, I can provide to the agent on which MCP servers to use and give an additional nudge to remove those servers when they’re no longer needed (which will reduce my context size and costs). This is what my file looks like:

# Special instructions

– When info is needed from GitHub, use the `github-official` MCP server. This will ensure proper auth tokens are used.
– When adding MCP servers, be sure to activate all tools.
– When you have the required info from any MCP server, remove the MCP server to reduce the number of tools in the context.

Before Kiro can use the GitHub MCP server, I need to ensure I’ve configured the MCP server with my OAuth credentials.

Screenshot of the Docker Desktop dashboard showing the GitHub Official MCP server listing within the MCP Toolkit. Specifically, it shows the OAuth configuration has been completed.

Putting it all together

With all of the setup completed, now I’m ready to have Kiro do some work for me. I’ll simply ask it to complete one of the issues on GitHub. I can simply ask:

> Can you help me complete the work for issue #64?

Watching the output, I’ll first see Kiro add the github-official MCP server by using the mcp-add tool.

Running tool mcp-add with the param (from mcp server: docker-mcp-toolkit)
⋮ {
⋮ "name": "github-official",
⋮ "activate": true
⋮ }
– Completed in 0.609s

From there, I’ll see that Kiro will check the git remotes to determine the GitHub organization and repository being used:

> I need to determine the repository owner and name. Let me check the current directory for git information:
I will run the following command: git remote -v (using tool: shell)
Purpose: Get git remote information to identify the repository

origin https://github.com/dockersamples/catalog-service-node.git (fetch)
origin https://github.com/dockersamples/catalog-service-node.git (push)
– Completed in 0.6s

Then, it will use the GitHub MCP server to get the issue details:

> Now I can get the issue details:
Running tool issue_read with the param (from mcp server: docker-mcp-toolkit)
⋮ {
⋮ "method": "get",
⋮ "owner": "dockersamples",
⋮ "repo": "catalog-service-node",
⋮ "issue_number": 64
⋮ }
– Completed in 0.613s

> Perfect! Issue #64 is about adding product metadata to S3 objects.

From here, Kiro will go and make the changes to my source code, update the integration tests, and even run the integration tests to ensure it all works. After it’s completed, I have the following output:

I've successfully completed the work for issue #64. Here's what was implemented:

Changes Made:

1. StorageService.js:
– Added Tagging parameter to S3 PutObjectCommand with product_id=${id}
– Added Metadata parameter with product_name: productName
– Added new AWS SDK imports for GetObjectTaggingCommand and HeadObjectCommand
– Created helper functions getObjectTags() and getObjectMetadata()

2. ProductService.js:
– Modified uploadProductImage() to fetch product name from database and pass it to uploadFile()
– Exposed new functions getProductImageTags() and getProductImageMetadata()

3. Integration Test:
– Added validation to verify S3 objects have correct tags (product_id with the product ID)
– Added validation to verify S3 objects have correct metadata (product_name with the product name)

Test Results:
All integration tests pass, confirming that:
– S3 objects are tagged with the product identifier
– S3 objects include metadata with the product name
– The existing functionality remains intact

And notably, right before that, I see that Kiro automatically removed the GitHub MCP server to remove its tools from my context:

Let me clean up by removing the GitHub MCP server since we're done with it:
Running tool mcp-remove with the param (from mcp server: docker-mcp-toolkit)
⋮ {
⋮ "name": "github-official"
⋮ }
– Completed in 0.2s

With those tools removed from the context, the model has less to tokenize and process which means faster responses and less cost.

Highlighting what’s important

Taking a step back at what we ran, we have the following:

An agent in an isolated environment. With the agent running in a container, it’s unable to access and leak credentials stored on my host machine. And rogue requests to delete my filesystem are limited to the containerized environment where it’s running as a non-root user.

Isolated and containerized MCP servers. Each MCP server runs in its isolated container, preventing host access. In addition, I don’t have to spend any time worrying about runtime environments or configuration. With a container, “it just works!”

API credentials only where they’re needed. The only component that needs access to my GitHub credential is the GitHub MCP server, where it is securely injected. This approach further prevents potential leaks and exposures.

In other words, we have a microserviced architecture where each component runs in its own container and follows least privilege by having access to only the things it needs access to.

Looking forward

Here at Docker, we’re quite excited about this architecture and there’s still a lot to do. Two items I’m excited about include:

A network boundary for agentic workloads. This boundary would limit network access to only authorized hostnames. Then, if a prompt injection tries to send sensitive information to evildomain.com, that request is blocked.

Governance and control for organizations. With this, your organization can authorize the MCP servers that are used and even create its own custom catalogs and rule sets.

If you want to try out Sandboxes, you can do so by enabling the Experimental Feature in Docker Desktop 4.50+. We’d love to hear your feedback and thoughts!

Learn more 

Docker Sandboxes: Simplifies running AI agents securely on your local machine

Explore the MCP Catalog: Discover containerized, security-hardened MCP servers.

Get started with the MCP Toolkit: Run MCP servers easily and securely.

Quelle: https://blog.docker.com/feed/

Breaking Free From AI Vendor Lock-in: Integrating GitHub Models with Docker cagent

The landscape of AI development is rapidly evolving, and one of the most exciting developments in 2025 from Docker is the release of Docker cagent. cagent is Docker’s open-source multi-agent runtime that orchestrates AI agents through declarative YAML configuration. Rather than managing Python environments, SDK versions, and orchestration logic, developers define agent behavior in a single configuration file and execute it with “cagent run”.In this article, we’ll explore how cagent’s integration with GitHub Models delivers true vendor independence, demonstrate building a real-world podcast generation agent that leverages multiple specialized sub-agents, and show you how to package and distribute your AI agents through Docker Hub. By the end, you’ll understand how to break free from vendor lock-in and build AI agent systems that remain flexible, cost-effective, and production-ready throughout their entire lifecycle.

What is Docker cagent?

cagent is Docker’s open-source multi-agent runtime that orchestrates AI agents through declarative YAML configuration. Rather than managing Python environments, SDK versions, and orchestration logic, developers define agent behavior in a single configuration file and execute it with “cagent run”. 

Some of the key features of Docker cagents:

Declarative YAML Configuration: single-file agent definitions with model configuration, clear instructions, tool access, and delegation rules to interact and coordinate with sub-agents

Multi-Provider Support: OpenAI, Anthropic, Google Gemini, and Docker Model Runner (DMR) for local inference. 

MCP Integration support: Leverage MCP (Stdio, HTTP, SSE) for connecting external tools and services

Secured Registry Distribution: Package and share agents securely via Docker Hub using standard container registry infrastructure.

Built-In Reasoning Tools: “think’, “todo” and “memory” capabilities for complex problem solving workflows.

The core value proposition is simple: declare what your agent should do, and cagent handles your execution. Each agent operates with isolated context, specialized tools via the Model Context Protocol (MCP), and configurable models. Agents can delegate tasks to sub-agents, creating hierarchical teams that mirror human organizational structures.

What are GitHub Models?

GitHub Models is a suite of developer tools that take you from AI idea to deployment, including a model catalog, prompt management, and quantitative evaluations.GitHub Models provides rate-limited free access to production-grade language models from OpenAI (GPT-4o, GPT-5, o1-preview), Meta (Llama 3.1, Llama 3.2), Microsoft (Phi-3.5), and DeepSeek models.The advantage with GitHub Models are you need to Authenticate only once via GitHub Personal Access Tokens and you can plug and play any models of your choice supported by GitHub Models.

You can browse to GitHub Marketplace at https://github.com/marketplace to see the list of all models supported. Currently GitHub supports all the popular models and the list continues to grow. Recently, Anthropic Claude models were also added.

Figure 1.1: GitHub Marketplace displaying list of all models available on the platform

GitHub has designed its platform, including GitHub Models and GitHub Copilot agents, to support production-level agentic AI workflows, offering the necessary infrastructure, governance, and integration points.GitHub Models employs a number of content filters. These filters cannot be turned off as part of the GitHub Models experience. If you decide to employ models through Azure AI  or a paid service, please configure your content filters to meet your requirements.

To get started with GitHub Models, visit https://docs.github.com/en/github-models/quickstart which contains detailed quick start guides. 

Configuring cagent with GitHub Models

GitHub Models OpenAI-compatible API allows straightforward integration with cagent by treating it as a custom OpenAI provider with modified base URL and authentication.

In this article, we will create and deploy a PodCast Generator agent using Github models and show you how easy it is to deploy and share AI agents by deploying it to Docker Hub registry. It is necessary to create a fine-grained personal access token by navigating to this url: https://github.com/settings/personal-access-tokens/new  

Figure 1.2: Generating a new personal access token (PAT) from GitHub developer settings.

Prerequisites

Docker Desktop 4.49+ with MCP Toolkit enabled​

GitHub Personal Access Token with models scope​

Download cagent binary from https://github.com/docker/cagent repository. Place it inside the folder C:Dockercagent. Run .cagent-exe –help to see more options.

Define your agent

I will showcase a simple podcast generator agent, which I created months ago during my testing of Docker cagent. This Agent’s purpose is to generate podcasts by sharing blogs/articles/youtube videos.Below Podcastgenerator yaml file describes a sophisticated multi-agent workflow for automated podcast production, leveraging GitHub Models and MCP tools (DuckDuckGo) for external data access. The DuckDuckGo MCP server runs in an isolated Docker container managed by the MCP gateway. To learn more about docker MCP server and MCP Gateway refer to official product documentation at https://docs.docker.com/ai/mcp-catalog-and-toolkit/mcp-gateway/. The root agent uses sub_agents: [“researcher”, “scriptwriter”] to create a hierarchical structure where specialized agents handle domain-specific tasks. 

sunnynagavo55_podcastgenerator.yaml

#!/usr/bin/env cagent run

agents:
root:
description: "Podcast Director – Orchestrates the entire podcast creation workflow and generates text file"
instruction: |
You are the Podcast Director responsible for coordinating the entire podcast creation process.

Your workflow:
1. Analyze input requirements (topic, length, style, target audience)
2. Delegate research to the research agent which can open duck duck go browser for researching
3. Pass the researched information to the scriptwriter for script creation
4. Output is generated as a text file which can be saved to file or printed out
5. Ensure quality control throughout the process

Always maintain a professional, engaging tone and ensure the final podcast meets broadcast standards.
model: github-model
toolsets:
– type: mcp
command: docker
args: ["mcp", "gateway", "run", "–servers=duckduckgo"]
sub_agents: ["researcher", "scriptwriter"]
researcher:
model: github-model
description: "Podcast Researcher – Gathers comprehensive information for podcast content"
instruction: |
You are an expert podcast researcher who gathers comprehensive, accurate, and engaging information.

Your responsibilities:
– Research the given topic thoroughly using web search
– Find current news, trends, and expert opinions
– Gather supporting statistics, quotes, and examples
– Identify interesting angles and story hooks
– Create detailed research briefs with sources
– Fact-check information for accuracy

Always provide well-sourced, current, and engaging research that will make for compelling podcast content.
toolsets:
– type: mcp
command: docker
args: ["mcp", "gateway", "run", "–servers=duckduckgo"]
scriptwriter:
model: github-model
description: "Podcast Scriptwriter – Creates engaging, professional podcast scripts"
instruction: |
You are a professional podcast scriptwriter who creates compelling, conversational content.

Your expertise:
– Transform research into engaging conversational scripts
– Create natural dialogue and smooth transitions
– Add hooks, sound bite moments, and calls-to-action
– Structure content with clear intro, body, and outro
– Include timing cues and production notes
– Adapt tone for target audience and podcast style
– Create multiple format options (interview, solo, panel discussion)

Write scripts that sound natural when spoken and keep listeners engaged throughout.
toolsets:
– type: mcp
command: docker
args: ["mcp", "gateway", "run", "–servers=filesystem"]
models:
github-model:
provider: openai
model: openai/gpt-5
base_url: https://models.github.ai/inference
env:
OPENAI_API_KEY: ${GITHUB_TOKEN}

Note: Since we are using DuckDuckGo MCP server, make sure to add and install this MCP server from MCP catalog on your docker desktop

Running your Agent on Local Machine

Make sure to update your GitHub PAT token and Run the below command to run your agent from the root folder where your cagent binaries reside.

cagent run ./sunnynagavo55_podcastgenerator.yaml

Pushing your Agent as Docker Image

Run the below command to push your agent as a docker image to your favorite registry to share it with your team.

cagent push Sunnynagavo55/Podcastgenerator

You can see your published images inside your repositories as shown below. 

Congratulations! Now we have our first AI Agent created using cagent and deployed to Docker Hub.

Pulling your Agent as Docker Image on a different machine

Run the below command to pull your docker image agent, created by your teammate, which gets the agent yaml file and saves it in the current directory.

cagent pull Sunnynagavo55/Podcastgenerator

Alternatively, you can run the same agent directly without pulling the image by using the below command.

cagent run Sunnynagavo55/Podcastgenerator

Note: Above Podcastgenerator example agent has been added to Docker/cagent GitHub repository under examples folder. Give it a try and share your experience. https://github.com/docker/cagent/blob/main/examples/podcastgenerator_githubmodel.yaml

Conclusion

The traditional AI development workflow locks you into specific providers, requiring separate API keys, managing multiple billing accounts, and navigating vendor-specific SDKs. cagent with GitHub Models fundamentally changes this equation by combining Docker’s declarative agent framework with GitHub’s unified model marketplace. This integration grants you true vendor independence—a single GitHub Personal Access Token provides access to models from OpenAI, Meta, Microsoft, Anthropic, and DeepSeek, eliminating the friction of managing multiple credentials and authentication schemes.

The future of AI development isn’t about choosing a vendor and committing to their ecosystem. Instead, it’s about building systems flexible enough to adapt as the landscape evolves, new models emerge, and your business requirements change. cagent and GitHub Models make that architectural freedom possible today.

What are you waiting for? Start building now with the power of cagent and GitHub Models and share your story with us.

Resources

To learn more about docker cagent, read the product documentation from https://docs.docker.com/ai/cagent/

For more information about cagent, see the GitHub repository. Give this repository a star and let us know what you build.

Quelle: https://blog.docker.com/feed/

Docker Model Runner now supports vLLM on Windows

Great news for Windows developers working with AI models: Docker Model Runner now supports vLLM on Docker Desktop for Windows with WSL2 and NVIDIA GPUs!

Until now, vLLM support in Docker Model Runner was limited to Docker Engine on Linux. With this update, Windows developers can take advantage of vLLM’s high-throughput inference capabilities directly through Docker Desktop, leveraging their NVIDIA GPUs for accelerated local AI development.

What is Docker Model Runner?

For those who haven’t tried it yet, Docker Model Runner is our new “it just works” experience for running generative AI models.

Our goal is to make running a model as simple as running a container.

Here’s what makes it great:

Simple UX: We’ve streamlined the process down to a single, intuitive command: docker model run <model-name>.

Broad GPU Support: While we started with NVIDIA, we’ve recently added Vulkan support. This is a big deal—it means Model Runner works on pretty much any modern GPU, including AMD and Intel, making AI accessible to more developers than ever.

vLLM: Perform high-throughput inference with an NVIDIA GPU

What is vLLM?

vLLM is a high-throughput inference engine for large language models. It’s designed for efficient memory management of the KV cache and excels at handling concurrent requests with impressive performance. If you’re building AI applications that need to serve multiple requests or require high-throughput inference, vLLM is an excellent choice. Learn more here.

Prerequisites

Before getting started, make sure you have the prerequisites for GPU support:

Docker Desktop for Windows (starting with Docker Desktop 4.54)

WSL2 backend enabled in Docker Desktop

NVIDIA GPU with updated drivers with compute capability >= 8.0

GPU support configured in Docker Desktop

Getting Started

Step 1: Enable Docker Model Runner

First, ensure Docker Model Runner is enabled in Docker Desktop. You can do this through the Docker Desktop settings or via the command line:

docker desktop enable model-runner –tcp 12434

Step 2: Install the vLLM Backend

In order to be able to use vLLM, install the vLLM runner with CUDA support:

docker model install-runner –backend vllm –gpu cuda

Step 3: Verify the Installation

Check that both inference engines are running:

docker model install-runner –backend vllm –gpu cuda

You should see output similar to:

Docker Model Runner is running

Status:
llama.cpp: running llama.cpp version: c22473b
vllm: running vllm version: 0.12.0

Step 4: Run a Model with vLLM

Now you can pull and run models optimized for vLLM. Models with the -vllm suffix on Docker Hub are packaged for vLLM:

docker model run ai/smollm2-vllm "Tell me about Docker."

Troubleshooting Tips

GPU Memory Issues

If you encounter an error like:

ValueError: Free memory on device (6.96/8.0 GiB) on startup is less than desired GPU memory utilization (0.9, 7.2 GiB).

You can configure the GPU memory utilization for a specific mode:

docker model configure –gpu-memory-utilization 0.7 ai/smollm2-vllm

This reduces the memory footprint, allowing the model to run alongside other GPU workloads.

Why This Matters

This update brings several benefits for Windows developers:

Production parity: Test with the same inference engine you’ll use in production

Unified workflow: Stay within the Docker ecosystem you already know

Local development: Keep your data private and reduce API costs during development

How You Can Get Involved

The strength of Docker Model Runner lies in its community, and there’s always room to grow. We need your help to make this project the best it can be. To get involved, you can:

Star the repository: Show your support and help us gain visibility by starring the Docker Model Runner repo.

Contribute your ideas: Have an idea for a new feature or a bug fix? Create an issue to discuss it. Or fork the repository, make your changes, and submit a pull request. We’re excited to see what ideas you have!

Spread the word: Tell your friends, colleagues, and anyone else who might be interested in running AI models with Docker.

We’re incredibly excited about this new chapter for Docker Model Runner, and we can’t wait to see what we can build together. Let’s get to work!
Quelle: https://blog.docker.com/feed/

Docker Joins the Agentic AI Foundation

Today, the Linux Foundation launched the Agentic AI Foundation with three founding projects: Anthropic’s Model Context Protocol (MCP), Block’s goose agent framework, and OpenAI’s AGENTS.md standard.

The foundation brings together the companies building the infrastructure layer for agents: Anthropic, Block, OpenAI, Amazon, Google, Microsoft, Cloudflare, and Bloomberg, alongside key tooling and platform companies. 

Docker is joining as a Gold member.

From Open Source to Production

The timing reflects how quickly the space has matured. A year ago, MCP launched as an open source project from Anthropic, solving a specific problem: how AI systems connect to tools and data. It’s now running on 10,000+ public servers and adopted across Claude, ChatGPT, Cursor, Copilot, VS Code, and Gemini.

Six months ago, companies started deploying agents that take real actions, triggering builds, accessing databases, modifying infrastructure, executing workflows. That shift from prototype to production created new questions around protocols and governance.

Today, foundational protocols that helped answer those questions, protocols like MCP, are moving to the Linux Foundation under the same governance structure that stewards Linux and PyTorch.

Why Neutral Governance Matters

When infrastructure becomes critical, developers won’t build on protocols that could change arbitrarily. And larger teams and enterprises want shared standards.

Over the past year we’ve partnered with Anthropic, Block, and other key players in the AI ecosystem to help create and embrace standards like MCP, Goose, and AGENTS.md. The Agentic AI Foundation creates a structure for the industry to unite behind these standards, building an ecosystem of interoperable tools that benefit developers.

Docker is excited to join as an active Gold member to drive innovation in developer-first, secure tools across our ecosystem.

What Happens Next

The protocols exist. Adoption is happening. The foundation ensures these protocols evolve transparently, with input from everyone building on them.

Docker helped build that structure for applications. Now we’re doing it for agents.

Learn more at aaif.io
Quelle: https://blog.docker.com/feed/

From Compose to Kubernetes to Cloud: Designing and Operating Infrastructure with Kanvas

Docker has long been the simplest way to run containers. Developers start with a docker-compose.yml file, run docker compose up, and get things running fast.

As teams grow and workloads expand into Kubernetes and integrate into cloud services, simplicity fades. Kubernetes has become the operating system of the cloud, but your clusters rarely live in isolation. Real-world platforms are a complex intermixing of proprietary cloud services – AWS S3 buckets, Azure Virtual Machines, Google Cloud SQL databases – all running alongside your containerized workloads. You and your teams are working with clusters and clouds in a sea of YAML.

Managing this hybrid sprawl often means context switching between Docker Desktop, the Kubernetes CLI, cloud provider consoles, and infrastructure as code. Simplicity fades as you juggle multiple distinct tools.

Bringing clarity back from this chaos is the new Docker Kanvas Extension from Layer5 – a visual, collaborative workspace built right into Docker Desktop that allows you to design, deploy, and operate not just Kubernetes resources, but your entire cloud infrastructure across AWS, GCP, and Azure.

What Is Kanvas?

Kanvas is a collaborative platform designed for engineers to visualize, manage, and design multi-cloud and Kubernetes-native infrastructure. Kanvas transforms the concept of infrastructure as code into infrastructure as design. This means your architecture diagram is no longer just documentation – it is the source of truth that drives your deployment. Built on top of Meshery (one of the Cloud Native Computing Foundation’s highest-velocity open source projects), Kanvas moves beyond simple Kubernetes manifests by using Meshery Models – definitions that describe the properties and behavior of specific cloud resources. This allows Kanvas to support a massive catalog of Infrastructure-as-a-Service (IaaS) components: 

AWS: Over 55+ services (e.g., EC2, Lambda, RDS, DynamoDB).

Azure: Over 50+ components (e.g., Virtual Machines, Blob Storage, VNet).

GCP: Over 60+ services (e.g., Compute Engine, BigQuery, Pub/Sub).

Kanvas bridges the gap between abstract architecture and concrete operations through two integrated modes: Designer and Operator.

Designer Mode (declarative mode)

Designer mode serves as a “blueprint studio” for cloud architects and DevOps teams, emphasizing declarative modeling – describing what your infrastructure should look like rather than how to build it step-by-step – making it ideal for GitOps workflows and team-based planning. 

Build and iterate collaboratively: Add annotations, comments for design reviews, and connections between components to visualize data flows, architectures, and relationships.

Dry-run and validate deployments: Before touching production, simulate your deployments by performing a dry-run to verify that your configuration is valid and that you have the necessary permissions. 

Import and export: Brownfield designs by connecting your existing clusters or importing Helm charts from your GitHub repositories. 

Reuse patterns, clone, and share: Pick from a catalog of reference architectures, sample configurations, and infrastructure templates, so you can start from proven blueprints rather than a blank design. Share designs just as you would a Google Doc. Clone designs just as you would a GitHub repo. Merge designs just as you would in a pull request.

Operator Mode (imperative mode)

Kanvas Operator mode transforms static diagrams into live, managed infrastructure. When you switch to Operator mode, Kanvas stops being a configuration tool and becomes an active infrastructure console, using Kubernetes controllers (like AWS Controllers for Kubernetes (ACK) or Google Config Connector) to actively manage your designs.

Operator mode allows you to:

Load testing and performance management: With Operator’s built-in load generator, you can execute stress tests and characterize service behavior by analyzing latency and throughput against predefined performance profiles, establishing baselines to measure the impact of infrastructure configuration changes made in Designer mode.

Multi-player, interactive terminal: Open a shell session with your containers and execute commands, stream and search container logs without leaving the visual topology. Streamline your troubleshooting by sharing your session with teammates. Stay in-context and avoid context-switching to external command-line tools like kubectl.

Integrated observability: Use the Prometheus integration to overlay key performance metrics (CPU usage, memory, request latency) and quickly find spot “hotspots” in your architecture visually. Import your existing Grafana dashboards for deeper analysis.

Multi-cluster, multi-cloud operations: Connect multiple Kubernetes clusters (across different clouds or regions) and manage workloads that span across a GKE cluster and an EKS cluster in a single topology view.them all from a single Kanvas interface.

While Kanvas Designer mode is about intent (what you want to build), Operator mode is about reality (what is actually running). Kanvas Designer mode and Operator mode are simply two, tightly integrated sides of the same coin. 

With this understanding, let’s see both modes in-action in Docker Desktop.

Walk-Through: From Compose to Kubernetes in Minutes

With the Docker Kanvas extension (install from Docker Hub), you can take any existing Docker Compose file and instantly see how it translates into Kubernetes, making it incredibly easy to understand, extend, and deploy your application at scale.

The Docker Samples repository offers a plethora of samples. Let’s use the Spring-based PetClinic example below. 

# sample docker-compose.yml

services:
petclinic:
build:
context: .
dockerfile: Dockerfile.multi
target: development
ports:
– 8000:8000
– 8080:8080
environment:
– SERVER_PORT=8080
– MYSQL_URL=jdbc:mysql://mysqlserver/petclinic
volumes:
– ./:/app
depends_on:
– mysqlserver

mysqlserver:
image: mysql:8
ports:
– 3306:3306
environment:
– MYSQL_ROOT_PASSWORD=
– MYSQL_ALLOW_EMPTY_PASSWORD=true
– MYSQL_USER=petclinic
– MYSQL_PASSWORD=petclinic
– MYSQL_DATABASE=petclinic
volumes:
– mysql_data:/var/lib/mysql
– mysql_config:/etc/mysql/conf.d
volumes:
mysql_data:
mysql_config:

With your Docker Kanvas extension installed:

Import sample app: Save the PetClinic docker-compose.yml file to your computer, then click to import or drag and drop the file onto Kanvas.

Kanvas renders an interactive topology of your stack showing services, dependencies (like MySQL), volumes, ports, and configurations, all mapped to their Kubernetes equivalents. Kanvas performs this rendering in phases, applying an increasing degree of scrutiny in the evaluation performed in each phase. Let’s explore the specifics of this tiered evaluation process in a moment.

Enhance the PetClinic design

From here, you can enhance the generated design in a visual, no-YAML way:

Add a LoadBalancer, Ingress, or ConfigMap

Configure Secrets for your database URL or sensitive environment variables

Modify service relationships or attach new components

Add comments or any other annotations.

Importantly, Kanvas saves your design as you make changes. This gives you production-ready deployment artifacts generated directly from your Compose file.

Deploy to a cluster

With one click, deploy the design to any cluster connected to Docker Desktop or any other remote cluster. Kanvas handles the translation and applies your configuration.

Switch modes and interact with your app

After deploying (or when managing an existing workload), switch to Operator mode to observe and manage your deployed design. You can:

Inspect Deployments, Services, Pods, and their relationships.

 Open a terminal session with your containers for quick debugging.

 Tail and search your container logs and monitor resource metrics.

 Generate traffic and analyze the performance of your deployment under heavy load.

Share your Operator View with teammates for collaborative management.

Within minutes, a Compose-based project becomes a fully managed Kubernetes workload, all without leaving Docker Desktop. This seamless flow from a simple Compose file to a fully managed, operable workload highlights the ease by which infrastructure can be visually managed, leading us to consider the underlying principle of Infrastructure as Design.

Infrastructure as Design

Infrastructure as design elevates the visual layout of your stack to be the primary driver of its configuration, where the act of adjusting the proximity and connectedness of components is one in the same as the process of configuring your infrastructure. In other words, the presence, absence, proximity, or connectedness of individual components (all of which affect how one component relates to another) respectively augments the underlying configuration of each. Kanvas is highly intelligent in this way, understanding at a very granular level of detail how each individual component relates to all other components and will augment the configuration of those components accordingly.

Understand that the process by which Kanvas renders the topology of your stack’s architecture in phases. The initial rendering involves a lightweight analysis of each component, establishing a baseline for the contents of your new design. A subsequent phase of rendering applies a higher level of sophistication in its analysis as Kanvas introspect the configuration of each of your stack’s components, their interdependencies, and proactively evaluates the manner in which each component relates to one another. Kanvas will add, remove, and update the configuration of your components as a result of this relationship evaluation.

This process of relationship evaluation is ongoing. Every time you make a change to your design, Kanvas re-evaluates each component configuration.

To offer an example, if you were to bring a Kubernetes Deployment in the same vicinity of the Kubernetes Namespace you will find that one magnetizes to the next and that your Deployment is visually placed inside of the Namespace, and at the same time, that Deployment’s configuration is mutated to include its new Namespace designation. Kanvas proactively evaluates and mutates the configuration of the infrastructure resources in your design as you make changes.

This ability for Kanvas to intelligently interpret and adapt to changes in your design—automatically managing configuration and relationships—is the key to achieving infrastructure as design. This power comes from a sophisticated system that gives Kanvas a level of intelligence, but with the reliability of a policy-driven engine.

AI-like Intelligence, Anchored by Deterministic Truth

In an era where generative AI dramatically accelerates infrastructure design, the risk of “hallucinations”—plausible but functionally invalid configurations—remains a critical bottleneck. Kanvas solves this by pairing the generative power of AI with a rigid, deterministic policy engine.

This engine acts as an architectural guardrail, offering you precise control over the degree to which AI is involved in assessing configuration correctness. It transforms designs from simple visual diagrams into validated, deployable blueprints.

While AI models function probabilistically, Kanvas’s policy engine functions deterministically, automatically analyzing designs to identify, validate, and enforce connections between components based on ground-truth rules. Each of these rules are statically defined and versioned in their respective Kanvas models.

Deep Contextualization: The evaluation goes beyond simple visualization. It treats relationships as context-aware and declarative, interpreting how components interact (e.g., data flows, dependencies, or resource sharing) to ensure designs are not just imaginative, but deployable and compliant.

Semantic Rigor: The engine distinguishes between semantic relationships (infrastructure-meaningful, such as a TCP connection that auto-configures ports) and non-semantic relationships (user-defined visuals, like annotations). This ensures that aesthetic choices never compromise infrastructure integrity.

Kanvas acknowledges that trust is not binary. You maintain sovereignty over your designs through granular controls that dictate how the engine interacts with AI-generated suggestions:

“Human-in-the-Loop” Slider: You can modulate the strictness of the policy evaluation. You might allow the AI to suggest high-level architecture while enforcing strict policies on security configurations (e.g., port exposure or IAM roles).

Selective Evaluation: You can disable evaluations via preferences for specific categories. For example, you may trust the AI to generate a valid Kubernetes Service definition, but rely entirely on the policy engine to validate the Ingress controller linking to it.

Kanvas does not just flag errors; it actively works to resolve them using sophisticated detection and correction strategies.

Intelligent Scanning: The engine scans for potential relationships based on component types, kinds, and subtypes (e.g., a Deployment linking to a Service via port exposure), catching logical gaps an AI might miss.

Patches and Resolvers: When a partial or a hallucinated configuration is detected, Kanvas applies patches to either propagate missing configuration or dynamically adjusts configurations to resolve conflicts, ensuring the final infrastructure-as-code export (e.g., Kubernetes manifests, Helm chart) is clean, versionable, and secure.

Turn Complexity into Clarity

Kanvas takes the guesswork out of managing modern infrastructure. For developers used to Docker Compose, it offers a natural bridge to Kubernetes and cloud services — with visibility and collaboration built in.

Capability

How It Helps You

Import and Deploy Compose Apps

Move from Compose, Helm, or Kustomize to Kubernetes in minutes.

Visual Designer

Understand your architecture through connected, interactive diagrams.

Design Catalog

Use ready-made templates and proven infrastructure patterns.

Terminal Integration

Debug directly from the Kanvas UI, without switching tools.

Sharable Views

Collaborate on live infrastructure with your team.

Multi-Environment Management

Operate across local, staging, and cloud clusters from one dashboard.

Kanvas brings visual design and real-time operations directly into Docker Desktop. Import your Compose files, Kubernetes Manifests, Helm Charts, and Kustomize files to explore the catalog of ready-to-use architectures, and deploy to Kubernetes in minutes — no YAML wrangling required.

Designs can also be exported in a variety of formats, including as OCI-compliant images and shared through registries like Docker Hub, GitHub Container Registry, or AWS ECR — keeping your infrastructure as design versioned and portable.

Install the Kanvas Extension from Docker Hub and start designing your infrastructure today.
Quelle: https://blog.docker.com/feed/

Announcing vLLM v0.12.0, Ministral 3 and DeepSeek-V3.2 for Docker Model Runner

At Docker, we are committed to making the AI development experience as seamless as possible. Today, we are thrilled to announce two major updates that bring state-of-the-art performance and frontier-class models directly to your fingertips: the immediate availability of Mistral AI’s Ministral 3 and DeepSeek-V3.2, alongside the release of vLLM v0.12.0 on Docker Model Runner.

Whether you are building high-throughput serving pipelines or experimenting with edge-optimized agents on your laptop, today’s updates are designed to accelerate your workflow.

Meet Ministral 3: Frontier Intelligence, Edge Optimized

While vLLM powers your production infrastructure, we know that development needs speed and efficiency right now. That’s why we are proud to add Mistral AI’s newest marvel, Ministral 3, to the Docker Model Runner library on Docker Hub.

Ministral 3 is Mistral AI’s premier edge model. It packs frontier-level reasoning and capabilities into a dense, efficient architecture designed specifically for local inference. It is perfect for:

Local RAG applications: Chat with your docs without data leaving your machine.

Agentic Workflows: Fast reasoning steps for complex function-calling agents.

Low-latency prototyping: Test ideas instantly without waiting for API calls.

DeepSeek-V3.2: The Open Reasoning Powerhouse

We are equally excited to introduce support for DeepSeek-V3.2. Known for pushing the boundaries of what open-weights models can achieve, the DeepSeek-V3 series has quickly become a favorite for developers requiring high-level reasoning and coding proficiency.

DeepSeek-V3.2 brings Mixture-of-Experts (MoE) architecture efficiency to your local environment, delivering performance that rivals top-tier closed models. It is the ideal choice for:

Complex Code Generation: Build and debug software with a model specialized in programming tasks.

Advanced Reasoning: Tackle complex logic puzzles, math problems, and multi-step instructions.

Data Analysis: Process and interpret structured data with high precision.

Run Them with One Command

With Docker Model Runner, you don’t need to worry about complex environment setups, python dependencies, or weight downloads. We’ve packaged both models so you can get started immediately.

To run Ministral 3:

docker model run ai/ministral3

To run DeepSeek-V3.2:

docker model run ai/deepseek-v3.2-vllm

These commands automatically pull the model, set up the runtime, and drop you into an interactive chat session. You can also point your applications to them using our OpenAI-compatible local endpoint, making them drop-in replacements for your cloud API calls during development.

vLLM v0.12.0: Faster, Leaner, and Ready for What’s Next

We are excited to highlight the release of vLLM v0.12.0. vLLM has quickly become the gold standard for high-throughput and memory-efficient LLM serving, and this latest version raises the bar again.

Version 0.12.0 brings critical enhancements to the engine, including:

Expanded Model Support: Day-0 support for the latest architecture innovations, ensuring you can run the newest open-weights models (like DeepSeek V3.2 and Ministral 3) the moment they drop.

Optimized Kernels: Significant latency reductions for inference on NVIDIA GPUs, making your containerized AI applications snappier than ever.

Enhanced PagedAttention: Further optimizations to memory management, allowing you to batch more requests and utilize your hardware to its full potential.

Why This Matters

The combination of Ministral 3, DeepSeek-V3.2, and vLLM v0.12.0 represents the maturity of the open AI ecosystem.

You now have access to a serving engine that maximizes data center performance, alongside a choice of models to fit your specific needs—whether you prioritize the edge-optimized speed of Ministral 3 or the deep reasoning power of DeepSeek-V3.2. All of this is easily accessible via Docker Model Runner.

How You Can Get Involved

The strength of Docker Model Runner lies in its community, and there’s always room to grow. We need your help to make this project the best it can be. To get involved, you can:

Star the repository: Show your support and help us gain visibility by starring the Docker Model Runner repo.

Contribute your ideas: Have an idea for a new feature or a bug fix? Create an issue to discuss it. Or fork the repository, make your changes, and submit a pull request. We’re excited to see what ideas you have!

Spread the word: Tell your friends, colleagues, and anyone else who might be interested in running AI models with Docker.

We’re incredibly excited about this new chapter for Docker Model Runner, and we can’t wait to see what we can build together. Let’s get to work!
Quelle: https://blog.docker.com/feed/

Docker, JetBrains, and Zed: Building a Common Language for Agents and IDEs

As agents become capable enough to write and refactor code, they should work natively inside the environments developers work in: editors. 

That’s why JetBrains and Zed are co-developing ACP, the Agent Client Protocol. ACP gives agents and editors a shared language, so any agent can read context, take actions, and respond intelligently without bespoke wiring for every tool.

Why it matters

Every protocol that’s reshaped development (LSP for language tools, MCP for AI context) works the same way: define the standard once, unlock the ecosystem. ACP does this for the editor itself. Write an agent that speaks ACP, and it works in JetBrains, Zed, or anywhere else that adopts the protocol. 

Docker’s contribution

Docker’s cagent, an open-source multi-agent runtime, already supports ACP, alongside Claude Code, Codex CLI, and Gemini CLI. Agents built with cagent can run in any ACP-compatible IDE, like JetBrains, immediately.

We’ve also shipped Dynamic MCPs, letting agents discover and compose tools at runtime, surfaced directly in the editor where developers work.

What’s next

ACP is early, but the direction is clear. As agents embed deeper into workflows, the winners will be tools that interoperate. Open standards let everyone build on shared foundations instead of custom glue.

Docker will continue investing in ACP and standards that make development faster, more open, and more secure. When code, context, and automation converge, shared protocols ensure we move forward together.
Quelle: https://blog.docker.com/feed/

Securing the Docker MCP Catalog: Commit Pinning, Agentic Auditing, and Publisher Trust Levels

Trust is the most important consideration when you connect AI assistants to real tools. While MCP containerization provides strong isolation and limits the blast radius of malfunctioning or compromised servers, we’re continuously strengthening trust and security across the Docker MCP solutions to further reduce exposure to malicious code. As the MCP ecosystem scales from hundreds to tens of thousands of servers (and beyond), we need stronger mechanisms to prove what code is running, how it was built, and why it’s trusted.

To strengthen trust across the entire MCP lifecycle, from submission to maintenance to daily use, we’ve introduced three key enhancements:

Commit Pinning: Every Docker-built MCP server in the Docker MCP Registry (the source of truth for the MCP Catalog) is now tied to a specific Git commit, making each release precisely attributable and verifiable.

Automated, AI-Audited Updates: A new update workflow keeps submitted MCP servers current, while agentic reviews of incoming changes make vigilance scalable and traceable.

Publisher Trust Levels: We’ve introduced clearer trust indicators in the MCP Catalog, so developers can easily distinguish between official, verified servers and community-contributed entries.

These updates raise the bar on transparency and security for everyone building with and using MCP at scale with Docker.

Commit pins for local MCP servers

Local MCP servers in the Docker MCP Registry are now tied to a specific Git commit with source.commit. That commit hash is a cryptographic fingerprint for the exact revision of the server code that we build and publish. Without this pinning, a reference like latest or a branch name would build whatever happens to be at that reference right now, making builds non-deterministic and vulnerable to supply chain attacks if an upstream repository is compromised. Even Git tags aren’t really immutable since they can be deleted and recreated to point to another commit. By contrast, commit hashes are cryptographically linked to the content they address, making the outcome of an audit of that commit a persistent result.

To make things easier, we’ve updated our authoring tools (like the handy MCP Registry Wizard) to automatically add this commit pin when creating a new server entry, and we now enforce the presence of a commit pin in our CI pipeline (missing or malformed pins will fail validation). This enforcement is deliberate: it’s impossible to accidentally publish a server without establishing clear provenance for the code being distributed. We also propagate the pin into the MCP server image metadata via the org.opencontainers.image.revision label for traceability.

Here’s an example of what this looks like in the registry:

# servers/aws-cdk-mcp-server/server.yaml
name: aws-cdk-mcp-server
image: mcp/aws-cdk-mcp-server
type: server
meta:
category: devops
tags:
– aws-cdk-mcp-server
– devops
about:
title: AWS CDK
description: AWS Cloud Development Kit (CDK) best practices, infrastructure as code patterns, and security compliance with CDK Nag.
icon: https://avatars.githubusercontent.com/u/3299148?v=4
source:
project: https://github.com/awslabs/mcp
commit: 7bace1f81455088b6690a44e99cabb602259ddf7
directory: src/cdk-mcp-server

And here’s an example of how you can verify the commit pin for a published MCP server image:

$ docker image inspect mcp/aws-core-mcp-server:latest
–format '{{index .Config.Labels "org.opencontainers.image.revision"}}'
7bace1f81455088b6690a44e99cabb602259ddf7

In fact, if you have the cosign and jq commands available, you can perform additional verifications:

$ COSIGN_REPOSITORY=mcp/signatures cosign verify mcp/aws-cdk-mcp-server –key https://raw.githubusercontent.com/docker/keyring/refs/heads/main/public/mcp/latest.pub | jq -r ' .[].optional["org.opencontainers.image.revision"] '

Verification for index.docker.io/mcp/aws-cdk-mcp-server:latest —
The following checks were performed on each of these signatures:
– The cosign claims were validated
– Existence of the claims in the transparency log was verified offline
– The signatures were verified against the specified public key
7bace1f81455088b6690a44e99cabb602259ddf7

Keeping in sync

Once a server is in the registry, we don’t want maintainers needing to hand‑edit pins every time they merge something into their upstream repos (they have better things to do with their time), so a new automated workflow scans upstreams nightly, bumping source.commit when there’s a newer revision, and opening an auditable PR in the registry to track the incoming upstream changes.  This gives you the security benefits of pinning (immutable references to reviewed code) without the maintenance toil. Updates still flow through pull requests, so you get a review gate and approval trail showing exactly what new code is entering your supply chain. The update workflow operates on a per-server basis, with each server update getting its own branch and pull request.

This raises the question, though: how do we know that the incoming changes are safe?

AI in the review loop, humans in charge

Every proposed commit pin bump (and any new local server) will now be subject to an agentic AI security review of the incoming upstream changes. The reviewers (Claude Code and OpenAI Codex) analyze MCP server behavior, flagging risky or malicious code, adding structured reports to the PR, and offering standardized labels such as security-risk:high or security-blocked. Humans remain in the loop for final judgment, but the agents are relentless and scalable.

The challenge: untrusted code means untrusted agents

When you run AI agents in CI to analyze untrusted code, you face a fundamental problem: the agents themselves become attack vectors. They’re susceptible to prompt injection through carefully crafted code comments, file names, or repository structure. A malicious PR could attempt to manipulate the reviewing agent into approving dangerous changes, exfiltrating secrets, or modifying the review process itself.

We can’t trust the code under review, but we also can’t fully trust the agents reviewing it.

Isolated agents

Our Compose-based security reviewer architecture addresses this trust problem by treating the AI agents as untrusted components. The agents run inside heavily isolated Docker containers with tightly controlled inputs and outputs:

The code being audited is mounted read-only — The agent can analyze code but never modify it. Moreover, the code it audits is just a temporary copy of the upstream repository, but the read-only access means that the agent can’t do something like modify a script that might be accidentally executed outside the container.

The agent can only write to an isolated output directory — Once the output is written, the CLI wrapper for the agent only extracts specific files (a Markdown report and a text file of labels, both with fixed names), meaning any malicious scripts or files that might be written to that directory are deleted.

The agent lacks direct Internet access — the reviewer container cannot reach external services.

CI secrets and API credentials never enter the reviewer container — Instead, a lightweight reverse proxy on a separate Docker network accepts requests from the reviewer, injects inference provider API keys on outbound requests, and shields those keys from the containerized code under review.

All of this is encapsulated in a Docker Compose stack and wrapped by a convenient CLI that allows running the agent both locally and in CI.

Most importantly, this architecture ensures that even if a malicious PR successfully manipulates the agent through prompt injection, the damage is contained: the agent cannot access secrets, cannot modify code, and cannot communicate with external attackers.

CI integration and GitHub Checks

The review workflow is automatically triggered when a PR is opened or updated. We still maintain some control over these workflows for external PRs, requiring manual triggering to prevent malicious PRs from exhausting inference API credits. These reviews surface directly as GitHub Status Checks, with each server being reviewed receiving dedicated status checks for any analyses performed.

The resulting check status maps to the associated risk level determined by the agent: critical findings result in a failed check that blocks merging, high and medium findings produce neutral warnings, while low and info findings pass. We’re still tuning these criteria (since we’ve asked the agents to be extra pedantic) and currently reviewing the reports manually, but eventually we’ll have the heuristics tuned to a point where we can auto-approve and merge most updated PRs. In the meantime, these reports serve as a scalable “canary in the coal mine”, alerting Docker MCP Registry maintainers to incoming upstream risks — both malicious and accidental.

It’s worth noting that the agent code in the MCP Registry repository is just an example (but a functional one available under an MIT License). The actual security review agent that we run lives in a private repository with additional isolation, but it follows the same architecture.

Reports and risk labels

Here’s an example of a report our automated reviewers produced:

# Security Review Report

## Scope Summary
– **Review Mode:** Differential
– **Repository:** /workspace/input/repository (stripe)
– **Head Commit:** 4eb0089a690cb60c7a30c159bd879ce5c04dd2b8
– **Base Commit:** f495421c400748b65a05751806cb20293c764233
– **Commit Range:** f495421c400748b65a05751806cb20293c764233…4eb0089a690cb60c7a30c159bd879ce5c04dd2b8
– **Overall Risk Level:** MEDIUM

## Executive Summary

This differential review covers 23 commits introducing significant changes to the Stripe Agent Toolkit repository, including: folder restructuring (moving tools to a tools/ directory), removal of evaluation code, addition of new LLM metering and provider packages, security dependency updates, and GitHub Actions workflow permission hardening.

The reviewers can produce both differential analyses (looking at the changes brought in by a specific set of upstream commits) as well as full analyses (looking at entire codebases). We intend to run both differential for PRs and full analyses regularly.

Why behavioral analysis matters

Traditional scanners remain essential, but they tend to focus on things like dependencies with CVEs, syntactical errors (such as a missing break in a switch statement), or memory safety issues (such as dereferencing an uninitialized pointer) — MCP requires us to also examine code’s behavior. Consider the recent malicious postmark-mcp package impersonation: a one‑line backdoor quietly BCC’d outgoing emails to an attacker. Events like this reinforce why our registry couples provenance with behavior‑aware reviews before updates ship.

Real-world results

In our scans so far, we’ve already found several real-world issues in upstream projects (stay tuned for a follow-up blog post), both in MCP servers and with a similar agent in our Docker Hardened Images pipeline. We’re happy to say that we haven’t run across anything malicious so far, just logic errors with security implications, but the granularity and subtlety of issues that these agents can identify is impressive.

Trust levels in the Docker MCP Catalog

In addition to the aforementioned technical changes, we’ve also introduced publisher trust levels in the Docker MCP Catalog, exposing them in both the Docker MCP Toolkit in Docker Desktop and on Docker MCP Hub. Each server will now have an associated icon indicating whether the server is from a “known publisher” or maintained by the community. In both cases, we’ll still subject the code to review, but these indicators should provide additional context on the origin of the MCP server.

Figure 1: Here’s an example of an MCP server, the AWS Terraform MCP published by a known, trusted publisher

Figure 2: The Fetch MCP server, an example of an MCP community server

What does this mean for the community?

Publishers now benefit from a steady stream of upstream improvements, backed by a documented, auditable trail of code changes. Commit pins make each release precisely attributable, while the nightly updater keeps the catalog current with no extra effort from publishers or maintainers. AI-powered reviewers scale our vigilance, freeing up human reviewers to focus on the edge cases that matter most.

At the same time, developers using MCP servers get clarity about a server’s publisher, making it easier to distinguish between official, community, and third-party contributions. These enhancements strengthen trust and security for everyone contributing to or relying on MCP servers in the Docker ecosystem.

Submit your MCP servers to Docker by following the submission guidance here!

Learn more

Explore the MCP Catalog: Discover containerized, security-hardened MCP servers.

Get started with the MCP Toolkit: Run MCP servers easily and securely.

Find documentation for Docker MCP Catalog and Toolkit.

Quelle: https://blog.docker.com/feed/