We've all been there: you need to generate a few images for a project, you fire up an AI image service, and suddenly you're wondering what happens to your prompts, how many credits you have left, or why that "safe content" filter rejected your perfectly reasonable request for a dragon wearing a business suit. What...
#docker model runner
18 posts
5 May
31 Mar
Back in October, we showed how Docker Model Runner on the NVIDIA DGX Spark makes it remarkably easy to run large AI models locally with the same familiar Docker experience developers already trust. That post struck a chord: hundreds of developers discovered that a compact desktop system paired with Docker Model Runner could replace complex...
27 Mar
Hello, I’m Philippe, and I am a Principal Solutions Architect helping customers with their usage of Docker. I wanted a lightweight way to automate my IT news roundups without burning through AI credits. So I built a Docker Agent skill that uses the Brave Search API to fetch recent articles on a topic, then hands...
13 Mar
Claude Code is quickly becoming a go-to AI coding assistant for developers and increasingly for non-developers who want to build with code. But to truly unlock its potential, it needs the right local infrastructure, tool access, and security boundaries. In this blog, we’ll show you how to run Claude Code with Docker to gain full...
26 Feb
vLLM has quickly become the go-to inference engine for developers who need high-throughput LLM serving. We brought vLLM to Docker Model Runner for NVIDIA GPUs on Linux, then extended it to Windows via WSL2. That changes today. Docker Model Runner now supports vllm-metal, a new backend that brings vLLM inference to macOS using Apple Silicon's...
25 Feb
We’re excited to share a seamless new integration between Docker Model Runner (DMR) and Open WebUI, bringing together two open source projects to make working with self-hosted models easier than ever. With this update, Open WebUI automatically detects and connects to Docker Model Runner running at localhost:12434. If Docker Model Runner is enabled, Open WebUI...
13 Feb
How to solve the context size issues with context packing with Docker Model Runner and Agentic Compose
DockerIf you’ve worked with local language models, you’ve probably run into the context window limit, especially when using smaller models on less powerful machines. While it’s an unavoidable constraint, techniques like context packing make it surprisingly manageable. Hello, I’m Philippe, and I am a Principal Solutions Architect helping customers with their usage of Docker. In...
26 Jan
Personal AI assistants are transforming how we manage our daily lives—from handling emails and calendars to automating smart homes. However, as these assistants gain more access to our private data, concerns about privacy, data residency, and long-term costs are at an all-time high. By combining Clawdbot with Docker Model Runner (DMR), you can build a...
We recently showed how to pair OpenCode with Docker Model Runner for a privacy-first, cost-effective AI coding setup. Today, we're bringing the same approach to Claude Code, Anthropic's agentic coding tool. This post walks through how to configure Claude Code to use Docker Model Runner, giving you full control over your data, infrastructure, and spend....
15 Jan
AI-powered coding assistants are becoming a core part of modern development workflows. At the same time, many teams are increasingly concerned about where their code goes, how it’s processed, and who has access to it. By combining OpenCode with Docker Model Runner, you can build a powerful AI-assisted coding experience while keeping full control over...
16 Dec 2025
Voice is the next frontier of conversational AI. It is the most natural modality for people to chat and interact with another intelligent being. However, the voice AI software stack is complex, with many moving parts. Docker has emerged as one of the most useful tools for AI agent deployment. In this article, we'll explore...
Running large language models (LLMs) and other generative AI models can be a complex, frustrating process of managing dependencies, drivers, and environments. At Docker, we believe this should be as simple as docker model run. That's why we built Docker Model Runner, and today, we’re thrilled to announce a new collaboration with Universal Blue. Thanks...
11 Dec 2025
Great news for Windows developers working with AI models: Docker Model Runner now supports vLLM on Docker Desktop for Windows with WSL2 and NVIDIA GPUs! Until now, vLLM support in Docker Model Runner was limited to Docker Engine on Linux. With this update, Windows developers can take advantage of vLLM's high-throughput inference capabilities directly through...
5 Dec 2025
At Docker, we are committed to making the AI development experience as seamless as possible. Today, we are thrilled to announce two major updates that bring state-of-the-art performance and frontier-class models directly to your fingertips: the immediate availability of Mistral AI’s Ministral 3 and DeepSeek-V3.2, alongside the release of vLLM v0.12.0 on Docker Model Runner....
1 Dec 2025
Embeddings have become the backbone of many modern AI applications. From semantic search to retrieval-augmented generation (RAG) and intelligent recommendation systems, embedding models enable systems to understand the meaning behind text, code, or documents, not just the literal words. But generating embeddings comes with trade-offs. Using a hosted API for embedding generation often results in...
20 Nov 2025
Expanding Docker Model Runner’s Capabilities Today, we’re excited to announce that Docker Model Runner now integrates the vLLM inference engine and safetensors models, unlocking high-throughput AI inference with the same Docker tooling you already use. When we first introduced Docker Model Runner, our goal was to make it simple for developers to run and experiment...
18 Nov 2025
Building and Running Custom Models Is Still Hard Running AI models locally is still hard. Even as open-source LLMs grow more capable, actually getting them to run on your machine, with the right dependencies, remains slow, fragile, and inconsistent. There’s two sides to this challenge: Model creation and optimization: making fine-tuning and quantization efficient. Model...
3 Nov 2025
One of the most exciting advances in modern AI is multimodal support, the ability for models to understand and generate multiple types of input, such as text, images, or audio. With multimodal models, you’re no longer limited to typing prompts; you can show an image or play a sound, and the model can understand it....