Hardware-aware · Open source · Updated weekly

Local AI: run open source models on your own computer.

The hub for local LLMs, install guides and a hardware picker.

You can run AI models like the ones behind ChatGPT or Claude directly on your laptop or desktop, without sending your data anywhere. RunLocal shows you which open source model to choose for your hardware, which free software to install (Ollama, LM Studio, llama.cpp), and how to get started in about ten minutes. No prior knowledge required.

New to this? Start with the glossary for plain-language definitions, or jump straight to the hardware picker to see what your computer can run.

Find a model for your hardware →Run your first model Browse the directory

Models worth your disk space

A short list, opinionated. Full catalog in the directory.

See all models →

Llama 4 (Scout & Maverick)

Meta AI · United States

2025

License: Llama Community License (custom)
Context: Up to 10M tokens (Scout)
Sizes: Scout 109B (MoE, 17B active) · Maverick 400B (MoE, ~17B active)

Long-context retrievalCodebase-scale RAGGeneral reasoning

Mixture-of-experts architecture. Scout pushes context to 10M tokens, Maverick targets frontier-tier reasoning. Custom license restricts use above 700M MAU and forbids training competitors.

Official site →Run it locally

Qwen 3.5

Alibaba (Qwen team) · China

2025

License: Apache 2.0 (small sizes); Qwen License (larger)
Context: Up to 1M (long-context variant)
Sizes: 1.8B · 4B · 7B · 14B · 32B · 72B · MoE variants

MultilingualCodeCost-sensitive deployments

The most permissive frontier-class family for smaller sizes. Strong on Asian languages, competitive on English code and reasoning benchmarks.

Official site →Run it locally

DeepSeek V4 (Pro & Flash)

DeepSeek AI · China

2025

License: MIT (V4) / DeepSeek License (V4 Pro variants)
Context: 1M tokens
Sizes: 236B MoE · Distilled dense variants

Mathematical reasoningCode generationLong-context analysis

Topped open-source leaderboards on SWE-Bench and GPQA Diamond in early 2026. The MIT-licensed core variant is the most permissive of the top-tier open weights.

Official site →Run it locally

Mistral Medium 3.5

Mistral AI · France (EU)

2026

License: Apache 2.0 (open weight tier)
Context: 256k tokens
Sizes: ~70B class

EU-friendly deploymentsCodingCompliance-sensitive workloads

Released April 2026. The most credible non-Chinese, non-American option at frontier level. Mistral remains the leading European LLM lab by capability.

Official site →Run it locally

Snapshot · May 15, 2026

Trending on Hugging Face

Browse HF directly →

Auto-curated from the Hugging Face Hub by a weighted mix of downloads, community likes and recency. Refreshes weekly via GitHub Action. License tier is a quick visual hint, not legal advice.

DeepSeek-V4-Pro

deepseek-ai

Permissive

MIT

Downloads: 2.8M
Likes: 4.0k
Updated: 9 days ago

View on HF →score 0.89

Kimi-K2-Instruct

moonshotai

Open weight

Custom license

Downloads: 826.7k
Likes: 2.4k
Updated: 22 days ago

View on HF →score 0.83

DeepSeek-V4-Flash

deepseek-ai

Permissive

MIT

Downloads: 1.6M
Likes: 1.1k
Updated: 9 days ago

View on HF →score 0.83

GLM-5.1

zai-org

Permissive

MIT

Downloads: 241.3k
Likes: 1.6k
Updated: 2 days ago

View on HF →score 0.81

MiniMax-M2.7

MiniMaxAI

Open weight

Custom license

Downloads: 635.6k
Likes: 1.1k
Updated: 25 days ago

View on HF →score 0.79

GLM-5

zai-org

Permissive

MIT

Downloads: 189.3k
Likes: 2.1k
Updated: 1 months ago

View on HF →score 0.77

DeepSeek-R1

deepseek-ai

Permissive

MIT

Downloads: 3.8M
Likes: 13.3k
Updated: 1 years ago

View on HF →score 0.76

MiniMax-M2.5

MiniMaxAI

Open weight

Custom license

Downloads: 780.4k
Likes: 1.5k
Updated: 2 months ago

View on HF →score 0.76

The tools that actually run them

Runtimes, GUIs and inference servers, with their real trade-offs.

See all tools →

Ollama

Runtime

MIT

The fastest way to get a local LLM running with one command.

macOSLinuxWindows

Strengths

One-line install, one-line model pulls
Built-in OpenAI-compatible API on localhost:11434
Active model library with 4,500+ tagged variants

Trade-offs

Less raw throughput than vLLM under heavy concurrent load
Configuration is opinionated; advanced tuning means dropping into llama.cpp anyway

Visit project →Read install guide

llama.cpp

Runtime

MIT

Maximum control and the broadest hardware coverage in the open ecosystem.

macOSLinuxWindowsAndroidiOS

Strengths

Runs almost anywhere: CUDA, ROCm, Metal, Vulkan, CPU-only
Tight GGUF quantization control
Reference implementation behind most desktop LLM tools

Trade-offs

Command-line first; the UX assumes you read READMEs
Quantization options multiply quickly, easy to pick the wrong one

Visit project →Read install guide

LM Studio

GUI

Proprietary

Browsing, comparing and chatting with local models in a desktop GUI.

macOSLinuxWindows

Strengths

Polished chat UI with side-by-side model comparison
Built-in Hugging Face model browser
Local OpenAI-compatible API server with one click

Trade-offs

Closed source; the engine is llama.cpp but the shell is not
Less scriptable than CLI-first tools

Visit project →Read install guide

vLLM

Server

Apache

Production-grade inference with concurrent users and high throughput targets.

Linux (CUDA, ROCm)

Strengths

PagedAttention for memory-efficient KV cache
Continuous batching and speculative decoding
An order of magnitude more throughput than Ollama under heavy concurrency

Trade-offs

GPU-only path; not aimed at single-user desktops
Operational complexity is real; budget for tuning

Visit project →No guide yet

Install guides

Latest writing

Why bother running AI locally?

The big cloud services are easier to start with. But there are real reasons to do it yourself. Three of them.

Your data stays with you

What you type and what the model answers never leave your computer. Handy when you are working with personal notes, client documents, internal code, or anything you would not paste into a public website.

It works even when the cloud does not

The model file lives on your disk. If the company that made it shuts down, raises prices, or simply changes its terms, your setup keeps working. The model you download today still runs in 2030 if your computer does.

No surprise bills

Cloud AI charges per use. Local AI costs you the price of your computer, plus electricity. After the first month, the marginal cost of an extra question is essentially zero.

Local AI: run open source models on your own computer.

Models worth your disk space

Llama 4 (Scout & Maverick)

Qwen 3.5

DeepSeek V4 (Pro & Flash)

Mistral Medium 3.5

Trending on Hugging Face

The tools that actually run them

Ollama

Strengths

Trade-offs

llama.cpp

Strengths

Trade-offs

LM Studio

Strengths

Trade-offs

vLLM

Strengths

Trade-offs

Install guides

Install Ollama and run your first local model

Build and run llama.cpp from source

LM Studio setup and side-by-side model evaluation

Latest writing

Choosing a GGUF quantization without lying to yourself

Apple Silicon or NVIDIA for local LLMs in 2026

Why openSUSE is a serious option for running AI locally

The state of open weights in May 2026

Which local inference engine should you actually use

Why bother running AI locally?

Your data stays with you

It works even when the cloud does not

No surprise bills