Go to file

Akarshan Biswas 7a174e621a

* feat: Smart model management

* **New UI option** – `memory_util` added to `settings.json` with a dropdown (high / medium / low) to let users control how aggressively the engine uses system memory.
* **Configuration updates** – `LlamacppConfig` now includes `memory_util`; the extension class stores it in a new `memoryMode` property and handles updates through `updateConfig`.
* **System memory handling**
  * Introduced `SystemMemory` interface and `getTotalSystemMemory()` to report combined VRAM + RAM.
  * Added helper methods `getKVCachePerToken`, `getLayerSize`, and a new `ModelPlan` type.
* **Smart model‑load planner** – `planModelLoad()` computes:
  * Number of GPU layers that can fit in usable VRAM.
  * Maximum context length based on KV‑cache size and the selected memory utilization mode (high/medium/low).
  * Whether KV‑cache must be off‑loaded to CPU and the overall loading mode (GPU, Hybrid, CPU, Unsupported).
  * Detailed logging of the planning decision.
* **Improved support check** – `isModelSupported()` now:
  * Uses the combined VRAM/RAM totals from `getTotalSystemMemory()`.
  * Applies an 80% usable‑memory heuristic.
  * Returns **GREEN** only when both weights and KV‑cache fit in VRAM, **YELLOW** when they fit only in total memory or require CPU off‑load, and **RED** when the model cannot fit at all.
* **Cleanup** – Removed unused `GgufMetadata` import; updated imports and type definitions accordingly.
* **Documentation/comments** – Added explanatory JSDoc comments for the new methods and clarified the return semantics of `isModelSupported`.

* chore: migrate no_kv_offload from llamacpp setting to model setting

* chore: add UI auto optimize model setting

* feat: improve model loading planner with mmproj support and smarter memory budgeting

* Extend `ModelPlan` with optional `noOffloadMmproj` flag to indicate when a multimodal projector can stay in VRAM.
* Add `mmprojPath` parameter to `planModelLoad` and calculate its size, attempting to keep it on GPU when possible.
* Refactor system memory detection:
  * Use `used_memory` (actual free RAM) instead of total RAM for budgeting.
  * Introduced `usableRAM` placeholder for future use.
* Rewrite KV‑cache size calculation:
  * Properly handle GQA models via `attention.head_count_kv`.
  * Compute bytes per token as `nHeadKV * headDim * 2 * 2 * nLayer`.
* Replace the old 70 % VRAM heuristic with a more flexible budget:
  * Reserve a fixed VRAM amount and apply an overhead factor.
  * Derive usable system RAM from total memory minus VRAM.
* Implement a robust allocation algorithm:
  * Prioritize placing the mmproj in VRAM.
  * Search for the best balance of GPU layers and context length.
  * Fallback strategies for hybrid and pure‑CPU modes with detailed safety checks.
* Add extensive validation of model size, KV‑cache size, layer size, and memory mode.
* Improve logging throughout the planning process for easier debugging.
* Adjust final plan return shape to include the new `noOffloadMmproj` field.

* remove unused variable

---------

Co-authored-by: Faisal Amir <urmauur@gmail.com>

2025-09-11 09:48:03 +05:30

.claude/commands

ci: add claude issue dedup

2025-09-10 17:16:21 +07:00

.devcontainer

refactor: pin linuxdeploy in make/yarn build process instead of github workflow

2025-07-10 04:50:12 +00:00

.github

ci: add claude issue dedup

2025-09-10 17:16:21 +07:00

.husky

chore: enhance onboarding screen's models (#4723 )

2025-02-25 09:36:55 +07:00

autoqa

feat: add regression checklist

2025-08-27 17:12:49 +07:00

core

feat: gguf file size + hash validation (#5266 ) (#6259 )

2025-08-21 16:17:58 +07:00

docs

fixed home page hyperlink and extension gif

2025-08-28 22:52:37 +10:00

extensions

feat: Smart model management (#6390 )

2025-09-11 09:48:03 +05:30

extensions-web

feat: Web use jan model (#6374 )

2025-09-05 16:18:30 +07:00

flatpak

chore: replace md5 with sha256 for CUDA

2025-08-15 16:37:05 +07:00

pre-install

chore: server download progress + S3 (#1925 )

2024-02-07 17:54:35 +07:00

scripts

fix: mise build failing

2025-09-01 14:02:31 +10:00

specs

chore: janhq to menloresearch

2025-03-18 13:06:17 +07:00

src-tauri

feat: improve testing (#6395 )

2025-09-09 12:16:25 +07:00

web-app

feat: Smart model management (#6390 )

2025-09-11 09:48:03 +05:30

website

updated openapi spec for jan server

2025-09-10 13:37:03 +10:00

.dockerignore

255: Cloud native

2023-10-30 23:20:10 +07:00

.gitignore

docs: add first‑class API Reference to Jan docs (Local + Server)

2025-09-05 11:21:43 +10:00

.prettierignore

fix: model path backward compatible (#2018 )

2024-02-14 23:04:46 +07:00

.prettierrc

fix: model path backward compatible (#2018 )

2024-02-14 23:04:46 +07:00

.yarnrc.yml

chore: upgrade to turbo v2 and reduce ci quality gate runtime (#4324 )

2024-12-29 17:46:15 +07:00

CONTRIBUTING.md

Add contributing section for jan (#6231 ) (#6232 )

2025-08-20 10:18:35 +07:00

demo.gif

docs: Update README.md (#1248 )

2023-12-29 11:30:16 +07:00

Dockerfile

ci: remove unnecessary folder paths and on Dockerfile

2025-09-08 16:37:26 +07:00

JanBanner.png

Add files via upload

2024-10-28 23:09:25 +07:00

LICENSE

chore: bundle license to app

2025-08-26 11:17:59 +07:00

Makefile

feat: improve testing (#6395 )

2025-09-09 12:16:25 +07:00

mise.toml

feat: improve testing (#6395 )

2025-09-09 12:16:25 +07:00

nginx.conf

chore: update Dockerfile to use custom nginx.conf

2025-09-05 17:26:22 +07:00

package.json

feat: improve testing (#6395 )

2025-09-09 12:16:25 +07:00

README.md

Backend Architecture Refactoring (#6094 ) (#6162 )

2025-08-15 08:59:01 +07:00

vitest.config.ts

test: add missing unit tests

2025-07-15 22:29:28 +07:00

yarn.lock

chore(deps): bump @radix-ui/react-hover-card from 1.1.11 to 1.1.14 (#5603 )

2025-07-20 15:20:18 +07:00

README.md

Jan - Local AI Assistant

GitHub commit activity Github Last Commit Github Contributors GitHub closed issues Discord

Getting Started - Docs - Changelog - Bug reports - Discord

Jan is an AI assistant that can run 100% offline on your device. Download and run LLMs with full control and privacy.

Installation

The easiest way to get started is by downloading one of the following versions for your respective operating system:

Platform	Stable	Nightly
Windows	jan.exe	jan.exe
macOS	jan.dmg	jan.dmg
Linux (deb)	jan.deb	jan.deb
Linux (AppImage)	jan.AppImage	jan.AppImage

Download from jan.ai or GitHub Releases.

Features

Local AI Models: Download and run LLMs (Llama, Gemma, Qwen, etc.) from HuggingFace
Cloud Integration: Connect to OpenAI, Anthropic, Mistral, Groq, and others
Custom Assistants: Create specialized AI assistants for your tasks
OpenAI-Compatible API: Local server at localhost:1337 for other applications
Model Context Protocol: MCP integration for enhanced capabilities
Privacy First: Everything runs locally when you want it to

Build from Source

For those who enjoy the scenic route:

Prerequisites

Node.js ≥ 20.0.0
Yarn ≥ 1.22.0
Make ≥ 3.81
Rust (for Tauri)

Run with Make

git clone https://github.com/menloresearch/jan
cd jan
make dev

This handles everything: installs dependencies, builds core components, and launches the app.

Available make targets:

make dev - Full development setup and launch
make build - Production build
make test - Run tests and linting
make clean - Delete everything and start fresh

Run with Mise (easier)

You can also run with mise, which is a bit easier as it ensures Node.js, Rust, and other dependency versions are automatically managed:

git clone https://github.com/menloresearch/jan
cd jan

# Install mise (if not already installed)
curl https://mise.run | sh

# Install tools and start development
mise install    # installs Node.js, Rust, and other tools
mise dev        # runs the full development setup

Available mise commands:

mise dev - Full development setup and launch
mise build - Production build
mise test - Run tests and linting
mise clean - Delete everything and start fresh
mise tasks - List all available tasks

Manual Commands

yarn install
yarn build:tauri:plugin:api
yarn build:core
yarn build:extensions
yarn dev

System Requirements

Minimum specs for a decent experience:

macOS: 13.6+ (8GB RAM for 3B models, 16GB for 7B, 32GB for 13B)
Windows: 10+ with GPU support for NVIDIA/AMD/Intel Arc
Linux: Most distributions work, GPU acceleration available

For detailed compatibility, check our installation guides.

Troubleshooting

If things go sideways:

Check our troubleshooting docs
Copy your error logs and system specs
Ask for help in our Discord #🆘|jan-help channel

Contributing

Contributions welcome. See CONTRIBUTING.md for the full spiel.

Contact

Bugs: GitHub Issues
Business: hello@jan.ai
Jobs: hr@jan.ai
General Discussion: Discord

License

Apache 2.0 - Because sharing is caring.

Acknowledgements

Built on the shoulders of giants:

Languages

TypeScript 54.9%

JavaScript 34.1%

Rust 8.6%

Python 1.5%

Shell 0.4%

Other 0.5%

README.md

Jan - Local AI Assistant

Installation

Features

Build from Source

Prerequisites

Run with Make

Run with Mise (easier)

Manual Commands

System Requirements

Troubleshooting

Contributing

Links

Contact

License

Acknowledgements