Go to file

Dinh Long Nguyen 7413f1354f

* fix: avoid error validate nested dom

* fix: correct context shift flag handling in LlamaCPP extension (#6404) (#6431)

* fix: correct context shift flag handling in LlamaCPP extension

The previous implementation added the `--no-context-shift` flag when `cfg.ctx_shift` was disabled, which conflicted with the llama.cpp CLI where the presence of `--context-shift` enables the feature.
The logic is updated to push `--context-shift` only when `cfg.ctx_shift` is true, ensuring the extension passes the correct argument and behaves as expected.

* feat: detect model out of context during generation

---------

Co-authored-by: Dinh Long Nguyen <dinhlongviolin1@gmail.com>

* chore: add install-rust-targets step for macOS universal builds

* fix: make install-rust-targets a dependency

* enhancement: copy MCP permission

* chore: make action mutton capitalize

* Update web-app/src/locales/en/tool-approval.json

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* chore: simplify macos workflow

* fix: KVCache size calculation and refactor (#6438)

- Removed the unused `getKVCachePerToken` helper and replaced it with a unified `estimateKVCache` that returns both total size and per‑token size.
- Fixed the KV cache size calculation to account for all layers, correcting previous under‑estimation.
- Added proper clamping of user‑requested context lengths to the model’s maximum.
- Refactored VRAM budgeting: introduced explicit reserves, fixed engine overhead, and separate multipliers for VRAM and system RAM based on memory mode.
- Implemented a more robust planning flow with clear GPU, Hybrid, and CPU pathways, including fallback configurations when resources are insufficient.
- Updated default context length handling and safety buffers to prevent OOM situations.
- Adjusted usable memory percentage to 90 % and refined logging for easier debugging.

* fix: detect allocation failures as out-of-memory errors (#6459)

The Llama.cpp backend can emit the phrase “failed to allocate” when it runs out of memory.
Adding this check ensures such messages are correctly classified as out‑of‑memory errors,
providing more accurate error handling CPU backends.

* fix: pathname file install BE

* fix: set default memory mode and clean up unused import (#6463)

Use fallback value 'high' for memory_util config and remove unused GgufMetadata import.

* fix: auto update should not block popup

* fix: remove log

* fix: imporove edit message with attachment image

* fix: imporove edit message with attachment image

* fix: type imageurl

* fix: immediate dropdown value update

* fix: linter

* fix/validate-mmproj-from-general-basename

* fix/revalidate-model-gguf

* fix: loader when importing

* fix/mcp-json-validation

* chore: update locale mcp json

* fix: new extension settings aren't populated properly (#6476)

* chore: embed webview2 bootstrapper in tauri windows

* fix: validat type mcp json

* chore: prevent click outside for edit dialog

* feat: add qa checklist

* chore: remove old checklist

* chore: correct typo in checklist

* fix: correct memory suitability checks in llamacpp extension (#6504)

The previous implementation mixed model size and VRAM checks, leading to inaccurate status reporting (e.g., false RED results).
- Simplified import statement for `readGgufMetadata`.
- Fixed RAM/VRAM comparison by removing unnecessary parentheses.
- Replaced ambiguous `modelSize > usableTotalMemory` check with a clear `totalRequired > usableTotalMemory` hard‑limit condition.
- Refactored the status logic to explicitly handle the CPU‑GPU hybrid scenario, returning **YELLOW** when the total requirement fits combined memory but exceeds VRAM.
- Updated comments for better readability and maintenance.

* fix: thread rerender issue

* chore: clean up console log

* chore: uncomment irrelevant fix

* fix: linter

* chore: remove duplicated block

* fix: tests

* Merge pull request #6469 from menloresearch/fix/deeplink-not-work-on-windows

fix: deeplink issue on Windows

* fix: reduce unnessary rerender due to current thread retrieval

* fix: reduce app layout rerender due to router state update

* fix: avoid the entire app layout re render on route change

* clean: unused import

* Merge pull request #6514 from menloresearch/feat/web-gtag

feat: Add GA Measurement and change keyboard bindings on web

* chore: update build tauri commands

* chore: remove unused task

* fix: should not rerender thread message components when typing

* fix re render issue

* direct tokenspeed access

* chore: sync latest

* feat: Add Jan API server Swagger UI (#6502)

* feat: Add Jan API server Swagger UI

- Serve OpenAPI spec (`static/openapi.json`) directly from the proxy server.
- Implement Swagger UI assets (`swagger-ui.css`, `swagger-ui-bundle.js`, `favicon.ico`) and a simple HTML wrapper under `/docs`.
- Extend the proxy whitelist to include Swagger UI routes.
- Add routing logic for `/openapi.json`, `/docs`, and Swagger UI static files.
- Update whitelisted paths and integrate CORS handling for the new endpoints.

* feat: serve Swagger UI at root path

The Swagger UI endpoint previously lived under `/docs`. The route handling and
exclusion list have been updated so the UI is now served directly at `/`.
This simplifies access, aligns with the expected root URL in the Tauri
frontend, and removes the now‑unused `/docs` path handling.

* feat: add model loading state and translations for local API server

Implemented a loading indicator for model startup, updated the start/stop button to reflect model loading and server starting states, and disabled interactions while pending. Added new translation keys (`loadingModel`, `startingServer`) across all supported locales (en, de, id, pl, vn, zh-CN, zh-TW) and integrated them into the UI. Included a small delay after model start to ensure backend state consistency. This improves user feedback and prevents race conditions during server initialization.

* fix: tests

* fix: linter

* fix: build

* docs: update changelog for v0.6.10

* fix(number-input): preserve '0.0x' format when typing (#6520)

* docs: update url for gifs and videos

* chore: update url for jan-v1 docs

* fix: Typo in openapi JSON (#6528)

* enhancement: toaster delete mcp server

* Update 2025-09-18-auto-optimize-vision-imports.mdx

* Merge pull request #6475 from menloresearch/feat/bump-tokenjs

feat: fix remote provider vision capability

* fix: prevent consecutive messages with same role (#6544)

* fix: prevent consecutive messages with same role

* fix: tests

* fix: first message should not be assistant

* fix: tests

* feat: Prompt progress when streaming (#6503)

* feat: Prompt progress when streaming

- BE changes:
    - Add a `return_progress` flag to `chatCompletionRequest` and a corresponding `prompt_progress` payload in `chatCompletionChunk`. Introduce `chatCompletionPromptProgress` interface to capture cache, processed, time, and total token counts.
    - Update the Llamacpp extension to always request progress data when streaming, enabling UI components to display real‑time generation progress and leverage llama.cpp’s built‑in progress reporting.

* Make return_progress optional

* chore: update ui prompt progress before streaming content

* chore: remove log

* chore: remove progress when percentage >= 100

* chore: set timeout prompt progress

* chore: move prompt progress outside streaming content

* fix: tests

---------

Co-authored-by: Faisal Amir <urmauur@gmail.com>
Co-authored-by: Louis <louis@jan.ai>

* chore: add ci for web stag (#6550)

* feat: add getTokensCount method to compute token usage (#6467)

* feat: add getTokensCount method to compute token usage

Implemented a new async `getTokensCount` function in the LLaMA.cpp extension.
The method validates the model session, checks process health, applies the request template, and tokenizes the resulting prompt to return the token count. Includes detailed error handling for crashed models and API failures, enabling callers to assess token usage before sending completions.

* Fix: typos

* chore: update ui token usage

* chore: remove unused code

* feat: add image token handling for multimodal LlamaCPP models

Implemented support for counting image tokens when using vision-enabled models:
- Extended `SessionInfo` with optional `mmprojPath` to store the multimodal project file.
- Propagated `mmproj_path` from the Tauri plugin into the session info.
- Added import of `chatCompletionRequestMessage` and enhanced token calculation logic in the LlamaCPP extension:
- Detects image content in messages.
- Reads GGUF metadata from `mmprojPath` to compute accurate image token counts.
- Provides a fallback estimation if metadata reading fails.
- Returns the sum of text and image tokens.
- Introduced helper methods `calculateImageTokens` and `estimateImageTokensFallback`.
- Minor clean‑ups such as comment capitalization and debug logging.

* chore: update FE send params message include content type image_url

* fix mmproj path from session info and num tokens calculation

* fix: Correct image token estimation calculation in llamacpp extension

This commit addresses an inaccurate token count for images in the llama.cpp extension.

The previous logic incorrectly calculated the token count based on image patch size and dimensions. This has been replaced with a more precise method that uses the clip.vision.projection_dim value from the model metadata.

Additionally, unnecessary debug logging was removed, and a new log was added to show the mmproj metadata for improved visibility.

* fix per image calc

* fix: crash due to force unwrap

---------

Co-authored-by: Faisal Amir <urmauur@gmail.com>
Co-authored-by: Louis <louis@jan.ai>

* fix: custom fetch for all providers (#6538)

* fix: custom fetch for all providers

* fix: run in development should use built-in fetch

* add full-width model names (#6350)

* fix: prevent relocation to root directories (#6547)

* fix: prevent relocation to root directories

* Update web-app/src/locales/zh-TW/settings.json

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* feat: web remote conversation (#6554)

* feat: implement conversation endpoint

* use conversation aware endpoint

* fetch message correctly

* preserve first message

* fix logout

* fix broadcast issue locally + auth not refreshing profile on other tabs+ clean up and sync messages

* add is dev tag

---------

Co-authored-by: Faisal Amir <urmauur@gmail.com>
Co-authored-by: Akarshan Biswas <akarshan@menlo.ai>
Co-authored-by: Minh141120 <minh.itptit@gmail.com>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: Nguyen Ngoc Minh <91668012+Minh141120@users.noreply.github.com>
Co-authored-by: Louis <louis@jan.ai>
Co-authored-by: Bui Quang Huy <34532913+LazyYuuki@users.noreply.github.com>
Co-authored-by: Roushan Singh <github.rtron18@gmail.com>
Co-authored-by: hiento09 <136591877+hiento09@users.noreply.github.com>
Co-authored-by: Alexey Haidamaka <gdmkaa@gmail.com>

2025-09-23 15:13:15 +07:00

.claude/commands

ci: add claude issue dedup

2025-09-10 17:16:21 +07:00

.devcontainer

refactor: pin linuxdeploy in make/yarn build process instead of github workflow

2025-07-10 04:50:12 +00:00

.github

chore: add ci for web stag (#6550 )

2025-09-23 01:58:48 +07:00

.husky

chore: enhance onboarding screen's models (#4723 )

2025-02-25 09:36:55 +07:00

autoqa

feat: add regression checklist

2025-08-27 17:12:49 +07:00

core

feat: web remote conversation (#6554 )

2025-09-23 15:09:45 +07:00

docs

Merge pull request #6524 from menloresearch/docs/update-changelog

2025-09-19 18:08:47 -07:00

extensions

feat: add getTokensCount method to compute token usage (#6467 )

2025-09-23 07:52:19 +05:30

extensions-web

feat: web remote conversation (#6554 )

2025-09-23 15:09:45 +07:00

flatpak

chore: replace md5 with sha256 for CUDA

2025-08-15 16:37:05 +07:00

scripts

refactor: clean up empty folders (#6454 )

2025-09-15 10:27:07 +07:00

src-tauri

fix: custom fetch for all providers (#6538 )

2025-09-23 09:55:36 +07:00

tests

chore: correct typo in checklist

2025-09-17 10:20:09 +07:00

web-app

feat: web remote conversation (#6554 )

2025-09-23 15:09:45 +07:00

.dockerignore

255: Cloud native

2023-10-30 23:20:10 +07:00

.gitignore

refactor: clean up empty folders (#6454 )

2025-09-15 10:27:07 +07:00

.prettierignore

fix: model path backward compatible (#2018 )

2024-02-14 23:04:46 +07:00

.prettierrc

fix: model path backward compatible (#2018 )

2024-02-14 23:04:46 +07:00

.yarnrc.yml

chore: upgrade to turbo v2 and reduce ci quality gate runtime (#4324 )

2024-12-29 17:46:15 +07:00

CONTRIBUTING.md

Add contributing section for jan (#6231 ) (#6232 )

2025-08-20 10:18:35 +07:00

demo.gif

docs: Update README.md (#1248 )

2023-12-29 11:30:16 +07:00

Dockerfile

bring dev changes to web dev (#6557 )

2025-09-23 15:13:15 +07:00

JanBanner.png

Add files via upload

2024-10-28 23:09:25 +07:00

LICENSE

chore: bundle license to app

2025-08-26 11:17:59 +07:00

Makefile

chore: remove unused task

2025-09-18 21:56:09 +07:00

mise.toml

chore: remove unused task

2025-09-18 21:56:09 +07:00

nginx.conf

chore: update Dockerfile to use custom nginx.conf

2025-09-05 17:26:22 +07:00

package.json

chore: update build tauri commands

2025-09-18 21:53:28 +07:00

README.md

Backend Architecture Refactoring (#6094 ) (#6162 )

2025-08-15 08:59:01 +07:00

vitest.config.ts

test: add missing unit tests

2025-07-15 22:29:28 +07:00

WEB_VERSION_TRACKER.md

add internal web version tracker (#6429 )

2025-09-12 13:07:12 +07:00

yarn.lock

feat: add auth + google auth provider for web (#6505 )

2025-09-18 11:11:14 +07:00

README.md

Jan - Local AI Assistant

GitHub commit activity Github Last Commit Github Contributors GitHub closed issues Discord

Getting Started - Docs - Changelog - Bug reports - Discord

Jan is an AI assistant that can run 100% offline on your device. Download and run LLMs with full control and privacy.

Installation

The easiest way to get started is by downloading one of the following versions for your respective operating system:

Platform	Stable	Nightly
Windows	jan.exe	jan.exe
macOS	jan.dmg	jan.dmg
Linux (deb)	jan.deb	jan.deb
Linux (AppImage)	jan.AppImage	jan.AppImage

Download from jan.ai or GitHub Releases.

Features

Local AI Models: Download and run LLMs (Llama, Gemma, Qwen, etc.) from HuggingFace
Cloud Integration: Connect to OpenAI, Anthropic, Mistral, Groq, and others
Custom Assistants: Create specialized AI assistants for your tasks
OpenAI-Compatible API: Local server at localhost:1337 for other applications
Model Context Protocol: MCP integration for enhanced capabilities
Privacy First: Everything runs locally when you want it to

Build from Source

For those who enjoy the scenic route:

Prerequisites

Node.js ≥ 20.0.0
Yarn ≥ 1.22.0
Make ≥ 3.81
Rust (for Tauri)

Run with Make

git clone https://github.com/menloresearch/jan
cd jan
make dev

This handles everything: installs dependencies, builds core components, and launches the app.

Available make targets:

make dev - Full development setup and launch
make build - Production build
make test - Run tests and linting
make clean - Delete everything and start fresh

Run with Mise (easier)

You can also run with mise, which is a bit easier as it ensures Node.js, Rust, and other dependency versions are automatically managed:

git clone https://github.com/menloresearch/jan
cd jan

# Install mise (if not already installed)
curl https://mise.run | sh

# Install tools and start development
mise install    # installs Node.js, Rust, and other tools
mise dev        # runs the full development setup

Available mise commands:

mise dev - Full development setup and launch
mise build - Production build
mise test - Run tests and linting
mise clean - Delete everything and start fresh
mise tasks - List all available tasks

Manual Commands

yarn install
yarn build:tauri:plugin:api
yarn build:core
yarn build:extensions
yarn dev

System Requirements

Minimum specs for a decent experience:

macOS: 13.6+ (8GB RAM for 3B models, 16GB for 7B, 32GB for 13B)
Windows: 10+ with GPU support for NVIDIA/AMD/Intel Arc
Linux: Most distributions work, GPU acceleration available

For detailed compatibility, check our installation guides.

Troubleshooting

If things go sideways:

Check our troubleshooting docs
Copy your error logs and system specs
Ask for help in our Discord #🆘|jan-help channel

Contributing

Contributions welcome. See CONTRIBUTING.md for the full spiel.

Contact

Bugs: GitHub Issues
Business: hello@jan.ai
Jobs: hr@jan.ai
General Discussion: Discord

License

Apache 2.0 - Because sharing is caring.

Acknowledgements

Built on the shoulders of giants:

Languages

TypeScript 54.9%

JavaScript 34.1%

Rust 8.6%

Python 1.5%

Shell 0.4%

Other 0.5%

README.md

Jan - Local AI Assistant

Installation

Features

Build from Source

Prerequisites

Run with Make

Run with Mise (easier)

Manual Commands

System Requirements

Troubleshooting

Contributing

Links

Contact

License

Acknowledgements