6016 Commits

Author SHA1 Message Date
Faisal Amir
198955285e
Merge pull request #6412 from menloresearch/fix/render-new-line
fix: render new line for user message
2025-09-11 13:29:18 +07:00
Louis
7fea6e1ab0
fix: clean up unused packages (#6414) 2025-09-11 13:16:26 +07:00
Akarshan Biswas
5ff7935d91
fix: include lm_head and embedding layers in totalLayers count (#6415)
The original calculation used only the `block_count` from the model metadata, which excludes the final LM head and the embedding layer. This caused an underestimation of the total number of layers and consequently an incorrect `layerSize` value. Adding `+2` accounts for these two missing layers, ensuring accurate model size metrics.
2025-09-11 11:40:39 +05:30
Nguyen Ngoc Minh
d856651380
Merge pull request #6413 from menloresearch/ci/add-nightly-external-contrib
ci: add nightly build for external contributors
2025-09-11 13:03:30 +07:00
Minh141120
65a515a9db chore: add upload artifact steps for 3 platforms 2025-09-11 12:21:56 +07:00
Minh141120
773b252555 ci: add nightly build for external contributors 2025-09-11 11:30:43 +07:00
Akarshan Biswas
7a94e74d6b
Merge pull request #6360 from menloresearch/feat/llamacpp_backend
feat: enhance llamacpp backend management and installation
2025-09-11 09:57:16 +05:30
Akarshan
13806a3f06
Fixup sorting in determineBestBackend 2025-09-11 09:56:46 +05:30
Akarshan Biswas
3cd099ee87
Update extensions/llamacpp-extension/src/index.ts
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2025-09-11 09:55:57 +05:30
Akarshan
42411b5f33
feat: prioritize Vulkan backend only when GPU has ≥6 GB VRAM
Added a GPU memory check using `getSystemInfo` to ensure Vulkan is selected only on systems with at least 6 GB of VRAM.
* Made `determineBestBackend` asynchronous and updated all callers to `await` it.
* Adjusted backend priority list to include or demote Vulkan based on the memory check.
* Updated Vulkan support detection in `backend.ts` to rely solely on API version (memory check moved to selection logic).
* Imported `getSystemInfo` and refined file‑existence validation.

These changes prevent sub‑optimal Vulkan usage on low‑memory GPUs and improve backend selection reliability.
2025-09-11 09:55:55 +05:30
Akarshan
84874c6039
fix file condition 2025-09-11 09:55:08 +05:30
Akarshan
0eff1bfaa9
Throw error when invalid file 2025-09-11 09:55:08 +05:30
Akarshan
5ef9d8dfc3
Add debug logs and refactor 2025-09-11 09:55:06 +05:30
dinhlongviolin1
e2e572ccab
refactor: moved get_short_path to utils and use it in decompress 2025-09-11 09:52:10 +05:30
Faisal Amir
6067ffe107
chore: fix conflict 2025-09-11 09:52:09 +05:30
Faisal Amir
cbd2651a63
chore: update copy and refresh list when import from local machine 2025-09-11 09:52:09 +05:30
Akarshan
2e350ab607
Refresh list of backends by calling configureBackends() and some refactoring in installBackend 2025-09-11 09:52:09 +05:30
Faisal Amir
ba4dc6d1eb
enhancement: update ui dialog update llamacpp backend 2025-09-11 09:52:09 +05:30
Akarshan
a6e4f28830
Add guard before checking locally installed backends 2025-09-11 09:52:09 +05:30
Akarshan
4e37c361c4
feat: expose new updateBackend function for manually updating backend 2025-09-11 09:52:09 +05:30
Akarshan
7ac927ff02
feat: enhance llamacpp backend management and installation
- Add `src-tauri/resources/` to `.gitignore`.
- Introduced utilities to read locally installed backends (`getLocalInstalledBackends`) and fetch remote supported backends (`fetchRemoteSupportedBackends`).
- Refactored `listSupportedBackends` to merge remote and local entries with deduplication and proper sorting.
- Exported `getBackendDir` and integrated it into the extension.
- Added helper `parseBackendVersion` and new method `checkBackendForUpdates` to detect newer backend versions.
- Implemented `installBackend` for manual backend archive installation, including platform‑specific binary path handling.
- Updated command‑line argument logic for `--flash-attn` to respect version‑specific defaults.
- Modified Tauri filesystem `decompress` command to remove overly strict path validation.
2025-09-11 09:52:09 +05:30
Akarshan Biswas
7a174e621a
feat: Smart model management (#6390)
* feat: Smart model management

* **New UI option** – `memory_util` added to `settings.json` with a dropdown (high / medium / low) to let users control how aggressively the engine uses system memory.
* **Configuration updates** – `LlamacppConfig` now includes `memory_util`; the extension class stores it in a new `memoryMode` property and handles updates through `updateConfig`.
* **System memory handling**
  * Introduced `SystemMemory` interface and `getTotalSystemMemory()` to report combined VRAM + RAM.
  * Added helper methods `getKVCachePerToken`, `getLayerSize`, and a new `ModelPlan` type.
* **Smart model‑load planner** – `planModelLoad()` computes:
  * Number of GPU layers that can fit in usable VRAM.
  * Maximum context length based on KV‑cache size and the selected memory utilization mode (high/medium/low).
  * Whether KV‑cache must be off‑loaded to CPU and the overall loading mode (GPU, Hybrid, CPU, Unsupported).
  * Detailed logging of the planning decision.
* **Improved support check** – `isModelSupported()` now:
  * Uses the combined VRAM/RAM totals from `getTotalSystemMemory()`.
  * Applies an 80% usable‑memory heuristic.
  * Returns **GREEN** only when both weights and KV‑cache fit in VRAM, **YELLOW** when they fit only in total memory or require CPU off‑load, and **RED** when the model cannot fit at all.
* **Cleanup** – Removed unused `GgufMetadata` import; updated imports and type definitions accordingly.
* **Documentation/comments** – Added explanatory JSDoc comments for the new methods and clarified the return semantics of `isModelSupported`.

* chore: migrate no_kv_offload from llamacpp setting to model setting

* chore: add UI auto optimize model setting

* feat: improve model loading planner with mmproj support and smarter memory budgeting

* Extend `ModelPlan` with optional `noOffloadMmproj` flag to indicate when a multimodal projector can stay in VRAM.
* Add `mmprojPath` parameter to `planModelLoad` and calculate its size, attempting to keep it on GPU when possible.
* Refactor system memory detection:
  * Use `used_memory` (actual free RAM) instead of total RAM for budgeting.
  * Introduced `usableRAM` placeholder for future use.
* Rewrite KV‑cache size calculation:
  * Properly handle GQA models via `attention.head_count_kv`.
  * Compute bytes per token as `nHeadKV * headDim * 2 * 2 * nLayer`.
* Replace the old 70 % VRAM heuristic with a more flexible budget:
  * Reserve a fixed VRAM amount and apply an overhead factor.
  * Derive usable system RAM from total memory minus VRAM.
* Implement a robust allocation algorithm:
  * Prioritize placing the mmproj in VRAM.
  * Search for the best balance of GPU layers and context length.
  * Fallback strategies for hybrid and pure‑CPU modes with detailed safety checks.
* Add extensive validation of model size, KV‑cache size, layer size, and memory mode.
* Improve logging throughout the planning process for easier debugging.
* Adjust final plan return shape to include the new `noOffloadMmproj` field.

* remove unused variable

---------

Co-authored-by: Faisal Amir <urmauur@gmail.com>
2025-09-11 09:48:03 +05:30
Faisal Amir
9e592b2aca fix: render new line for user message 2025-09-11 10:29:34 +07:00
Faisal Amir
3158722a63
Merge pull request #6409 from menloresearch/enhancement/edit-model-capabilities
enhancement: rollback edit capabilities for local model
2025-09-10 22:46:23 +07:00
Faisal Amir
86dcfc10cf enhancement: rollback edit capabilities for local model 2025-09-10 19:43:44 +07:00
Nguyen Ngoc Minh
eea76802d4
Merge pull request #6408 from menloresearch/ci/claude-issue-dedup
ci: add claude issue dedup
2025-09-10 17:20:48 +07:00
Minh141120
0edf9635a1 ci: add claude issue dedup 2025-09-10 17:16:21 +07:00
Ramon Perez
18351e3850
Merge pull request #6406 from menloresearch/rp/api-ref-fix
removed cloud api spec update
2025-09-10 13:51:59 +10:00
Ramon Perez
aef0538adc updated openapi spec for jan server 2025-09-10 13:37:03 +10:00
Ramon Perez
9ec87614b2 removed cloud api spec update 2025-09-10 13:32:06 +10:00
hiento09
97eebd1c97
fix: jan server api base url for prod (#6403) 2025-09-09 20:51:04 +07:00
Dinh Long Nguyen
5cd81bc6e8
feat: improve testing (#6395)
* add more test rust test

* fix servicehub test

* fix tauri failing on windows
2025-09-09 12:16:25 +07:00
hiento09
dbaf563e88
chore: add cicd for jan web prod (#6396) 2025-09-09 12:13:37 +07:00
Faisal Amir
5e30e10bf4
Merge pull request #6388 from menloresearch/feat/import-vision-model
feat: allow user import model include mmproj file
2025-09-09 09:41:58 +07:00
Faisal Amir
a5b0ced9a9 chore: update logic turn on / off mmproj 2025-09-09 00:01:56 +07:00
Faisal Amir
94dc298181 chore: update validation logic 2025-09-08 22:37:55 +07:00
Faisal Amir
cd85ae062e
Merge pull request #6381 from menloresearch/enhancement/dialog-modal-responsive
enhancement: responsive dialog modals
2025-09-08 22:24:47 +07:00
Faisal Amir
f2594134c7 chore: update UI 2025-09-08 21:20:21 +07:00
Faisal Amir
be851ebcf1 chore: validate gguf file base metadata architecture 2025-09-08 20:16:20 +07:00
Faisal Amir
9b13b140d5 chore: update mcp delete dialog 2025-09-08 19:43:38 +07:00
Nguyen Ngoc Minh
fcd285ca0f
Merge pull request #6392 from menloresearch/ci/update-jan-server-web
ci: remove unnecessary folder paths and on Dockerfile
2025-09-08 16:49:22 +07:00
Minh141120
0e7c12f340 ci: remove unnecessary folder paths and on Dockerfile 2025-09-08 16:37:26 +07:00
Faisal Amir
836990b7d9 chore: update fn check mmproj file 2025-09-08 11:10:00 +07:00
Faisal Amir
4141910ee2 chore: remove validate ext file 2025-09-08 00:07:20 +07:00
Faisal Amir
1b035fd2f1 feat: allow user import model include mmproj file 2025-09-08 00:00:46 +07:00
Faisal Amir
a49008e02d enhancement: responsive dialog modals 2025-09-06 21:48:09 +07:00
Ramon Perez
88fb1acc0a
Merge pull request #6372 from menloresearch/rp/api-docs
docs: add first‑class API Reference to Jan docs (Local + Server)
2025-09-05 21:44:07 +10:00
Ramon Perez
afcaf531ed separated scripts inside config file and fixed nav bar 2025-09-05 21:39:33 +10:00
Ramon Perez
aea474bf57 separated scripts inside config file and fixed nav bar 2025-09-05 21:33:59 +10:00
Nguyen Ngoc Minh
0e44d9340c
Merge pull request #6378 from menloresearch/chore/update-nginx-conf-jan-web
chore: update Dockerfile to use custom nginx.conf
2025-09-05 17:45:54 +07:00