13 Commits

Author SHA1 Message Date
Akarshan Biswas
a1af70f7a9
feat: Enhance Llama.cpp backend management with persistence (#5886)
* feat: Enhance Llama.cpp backend management with persistence

This commit introduces significant improvements to how the Llama.cpp extension manages and updates its backend installations, focusing on user preference persistence and smarter auto-updates.

Key changes include:

* **Persistent Backend Type Preference:** The extension now stores the user's preferred backend type (e.g., `cuda`, `cpu`, `metal`) in `localStorage`. This ensures that even after updates or restarts, the system attempts to use the user's previously selected backend type, if available.
* **Intelligent Auto-Update:** The auto-update mechanism has been refined to prioritize updating to the **latest version of the *currently selected backend type*** rather than always defaulting to the "best available" backend (which might change). This respects user choice while keeping the chosen backend type up-to-date.
* **Improved Initial Installation/Configuration:** For fresh installations or cases where the `version_backend` setting is invalid, the system now intelligently determines and installs the best available backend, then persists its type.
* **Refined Old Backend Cleanup:** The `removeOldBackends` function has been renamed to `removeOldBackend` and modified to specifically clean up *older versions of the currently selected backend type*, preventing the accumulation of unnecessary files while preserving other backend types the user might switch to.
* **Robust Local Storage Handling:** New private methods (`getStoredBackendType`, `setStoredBackendType`, `clearStoredBackendType`) are introduced to safely interact with `localStorage`, including error handling for potential `localStorage` access issues.
* **Version Filtering Utility:** A new utility `findLatestVersionForBackend` helps in identifying the latest available version for a specific backend type from a list of supported backends.

These changes provide a more stable, user-friendly, and maintainable backend management experience for the Llama.cpp extension.

Fixes: #5883

* fix: cortex models migration should be done once

* feat: Optimize Llama.cpp backend preference storage and UI updates

This commit refines the Llama.cpp extension's backend management by:

* **Optimizing `localStorage` Writes:** The system now only writes the backend type preference to `localStorage` if the new value is different from the currently stored one. This reduces unnecessary `localStorage` operations.
* **Ensuring UI Consistency on Initial Setup:** When a fresh installation or an invalid backend configuration is detected, the UI settings are now explicitly updated to reflect the newly determined `effectiveBackendString`, ensuring the displayed setting matches the active configuration.

These changes improve performance by reducing redundant storage operations and enhance user experience by maintaining UI synchronization with the backend state.

* Revert "fix: provider settings should be refreshed on page load (#5887)"

This reverts commit ce6af62c7df4a7e7ea8c0896f307309d6bf38771.

* fix: add loader version backend llamacpp

* fix: wrong key name

* fix: model setting issues

* fix: virtual dom hub

* chore: cleanup

* chore: hide device ofload setting

---------

Co-authored-by: Louis <louis@jan.ai>
Co-authored-by: Faisal Amir <urmauur@gmail.com>
2025-07-24 18:33:35 +07:00
Faisal Amir
399671488c
fix: gpu detected from backend version (#5882)
* fix: gpu detected from backend version

* chore: remove readonly props from dynamic field
2025-07-24 10:45:48 +07:00
Faisal Amir
43b7eb6e18
🐛fix: remove sampling parameters from llamacpp extension (#5871) 2025-07-23 12:13:42 +07:00
Akarshan Biswas
92703bceb2
refactor: move thinking toggle to runtime settings for dynamic control (#5800)
* refactor: move thinking toggle to runtime settings for per-message control

Replaces the static `reasoning_budget` config with a dynamic `enable_thinking` flag under `chat_template_kwargs`, allowing models like Jan-nano and Qwen3 to enable/disable thinking behavior at runtime, even mid-conversation.
Requires UI update

* remove engine argument
2025-07-17 20:18:24 +05:30
Akarshan Biswas
96ba42e411
feat: Add missing ctx-shift toggle (#5765)
* feat: Add missing ctx_shift

* fix typo

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* refine description

---------

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
2025-07-14 11:51:34 +05:30
Akarshan
ffef7b9cab enhancement: Add custom Jinja chat template option
Adds a new configuration option `chat_template` to the Llama.cpp extension, allowing users to define a custom Jinja chat template for the model.

The template can be provided via a new input field in the settings, and if set, it will be passed to the Llama.cpp backend using the `--chat-template` argument. This enhances flexibility for users who require specific chat formatting beyond the GGUF default.

The `chat_template` is added to the `LlamacppConfig` type and conditionally pushed to the command arguments if it's provided. The placeholder text provides an example of a Jinja template structure.
2025-07-03 23:38:16 +07:00
Akarshan
40f1fd4ffd
feat: Auto update backend implementation 2025-07-03 19:32:12 +05:30
Akarshan
0cbf35dc77
Add auto unload setting to llamacpp-extension 2025-07-02 12:28:25 +07:00
Thien Tran
1ae7c0b59a
update version/backend format. fix bugs around load() 2025-07-02 12:27:15 +07:00
Thien Tran
40cd7e962a
feat: download backend for llama.cpp extension (#5123)
* wip

* update

* add download logic

* add decompress. support delete file

* download backend upon selecting setting

* add some logging and nootes

* add note on race condition

* remove then catch

* default to none backend. only download if it's not installed

* merge version and backend. fetch version from GH

* restrict scope of output_dir

* add note on unpack
2025-07-02 12:27:13 +07:00
Akarshan Biswas
742e731e96
Add --reasoning_budget option 2025-07-02 12:27:10 +07:00
Akarshan Biswas
19274f7e69
update settings 2025-07-02 12:26:39 +07:00
Thien Tran
3f082372fd
add llamacpp-extension. can list some models 2025-07-02 12:26:39 +07:00