* ci: update artifact name for Linux and Windows build
* ci: enhance logic for naming convention for mac, linux and windows builds
* fix: resolve nested template expression in artifact names
* Fix: Windows llamacpp not picking up dlls from lib repo
* Fix lib path on Windows
* Add debug info about lib_path
* Normalize lib_path for Windows
* fix window lib path normalization
* fix: missing cuda dll files on windows
* throw backend setup errors to UI
* Fix format
* Update extensions/llamacpp-extension/src/index.ts
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* feat: add logger to llamacpp-extension
* fix: platform check
---------
Co-authored-by: Louis <louis@jan.ai>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* fix: support load model configurations
* chore: remove log
* chore: sampling params add from send completion
* chore: remove comment
* chore: remove comment on predefined file
* chore: update test model service
* refactor: Improve Llama.cpp backend management and auto-update
This commit refactors the Llama.cpp extension to enhance backend management and streamline the auto-update process.
Key changes include:
Refactored configureBackends: The logic for determining the best available backend and populating settings is now more modular, preventing duplicate executions.
Dedicated Auto-update Handling: Introduced a handleAutoUpdate method to encapsulate the auto-update logic, including downloading the latest available backend and updating the internal configuration and settings.
Robust Old Backend Cleanup: The removeOldBackends method is improved to ensure only the currently used backend version and type are kept, effectively managing disk space. A delay is added for Windows to prevent file conflicts during cleanup.
Final Installation Check: A ensureFinalBackendInstallation method is added to guarantee the selected backend is installed, acting as a final safeguard after auto-update or if auto-update is disabled.
Minor Fixes:
Added console.log for save_path during decompression for better debugging.
Ensured the output directory exists before decompression in the Rust backend.
Removed extraneous console log for session info.
Updated Cargo.toml and tauri.conf.json versions.
These changes lead to a more reliable and efficient Llama.cpp backend experience within the application, particularly for users with auto-update enabled.
* fix isBackendInstalled parameters
* Address bot's comments
* Address bot comments of using try finally block
On Windows, spawning the llamacpp server was causing an unwanted terminal window
to appear. This is now fixed by combining `CREATE_NO_WINDOW` with
`CREATE_NEW_PROCESS_GROUP` using `.creation_flags(...)`, ensuring that the
process runs in the background without a console window.
This change only applies to 64-bit Windows builds.
* feat: support per-model overrides in llama.cpp load()
Extend the `load()` method in the llama.cpp extension to accept optional
`overrideSettings`, allowing fine-grained per-model configuration.
This enables users to override provider-level settings such as `ctx_size`,
`chat_template`, `n_gpu_layers`, etc., when loading a specific model.
Fixes: #5818 (Feature Request - Jan v0.6.6)
Use cases enabled:
- Different context sizes per model (e.g., 4K vs 32K)
- Model-specific chat templates (ChatML, Alpaca, etc.)
- Performance tuning (threads, GPU layers)
- Better memory management per deployment
Maintains full backward compatibility with existing provider config.
* swap overrideSettings and isEmbedding argument
* fix: Enhance stream error handling and parsing
This commit improves the robustness of stream processing in the llamacpp-extension.
- Adds explicit handling for 'error:' prefixed lines in the stream, parsing the contained JSON error and throwing an appropriate JavaScript Error.
- Centralizes JSON parsing of 'data:' and 'error:' lines, ensuring consistent error propagation by re-throwing parsing exceptions.
- Ensures the async iterator terminates correctly upon encountering stream errors or malformed JSON.
* Address bot comments and cleanup