* fix: remove CREATE_NEW_PROCESS_GROUP flag for proper Ctrl-C handling
CREATE_NEW_PROCESS_GROUP prevented GenerateConsoleCtrlEvent from working,
causing graceful shutdown failures. Removed to enable proper signal handling.
* Revert "fix: remove CREATE_NEW_PROCESS_GROUP flag for proper Ctrl-C handling"
This reverts commit 82ace3e72e4bf7338f422d5c79bdd6a0f8a2440e.
* fix: use direct process termination instead of console events
Simplified Windows process cleanup by removing console attachment logic
and using direct child.kill() method. More reliable for headless processes.
* Fix missing imports
* switch to tokio::time
* Don't wait while forcefully terminate process using kill API on Windows
Disabled use of windows-sys crate as graceful shutdown on Windows is unreliable in this context.
Updated cleanup.rs and server.rs to directly call child.kill().await for terminating processes on Windows.
Improved logging for process termination and error handling during kill and wait.
Removed timeout-based graceful shutdown attempt on Windows since TerminateProcess is inherently forceful and immediate.
This ensures more predictable process cleanup behavior on Windows platforms.
* final cleanups
This change improves the robustness of the llama.cpp extension's server port selection.
Previously, the `getRandomPort()` method only checked for ports already in use by active sessions, which could lead to model load failures if the chosen port was occupied by another external process.
This change introduces a new Tauri command, `is_port_available`, which performs a system-level check to ensure the randomly selected port is truly free before attempting to start the llama-server. It also adds a retry mechanism with a maximum number of attempts (20,000) to find an available port, throwing an error if no suitable port is found within the specified range after all attempts.
This enhancement prevents port conflicts and improves the reliability and user experience of the llama.cpp extension within Jan.
Closes#5965
* feat: add support for querying available backend devices
This change introduces a new `get_devices` method to the `llamacpp_extension` engine that allows the frontend to query and display a list of available devices (e.g., Vulkan, CUDA, SYCL) from the compiled `llama-server` binary.
* Added `DeviceList` interface to represent GPU/device metadata.
* Implemented `getDevices(): Promise<DeviceList[]>` method.
* Splits `version/backend`, ensures backend is ready.
* Invokes the new Tauri command `get_devices`.
* Introduced a new `get_devices` Tauri command.
* Parses `llama-server --list-devices` output to extract available devices with memory info.
* Introduced `DeviceInfo` struct (`id`, `name`, `mem`, `free`) and exposed it via serialization.
* Robust parsing logic using string processing (non-regex) to locate memory stats.
* Registered the new command in the `tauri::Builder` in `lib.rs`.
* Fixed logic to correctly parse multiple devices from the llama-server output.
* Handles common failure modes: binary not found, malformed memory info, etc.
This sets the foundation for device selection, memory-aware model loading, and improved diagnostics in Jan AI engine setup flows.
* Update extensions/llamacpp-extension/src/index.ts
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
---------
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* Fix: Windows llamacpp not picking up dlls from lib repo
* Fix lib path on Windows
* Add debug info about lib_path
* Normalize lib_path for Windows
* fix window lib path normalization
* fix: missing cuda dll files on windows
* throw backend setup errors to UI
* Fix format
* Update extensions/llamacpp-extension/src/index.ts
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* feat: add logger to llamacpp-extension
* fix: platform check
---------
Co-authored-by: Louis <louis@jan.ai>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* refactor: Improve Llama.cpp backend management and auto-update
This commit refactors the Llama.cpp extension to enhance backend management and streamline the auto-update process.
Key changes include:
Refactored configureBackends: The logic for determining the best available backend and populating settings is now more modular, preventing duplicate executions.
Dedicated Auto-update Handling: Introduced a handleAutoUpdate method to encapsulate the auto-update logic, including downloading the latest available backend and updating the internal configuration and settings.
Robust Old Backend Cleanup: The removeOldBackends method is improved to ensure only the currently used backend version and type are kept, effectively managing disk space. A delay is added for Windows to prevent file conflicts during cleanup.
Final Installation Check: A ensureFinalBackendInstallation method is added to guarantee the selected backend is installed, acting as a final safeguard after auto-update or if auto-update is disabled.
Minor Fixes:
Added console.log for save_path during decompression for better debugging.
Ensured the output directory exists before decompression in the Rust backend.
Removed extraneous console log for session info.
Updated Cargo.toml and tauri.conf.json versions.
These changes lead to a more reliable and efficient Llama.cpp backend experience within the application, particularly for users with auto-update enabled.
* fix isBackendInstalled parameters
* Address bot's comments
* Address bot comments of using try finally block
On Windows, spawning the llamacpp server was causing an unwanted terminal window
to appear. This is now fixed by combining `CREATE_NO_WINDOW` with
`CREATE_NEW_PROCESS_GROUP` using `.creation_flags(...)`, ensuring that the
process runs in the background without a console window.
This change only applies to 64-bit Windows builds.
* fix: Prevent spamming /health endpoint and improve startup and resolve compiler warnings
This commit introduces a delay and improved logic to the /health endpoint checks in the llamacpp extension, preventing excessive requests during model loading.
Additionally, it addresses several Rust compiler warnings by:
- Commenting out an unused `handle_app_quit` function in `src/core/mcp.rs`.
- Explicitly declaring `target_port`, `session_api_key`, and `buffered_body` as mutable in `src/core/server.rs`.
- Commenting out unused `tokio` imports in `src/core/setup.rs`.
- Enhancing the `load_llama_model` function in `src/core/utils/extensions/inference_llamacpp_extension/server.rs` to better monitor stdout/stderr for readiness and errors, and handle timeouts.
- Commenting out an unused `std::path::Prefix` import and adjusting `normalize_path` in `src/core/utils/mod.rs`.
- Updating the application version to 0.6.904 in `tauri.conf.json`.
* fix grammar!
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* fix grammar 2
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* reimport prefix but only on Windows
* remove instead of commenting
* remove redundant check
* sync app version in cargo.toml with tauri.conf
---------
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* feat: Improve llamacpp server error reporting and model load stability
This commit introduces significant improvements to how the llamacpp server
process is managed and how its errors are reported.
Key changes:
- **Enhanced Error Reporting:** The llamacpp server's stdout and stderr
are now piped and captured. If the llamacpp process exits prematurely
or fails to start, its stderr output is captured and returned as a
`LlamacppError`. This provides much more specific and actionable
diagnostic information for users and developers.
- **Increased Model Load Timeout:** The `waitForModelLoad` timeout has
been increased from 30 seconds to 240 seconds (4 minutes). This
addresses issues where larger models or slower systems would
prematurely time out during the model loading phase.
- **API Secret Update:** The internal API secret for the llamacpp
extension has been updated from 'Jan' to 'JustAskNow'.
- **Version Bump:** The application version in `tauri.conf.json` has
been incremented to `0.6.901`.
* fix: should not spam load requests
* test: add test to cover the fix
* refactor: clean up
* test: add more test case
---------
Co-authored-by: Louis <louis@jan.ai>
- pulls fix for #5463 out of the github release workflow and into
the make/yarn build process
- implements a wrapper script that pins linuxdeploy and injects
a new location for XDG_CACHE_HOME into the build pipeline,
allowing manipulating .cache/tauri without tainting the hosts
.cache
- adds ./.cache (project_root/.cache) to make clean and mise clean
task
- remove .devcontainer/buildAppImage.sh, obsolete now that extra
build steps have been removed from the github workflow and
incorporated in the normal build process
- remove appimagetool from .devcontainer/postCreateCommand.sh,
as it was only used by .devcontainer/buildAppImage.sh
- pulled appimage packaging steps out of release workflow into new
src-tauri/build-utils/buildAppImage.sh
- cleaned up yarn scripts:
- moved multi platform yarn scripts out of yarn build:tauri:<platform>
into generic yarn build:tauri
- split yarn build:tauri:linux:win32 into separate yarn scripts so it's
clearer what is specific to which platform
- added src-tauri/build-utils/buildAppImage.sh to new yarn build:tauri:linux
yarn script
This is also a good entry point to add flatpak builds in the future.
Part of #5641
Allows for better per platform default config. Currently the
default serves windows/macos fine while it has to be tweaked
in order to build for linux
make build-tauri now successfully runs where it errored out before.
Appimages made with make alone however is incomplete as there are
still post processing steps in the github release workflow to bundle
additional resources.
- split platform specific config out of tauri.conf.json into auxiliary
platform specific config files, natively supported by tauri
- pull improved defaults out of template-tauri-build-linux-x64.yml
into new tauri.linux.conf.json
- fix tauri-build-linx-x64.yml to utilize new tauri.linux.conf.json
Things to ponder:
- Now, the v1/models endpoint of the API server will return an empty
list if no models are loaded
- Streaming v1/chat/completion routing works as well as v1/models; needs
further testing
- Changed `pid` field in `SessionInfo` from `string` to `number`/`i32` in TypeScript and Rust.
- Updated `activeSessions` map key from `string` to `number` to align with new PID type.
- Adjusted process monitoring logic to correctly handle numeric PIDs.
- Removed fallback UUID-based PID generation in favor of numeric fallback (-1).
- Added PID cleanup logic in `is_process_running` when the process is no longer alive.
- Bumped application version from 0.5.16 to 0.6.900 in `tauri.conf.json`.