Nicholai/jan - jan - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Akarshan Biswas	885da29f28	feat: add getTokensCount method to compute token usage (#6467 ) * feat: add getTokensCount method to compute token usage Implemented a new async `getTokensCount` function in the LLaMA.cpp extension. The method validates the model session, checks process health, applies the request template, and tokenizes the resulting prompt to return the token count. Includes detailed error handling for crashed models and API failures, enabling callers to assess token usage before sending completions. * Fix: typos * chore: update ui token usage * chore: remove unused code * feat: add image token handling for multimodal LlamaCPP models Implemented support for counting image tokens when using vision-enabled models: - Extended `SessionInfo` with optional `mmprojPath` to store the multimodal project file. - Propagated `mmproj_path` from the Tauri plugin into the session info. - Added import of `chatCompletionRequestMessage` and enhanced token calculation logic in the LlamaCPP extension: - Detects image content in messages. - Reads GGUF metadata from `mmprojPath` to compute accurate image token counts. - Provides a fallback estimation if metadata reading fails. - Returns the sum of text and image tokens. - Introduced helper methods `calculateImageTokens` and `estimateImageTokensFallback`. - Minor clean‑ups such as comment capitalization and debug logging. * chore: update FE send params message include content type image_url * fix mmproj path from session info and num tokens calculation * fix: Correct image token estimation calculation in llamacpp extension This commit addresses an inaccurate token count for images in the llama.cpp extension. The previous logic incorrectly calculated the token count based on image patch size and dimensions. This has been replaced with a more precise method that uses the clip.vision.projection_dim value from the model metadata. Additionally, unnecessary debug logging was removed, and a new log was added to show the mmproj metadata for improved visibility. * fix per image calc * fix: crash due to force unwrap --------- Co-authored-by: Faisal Amir <urmauur@gmail.com> Co-authored-by: Louis <louis@jan.ai>	2025-09-23 07:52:19 +05:30
Maksym Krasovakyi	71e2e24112	Add model response timeout for local api server as configurable value via UI	2025-09-15 14:25:09 +03:00
Akarshan Biswas	5c3a6fec32	feat: Add support for custom environmental variables to llama.cpp (#6256 ) This commit adds a new setting `llamacpp_env` to the llama.cpp extension, allowing users to specify custom environment variables. These variables are passed to the backend process when it starts. A new function `parseEnvFromString` is introduced to handle the parsing of the semicolon-separated key-value pairs from the user input. The environment variables are then used in the `load` function and when listing available devices. This enables more flexible configuration of the llama.cpp backend, such as specifying visible GPUs for Vulkan. This change also updates the Tauri command `get_devices` to accept environment variables, ensuring that device discovery respects the user's settings.	2025-08-21 15:50:37 +05:30
Faisal Amir	5481ee9e35	Merge pull request #6134 from menloresearch/feat/attachment-ui feat: attachment UI	2025-08-20 10:04:32 +07:00
Akarshan Biswas	e761c439d7	feat: Pass API key via environment variable instead of command line argument (#6225 ) This change modifies how the API key is passed to the llama-server process. Previously, it was sent as a command line argument (--api-key). This approach has been updated to pass the key via an environment variable (LLAMA_API_KEY). This improves security by preventing the API key from being visible in the process list (ps aux on Linux, Task Manager on Windows, etc.), where it could potentially be exposed to other users or processes on the same system. The commit also updates the Rust backend to read the API key from the environment variable instead of parsing it from the command line arguments.	2025-08-19 20:57:06 +05:30
Akarshan	9afeb5e514	feat: Add offload_mmproj option and validation This commit introduces a new configuration option offload_mmproj to the llamacpp extension. The offload_mmproj setting allows users to control whether the multimodal projector model is offloaded to the GPU. By default, it's offloaded for better performance. If set to false, the projector model will remain on the CPU, which can be useful in low GPU memory scenarios, though image processing might take longer. Additionally, this commit adds validate_mmproj_path to ensure the provided --mmproj path is valid and accessible, preventing issues during model loading. This change also refactors some invoke calls for improved readability.	2025-08-19 19:51:29 +07:00
Dinh Long Nguyen	e1c8d98bf2	Backend Architecture Refactoring (#6094 ) (#6162 ) * add llamacpp plugin * Refactor llamacpp plugin * add utils plugin * remove utils folder * add hardware implementation * add utils folder + move utils function * organize cargo files * refactor utils src * refactor util * apply fmt * fmt * Update gguf + reformat * add permission for gguf commands * fix cargo test windows * revert yarn lock * remove cargo.lock for hardware plugin * ignore cargo.lock file * Fix hardware invoke + refactor hardware + refactor tests, constants * use api wrapper in extension to invoke hardware call + api wrapper build integration * add newline at EOF (per Akarshan) * add vi mock for getSystemInfo	2025-08-15 08:59:01 +07:00

7 Commits