* feat: add getTokensCount method to compute token usage
Implemented a new async `getTokensCount` function in the LLaMA.cpp extension.
The method validates the model session, checks process health, applies the request template, and tokenizes the resulting prompt to return the token count. Includes detailed error handling for crashed models and API failures, enabling callers to assess token usage before sending completions.
* Fix: typos
* chore: update ui token usage
* chore: remove unused code
* feat: add image token handling for multimodal LlamaCPP models
Implemented support for counting image tokens when using vision-enabled models:
- Extended `SessionInfo` with optional `mmprojPath` to store the multimodal project file.
- Propagated `mmproj_path` from the Tauri plugin into the session info.
- Added import of `chatCompletionRequestMessage` and enhanced token calculation logic in the LlamaCPP extension:
- Detects image content in messages.
- Reads GGUF metadata from `mmprojPath` to compute accurate image token counts.
- Provides a fallback estimation if metadata reading fails.
- Returns the sum of text and image tokens.
- Introduced helper methods `calculateImageTokens` and `estimateImageTokensFallback`.
- Minor clean‑ups such as comment capitalization and debug logging.
* chore: update FE send params message include content type image_url
* fix mmproj path from session info and num tokens calculation
* fix: Correct image token estimation calculation in llamacpp extension
This commit addresses an inaccurate token count for images in the llama.cpp extension.
The previous logic incorrectly calculated the token count based on image patch size and dimensions. This has been replaced with a more precise method that uses the clip.vision.projection_dim value from the model metadata.
Additionally, unnecessary debug logging was removed, and a new log was added to show the mmproj metadata for improved visibility.
* fix per image calc
* fix: crash due to force unwrap
---------
Co-authored-by: Faisal Amir <urmauur@gmail.com>
Co-authored-by: Louis <louis@jan.ai>
* feat: Add model compatibility check and memory estimation
This commit introduces a new feature to check if a given model is supported based on available device memory.
The change includes:
- A new `estimateKVCache` method that calculates the required memory for the model's KV cache. It uses GGUF metadata such as `block_count`, `head_count`, `key_length`, and `value_length` to perform the calculation.
- An `isModelSupported` method that combines the model file size and the estimated KV cache size to determine the total memory required. It then checks if any available device has sufficient free memory to load the model.
- An updated error message for the `version_backend` check to be more user-friendly, suggesting a stable internet connection as a potential solution for backend setup failures.
This functionality helps prevent the application from attempting to load models that would exceed the device's memory capacity, leading to more stable and predictable behavior.
fixes: #5505
* Update extensions/llamacpp-extension/src/index.ts
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* Update extensions/llamacpp-extension/src/index.ts
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* Extend this to available system RAM if GGML device is not available
* fix: Improve model metadata and memory checks
This commit refactors the logic for checking if a model is supported by a system's available memory.
**Key changes:**
- **Remote model support**: The `read_gguf_metadata` function can now fetch metadata from a remote URL by reading the file in chunks.
- **Improved KV cache size calculation**: The KV cache size is now estimated more accurately by using `attention.key_length` and `attention.value_length` from the GGUF metadata, with a fallback to `embedding_length`.
- **Granular memory check statuses**: The `isModelSupported` function now returns a more specific status (`'RED'`, `'YELLOW'`, `'GREEN'`) to indicate whether the model weights or the KV cache are too large for the available memory.
- **Consolidated logic**: The logic for checking local and remote models has been consolidated into a single `isModelSupported` function, improving code clarity and maintainability.
These changes provide more robust and informative model compatibility checks, especially for models hosted on remote servers.
* Update extensions/llamacpp-extension/src/index.ts
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* Make ctx_size optional and use sum free memory across ggml devices
* feat: hub and dropdown model selection handle model compatibility
* feat: update bage model info color
* chore: enable detail page to get compatibility model
* chore: update copy
* chore: update shrink indicator UI
---------
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: Faisal Amir <urmauur@gmail.com>
* Refactor translation imports and update text for localization across settings and system monitor routes
- Changed translation import from 'react-i18next' to '@/i18n/react-i18next-compat' in multiple files.
- Updated various text strings to use translation keys for better localization support in:
- Local API Server settings
- MCP Servers settings
- Privacy settings
- Provider settings
- Shortcuts settings
- System Monitor
- Thread details
- Ensured consistent use of translation keys for all user-facing text.
Update web-app/src/routes/settings/appearance.tsx
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Update web-app/src/routes/settings/appearance.tsx
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Update web-app/src/locales/vn/settings.json
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Update web-app/src/containers/dialogs/DeleteMCPServerConfirm.tsx
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Update web-app/src/locales/id/common.json
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
* Add Chinese (Simplified and Traditional) localization files for various components
- Created `tools.json`, `updater.json`, `assistants.json`, `chat.json`, `common.json`, `hub.json`, `logs.json`, `mcp-servers.json`, `provider.json`, `providers.json`, `settings.json`, `setup.json`, `system-monitor.json`, `tool-approval.json` in both `zh-CN` and `zh-TW` locales.
- Added translations for tool approval, updater notifications, assistant management, chat interface, common UI elements, hub interactions, logging messages, MCP server configurations, provider management, settings options, setup instructions, and system monitoring.
* Refactor localization strings for improved clarity and consistency in English, Indonesian, and Vietnamese settings files
* Fix missing key and reword
* fix pr comment
---------
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>