4 Commits

Author SHA1 Message Date
Akarshan
9afeb5e514 feat: Add offload_mmproj option and validation
This commit introduces a new configuration option offload_mmproj to the llamacpp extension.

The offload_mmproj setting allows users to control whether the multimodal projector model is offloaded to the GPU. By default, it's offloaded for better performance. If set to false, the projector model will remain on the CPU, which can be useful in low GPU memory scenarios, though image processing might take longer.

Additionally, this commit adds validate_mmproj_path to ensure the provided --mmproj path is valid and accessible, preventing issues during model loading.

This change also refactors some invoke calls for improved readability.
2025-08-19 19:51:29 +07:00
Dinh Long Nguyen
e1c8d98bf2
Backend Architecture Refactoring (#6094) (#6162)
* add llamacpp plugin

* Refactor llamacpp plugin

* add utils plugin

* remove utils folder

* add hardware implementation

* add utils folder + move utils function

* organize cargo files

* refactor utils src

* refactor util

* apply fmt

* fmt

* Update gguf + reformat

* add permission for gguf commands

* fix cargo test windows

* revert yarn lock

* remove cargo.lock for hardware plugin

* ignore cargo.lock file

* Fix hardware invoke + refactor hardware + refactor tests, constants

* use api wrapper in extension to invoke hardware call + api wrapper build integration

* add newline at EOF (per Akarshan)

* add vi mock for getSystemInfo
2025-08-15 08:59:01 +07:00
Akarshan Biswas
f4661912b0
feat: Add GGUF metadata reading functionality (#6120)
* feat: Add GGUF metadata reading functionality

This commit introduces a new Tauri command and a corresponding function to read metadata from GGUF model files.

The new read_gguf_metadata command in the Rust backend uses the byteorder crate to parse the GGUF file format and extract key metadata. This information, including the file's version, tensor count, and a key-value map of other metadata, is then made available to the TypeScript frontend.

This functionality is a foundational step toward providing users with more detailed information about their loaded models directly within the application.

This will be refactored later.

fixes: #6001

* loadMetadata() should return

* Properly throw eror to FE

* Use BufReader to improve performance
2025-08-13 22:54:20 +05:30
Akarshan
dc82fd6051
fix windows test for short path 2025-08-07 20:16:43 +05:30