737 Commits

Author SHA1 Message Date
Louis
b2ce138ea0
test: add tests 2025-07-11 09:21:11 +07:00
Louis
b8259e7794 feat: add HF token setting 2025-07-11 00:05:52 +07:00
Louis
a770e08013
test: migrate jest to vitest 2025-07-10 21:14:21 +07:00
Louis
6e0218c084
Merge branch 'release/v0.7.0' into feat/inference-llamacpp-extension
# Conflicts:
#	.devcontainer/buildAppImage.sh
#	.github/workflows/template-tauri-build-linux-x64.yml
#	Makefile
#	core/src/node/extension/index.test.ts
#	package.json
#	src-tauri/tauri.conf.json
#	web-app/package.json
2025-07-10 15:36:41 +07:00
hiento09
3287e8b300
chore: enable test coverage (#5710)
* chore: enable test coverage
2025-07-07 11:24:13 +07:00
Akarshan
d4a3d6a0d6
Refactor session PID types from string to number across backend and extension
- Changed `pid` field in `SessionInfo` from `string` to `number`/`i32` in TypeScript and Rust.
- Updated `activeSessions` map key from `string` to `number` to align with new PID type.
- Adjusted process monitoring logic to correctly handle numeric PIDs.
- Removed fallback UUID-based PID generation in favor of numeric fallback (-1).
- Added PID cleanup logic in `is_process_running` when the process is no longer alive.
- Bumped application version from 0.5.16 to 0.6.900 in `tauri.conf.json`.
2025-07-04 21:40:54 +05:30
Akarshan
dbdc031583
chore: store session_info in backend as well for API server(WIP) 2025-07-04 20:31:30 +05:30
Akarshan
ffef7b9cab enhancement: Add custom Jinja chat template option
Adds a new configuration option `chat_template` to the Llama.cpp extension, allowing users to define a custom Jinja chat template for the model.

The template can be provided via a new input field in the settings, and if set, it will be passed to the Llama.cpp backend using the `--chat-template` argument. This enhances flexibility for users who require specific chat formatting beyond the GGUF default.

The `chat_template` is added to the `LlamacppConfig` type and conditionally pushed to the command arguments if it's provided. The placeholder text provides an example of a Jinja template structure.
2025-07-03 23:38:16 +07:00
Akarshan
40f1fd4ffd
feat: Auto update backend implementation 2025-07-03 19:32:12 +05:30
Akarshan
c2493fc535
Fix camelCase 2025-07-03 09:13:33 +05:30
Akarshan
396573055f
Address bot's review comment and minor refactoring 2025-07-03 09:13:33 +05:30
Akarshan
37151ba926
Feat: Auto load and download default backend during first launch 2025-07-03 09:13:32 +05:30
Akarshan
449bf17692
Add process aliveness check 2025-07-02 12:29:03 +07:00
Louis
c6ac9f1d2a
feat: sync hub with model catalog 2025-07-02 12:29:01 +07:00
Louis
c9c1ff1778
refactor: clean up core node packages 2025-07-02 12:28:38 +07:00
Louis
b538d57207
feat: auto unload models on model start 2025-07-02 12:28:25 +07:00
Akarshan
0cbf35dc77
Add auto unload setting to llamacpp-extension 2025-07-02 12:28:25 +07:00
Akarshan
54691044d4
Add missing --jinja flag 2025-07-02 12:28:25 +07:00
Louis
8bd4a3389f
refactor: frontend uses new engine extension
# Conflicts:
#	extensions/model-extension/resources/default.json
#	web-app/src/containers/dialogs/DeleteProvider.tsx
#	web-app/src/routes/hub.tsx
2025-07-02 12:28:24 +07:00
Akarshan
48d1164858
feat: add embedding support to llamacpp extension
This commit introduces embedding functionality to the llamacpp extension. It allows users to generate embeddings for text inputs using the 'sentence-transformer-mini' model.  The changes include:

- Adding a new `embed` method to the `llamacpp_extension` class.
- Implementing model loading and API interaction for embeddings.
- Handling potential errors during API requests.
- Adding necessary types for embedding responses and data.
- The load method now accepts a boolean parameter to determine if it should load embedding model.
2025-07-02 12:27:36 +07:00
Akarshan
f463008362
feat: add model load wait to ensure model is ready before use 2025-07-02 12:27:35 +07:00
Akarshan
9d4e7cb2b8
fix: correct model_id to model_id in console error message
This change ensures that the error message includes the correct model ID, as `modelId` is capitalized in the `sInfo` object.
2025-07-02 12:27:35 +07:00
Akarshan
d60257ebbd
Revert: extension/yarn.lock 2025-07-02 12:27:35 +07:00
Akarshan
dbcce86bb8
refactor: rename interfaces and add getLoadedModels
The changes include:
- Renaming interfaces (sessionInfo -> SessionInfo, unloadResult -> UnloadResult) for consistency
- Adding getLoadedModels() method to retrieve active model IDs
- Updating variable names from modelId to model_id for alignment
- Updating cleanup paths to use XDG-standard locations
- Improving type consistency across extension implementation
2025-07-02 12:27:35 +07:00
Akarshan
4ffc504150
style: Rename camelCase to snake_case in llamacpp extension code
Rename variable, struct, and enum names from camelCase to snake_case throughout the llamacpp extension codebase to align with Rust naming conventions. This change improves readability and consistency without altering functionality.
2025-07-02 12:27:34 +07:00
Akarshan
6c769c5db9
feat: refactor llama server process storage to use HashMap
Change the llama_server_process state from an Option<Child> to a HashMap<String, Child> to support managing multiple server instances by PID. This allows precise process tracking and termination, replacing the previous single-process limitation.

Previously, only one server process could be tracked at a time. Now, each process is stored with its PID as the key, enabling:
- Accurate session matching during unloading
- Proper termination of specific processes
- Better error handling for mismatched PIDs

The load_llama_model function now inserts processes into the map, and unload_llama_model removes them by PID.
2025-07-02 12:27:34 +07:00
Thien Tran
525cc93d4a
fix system cudart detection on linux 2025-07-02 12:27:34 +07:00
Thien Tran
95944fa081
add Jan's library path to path 2025-07-02 12:27:17 +07:00
Thien Tran
65d6f34878
check for system libraries 2025-07-02 12:27:17 +07:00
Thien Tran
622f4118c0
add placeholder for windows and linux arm 2025-07-02 12:27:17 +07:00
Thien Tran
f7bcf43334
update folde structure. small refactoring 2025-07-02 12:27:16 +07:00
Thien Tran
3b72d80979
fix wrong key for backend 2025-07-02 12:27:16 +07:00
Akarshan Biswas
331c0e04a5
fix: use modelId instead of sessionId for unloading
The loop now extracts session info to retrieve the model ID, ensuring correct unloading of sessions by their associated model identifiers rather than session IDs. This aligns the cleanup process with the actual model resources being managed.
2025-07-02 12:27:16 +07:00
Akarshan Biswas
e3d6cbd80f
feat: add port parameter to generateApiKey for secure model-specific API keys
The generateApiKey method now incorporates the model's port to create a unique,
port-specific API key, enhancing security by ensuring keys are tied to both
model ID and port. This change supports better isolation between models
running on different ports. Code formatting improvements were also made
for consistency and readability.
2025-07-02 12:27:16 +07:00
Akarshan Biswas
4dfdcd68d5
refactor: rename session identifiers to pid and modelId
The changes standardize identifier names across the codebase for clarity:
- Replaced `sessionId` with `pid` to reflect process ID usage
- Changed `modelName` to `modelId` for consistency with identifier naming
- Renamed `api_key` to `apiKey` for camelCase consistency
- Updated corresponding methods to use these new identifiers
- Improved type safety and readability by aligning variable names with their semantic meaning
2025-07-02 12:27:16 +07:00
Akarshan Biswas
5d61062b0e
feat: enhance argument parsing and add API key generation
The changes improve the robustness of command-line argument parsing in the Llama model server by replacing direct index access with safe iteration methods. A new generate_api_key function was added to handle API key generation securely. The sessionId parameter was standardized to match the renamed property in the client code.
2025-07-02 12:27:15 +07:00
Thien Tran
6679debf72
mkdir before write yaml 2025-07-02 12:27:15 +07:00
Thien Tran
1ae7c0b59a
update version/backend format. fix bugs around load() 2025-07-02 12:27:15 +07:00
Akarshan Biswas
fd9e034461
feat: update AIEngine load method and backend path handling
- Changed load method to accept modelId instead of loadOptions for better clarity and simplicity
- Renamed engineBasePath parameter to backendPath for consistency with the backend's directory structure
- Added getRandomPort method to ensure unique ports for each session to prevent conflicts
- Refactored configuration and model loading logic to improve maintainability and reduce redundancy
2025-07-02 12:27:15 +07:00
Thien Tran
9e24e28341
add await to config 2025-07-02 12:27:15 +07:00
Thien Tran
070d8534c4
add some string validation 2025-07-02 12:27:14 +07:00
Thien Tran
494a47aaa5
fix download condition 2025-07-02 12:27:14 +07:00
Thien Tran
f32ae402d5
fix CUDA version URL 2025-07-02 12:27:14 +07:00
Thien Tran
27146eb5cc
fix feature parsing 2025-07-02 12:27:14 +07:00
Thien Tran
a75d13f42f
fix version compare 2025-07-02 12:27:14 +07:00
Thien Tran
3490299f66
refactor get supported features. check driver version for cu11 and cu12 2025-07-02 12:27:13 +07:00
Akarshan Biswas
07d76dc871
feat: Allow specifying mmproj path during model loading
The `loadOptions` interface in `AIEngine.ts` now includes an optional `mmprojPath` property.  This allows users to provide a path to their MMProject file when loading a model, which is required for certain model types.  The `llamacpp-extension/src/index.ts` has been updated to pass this option to the llamacpp server if provided.
2025-07-02 12:27:13 +07:00
Thien Tran
fbfaaf43c5
download CUDA libs if needed 2025-07-02 12:27:13 +07:00
Thien Tran
40cd7e962a
feat: download backend for llama.cpp extension (#5123)
* wip

* update

* add download logic

* add decompress. support delete file

* download backend upon selecting setting

* add some logging and nootes

* add note on race condition

* remove then catch

* default to none backend. only download if it's not installed

* merge version and backend. fetch version from GH

* restrict scope of output_dir

* add note on unpack
2025-07-02 12:27:13 +07:00
Akarshan Biswas
da23673a44
feat: Add API key generation for Llama.cpp
This commit introduces API key generation for the Llama.cpp extension.  The API key is now generated on the server side using HMAC-SHA256 and a secret key to ensure security and uniqueness.  The frontend now passes the model ID and API secret to the server to generate the key. This addresses the requirement for secure model access and authorization.
2025-07-02 12:27:12 +07:00