21 Commits

Author SHA1 Message Date
Akarshan Biswas
fd9e034461
feat: update AIEngine load method and backend path handling
- Changed load method to accept modelId instead of loadOptions for better clarity and simplicity
- Renamed engineBasePath parameter to backendPath for consistency with the backend's directory structure
- Added getRandomPort method to ensure unique ports for each session to prevent conflicts
- Refactored configuration and model loading logic to improve maintainability and reduce redundancy
2025-07-02 12:27:15 +07:00
Thien Tran
40cd7e962a
feat: download backend for llama.cpp extension (#5123)
* wip

* update

* add download logic

* add decompress. support delete file

* download backend upon selecting setting

* add some logging and nootes

* add note on race condition

* remove then catch

* default to none backend. only download if it's not installed

* merge version and backend. fetch version from GH

* restrict scope of output_dir

* add note on unpack
2025-07-02 12:27:13 +07:00
Akarshan Biswas
da23673a44
feat: Add API key generation for Llama.cpp
This commit introduces API key generation for the Llama.cpp extension.  The API key is now generated on the server side using HMAC-SHA256 and a secret key to ensure security and uniqueness.  The frontend now passes the model ID and API secret to the server to generate the key. This addresses the requirement for secure model access and authorization.
2025-07-02 12:27:12 +07:00
Thien Tran
39bb3f34d6
patch failing calls to cortex 2025-07-02 12:27:12 +07:00
Akarshan Biswas
31971e7821
(WIP)randomly generate api-key hash each session 2025-07-02 12:27:12 +07:00
Thien Tran
5803fcdb99
add read_yaml. use buffered reader/writer 2025-07-02 12:27:11 +07:00
Thien Tran
d01cbe44ae
use PathBuf to check exists() 2025-07-02 12:27:11 +07:00
Akarshan Biswas
c5a0ee7f6e
refactor unload and implement a destructor to clean up sessions 2025-07-02 12:27:10 +07:00
Thien Tran
ded9ae733a
feat: Model import (download + local import) for llama.cpp extension (#5087)
* add pull and abortPull

* add model import (download only)

* write model.yaml. support local model import

* remove cortex-related command

* add TODO

* remove cortex-related command
2025-07-02 12:27:09 +07:00
Akarshan Biswas
a7a2dcc8d8
refactor load/unload again; move types to core and refactor AIEngine abstract class 2025-07-02 12:27:09 +07:00
Akarshan Biswas
bbbf4779df
refactor load/unload 2025-07-02 12:27:08 +07:00
Akarshan Biswas
021f8ae80f
Fixup: llama-server load 2025-07-02 12:27:08 +07:00
Akarshan Biswas
a8abc9f9aa
Resolved conflicts by keeping HEAD changes 2025-07-02 12:27:07 +07:00
Thien Tran
15f0b11c0d
make it compile 2025-07-02 12:26:38 +07:00
Akarshan Biswas
0551b0bfd2
Fix import 2025-07-02 12:26:38 +07:00
Akarshan Biswas
5c9e8dce76
Add spaces before EOF 2025-07-02 12:26:38 +07:00
Akarshan Biswas
9016fbff68
feat: inference-llamacpp-extension: backend implementation 2025-07-02 12:26:37 +07:00
Thien Tran
6415be9c74
feat: Support download resume (#5111)
* initial support

* append instead of replace extension
2025-05-27 10:46:49 +08:00
Thien Tran
56f4ec3b61
feat: improve download extension (#5073) 2025-05-23 16:49:41 +08:00
Thien Tran
4bde6645d0
feat: Download manager for llama.cpp extension (#4933) 2025-05-16 15:01:42 +08:00
Louis
6f53f1056a
refactor: Jan manages threads for a better performance (#4912)
* refactor: Jan manages threads for a better performance

* test: add tests
2025-05-15 17:10:52 +07:00