5169 Commits

Author SHA1 Message Date
Thien Tran
ae349159ce
remove yarn install:cortex 2025-07-02 12:27:33 +07:00
Thien Tran
95944fa081
add Jan's library path to path 2025-07-02 12:27:17 +07:00
Thien Tran
65d6f34878
check for system libraries 2025-07-02 12:27:17 +07:00
Thien Tran
1eb49350e9
add is_library_available command 2025-07-02 12:27:17 +07:00
Thien Tran
622f4118c0
add placeholder for windows and linux arm 2025-07-02 12:27:17 +07:00
Thien Tran
f7bcf43334
update folde structure. small refactoring 2025-07-02 12:27:16 +07:00
Thien Tran
3b72d80979
fix wrong key for backend 2025-07-02 12:27:16 +07:00
Akarshan Biswas
331c0e04a5
fix: use modelId instead of sessionId for unloading
The loop now extracts session info to retrieve the model ID, ensuring correct unloading of sessions by their associated model identifiers rather than session IDs. This aligns the cleanup process with the actual model resources being managed.
2025-07-02 12:27:16 +07:00
Akarshan Biswas
e3d6cbd80f
feat: add port parameter to generateApiKey for secure model-specific API keys
The generateApiKey method now incorporates the model's port to create a unique,
port-specific API key, enhancing security by ensuring keys are tied to both
model ID and port. This change supports better isolation between models
running on different ports. Code formatting improvements were also made
for consistency and readability.
2025-07-02 12:27:16 +07:00
Akarshan Biswas
4dfdcd68d5
refactor: rename session identifiers to pid and modelId
The changes standardize identifier names across the codebase for clarity:
- Replaced `sessionId` with `pid` to reflect process ID usage
- Changed `modelName` to `modelId` for consistency with identifier naming
- Renamed `api_key` to `apiKey` for camelCase consistency
- Updated corresponding methods to use these new identifiers
- Improved type safety and readability by aligning variable names with their semantic meaning
2025-07-02 12:27:16 +07:00
Akarshan Biswas
f9d3935269
feat: allow specifying port via command line argument
This change allows the port to be specified via command line arguments, providing flexibility. The port is parsed from the arguments, defaulting to 8080 if not provided.
2025-07-02 12:27:16 +07:00
Akarshan Biswas
5d61062b0e
feat: enhance argument parsing and add API key generation
The changes improve the robustness of command-line argument parsing in the Llama model server by replacing direct index access with safe iteration methods. A new generate_api_key function was added to handle API key generation securely. The sessionId parameter was standardized to match the renamed property in the client code.
2025-07-02 12:27:15 +07:00
Thien Tran
6679debf72
mkdir before write yaml 2025-07-02 12:27:15 +07:00
Thien Tran
1ae7c0b59a
update version/backend format. fix bugs around load() 2025-07-02 12:27:15 +07:00
Akarshan Biswas
fd9e034461
feat: update AIEngine load method and backend path handling
- Changed load method to accept modelId instead of loadOptions for better clarity and simplicity
- Renamed engineBasePath parameter to backendPath for consistency with the backend's directory structure
- Added getRandomPort method to ensure unique ports for each session to prevent conflicts
- Refactored configuration and model loading logic to improve maintainability and reduce redundancy
2025-07-02 12:27:15 +07:00
Thien Tran
9e24e28341
add await to config 2025-07-02 12:27:15 +07:00
Thien Tran
070d8534c4
add some string validation 2025-07-02 12:27:14 +07:00
Thien Tran
494a47aaa5
fix download condition 2025-07-02 12:27:14 +07:00
Thien Tran
f32ae402d5
fix CUDA version URL 2025-07-02 12:27:14 +07:00
Thien Tran
27146eb5cc
fix feature parsing 2025-07-02 12:27:14 +07:00
Thien Tran
a75d13f42f
fix version compare 2025-07-02 12:27:14 +07:00
Thien Tran
3490299f66
refactor get supported features. check driver version for cu11 and cu12 2025-07-02 12:27:13 +07:00
Akarshan Biswas
267bbbf77b
feat: add model and mmproj paths to ImportOptions
The `ImportOptions` interface was updated to include `modelPath` and `mmprojPath`. These options are required for importing models and multi-modal projects.
2025-07-02 12:27:13 +07:00
Akarshan Biswas
07d76dc871
feat: Allow specifying mmproj path during model loading
The `loadOptions` interface in `AIEngine.ts` now includes an optional `mmprojPath` property.  This allows users to provide a path to their MMProject file when loading a model, which is required for certain model types.  The `llamacpp-extension/src/index.ts` has been updated to pass this option to the llamacpp server if provided.
2025-07-02 12:27:13 +07:00
Thien Tran
fbfaaf43c5
download CUDA libs if needed 2025-07-02 12:27:13 +07:00
Thien Tran
40cd7e962a
feat: download backend for llama.cpp extension (#5123)
* wip

* update

* add download logic

* add decompress. support delete file

* download backend upon selecting setting

* add some logging and nootes

* add note on race condition

* remove then catch

* default to none backend. only download if it's not installed

* merge version and backend. fetch version from GH

* restrict scope of output_dir

* add note on unpack
2025-07-02 12:27:13 +07:00
Akarshan Biswas
da23673a44
feat: Add API key generation for Llama.cpp
This commit introduces API key generation for the Llama.cpp extension.  The API key is now generated on the server side using HMAC-SHA256 and a secret key to ensure security and uniqueness.  The frontend now passes the model ID and API secret to the server to generate the key. This addresses the requirement for secure model access and authorization.
2025-07-02 12:27:12 +07:00
Akarshan Biswas
d6edb1e944
If checking for proper ctx_len settings after refactoring 2025-07-02 12:27:12 +07:00
Thien Tran
39bb3f34d6
patch failing calls to cortex 2025-07-02 12:27:12 +07:00
Akarshan Biswas
31971e7821
(WIP)randomly generate api-key hash each session 2025-07-02 12:27:12 +07:00
Akarshan Biswas
1dd762f0cf
remove parseGGUFFileName function as it is not used 2025-07-02 12:27:12 +07:00
Akarshan Biswas
7481fae0df
remove ununsed imports and remove n_ctx key from loadOptions 2025-07-02 12:27:11 +07:00
Akarshan Biswas
77d861f56f
Fixup: change key to ctx_size to align with upstream and remove duplicate key 2025-07-02 12:27:11 +07:00
Thien Tran
d5c07acdb5
feat: add LlamacppConfig for llama.cpp extension to improve settings (#5121)
* add engine settings

* update load options

* rename variable
2025-07-02 12:27:11 +07:00
Thien Tran
9bb4deeb78
update model config (import and list) 2025-07-02 12:27:11 +07:00
Thien Tran
5803fcdb99
add read_yaml. use buffered reader/writer 2025-07-02 12:27:11 +07:00
Thien Tran
d01cbe44ae
use PathBuf to check exists() 2025-07-02 12:27:11 +07:00
Thien Tran
77f6770333
update fileStat() 2025-07-02 12:27:10 +07:00
Akarshan Biswas
742e731e96
Add --reasoning_budget option 2025-07-02 12:27:10 +07:00
Akarshan Biswas
fe457a5368
slight modelbasepath refactoring 2025-07-02 12:27:10 +07:00
Akarshan Biswas
c5a0ee7f6e
refactor unload and implement a destructor to clean up sessions 2025-07-02 12:27:10 +07:00
Thien Tran
cd36b423b6
add basic model list 2025-07-02 12:27:10 +07:00
Thien Tran
d523166b61
implement delete 2025-07-02 12:27:09 +07:00
Akarshan Biswas
587ed3c83c
refactor OAI request payload type to support image and audio 2025-07-02 12:27:09 +07:00
Thien Tran
ded9ae733a
feat: Model import (download + local import) for llama.cpp extension (#5087)
* add pull and abortPull

* add model import (download only)

* write model.yaml. support local model import

* remove cortex-related command

* add TODO

* remove cortex-related command
2025-07-02 12:27:09 +07:00
Akarshan Biswas
a7a2dcc8d8
refactor load/unload again; move types to core and refactor AIEngine abstract class 2025-07-02 12:27:09 +07:00
Akarshan Biswas
ee2cb9e625
remove override from localOAIEngine and OAIEngine 2025-07-02 12:27:09 +07:00
Akarshan Biswas
0e9a8a27e5
fixup from refactoring 2025-07-02 12:27:08 +07:00
Akarshan Biswas
bbbf4779df
refactor load/unload 2025-07-02 12:27:08 +07:00
Akarshan Biswas
b4670b5526
remove cortex engine dirs 2025-07-02 12:27:08 +07:00