* fix: support load model configurations
* chore: remove log
* chore: sampling params add from send completion
* chore: remove comment
* chore: remove comment on predefined file
* chore: update test model service
* refactor: move thinking toggle to runtime settings for per-message control
Replaces the static `reasoning_budget` config with a dynamic `enable_thinking` flag under `chat_template_kwargs`, allowing models like Jan-nano and Qwen3 to enable/disable thinking behavior at runtime, even mid-conversation.
Requires UI update
* remove engine argument
- Changed `pid` field in `SessionInfo` from `string` to `number`/`i32` in TypeScript and Rust.
- Updated `activeSessions` map key from `string` to `number` to align with new PID type.
- Adjusted process monitoring logic to correctly handle numeric PIDs.
- Removed fallback UUID-based PID generation in favor of numeric fallback (-1).
- Added PID cleanup logic in `is_process_running` when the process is no longer alive.
- Bumped application version from 0.5.16 to 0.6.900 in `tauri.conf.json`.
This commit introduces embedding functionality to the llamacpp extension. It allows users to generate embeddings for text inputs using the 'sentence-transformer-mini' model. The changes include:
- Adding a new `embed` method to the `llamacpp_extension` class.
- Implementing model loading and API interaction for embeddings.
- Handling potential errors during API requests.
- Adding necessary types for embedding responses and data.
- The load method now accepts a boolean parameter to determine if it should load embedding model.
The changes include:
- Renaming interfaces (sessionInfo -> SessionInfo, unloadResult -> UnloadResult) for consistency
- Adding getLoadedModels() method to retrieve active model IDs
- Updating variable names from modelId to model_id for alignment
- Updating cleanup paths to use XDG-standard locations
- Improving type consistency across extension implementation
Add comprehensive sampling parameters for fine-grained control over AI output generation, including dynamic temperature, Mirostat sampling, repetition penalties, and advanced prompt handling. These parameters enable more precise tuning of model behavior and output quality.
The changes standardize identifier names across the codebase for clarity:
- Replaced `sessionId` with `pid` to reflect process ID usage
- Changed `modelName` to `modelId` for consistency with identifier naming
- Renamed `api_key` to `apiKey` for camelCase consistency
- Updated corresponding methods to use these new identifiers
- Improved type safety and readability by aligning variable names with their semantic meaning
- Changed load method to accept modelId instead of loadOptions for better clarity and simplicity
- Renamed engineBasePath parameter to backendPath for consistency with the backend's directory structure
- Added getRandomPort method to ensure unique ports for each session to prevent conflicts
- Refactored configuration and model loading logic to improve maintainability and reduce redundancy
The `ImportOptions` interface was updated to include `modelPath` and `mmprojPath`. These options are required for importing models and multi-modal projects.
The `loadOptions` interface in `AIEngine.ts` now includes an optional `mmprojPath` property. This allows users to provide a path to their MMProject file when loading a model, which is required for certain model types. The `llamacpp-extension/src/index.ts` has been updated to pass this option to the llamacpp server if provided.
This commit introduces API key generation for the Llama.cpp extension. The API key is now generated on the server side using HMAC-SHA256 and a secret key to ensure security and uniqueness. The frontend now passes the model ID and API secret to the server to generate the key. This addresses the requirement for secure model access and authorization.
* add pull and abortPull
* add model import (download only)
* write model.yaml. support local model import
* remove cortex-related command
* add TODO
* remove cortex-related command