Nicholai/jan - jan - Gitea: Git with a cup of tea

Author	SHA1	Message	Date
Faisal Amir	cbd2651a63	chore: update copy and refresh list when import from local machine	2025-09-11 09:52:09 +05:30
Faisal Amir	ba4dc6d1eb	enhancement: update ui dialog update llamacpp backend	2025-09-11 09:52:09 +05:30
Akarshan Biswas	7a174e621a	feat: Smart model management (#6390 ) * feat: Smart model management * New UI option – `memory_util` added to `settings.json` with a dropdown (high / medium / low) to let users control how aggressively the engine uses system memory. * Configuration updates – `LlamacppConfig` now includes `memory_util`; the extension class stores it in a new `memoryMode` property and handles updates through `updateConfig`. * System memory handling * Introduced `SystemMemory` interface and `getTotalSystemMemory()` to report combined VRAM + RAM. * Added helper methods `getKVCachePerToken`, `getLayerSize`, and a new `ModelPlan` type. * Smart model‑load planner – `planModelLoad()` computes: * Number of GPU layers that can fit in usable VRAM. * Maximum context length based on KV‑cache size and the selected memory utilization mode (high/medium/low). * Whether KV‑cache must be off‑loaded to CPU and the overall loading mode (GPU, Hybrid, CPU, Unsupported). * Detailed logging of the planning decision. * Improved support check – `isModelSupported()` now: * Uses the combined VRAM/RAM totals from `getTotalSystemMemory()`. * Applies an 80% usable‑memory heuristic. * Returns GREEN only when both weights and KV‑cache fit in VRAM, YELLOW when they fit only in total memory or require CPU off‑load, and RED when the model cannot fit at all. * Cleanup – Removed unused `GgufMetadata` import; updated imports and type definitions accordingly. * Documentation/comments – Added explanatory JSDoc comments for the new methods and clarified the return semantics of `isModelSupported`. * chore: migrate no_kv_offload from llamacpp setting to model setting * chore: add UI auto optimize model setting * feat: improve model loading planner with mmproj support and smarter memory budgeting * Extend `ModelPlan` with optional `noOffloadMmproj` flag to indicate when a multimodal projector can stay in VRAM. * Add `mmprojPath` parameter to `planModelLoad` and calculate its size, attempting to keep it on GPU when possible. * Refactor system memory detection: * Use `used_memory` (actual free RAM) instead of total RAM for budgeting. * Introduced `usableRAM` placeholder for future use. * Rewrite KV‑cache size calculation: * Properly handle GQA models via `attention.head_count_kv`. * Compute bytes per token as `nHeadKV * headDim * 2 * 2 * nLayer`. * Replace the old 70 % VRAM heuristic with a more flexible budget: * Reserve a fixed VRAM amount and apply an overhead factor. * Derive usable system RAM from total memory minus VRAM. * Implement a robust allocation algorithm: * Prioritize placing the mmproj in VRAM. * Search for the best balance of GPU layers and context length. * Fallback strategies for hybrid and pure‑CPU modes with detailed safety checks. * Add extensive validation of model size, KV‑cache size, layer size, and memory mode. * Improve logging throughout the planning process for easier debugging. * Adjust final plan return shape to include the new `noOffloadMmproj` field. * remove unused variable --------- Co-authored-by: Faisal Amir <urmauur@gmail.com>	2025-09-11 09:48:03 +05:30
Dinh Long Nguyen	a30eb7f968	feat: Jan Web (reusing Jan Desktop UI) (#6298 ) * add platform guards * add service management * fix types * move to zustand for servicehub * update App Updater * update tauri missing move * update app updater * refactor: move PlatformFeatures to separate const file 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> * change tauri fetch name * update implementation * update extension fetch * make web version run properly * disabled unused web settings * fix all tests * fix lint * fix tests * add mock for extension * fix build * update make and mise * fix tsconfig for web-extensions * fix loader type * cleanup * fix test * update error handling + mcp should be working * Update mcp init * use separate is_web_app build property * Remove fixed model catalog url * fix additional tests * fix download issue (event emitter not implemented correctly) * Update Title html * fix app logs * update root tsx render timing --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-05 01:47:46 +07:00
Faisal Amir	d922d7454d	fix: mcp sort list	2025-08-27 18:25:13 +07:00
Faisal Amir	742f9c1a70	fix: sort list when add server	2025-08-27 18:17:55 +07:00
Faisal Amir	75d189900c	fix: mcp cleanup dropodown tool availabel and sort list	2025-08-27 18:08:23 +07:00
Faisal Amir	62eb422934	chore: show model setting only for local provider	2025-08-25 11:26:56 +07:00
Louis	8e7378b70f	Merge pull request #6255 from menloresearch/fix/remove-experimental-toggle fix: remove experimental toggle	2025-08-21 12:51:25 +07:00
Faisal Amir	7b9e752301	Merge pull request #6250 from menloresearch/feat/local-api-server feat: run on startup setting for local api server	2025-08-21 12:43:13 +07:00
Louis	8de5c1709b	fix: test	2025-08-21 12:01:45 +07:00
Louis	cfbc6b9150	fix: remove experimental toggle	2025-08-21 11:54:34 +07:00
Louis	e6587844d0	Merge branch 'dev' into current-date-instruction	2025-08-21 11:41:30 +07:00
Louis	6850dda108	feat: MCP server error handling	2025-08-20 23:42:12 +07:00
Faisal Amir	39df7b22b9	chore: rename key runOnStartup from hooks useLocalApiServer	2025-08-20 22:37:45 +07:00
Faisal Amir	cfa68c5500	feat: run on startup settin for local api server	2025-08-20 21:56:53 +07:00
Louis	c018713676	feat: allow user to set max_attempt for MCP to avoid looping	2025-08-20 12:42:54 +07:00
Faisal Amir	5481ee9e35	Merge pull request #6134 from menloresearch/feat/attachment-ui feat: attachment UI	2025-08-20 10:04:32 +07:00
Kamal Fariz Mahyuddin	df27def9cb	Merge branch 'dev' into current-date-instruction	2025-08-19 14:40:08 -07:00
Louis	91f05b8f32	feat: add tool call cancellation	2025-08-19 23:27:12 +07:00
Faisal Amir	cef3e122ff	chore: send attachment file when send message	2025-08-19 19:51:01 +07:00
Dinh Long Nguyen	9ea9b7d87d	handle abort properly + finally clause to resolve (#6227 )	2025-08-19 14:45:57 +07:00
Dinh Long Nguyen	2d486d7b3a	feat: add support for reasoning fields (OpenRouter) (#6206 ) * add support for reasoning fields (OpenRouter) * reformat * fix linter * Update web-app/src/utils/reasoning.ts Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-08-18 21:59:14 +07:00
Louis	362324cb87	Merge pull request #6188 from menloresearch/feat/mcp-enhancement feat: mcp enhancement	2025-08-18 09:55:44 +07:00
Faisal Amir	b1b2ca1987	Merge pull request #6006 from menloresearch/feat/fav-model 🚀feat: allow user mark model as favorite	2025-08-17 23:14:26 +07:00
Kamal Fariz Mahyuddin	b77c8932a6	feat: support inserting current date into assistant prompt	2025-08-17 00:24:00 -07:00
Jasper Morgal	4ba56f1377	Fix Issue #6199 Fix Issue: Jan UI Bottlenecks Token Rendering Speed to ~300 TPS Despite Faster Cerebras API Output	2025-08-15 15:00:29 -07:00
Louis	c8d9592ab8	chore: mcp group server, action and import json	2025-08-15 11:37:21 +07:00
Louis	dcb46174ff	fix: test	2025-08-14 14:30:43 +07:00
Minh141120	aa8fb0464c	Merge branch 'dev' into fix/feature-toggle-auto-updater	2025-08-14 13:42:27 +07:00
Minh141120	388959a1fe	chore: gate check auto updater	2025-08-14 12:39:48 +07:00
Louis	16bfd6eafb	fix: full url search	2025-08-14 11:33:03 +07:00
Louis	526e532e2d	fix: normalize model id from source preparation	2025-08-14 10:50:50 +07:00
Faisal Amir	985a8f31ae	fix: migrations model setting (#6165 )	2025-08-13 18:21:48 +07:00
Louis	8e5fac83fd	fix: deprecate addSource tests since the function was removed	2025-08-12 11:25:47 +07:00
Louis	736790473e	fix: duplicate model while searching	2025-08-12 11:17:00 +07:00
Louis	b924156a15	fix: bring back GPU detection	2025-08-11 13:52:20 +07:00
Louis	4f5d9b8222	Merge pull request #6089 from menloresearch/fix/clean-up-unused-apis refactor: clean up unused hardware apis	2025-08-11 00:02:31 +07:00
Akarshan Biswas	0cfc745954	feat: Introduce structured error handling for llamacpp extension (#6087 ) * feat: Introduce structured error handling for llamacpp extension This commit introduces a structured error handling system for the `llamacpp` extension. Instead of returning simple string errors, we now use a custom `LlamacppError` struct with a specific `ErrorCode` enum. This allows the frontend to display more user-friendly and actionable error messages based on the code, rather than raw debug logs. The changes include: - A new `ErrorCode` enum to categorize errors (e.g., `OutOfMemory`, `ModelArchNotSupported`, `BinaryNotFound`). - A `LlamacppError` struct to encapsulate the code, a user-facing message, and optional detailed logs. - A static method `from_stderr` that intelligently parses llama.cpp's standard error output to identify and map common issues like Out of Memory errors to a specific error code. - Refactored `ServerError` enum to wrap the new `LlamacppError` and provide a consistent serialization format for the Tauri frontend. - Updated all relevant functions (`load_llama_model`, `get_devices`) to return the new structured error type, ensuring a more robust and predictable error flow. - A reduced timeout for model loading from 300 to 180 seconds. This work lays the groundwork for a more intuitive and helpful user experience, as the application can now provide clear guidance to users when a model fails to load. * Update src-tauri/src/core/utils/extensions/inference_llamacpp_extension/server.rs Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * Update src-tauri/src/core/utils/extensions/inference_llamacpp_extension/server.rs Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * chore: update FE handle error object from extension * chore: fix property type --------- Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> Co-authored-by: Faisal Amir <urmauur@gmail.com>	2025-08-07 23:28:25 +05:30
Louis	ab44faeda3	test: fix test	2025-08-07 20:09:07 +07:00
Louis	c1668a4e4a	refactor: clean up unused hardware apis	2025-08-07 20:04:23 +07:00
Faisal Amir	f58332e9b5	Merge branch 'dev' into feat/fav-model	2025-08-07 18:11:44 +07:00
Akarshan Biswas	1f1605bdf9	feat: Add support for overriding tensor buffer type (#6062 ) * feat: Add support for overriding tensor buffer type This commit introduces a new configuration option, `override_tensor_buffer_t`, which allows users to specify a regex for matching tensor names to override their buffer type. This is an advanced setting primarily useful for optimizing the performance of large models, particularly Mixture of Experts (MoE) models. By overriding the tensor buffer type, users can keep critical parts of the model, like the attention layers, on the GPU while offloading other parts, such as the expert feed-forward networks, to the CPU. This can lead to significant speed improvements for massive models. Additionally, this change refines the error message to be more specific when a model fails to load. The previous message "Failed to load llama-server" has been updated to "Failed to load model" to be more accurate. * chore: update FE to suppoer override-tensor --------- Co-authored-by: Faisal Amir <urmauur@gmail.com>	2025-08-07 10:31:34 +05:30
Faisal Amir	5d001dfd5a	✨feat: jinja template customize per model instead provider level (#6053 )	2025-08-05 21:21:41 +07:00
Faisal Amir	e3ba37ba15	🚀feat: allow user mark model as favorite	2025-08-05 14:26:12 +07:00
Louis	48004024ee	Merge pull request #6020 from cmppoon/fix-mcp-servers-edit-json fix connected servers status not in sync when edit mcp json	2025-08-05 11:06:05 +07:00
Faisal Amir	641df474fd	fix: Generate A Response button does not show context size error dialog (#6029 ) * fix: Generate A Response button does not show context size error dialog * chore: remove as a child button params	2025-08-05 08:34:06 +07:00
Chaiyapruek Muangsiri	477651e5d5	fix connected servers status not in sync when edit mcp json	2025-08-05 08:08:59 +07:00
Faisal Amir	787c4ee073	fix: wrong desc setting cont_batching (#6034 )	2025-08-02 21:48:43 +07:00
Faisal Amir	3acb61b5ed	fix: react state loop from hooks useMediaQuery (#6031 ) * fix: react state loop from hooks useMediaQuerry * chore: update test cases hooks media query	2025-08-02 21:48:40 +07:00

1 2 3 4

177 Commits