This commit introduces a change to prevent **Markdown** rendering issues where a dollar sign followed by a number (like **`$1`**) is incorrectly interpreted as **LaTeX** by the rendering engine.
---
The `normalizeLatex` function in `RenderMarkdown.tsx` now explicitly escapes these sequences (e.g., **`$1`** becomes **`\$1`**), ensuring they are displayed literally instead of being processed as mathematical expressions. This improves the fidelity of text that might contain currency or similar numerical notations.
* enable new prompt input while waiting for an answer
* correct spelling of handleSendMessage function
* remove test for disabling input while streaming content
- Added shallow equality guard for `connectedServers` state to prevent redundant updates when the fetched server list hasn't changed.
- Updated error handling for server fetch to only clear the state when it actually contains data.
- Introduced `newHasActiveModels` variable and conditional updater for `hasActiveModels` to avoid unnecessary state changes.
- Adjusted error handling for active model fetch to only set `hasActiveModels` to `false` when the current state differs.
These changes reduce needless re‑renders and improve component performance.
The `listSupportedBackends` function now includes error handling for the `fetchRemoteSupportedBackends` call.
This addresses an issue where an error thrown during the remote fetch (e.g., due to no network connection in offline mode) would prevent the subsequent loading of locally installed or manually provided llama.cpp backends.
The remote backend versions array will now default to empty if the fetch fails, allowing the rest of the backend initialization process to proceed as expected.
This commit introduces a new field, `is_embedding`, to the `SessionInfo` structure to clearly mark sessions running dedicated embedding models.
Key changes:
- Adds `is_embedding` to the `SessionInfo` interface in `AIEngine.ts` and the Rust backend.
- Updates the `loadLlamaModel` command signatures to pass this new flag.
- Modifies the llama.cpp extension's **auto-unload logic** to explicitly **filter out** and **not unload** any currently loaded embedding models when a new text generation model is loaded. This is a critical performance fix to prevent the embedding model (e.g., used for RAG) from being repeatedly reloaded.
Also includes minor code style cleanup/reformatting in `jan-provider-web/provider.ts` for improved readability.
* feat: Add support for llamacpp MoE offloading setting
Introduces the n_cpu_moe configuration setting for the llamacpp provider. This allows users to specify the number of Mixture of Experts (MoE) layers whose weights should be offloaded to the CPU via the --n-cpu-moe flag in llama.cpp.
This is useful for running large MoE models by balancing resource usage, for example, by keeping attention on the GPU and offloading expert FFNs to the CPU.
The changes include:
- Updating the llamacpp-extension to accept and pass the --n-cpu-moe argument.
- Adding the input field to the Model Settings UI (ModelSetting.tsx).
- Including model setting migration logic and bumping the store version to 4.
* remove unused import
* feat: add cpu-moe boolean flag
* chore: remove unused migration cont_batching
* chore: fix migration delete old key and add new one
* chore: fix migration
---------
Co-authored-by: Faisal Amir <urmauur@gmail.com>