This commit significantly refactors how assistant message content containing reasoning steps (<think> blocks) and tool calls is parsed and split into final output text and streamed reasoning text in `ThreadContent.tsx`.
It introduces new logic to correctly handle multiple, open, or closed `<think>` tags, ensuring that:
1. All text outside of `<think>...</think>` tags is correctly extracted as final output text.
2. Content inside all `<think>` tags is aggregated as streamed reasoning text.
3. The message correctly determines if reasoning is actively loading during a stream.
Additionally, this commit:
* **Fixes infinite tool loop prevention:** The global `toolStepCounter` in `completion.ts` is replaced with an explicit `currentStepCount` parameter passed recursively in `postMessageProcessing`. This ensures that the tool step limit is correctly enforced per message chain, preventing potential race conditions and correctly resolving the chain.
* **Fixes large step content rendering:** Limits the content of a single thinking step in `ThinkingBlock.tsx` to 1000 characters to prevent UI slowdowns from rendering extremely large JSON or text outputs.
* Move the assistant‑loop logic out of `useChat` and into `postMessageProcessing`.
* Eliminate the while‑loop that drove repeated completions; now a single completion is sent and subsequent tool calls are processed recursively.
* Introduce early‑abort checks and guard against missing provider before proceeding.
* Add `ReasoningProcessor` import and use it consistently for streaming reasoning chunks.
* Add `ToolCallEntry` type and a global `toolStepCounter` to track and cap total tool steps (default 20) to prevent infinite loops.
* Extend `postMessageProcessing` signature to accept thread, provider, tools, UI update callback, and max tool steps.
* Update UI‑update logic to use a single `updateStreamingUI` callback and ensure RAF scheduling is cleaned up reliably.
* Refactor token‑speed / progress handling, improve error handling for out‑of‑context situations, and tidy up code formatting.
* Minor clean‑ups: const‑ify `availableTools`, remove unused variables, improve readability.
- Replace raw text parsing with step‑based streaming logic in `ThinkingBlock`.
- Introduced `stepsWithoutDone`, `currentStreamingStepIndex`, and `displayedStepIndex` to drive the streaming UI.
- Added placeholder UI for empty streaming state and hide block when there is no content after streaming finishes.
- Simplified expansion handling and bullet‑point rendering, using `renderStepContent` for both streaming and expanded views.
- Removed unused `extractThinkingContent` import and related code.
- Updated translation keys and duration formatting.
- Consolidate reasoning and tool‑call presentation in `ThreadContent`.
- Introduced `shouldShowThinkingBlock` to render a single `ThinkingBlock` when either reasoning or tool calls are present.
- Adjusted `ThinkingBlock` props (`text`, `steps`, `loading`) and ID generation.
- Commented out the now‑redundant `ToolCallBlock` import and removed its conditional rendering block.
- Cleaned up comments, unused variables, and minor formatting/typo fixes.
- General cleanup:
- Updated comments for clarity.
- Fixed typo in deletion loop comment.
- Minor UI tweaks (bullet styling, border handling).
- **ThinkingBlock**
- Added `ThoughtStep` type and UI handling for step kinds: `thought`, `tool_call`, `tool_output`, and `done`.
- Integrated `Check` icon for completed steps and formatted duration (seconds) display.
- Implemented streaming paragraph extraction, fade‑in/out animation, and improved loading state handling.
- Updated header to show dynamic titles (thinking/thought + duration) and disabled expand/collapse while loading.
- Utilized `cn` utility for conditional class names and added relevant imports.
- **ThreadContent**
- Defined `ToolCall` and `ThoughtStep` types for type safety.
- Constructed `allSteps` via `useMemo`, extracting thought paragraphs, tool calls/outputs, and a final `done` step with total thinking time.
- Passed `steps`, `loading`, and `duration` props to `ThinkingBlock`.
- Introduced `hasReasoning` flag to conditionally render the reasoning block and avoid duplicate tool call rendering.
- Adjusted rendering logic to hide empty reasoning and ensure tool call blocks only appear when no reasoning is present.
- **useChat**
- Refactored `getCurrentThread` for clearer async flow while preserving temporary‑chat behavior.
- Captured `startTime` at message creation and computed `totalThinkingTime` on completion.
- Included `totalThinkingTime` in message metadata when appropriate.
- Minor cleanup: improved error handling for image ingestion and formatting adjustments.
Overall, these changes provide a richer, step‑by‑step thinking UI, better state handling during streaming, and expose total thinking duration for downstream components.
This icon doesn't do anything on chatInput but just an indicator when the proactive capability is activated. Safely remove since this can be indicated from the model dropdown
This commit introduces Japanese as a supported language in the web application.
Key changes include:
- Addition of a new `ja` locale with 15 translated JSON resource files, making the application accessible to Japanese-speaking users.
- Update of the `LanguageSwitcher.tsx` component to include '日本語' in the language selection dropdown menu, allowing users to switch to the new language.
- The localization files were added by creating a new `ja` directory under `web-app/src/locales` and translating the content from the `en` directory.
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
* feat: support multimodal tool results and improve tool message handling
- Added a temporary `ToolResult` type that mirrors the structure returned by tools (text, image data, URLs, errors).
- Implemented `convertToolPartToApiContentPart` to translate each tool output part into the format expected by the OpenAI chat completion API.
- Updated `CompletionMessagesBuilder.addToolMessage` to accept a full `ToolResult` instead of a plain string and to:
- Detect multimodal content (base64 images, image URLs) and build a structured `content` array.
- Properly handle plain‑text results, tool execution errors, and unexpected formats with sensible fallbacks.
- Cast the final content to `any` for the `tool` role as required by the API.
- Modified `postMessageProcessing` to pass the raw tool result (`result as any`) to `addToolMessage`, avoiding premature extraction of only the first text part.
- Refactored several formatting and type‑annotation sections:
- Added multiline guard for empty user messages to insert a placeholder.
- Split the image URL construction into a clearer multiline object.
- Adjusted method signatures and added minor line‑breaks for readability.
- Included extensive comments explaining the new logic and edge‑case handling.
These changes enable the chat system to handle richer tool outputs (e.g., images, mixed content) and provide more robust error handling.
* Satisfy ts linter
* Make ts linter happy x2
* chore: update test message creation
---------
Co-authored-by: Faisal Amir <urmauur@gmail.com>
This commit introduces a change to prevent **Markdown** rendering issues where a dollar sign followed by a number (like **`$1`**) is incorrectly interpreted as **LaTeX** by the rendering engine.
---
The `normalizeLatex` function in `RenderMarkdown.tsx` now explicitly escapes these sequences (e.g., **`$1`** becomes **`\$1`**), ensuring they are displayed literally instead of being processed as mathematical expressions. This improves the fidelity of text that might contain currency or similar numerical notations.
* enable new prompt input while waiting for an answer
* correct spelling of handleSendMessage function
* remove test for disabling input while streaming content
- Added shallow equality guard for `connectedServers` state to prevent redundant updates when the fetched server list hasn't changed.
- Updated error handling for server fetch to only clear the state when it actually contains data.
- Introduced `newHasActiveModels` variable and conditional updater for `hasActiveModels` to avoid unnecessary state changes.
- Adjusted error handling for active model fetch to only set `hasActiveModels` to `false` when the current state differs.
These changes reduce needless re‑renders and improve component performance.
* feat: Add support for llamacpp MoE offloading setting
Introduces the n_cpu_moe configuration setting for the llamacpp provider. This allows users to specify the number of Mixture of Experts (MoE) layers whose weights should be offloaded to the CPU via the --n-cpu-moe flag in llama.cpp.
This is useful for running large MoE models by balancing resource usage, for example, by keeping attention on the GPU and offloading expert FFNs to the CPU.
The changes include:
- Updating the llamacpp-extension to accept and pass the --n-cpu-moe argument.
- Adding the input field to the Model Settings UI (ModelSetting.tsx).
- Including model setting migration logic and bumping the store version to 4.
* remove unused import
* feat: add cpu-moe boolean flag
* chore: remove unused migration cont_batching
* chore: fix migration delete old key and add new one
* chore: fix migration
---------
Co-authored-by: Faisal Amir <urmauur@gmail.com>