docs: add customize engine settings

2024-01-08 21:20:09 +07:00 · 2024-01-08 21:20:09 +07:00 · f467aea85c
commit f467aea85c
parent 4cd4321d1a
1 changed files with 78 additions and 0 deletions
--- a/docs/docs/guides/04-using-models/04-customize-engine-settings.mdx
+++ b/docs/docs/guides/04-using-models/04-customize-engine-settings.mdx
@ -0,0 +1,78 @@
+---
+title: Customize Engine Settings
+slug: /guides/using-models/customize-engine-settings
+description: Guide to integrate with a remote server.
+keywords:
+  [
+    Jan AI,
+    Jan,
+    ChatGPT alternative,
+    local AI,
+    private AI,
+    conversational AI,
+    no-subscription fee,
+    large language model,
+    import-models-manually,
+    customize-engine-settings,
+  ]
+---
+
+{/* Imports */}
+import Tabs from "@theme/Tabs";
+import TabItem from "@theme/TabItem";
+
+In this guide, we will show you how to customize the engine settings.
+
+1. Navigate to the `~/jan/engine` folder. You can find this folder by going to `App Settings` > `Advanced` > `Open App Directory`.
+
+<Tabs groupId="operating-systems">
+  <TabItem value="mac" label="macOS">
+    
+    ```sh
+    cd ~/jan/engine
+    ```
+  
+  </TabItem>
+  <TabItem value="win" label="Windows">
+  
+    ```sh
+    C:/Users/<your_user_name>/jan/engine
+    ```
+  
+  </TabItem>
+  <TabItem value="linux" label="Linux">
+  
+    ```sh
+    cd ~/jan/engine
+    ```
+  
+  </TabItem>
+</Tabs>
+
+2. Modify the `nitro.json` file based on your needs. The default settings are shown below.
+
+```json title="~/jan/engines/nitro.json"
+{
+  "ctx_len": 2048,
+  "ngl": 100,
+  "cpu_threads": 1,
+  "cont_batching": false,
+  "embedding": false
+}
+```
+
+| Parameter       | Type    | Description                                                  |
+| --------------- | ------- | ------------------------------------------------------------ |
+| `ctx_len`       | Integer | The context length for the model operations.                 |
+| `ngl`           | Integer | The number of GPU layers to use.                             |
+| `cpu_threads`   | Integer | The number of threads to use for inferencing (CPU mode only) |
+| `cont_batching` | Boolean | Whether to use continuous batching.                          |
+| `embedding`     | Boolean | Whether to use embedding in the model.                       |
+
+:::tip
+
+- By default, the value of `ngl` is set to 100, which indicates that it will offload all. If you wish to offload only 50% of the GPU, you can set `ngl` to 15. This is because most models on Mistral or Llama is around ~ 30 layers.
+- To utilize the embedding feature, include the JSON parameter `"embedding": true`. It will enable Nitro to process inferences with embedding capabilities. For a more detailed explanation, please refer to the [Embedding in the Nitro documentation](https://nitro.jan.ai/features/embed).
+- To utilize the continuous batching feature to boost throughput and minimize latency in large language model (LLM) inference, please refer to the [Continuous Batching in the Nitro documentation](https://nitro.jan.ai/features/cont-batch).
+
+:::