docs: add customize engine settings
This commit is contained in:
parent
4cd4321d1a
commit
f467aea85c
@ -0,0 +1,78 @@
|
||||
---
|
||||
title: Customize Engine Settings
|
||||
slug: /guides/using-models/customize-engine-settings
|
||||
description: Guide to integrate with a remote server.
|
||||
keywords:
|
||||
[
|
||||
Jan AI,
|
||||
Jan,
|
||||
ChatGPT alternative,
|
||||
local AI,
|
||||
private AI,
|
||||
conversational AI,
|
||||
no-subscription fee,
|
||||
large language model,
|
||||
import-models-manually,
|
||||
customize-engine-settings,
|
||||
]
|
||||
---
|
||||
|
||||
{/* Imports */}
|
||||
import Tabs from "@theme/Tabs";
|
||||
import TabItem from "@theme/TabItem";
|
||||
|
||||
In this guide, we will show you how to customize the engine settings.
|
||||
|
||||
1. Navigate to the `~/jan/engine` folder. You can find this folder by going to `App Settings` > `Advanced` > `Open App Directory`.
|
||||
|
||||
<Tabs groupId="operating-systems">
|
||||
<TabItem value="mac" label="macOS">
|
||||
|
||||
```sh
|
||||
cd ~/jan/engine
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="win" label="Windows">
|
||||
|
||||
```sh
|
||||
C:/Users/<your_user_name>/jan/engine
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
<TabItem value="linux" label="Linux">
|
||||
|
||||
```sh
|
||||
cd ~/jan/engine
|
||||
```
|
||||
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
2. Modify the `nitro.json` file based on your needs. The default settings are shown below.
|
||||
|
||||
```json title="~/jan/engines/nitro.json"
|
||||
{
|
||||
"ctx_len": 2048,
|
||||
"ngl": 100,
|
||||
"cpu_threads": 1,
|
||||
"cont_batching": false,
|
||||
"embedding": false
|
||||
}
|
||||
```
|
||||
|
||||
| Parameter | Type | Description |
|
||||
| --------------- | ------- | ------------------------------------------------------------ |
|
||||
| `ctx_len` | Integer | The context length for the model operations. |
|
||||
| `ngl` | Integer | The number of GPU layers to use. |
|
||||
| `cpu_threads` | Integer | The number of threads to use for inferencing (CPU mode only) |
|
||||
| `cont_batching` | Boolean | Whether to use continuous batching. |
|
||||
| `embedding` | Boolean | Whether to use embedding in the model. |
|
||||
|
||||
:::tip
|
||||
|
||||
- By default, the value of `ngl` is set to 100, which indicates that it will offload all. If you wish to offload only 50% of the GPU, you can set `ngl` to 15. This is because most models on Mistral or Llama is around ~ 30 layers.
|
||||
- To utilize the embedding feature, include the JSON parameter `"embedding": true`. It will enable Nitro to process inferences with embedding capabilities. For a more detailed explanation, please refer to the [Embedding in the Nitro documentation](https://nitro.jan.ai/features/embed).
|
||||
- To utilize the continuous batching feature to boost throughput and minimize latency in large language model (LLM) inference, please refer to the [Continuous Batching in the Nitro documentation](https://nitro.jan.ai/features/cont-batch).
|
||||
|
||||
:::
|
||||
Loading…
x
Reference in New Issue
Block a user