91 lines
4.3 KiB
Plaintext
91 lines
4.3 KiB
Plaintext
---
|
|
title: Customize Engine Settings
|
|
sidebar_position: 1
|
|
description: A step-by-step guide to change your engine's settings.
|
|
keywords:
|
|
[
|
|
Jan AI,
|
|
Jan,
|
|
ChatGPT alternative,
|
|
local AI,
|
|
private AI,
|
|
conversational AI,
|
|
no-subscription fee,
|
|
large language model,
|
|
import-models-manually,
|
|
customize-engine-settings,
|
|
]
|
|
---
|
|
|
|
<head>
|
|
<title>Customize Engine Settings</title>
|
|
<meta name="description" content="A step-by-step guide to change your engine's settings. Learn how to modify the nitro.json file to customize parameters such as ctx_len, ngl, cpu_threads, cont_batching, and embedding to optimize the performance of your Jan AI."/>
|
|
<meta name="keywords" content="Jan AI, Jan, ChatGPT alternative, local AI, private AI, conversational AI, no-subscription fee, large language model, import-models-manually, customize-engine-settings"/>
|
|
<meta property="og:title" content="Customize Engine Settings"/>
|
|
<meta property="og:description" content="A step-by-step guide to change your engine's settings. Learn how to modify the nitro.json file to customize parameters such as ctx_len, ngl, cpu_threads, cont_batching, and embedding to optimize the performance of your Jan AI."/>
|
|
<meta property="og:url" content="https://jan.ai/guides/customize-engine-settings"/>
|
|
<meta name="twitter:card" content="summary"/>
|
|
<meta name="twitter:title" content="Customize Engine Settings"/>
|
|
<meta name="twitter:description" content="A step-by-step guide to change your engine's settings. Learn how to modify the nitro.json file to customize parameters such as ctx_len, ngl, cpu_threads, cont_batching, and embedding to optimize the performance of your Jan AI."/>
|
|
</head>
|
|
|
|
import Tabs from '@theme/Tabs';
|
|
import TabItem from '@theme/TabItem';
|
|
|
|
|
|
In this guide, we'll walk you through the process of customizing your engine settings by configuring the `nitro.json` file
|
|
|
|
1. Navigate to the `App Settings` > `Advanced` > `Open App Directory` > `~/jan/engine` folder.
|
|
|
|
<Tabs>
|
|
<TabItem value="mac" label="MacOS" default>
|
|
```sh
|
|
cd ~/jan/engines
|
|
```
|
|
</TabItem>
|
|
<TabItem value="windows" label="Windows" default>
|
|
```sh
|
|
C:/Users/<your_user_name>/jan/engines
|
|
```
|
|
</TabItem>
|
|
<TabItem value="linux" label="Linux" default>
|
|
```sh
|
|
cd ~/jan/engines
|
|
```
|
|
</TabItem>
|
|
</Tabs>
|
|
|
|
2. Modify the `nitro.json` file based on your needs. The default settings are shown below.
|
|
|
|
```json title="~/jan/engines/nitro.json"
|
|
{
|
|
"ctx_len": 2048,
|
|
"ngl": 100,
|
|
"cpu_threads": 1,
|
|
"cont_batching": false,
|
|
"embedding": false
|
|
}
|
|
```
|
|
|
|
The table below describes the parameters in the `nitro.json` file.
|
|
|
|
| Parameter | Type | Description |
|
|
| --------- | ---- | ----------- |
|
|
| `ctx_len` | **Integer** | Typically set at `2048`, `ctx_len` provides ample context for model operations like `GPT-3.5`. (*Maximum*: `4096`, *Minimum*: `1`) |
|
|
| `ngl` | **Integer** | Defaulted at `100`, `ngl` determines GPU layer usage. |
|
|
| `cpu_threads` | **Integer** | Determines CPU inference threads, limited by hardware and OS. (*Maximum* determined by system) |
|
|
| `cont_batching` | **Integer** | Controls continuous batching, enhancing throughput for LLM inference. |
|
|
| `embedding` | **Integer** | Enables embedding utilization for tasks like document-enhanced chat in RAG-based applications. |
|
|
|
|
:::tip
|
|
- By default, the value of `ngl` is set to 100, which indicates that it will offload all. If you wish to offload only 50% of the GPU, you can set `ngl` to 15 because most models on Mistral or Llama are around ~ 30 layers.
|
|
- To utilize the embedding feature, include the JSON parameter `"embedding": true`. It will enable Nitro to process inferences with embedding capabilities. Please refer to the [Embedding in the Nitro documentation](https://nitro.jan.ai/features/embed) for a more detailed explanation.
|
|
- To utilize the continuous batching feature for boosting throughput and minimizing latency in large language model (LLM) inference, include `cont_batching: true`. For details, please refer to the [Continuous Batching in the Nitro documentation](https://nitro.jan.ai/features/cont-batch).
|
|
|
|
:::
|
|
|
|
:::info[Assistance and Support]
|
|
|
|
If you have questions, please join our [Discord community](https://discord.gg/Dt7MxDyNNZ) for support, updates, and discussions.
|
|
|
|
::: |