docs: Update quickstart content add keywords

This commit is contained in:
Arista Indrajaya 2024-02-29 23:09:18 +07:00
parent 1585f3dd10
commit 53f66ce1ba
3 changed files with 161 additions and 47 deletions

View File

@ -1,22 +1,30 @@
--- ---
title: Installation
sidebar_position: 2 sidebar_position: 2
hide_table_of_contents: true hide_table_of_contents: true
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
]
--- ---
import Tabs from '@theme/Tabs'; import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem'; import TabItem from '@theme/TabItem';
import installImageURL from './assets/jan-ai-download.png'; import installImageURL from './assets/jan-ai-download.png';
# Installation
<Tabs> <Tabs>
<TabItem value="mac" label = "Mac" default> <TabItem value="mac" label = "Mac" default>
:::warning ### Pre-requisites
Ensure that your MacOS version is 13 or higher to run Jan.
Ensure that your MacOS version is 13 or higher to run Jan.
:::
### Stable Releases ### Stable Releases
@ -43,16 +51,13 @@ If you are stuck in a broken build, go to the [Broken Build](/docs/common-error/
</TabItem> </TabItem>
<TabItem value = "windows" label = "Windows"> <TabItem value = "windows" label = "Windows">
:::warning ### Pre-requisites
Ensure that your system meets the following requirements:
Ensure that your system meets the following requirements: - Windows 10 or higher is required to run Jan.
- Windows 10 or higher is required to run Jan.
To enable GPU support, you will need:
To enable GPU support, you will need: - NVIDIA GPU with CUDA Toolkit 11.7 or higher
- NVIDIA GPU with CUDA Toolkit 11.7 or higher - NVIDIA driver 470.63.01 or higher
- NVIDIA driver 470.63.01 or higher
:::
### Stable Releases ### Stable Releases
@ -88,15 +93,14 @@ If you are stuck in a broken build, go to the [Broken Build](/docs/common-error/
</TabItem> </TabItem>
<TabItem value = "linux" label = "Linux"> <TabItem value = "linux" label = "Linux">
:::warning ### Pre-requisites
Ensure that your system meets the following requirements: Ensure that your system meets the following requirements:
- glibc 2.27 or higher (check with `ldd --version`) - glibc 2.27 or higher (check with `ldd --version`)
- gcc 11, g++ 11, cpp 11, or higher, refer to this link for more information. - gcc 11, g++ 11, cpp 11, or higher, refer to this link for more information.
To enable GPU support, you will need: To enable GPU support, you will need:
- NVIDIA GPU with CUDA Toolkit 11.7 or higher - NVIDIA GPU with CUDA Toolkit 11.7 or higher
- NVIDIA driver 470.63.01 or higher - NVIDIA driver 470.63.01 or higher
:::
### Stable Releases ### Stable Releases
@ -154,4 +158,117 @@ If you are stuck in a broken build, go to the [Broken Build](/docs/common-error/
::: :::
</TabItem> </TabItem>
<TabItem value="docker" label = "Docker" default>
### Pre-requisites
Ensure that your system meets the following requirements:
- Linux or WSL2 Docker
- Latest Docker Engine and Docker Compose
To enable GPU support, you will need:
- `nvidia-driver`
- `nvidia-docker2`
:::note
- If you have not installed Docker, follow the instructions [here](https://docs.docker.com/engine/install/ubuntu/).
- If you have not installed the required file for GPU support, follow the instructions [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
:::
### Docker Compose Profile and Environment
Before dive in into the steps to run Jan in Docker, ensure that you have understand the following docker compose profile and the environment variable listed below:
#### Docker Compose Profile
| Profile | Description |
|-----------|-------------------------------------------|
| cpu-fs | Run Jan in CPU mode with default file system |
| cpu-s3fs | Run Jan in CPU mode with S3 file system |
| gpu-fs | Run Jan in GPU mode with default file system |
| gpu-s3fs | Run Jan in GPU mode with S3 file system |
#### Environment Variables
| Environment Variable | Description |
|--------------------------|------------------------------------------------------------|
| S3_BUCKET_NAME | S3 bucket name - leave blank for default file system |
| AWS_ACCESS_KEY_ID | AWS access key ID - leave blank for default file system |
| AWS_SECRET_ACCESS_KEY | AWS secret access key - leave blank for default file system|
| AWS_ENDPOINT | AWS endpoint URL - leave blank for default file system |
| AWS_REGION | AWS region - leave blank for default file system |
| API_BASE_URL | Jan Server URL, please modify it as your public ip address or domain name default http://localhost:1377 |
### Run Jan in Docker
You can run Jan in Docker with two methods:
1. Run Jan in CPU mode
2. Run Jan in GPU mode
<Tabs groupId = "ldocker_type">
<TabItem value="docker_cpu" label = "CPU">
To run Jan in Docker CPU mode, by using the following code:
```bash
# cpu mode with default file system
docker compose --profile cpu-fs up -d
# cpu mode with S3 file system
docker compose --profile cpu-s3fs up -d
```
</TabItem>
<TabItem value="docker_gpu" label = "GPU">
To run Jan in Docker CPU mode, follow the steps below:
1. Check CUDA compatibility with your NVIDIA driver by running nvidia-smi and check the CUDA version in the output as shown below:
```sh
nvidia-smi
# Output
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 531.18 Driver Version: 531.18 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4070 Ti WDDM | 00000000:01:00.0 On | N/A |
| 0% 44C P8 16W / 285W| 1481MiB / 12282MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce GTX 1660 Ti WDDM | 00000000:02:00.0 Off | N/A |
| 0% 49C P8 14W / 120W| 0MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce GTX 1660 Ti WDDM | 00000000:05:00.0 Off | N/A |
| 29% 38C P8 11W / 120W| 0MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
```
2. Visit [NVIDIA NGC Catalog](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda/tags) and find the smallest minor version of image tag that matches your CUDA version (e.g., 12.1 -> 12.1.0)
3. Update the `Dockerfile.gpu` line number 5 with the latest minor version of the image tag from step 2 (e.g. change `FROM nvidia/cuda:12.2.0-runtime-ubuntu22.04 AS base` to `FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04 AS base`)
4. Run Jan in GPU mode by using the following command:
```bash
# GPU mode with default file system
docker compose --profile gpu-fs up -d
# GPU mode with S3 file system
docker compose --profile gpu-s3fs up -d
```
</TabItem>
</Tabs>
:::warning
If you are stuck in a broken build, go to the [Broken Build](/docs/common-error/broken-build) section of Common Errors.
:::
</TabItem>
</Tabs> </Tabs>

View File

@ -1,9 +1,8 @@
--- ---
title: Pre-configured Models
sidebar_position: 3 sidebar_position: 3
--- ---
# Pre-configured Models
## Overview ## Overview
Jan provides various pre-configured AI models with different capabilities. Please see the following list for details. Jan provides various pre-configured AI models with different capabilities. Please see the following list for details.
@ -14,18 +13,18 @@ Jan provides various pre-configured AI models with different capabilities. Pleas
| OpenHermes Neural 7B Q4 | A merged model using the TIES method. It performs well in various benchmarks | | OpenHermes Neural 7B Q4 | A merged model using the TIES method. It performs well in various benchmarks |
| Stealth 7B Q4 | This is a new experimental family designed to enhance Mathematical and Logical abilities | | Stealth 7B Q4 | This is a new experimental family designed to enhance Mathematical and Logical abilities |
| Trinity-v1.2 7B Q4 | An experimental model merge using the Slerp method | | Trinity-v1.2 7B Q4 | An experimental model merge using the Slerp method |
| Openchat-3.5 7B Q4 | An open-source model that has the performance that surpasses that of ChatGPT-3.5 and Grok-1 across various benchmarks | | Openchat-3.5 7B Q4 | An open-source model that has a performance that surpasses that of ChatGPT-3.5 and Grok-1 across various benchmarks |
| Wizard Coder Python 13B Q5 | A Python coding model that demonstrates high proficiency in specific domains like coding and mathematics | | Wizard Coder Python 13B Q5 | A Python coding model that demonstrates high proficiency in specific domains like coding and mathematics |
| OpenAI GPT 3.5 Turbo | The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls | | OpenAI GPT 3.5 Turbo | The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug that caused a text encoding issue for non-English language function calls |
| OpenAI GPT 3.5 Turbo 16k 0613 | A Snapshot model of gpt-3.5-16k-turbo from June 13th 2023 | | OpenAI GPT 3.5 Turbo 16k 0613 | A Snapshot model of gpt-3.5-16k-turbo from June 13th 2023 |
| OpenAI GPT 4 | The latest GPT-4 model intended to reduce cases of “laziness” where the model doesn't complete a task | | OpenAI GPT 4 | The latest GPT-4 model intended to reduce cases of “laziness” where the model doesn't complete a task |
| TinyLlama Chat 1.1B Q4 | A tiny model with only 1.1B. It's a good model for less powerful computers | | TinyLlama Chat 1.1B Q4 | A tiny model with only 1.1B. It's a good model for less powerful computers |
| Deepseek Coder 1.3B Q8 | A model that excelled in project-level code completion with advanced capabilities across multiple programming languages | | Deepseek Coder 1.3B Q8 | A model that excelled in project-level code completion with advanced capabilities across multiple programming languages |
| Phi-2 3B Q8 | a 2.7B model, excelling in common sense and logical reasoning benchmarks, trained with synthetic texts and filtered websites | | Phi-2 3B Q8 | a 2.7B model, excelling in common sense and logical reasoning benchmarks, trained with synthetic texts and filtered websites |
| Llama 2 Chat 7B Q4 | A model that is specifically designed for a comprehensive understanding through training on extensive internet data | | Llama 2 Chat 7B Q4 | A model that is specifically designed for a comprehensive understanding through training on extensive internet data |
| CodeNinja 7B Q4 | A model that is is good for coding tasks and can handle various languages including Python, C, C++, Rust, Java, JavaScript, and more | | CodeNinja 7B Q4 | A model that is good for coding tasks and can handle various languages, including Python, C, C++, Rust, Java, JavaScript, and more |
| Noromaid 7B Q5 | A model that is designed for role-playing with human-like behavior. | | Noromaid 7B Q5 | A model designed for role-playing with human-like behavior. |
| Starling alpha 7B Q4 | An upgrade of Openchat 3.5 using RLAIF, is really good at various benchmarks, especially with GPT-4 judging its performance | | Starling alpha 7B Q4 | An upgrade of Openchat 3.5 using RLAIF, is good at various benchmarks, especially with GPT-4 judging its performance |
| Yarn Mistral 7B Q4 | A language model for long context and supports a 128k token context window | | Yarn Mistral 7B Q4 | A language model for long context and supports a 128k token context window |
| LlaVa 1.5 7B Q5 K | A model can bring vision understanding to Jan | | LlaVa 1.5 7B Q5 K | A model can bring vision understanding to Jan |
| BakLlava 1 | A model can bring vision understanding to Jan | | BakLlava 1 | A model can bring vision understanding to Jan |
@ -33,16 +32,16 @@ Jan provides various pre-configured AI models with different capabilities. Pleas
| LlaVa 1.5 13B Q5 K | A model can bring vision understanding to Jan | | LlaVa 1.5 13B Q5 K | A model can bring vision understanding to Jan |
| Deepseek Coder 33B Q5 | A model that excelled in project-level code completion with advanced capabilities across multiple programming languages | | Deepseek Coder 33B Q5 | A model that excelled in project-level code completion with advanced capabilities across multiple programming languages |
| Phind 34B Q5 | A multi-lingual model that is fine-tuned on 1.5B tokens of high-quality programming data, excels in various programming languages, and is designed to be steerable and user-friendly | | Phind 34B Q5 | A multi-lingual model that is fine-tuned on 1.5B tokens of high-quality programming data, excels in various programming languages, and is designed to be steerable and user-friendly |
| Yi 34B Q5 | A specialized chat model, is known for its diverse and creative responses and excels across various NLP tasks and benchmarks | | Yi 34B Q5 | A specialized chat model is known for its diverse and creative responses and excels across various NLP tasks and benchmarks |
| Capybara 200k 34B Q5 | A long context length model that supports 200K tokens | | Capybara 200k 34B Q5 | A long context length model that supports 200K tokens |
| Dolphin 8x7B Q4 | An uncensored model built on Mixtral-8x7b and it is good at programming tasks | | Dolphin 8x7B Q4 | An uncensored model built on Mixtral-8x7b and it is good at programming tasks |
| Mixtral 8x7B Instruct Q4 | A pretrained generative Sparse Mixture of Experts, which outperforms 70B models on most benchmarks | | Mixtral 8x7B Instruct Q4 | A pre-trained generative Sparse Mixture of Experts, which outperforms 70B models on most benchmarks |
| Tulu 2 70B Q4 | A strong model alternative to Llama 2 70b Chat to act as helpful assistants | | Tulu 2 70B Q4 | A strong model alternative to Llama 2 70b Chat to act as helpful assistants |
| Llama 2 Chat 70B Q4 | A model that is specifically designed for a comprehensive understanding through training on extensive internet data | | Llama 2 Chat 70B Q4 | A model that is specifically designed for a comprehensive understanding through training on extensive internet data |
:::note :::note
OpenAI GPT models requires a subscription in order to use them further. To learn more, [click here](https://openai.com/pricing). OpenAI GPT models require a subscription to use them further. To learn more, [click here](https://openai.com/pricing).
::: :::
@ -69,13 +68,3 @@ OpenAI GPT models requires a subscription in order to use them further. To learn
| Yarn Mistral 7B Q4 | NousResearch, The Bloke | `yarn-mistral-7b` | **GGUF** | 4.07GB | | Yarn Mistral 7B Q4 | NousResearch, The Bloke | `yarn-mistral-7b` | **GGUF** | 4.07GB |
| LlaVa 1.5 7B Q5 K | Mys | `llava-1.5-7b-q5` | **GGUF** | 5.03GB | | LlaVa 1.5 7B Q5 K | Mys | `llava-1.5-7b-q5` | **GGUF** | 5.03GB |
| BakLlava 1 | Mys | `bakllava-1` | **GGUF** | 5.36GB | | BakLlava 1 | Mys | `bakllava-1` | **GGUF** | 5.36GB |
| Solar Slerp 10.7B Q4 | Jan | `solar-10.7b-slerp` | **GGUF** | 5.92GB |
| LlaVa 1.5 13B Q5 K | Mys | `llava-1.5-13b-q5` | **GGUF** | 9.17GB |
| Deepseek Coder 33B Q5 | Deepseek, The Bloke | `deepseek-coder-34b` | **GGUF** | 18.57GB |
| Phind 34B Q5 | Phind, The Bloke | `phind-34b` | **GGUF** | 18.83GB |
| Yi 34B Q5 | 01-ai, The Bloke | `yi-34b` | **GGUF** | 19.24GB |
| Capybara 200k 34B Q5 | NousResearch, The Bloke | `capybara-34b` | **GGUF** | 22.65GB |
| Dolphin 8x7B Q4 | Cognitive Computations, TheBloke | `dolphin-2.7-mixtral-8x7b` | **GGUF** | 24.62GB |
| Mixtral 8x7B Instruct Q4 | MistralAI, TheBloke | `mixtral-8x7b-instruct` | **GGUF** | 24.62GB |
| Tulu 2 70B Q4 | Lizpreciatior, The Bloke | `tulu-2-70b` | **GGUF** | 38.56GB |
| Llama 2 Chat 70B Q4 | MetaAI, The Bloke | `llama2-chat-70b-q4` | **GGUF** | 40.90GB |

View File

@ -4,6 +4,7 @@ hide_table_of_contents: true
--- ---
import installImageURL from './assets/jan-ai-quickstart.png'; import installImageURL from './assets/jan-ai-quickstart.png';
import flow from './assets/quick.png';
# Quickstart # Quickstart
@ -28,12 +29,19 @@ import installImageURL from './assets/jan-ai-quickstart.png';
3. Go to the **Hub** under the **Thread** section and select the AI model that you want to use. For more info, go to the [Using Models](category/using-models) section. 3. Go to the **Hub** under the **Thread** section and select the AI model that you want to use. For more info, go to the [Using Models](category/using-models) section.
4. A new thread will be added. You can use Jan in the thread with the AI model that you selected before. */} 4. A new thread will be added. You can use Jan in the thread with the AI model that you selected before. */}
<div class="text--center" >
<img src={ flow } width = { 800} alt = "Flow" />
</div>
To get started quickly with Jan, follow the steps below:
### Step 1: Install Jan ### Step 1: Install Jan
Go to [Jan.ai](https://jan.ai/) > Select your operating system > Install the program. Go to [Jan.ai](https://jan.ai/) > Select your operating system > Install the program.
To learn more about system requirements for your operating system, go to [Installation guide](/quickstart/install). :::note
To learn more about system requirements for your operating system, go to [Installation guide](/docs/install).
:::
### Step 2: Select AI Model ### Step 2: Select AI Model
@ -43,7 +51,7 @@ Each model has their purposes, capabilities, and different requirements.
To select AI models: Go to the **Hub** > select the models that you would like to install. To select AI models: Go to the **Hub** > select the models that you would like to install.
For more info, go to [list of supported models](/quickstart/models-list/). For more info, go to [list of supported models](/docs/models-list/).
### Step 3: Use the AI Model ### Step 3: Use the AI Model