diff --git a/docs/docs/quickstart/install.mdx b/docs/docs/quickstart/install.mdx index 53c0a7a1c..d96246a53 100644 --- a/docs/docs/quickstart/install.mdx +++ b/docs/docs/quickstart/install.mdx @@ -1,22 +1,30 @@ --- +title: Installation sidebar_position: 2 hide_table_of_contents: true +description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server. +keywords: + [ + Jan AI, + Jan, + ChatGPT alternative, + local AI, + private AI, + conversational AI, + no-subscription fee, + large language model, + ] --- import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem'; import installImageURL from './assets/jan-ai-download.png'; -# Installation - -:::warning - -Ensure that your MacOS version is 13 or higher to run Jan. - -::: + ### Pre-requisites + Ensure that your MacOS version is 13 or higher to run Jan. ### Stable Releases @@ -43,16 +51,13 @@ If you are stuck in a broken build, go to the [Broken Build](/docs/common-error/ -:::warning - -Ensure that your system meets the following requirements: - - Windows 10 or higher is required to run Jan. - -To enable GPU support, you will need: - - NVIDIA GPU with CUDA Toolkit 11.7 or higher - - NVIDIA driver 470.63.01 or higher - -::: + ### Pre-requisites + Ensure that your system meets the following requirements: + - Windows 10 or higher is required to run Jan. + + To enable GPU support, you will need: + - NVIDIA GPU with CUDA Toolkit 11.7 or higher + - NVIDIA driver 470.63.01 or higher ### Stable Releases @@ -88,15 +93,14 @@ If you are stuck in a broken build, go to the [Broken Build](/docs/common-error/ -:::warning -Ensure that your system meets the following requirements: - - glibc 2.27 or higher (check with `ldd --version`) - - gcc 11, g++ 11, cpp 11, or higher, refer to this link for more information. + ### Pre-requisites + Ensure that your system meets the following requirements: + - glibc 2.27 or higher (check with `ldd --version`) + - gcc 11, g++ 11, cpp 11, or higher, refer to this link for more information. -To enable GPU support, you will need: - - NVIDIA GPU with CUDA Toolkit 11.7 or higher - - NVIDIA driver 470.63.01 or higher -::: + To enable GPU support, you will need: + - NVIDIA GPU with CUDA Toolkit 11.7 or higher + - NVIDIA driver 470.63.01 or higher ### Stable Releases @@ -154,4 +158,117 @@ If you are stuck in a broken build, go to the [Broken Build](/docs/common-error/ ::: + + + ### Pre-requisites + Ensure that your system meets the following requirements: + - Linux or WSL2 Docker + - Latest Docker Engine and Docker Compose + + To enable GPU support, you will need: + - `nvidia-driver` + - `nvidia-docker2` + +:::note +- If you have not installed Docker, follow the instructions [here](https://docs.docker.com/engine/install/ubuntu/). +- If you have not installed the required file for GPU support, follow the instructions [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html). +::: + + ### Docker Compose Profile and Environment + Before dive in into the steps to run Jan in Docker, ensure that you have understand the following docker compose profile and the environment variable listed below: + + #### Docker Compose Profile + + | Profile | Description | + |-----------|-------------------------------------------| + | cpu-fs | Run Jan in CPU mode with default file system | + | cpu-s3fs | Run Jan in CPU mode with S3 file system | + | gpu-fs | Run Jan in GPU mode with default file system | + | gpu-s3fs | Run Jan in GPU mode with S3 file system | + + #### Environment Variables + + | Environment Variable | Description | + |--------------------------|------------------------------------------------------------| + | S3_BUCKET_NAME | S3 bucket name - leave blank for default file system | + | AWS_ACCESS_KEY_ID | AWS access key ID - leave blank for default file system | + | AWS_SECRET_ACCESS_KEY | AWS secret access key - leave blank for default file system| + | AWS_ENDPOINT | AWS endpoint URL - leave blank for default file system | + | AWS_REGION | AWS region - leave blank for default file system | + | API_BASE_URL | Jan Server URL, please modify it as your public ip address or domain name default http://localhost:1377 | + + ### Run Jan in Docker + You can run Jan in Docker with two methods: + 1. Run Jan in CPU mode + 2. Run Jan in GPU mode + + + + To run Jan in Docker CPU mode, by using the following code: + + ```bash + # cpu mode with default file system + docker compose --profile cpu-fs up -d + + # cpu mode with S3 file system + docker compose --profile cpu-s3fs up -d + ``` + + + + + To run Jan in Docker CPU mode, follow the steps below: + 1. Check CUDA compatibility with your NVIDIA driver by running nvidia-smi and check the CUDA version in the output as shown below: + ```sh + nvidia-smi + + # Output + +---------------------------------------------------------------------------------------+ + | NVIDIA-SMI 531.18 Driver Version: 531.18 CUDA Version: 12.1 | + |-----------------------------------------+----------------------+----------------------+ + | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | + | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | + | | | MIG M. | + |=========================================+======================+======================| + | 0 NVIDIA GeForce RTX 4070 Ti WDDM | 00000000:01:00.0 On | N/A | + | 0% 44C P8 16W / 285W| 1481MiB / 12282MiB | 2% Default | + | | | N/A | + +-----------------------------------------+----------------------+----------------------+ + | 1 NVIDIA GeForce GTX 1660 Ti WDDM | 00000000:02:00.0 Off | N/A | + | 0% 49C P8 14W / 120W| 0MiB / 6144MiB | 0% Default | + | | | N/A | + +-----------------------------------------+----------------------+----------------------+ + | 2 NVIDIA GeForce GTX 1660 Ti WDDM | 00000000:05:00.0 Off | N/A | + | 29% 38C P8 11W / 120W| 0MiB / 6144MiB | 0% Default | + | | | N/A | + +-----------------------------------------+----------------------+----------------------+ + + +---------------------------------------------------------------------------------------+ + | Processes: | + | GPU GI CI PID Type Process name GPU Memory | + | ID ID Usage | + |=======================================================================================| + ``` + 2. Visit [NVIDIA NGC Catalog](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda/tags) and find the smallest minor version of image tag that matches your CUDA version (e.g., 12.1 -> 12.1.0) + 3. Update the `Dockerfile.gpu` line number 5 with the latest minor version of the image tag from step 2 (e.g. change `FROM nvidia/cuda:12.2.0-runtime-ubuntu22.04 AS base` to `FROM nvidia/cuda:12.1.0-runtime-ubuntu22.04 AS base`) + 4. Run Jan in GPU mode by using the following command: + + ```bash + # GPU mode with default file system + docker compose --profile gpu-fs up -d + + # GPU mode with S3 file system + docker compose --profile gpu-s3fs up -d + ``` + + + + +:::warning + +If you are stuck in a broken build, go to the [Broken Build](/docs/common-error/broken-build) section of Common Errors. + +::: + + \ No newline at end of file diff --git a/docs/docs/quickstart/models-list.mdx b/docs/docs/quickstart/models-list.mdx index a939b0248..cd7107a92 100644 --- a/docs/docs/quickstart/models-list.mdx +++ b/docs/docs/quickstart/models-list.mdx @@ -1,9 +1,8 @@ --- +title: Pre-configured Models sidebar_position: 3 --- -# Pre-configured Models - ## Overview Jan provides various pre-configured AI models with different capabilities. Please see the following list for details. @@ -14,18 +13,18 @@ Jan provides various pre-configured AI models with different capabilities. Pleas | OpenHermes Neural 7B Q4 | A merged model using the TIES method. It performs well in various benchmarks | | Stealth 7B Q4 | This is a new experimental family designed to enhance Mathematical and Logical abilities | | Trinity-v1.2 7B Q4 | An experimental model merge using the Slerp method | -| Openchat-3.5 7B Q4 | An open-source model that has the performance that surpasses that of ChatGPT-3.5 and Grok-1 across various benchmarks | +| Openchat-3.5 7B Q4 | An open-source model that has a performance that surpasses that of ChatGPT-3.5 and Grok-1 across various benchmarks | | Wizard Coder Python 13B Q5 | A Python coding model that demonstrates high proficiency in specific domains like coding and mathematics | -| OpenAI GPT 3.5 Turbo | The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug which caused a text encoding issue for non-English language function calls | +| OpenAI GPT 3.5 Turbo | The latest GPT-3.5 Turbo model with higher accuracy at responding in requested formats and a fix for a bug that caused a text encoding issue for non-English language function calls | | OpenAI GPT 3.5 Turbo 16k 0613 | A Snapshot model of gpt-3.5-16k-turbo from June 13th 2023 | | OpenAI GPT 4 | The latest GPT-4 model intended to reduce cases of “laziness” where the model doesn't complete a task | | TinyLlama Chat 1.1B Q4 | A tiny model with only 1.1B. It's a good model for less powerful computers | | Deepseek Coder 1.3B Q8 | A model that excelled in project-level code completion with advanced capabilities across multiple programming languages | | Phi-2 3B Q8 | a 2.7B model, excelling in common sense and logical reasoning benchmarks, trained with synthetic texts and filtered websites | | Llama 2 Chat 7B Q4 | A model that is specifically designed for a comprehensive understanding through training on extensive internet data | -| CodeNinja 7B Q4 | A model that is is good for coding tasks and can handle various languages including Python, C, C++, Rust, Java, JavaScript, and more | -| Noromaid 7B Q5 | A model that is designed for role-playing with human-like behavior. | -| Starling alpha 7B Q4 | An upgrade of Openchat 3.5 using RLAIF, is really good at various benchmarks, especially with GPT-4 judging its performance | +| CodeNinja 7B Q4 | A model that is good for coding tasks and can handle various languages, including Python, C, C++, Rust, Java, JavaScript, and more | +| Noromaid 7B Q5 | A model designed for role-playing with human-like behavior. | +| Starling alpha 7B Q4 | An upgrade of Openchat 3.5 using RLAIF, is good at various benchmarks, especially with GPT-4 judging its performance | | Yarn Mistral 7B Q4 | A language model for long context and supports a 128k token context window | | LlaVa 1.5 7B Q5 K | A model can bring vision understanding to Jan | | BakLlava 1 | A model can bring vision understanding to Jan | @@ -33,16 +32,16 @@ Jan provides various pre-configured AI models with different capabilities. Pleas | LlaVa 1.5 13B Q5 K | A model can bring vision understanding to Jan | | Deepseek Coder 33B Q5 | A model that excelled in project-level code completion with advanced capabilities across multiple programming languages | | Phind 34B Q5 | A multi-lingual model that is fine-tuned on 1.5B tokens of high-quality programming data, excels in various programming languages, and is designed to be steerable and user-friendly | -| Yi 34B Q5 | A specialized chat model, is known for its diverse and creative responses and excels across various NLP tasks and benchmarks | +| Yi 34B Q5 | A specialized chat model is known for its diverse and creative responses and excels across various NLP tasks and benchmarks | | Capybara 200k 34B Q5 | A long context length model that supports 200K tokens | | Dolphin 8x7B Q4 | An uncensored model built on Mixtral-8x7b and it is good at programming tasks | -| Mixtral 8x7B Instruct Q4 | A pretrained generative Sparse Mixture of Experts, which outperforms 70B models on most benchmarks | +| Mixtral 8x7B Instruct Q4 | A pre-trained generative Sparse Mixture of Experts, which outperforms 70B models on most benchmarks | | Tulu 2 70B Q4 | A strong model alternative to Llama 2 70b Chat to act as helpful assistants | | Llama 2 Chat 70B Q4 | A model that is specifically designed for a comprehensive understanding through training on extensive internet data | :::note -OpenAI GPT models requires a subscription in order to use them further. To learn more, [click here](https://openai.com/pricing). +OpenAI GPT models require a subscription to use them further. To learn more, [click here](https://openai.com/pricing). ::: @@ -69,13 +68,3 @@ OpenAI GPT models requires a subscription in order to use them further. To learn | Yarn Mistral 7B Q4 | NousResearch, The Bloke | `yarn-mistral-7b` | **GGUF** | 4.07GB | | LlaVa 1.5 7B Q5 K | Mys | `llava-1.5-7b-q5` | **GGUF** | 5.03GB | | BakLlava 1 | Mys | `bakllava-1` | **GGUF** | 5.36GB | -| Solar Slerp 10.7B Q4 | Jan | `solar-10.7b-slerp` | **GGUF** | 5.92GB | -| LlaVa 1.5 13B Q5 K | Mys | `llava-1.5-13b-q5` | **GGUF** | 9.17GB | -| Deepseek Coder 33B Q5 | Deepseek, The Bloke | `deepseek-coder-34b` | **GGUF** | 18.57GB | -| Phind 34B Q5 | Phind, The Bloke | `phind-34b` | **GGUF** | 18.83GB | -| Yi 34B Q5 | 01-ai, The Bloke | `yi-34b` | **GGUF** | 19.24GB | -| Capybara 200k 34B Q5 | NousResearch, The Bloke | `capybara-34b` | **GGUF** | 22.65GB | -| Dolphin 8x7B Q4 | Cognitive Computations, TheBloke | `dolphin-2.7-mixtral-8x7b` | **GGUF** | 24.62GB | -| Mixtral 8x7B Instruct Q4 | MistralAI, TheBloke | `mixtral-8x7b-instruct` | **GGUF** | 24.62GB | -| Tulu 2 70B Q4 | Lizpreciatior, The Bloke | `tulu-2-70b` | **GGUF** | 38.56GB | -| Llama 2 Chat 70B Q4 | MetaAI, The Bloke | `llama2-chat-70b-q4` | **GGUF** | 40.90GB | \ No newline at end of file diff --git a/docs/docs/quickstart/quickstart.mdx b/docs/docs/quickstart/quickstart.mdx index c84e21d54..c00fbfd5c 100644 --- a/docs/docs/quickstart/quickstart.mdx +++ b/docs/docs/quickstart/quickstart.mdx @@ -4,6 +4,7 @@ hide_table_of_contents: true --- import installImageURL from './assets/jan-ai-quickstart.png'; +import flow from './assets/quick.png'; # Quickstart @@ -28,12 +29,19 @@ import installImageURL from './assets/jan-ai-quickstart.png'; 3. Go to the **Hub** under the **Thread** section and select the AI model that you want to use. For more info, go to the [Using Models](category/using-models) section. 4. A new thread will be added. You can use Jan in the thread with the AI model that you selected before. */} +
+ Flow +
+ +To get started quickly with Jan, follow the steps below: ### Step 1: Install Jan Go to [Jan.ai](https://jan.ai/) > Select your operating system > Install the program. -To learn more about system requirements for your operating system, go to [Installation guide](/quickstart/install). +:::note +To learn more about system requirements for your operating system, go to [Installation guide](/docs/install). +::: ### Step 2: Select AI Model @@ -43,7 +51,7 @@ Each model has their purposes, capabilities, and different requirements. To select AI models: Go to the **Hub** > select the models that you would like to install. -For more info, go to [list of supported models](/quickstart/models-list/). +For more info, go to [list of supported models](/docs/models-list/). ### Step 3: Use the AI Model