Merge branch 'main' into feat-1346

This commit is contained in:
0xSage 2024-01-11 11:24:20 +08:00 committed by GitHub
commit 2371791385
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
114 changed files with 2284 additions and 1295 deletions

View File

@ -28,6 +28,9 @@ on:
- "node_modules/**"
- "yarn.lock"
- "Makefile"
- "extensions/**"
- "core/**"
- "!README.md"
jobs:
test-on-macos:

View File

@ -37,7 +37,7 @@ jobs:
sed -i "s|<a href='https://github.com/janhq/jan/releases/download/v.*/jan-mac-x64-.*'>|<a href='https://github.com/janhq/jan/releases/download/v${release}/jan-mac-x64-${release}.dmg'>|" README.md
sed -i "s|<a href='https://github.com/janhq/jan/releases/download/v.*/jan-mac-arm64-.*'>|<a href='https://github.com/janhq/jan/releases/download/v${release}/jan-mac-arm64-${release}.dmg'>|" README.md
sed -i "s|<a href='https://github.com/janhq/jan/releases/download/v.*/jan-linux-amd64-.*'>|<a href='https://github.com/janhq/jan/releases/download/v${release}/jan-linux-amd64-${release}.deb'>|" README.md
sed -i "s|<a href='https://github.com/janhq/jan/releases/download/v.*/jan-linux-amd64-.*'>|<a href='https://github.com/janhq/jan/releases/download/v${release}/jan-linux-x86_64-${release}.AppImage'>|" README.md
sed -i "s|<a href='https://github.com/janhq/jan/releases/download/v.*/jan-linux-x86_64-.*'>|<a href='https://github.com/janhq/jan/releases/download/v${release}/jan-linux-x86_64-${release}.AppImage'>|" README.md
- name: Commit and Push changes
if: github.event_name == 'release'

View File

@ -76,31 +76,31 @@ Jan is an open-source ChatGPT alternative that runs 100% offline on your compute
<tr style="text-align:center">
<td style="text-align:center"><b>Experimental (Nightly Build)</b></td>
<td style="text-align:center">
<a href='https://delta.jan.ai/0.4.3-135/jan-win-x64-0.4.3-135.exe'>
<a href='https://delta.jan.ai/0.4.3-139/jan-win-x64-0.4.3-139.exe'>
<img src='./docs/static/img/windows.png' style="height:14px; width: 14px" />
<b>jan.exe</b>
</a>
</td>
<td style="text-align:center">
<a href='https://delta.jan.ai/0.4.3-135/jan-mac-x64-0.4.3-135.dmg'>
<a href='https://delta.jan.ai/0.4.3-139/jan-mac-x64-0.4.3-139.dmg'>
<img src='./docs/static/img/mac.png' style="height:15px; width: 15px" />
<b>Intel</b>
</a>
</td>
<td style="text-align:center">
<a href='https://delta.jan.ai/0.4.3-135/jan-mac-arm64-0.4.3-135.dmg'>
<a href='https://delta.jan.ai/0.4.3-139/jan-mac-arm64-0.4.3-139.dmg'>
<img src='./docs/static/img/mac.png' style="height:15px; width: 15px" />
<b>M1/M2</b>
</a>
</td>
<td style="text-align:center">
<a href='https://delta.jan.ai/0.4.3-135/jan-linux-amd64-0.4.3-135.deb'>
<a href='https://delta.jan.ai/0.4.3-139/jan-linux-amd64-0.4.3-139.deb'>
<img src='./docs/static/img/linux.png' style="height:14px; width: 14px" />
<b>jan.deb</b>
</a>
</td>
<td style="text-align:center">
<a href='https://delta.jan.ai/0.4.3-135/jan-linux-x86_64-0.4.3-135.AppImage'>
<a href='https://delta.jan.ai/0.4.3-139/jan-linux-x86_64-0.4.3-139.AppImage'>
<img src='./docs/static/img/linux.png' style="height:14px; width: 14px" />
<b>jan.AppImage</b>
</a>
@ -166,6 +166,20 @@ To reset your installation:
- Delete all `node_modules` in current folder
- Clear Application cache in `~/Library/Caches/jan`
## Requirements for running Jan
- MacOS: 13 or higher
- Windows:
- Windows 10 or higher
- To enable GPU support:
- Nvidia GPU with CUDA Toolkit 11.4 or higher
- Nvidia driver 470.63.01 or higher
- Linux:
- glibc 2.27 or higher (check with `ldd --version`)
- gcc 11, g++ 11, cpp 11 or higher, refer to this [link](https://jan.ai/guides/troubleshooting/gpu-not-used/#specific-requirements-for-linux) for more information
- To enable GPU support:
- Nvidia GPU with CUDA Toolkit 11.4 or higher
- Nvidia driver 470.63.01 or higher
## Contributing
Contributions are welcome! Please read the [CONTRIBUTING.md](CONTRIBUTING.md) file

View File

@ -1,92 +0,0 @@
## Requirements for running Jan App in GPU mode on Windows and Linux
- You must have an NVIDIA driver that supports CUDA 11.4 or higher. Refer [here](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver).
To check if the NVIDIA driver is installed, open PowerShell or Terminal and enter the following command:
```bash
nvidia-smi
```
If you see a result similar to the following, you have successfully installed the NVIDIA driver:
```bash
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 On | N/A |
| 0% 51C P8 10W / 170W | 364MiB / 7982MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
```
- You must have CUDA 11.4 or higher (refer [here](https://developer.nvidia.com/cuda-toolkit-archive)).
To check if CUDA is installed, open PowerShell or Terminal and enter the following command:
```bash
nvcc --version
```
If you see a result similar to the following, you have successfully installed CUDA:
```bash
nvcc: NVIDIA (R) Cuda compiler driver
Cuda compilation tools, release 11.4, V11.4.100
Build cuda_11.4.r11.4/compiler.30033411_0
```
- Specifically for Linux:
- you must have `gcc-11`, `g++-11`, `cpp-11` or higher, refer [here](https://gcc.gnu.org/projects/cxx-status.html#cxx17). For Ubuntu, you can install g++ 11 by following the instructions [here](https://linuxconfig.org/how-to-switch-between-multiple-gcc-and-g-compiler-versions-on-ubuntu-20-04-lts-focal-fossa).
```bash
# Example for ubuntu
# Add the following PPA repository
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
# Update the package list
sudo apt update
# Install g++ 11
sudo apt-get install -y gcc-11 g++-11 cpp-11
# Update the default g++ version
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 110 \
--slave /usr/bin/g++ g++ /usr/bin/g++-11 \
--slave /usr/bin/gcov gcov /usr/bin/gcov-11 \
--slave /usr/bin/gcc-ar gcc-ar /usr/bin/gcc-ar-11 \
--slave /usr/bin/gcc-ranlib gcc-ranlib /usr/bin/gcc-ranlib-11
sudo update-alternatives --install /usr/bin/cpp cpp /usr/bin/cpp-11 110
# Check the default g++ version
g++ --version
```
- You must add the `.so` libraries of CUDA to the `LD_LIBRARY_PATH` environment variable, refer [here](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions).
```bash
# Example for ubuntu with CUDA 11.4
sudo nano /etc/environment
# Add /usr/local/cuda-11.4/bin to the PATH environment variable - the first line
# Add the following line to the end of the file
LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64
# Save and exit
# Restart your computer or log out and log in again, the changes will take effect
```
## How to switch mode CPU/GPU Jan app
By default, Jan app will run in CPU mode. When starting Jan app, the program will automatically check if your computer meets the requirements to run in GPU mode. If it does, we will automatically enable GPU mode and pick the GPU has highest VGRAM for you (feature allowing users to select one or more GPU devices for use - currently in planning). You can check whether you are using CPU mode or GPU mode in the settings/advance section of Jan app. (see image below). ![](/docs/static/img/usage/jan-gpu-enable-setting.png)
If you have GPU mode but it is not enabled by default, the following possibilities may exist, you can follow the next steps to fix the error:
1. You have not installed the NVIDIA driver, refer to the NVIDIA driver that supports CUDA 11.4 [here](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver).
2. You have not installed the CUDA toolkit or your CUDA toolkit is not compatible with the NVIDIA driver, refer to CUDA compatibility [here](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver).
3. You have not installed a CUDA compatible driver, refer [here](https://docs.nvidia.com/deploy/cuda-compatibility/index.html#binary-compatibility__table-toolkit-driver), and you must add the `.so` libraries of CUDA and the CUDA compatible driver to the `LD_LIBRARY_PATH` environment variable, refer [here](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions). For Windows, add the `.dll` libraries of CUDA and the CUDA compatible driver to the `PATH` environment variable. Usually, when installing CUDA on Windows, this environment variable is automatically added, but if you do not see it, you can add it manually by referring [here](https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html#environment-setup).
## To check the current GPU-related settings that Jan app has detected, you can go to the Settings/Advanced section as shown in the image below.
![](/docs/static/img/usage/jan-open-home-directory.png)
![](/docs/static/img/usage/jan-open-settings-1.png)
![](/docs/static/img/usage/jan-open-settings-2.png)
![](/docs/static/img/usage/jan-open-settings-3.png)
When you have an issue with GPU mode, share the `settings.json` with us will help us to solve the problem faster.
## Tested on
- Windows 11 Pro 64-bit, NVIDIA GeForce RTX 4070ti GPU, CUDA 12.2, NVIDIA driver 531.18 (Bare metal)
- Ubuntu 22.04 LTS, NVIDIA GeForce RTX 4070ti GPU, CUDA 12.2, NVIDIA driver 545 (Bare metal)
- Ubuntu 18.04 LTS, NVIDIA GeForce GTX 1660ti GPU, CUDA 12.1, NVIDIA driver 535 (Proxmox VM passthrough GPU)
- Ubuntu 20.04 LTS, NVIDIA GeForce GTX 1660ti GPU, CUDA 12.1, NVIDIA driver 535 (Proxmox VM passthrough GPU)

View File

@ -53,6 +53,8 @@ export default [
'crypto',
'url',
'http',
'os',
'util'
],
watch: {
include: 'src/node/**',

View File

@ -12,6 +12,7 @@ export enum AppRoute {
baseName = 'baseName',
startServer = 'startServer',
stopServer = 'stopServer',
log = 'log'
}
export enum AppEvent {

View File

@ -77,13 +77,12 @@ const openExternalUrl: (url: string) => Promise<any> = (url) =>
const getResourcePath: () => Promise<string> = () => global.core.api?.getResourcePath()
/**
* Gets the file's stats.
* Log to file from browser processes.
*
* @param path - The path to the file.
* @returns {Promise<FileStat>} - A promise that resolves with the file's stats.
* @param message - Message to log.
*/
const fileStat: (path: string) => Promise<FileStat | undefined> = (path) =>
global.core.api?.fileStat(path)
const log: (message: string, fileName?: string) => void = (message, fileName) =>
global.core.api?.log(message, fileName)
/**
* Register extension point function type definition
@ -108,5 +107,6 @@ export {
joinPath,
openExternalUrl,
baseName,
fileStat,
log,
FileStat
}

View File

@ -1,3 +1,5 @@
import { FileStat } from "./types"
/**
* Writes data to a file at the specified path.
* @returns {Promise<any>} A Promise that resolves when the file is written successfully.
@ -58,6 +60,17 @@ const syncFile: (src: string, dest: string) => Promise<any> = (src, dest) =>
*/
const copyFileSync = (...args: any[]) => global.core.api?.copyFileSync(...args)
/**
* Gets the file's stats.
*
* @param path - The path to the file.
* @returns {Promise<FileStat>} - A promise that resolves with the file's stats.
*/
const fileStat: (path: string) => Promise<FileStat | undefined> = (path) =>
global.core.api?.fileStat(path)
// TODO: Export `dummy` fs functions automatically
// Currently adding these manually
export const fs = {
@ -71,4 +84,5 @@ export const fs = {
appendFileSync,
copyFileSync,
syncFile,
fileStat
}

View File

@ -1,13 +1,8 @@
import fs from 'fs'
import { JanApiRouteConfiguration, RouteConfiguration } from './configuration'
import { join } from 'path'
import { Model, ThreadMessage } from './../../../index'
import { ContentType, MessageStatus, Model, ThreadMessage } from './../../../index'
import fetch from 'node-fetch'
import { ulid } from 'ulid'
import request from 'request'
const progress = require('request-progress')
const os = require('os')
const path = join(os.homedir(), 'jan')
@ -207,17 +202,28 @@ const generateThreadId = (assistantId: string) => {
export const createMessage = async (threadId: string, message: any) => {
const threadMessagesFileName = 'messages.jsonl'
// TODO: add validation
try {
const { ulid } = require('ulid')
const msgId = ulid()
const createdAt = Date.now()
const threadMessage: ThreadMessage = {
...message,
id: msgId,
thread_id: threadId,
status: MessageStatus.Ready,
created: createdAt,
updated: createdAt,
object: 'thread.message',
role: message.role,
content: [
{
type: ContentType.Text,
text: {
value: message.content,
annotations: [],
},
},
],
}
const threadDirPath = join(path, 'threads', threadId)
@ -250,8 +256,10 @@ export const downloadModel = async (modelId: string) => {
// path to model binary
const modelBinaryPath = join(directoryPath, modelId)
const rq = request(model.source_url)
const request = require('request')
const rq = request(model.source_url)
const progress = require('request-progress')
progress(rq, {})
.on('progress', function (state: any) {
console.log('progress', JSON.stringify(state, null, 2))
@ -312,8 +320,9 @@ export const chatCompletions = async (request: any, reply: any) => {
headers['Authorization'] = `Bearer ${apiKey}`
headers['api-key'] = apiKey
}
console.log(apiUrl)
console.log(JSON.stringify(headers))
console.debug(apiUrl)
console.debug(JSON.stringify(headers))
const fetch = require('node-fetch')
const response = await fetch(apiUrl, {
method: 'POST',
headers: headers,

View File

@ -1,11 +1,10 @@
import { DownloadRoute } from '../../../api'
import { join } from 'path'
import { userSpacePath, DownloadManager, HttpServer } from '../../index'
import { userSpacePath } from '../../extension/manager'
import { DownloadManager } from '../../download'
import { HttpServer } from '../HttpServer'
import { createWriteStream } from 'fs'
const request = require('request')
const progress = require('request-progress')
export const downloadRouter = async (app: HttpServer) => {
app.post(`/${DownloadRoute.downloadFile}`, async (req, res) => {
const body = JSON.parse(req.body as any)
@ -19,6 +18,9 @@ export const downloadRouter = async (app: HttpServer) => {
const localPath = normalizedArgs[1]
const fileName = localPath.split('/').pop() ?? ''
const request = require('request')
const progress = require('request-progress')
const rq = request(normalizedArgs[0])
progress(rq, {})
.on('progress', function (state: any) {

View File

@ -1,12 +1,10 @@
import { join, extname } from 'path'
import { ExtensionRoute } from '../../../api'
import {
userSpacePath,
ModuleManager,
getActiveExtensions,
installExtensions,
HttpServer,
} from '../../index'
import { ExtensionRoute } from '../../../api/index'
import { userSpacePath } from '../../extension/manager'
import { ModuleManager } from '../../module'
import { getActiveExtensions, installExtensions } from '../../extension/store'
import { HttpServer } from '../HttpServer'
import { readdirSync } from 'fs'
export const extensionRouter = async (app: HttpServer) => {

View File

@ -1,6 +1,7 @@
import { FileSystemRoute } from '../../../api'
import { join } from 'path'
import { HttpServer, userSpacePath } from '../../index'
import { HttpServer } from '../HttpServer'
import { userSpacePath } from '../../extension/manager'
export const fsRouter = async (app: HttpServer) => {
const moduleName = 'fs'

View File

@ -1,5 +1,9 @@
import { HttpServer } from '../HttpServer'
import { commonRouter, threadRouter, fsRouter, extensionRouter, downloadRouter } from './index'
import { commonRouter } from './common'
import { threadRouter } from './thread'
import { fsRouter } from './fs'
import { extensionRouter } from './extension'
import { downloadRouter } from './download'
export const v1Router = async (app: HttpServer) => {
// MARK: External Routes

View File

@ -1,4 +1,3 @@
import { Request } from "request";
/**
* Manages file downloads and network requests.
@ -18,7 +17,7 @@ export class DownloadManager {
* @param {string} fileName - The name of the file.
* @param {Request | undefined} request - The network request to set, or undefined to clear the request.
*/
setRequest(fileName: string, request: Request | undefined) {
setRequest(fileName: string, request: any | undefined) {
this.networkRequests[fileName] = request;
}
}

View File

@ -1,7 +1,5 @@
import { rmdirSync } from 'fs'
import { resolve, join } from 'path'
import { manifest, extract } from 'pacote'
import * as Arborist from '@npmcli/arborist'
import { ExtensionManager } from './manager'
/**
@ -41,6 +39,7 @@ export default class Extension {
* @param {Object} [options] Options provided to pacote when fetching the manifest.
*/
constructor(origin?: string, options = {}) {
const Arborist = require('@npmcli/arborist')
const defaultOpts = {
version: false,
fullMetadata: false,
@ -74,13 +73,15 @@ export default class Extension {
async getManifest() {
// Get the package's manifest (package.json object)
try {
const mnf = await manifest(this.specifier, this.installOptions)
// set the Package properties based on the it's manifest
this.name = mnf.name
this.version = mnf.version
this.main = mnf.main
this.description = mnf.description
await import('pacote').then((pacote) => {
return pacote.manifest(this.specifier, this.installOptions).then((mnf) => {
// set the Package properties based on the it's manifest
this.name = mnf.name
this.version = mnf.version
this.main = mnf.main
this.description = mnf.description
})
})
} catch (error) {
throw new Error(`Package ${this.origin} does not contain a valid manifest: ${error}`)
}
@ -99,7 +100,8 @@ export default class Extension {
await this.getManifest()
// Install the package in a child folder of the given folder
await extract(
const pacote = await import('pacote')
await pacote.extract(
this.specifier,
join(ExtensionManager.instance.extensionsPath ?? '', this.name ?? ''),
this.installOptions,
@ -164,10 +166,13 @@ export default class Extension {
* @returns the latest available version if a new version is available or false if not.
*/
async isUpdateAvailable() {
if (this.origin) {
const mnf = await manifest(this.origin)
return mnf.version !== this.version ? mnf.version : false
}
return import('pacote').then((pacote) => {
if (this.origin) {
return pacote.manifest(this.origin).then((mnf) => {
return mnf.version !== this.version ? mnf.version : false
})
}
})
}
/**

View File

@ -1,7 +1,6 @@
import { join, resolve } from "path";
import { existsSync, mkdirSync, writeFileSync } from "fs";
import { init } from "./index";
import { homedir } from "os"
/**
* Manages extension installation and migration.
@ -20,22 +19,6 @@ export class ExtensionManager {
}
}
/**
* Sets up the extensions by initializing the `extensions` module with the `confirmInstall` and `extensionsPath` options.
* The `confirmInstall` function always returns `true` to allow extension installation.
* The `extensionsPath` option specifies the path to install extensions to.
*/
setupExtensions() {
init({
// Function to check from the main process that user wants to install a extension
confirmInstall: async (_extensions: string[]) => {
return true;
},
// Path to install extension to
extensionsPath: join(userSpacePath, "extensions"),
});
}
setExtensionsPath(extPath: string) {
// Create folder if it does not exist
let extDir;

View File

@ -3,16 +3,24 @@ import util from 'util'
import path from 'path'
import os from 'os'
const appDir = path.join(os.homedir(), 'jan')
export const logDir = path.join(os.homedir(), 'jan', 'logs')
export const logPath = path.join(appDir, 'app.log')
export const log = function (message: string, fileName: string = 'app.log') {
if (!fs.existsSync(logDir)) {
fs.mkdirSync(logDir, { recursive: true })
}
if (!message.startsWith('[')) {
message = `[APP]::${message}`
}
export const log = function (d: any) {
if (fs.existsSync(appDir)) {
var log_file = fs.createWriteStream(logPath, {
message = `${new Date().toISOString()} ${message}`
if (fs.existsSync(logDir)) {
var log_file = fs.createWriteStream(path.join(logDir, fileName), {
flags: 'a',
})
log_file.write(util.format(d) + '\n')
log_file.write(util.format(message) + '\n')
log_file.close()
console.debug(message)
}
}

View File

@ -61,6 +61,8 @@ export enum MessageStatus {
Pending = 'pending',
/** Message loaded with error. **/
Error = 'error',
/** Message is cancelled streaming */
Stopped = "stopped"
}
/**

View File

@ -20,7 +20,7 @@ Jan's long-term technical endeavor is to build a cognitive framework for future
## Quicklinks
- Core product vision for [Jan Framework](http://localhost:3001/docs/)
- Core product vision for [Jan Framework](../docs)
- R&D and model training efforts [Discord](https://discord.gg/9NfUSyzp3y) (via our small data-center which is `free & open to all researchers who lack GPUs`!)
- Current implementations of Jan Framework: [Jan Desktop](https://jan.ai/), [Nitro](https://nitro.jan.ai/)

View File

@ -21,12 +21,36 @@ keywords:
Jan is available for download via our homepage, [https://jan.ai](https://jan.ai/).
For Linux, the download should be available as a `.deb` file in the following format.
For Linux, the download should be available as a `.AppImage` file or a `.deb` file in the following format.
```bash
# AppImage
jan-linux-x86_64-{version}.AppImage
# Debian Linux distribution
jan-linux-amd64-{version}.deb
```
To install Jan on Linux, you should use your package manager's install or `dpkg``. For Debian/Ubuntu-based distributions, you can install Jan using the following command:
```bash
# Install Jan using dpkg
sudo dpkg -i jan-linux-amd64-{version}.deb
# Install Jan using apt-get
sudo apt-get install ./jan-linux-amd64-{version}.deb
# where jan-linux-amd64-{version}.deb is path to the Jan package
```
For other Linux distributions, you launch the AppImage file without installation. To do so, you need to make the AppImage file executable and then run it. You can do this either through your file manager's properties dialog or with the following commands:
```bash
# Install Jan using AppImage
chmod +x jan-linux-x86_64-{version}.AppImage
./jan-linux-x86_64-{version}.AppImage
# where jan-linux-x86_64-{version}.AppImage is path to the Jan package
```
The typical installation process takes around a minute.
### GitHub Releases
@ -38,6 +62,10 @@ Within the Releases' assets, you will find the following files for Linux:
```bash
# Debian Linux distribution
jan-linux-amd64-{version}.deb
# AppImage
jan-linux-x86_64-{version}.AppImage
```
```
## Uninstall Jan
@ -49,4 +77,6 @@ sudo apt-get remove jan
# where jan is the name of Jan package
```
For other Linux distributions, if you installed Jan via the `.AppImage` file, you can uninstall Jan by deleting the `.AppImage` file.
In case you wish to completely remove all user data associated with Jan after uninstallation, you can delete the user data folders located at ~/jan. This will return your system to its state prior to the installation of Jan. This method can also be used to reset all settings if you are experiencing any issues with Jan.

View File

@ -1,7 +1,7 @@
---
title: Integrate Continue with Jan and VSCode
title: Integrate Continue with Jan and VS Code
slug: /guides/integrations/continue
description: Guide to integrate Continue with Jan and VSCode
description: Guide to integrate Continue with Jan and VS Code
keywords:
[
Jan AI,
@ -25,13 +25,13 @@ import TabItem from "@theme/TabItem";
[Continue](https://continue.dev/docs/intro) is an open-source autopilot for VS Code and JetBrains—the easiest way to code with any LLM.
In this guide, we will show you how to integrate Continue with Jan and VSCode, enhancing your coding experience with the power of the local AI language model.
In this guide, we will show you how to integrate Continue with Jan and VS Code, enhancing your coding experience with the power of the local AI language model.
## Steps to Integrate Continue with Jan and VSCode
## Steps to Integrate Continue with Jan and VS Code
### 1. Install Continue for VSCode
### 1. Install Continue for VS Code
You need to install Continue for VSCode. You can follow this [guide to install Continue for VSCode](https://continue.dev/docs/quickstart).
To get started with Continue in VS Code, please follow this [guide to install Continue for VS Code](https://continue.dev/docs/quickstart).
### 2. Enable Jan API Server
@ -95,7 +95,7 @@ Navigate to `Settings` > `Models`. Activate the model that you want to use in Ja
![Active Models](assets/01-start-model.png)
### 5. Try Out the Integration of Jan and Continue in Vscode
### 5. Try Out the Integration of Jan and Continue in VS Code
#### Asking questions about the code

View File

@ -1,7 +1,7 @@
---
title: Failed to Fetch
slug: /troubleshooting/failed-to-fetch
description: Troubleshooting "Failed to Fetch"
title: Something's amiss
slug: /troubleshooting/somethings-amiss
description: Troubleshooting "Something's amiss"
keywords: [
Jan AI,
Jan,
@ -16,7 +16,7 @@ keywords: [
]
---
You may receive an "Error occured: Failed to Fetch" response when you first start chatting with a selected model.
You may receive an "Something's amiss" response when you first start chatting with a selected model.
This may occur due to several reasons. Please follow these steps to resolve it:

View File

@ -1,5 +1,7 @@
---
title: Jan is Not Using GPU
slug: /troubleshooting/gpu-not-used
description: Jan is not using GPU
keywords: [
Jan AI,
Jan,
@ -60,8 +62,8 @@ If you see a result similar to the following, you have successfully installed CU
```bash
nvcc: NVIDIA (R) Cuda compiler driver
Cuda compilation tools, release 11.4, V11.4.100
Build cuda_11.4.r11.4/compiler.30033411_0
Cuda compilation tools, release 11.4, V11.7.100
Build cuda_11.7.r11.7/compiler.30033411_0
```
### Specific Requirements for Linux

View File

@ -0,0 +1,18 @@
---
title: Overview
slug: /handbook
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
]
---
Welcome to Jan Handbook! Were really excited to bring you onboard.

View File

@ -0,0 +1,17 @@
---
title: Why we exist
slug: /handbook/meet-jan/why-we-exist
description: Why we exist
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Vision and Mission
slug: /handbook/meet-jan/vision-and-mission
description: Vision and mission of Jan
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,21 @@
---
title: Meet Jan
slug: /handbook/meet-jan
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---
import DocCardList from "@theme/DocCardList";
<DocCardList className="DocCardList--no-description" />

View File

@ -0,0 +1,17 @@
---
title: Overview of Jan Framework and Its Applications
slug: /handbook/products-and-innovations/overview-of-jan-framework-and-its-applications
description: Overview of Jan Framework and Its Applications
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Philosophy Behind Product Development
slug: /handbook/products-and-innovations/philosophy-behind-product-development
description: Philosophy Behind Product Development
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Roadmap - Present and Future Directions
slug: /handbook/products-and-innovations/roadmap-present-and-future-directions
description: Roadmap - Present and Future Directions
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,21 @@
---
title: Our Products and Innovations
slug: /handbook/products-and-innovations
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---
import DocCardList from "@theme/DocCardList";
<DocCardList className="DocCardList--no-description" />

View File

@ -0,0 +1,17 @@
---
title: How We Hire
slug: /handbook/core-contributors/how-we-hire
description: How We Hire
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Embracing Pod Structure
slug: /handbook/core-contributors/embracing-pod-structure
description: Embracing Pod Structure
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: The Art of Conflict
slug: /handbook/core-contributors/the-art-of-conflict
description: The Art of Conflict
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: OpSec
slug: /handbook/core-contributors/opsec
description: OpSec
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: See a Problem, Own a Problem
slug: /handbook/core-contributors/see-a-problem-own-a-problem
description: See a Problem, Own a Problem - How we function without management
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,21 @@
---
title: Our Contributors
slug: /handbook/core-contributors
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---
import DocCardList from "@theme/DocCardList";
<DocCardList className="DocCardList--no-description" />

View File

@ -0,0 +1,17 @@
---
title: No PMs Allowed
slug: /handbook/what-we-do/no-pms-allowed
description: No PMs Allowed
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Our Support Methodology - Open Source, Collaborative, and Self-serve
slug: /handbook/what-we-do/our-support-methodology
description: Our Support Methodology - Open Source, Collaborative, and Self-serve
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Our Approach to Design
slug: /handbook/what-we-do/our-approach-to-design
description: Our Approach to Design
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Shipping Now, Shipping Later
slug: /handbook/what-we-do/shipping-now-shipping-later
description: Shipping Now, Shipping Later
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Trial by Fire
slug: /handbook/what-we-do/trial-by-fire
description: Trial by Fire
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,21 @@
---
title: What We Do
slug: /handbook/what-we-do
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---
import DocCardList from "@theme/DocCardList";
<DocCardList className="DocCardList--no-description" />

View File

@ -0,0 +1,17 @@
---
title: On the Tools - What We Use and Why
slug: /handbook/engineering-exellence/one-the-tools-what-we-use-and-why
description: On the Tools - What We Use and Why
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Jan Choices - Why FOSS and Why C++
slug: /handbook/engineering-exellence/jan-choices
description: Jan Choices - Why FOSS and Why C++
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Engineering Processes - From Plan to Launch
slug: /handbook/engineering-exellence/engineering-processes
description: Engineering Processes - From Plan to Launch
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Data Management and Deployment Strategies
slug: /handbook/engineering-exellence/data-management-and-deployment-strategies
description: Data Management and Deployment Strategies
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,21 @@
---
title: Engineering Excellence
slug: /handbook/engineering-exellence
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---
import DocCardList from "@theme/DocCardList";
<DocCardList className="DocCardList--no-description" />

View File

@ -0,0 +1,17 @@
---
title: How Do We Know What to Work On?
slug: /handbook/product-and-community/how-dowe-know-what-to-work-on
description: How Do We Know What to Work On?
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Our OKRs
slug: /handbook/product-and-community/our-okrs
description: Our OKRs
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Approaches to Beta Testing and User Engagement
slug: /handbook/product-and-community/approaches-to-beta-testing-and-user-engagement
description: Approaches to Beta Testing and User Engagement
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,21 @@
---
title: Product and Community
slug: /handbook/product-and-community
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---
import DocCardList from "@theme/DocCardList";
<DocCardList className="DocCardList--no-description" />

View File

@ -0,0 +1,17 @@
---
title: Jans Pivot and Journey So Far
slug: /handbook/from-spaghetti-flinging-to-strategy/jan-pivot-and-journey-so-far
description: Jans Pivot and Journey So Far
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: ESOP Philosophy
slug: /handbook/from-spaghetti-flinging-to-strategy/esop-philosophy
description: ESOP Philosophy
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: How We GTM
slug: /handbook/from-spaghetti-flinging-to-strategy/how-we-gtm
description: How We GTM
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,21 @@
---
title: From Spaghetti Flinging to Strategy
slug: /handbook/from-spaghetti-flinging-to-strategy
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---
import DocCardList from "@theme/DocCardList";
<DocCardList className="DocCardList--no-description" />

View File

@ -0,0 +1,17 @@
---
title: How to Get Involved and FAQ
slug: /handbook/contributing-to-jan/how-to-get-involved-and-faq
description: How to Get Involved and FAQ
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,17 @@
---
title: Feedback Channels/ Where to Get Help/ Use Your Voice
slug: /handbook/contributing-to-jan/feedback-channels
description: Feedback Channels/ Where to Get Help/ Use Your Voice
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---

View File

@ -0,0 +1,21 @@
---
title: Contributing to Jan
slug: /handbook/contributing-to-jan
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
handbook,
]
---
import DocCardList from "@theme/DocCardList";
<DocCardList className="DocCardList--no-description" />

View File

@ -0,0 +1,146 @@
---
title: Engineering
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[
Jan AI,
Jan,
ChatGPT alternative,
local AI,
private AI,
conversational AI,
no-subscription fee,
large language model,
]
---
## Connecting to Rigs
### Pritunl Setup
1. **Install Pritunl**: [Download here](https://client.pritunl.com/#install)
2. **Import .ovpn file**
3. **VSCode**: Install the "Remote-SSH" extension for connection
### Llama.cpp Setup
1. **Clone Repo**: `git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp`
2. **Build**:
```bash
mkdir build && cd build
cmake .. -DLLAMA_CUBLAS=ON -DLLAMA_CUDA_F16=ON -DLLAMA_CUDA_MMV_Y=8
cmake --build . --config Release
```
3. **Download Model:**
```bash
cd ../models && wget https://huggingface.co/TheBloke/Llama-2-7B-GGUF/resolve/main/llama-2-7b.Q8_0.gguf
```
4. **Run:**
```bash
cd ../build/bin/
./main -m ./models/llama-2-7b.Q8_0.gguf -p "Writing a thesis proposal can be done in 10 simple steps:\nStep 1:" -n 2048 -e -ngl 100 -t 48
```
For the llama.cpp CLI arguments you can see here:
| Short Option | Long Option | Param Value | Description |
| --------------- | --------------------- | ----------- | ---------------------------------------------------------------- |
| `-h` | `--help` | | Show this help message and exit |
| `-i` | `--interactive` | | Run in interactive mode |
| | `--interactive-first` | | Run in interactive mode and wait for input right away |
| | `-ins`, `--instruct` | | Run in instruction mode (use with Alpaca models) |
| `-r` | `--reverse-prompt` | `PROMPT` | Run in interactive mode and poll user input upon seeing `PROMPT` |
| | `--color` | | Colorise output to distinguish prompt and user input from |
| **Generations** |
| `-s` | `--seed` | `SEED` | Seed for random number generator |
| `-t` | `--threads` | `N` | Number of threads to use during computation |
| `-p` | `--prompt` | `PROMPT` | Prompt to start generation with |
| | `--random-prompt` | | Start with a randomized prompt |
| | `--in-prefix` | `STRING` | String to prefix user inputs with |
| `-f` | `--file` | `FNAME` | Prompt file to start generation |
| `-n` | `--n_predict` | `N` | Number of tokens to predict |
| | `--top_k` | `N` | Top-k sampling |
| | `--top_p` | `N` | Top-p sampling |
| | `--repeat_last_n` | `N` | Last n tokens to consider for penalize |
| | `--repeat_penalty` | `N` | Penalize repeat sequence of tokens |
| `-c` | `--ctx_size` | `N` | Size of the prompt context |
| | `--ignore-eos` | | Ignore end of stream token and continue generating |
| | `--memory_f32` | | Use `f32` instead of `f16` for memory key+value |
| | `--temp` | `N` | Temperature |
| | `--n_parts` | `N` | Number of model parts |
| `-b` | `--batch_size` | `N` | Batch size for prompt processing |
| | `--perplexity` | | Compute perplexity over the prompt |
| | `--keep` | | Number of tokens to keep from the initial prompt |
| | `--mlock` | | Force system to keep model in RAM |
| | `--mtest` | | Determine the maximum memory usage |
| | `--verbose-prompt` | | Print prompt before generation |
| `-m` | `--model` | `FNAME` | Model path |
### TensorRT-LLM Setup
#### **Docker and TensorRT-LLM build**
> Note: You should run with admin permission to make sure everything works fine
1. **Docker Image:**
```bash
sudo make -C docker build
```
2. **Run Container:**
```bash
sudo make -C docker run
```
Once in the container, TensorRT-LLM can be built from the source using the following:
3. **Build:**
```bash
# To build the TensorRT-LLM code.
python3 ./scripts/build_wheel.py --trt_root /usr/local/tensorrt
# Deploy TensorRT-LLM in your environment.
pip install ./build/tensorrt_llm*.whl
```
> Note: You can specify the GPU architecture (e.g. for 4090 is ADA) for compilation time reduction
> The list of supported architectures can be found in the `CMakeLists.txt` file.
```bash
python3 ./scripts/build_wheel.py --cuda_architectures "89-real;90-real"
```
#### Running TensorRT-LLM
1. **Requirements:**
```bash
pip install -r examples/bloom/requirements.txt && git lfs install
```
2. **Download Weights:**
```bash
cd examples/llama && rm -rf ./llama/7B && mkdir -p ./llama/7B && git clone https://huggingface.co/NousResearch/Llama-2-7b-hf ./llama/7B
```
3. **Build Engine:**
```bash
python build.py --model_dir ./llama/7B/ --dtype float16 --remove_input_padding --use_gpt_attention_plugin float16 --enable_context_fmha --use_gemm_plugin float16 --use_weight_only --output_dir ./llama/7B/trt_engines/weight_only/1-gpu/
```
4. Run Inference:
```bash
python3 run.py --max_output_len=2048 --tokenizer_dir ./llama/7B/ --engine_dir=./llama/7B/trt_engines/weight_only/1-gpu/ --input_text "Writing a thesis proposal can be done in 10 simple steps:\nStep 1:"
```
For the tensorRT-LLM CLI arguments, you can see in the `run.py`.

View File

@ -1,6 +1,5 @@
---
title: Onboarding
slug: /handbook
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords:
[

View File

@ -1,122 +0,0 @@
---
title: Engineering
description: Jan is a ChatGPT-alternative that runs on your own computer, with a local API server.
keywords: [Jan AI, Jan, ChatGPT alternative, local AI, private AI, conversational AI, no-subscription fee, large language model ]
---
## Connecting to Rigs
### Pritunl Setup
1. **Install Pritunl**: [Download here](https://client.pritunl.com/#install)
2. **Import .ovpn file**
3. **VSCode**: Install the "Remote-SSH" extension for connection
### Llama.cpp Setup
1. **Clone Repo**: `git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp`
2. **Build**:
```bash
mkdir build && cd build
cmake .. -DLLAMA_CUBLAS=ON -DLLAMA_CUDA_F16=ON -DLLAMA_CUDA_MMV_Y=8
cmake --build . --config Release
```
3. **Download Model:**
```bash
cd ../models && wget https://huggingface.co/TheBloke/Llama-2-7B-GGUF/resolve/main/llama-2-7b.Q8_0.gguf
```
4. **Run:**
```bash
cd ../build/bin/
./main -m ./models/llama-2-7b.Q8_0.gguf -p "Writing a thesis proposal can be done in 10 simple steps:\nStep 1:" -n 2048 -e -ngl 100 -t 48
```
For the llama.cpp CLI arguments you can see here:
| Short Option | Long Option | Param Value | Description |
|--------------|-----------------------|-------------|-------------|
| `-h` | `--help` | | Show this help message and exit |
| `-i` | `--interactive` | | Run in interactive mode |
| | `--interactive-first` | | Run in interactive mode and wait for input right away |
| | `-ins`, `--instruct` | | Run in instruction mode (use with Alpaca models) |
| `-r` | `--reverse-prompt` | `PROMPT` | Run in interactive mode and poll user input upon seeing `PROMPT` |
| | `--color` | | Colorise output to distinguish prompt and user input from |
|**Generations**|
| `-s` | `--seed` | `SEED` | Seed for random number generator |
| `-t` | `--threads` | `N` | Number of threads to use during computation |
| `-p` | `--prompt` | `PROMPT` | Prompt to start generation with |
| | `--random-prompt` | | Start with a randomized prompt |
| | `--in-prefix` | `STRING` | String to prefix user inputs with |
| `-f` | `--file` | `FNAME` | Prompt file to start generation |
| `-n` | `--n_predict` | `N` | Number of tokens to predict |
| | `--top_k` | `N` | Top-k sampling |
| | `--top_p` | `N` | Top-p sampling |
| | `--repeat_last_n` | `N` | Last n tokens to consider for penalize |
| | `--repeat_penalty` | `N` | Penalize repeat sequence of tokens |
| `-c` | `--ctx_size` | `N` | Size of the prompt context |
| | `--ignore-eos` | | Ignore end of stream token and continue generating |
| | `--memory_f32` | | Use `f32` instead of `f16` for memory key+value |
| | `--temp` | `N` | Temperature |
| | `--n_parts` | `N` | Number of model parts |
| `-b` | `--batch_size` | `N` | Batch size for prompt processing |
| | `--perplexity` | | Compute perplexity over the prompt |
| | `--keep` | | Number of tokens to keep from the initial prompt |
| | `--mlock` | | Force system to keep model in RAM |
| | `--mtest` | | Determine the maximum memory usage |
| | `--verbose-prompt` | | Print prompt before generation |
| `-m` | `--model` | `FNAME` | Model path |
### TensorRT-LLM Setup
#### **Docker and TensorRT-LLM build**
> Note: You should run with admin permission to make sure everything works fine
1. **Docker Image:**
```bash
sudo make -C docker build
```
2. **Run Container:**
```bash
sudo make -C docker run
```
Once in the container, TensorRT-LLM can be built from the source using the following:
3. **Build:**
```bash
# To build the TensorRT-LLM code.
python3 ./scripts/build_wheel.py --trt_root /usr/local/tensorrt
# Deploy TensorRT-LLM in your environment.
pip install ./build/tensorrt_llm*.whl
```
> Note: You can specify the GPU architecture (e.g. for 4090 is ADA) for compilation time reduction
> The list of supported architectures can be found in the `CMakeLists.txt` file.
```bash
python3 ./scripts/build_wheel.py --cuda_architectures "89-real;90-real"
```
#### Running TensorRT-LLM
1. **Requirements:**
```bash
pip install -r examples/bloom/requirements.txt && git lfs install
```
2. **Download Weights:**
```bash
cd examples/llama && rm -rf ./llama/7B && mkdir -p ./llama/7B && git clone https://huggingface.co/NousResearch/Llama-2-7b-hf ./llama/7B
```
3. **Build Engine:**
```bash
python build.py --model_dir ./llama/7B/ --dtype float16 --remove_input_padding --use_gpt_attention_plugin float16 --enable_context_fmha --use_gemm_plugin float16 --use_weight_only --output_dir ./llama/7B/trt_engines/weight_only/1-gpu/
```
4. Run Inference:
```bash
python3 run.py --max_output_len=2048 --tokenizer_dir ./llama/7B/ --engine_dir=./llama/7B/trt_engines/weight_only/1-gpu/ --input_text "Writing a thesis proposal can be done in 10 simple steps:\nStep 1:"
```
For the tensorRT-LLM CLI arguments, you can see in the `run.py`.

View File

@ -61,6 +61,21 @@ const config = {
enableInDevelopment: false, // optional
},
],
[
"@docusaurus/plugin-client-redirects",
{
redirects: [
{
from: "/troubleshooting/failed-to-fetch",
to: "/troubleshooting/somethings-amiss",
},
{
from: "/guides/troubleshooting/gpu-not-used/",
to: "/troubleshooting/gpu-not-used",
},
],
},
],
],
// The classic preset will relay each option entry to the respective sub plugin/theme.
@ -136,6 +151,14 @@ const config = {
autoCollapseCategories: false,
},
},
// Agolia DocSearch
algolia: {
appId: process.env.ALGOLIA_APP_ID || "XXX",
apiKey: process.env.ALGOLIA_API_KEY || "XXX",
indexName: "jan",
insights: true,
debug: false,
},
// SEO Docusarus
metadata: [
{

View File

@ -14,7 +14,10 @@
"write-heading-ids": "docusaurus write-heading-ids"
},
"dependencies": {
"@docsearch/js": "3",
"@docsearch/react": "3",
"@docusaurus/core": "^3.0.0",
"@docusaurus/plugin-client-redirects": "^3.0.0",
"@docusaurus/plugin-content-docs": "^3.0.0",
"@docusaurus/preset-classic": "^3.0.0",
"@docusaurus/theme-live-codeblock": "^3.0.0",

View File

@ -59,15 +59,9 @@ const sidebars = {
id: "about/about",
},
{
type: "category",
type: "doc",
label: "Company Handbook",
collapsible: true,
collapsed: false,
items: [
"handbook/onboarding",
"handbook/product",
"handbook/engineering",
],
id: "handbook/overview",
},
{
type: "link",
@ -75,6 +69,13 @@ const sidebars = {
href: "https://janai.bamboohr.com/careers",
},
],
handbookSidebar: [
{
type: "autogenerated",
dirName: "handbook",
},
],
};
module.exports = sidebars;

View File

@ -22,7 +22,12 @@ const systemsTemplate = [
fileFormat: "{appname}-win-x64-{tag}.exe",
},
{
name: "Linux",
name: "Linux (AppImage)",
logo: FaLinux,
fileFormat: "{appname}-linux-x86_64-{tag}.AppImage",
},
{
name: "Linux (deb)",
logo: FaLinux,
fileFormat: "{appname}-linux-amd64-{tag}.deb",
},
@ -44,7 +49,7 @@ export default function DownloadApp() {
const extractAppName = (fileName) => {
// Extract appname using a regex that matches the provided file formats
const regex = /^(.*?)-(?:mac|win|linux)-(?:arm64|x64|amd64)-.*$/;
const regex = /^(.*?)-(?:mac|win|linux)-(?:arm64|x64|amd64|x86_64)-.*$/;
const match = fileName.match(regex);
return match ? match[1] : null;
};

View File

@ -22,10 +22,15 @@ const systemsTemplate = [
fileFormat: "{appname}-win-x64-{tag}.exe",
},
{
name: "Download for Linux",
name: "Download for Linux (AppImage)",
logo: FaLinux,
fileFormat: "{appname}-linux-x86_64-{tag}.AppImage",
},
{
name: "Download for Linux (deb)",
logo: FaLinux,
fileFormat: "{appname}-linux-amd64-{tag}.deb",
},
}
];
function classNames(...classes) {
@ -49,7 +54,7 @@ export default function Dropdown() {
const extractAppName = (fileName) => {
// Extract appname using a regex that matches the provided file formats
const regex = /^(.*?)-(?:mac|win|linux)-(?:arm64|x64|amd64)-.*$/;
const regex = /^(.*?)-(?:mac|win|linux)-(?:arm64|x64|x86_64|amd64)-.*$/;
const match = fileName.match(regex);
return match ? match[1] : null;
};

File diff suppressed because it is too large Load Diff

View File

@ -1,9 +1,9 @@
import { app, ipcMain, shell, nativeTheme } from 'electron'
import { app, ipcMain, shell } from 'electron'
import { join, basename } from 'path'
import { WindowManager } from './../managers/window'
import { getResourcePath, userSpacePath } from './../utils/path'
import { AppRoute } from '@janhq/core'
import { ExtensionManager, ModuleManager } from '@janhq/core/node'
import { ModuleManager, init, log } from '@janhq/core/node'
import { startServer, stopServer } from '@janhq/server'
export function handleAppIPCs() {
@ -59,7 +59,7 @@ export function handleAppIPCs() {
app.isPackaged ? join(getResourcePath(), 'docs', 'openapi') : undefined
)
)
/**
* Stop Jan API Server.
*/
@ -82,8 +82,22 @@ export function handleAppIPCs() {
require.resolve(join(userSpacePath, 'extensions', modulePath))
]
}
ExtensionManager.instance.setupExtensions()
init({
// Function to check from the main process that user wants to install a extension
confirmInstall: async (_extensions: string[]) => {
return true
},
// Path to install extension to
extensionsPath: join(userSpacePath, 'extensions'),
})
WindowManager.instance.currentWindow?.reload()
}
})
/**
* Log message to log file.
*/
ipcMain.handle(AppRoute.log, async (_event, message, fileName) =>
log(message, fileName)
)
}

View File

@ -6,7 +6,7 @@ import { FileManagerRoute } from '@janhq/core'
import { userSpacePath, getResourcePath } from './../utils/path'
import fs from 'fs'
import { join } from 'path'
import { FileStat } from '@janhq/core/.'
import { FileStat } from '@janhq/core'
/**
* Handles file system extensions operations.

View File

@ -6,7 +6,7 @@ import { createUserSpace } from './utils/path'
* Managers
**/
import { WindowManager } from './managers/window'
import { ExtensionManager, ModuleManager } from '@janhq/core/node'
import { log, ModuleManager } from '@janhq/core/node'
/**
* IPC Handlers
@ -17,14 +17,19 @@ import { handleFileMangerIPCs } from './handlers/fileManager'
import { handleAppIPCs } from './handlers/app'
import { handleAppUpdates } from './handlers/update'
import { handleFsIPCs } from './handlers/fs'
/**
* Utils
**/
import { migrateExtensions } from './utils/migration'
import { cleanUpAndQuit } from './utils/clean'
import { setupExtensions } from './utils/extension'
app
.whenReady()
.then(createUserSpace)
.then(migrateExtensions)
.then(ExtensionManager.instance.setupExtensions)
.then(setupExtensions)
.then(setupMenu)
.then(handleIPCs)
.then(handleAppUpdates)
@ -93,5 +98,5 @@ function handleIPCs() {
*/
process.on('uncaughtException', function (err) {
// TODO: Write error to log file in #1447
console.error(err)
log(`Error: ${err}`)
})

View File

@ -69,7 +69,7 @@
"build:publish": "run-script-os",
"build:publish:darwin": "tsc -p . && electron-builder -p onTagOrDraft -m --x64 --arm64",
"build:publish:win32": "tsc -p . && electron-builder -p onTagOrDraft -w",
"build:publish:linux": "tsc -p . && electron-builder -p onTagOrDraft -l deb"
"build:publish:linux": "tsc -p . && electron-builder -p onTagOrDraft -l deb -l AppImage"
},
"dependencies": {
"@alumna/reflect": "^1.1.3",

View File

@ -0,0 +1,13 @@
import { init, userSpacePath } from '@janhq/core/node'
import path from 'path'
export const setupExtensions = () => {
init({
// Function to check from the main process that user wants to install a extension
confirmInstall: async (_extensions: string[]) => {
return true
},
// Path to install extension to
extensionsPath: path.join(userSpacePath, 'extensions'),
})
}

View File

@ -50,6 +50,7 @@
"bundleDependencies": [
"tcp-port-used",
"fetch-retry",
"os-utils"
"os-utils",
"@janhq/core"
]
}

View File

@ -1,5 +1,6 @@
declare const MODULE: string;
declare const INFERENCE_URL: string;
declare const TROUBLESHOOTING_URL: string;
/**
* The parameters for the initModel function.

View File

@ -0,0 +1,138 @@
/**
* Default GPU settings
**/
const DEFALT_SETTINGS = {
notify: true,
run_mode: "cpu",
nvidia_driver: {
exist: false,
version: "",
},
cuda: {
exist: false,
version: "",
},
gpus: [],
gpu_highest_vram: "",
};
/**
* Validate nvidia and cuda for linux and windows
*/
async function updateNvidiaDriverInfo(): Promise<void> {
exec(
"nvidia-smi --query-gpu=driver_version --format=csv,noheader",
(error, stdout) => {
let data;
try {
data = JSON.parse(readFileSync(NVIDIA_INFO_FILE, "utf-8"));
} catch (error) {
data = DEFALT_SETTINGS;
}
if (!error) {
const firstLine = stdout.split("\n")[0].trim();
data["nvidia_driver"].exist = true;
data["nvidia_driver"].version = firstLine;
} else {
data["nvidia_driver"].exist = false;
}
writeFileSync(NVIDIA_INFO_FILE, JSON.stringify(data, null, 2));
Promise.resolve();
}
);
}
function checkFileExistenceInPaths(file: string, paths: string[]): boolean {
return paths.some((p) => existsSync(path.join(p, file)));
}
function updateCudaExistence() {
let filesCuda12: string[];
let filesCuda11: string[];
let paths: string[];
let cudaVersion: string = "";
if (process.platform === "win32") {
filesCuda12 = ["cublas64_12.dll", "cudart64_12.dll", "cublasLt64_12.dll"];
filesCuda11 = ["cublas64_11.dll", "cudart64_11.dll", "cublasLt64_11.dll"];
paths = process.env.PATH ? process.env.PATH.split(path.delimiter) : [];
} else {
filesCuda12 = ["libcudart.so.12", "libcublas.so.12", "libcublasLt.so.12"];
filesCuda11 = ["libcudart.so.11.0", "libcublas.so.11", "libcublasLt.so.11"];
paths = process.env.LD_LIBRARY_PATH
? process.env.LD_LIBRARY_PATH.split(path.delimiter)
: [];
paths.push("/usr/lib/x86_64-linux-gnu/");
}
let cudaExists = filesCuda12.every(
(file) => existsSync(file) || checkFileExistenceInPaths(file, paths)
);
if (!cudaExists) {
cudaExists = filesCuda11.every(
(file) => existsSync(file) || checkFileExistenceInPaths(file, paths)
);
if (cudaExists) {
cudaVersion = "11";
}
} else {
cudaVersion = "12";
}
let data;
try {
data = JSON.parse(readFileSync(NVIDIA_INFO_FILE, "utf-8"));
} catch (error) {
data = DEFALT_SETTINGS;
}
data["cuda"].exist = cudaExists;
data["cuda"].version = cudaVersion;
if (cudaExists) {
data.run_mode = "gpu";
}
writeFileSync(NVIDIA_INFO_FILE, JSON.stringify(data, null, 2));
}
async function updateGpuInfo(): Promise<void> {
exec(
"nvidia-smi --query-gpu=index,memory.total --format=csv,noheader,nounits",
(error, stdout) => {
let data;
try {
data = JSON.parse(readFileSync(NVIDIA_INFO_FILE, "utf-8"));
} catch (error) {
data = DEFALT_SETTINGS;
}
if (!error) {
// Get GPU info and gpu has higher memory first
let highestVram = 0;
let highestVramId = "0";
let gpus = stdout
.trim()
.split("\n")
.map((line) => {
let [id, vram] = line.split(", ");
vram = vram.replace(/\r/g, "");
if (parseFloat(vram) > highestVram) {
highestVram = parseFloat(vram);
highestVramId = id;
}
return { id, vram };
});
data["gpus"] = gpus;
data["gpu_highest_vram"] = highestVramId;
} else {
data["gpus"] = [];
}
writeFileSync(NVIDIA_INFO_FILE, JSON.stringify(data, null, 2));
Promise.resolve();
}
);
}

View File

@ -20,8 +20,9 @@ import {
fs,
Model,
joinPath,
InferenceExtension,
log,
} from "@janhq/core";
import { InferenceExtension } from "@janhq/core";
import { requestInference } from "./helpers/sse";
import { ulid } from "ulid";
import { join } from "path";
@ -36,9 +37,14 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
private static readonly _settingsDir = "file://settings";
private static readonly _engineMetadataFileName = "nitro.json";
private static _currentModel: Model;
/**
* Checking the health for Nitro's process each 5 secs.
*/
private static readonly _intervalHealthCheck = 5 * 1000;
private static _engineSettings: EngineSettings = {
private _currentModel: Model;
private _engineSettings: EngineSettings = {
ctx_len: 2048,
ngl: 100,
cpu_threads: 1,
@ -48,6 +54,18 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
controller = new AbortController();
isCancelled = false;
/**
* The interval id for the health check. Used to stop the health check.
*/
private getNitroProcesHealthIntervalId: NodeJS.Timeout | undefined =
undefined;
/**
* Tracking the current state of nitro process.
*/
private nitroProcessInfo: any = undefined;
/**
* Returns the type of the extension.
* @returns {ExtensionType} The type of the extension.
@ -71,21 +89,13 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
this.writeDefaultEngineSettings();
// Events subscription
events.on(EventName.OnMessageSent, (data) =>
JanInferenceNitroExtension.handleMessageRequest(data, this)
);
events.on(EventName.OnMessageSent, (data) => this.onMessageRequest(data));
events.on(EventName.OnModelInit, (model: Model) => {
JanInferenceNitroExtension.handleModelInit(model);
});
events.on(EventName.OnModelInit, (model: Model) => this.onModelInit(model));
events.on(EventName.OnModelStop, (model: Model) => {
JanInferenceNitroExtension.handleModelStop(model);
});
events.on(EventName.OnModelStop, (model: Model) => this.onModelStop(model));
events.on(EventName.OnInferenceStopped, () => {
JanInferenceNitroExtension.handleInferenceStopped(this);
});
events.on(EventName.OnInferenceStopped, () => this.onInferenceStopped());
// Attempt to fetch nvidia info
await executeOnMain(MODULE, "updateNvidiaInfo", {});
@ -104,12 +114,12 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
);
if (await fs.existsSync(engineFile)) {
const engine = await fs.readFileSync(engineFile, "utf-8");
JanInferenceNitroExtension._engineSettings =
this._engineSettings =
typeof engine === "object" ? engine : JSON.parse(engine);
} else {
await fs.writeFileSync(
engineFile,
JSON.stringify(JanInferenceNitroExtension._engineSettings, null, 2)
JSON.stringify(this._engineSettings, null, 2)
);
}
} catch (err) {
@ -117,10 +127,9 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
}
}
private static async handleModelInit(model: Model) {
if (model.engine !== "nitro") {
return;
}
private async onModelInit(model: Model) {
if (model.engine !== "nitro") return;
const modelFullPath = await joinPath(["models", model.id]);
const nitroInitResult = await executeOnMain(MODULE, "initModel", {
@ -130,26 +139,49 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
if (nitroInitResult.error === null) {
events.emit(EventName.OnModelFail, model);
} else {
JanInferenceNitroExtension._currentModel = model;
events.emit(EventName.OnModelReady, model);
}
}
private static async handleModelStop(model: Model) {
if (model.engine !== "nitro") {
return;
} else {
await executeOnMain(MODULE, "stopModel");
events.emit(EventName.OnModelStopped, model);
}
this._currentModel = model;
events.emit(EventName.OnModelReady, model);
this.getNitroProcesHealthIntervalId = setInterval(
() => this.periodicallyGetNitroHealth(),
JanInferenceNitroExtension._intervalHealthCheck
);
}
private async onModelStop(model: Model) {
if (model.engine !== "nitro") return;
await executeOnMain(MODULE, "stopModel");
events.emit(EventName.OnModelStopped, {});
// stop the periocally health check
if (this.getNitroProcesHealthIntervalId) {
console.debug("Stop calling Nitro process health check");
clearInterval(this.getNitroProcesHealthIntervalId);
this.getNitroProcesHealthIntervalId = undefined;
}
}
private static async handleInferenceStopped(
instance: JanInferenceNitroExtension
) {
instance.isCancelled = true;
instance.controller?.abort();
/**
* Periodically check for nitro process's health.
*/
private async periodicallyGetNitroHealth(): Promise<void> {
const health = await executeOnMain(MODULE, "getCurrentNitroProcessInfo");
const isRunning = this.nitroProcessInfo?.isRunning ?? false;
if (isRunning && health.isRunning === false) {
console.debug("Nitro process is stopped");
events.emit(EventName.OnModelStopped, {});
}
this.nitroProcessInfo = health;
}
private async onInferenceStopped() {
this.isCancelled = true;
this.controller?.abort();
}
/**
@ -171,10 +203,7 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
};
return new Promise(async (resolve, reject) => {
requestInference(
data.messages ?? [],
JanInferenceNitroExtension._currentModel
).subscribe({
requestInference(data.messages ?? [], this._currentModel).subscribe({
next: (_content) => {},
complete: async () => {
resolve(message);
@ -192,13 +221,9 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
* Pass instance as a reference.
* @param {MessageRequest} data - The data for the new message request.
*/
private static async handleMessageRequest(
data: MessageRequest,
instance: JanInferenceNitroExtension
) {
if (data.model.engine !== "nitro") {
return;
}
private async onMessageRequest(data: MessageRequest) {
if (data.model.engine !== "nitro") return;
const timestamp = Date.now();
const message: ThreadMessage = {
id: ulid(),
@ -213,13 +238,13 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
};
events.emit(EventName.OnMessageResponse, message);
instance.isCancelled = false;
instance.controller = new AbortController();
this.isCancelled = false;
this.controller = new AbortController();
requestInference(
data.messages ?? [],
{ ...JanInferenceNitroExtension._currentModel, ...data.model },
instance.controller
{ ...this._currentModel, ...data.model },
this.controller
).subscribe({
next: (content) => {
const messageContent: ThreadContent = {
@ -239,21 +264,14 @@ export default class JanInferenceNitroExtension implements InferenceExtension {
events.emit(EventName.OnMessageUpdate, message);
},
error: async (err) => {
if (instance.isCancelled || message.content.length) {
message.status = MessageStatus.Error;
if (this.isCancelled || message.content.length) {
message.status = MessageStatus.Stopped;
events.emit(EventName.OnMessageUpdate, message);
return;
}
const messageContent: ThreadContent = {
type: ContentType.Text,
text: {
value: "Error occurred: " + err.message,
annotations: [],
},
};
message.content = [messageContent];
message.status = MessageStatus.Ready;
message.status = MessageStatus.Error;
events.emit(EventName.OnMessageUpdate, message);
log(`[APP]::Error: ${err.message}`);
},
});
}

View File

@ -1,18 +1,17 @@
const fs = require("fs");
const fsPromises = fs.promises;
const path = require("path");
const { exec, spawn } = require("child_process");
const tcpPortUsed = require("tcp-port-used");
const fetchRetry = require("fetch-retry")(global.fetch);
const osUtils = require("os-utils");
const { readFileSync, writeFileSync, existsSync } = require("fs");
const { log } = require("@janhq/core/node");
// The PORT to use for the Nitro subprocess
const PORT = 3928;
const LOCAL_HOST = "127.0.0.1";
const NITRO_HTTP_SERVER_URL = `http://${LOCAL_HOST}:${PORT}`;
const NITRO_HTTP_LOAD_MODEL_URL = `${NITRO_HTTP_SERVER_URL}/inferences/llamacpp/loadmodel`;
const NITRO_HTTP_UNLOAD_MODEL_URL = `${NITRO_HTTP_SERVER_URL}/inferences/llamacpp/unloadModel`;
const NITRO_HTTP_VALIDATE_MODEL_URL = `${NITRO_HTTP_SERVER_URL}/inferences/llamacpp/modelstatus`;
const NITRO_HTTP_KILL_URL = `${NITRO_HTTP_SERVER_URL}/processmanager/destroy`;
const SUPPORTED_MODEL_FORMAT = ".gguf";
@ -23,26 +22,13 @@ const NVIDIA_INFO_FILE = path.join(
"settings.json"
);
const DEFALT_SETTINGS = {
notify: true,
run_mode: "cpu",
nvidia_driver: {
exist: false,
version: "",
},
cuda: {
exist: false,
version: "",
},
gpus: [],
gpu_highest_vram: "",
};
// The subprocess instance for Nitro
let subprocess = undefined;
let currentModelFile: string = undefined;
let currentSettings = undefined;
let nitroProcessInfo = undefined;
/**
* Stops a Nitro subprocess.
* @param wrapper - The model wrapper.
@ -52,137 +38,6 @@ function stopModel(): Promise<void> {
return killSubprocess();
}
/**
* Validate nvidia and cuda for linux and windows
*/
async function updateNvidiaDriverInfo(): Promise<void> {
exec(
"nvidia-smi --query-gpu=driver_version --format=csv,noheader",
(error, stdout) => {
let data;
try {
data = JSON.parse(readFileSync(NVIDIA_INFO_FILE, "utf-8"));
} catch (error) {
data = DEFALT_SETTINGS;
}
if (!error) {
const firstLine = stdout.split("\n")[0].trim();
data["nvidia_driver"].exist = true;
data["nvidia_driver"].version = firstLine;
} else {
data["nvidia_driver"].exist = false;
}
writeFileSync(NVIDIA_INFO_FILE, JSON.stringify(data, null, 2));
Promise.resolve();
}
);
}
function checkFileExistenceInPaths(file: string, paths: string[]): boolean {
return paths.some((p) => existsSync(path.join(p, file)));
}
function updateCudaExistence() {
let filesCuda12: string[];
let filesCuda11: string[];
let paths: string[];
let cudaVersion: string = "";
if (process.platform === "win32") {
filesCuda12 = ["cublas64_12.dll", "cudart64_12.dll", "cublasLt64_12.dll"];
filesCuda11 = ["cublas64_11.dll", "cudart64_11.dll", "cublasLt64_11.dll"];
paths = process.env.PATH ? process.env.PATH.split(path.delimiter) : [];
} else {
filesCuda12 = ["libcudart.so.12", "libcublas.so.12", "libcublasLt.so.12"];
filesCuda11 = ["libcudart.so.11.0", "libcublas.so.11", "libcublasLt.so.11"];
paths = process.env.LD_LIBRARY_PATH
? process.env.LD_LIBRARY_PATH.split(path.delimiter)
: [];
paths.push("/usr/lib/x86_64-linux-gnu/");
}
let cudaExists = filesCuda12.every(
(file) => existsSync(file) || checkFileExistenceInPaths(file, paths)
);
if (!cudaExists) {
cudaExists = filesCuda11.every(
(file) => existsSync(file) || checkFileExistenceInPaths(file, paths)
);
if (cudaExists) {
cudaVersion = "11";
}
} else {
cudaVersion = "12";
}
let data;
try {
data = JSON.parse(readFileSync(NVIDIA_INFO_FILE, "utf-8"));
} catch (error) {
data = DEFALT_SETTINGS;
}
data["cuda"].exist = cudaExists;
data["cuda"].version = cudaVersion;
if (cudaExists) {
data.run_mode = "gpu";
}
writeFileSync(NVIDIA_INFO_FILE, JSON.stringify(data, null, 2));
}
async function updateGpuInfo(): Promise<void> {
exec(
"nvidia-smi --query-gpu=index,memory.total --format=csv,noheader,nounits",
(error, stdout) => {
let data;
try {
data = JSON.parse(readFileSync(NVIDIA_INFO_FILE, "utf-8"));
} catch (error) {
data = DEFALT_SETTINGS;
}
if (!error) {
// Get GPU info and gpu has higher memory first
let highestVram = 0;
let highestVramId = "0";
let gpus = stdout
.trim()
.split("\n")
.map((line) => {
let [id, vram] = line.split(", ");
vram = vram.replace(/\r/g, "");
if (parseFloat(vram) > highestVram) {
highestVram = parseFloat(vram);
highestVramId = id;
}
return { id, vram };
});
data["gpus"] = gpus;
data["gpu_highest_vram"] = highestVramId;
} else {
data["gpus"] = [];
}
writeFileSync(NVIDIA_INFO_FILE, JSON.stringify(data, null, 2));
Promise.resolve();
}
);
}
async function updateNvidiaInfo() {
if (process.platform !== "darwin") {
await Promise.all([
updateNvidiaDriverInfo(),
updateCudaExistence(),
updateGpuInfo(),
]);
}
}
/**
* Initializes a Nitro subprocess to load a machine learning model.
* @param wrapper - The model wrapper.
@ -236,31 +91,28 @@ async function initModel(wrapper: any): Promise<ModelOperationResponse> {
async function loadModel(nitroResourceProbe: any | undefined) {
// Gather system information for CPU physical cores and memory
if (!nitroResourceProbe) nitroResourceProbe = await getResourcesInfo();
return (
killSubprocess()
.then(() => tcpPortUsed.waitUntilFree(PORT, 300, 5000))
// wait for 500ms to make sure the port is free for windows platform
.then(() => {
if (process.platform === "win32") {
return sleep(500);
} else {
return sleep(0);
}
})
.then(() => spawnNitroProcess(nitroResourceProbe))
.then(() => loadLLMModel(currentSettings))
.then(validateModelStatus)
.catch((err) => {
console.error("error: ", err);
// TODO: Broadcast error so app could display proper error message
return { error: err, currentModelFile };
})
);
}
// Add function sleep
function sleep(ms) {
return new Promise((resolve) => setTimeout(resolve, ms));
return killSubprocess()
.then(() => tcpPortUsed.waitUntilFree(PORT, 300, 5000))
.then(() => {
/**
* There is a problem with Windows process manager
* Should wait for awhile to make sure the port is free and subprocess is killed
* The tested threshold is 500ms
**/
if (process.platform === "win32") {
return new Promise((resolve) => setTimeout(resolve, 500));
} else {
return Promise.resolve();
}
})
.then(() => spawnNitroProcess(nitroResourceProbe))
.then(() => loadLLMModel(currentSettings))
.then(validateModelStatus)
.catch((err) => {
log(`[NITRO]::Error: ${err}`);
// TODO: Broadcast error so app could display proper error message
return { error: err, currentModelFile };
});
}
function promptTemplateConverter(promptTemplate) {
@ -310,6 +162,7 @@ function promptTemplateConverter(promptTemplate) {
* @returns A Promise that resolves when the model is loaded successfully, or rejects with an error message if the model is not found or fails to load.
*/
function loadLLMModel(settings): Promise<Response> {
log(`[NITRO]::Debug: Loading model with params ${settings}`);
return fetchRetry(NITRO_HTTP_LOAD_MODEL_URL, {
method: "POST",
headers: {
@ -318,6 +171,8 @@ function loadLLMModel(settings): Promise<Response> {
body: JSON.stringify(settings),
retries: 3,
retryDelay: 500,
}).catch((err) => {
log(`[NITRO]::Error: Load model failed with error ${err}`);
});
}
@ -358,7 +213,8 @@ async function validateModelStatus(): Promise<ModelOperationResponse> {
async function killSubprocess(): Promise<void> {
const controller = new AbortController();
setTimeout(() => controller.abort(), 5000);
console.debug("Start requesting to kill Nitro...");
log(`[NITRO]::Debug: Request to kill Nitro`);
return fetch(NITRO_HTTP_KILL_URL, {
method: "DELETE",
signal: controller.signal,
@ -369,20 +225,17 @@ async function killSubprocess(): Promise<void> {
})
.catch(() => {})
.then(() => tcpPortUsed.waitUntilFree(PORT, 300, 5000))
.then(() => console.debug("Nitro is killed"));
.then(() => log(`[NITRO]::Debug: Nitro process is terminated`));
}
/**
* Look for the Nitro binary and execute it
* Using child-process to spawn the process
* Should run exactly platform specified Nitro binary version
*/
/**
* Spawns a Nitro subprocess.
* @param nitroResourceProbe - The Nitro resource probe.
* @returns A promise that resolves when the Nitro subprocess is started.
*/
function spawnNitroProcess(nitroResourceProbe: any): Promise<any> {
console.debug("Starting Nitro subprocess...");
log(`[NITRO]::Debug: Spawning Nitro subprocess...`);
return new Promise(async (resolve, reject) => {
let binaryFolder = path.join(__dirname, "bin"); // Current directory by default
let cudaVisibleDevices = "";
@ -424,7 +277,7 @@ function spawnNitroProcess(nitroResourceProbe: any): Promise<any> {
const binaryPath = path.join(binaryFolder, binaryName);
// Execute the binary
subprocess = spawn(binaryPath, [1, LOCAL_HOST, PORT], {
subprocess = spawn(binaryPath, ["1", LOCAL_HOST, PORT.toString()], {
cwd: binaryFolder,
env: {
...process.env,
@ -434,16 +287,15 @@ function spawnNitroProcess(nitroResourceProbe: any): Promise<any> {
// Handle subprocess output
subprocess.stdout.on("data", (data) => {
console.debug(`stdout: ${data}`);
log(`[NITRO]::Debug: ${data}`);
});
subprocess.stderr.on("data", (data) => {
console.error("subprocess error:" + data.toString());
console.error(`stderr: ${data}`);
log(`[NITRO]::Error: ${data}`);
});
subprocess.on("close", (code) => {
console.debug(`child process exited with code ${code}`);
log(`[NITRO]::Debug: Nitro exited with code: ${code}`);
subprocess = null;
reject(`child process exited with code ${code}`);
});
@ -461,7 +313,7 @@ function spawnNitroProcess(nitroResourceProbe: any): Promise<any> {
function getResourcesInfo(): Promise<ResourcesInfo> {
return new Promise(async (resolve) => {
const cpu = await osUtils.cpuCount();
console.log("cpu: ", cpu);
log(`[NITRO]::CPU informations - ${cpu}`);
const response: ResourcesInfo = {
numCpuPhysicalCore: cpu,
memAvailable: 0,
@ -470,6 +322,35 @@ function getResourcesInfo(): Promise<ResourcesInfo> {
});
}
/**
* This will retrive GPU informations and persist settings.json
* Will be called when the extension is loaded to turn on GPU acceleration if supported
*/
async function updateNvidiaInfo() {
if (process.platform !== "darwin") {
await Promise.all([
updateNvidiaDriverInfo(),
updateCudaExistence(),
updateGpuInfo(),
]);
}
}
/**
* Retrieve current nitro process
*/
const getCurrentNitroProcessInfo = (): Promise<any> => {
nitroProcessInfo = {
isRunning: subprocess != null,
};
return nitroProcessInfo;
};
/**
* Every module should have a dispose function
* This will be called when the extension is unloaded and should clean up any resources
* Also called when app is closed
*/
function dispose() {
// clean other registered resources here
killSubprocess();
@ -481,4 +362,5 @@ module.exports = {
killSubprocess,
dispose,
updateNvidiaInfo,
getCurrentNitroProcessInfo,
};

View File

@ -22,6 +22,7 @@ module.exports = {
process.env.INFERENCE_URL ||
"http://127.0.0.1:3928/inferences/llamacpp/chat_completion"
),
TROUBLESHOOTING_URL: JSON.stringify("https://jan.ai/guides/troubleshooting")
}),
],
output: {

View File

@ -1,6 +1,6 @@
{
"name": "@janhq/model-extension",
"version": "1.0.20",
"version": "1.0.21",
"description": "Model Management Extension provides model exploration and seamless downloads",
"main": "dist/index.js",
"module": "dist/module.js",

View File

@ -5,7 +5,6 @@ import {
abortDownload,
getResourcePath,
getUserSpace,
fileStat,
InferenceEngine,
joinPath,
ModelExtension,
@ -281,7 +280,7 @@ export default class JanModelExtension implements ModelExtension {
if (file.endsWith('.json')) continue
const path = await joinPath([JanModelExtension._homeDir, dirName, file])
const fileStats = await fileStat(path)
const fileStats = await fs.fileStat(path)
if (fileStats.isDirectory) continue
binaryFileSize = fileStats.size
binaryFileName = file

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": ["<|end_of_turn|>"],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -16,6 +16,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": ["<|end_of_turn|>"],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": ["<|end_of_turn|>"],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,6 +15,7 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 4096,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},

View File

@ -15,13 +15,14 @@
"top_p": 0.95,
"stream": true,
"max_tokens": 2048,
"stop": [],
"frequency_penalty": 0,
"presence_penalty": 0
},
"metadata": {
"author": "TinyLlama",
"tags": ["Tiny", "Foundation Model"],
"size": 1170000000
"size": 669000000
},
"engine": "nitro"
}

Some files were not shown because too many files have changed in this diff Show More