diff --git a/docs/blog/01-january-10-2024-bitdefender-false-positive-flag.mdx b/docs/blog/2024-01-10-bitdefender-false-positive-flag.mdx similarity index 100% rename from docs/blog/01-january-10-2024-bitdefender-false-positive-flag.mdx rename to docs/blog/2024-01-10-bitdefender-false-positive-flag.mdx diff --git a/docs/blog/2024-03-19-TensorRT-LLM.md b/docs/blog/2024-03-19-TensorRT-LLM.md new file mode 100644 index 000000000..08f1a1d1a --- /dev/null +++ b/docs/blog/2024-03-19-TensorRT-LLM.md @@ -0,0 +1,116 @@ +--- +title: Jan now supports TensorRT-LLM +description: Jan has added for Nvidia's TensorRT-LLM, a hardware-optimized LLM inference engine that runs very fast on Nvidia GPUs +tags: [Nvidia, TensorRT-LLM] +--- + +Jan now supports [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM) as an alternative inference engine. TensorRT-LLM is a hardware-optimized LLM inference engine that compiles models to [run extremely fast on Nvidia GPUs](https://blogs.nvidia.com/blog/tensorrt-llm-windows-stable-diffusion-rtx/). + +- [TensorRT-LLM Extension](/guides/providers/tensorrt-llm) is available in [0.4.9 release](https://github.com/janhq/jan/releases/tag/v0.4.9) +- Currently available only for Windows + +We've made a few TensorRT-LLM models TensorRT-LLM models available in the Jan Hub for download: + +- TinyLlama-1.1b +- Mistral 7b +- TinyJensen-1.1b, which is trained on Jensen Huang's πŸ‘€ + +## What is TensorRT-LLM? + +Please read our [TensorRT-LLM Guide](/guides/providers/tensorrt-llm). + +TensorRT-LLM is mainly used in datacenter-grade GPUs to achieve [10,000 tokens/s](https://nvidia.github.io/TensorRT-LLM/blogs/H100vsA100.html) type speeds. + +## Performance Benchmarks + + +We were curious to see how this would perform on consumer-grade GPUs, as most of Jan's users use consumer-grade GPUs. + +- We’ve done a comparison of how TensorRT-LLM does vs. llama.cpp, our default inference engine. + +| NVIDIA GPU | Architecture | VRAM Used (GB) | CUDA Cores | Tensor Cores | Memory Bus Width (bit) | Memory Bandwidth (GB/s) | +| ---------- | ------------ | -------------- | ---------- | ------------ | ---------------------- | ----------------------- | +| RTX 4090 | Ada | 24 | 16,384 | 512 | 384 | ~1000 | +| RTX 3090 | Ampere | 24 | 10,496 | 328 | 384 | 935.8 | +| RTX 4060 | Ada | 8 | 3,072 | 96 | 128 | 272 | + +> We test using batch_size 1 and input length 2048, output length 512 as it’s the common use case people all use. We run 5 times and get the Average. + +> We use Windows task manager and Linux NVIDIA-SMI/ Htop to get CPU/ Memory/ NVIDIA GPU metrics per process. + +> We turn off all user application and only open Jan app with Nitro tensorrt-llm or NVIDIA benchmark script in python + +### RTX 4090 on Windows PC + +- CPU: Intel 13th series +- GPU: NVIDIA GPU 4090 (Ampere - sm 86) +- RAM: 120GB +- OS: Windows + +#### TinyLlama-1.1b q4 + +| Metrics | GGUF (using the GPU) | TensorRT-LLM | +| -------------------- | -------------------- | ------------ | +| Throughput (token/s) | 104 | βœ… 131 | +| VRAM Used (GB) | 2.1 | 😱 21.5 | +| RAM Used (GB) | 0.3 | 😱 15 | +| Disk Size (GB) | 4.07 | 4.07 | + +#### Mistral-7b int4 + +| Metrics | GGUF (using the GPU) | TensorRT-LLM | +| -------------------- | -------------------- | ------------ | +| Throughput (token/s) | 80 | βœ… 97.9 | +| VRAM Used (GB) | 2.1 | 😱 23.5 | +| RAM Used (GB) | 0.3 | 😱 15 | +| Disk Size (GB) | 4.07 | 4.07 | + +### RTX 3090 on Windows PC + +- CPU: Intel 13th series +- GPU: NVIDIA GPU 3090 (Ampere - sm 86) +- RAM: 64GB +- OS: Windows + +#### TinyLlama-1.1b q4 + +| Metrics | GGUF (using the GPU) | TensorRT-LLM | +| -------------------- | -------------------- | ------------ | +| Throughput (token/s) | 131.28 | βœ… 194 | +| VRAM Used (GB) | 2.1 | 😱 21.5 | +| RAM Used (GB) | 0.3 | 😱 15 | +| Disk Size (GB) | 4.07 | 4.07 | + +#### Mistral-7b int4 + +| Metrics | GGUF (using the GPU) | TensorRT-LLM | +| -------------------- | -------------------- | ------------ | +| Throughput (token/s) | 88 | βœ… 137 | +| VRAM Used (GB) | 6.0 | 😱 23.8 | +| RAM Used (GB) | 0.3 | 😱 25 | +| Disk Size (GB) | 4.07 | 4.07 | + +### RTX 4060 on Windows Laptop + +- Manufacturer: Acer Nitro 16 Phenix +- CPU: Ryzen 7000 +- RAM: 16GB +- GPU: NVIDIA Laptop GPU 4060 (Ada) + +#### TinyLlama-1.1b q4 + +| Metrics | GGUF (using the GPU) | TensorRT-LLM | +| -------------------- | -------------------- | ------------ | +| Throughput (token/s) | 65 | ❌ 41 | +| VRAM Used (GB) | 2.1 | 😱 7.6 | +| RAM Used (GB) | 0.3 | 😱 7.2 | +| Disk Size (GB) | 4.07 | 4.07 GB | + +#### Mistral-7b int4 + +| Metrics | GGUF (using the GPU) | TensorRT-LLM | +| -------------------- | -------------------- | ------------ | +| Throughput (token/s) | 22 | ❌ 19 | +| VRAM Used (GB) | 2.1 | 😱 7.7 | +| RAM Used (GB) | 0.3 | 😱 13.5 | +| Disk Size (GB) | 4.07 | 4.07 | diff --git a/docs/docs/releases/changelog/cache.json b/docs/docs/releases/changelog/cache.json index a8c680d81..13bb08d60 100644 --- a/docs/docs/releases/changelog/cache.json +++ b/docs/docs/releases/changelog/cache.json @@ -64,7 +64,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 118201794, - "download_count": 89, + "download_count": 94, "created_at": "2024-03-19T04:08:03Z", "updated_at": "2024-03-19T04:08:06Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/jan-linux-amd64-0.4.9.deb" @@ -98,7 +98,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 156697166, - "download_count": 91, + "download_count": 98, "created_at": "2024-03-19T04:06:51Z", "updated_at": "2024-03-19T04:06:55Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/jan-linux-x86_64-0.4.9.AppImage" @@ -132,7 +132,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 132665337, - "download_count": 133, + "download_count": 135, "created_at": "2024-03-19T04:10:15Z", "updated_at": "2024-03-19T04:10:26Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/jan-mac-arm64-0.4.9.dmg" @@ -200,7 +200,7 @@ "content_type": "application/zip", "state": "uploaded", "size": 128089843, - "download_count": 212, + "download_count": 222, "created_at": "2024-03-19T04:10:32Z", "updated_at": "2024-03-19T04:10:46Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/jan-mac-arm64-0.4.9.zip" @@ -268,7 +268,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 139245048, - "download_count": 36, + "download_count": 39, "created_at": "2024-03-19T04:17:33Z", "updated_at": "2024-03-19T04:17:37Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/jan-mac-x64-0.4.9.dmg" @@ -336,7 +336,7 @@ "content_type": "application/zip", "state": "uploaded", "size": 134752189, - "download_count": 38, + "download_count": 40, "created_at": "2024-03-19T04:17:52Z", "updated_at": "2024-03-19T04:17:56Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/jan-mac-x64-0.4.9.zip" @@ -404,7 +404,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 129440528, - "download_count": 1044, + "download_count": 1078, "created_at": "2024-03-19T04:18:43Z", "updated_at": "2024-03-19T04:18:46Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/jan-win-x64-0.4.9.exe" @@ -438,7 +438,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 136498, - "download_count": 556, + "download_count": 570, "created_at": "2024-03-19T04:18:43Z", "updated_at": "2024-03-19T04:18:43Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/jan-win-x64-0.4.9.exe.blockmap" @@ -472,7 +472,7 @@ "content_type": "text/yaml", "state": "uploaded", "size": 540, - "download_count": 304, + "download_count": 321, "created_at": "2024-03-19T04:08:06Z", "updated_at": "2024-03-19T04:08:06Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/latest-linux.yml" @@ -506,7 +506,7 @@ "content_type": "text/yaml", "state": "uploaded", "size": 842, - "download_count": 712, + "download_count": 743, "created_at": "2024-03-19T04:18:53Z", "updated_at": "2024-03-19T04:18:53Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/latest-mac.yml" @@ -540,7 +540,7 @@ "content_type": "text/yaml", "state": "uploaded", "size": 339, - "download_count": 1835, + "download_count": 1890, "created_at": "2024-03-19T04:18:46Z", "updated_at": "2024-03-19T04:18:46Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.9/latest.yml" @@ -627,7 +627,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 110060688, - "download_count": 850, + "download_count": 852, "created_at": "2024-03-11T06:08:19Z", "updated_at": "2024-03-11T06:08:21Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.8/jan-linux-amd64-0.4.8.deb" @@ -1001,7 +1001,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 127370, - "download_count": 3188, + "download_count": 3195, "created_at": "2024-03-11T06:15:48Z", "updated_at": "2024-03-11T06:15:48Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.8/jan-win-x64-0.4.8.exe.blockmap" @@ -1564,7 +1564,7 @@ "content_type": "application/octet-stream", "state": "uploaded", "size": 116340, - "download_count": 6503, + "download_count": 6509, "created_at": "2024-02-26T02:48:10Z", "updated_at": "2024-02-26T02:48:10Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.7/jan-win-x64-0.4.7.exe.blockmap" @@ -4311,7 +4311,7 @@ "content_type": "text/xml", "state": "uploaded", "size": 110511, - "download_count": 220, + "download_count": 221, "created_at": "2023-12-15T14:19:41Z", "updated_at": "2023-12-15T14:19:42Z", "browser_download_url": "https://github.com/janhq/jan/releases/download/v0.4.2/jan-win-x64-0.4.2.exe.blockmap" diff --git a/docs/docs/releases/changelog/changelog-v0.2.0.mdx b/docs/docs/releases/changelog/changelog-v0.2.0.mdx index 55a64bc48..5884db762 100644 --- a/docs/docs/releases/changelog/changelog-v0.2.0.mdx +++ b/docs/docs/releases/changelog/changelog-v0.2.0.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 17 +sidebar_position: 18 slug: /changelog/changelog-v0.2.0 --- # v0.2.0 diff --git a/docs/docs/releases/changelog/changelog-v0.2.1.mdx b/docs/docs/releases/changelog/changelog-v0.2.1.mdx index e4e8960f6..917aa43a3 100644 --- a/docs/docs/releases/changelog/changelog-v0.2.1.mdx +++ b/docs/docs/releases/changelog/changelog-v0.2.1.mdx @@ -1,13 +1,13 @@ ---- -sidebar_position: 16 -slug: /changelog/changelog-v0.2.1 ---- -# v0.2.1 - -For more details, [GitHub Issues](https://github.com/janhq/jan/releases/tag/v0.2.1) - -Highlighted Issue: [Issue #446: fix: model is started but the indicator is not stopped loading](https://github.com/janhq/jan/pull/446) - +--- +sidebar_position: 17 +slug: /changelog/changelog-v0.2.1 +--- +# v0.2.1 + +For more details, [GitHub Issues](https://github.com/janhq/jan/releases/tag/v0.2.1) + +Highlighted Issue: [Issue #446: fix: model is started but the indicator is not stopped loading](https://github.com/janhq/jan/pull/446) + ## Changes - fix: model is started but the indicator is not stopped loading @louis-jan (#446) @@ -90,4 +90,4 @@ Highlighted Issue: [Issue #446: fix: model is started but the indicator is not ## Contributor @0xSage, @dan-jan, @hiento09, @jan-service-account, @louis-jan, @nam-john-ho, @namchuai, @tikikun, @urmauur, @vuonghoainam and Hien To - + diff --git a/docs/docs/releases/changelog/changelog-v0.2.2.mdx b/docs/docs/releases/changelog/changelog-v0.2.2.mdx index 6546033cd..0beb0013b 100644 --- a/docs/docs/releases/changelog/changelog-v0.2.2.mdx +++ b/docs/docs/releases/changelog/changelog-v0.2.2.mdx @@ -1,13 +1,13 @@ ---- -sidebar_position: 15 -slug: /changelog/changelog-v0.2.2 ---- -# v0.2.2 - -For more details, [GitHub Issues](https://github.com/janhq/jan/releases/tag/v0.2.2) - -Highlighted Issue: [Issue #469: chore: plugin and app version dependency](https://github.com/janhq/jan/pull/469) - +--- +sidebar_position: 16 +slug: /changelog/changelog-v0.2.2 +--- +# v0.2.2 + +For more details, [GitHub Issues](https://github.com/janhq/jan/releases/tag/v0.2.2) + +Highlighted Issue: [Issue #469: chore: plugin and app version dependency](https://github.com/janhq/jan/pull/469) + ## Changes - chore: plugin and app version dependency @louis-jan (#469) @@ -40,4 +40,4 @@ Highlighted Issue: [Issue #469: chore: plugin and app version dependency](https ## Contributor @hiento09, @jan-service-account, @louis-jan, @namchuai, @urmauur and @vuonghoainam - + diff --git a/docs/docs/releases/changelog/changelog-v0.2.3.mdx b/docs/docs/releases/changelog/changelog-v0.2.3.mdx index e450bffc5..ba4c8fafd 100644 --- a/docs/docs/releases/changelog/changelog-v0.2.3.mdx +++ b/docs/docs/releases/changelog/changelog-v0.2.3.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 14 +sidebar_position: 15 slug: /changelog/changelog-v0.2.3 --- # v0.2.3 diff --git a/docs/docs/releases/changelog/changelog-v0.3.0.mdx b/docs/docs/releases/changelog/changelog-v0.3.0.mdx index 6ef6acb42..1e91edc8b 100644 --- a/docs/docs/releases/changelog/changelog-v0.3.0.mdx +++ b/docs/docs/releases/changelog/changelog-v0.3.0.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 13 +sidebar_position: 14 slug: /changelog/changelog-v0.3.0 --- # v0.3.0 diff --git a/docs/docs/releases/changelog/changelog-v0.3.1.mdx b/docs/docs/releases/changelog/changelog-v0.3.1.mdx index b83bc88a7..dedbae8e1 100644 --- a/docs/docs/releases/changelog/changelog-v0.3.1.mdx +++ b/docs/docs/releases/changelog/changelog-v0.3.1.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 12 +sidebar_position: 13 slug: /changelog/changelog-v0.3.1 --- # v0.3.1 diff --git a/docs/docs/releases/changelog/changelog-v0.3.2.mdx b/docs/docs/releases/changelog/changelog-v0.3.2.mdx index acc19cc1a..556085a6a 100644 --- a/docs/docs/releases/changelog/changelog-v0.3.2.mdx +++ b/docs/docs/releases/changelog/changelog-v0.3.2.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 11 +sidebar_position: 12 slug: /changelog/changelog-v0.3.2 --- # v0.3.2 diff --git a/docs/docs/releases/changelog/changelog-v0.3.3.mdx b/docs/docs/releases/changelog/changelog-v0.3.3.mdx index bdf4d1ec3..13fbeae01 100644 --- a/docs/docs/releases/changelog/changelog-v0.3.3.mdx +++ b/docs/docs/releases/changelog/changelog-v0.3.3.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 10 +sidebar_position: 11 slug: /changelog/changelog-v0.3.3 --- # v0.3.3 diff --git a/docs/docs/releases/changelog/changelog-v0.4.0.mdx b/docs/docs/releases/changelog/changelog-v0.4.0.mdx index c0225cc25..e8838e191 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.0.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.0.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 9 +sidebar_position: 10 slug: /changelog/changelog-v0.4.0 --- # v0.4.0 diff --git a/docs/docs/releases/changelog/changelog-v0.4.1.mdx b/docs/docs/releases/changelog/changelog-v0.4.1.mdx index 9e0300a4b..37d35a63d 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.1.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.1.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 8 +sidebar_position: 9 slug: /changelog/changelog-v0.4.1 --- # v0.4.1 diff --git a/docs/docs/releases/changelog/changelog-v0.4.2.mdx b/docs/docs/releases/changelog/changelog-v0.4.2.mdx index 7b2a1b81c..c2a6f7c0f 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.2.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.2.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 7 +sidebar_position: 8 slug: /changelog/changelog-v0.4.2 --- # v0.4.2 diff --git a/docs/docs/releases/changelog/changelog-v0.4.3.mdx b/docs/docs/releases/changelog/changelog-v0.4.3.mdx index 5703dbb6e..7ba008286 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.3.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.3.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 6 +sidebar_position: 7 slug: /changelog/changelog-v0.4.3 --- # v0.4.3 diff --git a/docs/docs/releases/changelog/changelog-v0.4.4.mdx b/docs/docs/releases/changelog/changelog-v0.4.4.mdx index e21359e67..348a48e7e 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.4.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.4.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 5 +sidebar_position: 6 slug: /changelog/changelog-v0.4.4 --- # v0.4.4 diff --git a/docs/docs/releases/changelog/changelog-v0.4.5.mdx b/docs/docs/releases/changelog/changelog-v0.4.5.mdx index 370d37cc7..0a94313a5 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.5.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.5.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 4 +sidebar_position: 5 slug: /changelog/changelog-v0.4.5 --- # v0.4.5 diff --git a/docs/docs/releases/changelog/changelog-v0.4.6.mdx b/docs/docs/releases/changelog/changelog-v0.4.6.mdx index d836551e7..aece33420 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.6.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.6.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 3 +sidebar_position: 4 slug: /changelog/changelog-v0.4.6 --- # v0.4.6 diff --git a/docs/docs/releases/changelog/changelog-v0.4.7.mdx b/docs/docs/releases/changelog/changelog-v0.4.7.mdx index b73ea828c..06db9832d 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.7.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.7.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 2 +sidebar_position: 3 slug: /changelog/changelog-v0.4.7 --- # v0.4.7 diff --git a/docs/docs/releases/changelog/changelog-v0.4.8.mdx b/docs/docs/releases/changelog/changelog-v0.4.8.mdx index d5bb266fb..6aecf4293 100644 --- a/docs/docs/releases/changelog/changelog-v0.4.8.mdx +++ b/docs/docs/releases/changelog/changelog-v0.4.8.mdx @@ -1,5 +1,5 @@ --- -sidebar_position: 1 +sidebar_position: 2 slug: /changelog/changelog-v0.4.8 --- # v0.4.8