9 Commits

Author SHA1 Message Date
Louis
d85d02693b
feat: Nitro-Tensorrt-LLM Extension (#2280)
* feat: tensorrt-llm-extension

* fix: loading

* feat: add download tensorrt llm runner

Signed-off-by: James <james@jan.ai>

* feat: update to rollupjs instead of webpack for monitoring extension

Signed-off-by: James <james@jan.ai>

* feat: move update nvidia info to monitor extension

Signed-off-by: James <james@jan.ai>

* allow download tensorrt

Signed-off-by: James <james@jan.ai>

* update

Signed-off-by: James <james@jan.ai>

* allow download tensor rt based on gpu setting

Signed-off-by: James <james@jan.ai>

* update downloaded models

Signed-off-by: James <james@jan.ai>

* feat: add extension compatibility

* dynamic tensor rt engines

Signed-off-by: James <james@jan.ai>

* update models

Signed-off-by: James <james@jan.ai>

* chore: remove ts-ignore

* feat: getting installation state from extension

Signed-off-by: James <james@jan.ai>

* chore: adding type for decompress

Signed-off-by: James <james@jan.ai>

* feat: update according Louis's comment

Signed-off-by: James <james@jan.ai>

* feat: add progress for installing extension

Signed-off-by: James <james@jan.ai>

* chore: remove args from extension installation

* fix: model download does not work properly

* fix: do not allow user to stop tensorrtllm inference

* fix: extension installed style

* fix: download tensorrt does not update state

Signed-off-by: James <james@jan.ai>

* chore: replace int4 by fl16

* feat: modal for installing extension

Signed-off-by: James <james@jan.ai>

* fix: start download immediately after press install

Signed-off-by: James <james@jan.ai>

* fix: error switching between engines

* feat: rename inference provider to ai engine and refactor to core

* fix: missing ulid

* fix: core bundler

* feat: add cancel extension installing

Signed-off-by: James <james@jan.ai>

* remove mocking for mac

Signed-off-by: James <james@jan.ai>

* fix: show models only when extension is ready

* add tensorrt badge for model

Signed-off-by: James <james@jan.ai>

* fix: copy

* fix: add compatible check (#2342)

* fix: add compatible check

Signed-off-by: James <james@jan.ai>

* fix: copy

* fix: font

* fix: copy

* fix: broken monitoring extension

* chore: bump engine

* fix: copy

* fix: model copy

* fix: copy

* fix: model json

---------

Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
Co-authored-by: Louis <louis@jan.ai>

* fix: vulkan support

* fix: installation button padding

* fix: empty script

* fix: remove hard code string

---------

Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
Co-authored-by: NamH <NamNh0122@gmail.com>
2024-03-14 14:07:22 +07:00
hiro
926f19bd9b
feat: Add nitro vulkan to support AMD GPU/ APU and Intel Arc GPU (#2056)
* feat: add vulkan support on windows and linux

* fix: correct vulkan settings

* fix: gpu settings and enable Vulkan support

* fix: vulkan support 1 device at a time only

* inference-nitro-extension add download vulkaninfo

---------

Co-authored-by: Louis <louis@jan.ai>
Co-authored-by: Hien To <tominhhien97@gmail.com>
2024-02-22 11:19:36 +07:00
hiento09
a0e55cde8f
Feature integrate antivirus scanner to ci (#1529)
* Revert nitro to 0.2.6

* Update nitro to 0.2.7 and add ci antivirus scanner

---------

Co-authored-by: Hien To <tominhhien97@gmail.com>
2024-01-11 18:26:36 +07:00
hiento09
31fdd89f0e
Revert nitro to 0.2.6 (#1491)
Co-authored-by: Hien To <tominhhien97@gmail.com>
2024-01-10 13:47:39 +07:00
hiento09
1350413e4f
Bump nitro to 0.2.8 and change Jan App to support cuda >= 11.7 (#1476) 2024-01-10 00:09:18 +07:00
hiento09
f11a59bece
Add detect cuda version (#1351)
Co-authored-by: Hien To <tominhhien97@gmail.com>
2024-01-04 22:53:21 +07:00
hiro
c32ad0aff7 fix: small change in nitro bin location 2023-12-09 00:42:48 +07:00
hiro
0ef9a581d3 fix: BAT for nitro 2023-12-09 00:36:55 +07:00
hiro
c01737ff69 refactor: Change inference-extension to inference-nitro-extension 2023-12-08 23:06:08 +07:00