* chore: add model.json for Llama3 and other outdated model version
* fix: consistency format
* fix: correct folder id
* update: bump version
* add: stop words
* fix: model.json
* Update extensions/inference-nitro-extension/resources/models/llama3-8b-instruct/model.json
* Update extensions/inference-nitro-extension/resources/models/llama3-8b-instruct/model.json
Based on suggested change
Co-authored-by: Nikolaus Kühn <nikolaus.kuehn@commercetools.com>
---------
Co-authored-by: Van-QA <van@jan.ai>
Co-authored-by: Hoang Ha <64120343+hahuyhoang411@users.noreply.github.com>
Co-authored-by: Louis <louis@jan.ai>
Co-authored-by: Nikolaus Kühn <nikolaus.kuehn@commercetools.com>
* fix: move to comming soon
* fix: Q4 for consistancy
* version pump extension
* pump version model
* fix: highlight unsupported tag
---------
Co-authored-by: Louis <louis@jan.ai>
* chore: extension should register its own models
Signed-off-by: James <james@jan.ai>
---------
Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
* feat: add extesion settings
Signed-off-by: James <james@jan.ai>
---------
Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
Co-authored-by: Louis <louis@jan.ai>
* feat: tensorrt-llm-extension
* fix: loading
* feat: add download tensorrt llm runner
Signed-off-by: James <james@jan.ai>
* feat: update to rollupjs instead of webpack for monitoring extension
Signed-off-by: James <james@jan.ai>
* feat: move update nvidia info to monitor extension
Signed-off-by: James <james@jan.ai>
* allow download tensorrt
Signed-off-by: James <james@jan.ai>
* update
Signed-off-by: James <james@jan.ai>
* allow download tensor rt based on gpu setting
Signed-off-by: James <james@jan.ai>
* update downloaded models
Signed-off-by: James <james@jan.ai>
* feat: add extension compatibility
* dynamic tensor rt engines
Signed-off-by: James <james@jan.ai>
* update models
Signed-off-by: James <james@jan.ai>
* chore: remove ts-ignore
* feat: getting installation state from extension
Signed-off-by: James <james@jan.ai>
* chore: adding type for decompress
Signed-off-by: James <james@jan.ai>
* feat: update according Louis's comment
Signed-off-by: James <james@jan.ai>
* feat: add progress for installing extension
Signed-off-by: James <james@jan.ai>
* chore: remove args from extension installation
* fix: model download does not work properly
* fix: do not allow user to stop tensorrtllm inference
* fix: extension installed style
* fix: download tensorrt does not update state
Signed-off-by: James <james@jan.ai>
* chore: replace int4 by fl16
* feat: modal for installing extension
Signed-off-by: James <james@jan.ai>
* fix: start download immediately after press install
Signed-off-by: James <james@jan.ai>
* fix: error switching between engines
* feat: rename inference provider to ai engine and refactor to core
* fix: missing ulid
* fix: core bundler
* feat: add cancel extension installing
Signed-off-by: James <james@jan.ai>
* remove mocking for mac
Signed-off-by: James <james@jan.ai>
* fix: show models only when extension is ready
* add tensorrt badge for model
Signed-off-by: James <james@jan.ai>
* fix: copy
* fix: add compatible check (#2342)
* fix: add compatible check
Signed-off-by: James <james@jan.ai>
* fix: copy
* fix: font
* fix: copy
* fix: broken monitoring extension
* chore: bump engine
* fix: copy
* fix: model copy
* fix: copy
* fix: model json
---------
Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
Co-authored-by: Louis <louis@jan.ai>
* fix: vulkan support
* fix: installation button padding
* fix: empty script
* fix: remove hard code string
---------
Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
Co-authored-by: NamH <NamNh0122@gmail.com>
* feat: add quick ask
Signed-off-by: James <james@jan.ai>
---------
Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
Co-authored-by: Louis <louis@jan.ai>
* feat: add vulkan support on windows and linux
* fix: correct vulkan settings
* fix: gpu settings and enable Vulkan support
* fix: vulkan support 1 device at a time only
* inference-nitro-extension add download vulkaninfo
---------
Co-authored-by: Louis <louis@jan.ai>
Co-authored-by: Hien To <tominhhien97@gmail.com>
* Web: change API_BASE_URL to build time env
* Update Dockerfile and Docker Compose by adding env API_BASE_URL
* Update make clean
* INFERENCE_URL get from baseApiUrl
* Fix error settings/settings.json not found when start server at the first time
* Update README docker
---------
Co-authored-by: Hien To <tominhhien97@gmail.com>
* fix: reduce the number of api call
Signed-off-by: James <james@jan.ai>
* fix: download progress
Signed-off-by: James <james@jan.ai>
* chore: save blob
* fix: server boot up
* fix: download state not updating
Signed-off-by: James <james@jan.ai>
* fix: copy assets
* Add Dockerfile CPU for Jan Server and Jan Web
* Add Dockerfile GPU for Jan Server and Jan Web
* feat: S3 adapter
* Update check find count from ./pre-install and correct copy:asserts command
* server add bundleDependencies @janhq/core
* server add bundleDependencies @janhq/core
* fix: update success/failed download state (#1945)
* fix: update success/failed download state
Signed-off-by: James <james@jan.ai>
* fix: download model progress and state handling for both Desktop and Web
---------
Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
Co-authored-by: Louis <louis@jan.ai>
* chore: refactor
* fix: load models empty first time open
* Add Docker compose
* fix: assistants onUpdate
---------
Signed-off-by: James <james@jan.ai>
Co-authored-by: James <james@jan.ai>
Co-authored-by: Hien To <tominhhien97@gmail.com>
Co-authored-by: NamH <NamNh0122@gmail.com>