8 Commits

Author SHA1 Message Date
hiento09
4f93e14d16
Fix token speed slow in machine has multi gpus (#1157)
* Update bat script windows choose GPU has highest ram to start nitro

* Update bash script for linux to choose gpu has highest vram

---------

Co-authored-by: Hien To <tominhhien97@gmail.com>
2023-12-21 15:38:21 +07:00
Louis
4653030bc1
fix: #1097 streaming response is replaced by error message (#1099) 2023-12-19 16:42:13 +07:00
hiento09
fde176955a
bump nitro version to 0.1.30 (#1036) 2023-12-15 17:39:49 +07:00
hiro
7f60265b3e chore: Bump nitro to 0.1.27 to support api to kill process 2023-12-13 16:35:37 +07:00
hiro
8f5c5e1e42 chore: Bump nitro to 0.1.26 2023-12-12 19:41:48 +07:00
hiro
f2eb8635da chore: Bumpt nitro bin version to version 0.1.23 2023-12-11 20:53:53 +07:00
hiro
6d3bf24d5c chore: remove gitkeep 2023-12-08 23:06:08 +07:00
hiro
c01737ff69 refactor: Change inference-extension to inference-nitro-extension 2023-12-08 23:06:08 +07:00