From fd146071e4a273b56f2602f387c1546304c4b9f5 Mon Sep 17 00:00:00 2001 From: 0xSage Date: Tue, 5 Sep 2023 15:18:10 +0800 Subject: [PATCH 1/2] docs: small fixes --- README.md | 31 +++++++++++++++++-------------- 1 file changed, 17 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 2526ea8cb..8aa3b3dac 100644 --- a/README.md +++ b/README.md @@ -14,20 +14,23 @@ Jan is free, source-available, and fair-code licensed. - GPU support with Nvidia hardware acceleration - CPU support with optimizations via llama.cpp -## Features -- [x] Web, Mobile and APIs (OpenAI compatible REST & GRPC) -- [x] LLMs and Generative Art models -- [x] Support for Apple Silicon, CPU architectures -- [x] C++ inference backend with CUDA/TensorRT/Triton, dynamic batching +## Current Features +- [x] Llama 7Bn +- [x] Web app and APIs (OpenAI compatible REST & GRPC) +- [x] Supports Apple Silicon/CPU & GPU architectures - [x] Load balancing via Traefik - [x] Login and authz via Keycloak -- [x] Data persistence via Postgres and/or MinIO +- [x] Data storage via Postgres, MinIO -## Planned -- [ ] Support opting out of optional, 3rd party integrations +## Planned Features +- [ ] Support more language models (code llama, wizard, etc.) +- [ ] Support text2image models (SD1.5, SDXL, etc.) +- [ ] Accelerated popular models via TensorRT ([WIP](https://github.com/janhq/accelerated_ai)) +- [ ] C++ inference backend with CUDA & dynamic batching ([WIP](https://github.com/janhq/inference_backend)) +- [ ] Let ppl opt out of 3rd party integrations - [ ] Universal model installer & compiler, targeting Nvidia GPU acceleration - [ ] Mobile UI with a swappable backend URL -- [ ] Support for controlnet, upscaler, and code llama +- [ ] UI for controlnet, upscaler, and code llama - [ ] Admin dashboards for user management and audit ## Quickstart @@ -105,11 +108,11 @@ docker compose up -d | Service | URL | Credentials | | -------------------- | --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Keycloak | http://localhost:8088 | Admin credentials are set via the environment variables `KEYCLOAK_ADMIN` and `KEYCLOAK_ADMIN_PASSWORD` | -| app-backend (hasura) | http://localhost:8080 | Admin credentials are set via the environment variables `HASURA_GRAPHQL_ADMIN_SECRET` in file `conf/sample.env_app-backend` | -| web-client | http://localhost:3000 | Users are signed up to keycloak, default created user is set via `conf/keycloak_conf/example-realm.json` on keycloak with username: `username`, password: `password` | -| llm service | http://localhost:8000 | | -| sd service | http://localhost:8001 | | +| Web App | http://localhost:3000 | Users are signed up to keycloak, default created user is set via `conf/keycloak_conf/example-realm.json` on keycloak with username: `username`, password: `password` | +| Keycloak Admin | http://localhost:8088 | Admin credentials are set via the environment variables `KEYCLOAK_ADMIN` and `KEYCLOAK_ADMIN_PASSWORD` | +| Hasura App Backend | http://localhost:8080 | Admin credentials are set via the environment variables `HASURA_GRAPHQL_ADMIN_SECRET` in file `conf/sample.env_app-backend` | +| LLM Service | http://localhost:8000 | | +| Stable Diffusion Service | http://localhost:8001 | | ## Usage From b186c8d09b3a7f4c5c4baf1c890a8db14d5a5ea9 Mon Sep 17 00:00:00 2001 From: 0xSage Date: Tue, 5 Sep 2023 15:20:23 +0800 Subject: [PATCH 2/2] docs: make concise --- README.md | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 8aa3b3dac..aa477d285 100644 --- a/README.md +++ b/README.md @@ -23,14 +23,10 @@ Jan is free, source-available, and fair-code licensed. - [x] Data storage via Postgres, MinIO ## Planned Features -- [ ] Support more language models (code llama, wizard, etc.) -- [ ] Support text2image models (SD1.5, SDXL, etc.) -- [ ] Accelerated popular models via TensorRT ([WIP](https://github.com/janhq/accelerated_ai)) -- [ ] C++ inference backend with CUDA & dynamic batching ([WIP](https://github.com/janhq/inference_backend)) -- [ ] Let ppl opt out of 3rd party integrations +- [ ] Support more models, with acceleration ([WIP](https://github.com/janhq/accelerated_ai)) +- [ ] C++ inference backend w/ CUDA & dynamic batching ([WIP](https://github.com/janhq/inference_backend)) - [ ] Universal model installer & compiler, targeting Nvidia GPU acceleration - [ ] Mobile UI with a swappable backend URL -- [ ] UI for controlnet, upscaler, and code llama - [ ] Admin dashboards for user management and audit ## Quickstart