Merge Hiro's Architecture docs

2023-11-18 22:36:59 +08:00 · 2023-11-18 22:36:59 +08:00 · 55d8aba03a
commit 55d8aba03a
parent 49ac9ebd19 ba46ff7f08
13 changed files with 867 additions and 2874 deletions
--- a/docs/docs/guide/server.md
+++ b/docs/docs/guide/server.md
@ -0,0 +1,3 @@
+---
+title: API Server
+---
--- a/docs/docs/specs/architecture.md
+++ b/docs/docs/specs/architecture.md
@ -2,4 +2,40 @@
 title: Architecture
 ---

- [ ] Add Architecture Diagram here
+## Concepts
+
+```mermaid
+graph LR
+    A1[("A User Integrators")] -->|uses| B1[assistant]
+    B1 -->|persist conversational history| C1[("thread A")]
+    B1 -->|executes| D1[("built-in tools as module")]
+    B1 -.->|uses| E1[model]
+    E1 -.->|model.json| D1
+    D1 --> F1[retrieval]
+    F1 -->|belongs to| G1[("web browsing")]
+    G1 --> H1[Google]
+    G1 --> H2[Duckduckgo]
+    F1 -->|belongs to| I1[("API calling")]
+    F1 --> J1[("knowledge files")]
+```
+- User/ Integrator
+- Assistant object
+- Model object
+- Thread object
+- Built-in tool object
+
+## File system
+```sh
+janroot/
+	assistants/
+		assistant-a/
+			assistant.json
+			src/
+				index.ts
+			threads/
+				thread-a/
+				thread-b
+	models/
+		model-a/
+			model.json
+```
--- a/docs/docs/specs/assistants.md
+++ b/docs/docs/specs/assistants.md
@ -2,188 +2,239 @@
 title: "Assistants"
 ---

-:::warning
-
-Draft Specification: functionality has not been implemented yet. 
-
-Feedback: [HackMD: Assistants Spec](https://hackmd.io/KKAznzZvS668R6Vmyf8fCg) 
-
-:::
-
+Assistants can use models and tools.
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/assistants
+- Jan's `Assistants` are even more powerful than OpenAI due to customizable code in `index.js`

 ## User Stories

-_Users can chat with an assistant_
+_Users can download an assistant via a web URL_

- [Wireframes - show asst object properties]
- See [Threads Spec](https://hackmd.io/BM_8o_OCQ-iLCYhunn2Aug)
+- Wireframes here

-_Users can use Jan - the default assistant_
+_Users can import an assistant from local directory_

- [Wireframes here - show model picker]
- See [Default Jan Object](#Default-Jan-Example)
+- Wireframes here

-_Users can create an assistant from scratch_
+_Users can configure assistant settings_

- [Wireframes here - show create asst flow]
- Users can select any model for an assistant. See Model Spec
+- Wireframes here

-_Users can create an assistant from an existing assistant_
+## Assistant Object

- [Wireframes showing asst edit mode]
-
-## Jan Assistant Object
-
- A `Jan Assistant Object` is a "representation of an assistant"
- Objects are defined by `assistant-uuid.json` files in `json` format
- Objects are designed to be compatible with `OpenAI Assistant Objects` with additional properties needed to run on our infrastructure.
- ALL object properties are optional, i.e. users should be able to use an assistant declared by an empty `json` file.
-
-| Property      | Type                                            | Description                                                                                    | Validation                      |
-| ------------- | ----------------------------------------------- | ---------------------------------------------------------------------------------------------- | ------------------------------- |
-| `object`      | enum: `model`, `assistant`, `thread`, `message` | The Jan Object type                                                                            | Defaults to `assistant`         |
-| `name`        | string                                          | A vanity name.                                                                                 | Defaults to filename            |
-| `description` | string                                          | A vanity description.                                                                          | Max `n` chars. Defaults to `""` |
-| `models`      | array                                           | A list of Model Objects that the assistant can use.                                            | Defaults to ALL models          |
-| `metadata`    | map                                             | This can be useful for storing additional information about the object in a structured format. | Defaults to `{}`                |
-| `tools`       | array                                           | TBA.                                                                                           | TBA                             |
-| `files`       | array                                           | TBA.                                                                                           | TBA                             |
-
-### Generic Example
+- `assistant.json`
+> OpenAI Equivalen: https://platform.openai.com/docs/api-reference/assistants/object

 ```json
-// janroot/assistants/example/example.json
-"name": "Homework Helper",
+{
+  // Jan specific properties
+  "avatar": "https://lala.png",
+  "thread_location": "ROOT/threads",  // Default to root (optional field)
+  // TODO: add moar

-// Option 1 (default): all models in janroot/models are available via Model Picker
-"models": [],
-
-// Option 2: creator can configure custom parameters on existing models in `janroot/models` &&
-// Option 3: creator can package a custom model with the assistant
-"models": [{ ...modelObject1 }, { ...modelObject2 }],
+  // OpenAI compatible properties: https://platform.openai.com/docs/api-reference/assistants
+  "id": "asst_abc123",
+  "object": "assistant",
+  "created_at": 1698984975,
+  "name": "Math Tutor",
+  "description": null,
+  "instructions": "...",
+  "tools": [
+    {
+      "type": "retrieval"
+    },
+    {
+      "type": "web_browsing"
+    }
+  ],
+  "file_ids": ["file_id"],
+  "models": ["<model_id>"],
+  "metadata": {}
+}
 ```

-### Default Jan Example
+### Assistant lifecycle
+Assistant has 4 states (enum)
+- `to_download`
+- `downloading`
+- `ready`
+- `running`

- Every user install has a default "Jan Assistant" declared below.
-  > Q: can we omit most properties in `jan.json`? It's all defaults anyway.
+## Assistants API

+- What would modifying Assistant do? (doesn't mutate `index.js`?)
+  - By default, `index.js` loads `assistant.json` file and executes exactly like so. This supports builders with little time to write code.
+  - The `assistant.json` is 1 source of truth for the definitions of `models` and `built-in tools` that they can use it without writing more code.
+
+### Get list assistants
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/assistants/listAssistants
+- Example request
+```shell
+  curl {JAN_URL}/v1/assistants?order=desc&limit=20 \
+    -H "Content-Type: application/json"
+```
+- Example response
 ```json
-// janroot/assistants/jan/jan.json
-"description": "Use Jan to chat with all models",
+{
+  "object": "list",
+  "data": [
+    {
+      "id": "asst_abc123",
+      "object": "assistant",
+      "created_at": 1698982736,
+      "name": "Coding Tutor",
+      "description": null,
+      "models": ["model_zephyr_7b", "azure-openai-gpt4-turbo"],
+      "instructions": "You are a helpful assistant designed to make me better at coding!",
+      "tools": [],
+      "file_ids": [],
+      "metadata": {},
+      "state": "ready"
+    },
+  ],
+  "first_id": "asst_abc123",
+  "last_id": "asst_abc789",
+  "has_more": false
+}
 ```

-## Filesystem
-
- Everything needed to represent & run an assistant is packaged into an `Assistant folder`.
- The folder is standalone and can be easily zipped, imported, and exported, e.g. to Github.
- The folder always contains an `Assistant Object`, declared in an `assistant-uuid.json`.
-  - The folder and file must share the same name: `assistant-uuid`
- In the future, the folder will contain all of the resources an assistant needs to run, e.g. custom model binaries, pdf files, custom code, etc.
-
-```sh
-janroot/
-    assistants/
-        jan/                       # Assistant Folder
-            jan.json               # Assistant Object
-        homework-helper/           # Assistant Folder
-            homework-helper.json   # Assistant Object
+### Get assistant
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/assistants/getAssistant
+- Example request
+```shell
+  curl {JAN_URL}/v1/assistants/{assistant_id}   \
+    -H "Content-Type: application/json"
+```
+- Example response
+```json
+{
+  "id": "asst_abc123",
+  "object": "assistant",
+  "created_at": 1699009709,
+  "name": "HR Helper",
+  "description": null,
+  "models": ["model_zephyr_7b", "azure-openai-gpt4-turbo"],
+  "instructions": "You are an HR bot, and you have access to files to answer employee questions about company policies.",
+  "tools": [
+    {
+      "type": "retrieval"
+    }
+  ],
+  "file_ids": [
+    "file-abc123"
+  ],
+  "metadata": {},
+  "state": "ready"
+}
 ```

-### Custom Code
-
-> Not in scope yet. Sharing as a preview only.
-
- Assistants can call custom code in the future
- Custom code extends beyond `function calling` to any features that can be implemented in `/src`
-
-```sh
-example/                       # Assistant Folder
-    example.json               # Assistant Object
-    package.json
-    src/
-        index.ts
-        helpers.ts
+### Create an assistant
+Create an assistant with models and instructions.
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/assistants/createAssistant
+- Example request
+```shell
+  curl -X POST {JAN_URL}/v1/assistants   \
+    -H "Content-Type: application/json" \
+    -d {
+      "instructions": "You are a personal math tutor. When asked a question, write and run Python code to answer the question.",
+      "name": "Math Tutor",
+      "tools": [{"type": "retrieval"}],
+      "model": ["model_zephyr_7b", "azure-openai-gpt4-turbo"]
+    }
 ```
-
-### Knowledge Files
-
-> Not in scope yet. Sharing as a preview only
-
- Assistants can do `retrieval` in future
-
-```sh
-
-example/                       # Assistant Folder
-    example.json               # Assistant Object
-    files/
+- Example response
+```json
+{
+  "id": "asst_abc123",
+  "object": "assistant",
+  "created_at": 1698984975,
+  "name": "Math Tutor",
+  "description": null,
+  "model": ["model_zephyr_7b", "azure-openai-gpt4-turbo"]
+  "instructions": "You are a personal math tutor. When asked a question, write and run Python code to answer the question.",
+  "tools": [
+    {
+      "type": "retrieval"
+    }
+  ],
+  "file_ids": [],
+  "metadata": {},
+  "state": "ready"
+}
+```
+### Modify an assistant
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/assistants/modifyAssistant
+- Example request
+```shell
+  curl -X POST {JAN_URL}/v1/assistants/{assistant_id}   \
+    -H "Content-Type: application/json" \
+    -d {
+      "instructions": "You are a personal math tutor. When asked a question, write and run Python code to answer the question.",
+      "name": "Math Tutor",
+      "tools": [{"type": "retrieval"}],
+      "model": ["model_zephyr_7b", "azure-openai-gpt4-turbo"]
+    }
+```
+- Example response
+```json
+{
+  "id": "asst_abc123",
+  "object": "assistant",
+  "created_at": 1698984975,
+  "name": "Math Tutor",
+  "description": null,
+  "model": ["model_zephyr_7b", "azure-openai-gpt4-turbo"]
+  "instructions": "You are a personal math tutor. When asked a question, write and run Python code to answer the question.",
+  "tools": [
+    {
+      "type": "retrieval"
+    }
+  ],
+  "file_ids": [],
+  "metadata": {},
+  "state": "ready"
+}
 ```
-
-## Jan API
-
-### Assistant API Object
-
-#### `GET /v1/assistants/{assistant_id}`
-
- The `Jan Assistant Object` maps into the `OpenAI Assistant Object`.
- Properties marked with `*` are compatible with the [OpenAI `assistant` object](https://platform.openai.com/docs/api-reference/assistants)
- Note: The `Jan Assistant Object` has additional properties when retrieved via its API endpoint.
- https://platform.openai.com/docs/api-reference/assistants/getAssistant
-
-| Property         | Type           | Public Description                                                        | Jan Assistant Object (`a`) Property |
-| ---------------- | -------------- | ------------------------------------------------------------------------- | ----------------------------------- |
-| `id`\*           | string         | Assistant uuid, also the name of the Jan Assistant Object file: `id.json` | `json` filename                     |
-| `object`\*       | string         | Always "assistant"                                                        | `a.object`                          |
-| `created_at`\*   | integer        | Timestamp when assistant was created.                                     | `a.json` creation time              |
-| `name`\*         | string or null | A display name                                                            | `a.name` or `id`                    |
-| `description`\*  | string or null | A description                                                             | `a.description`                     |
-| `model`\*        | string         | Text                                                                      | `a.models[0].name`                  |
-| `instructions`\* | string or null | Text                                                                      | `a.models[0].parameters.prompt`     |
-| `tools`\*        | array          | TBA                                                                       | `a.tools`                           |
-| `file_ids`\*     | array          | TBA                                                                       | `a.files`                           |
-| `metadata`\*     | map            | TBA                                                                       | `a.metadata`                        |
-| `models`         | array          | TBA                                                                       | `a.models`                          |
-
-### Create Assistant
-
-#### `POST /v1/assistants`
-
- https://platform.openai.com/docs/api-reference/assistants/createAssistant
-
-### Retrieve Assistant
-
-#### `GET v1/assistants/{assistant_id}`
-
- https://platform.openai.com/docs/api-reference/assistants/getAssistant
-
-### Modify Assistant
-
-#### `POST v1/assistants/{assistant_id}`
-
- https://platform.openai.com/docs/api-reference/assistants/modifyAssistant
-
 ### Delete Assistant
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/assistants/deleteAssistant
+`- Example request
+```shell
+curl -X DELETE {JAN_URL}/v1/assistant/model-zephyr-7B
+```
+- Example response
+```json
+{
+  "id": "asst_abc123",
+  "object": "assistant.deleted",
+  "deleted": true,
+  "state": "to_download"
+}
+```

-#### `DELETE v1/assistants/{assistant_id}`
+## Assistants Filesystem

- https://platform.openai.com/docs/api-reference/assistants/deleteAssistant
+```sh
+/assistants
+    /jan
+        assistant.json    # Assistant configs (see below)

-### List Assistants
+        # For any custom code
+        package.json      # Import npm modules
+                          # e.g. Langchain, Llamaindex
+        /src              # Supporting files (needs better name)
+            index.js      # Entrypoint
+            process.js    # For electron IPC processes (needs better name)

-#### `GET v1/assistants`
+        # `/threads` at root level
+        # `/models` at root level
+    /shakespeare
+        assistant.json
+        package.json
+        /src
+            index.js
+            process.js

- https://platform.openai.com/docs/api-reference/assistants/listAssistants
-
-### CRUD Assistant.Models
-
- This is a Jan-only endpoint, since Jan supports the ModelPicker, i.e. an `assistant` can be created to run with many `models`.
-
-#### `POST /v1/assistants/{assistant_id}/models`
-
-#### `GET /v1/assistants/{assistant_id}/models`
-
-#### `GET /v1/assistants/{assistant_id}/models/{model_id}`
-
-#### `DELETE /v1/assistants/{assistant_id}/models`
-
-Note: There's no need to implement `Modify Assistant.Models`
+        /threads          # Assistants remember conversations in the future
+        /models           # Users can upload custom models
+            /finetuned-model
+```
--- a/docs/docs/specs/chats.md
+++ b/docs/docs/specs/chats.md
@ -12,24 +12,5 @@ Chats are essentially inference requests to a model

 > OpenAI Equivalent: https://platform.openai.com/docs/api-reference/chat

-## Chat Object
-
- Equivalent to: https://platform.openai.com/docs/api-reference/chat/object
-
-## Chat API
-
-See [/chat](/api/chat)
-
- Equivalent to: https://platform.openai.com/docs/api-reference/chat
-
-```sh
-POST https://localhost:1337/v1/chat/completions
-
-TODO:
-# Figure out how to incorporate tools
-```
-
-## Chat Filesystem
-
- Chats will be persisted to `messages` within `threads`
- There is no data structure specific to Chats
+- This should reference Nitro ChatCompletion API page to reduce duplication. 
+- We are fine with adding Jan API for this but it makes sense to use Nitro as reference as Nitro is default inference engine for Jan in this release
--- a/docs/docs/specs/files.md
+++ b/docs/docs/specs/files.md
@ -31,6 +31,20 @@ Files can be used by `threads`, `assistants` and `fine-tuning`
 ```

 ## File API
+### List Files
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/files/list
+
+### Upload file
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/files/create
+
+### Delete file
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/files/delete
+
+### Retrieve file
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/files/retrieve
+
+### Retrieve file content
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/files/retrieve-contents

 ## Files Filesystem

@ -45,5 +59,4 @@ Files can be used by `threads`, `assistants` and `fine-tuning`
 /threads
    /jan-12938912
        /files          # thread-specific files
-
 ```
--- a/docs/docs/specs/fine-tuning.md
+++ b/docs/docs/specs/fine-tuning.md
@ -0,0 +1,4 @@
+---
+title: "Fine tuning"
+---
+Todo: @hiro
--- a/docs/docs/specs/messages.md
+++ b/docs/docs/specs/messages.md
@ -11,17 +11,14 @@ Feedback: [HackMD: Threads Spec](https://hackmd.io/BM_8o_OCQ-iLCYhunn2Aug)
 :::

 Messages are within `threads` and capture additional metadata.
-
- Equivalent to: https://platform.openai.com/docs/api-reference/messages
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/messages

 ## Message Object
-
- Equivalent to: https://platform.openai.com/docs/api-reference/messages/object
-
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/messages/object
 ```json
 {
  // Jan specific properties
-  "updatedAt": "..." // that's it I think
+  "updatedAt": "...", // that's it I think

  // OpenAI compatible properties: https://platform.openai.com/docs/api-reference/messages)
  "id": "msg_dKYDWyQvtjDBi3tudL1yWKDa",
@ -46,16 +43,136 @@ Messages are within `threads` and capture additional metadata.
 ```

 ## Messages API
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/messages

- Equivalent to: https://platform.openai.com/docs/api-reference/messages
+### Get list message
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/messages/getMessage
+- Example request
+```shell
+  curl {JAN_URL}/v1/threads/{thread_id}/messages/{message_id} \
+    -H "Content-Type: application/json"
+```
+- Example response
+```json
+{
+  "id": "msg_abc123",
+  "object": "thread.message",
+  "created_at": 1699017614,
+  "thread_id": "thread_abc123",
+  "role": "user",
+  "content": [
+    {
+      "type": "text",
+      "text": {
+        "value": "How does AI work? Explain it in simple terms.",
+        "annotations": []
+      }
+    }
+  ],
+  "file_ids": [],
+  "assistant_id": null,
+  "run_id": null,
+  "metadata": {}
+}
+```
+### Create message
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/messages/createMessage
+- Example request
+```shell
+  curl -X POST {JAN_URL}/v1/threads/{thread_id}/messages \
+    -H "Content-Type: application/json" \
+    -d '{
+      "role": "user",
+      "content": "How does AI work? Explain it in simple terms."
+    }'
+```
+- Example response
+```json
+  {
+    "id": "msg_abc123",
+    "object": "thread.message",
+    "created_at": 1699017614,
+    "thread_id": "thread_abc123",
+    "role": "user",
+    "content": [
+      {
+        "type": "text",
+        "text": {
+          "value": "How does AI work? Explain it in simple terms.",
+          "annotations": []
+        }
+      }
+    ],
+    "file_ids": [],
+    "assistant_id": null,
+    "run_id": null,
+    "metadata": {}
+  }
+```
+### Get message
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/assistants/listAssistants
+- Example request
+```shell
+  curl {JAN_URL}/v1/threads/{thread_id}/messages/{message_id} \
+    -H "Content-Type: application/json"
+```
+- Example response
+```json
+  {
+    "id": "msg_abc123",
+    "object": "thread.message",
+    "created_at": 1699017614,
+    "thread_id": "thread_abc123",
+    "role": "user",
+    "content": [
+      {
+        "type": "text",
+        "text": {
+          "value": "How does AI work? Explain it in simple terms.",
+          "annotations": []
+        }
+      }
+    ],
+    "file_ids": [],
+    "assistant_id": null,
+    "run_id": null,
+    "metadata": {}
+  }
+```

-```sh
-POST https://api.openai.com/v1/threads/{thread_id}/messages # create msg
-GET https://api.openai.com/v1/threads/{thread_id}/messages  # list messages
-GET https://api.openai.com/v1/threads/{thread_id}/messages/{message_id}
+### Modify message
+> Jan: TODO: Do we need to modify message? Or let user create new message?

 # Get message file
-GET https://api.openai.com/v1/threads/{thread_id}/messages/{message_id}/files/{file_id}
-# List message files
-GET https://api.openai.com/v1/threads/{thread_id}/messages/{message_id}/files
+> OpenAI Equivalent: https://api.openai.com/v1/threads/{thread_id}/messages/{message_id}/files/{file_id}
+- Example request
+```shell
+  curl {JAN_URL}/v1/threads/{thread_id}/messages/{message_id}/files/{file_id} \
+    -H "Content-Type: application/json"
 ```
+- Example response
+```json
+  {
+    "id": "file-abc123",
+    "object": "thread.message.file",
+    "created_at": 1699061776,
+    "message_id": "msg_abc123"
+  }
+```
+# List message files
+> OpenAI Equivalent: https://api.openai.com/v1/threads/{thread_id}/messages/{message_id}/files
+```
+- Example request
+```shell
+  curl {JAN_URL}/v1/threads/{thread_id}/messages/{message_id}/files/{file_id} \
+    -H "Content-Type: application/json"
+```
+- Example response
+```json
+  {
+    "id": "file-abc123",
+    "object": "thread.message.file",
+    "created_at": 1699061776,
+    "message_id": "msg_abc123"
+  }
+```
--- a/docs/docs/specs/models.md
+++ b/docs/docs/specs/models.md
@ -46,7 +46,7 @@ _Users can override run settings at runtime_
 | `object`                | enum: `model`, `assistant`, `thread`, `message`               | Type of the Jan Object. Always `model`                                    | Defaults to "model"                              |
 | `name`                  | string                                                        | A vanity name                                                             | Defaults to filename                             |
 | `description`           | string                                                        | A vanity description of the model                                         | Defaults to ""                                   |
-| `state`                 | enum[`running` , `stopped`, `not-downloaded` , `downloading`] | Needs more thought                                                        | Defaults to `not-downloaded`                     |
+| `state`                 | enum[`to_download` , `downloading`, `ready` , `running`] | Needs more thought                                                        | Defaults to `to_download`                     |
 | `parameters`            | map                                                           | Defines default model run parameters used by any assistant.               | Defaults to `{}`                                 |
 | `metadata`              | map                                                           | Stores additional structured information about the model.                 | Defaults to `{}`                                 |
 | `metadata.engine`       | enum: `llamacpp`, `api`, `tensorrt`                           | The model backend used to run model.                                      | Defaults to "llamacpp"                           |
@ -83,10 +83,11 @@ Additionally, Jan supports importing popular formats. For example, if you provid

 Supported URL formats with custom importers:

- `huggingface/thebloke`: `TODO: URL here`
+- `huggingface/thebloke`: [Link](https://huggingface.co/TheBloke/Llama-2-7B-GGUF)
+- `huggingface/thebloke`: [Link](https://huggingface.co/TheBloke/Llama-2-7B-GGUF)
 - `janhq`: `TODO: put URL here`
- `azure_openai`: `TODO: put URL here`
- `openai`: `TODO: put URL here`
+- `azure_openai`: `https://docs-test-001.openai.azure.com/openai.azure.com/docs-test-001/gpt4-turbo`
+- `openai`: `api.openai.com`

 ### Generic Example

@ -98,52 +99,66 @@ Supported URL formats with custom importers:
 // Note: Default fields omitted for brevity
 "source_url": "https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/blob/main/zephyr-7b-beta.Q4_K_M.gguf",
 "parameters": {
-    "ctx_len": 2048,
-    "ngl": 100,
-    "embedding": true,
-    "n_parallel": 4,
+  "init": {
+    "ctx_len": "2048",
+    "ngl": "100",
+    "embedding": "true",
+    "n_parallel": "4",
    "pre_prompt": "A chat between a curious user and an artificial intelligence",
    "user_prompt": "USER: ",
    "ai_prompt": "ASSISTANT: "
+  },
+  "runtime": {
    "temperature": "0.7",
    "token_limit": "2048",
-    "top_k": "..",
-    "top_p": "..",
+    "top_k": "0",
+    "top_p": "1",
+    "stream": "true"
+  }
 },
 "metadata": {
-    "quantization": "..",
-    "size": "..",
+    "engine": "llamacpp",
+    "quantization": "Q3_K_L",
+    "size": "7B",
 }
 ```

 ### Example: multiple binaries

- Model has multiple binaries
+- Model has multiple binaries `model-llava-1.5-ggml.json`
 - See [source](https://huggingface.co/mys/ggml_llava-v1.5-13b)

 ```json
-"source_url": "https://huggingface.co/mys/ggml_llava-v1.5-13b"
+"source_url": "https://huggingface.co/mys/ggml_llava-v1.5-13b",
+"parameters": {"init": {}, "runtime": {}}
 "metadata": {
-    "binaries": "..", // TODO: what should this property be
+    "mmproj_binary": "https://huggingface.co/mys/ggml_llava-v1.5-13b/blob/main/mmproj-model-f16.gguf",
+    "ggml_binary": "https://huggingface.co/mys/ggml_llava-v1.5-13b/blob/main/ggml-model-q5_k.gguf",
+    "engine": "llamacpp",
+    "quantization": "Q5_K"
 }
 ```

 ### Example: Azure API

- Using a remote API to access model
+- Using a remote API to access model `model-azure-openai-gpt4-turbo.json`
 - See [source](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api)

 ```json
 "source_url": "https://docs-test-001.openai.azure.com/openai.azure.com/docs-test-001/gpt4-turbo",
 "parameters": {
+  "init" {
    "API-KEY": "",
    "DEPLOYMENT-NAME": "",
-    "api-version": "2023-05-15",
+    "api-version": "2023-05-15"
+  },
+  "runtime": {
    "temperature": "0.7",
    "max_tokens": "2048",
    "presence_penalty": "0",
    "top_p": "1",
    "stream": "true"
+  }
 }
 "metadata": {
    "engine": "api",
@ -155,7 +170,7 @@ Supported URL formats with custom importers:
 - Everything needed to represent a `model` is packaged into an `Model folder`.
 - The `folder` is standalone and can be easily zipped, imported, and exported, e.g. to Github.
 - The `folder` always contains at least one `Model Object`, declared in a `json` format.
-  - The `folder` and `file` do not have to share the same name
+- The `folder` and `file` do not have to share the same name
 - The model `id` is made up of `folder_name/filename` and is thus always unique.

 ```sh
@ -170,11 +185,9 @@ Supported URL formats with custom importers:
 ```

 ### Default ./model folder
-
 - Jan ships with a default model folders containing recommended models
 - Only the Model Object `json` files are included
 - Users must later explicitly download the model binaries
-
 ```sh
 models/
    mistral-7b/
@ -182,7 +195,6 @@ models/
    hermes-7b/
        hermes-7b.json
 ```
-
 ### Multiple quantizations

 - Each quantization has its own `Jan Model Object` file
@ -193,7 +205,6 @@ llama2-7b-gguf/
    llama2-7b-gguf-Q3_K_L.json
    .bin
 ```
-
 ### Multiple model partitions

 - A Model that is partitioned into several binaries use just 1 file
@ -204,8 +215,7 @@ llava-ggml/
    .proj
    ggml
 ```
-
-### ?? whats this example for?
+### Your locally fine-tuned model

 - ??

@ -214,67 +224,149 @@ llama-70b-finetune/
    llama-70b-finetune-q5.json
    .bin
 ```
-
 ## Jan API
-
 ### Model API Object
-
 - The `Jan Model Object` maps into the `OpenAI Model Object`.
 - Properties marked with `*` are compatible with the [OpenAI `model` object](https://platform.openai.com/docs/api-reference/models)
 - Note: The `Jan Model Object` has additional properties when retrieved via its API endpoint.
- https://platform.openai.com/docs/api-reference/models/object
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/object

-| Property      | Type           | Public Description                                          | Jan Model Object (`m`) Property              |
-| ------------- | -------------- | ----------------------------------------------------------- | -------------------------------------------- |
-| `id`\*        | string         | Model uuid; also the file location under `/models`          | `folder/filename`                            |
-| `object`\*    | string         | Always "model"                                              | `m.object`                                   |
-| `created`\*   | integer        | Timestamp when model was created.                           | `m.json` creation time                       |
-| `owned_by`\*  | string         | The organization that owns the model.                       | grep author from `m.source_url` OR $(whoami) |
-| `name`        | string or null | A display name                                              | `m.name` or filename                         |
-| `description` | string         | A vanity description of the model                           | `m.description`                              |
-| `state`       | enum           |                                                             |                                              |
-| `parameters`  | map            | Defines default model run parameters used by any assistant. |                                              |
-| `metadata`    | map            | Stores additional structured information about the model.   |                                              |
-
-### List models
-
- https://platform.openai.com/docs/api-reference/models/list
-
-TODO: @hiro
+### Model lifecycle
+Model has 4 states (enum)
+- `to_download`
+- `downloading`
+- `ready`
+- `running`

 ### Get Model
-
- https://platform.openai.com/docs/api-reference/models/retrieve
-
-TODO: @hiro
-
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/retrieve
+- Example request
+```shell
+curl {JAN_URL}/v1/models/{model_id}
+```
+- Example response
+```json
+{
+  "id": "model-zephyr-7B",
+  "object": "model",
+  "created_at": 1686935002,
+  "owned_by": "thebloke",
+  "state": "running",
+  "source_url": "https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/blob/main/zephyr-7b-beta.Q4_K_M.gguf",
+  "parameters": {
+     "ctx_len": 2048,
+     "ngl": 100,
+     "embedding": true,
+     "n_parallel": 4,
+     "pre_prompt": "A chat between a curious user and an artificial intelligence",
+     "user_prompt": "USER: ",
+     "ai_prompt": "ASSISTANT: ",
+     "temperature": "0.7",
+     "token_limit": "2048",
+     "top_k": "0",
+     "top_p": "1",
+  },
+  "metadata": {
+     "engine": "llamacpp",
+     "quantization": "Q3_K_L",
+     "size": "7B",
+  }
+}
+```
+### List models
+Lists the currently available models, and provides basic information about each one such as the owner and availability.
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/list
+- Example request
+```shell=
+curl {JAN_URL}/v1/models
+```
+- Example response
+```json
+{
+  "object": "list",
+  "data": [
+    {
+      "id": "model-zephyr-7B",
+      "object": "model",
+      "created_at": 1686935002,
+      "owned_by": "thebloke",
+      "state": "running"
+    },
+    {
+      "id": "ft-llama-70b-gguf",
+      "object": "model",
+      "created_at": 1686935002,
+      "owned_by": "you",
+      "state": "stopped"
+    },
+    {
+      "id": "model-azure-openai-gpt4-turbo",
+      "object": "model",
+      "created_at": 1686935002,
+      "owned_by": "azure_openai",
+      "state": "running"
+    },
+  ],
+  "object": "list"
+}
+```
 ### Delete Model
-
- https://platform.openai.com/docs/api-reference/models/delete
-
-TODO: @hiro
-
-### Get Model State
-
-> Jan-only endpoint
-> TODO: @hiro
-
-### Get Model Metadata
-
-> Jan-only endpoint
-> TODO: @hiro
-
-### Download Model
-
-> Jan-only endpoint
-> TODO: @hiro
-
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/delete
+`- Example request
+```shell
+curl -X DELETE {JAN_URL}/v1/models/{model_id}
+```
+- Example response
+```json
+{
+  "id": "model-zephyr-7B",
+  "object": "model",
+  "deleted": true,
+  "state": "to_download"
+}
+```
 ### Start Model
-
 > Jan-only endpoint
-> TODO: @hiro
-
+The request to start `model` by changing model state from `ready` to `running`
+- Example request
+```shell
+curl -X PUT {JAN_URL}/v1/models{model_id}/start
+```
+- Example response
+```json
+{
+  "id": "model-zephyr-7B",
+  "object": "model",
+  "state": "running"
+}
+```
 ### Stop Model
-
 > Jan-only endpoint
-> TODO: @hiro
+The request to start `model` by changing model state from `running` to `ready`
+- Example request
+```shell
+curl -X PUT {JAN_URL}/v1/models/{model_id}/stop
+```
+- Example response
+```json
+{
+  "id": "model-zephyr-7B",
+  "object": "model",
+  "state": "ready"
+}
+```
+### Download Model
+> Jan-only endpoint
+The request to download `model` by changing model state from `to_download` to `downloading` then `ready`once it's done.
+- Example request
+```shell
+curl -X POST {JAN_URL}/v1/models/
+```
+- Example response
+```json
+{
+  "id": "model-zephyr-7B",
+  "object": "model",
+  "state": "downloading"
+}
+```
--- a/docs/docs/specs/threads.md
+++ b/docs/docs/specs/threads.md
@ -14,13 +14,9 @@ Feedback: [HackMD: Threads Spec](https://hackmd.io/BM_8o_OCQ-iLCYhunn2Aug)

 _Users can chat with an assistant in a thread_

- See [Messages Spec]
+- See [Messages Spec](./messages.md)

-_Users can change model in a new thread_
-
- Wireframes here
-
-_Users can change model parameters in a thread_
+_Users can change assistant and model parameters in a thread_

 - Wireframes of

@ -38,7 +34,7 @@ _Users can delete all thread history_
 | Property   | Type                                            | Description                                                                                                                                                                                    | Validation                     |
 | ---------- | ----------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------ |
 | `object`   | enum: `model`, `assistant`, `thread`, `message` | The Jan Object type                                                                                                                                                                            | Defaults to `thread`           |
-| `models`   | array                                           | An array of Jan Model Objects. Threads can "override" an assistant's model run parameters. Thread-level model parameters are directly saved in the `thread.models` property! (see Models spec) | Defaults to `assistant.models` |
+| `assistants`   | array                                           | An array of Jan Assistant Objects. Threads can "override" an assistant's parameters. Thread-level model parameters are directly saved in the `thread.models` property! (see Models spec) | Defaults to `assistant.name` |
 | `messages` | array                                           | An array of Jan Message Objects. (see Messages spec)                                                                                                                                           | Defaults to `[]`               |
 | `metadata` | map                                             | Useful for storing additional information about the object in a structured format.                                                                                                             | Defaults to `{}`               |

@ -46,6 +42,7 @@ _Users can delete all thread history_

 ```json
 // janroot/threads/jan_1700123404.json
+"assistants": ["assistant-123"],
 "messages": [
    {...message0}, {...message1}
 ],
@ -56,7 +53,7 @@ _Users can delete all thread history_

 ## Filesystem

- `Jan Thread Objects`' `json` files always has the naming schema: `assistant_uuid` + `unix_time_thread_created_at. See below.
+- `Jan Thread Objects`'s `json` files always has the naming schema: `assistant_uuid` + `unix_time_thread_created_at. See below.
 - Threads are all saved in the `janroot/threads` folder in a flat folder structure.
 - The folder is standalone and can be easily zipped, exported, and cleared.

@ -68,67 +65,129 @@ janroot/
 ```

 ## Jan API
-
-### Thread API Object
-
-#### `GET /v1/threads/{thread_id}`
-
- The `Jan Thread Object` maps into the `OpenAI Thread Object`.
- Properties marked with `*` are compatible with the [OpenAI `thread` object](https://platform.openai.com/docs/api-reference/threads)
- Note: The `Jan Thread Object` has additional properties when retrieved via its API endpoint.
- https://platform.openai.com/docs/api-reference/threads/getThread
-
-| Property       | Type    | Public Description                                                  | Jan Thread Object (`t`) Property |
-| -------------- | ------- | ------------------------------------------------------------------- | -------------------------------- |
-| `id`\*         | string  | Thread uuid, also the name of the Jan Thread Object file: `id.json` | `json` filename                  |
-| `object`\*     | string  | Always "thread"                                                     | `t.object`                       |
-| `created_at`\* | integer |                                                                     | `json` file creation time        |
-| `metadata`\*   | map     |                                                                     | `t.metadata`                     |
-| `models`       | array   |                                                                     | `t.models`                       |
-| `messages`     | array   |                                                                     | `t.messages`                     |
-
+### Get thread
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/threads/getThread
+- Example request
+```shell
+    curl {JAN_URL}/v1/threads/{thread_id}
+```
+- Example response
+```json
+    {
+    "id": "thread_abc123",
+    "object": "thread",
+    "created_at": 1699014083,
+    "assistants": ["assistant-001"],
+    "metadata": {},
+    "messages": []
+    }
+```
 ### Create Thread
-
-#### `POST /v1/threads`
-
- https://platform.openai.com/docs/api-reference/threads/createThread
-
-### Retrieve Thread
-
-#### `GET v1/threads/{thread_id}`
-
- https://platform.openai.com/docs/api-reference/threads/getThread
-
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/threads/createThread
+- Example request
+```shell
+    curl -X POST {JAN_URL}/v1/threads \
+    -H "Content-Type: application/json" \
+    -d '{
+        "messages": [{
+            "role": "user",
+            "content": "Hello, what is AI?",
+            "file_ids": ["file-abc123"]
+        }, {
+            "role": "user",
+            "content": "How does AI work? Explain it in simple terms."
+        }]
+    }'
+```
+- Example response
+```json
+    {
+    "id": 'thread_abc123',
+    "object": 'thread',
+    "created_at": 1699014083,
+    "metadata": {}
+    }
+```
 ### Modify Thread
-
-#### `POST v1/threads/{thread_id}`
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/threads/modifyThread
+- Example request
+```shell
+    curl -X POST {JAN_URL}/v1/threads/{thread_id} \
+    -H "Content-Type: application/json" \
+    -d '{
+        "messages": [{
+            "role": "user",
+            "content": "Hello, what is AI?",
+            "file_ids": ["file-abc123"]
+        }, {
+            "role": "user",
+            "content": "How does AI work? Explain it in simple terms."
+        }]
+    }'
+```
+- Example response
+```json
+    {
+    "id": 'thread_abc123',
+    "object": 'thread',
+    "created_at": 1699014083,
+    "metadata": {}
+    }
+```

 - https://platform.openai.com/docs/api-reference/threads/modifyThread

 ### Delete Thread
-
-#### `DELETE v1/threads/{thread_id}`
-
- https://platform.openai.com/docs/api-reference/threads/deleteThread
+> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/threads/deleteThread
+- Example request
+```shell
+    curl -X DELETE {JAN_URL}/v1/threads/{thread_id}
+```
+- Example response
+```json
+    {
+    "id": "thread_abc123",
+    "object": "thread.deleted",
+    "deleted": true
+    }
+```

 ### List Threads
-
 > This is a Jan-only endpoint, not supported by OAI yet.
+- Example request
+```shell
+    curl {JAN_URL}/v1/threads \
+    -H "Content-Type: application/json" \
+```
+- Example response
+```json
+    [
+        {
+            "id": "thread_abc123",
+            "object": "thread",
+            "created_at": 1699014083,
+            "assistants": ["assistant-001"],
+            "metadata": {},
+            "messages": []
+        },
+        {
+            "id": "thread_abc456",
+            "object": "thread",
+            "created_at": 1699014083,
+            "assistants": ["assistant-002", "assistant-002"],
+            "metadata": {},
+        }
+    ]
+```

-#### `GET v1/threads`
+### Get & Modify `Thread.Assistants`
+-> Can achieve this goal by calling `Modify Thread` API

-### Get & Modify `Thread.Models`
+#### `GET v1/threads/{thread_id}/assistants`
+-> Can achieve this goal by calling `Get Thread` API

-> This is a Jan-only endpoint, not supported by OAI yet.
-
-#### `GET v1/threads/{thread_id}/models`
-
-#### `POST v1/threads/{thread_id}/models/{model_id}`
-
- Since users can change model parameters in an existing thread
+#### `POST v1/threads/{thread_id}/assistants/{assistant_id}`
+-> Can achieve this goal by calling `Modify Assistant` API with `thread.assistant[]`

 ### List `Thread.Messages`
-
-> This is a Jan-only endpoint, not supported by OAI yet.
-
-#### `GET v1/threads/{thread_id}/messages`
+-> Can achieve this goal by calling `Get Thread` API
--- a/docs/docusaurus.config.js
+++ b/docs/docusaurus.config.js
@ -58,15 +58,13 @@ const config = {
    ],
  ],

-  // Only for react live
  themes: ["@docusaurus/theme-live-codeblock"],

  // The classic preset will relay each option entry to the respective sub plugin/theme.
  presets: [
    [
-      "classic",
-      /** @type {import('@docusaurus/preset-classic').Options} */
-      ({
+      "@docusaurus/preset-classic",
+      {
        // Will be passed to @docusaurus/plugin-content-docs (false to disable)
        docs: {
          routeBasePath: "/",
@ -97,7 +95,7 @@ const config = {
        },
        // Will be passed to @docusaurus/plugin-content-pages (false to disable)
        // pages: {},
-      }),
+      },
    ],
    // Redoc preset
    [
@ -119,65 +117,63 @@ const config = {
  ],

  // Docs: https://docusaurus.io/docs/api/themes/configuration
-  themeConfig:
-    /** @type {import('@docusaurus/preset-classic').ThemeConfig} */
-    ({
-      image: "img/jan-social-card.png",
-      // Only for react live
-      liveCodeBlock: {
-        playgroundPosition: "bottom",
+  themeConfig: {
+    image: "img/jan-social-card.png",
+    // Only for react live
+    liveCodeBlock: {
+      playgroundPosition: "bottom",
+    },
+    docs: {
+      sidebar: {
+        hideable: true,
+        autoCollapseCategories: true,
      },
-      docs: {
-        sidebar: {
-          hideable: true,
-          autoCollapseCategories: true,
+    },
+    navbar: {
+      title: "Jan",
+      logo: {
+        alt: "Jan Logo",
+        src: "img/logo.svg",
+      },
+      items: [
+        // Navbar Left
+        {
+          type: "docSidebar",
+          sidebarId: "docsSidebar",
+          position: "left",
+          label: "Documentation",
        },
-      },
-      navbar: {
-        title: "Jan",
-        logo: {
-          alt: "Jan Logo",
-          src: "img/logo.svg",
+        {
+          type: "docSidebar",
+          sidebarId: "apiSidebar",
+          position: "left",
+          label: "API Reference",
        },
-        items: [
-          // Navbar Left
-          {
-            type: "docSidebar",
-            sidebarId: "docsSidebar",
-            position: "left",
-            label: "Documentation",
-          },
-          {
-            type: "docSidebar",
-            sidebarId: "apiSidebar",
-            position: "left",
-            label: "API Reference",
-          },
-          // Navbar right
-          {
-            to: "blog",
-            label: "Blog",
-            position: "right",
-          },
-          {
-            type: "docSidebar",
-            sidebarId: "aboutSidebar",
-            position: "right",
-            label: "About",
-          },
-        ],
-      },
-      prism: {
-        theme: darkCodeTheme,
-        darkTheme: darkCodeTheme,
-        additionalLanguages: ["python"],
-      },
-      colorMode: {
-        defaultMode: "dark",
-        disableSwitch: false,
-        respectPrefersColorScheme: false,
-      },
-    }),
+        // Navbar right
+        {
+          to: "blog",
+          label: "Blog",
+          position: "right",
+        },
+        {
+          type: "docSidebar",
+          sidebarId: "aboutSidebar",
+          position: "right",
+          label: "About",
+        },
+      ],
+    },
+    prism: {
+      theme: darkCodeTheme,
+      darkTheme: darkCodeTheme,
+      additionalLanguages: ["python"],
+    },
+    colorMode: {
+      defaultMode: "dark",
+      disableSwitch: false,
+      respectPrefersColorScheme: false,
+    },
+  },
 };

 module.exports = config;
--- a/docs/package.json
+++ b/docs/package.json
@ -17,7 +17,7 @@
    "@docusaurus/core": "^2.4.3",
    "@docusaurus/preset-classic": "^2.4.3",
    "@docusaurus/theme-live-codeblock": "^2.4.3",
-    "@docusaurus/theme-mermaid": "^3.0.0",
+    "@docusaurus/theme-mermaid": "^2.4.3",
    "@headlessui/react": "^1.7.17",
    "@heroicons/react": "^2.0.18",
    "@mdx-js/react": "^1.6.22",
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@ -34,7 +34,7 @@ const sidebars = {
      label: "Using Jan",
      collapsible: true,
      collapsed: true,
-      items: ["guide/models"],
+      items: ["guide/models", "guide/server"],
    },
    {
      type: "category",
@ -70,6 +70,7 @@ const sidebars = {
            "specs/assistants",
            "specs/files",
            "specs/jan",
+            "specs/fine-tuning",
          ],
        },
      ],
--- a/docs/yarn.lock
+++ b/docs/yarn.lock