diff --git a/docs/docs/intro/how-jan-works.md b/docs/docs/intro/how-jan-works.md
index 86fe39b32..54122356d 100644
--- a/docs/docs/intro/how-jan-works.md
+++ b/docs/docs/intro/how-jan-works.md
@@ -1,3 +1,11 @@
 ---
 title: How Jan Works
----
\ No newline at end of file
+---
+
+- Local Filesystem
+Follow-on from Quickstart to show how things actually worked
+Write in a conversational style, show how things work under the hood
+Check how filesystem changed after each request
+- Model loading into RAM/VRAM
+Explain how the .bin file is loaded via Llama.cpp
+Explain how it consumes RAM and VRAM, and refer to system monitor
diff --git a/docs/docs/specs/fine-tuning.md b/docs/docs/specs/fine-tuning.md
index df6723ff1..281a065b2 100644
--- a/docs/docs/specs/fine-tuning.md
+++ b/docs/docs/specs/fine-tuning.md
@@ -1,4 +1,4 @@
 ---
-title: "Fine tuning"
+title: "Fine-tuning"
 ---
 Todo: @hiro
\ No newline at end of file
diff --git a/docs/docs/specs/models.md b/docs/docs/specs/models.md
index d7e19be0a..f5fd26e3f 100644
--- a/docs/docs/specs/models.md
+++ b/docs/docs/specs/models.md
@@ -10,56 +10,103 @@ Feedback: [HackMD: Models Spec](https://hackmd.io/ulO3uB1AQCqLa5SAAMFOQw)
 
 :::
 
-Models are AI models like Llama and Mistral
+## Overview
 
-> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models
+Jan's Model API aims to be as similar as possible to [OpenAI's Models API](https://platform.openai.com/docs/api-reference/models), with additional methods for managing and running models locally. 
 
-## User Stories
+### User Objectives
 
-_Users can download a model via a web URL_
+- Users can start/stop models and use them in a thread (or via Chat Completions API)
+- Users can download, import and delete models  
+- User can configure model settings at the model level or override it at thread-level
+- Users can use remote models (e.g. OpenAI, OpenRouter)
 
-- Wireframes here
+## Models Folder
 
-_Users can import a model from local directory_
+Models in Jan are stored in the `/models` folder. 
 
-- Wireframes here
+ `<model-name>.json` files. 
 
-_Users can configure model settings, like run parameters_
+- Everything needed to represent a `model` is packaged into an `Model folder`.
+- The `folder` is standalone and can be easily zipped, imported, and exported, e.g. to Github.
+- The `folder` always contains at least one `Model Object`, declared in a `json` format.
+- The `folder` and `file` do not have to share the same name
+- The model `id` is made up of `folder_name/filename` and is thus always unique.
 
-- Wireframes here
+```sh
+/janroot
+    /models
+        azure-openai/                       # Folder name
+            azure-openai-gpt3-5.json        # File name
 
-_Users can override run settings at runtime_
+        llama2-70b/
+            model.json
+            .gguf
+```
 
-- See Assistant Spec and Thread
+## Model Object
 
-## Jan Model Object
+Models in Jan are represented as `json` objects, and are colloquially known as `model.jsons`. 
 
-- A `Jan Model Object` is a “representation" of a model
-- Objects are defined by `model-name.json` files in `json` format
-- Objects are identified by `folder-name/model-name`, where its `id` is indicative of its file location.
-- Objects are designed to be compatible with `OpenAI Model Objects`, with additional properties needed to run on our infrastructure.
-- ALL object properties are optional, i.e. users should be able to run a model declared by an empty `json` file.
+Jan's models follow a `<model_name>.json` naming convention. 
+
+Jan's `model.json` aims for rough equivalence with [OpenAI's Model Object](https://platform.openai.com/docs/api-reference/models/object), and add additional properties to support local models.  
+
+Jan's `model.json` object properties are optional, i.e. users should be able to run a model declared by an empty `json` file.
+
+```json
+// ./models/zephr/zephyr-7b-beta-Q4_K_M.json
+{
+    "source_url": "https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/blob/main/zephyr-7b-beta.Q4_K_M.gguf",
+    "parameters": {
+        "init": {
+            "ctx_len": "2048",
+            "ngl": "100",
+            "embedding": "true",
+            "n_parallel": "4",
+            "pre_prompt": "A chat between a curious user and an artificial intelligence",
+            "user_prompt": "USER: ",
+            "ai_prompt": "ASSISTANT: "
+        },
+        "runtime": {
+            "temperature": "0.7",
+            "token_limit": "2048",
+            "top_k": "0",
+            "top_p": "1",
+            "stream": "true"
+        }
+    },
+    "metadata": {
+        "engine": "llamacpp",
+        "quantization": "Q4_K_M",
+        "size": "7B",
+    }
+}
+```
 
 | Property                | Type                                                          | Description                                                               | Validation                                       |
 | ----------------------- | ------------------------------------------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------ |
-| `source_url`            | string                                                        | The model download source. It can be an external url or a local filepath. | Defaults to `pwd`. See [Source_url](#Source_url) |
 | `object`                | enum: `model`, `assistant`, `thread`, `message`               | Type of the Jan Object. Always `model`                                    | Defaults to "model"                              |
-| `name`                  | string                                                        | A vanity name                                                             | Defaults to filename                             |
-| `description`           | string                                                        | A vanity description of the model                                         | Defaults to ""                                   |
-| `state`                 | enum[`to_download` , `downloading`, `ready` , `running`] | Needs more thought                                                        | Defaults to `to_download`                     |
+| `source_url`            | string                                                        | The model download source. It can be an external url or a local filepath. | Defaults to `pwd`. See [Source_url](#Source_url) |
 | `parameters`            | map                                                           | Defines default model run parameters used by any assistant.               | Defaults to `{}`                                 |
+| `description`           | string                                                        | A vanity description of the model                                         | Defaults to ""                                   |
 | `metadata`              | map                                                           | Stores additional structured information about the model.                 | Defaults to `{}`                                 |
 | `metadata.engine`       | enum: `llamacpp`, `api`, `tensorrt`                           | The model backend used to run model.                                      | Defaults to "llamacpp"                           |
 | `metadata.quantization` | string                                                        | Supported formats only                                                    | See [Custom importers](#Custom-importers)        |
 | `metadata.binaries`     | array                                                         | Supported formats only.                                                   | See [Custom importers](#Custom-importers)        |
+| `state`                 | enum[`to_download` , `downloading`, `ready` , `running`] | Needs more thought                                                        | Defaults to `to_download`                     |
+| `name`                  | string                                                        | A vanity name                                                             | Defaults to filename                             |
 
-### Source_url
+### Model Source
+
+There are 3 types of model sources
+
+- Local model
+- Remote source
+- Cloud API
 
 - Users can download models from a `remote` source or reference an existing `local` model.
 - If this property is not specified in the Model Object file, then the default behavior is to look in the current directory.
-
-#### Local source_url
-
 - Users can import a local model by providing the filepath to the model
 
 ```json
@@ -70,14 +117,36 @@ _Users can override run settings at runtime_
 "source_url": "./",
 ```
 
-#### Remote source_url
-
 - Users can download a model by remote URL.
 - Supported url formats:
   - `https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/blob/main/llama-2-7b-chat.Q3_K_L.gguf`
   - `https://any-source.com/.../model-binary.bin`
 
-#### Custom importers
+- Using a remote API to access model `model-azure-openai-gpt4-turbo.json`
+- See [source](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api)
+
+```json
+"source_url": "https://docs-test-001.openai.azure.com/openai.azure.com/docs-test-001/gpt4-turbo",
+"parameters": {
+  "init" {
+    "API-KEY": "",
+    "DEPLOYMENT-NAME": "",
+    "api-version": "2023-05-15"
+  },
+  "runtime": {
+    "temperature": "0.7",
+    "max_tokens": "2048",
+    "presence_penalty": "0",
+    "top_p": "1",
+    "stream": "true"
+  }
+}
+"metadata": {
+    "engine": "api",
+}
+```
+
+### Model Formats
 
 Additionally, Jan supports importing popular formats. For example, if you provide a HuggingFace URL for a `TheBloke` model, Jan automatically downloads and catalogs all quantizations. Custom importers autofills properties like `metadata.quantization` and `metadata.size`.
 
@@ -89,7 +158,8 @@ Supported URL formats with custom importers:
 - `azure_openai`: `https://docs-test-001.openai.azure.com/openai.azure.com/docs-test-001/gpt4-turbo`
 - `openai`: `api.openai.com`
 
-### Generic Example
+<details>
+    <summary>Example: Zephyr 7B</summary>
 
 - Model has 1 binary `model-zephyr-7B.json`
 - See [source](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/)
@@ -122,8 +192,9 @@ Supported URL formats with custom importers:
     "size": "7B",
 }
 ```
+</details>
 
-### Example: multiple binaries
+### Multiple binaries
 
 - Model has multiple binaries `model-llava-1.5-ggml.json`
 - See [source](https://huggingface.co/mys/ggml_llava-v1.5-13b)
@@ -139,112 +210,24 @@ Supported URL formats with custom importers:
 }
 ```
 
-### Example: Azure API
+## Models API
 
-- Using a remote API to access model `model-azure-openai-gpt4-turbo.json`
-- See [source](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api)
+### Get Model
 
-```json
-"source_url": "https://docs-test-001.openai.azure.com/openai.azure.com/docs-test-001/gpt4-turbo",
-"parameters": {
-  "init" {
-    "API-KEY": "",
-    "DEPLOYMENT-NAME": "",
-    "api-version": "2023-05-15"
-  },
-  "runtime": {
-    "temperature": "0.7",
-    "max_tokens": "2048",
-    "presence_penalty": "0",
-    "top_p": "1",
-    "stream": "true"
-  }
-}
-"metadata": {
-    "engine": "api",
-}
-```
-
-## Filesystem
-
-- Everything needed to represent a `model` is packaged into an `Model folder`.
-- The `folder` is standalone and can be easily zipped, imported, and exported, e.g. to Github.
-- The `folder` always contains at least one `Model Object`, declared in a `json` format.
-- The `folder` and `file` do not have to share the same name
-- The model `id` is made up of `folder_name/filename` and is thus always unique.
-
-```sh
-/janroot
-    /models
-        azure-openai/                       # Folder name
-            azure-openai-gpt3-5.json        # File name
-
-        llama2-70b/
-            model.json
-            .gguf
-```
-
-### Default ./model folder
-- Jan ships with a default model folders containing recommended models
-- Only the Model Object `json` files are included
-- Users must later explicitly download the model binaries
-```sh
-models/
-    mistral-7b/
-        mistral-7b.json
-    hermes-7b/
-        hermes-7b.json
-```
-### Multiple quantizations
-
-- Each quantization has its own `Jan Model Object` file
-
-```sh
-llama2-7b-gguf/
-    llama2-7b-gguf-Q2.json
-    llama2-7b-gguf-Q3_K_L.json
-    .bin
-```
-### Multiple model partitions
-
-- A Model that is partitioned into several binaries use just 1 file
-
-```sh
-llava-ggml/
-    llava-ggml-Q5.json
-    .proj
-    ggml
-```
-### Your locally fine-tuned model
-
-- ??
-
-```sh
-llama-70b-finetune/
-    llama-70b-finetune-q5.json
-    .bin
-```
-## Jan API
-### Model API Object
+- OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/retrieve
+- OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/object
 - The `Jan Model Object` maps into the `OpenAI Model Object`.
 - Properties marked with `*` are compatible with the [OpenAI `model` object](https://platform.openai.com/docs/api-reference/models)
 - Note: The `Jan Model Object` has additional properties when retrieved via its API endpoint.
-> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/object
 
-### Model lifecycle
-Model has 4 states (enum)
-- `to_download`
-- `downloading`
-- `ready`
-- `running`
+#### Request
 
-### Get Model
-> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/retrieve
-- Example request
 ```shell
 curl {JAN_URL}/v1/models/{model_id}
 ```
-- Example response
+
+#### Response
+
 ```json
 {
   "id": "model-zephyr-7B",
@@ -273,14 +256,19 @@ curl {JAN_URL}/v1/models/{model_id}
   }
 }
 ```
+
 ### List models
 Lists the currently available models, and provides basic information about each one such as the owner and availability.
 > OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/list
-- Example request
+
+#### Request
+
 ```shell=
 curl {JAN_URL}/v1/models
 ```
-- Example response
+
+#### Response
+
 ```json
 {
   "object": "list",
@@ -310,13 +298,18 @@ curl {JAN_URL}/v1/models
   "object": "list"
 }
 ```
+
 ### Delete Model
 > OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models/delete
-`- Example request
+
+#### Request
+
 ```shell
 curl -X DELETE {JAN_URL}/v1/models/{model_id}
 ```
-- Example response
+
+#### Response
+
 ```json
 {
   "id": "model-zephyr-7B",
@@ -325,14 +318,19 @@ curl -X DELETE {JAN_URL}/v1/models/{model_id}
   "state": "to_download"
 }
 ```
+
 ### Start Model
 > Jan-only endpoint
 The request to start `model` by changing model state from `ready` to `running`
-- Example request
+
+#### Request
+
 ```shell
 curl -X PUT {JAN_URL}/v1/models{model_id}/start
 ```
-- Example response
+
+#### Response
+
 ```json
 {
   "id": "model-zephyr-7B",
@@ -340,14 +338,19 @@ curl -X PUT {JAN_URL}/v1/models{model_id}/start
   "state": "running"
 }
 ```
+
 ### Stop Model
 > Jan-only endpoint
 The request to start `model` by changing model state from `running` to `ready`
-- Example request
+
+#### Request
+
 ```shell
 curl -X PUT {JAN_URL}/v1/models/{model_id}/stop
 ```
-- Example response
+
+#### Response
+
 ```json
 {
   "id": "model-zephyr-7B",
@@ -355,18 +358,95 @@ curl -X PUT {JAN_URL}/v1/models/{model_id}/stop
   "state": "ready"
 }
 ```
+
 ### Download Model
 > Jan-only endpoint
 The request to download `model` by changing model state from `to_download` to `downloading` then `ready`once it's done.
-- Example request
+
+#### Request
 ```shell
 curl -X POST {JAN_URL}/v1/models/
 ```
-- Example response
+
+#### Response
 ```json
 {
   "id": "model-zephyr-7B",
   "object": "model",
   "state": "downloading"
 }
+```
+
+## Examples
+
+### Pre-loaded Models
+
+- Jan ships with a default model folders containing recommended models
+- Only the Model Object `json` files are included
+- Users must later explicitly download the model binaries
+- 
+```sh
+models/
+    mistral-7b/
+        mistral-7b.json
+    hermes-7b/
+        hermes-7b.json
+```
+
+### Azure OpenAI
+
+- Using a remote API to access model `model-azure-openai-gpt4-turbo.json`
+- See [source](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api)
+
+```json
+"source_url": "https://docs-test-001.openai.azure.com/openai.azure.com/docs-test-001/gpt4-turbo",
+"parameters": {
+  "init" {
+    "API-KEY": "",
+    "DEPLOYMENT-NAME": "",
+    "api-version": "2023-05-15"
+  },
+  "runtime": {
+    "temperature": "0.7",
+    "max_tokens": "2048",
+    "presence_penalty": "0",
+    "top_p": "1",
+    "stream": "true"
+  }
+}
+"metadata": {
+    "engine": "api",
+}
+```
+
+### Multiple quantizations
+
+- Each quantization has its own `Jan Model Object` file
+
+```sh
+llama2-7b-gguf/
+    llama2-7b-gguf-Q2.json
+    llama2-7b-gguf-Q3_K_L.json
+    .bin
+```
+
+### Multiple model partitions
+
+- A Model that is partitioned into several binaries use just 1 file
+
+```sh
+llava-ggml/
+    llava-ggml-Q5.json
+    .proj
+    ggml
+```
+
+### Your locally fine-tuned model
+
+- ??
+
+```sh
+llama-70b-finetune/
+    llama-70b-finetune-q5.json
+    .bin
 ```
\ No newline at end of file
diff --git a/docs/docs/specs/prompts.md b/docs/docs/specs/prompts.md
new file mode 100644
index 000000000..2ec008d8a
--- /dev/null
+++ b/docs/docs/specs/prompts.md
@@ -0,0 +1,7 @@
+---
+title: Prompts
+---
+
+- [ ] /prompts folder
+- [ ] How to add to prompts
+- [ ] Assistants can have suggested Prompts
\ No newline at end of file
diff --git a/docs/docs/specs/settings.md b/docs/docs/specs/settings.md
index 2f5cfc061..a8cf8809c 100644
--- a/docs/docs/specs/settings.md
+++ b/docs/docs/specs/settings.md
@@ -1,3 +1,5 @@
 ---
 title: Settings
----
\ No newline at end of file
+---
+
+- [ ] .jan folder in jan root
\ No newline at end of file
diff --git a/docs/sidebars.js b/docs/sidebars.js
index 973d8ccb2..fe04d330a 100644
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -68,6 +68,7 @@ const sidebars = {
             "specs/jan",
             "specs/fine-tuning",
             "specs/settings",
+            "specs/prompts",
           ],
         },
       ],