diff --git a/docs/docs/docs/specs/models.md b/docs/docs/docs/specs/models.md index 8d95043c2..4b53b279f 100644 --- a/docs/docs/docs/specs/models.md +++ b/docs/docs/docs/specs/models.md @@ -1,15 +1,44 @@ ---- -title: "Models" ---- - -Models are AI models like Llama and Mistral +# Model Specs > OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models -## Model Object +## User Stories + +*Users can download from model registries or reuse downloaded model binaries with an model* + +*Users can use some default assistants* +- User can use existing models (openai, llama2-7b-Q3) right away +- User can browse model in model catalog +- If user airdrop model, drag and drop to Jan (bin + json file), Jan can pick up and use + +*Users can create an model from scratch* +- User can choose model from remote model registry or even their fine-tuned model locally, even multiple model binaries +- User can import and use the model easily on Jan + +*Users can create an custom model from an existing model* + + +## Jan Model Object > Equivalent to: https://platform.openai.com/docs/api-reference/models/object -- LOCAL MODEL - 1 binary `model-zephyr-7B.json` - [Reference](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/) + +| Property | Type | Description | Validation | +| -------- | -------- | -------- | -------- | +| `origin` | string | Unique identifier for the source of the model object. | Required | +| `import_format` | enum: `default`, `thebloke`, `janhq`, `openai` | Specifies the format for importing the object. | Defaults to `default` | +| `download_url` | string | URL for downloading the model. | Optional; defaults to model with recommended hardware | +| `id` | string | Identifier of the model file. Used mainly for API responses. | Optional; auto-generated if not specified | +| `object` | enum: `model`, `assistant`, `thread`, `message` | Type of the Jan Object. | Defaults to `model` | +| `created` | integer | Unix timestamp of the model's creation time. | Optional | +| `owned_by` | string | Identifier of the owner of the model. | Optional | +| `parameters` | object | Defines initialization and runtime parameters for the assistant. | Optional; specific sub-properties for `init` and `runtime` | +| -- `init` | object | Defines initialization parameters for the model. | Required | +| --`runtime` | object | Defines runtime parameters for the model. | Optional; Can be overridden by `Asissitant` | + +| `metadata` | map | Stores additional structured information about the model. | Optional; defaults to `{}` | + +### LOCAL MODEL - 1 binary `model-zephyr-7B.json` +> [Reference](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/) ```json # Required @@ -32,7 +61,6 @@ Models are AI models like Llama and Mistral "created": 1686935002, # Unix timestamp "owned_by": "TheBloke" -# Optional: params parameters: { "init": { "ctx_len": 2048, @@ -57,11 +85,11 @@ parameters: { } ``` -- LOCAL MODEL - multiple binaries `model-llava-v1.5-ggml.json` [Reference](https://huggingface.co/mys/ggml_llava-v1.5-13b) +### LOCAL MODEL - multiple binaries `model-llava-v1.5-ggml.json` +> [Reference](https://huggingface.co/mys/ggml_llava-v1.5-13b) ```json # Required - "origin": "mys/ggml_llava-v1.5-13b" # Optional - by default use `default`` @@ -76,7 +104,6 @@ parameters: { "created": 1686935002, "owned_by": "TheBloke" -# Optional: params parameters: { "init": { "ctx_len": 2048, @@ -101,7 +128,8 @@ parameters: { } ``` -- REMOTE MODEL `model-azure-openai-gpt4-turbo.json` - [Reference](https://learn.microsoft.com/en-us/azure/ai-services/openai/)quickstart?tabs=command-line%2Cpython&pivots=rest-api +### REMOTE MODEL `model-azure-openai-gpt4-turbo.json` +> [Reference](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api) ```json # Required @@ -109,7 +137,7 @@ parameters: { # This is `api.openai.com` if it's OpenAI platform # Optional - by default use `default`` -import_format: azure_openai +"import_format": "azure_openai" # default # downloads the whole thing # thebloke # custom importer (detects from URL) # janhq # Custom importers @@ -122,9 +150,6 @@ import_format: azure_openai "created": 1686935002, "owned_by": "OpenAI Azure" -# Optional: params -# This is the one model gets configured and cannot be changed by assistant - parameters: { "init": { "API-KEY": "", @@ -146,37 +171,7 @@ parameters: { } ``` -## Model API -See [/model](/api/model) - -- Equivalent to: https://platform.openai.com/docs/api-reference/models - -```sh -# List models -GET https://localhost:1337/v1/models?filter=[enum](all,running,downloaded,downloading) -List[model_object] - -# Get model object -GET https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B -model_object - -# Delete model -DELETE https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B - -# Stop model -PUT https://localhost:1337/v1/models/{model_id}/stop # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B - -# Start model -PUT https://localhost:1337/v1/models/{model_id}/start # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B -{ - "id": [string] # The model name to be used in `chat_completion` = model_id - "model_parameters": [jsonPayload], - "engine": [enum](llamacpp,openai) -} -``` - -## Model Filesystem - +## Filesystem How `models` map onto your local filesystem ```shell= @@ -204,7 +199,51 @@ How `models` map onto your local filesystem .bin ``` -- Test cases - 1. If user airdrop model, drag and drop to Jan (bin + json file), Jan can pick up and use - 2. If user have fine tuned model, same as step 1 - 3. If user have 1 model that needs multiple binaries \ No newline at end of file +## Jan API +### Jan Model API +> Equivalent to: https://platform.openai.com/docs/api-reference/models + +```sh +# List models +GET https://localhost:1337/v1/models?state=[enum](all,running,downloaded,downloading) +[ + { + "id": "model-azure-openai-gpt4-turbo", # Autofilled by Jan with required URL above + "object": "model", + "created": 1686935002, + "owned_by": "OpenAI Azure", + "state": enum[all,running,downloaded,downloading] + }, + { + "id": "model-llava-v1.5-ggml", # Autofilled by Jan with required URL above + "object": "model", + "created": 1686935002, + "owned_by": "mys", + "state": enum[all,running,downloaded,downloading] + } +] + +# Get model object +GET https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B +{ + "id": "model-azure-openai-gpt4-turbo", # Autofilled by Jan with required URL above + "object": "model", + "created": 1686935002, + "owned_by": "OpenAI Azure", + "state": enum[all,running,downloaded,downloading] +}, + +# Delete model +DELETE https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B + +# Stop model +PUT https://localhost:1337/v1/models/{model_id}/stop # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B + +# Start model +PUT https://localhost:1337/v1/models/{model_id}/start # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B +{ + "id": [string] # The model name to be used in `chat_completion` = model_id + "model_parameters": [jsonPayload], + "engine": [enum](llamacpp,openai) +} +``` \ No newline at end of file