diff --git a/docs/docs/specs/models.md b/docs/docs/specs/models.md index 31a79aede..3d2c39b89 100644 --- a/docs/docs/specs/models.md +++ b/docs/docs/specs/models.md @@ -2,6 +2,8 @@ title: Models --- +import ApiSchema from '@theme/ApiSchema'; + :::warning Draft Specification: functionality has not been implemented yet. @@ -14,7 +16,7 @@ Feedback: [HackMD: Models Spec](https://hackmd.io/ulO3uB1AQCqLa5SAAMFOQw) Jan's Model API aims to be as similar as possible to [OpenAI's Models API](https://platform.openai.com/docs/api-reference/models), with additional methods for managing and running models locally. -### User Objectives +### Objectives - Users can start/stop models and use them in a thread (or via Chat Completions API) - Users can download, import and delete models @@ -63,6 +65,8 @@ Jan's `model.json` aims for rough equivalence with [OpenAI's Model Object](https Jan's `model.json` object properties are optional, i.e. users should be able to run a model declared by an empty `json` file. +; + ```json // ./models/zephr/zephyr-7b-beta-Q4_K_M.json { diff --git a/docs/openapi/jan.yaml b/docs/openapi/jan.yaml index 4a56e4df9..34660b5a6 100644 --- a/docs/openapi/jan.yaml +++ b/docs/openapi/jan.yaml @@ -7314,29 +7314,126 @@ components: - model - input - voice - Model: title: Model - description: Describes an OpenAI model offering that can be used with the API. + description: Describes an Jan model properties: - id: + type: + type: string + enum: [model, assistant, thread, message] # This should be specified + default: model + version: + type: integer + description: The version of the Model Object file + default: 1 + source_url: + type: string + format: uri + default: https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/blob/main/zephyr-7b-beta.Q4_K_M.gguf + description: The model download source. It can be an external url or a local filepath. + id: # OpenAI-equivalent type: string description: The model identifier, which can be referenced in the API endpoints. + default: zephyr-7b + description: + type: string + default: A cool model from Huggingface + owned_by: # OpenAI-equivalent + type: string + description: The organization that owns the model (you!) + default: you # TODO created: type: integer - description: The Unix timestamp (in seconds) when the model was created. - object: + description: The Unix timestamp (in seconds) for when the model was created + state: type: string - description: The object type, which is always "model". - enum: [model] - owned_by: - type: string - description: The organization that owns the model. + enum: [to_download, downloading, ready, running] + default: to_download + parameters: + type: object + description: + properties: + init: + type: object + properties: + ctx_len: + type: string + description: TODO + default: 2048 + ngl: + type: string + description: TODO + default: 100 + embedding: + type: bool + description: TODO + default: true + n_parallel: + type: string + description: TODO + default: 4 + pre_prompt: + type: string + description: TODO + default: A chat between a curious user and an artificial intelligence + user_prompt: + type: string + description: TODO + default: "USER:" + ai_prompt: + type: string + description: TODO + default: "ASSISTANT:" + runtime: + type: object + properties: + temperature: + type: string + description: TODO + default: 0.7 + token_limit: + type: string + description: TODO + default: 2048 + top_k: + type: string + description: TODO + default: 0 + top_p: + type: string + description: TODO + default: 1 + stream: + type: string + description: TODO + default: true + default: {} + metadata: + type: object + properties: + engine: + type: string + enum: [llamacpp, api,tensorrt] + default: llamacpp + quantization: + type: string + description: TODO + binaries: + type: array + description: TODO + default: {} required: - - id - object - - created - - owned_by + - source_url + - parameters + - description + - metadata + - state + - name + - id # From OpenAI + - object # From OpenAI + - created # From OpenAI + - owned_by # From OpenAI x-oaiMeta: name: The model object example: *retrieve_model_response