chore: Refactor spec outline
This commit is contained in:
parent
6ca8ce24db
commit
f70a16523d
@ -1,15 +1,44 @@
|
|||||||
---
|
# Model Specs
|
||||||
title: "Models"
|
|
||||||
---
|
|
||||||
|
|
||||||
Models are AI models like Llama and Mistral
|
|
||||||
|
|
||||||
> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models
|
> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models
|
||||||
|
|
||||||
## Model Object
|
## User Stories
|
||||||
|
|
||||||
|
*Users can download from model registries or reuse downloaded model binaries with an model*
|
||||||
|
|
||||||
|
*Users can use some default assistants*
|
||||||
|
- User can use existing models (openai, llama2-7b-Q3) right away
|
||||||
|
- User can browse model in model catalog
|
||||||
|
- If user airdrop model, drag and drop to Jan (bin + json file), Jan can pick up and use
|
||||||
|
|
||||||
|
*Users can create an model from scratch*
|
||||||
|
- User can choose model from remote model registry or even their fine-tuned model locally, even multiple model binaries
|
||||||
|
- User can import and use the model easily on Jan
|
||||||
|
|
||||||
|
*Users can create an custom model from an existing model*
|
||||||
|
|
||||||
|
|
||||||
|
## Jan Model Object
|
||||||
> Equivalent to: https://platform.openai.com/docs/api-reference/models/object
|
> Equivalent to: https://platform.openai.com/docs/api-reference/models/object
|
||||||
|
|
||||||
- LOCAL MODEL - 1 binary `model-zephyr-7B.json` - [Reference](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/)
|
|
||||||
|
| Property | Type | Description | Validation |
|
||||||
|
| -------- | -------- | -------- | -------- |
|
||||||
|
| `origin` | string | Unique identifier for the source of the model object. | Required |
|
||||||
|
| `import_format` | enum: `default`, `thebloke`, `janhq`, `openai` | Specifies the format for importing the object. | Defaults to `default` |
|
||||||
|
| `download_url` | string | URL for downloading the model. | Optional; defaults to model with recommended hardware |
|
||||||
|
| `id` | string | Identifier of the model file. Used mainly for API responses. | Optional; auto-generated if not specified |
|
||||||
|
| `object` | enum: `model`, `assistant`, `thread`, `message` | Type of the Jan Object. | Defaults to `model` |
|
||||||
|
| `created` | integer | Unix timestamp of the model's creation time. | Optional |
|
||||||
|
| `owned_by` | string | Identifier of the owner of the model. | Optional |
|
||||||
|
| `parameters` | object | Defines initialization and runtime parameters for the assistant. | Optional; specific sub-properties for `init` and `runtime` |
|
||||||
|
| -- `init` | object | Defines initialization parameters for the model. | Required |
|
||||||
|
| --`runtime` | object | Defines runtime parameters for the model. | Optional; Can be overridden by `Asissitant` |
|
||||||
|
|
||||||
|
| `metadata` | map | Stores additional structured information about the model. | Optional; defaults to `{}` |
|
||||||
|
|
||||||
|
### LOCAL MODEL - 1 binary `model-zephyr-7B.json`
|
||||||
|
> [Reference](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/)
|
||||||
|
|
||||||
```json
|
```json
|
||||||
# Required
|
# Required
|
||||||
@ -32,7 +61,6 @@ Models are AI models like Llama and Mistral
|
|||||||
"created": 1686935002, # Unix timestamp
|
"created": 1686935002, # Unix timestamp
|
||||||
"owned_by": "TheBloke"
|
"owned_by": "TheBloke"
|
||||||
|
|
||||||
# Optional: params
|
|
||||||
parameters: {
|
parameters: {
|
||||||
"init": {
|
"init": {
|
||||||
"ctx_len": 2048,
|
"ctx_len": 2048,
|
||||||
@ -57,11 +85,11 @@ parameters: {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
- LOCAL MODEL - multiple binaries `model-llava-v1.5-ggml.json` [Reference](https://huggingface.co/mys/ggml_llava-v1.5-13b)
|
### LOCAL MODEL - multiple binaries `model-llava-v1.5-ggml.json`
|
||||||
|
> [Reference](https://huggingface.co/mys/ggml_llava-v1.5-13b)
|
||||||
|
|
||||||
```json
|
```json
|
||||||
# Required
|
# Required
|
||||||
|
|
||||||
"origin": "mys/ggml_llava-v1.5-13b"
|
"origin": "mys/ggml_llava-v1.5-13b"
|
||||||
|
|
||||||
# Optional - by default use `default``
|
# Optional - by default use `default``
|
||||||
@ -76,7 +104,6 @@ parameters: {
|
|||||||
"created": 1686935002,
|
"created": 1686935002,
|
||||||
"owned_by": "TheBloke"
|
"owned_by": "TheBloke"
|
||||||
|
|
||||||
# Optional: params
|
|
||||||
parameters: {
|
parameters: {
|
||||||
"init": {
|
"init": {
|
||||||
"ctx_len": 2048,
|
"ctx_len": 2048,
|
||||||
@ -101,7 +128,8 @@ parameters: {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
- REMOTE MODEL `model-azure-openai-gpt4-turbo.json` - [Reference](https://learn.microsoft.com/en-us/azure/ai-services/openai/)quickstart?tabs=command-line%2Cpython&pivots=rest-api
|
### REMOTE MODEL `model-azure-openai-gpt4-turbo.json`
|
||||||
|
> [Reference](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api)
|
||||||
|
|
||||||
```json
|
```json
|
||||||
# Required
|
# Required
|
||||||
@ -109,7 +137,7 @@ parameters: {
|
|||||||
# This is `api.openai.com` if it's OpenAI platform
|
# This is `api.openai.com` if it's OpenAI platform
|
||||||
|
|
||||||
# Optional - by default use `default``
|
# Optional - by default use `default``
|
||||||
import_format: azure_openai
|
"import_format": "azure_openai"
|
||||||
# default # downloads the whole thing
|
# default # downloads the whole thing
|
||||||
# thebloke # custom importer (detects from URL)
|
# thebloke # custom importer (detects from URL)
|
||||||
# janhq # Custom importers
|
# janhq # Custom importers
|
||||||
@ -122,9 +150,6 @@ import_format: azure_openai
|
|||||||
"created": 1686935002,
|
"created": 1686935002,
|
||||||
"owned_by": "OpenAI Azure"
|
"owned_by": "OpenAI Azure"
|
||||||
|
|
||||||
# Optional: params
|
|
||||||
# This is the one model gets configured and cannot be changed by assistant
|
|
||||||
|
|
||||||
parameters: {
|
parameters: {
|
||||||
"init": {
|
"init": {
|
||||||
"API-KEY": "",
|
"API-KEY": "",
|
||||||
@ -146,37 +171,7 @@ parameters: {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
## Model API
|
## Filesystem
|
||||||
See [/model](/api/model)
|
|
||||||
|
|
||||||
- Equivalent to: https://platform.openai.com/docs/api-reference/models
|
|
||||||
|
|
||||||
```sh
|
|
||||||
# List models
|
|
||||||
GET https://localhost:1337/v1/models?filter=[enum](all,running,downloaded,downloading)
|
|
||||||
List[model_object]
|
|
||||||
|
|
||||||
# Get model object
|
|
||||||
GET https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
|
||||||
model_object
|
|
||||||
|
|
||||||
# Delete model
|
|
||||||
DELETE https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
|
||||||
|
|
||||||
# Stop model
|
|
||||||
PUT https://localhost:1337/v1/models/{model_id}/stop # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
|
||||||
|
|
||||||
# Start model
|
|
||||||
PUT https://localhost:1337/v1/models/{model_id}/start # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
|
||||||
{
|
|
||||||
"id": [string] # The model name to be used in `chat_completion` = model_id
|
|
||||||
"model_parameters": [jsonPayload],
|
|
||||||
"engine": [enum](llamacpp,openai)
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Model Filesystem
|
|
||||||
|
|
||||||
How `models` map onto your local filesystem
|
How `models` map onto your local filesystem
|
||||||
|
|
||||||
```shell=
|
```shell=
|
||||||
@ -204,7 +199,51 @@ How `models` map onto your local filesystem
|
|||||||
.bin
|
.bin
|
||||||
```
|
```
|
||||||
|
|
||||||
- Test cases
|
## Jan API
|
||||||
1. If user airdrop model, drag and drop to Jan (bin + json file), Jan can pick up and use
|
### Jan Model API
|
||||||
2. If user have fine tuned model, same as step 1
|
> Equivalent to: https://platform.openai.com/docs/api-reference/models
|
||||||
3. If user have 1 model that needs multiple binaries
|
|
||||||
|
```sh
|
||||||
|
# List models
|
||||||
|
GET https://localhost:1337/v1/models?state=[enum](all,running,downloaded,downloading)
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"id": "model-azure-openai-gpt4-turbo", # Autofilled by Jan with required URL above
|
||||||
|
"object": "model",
|
||||||
|
"created": 1686935002,
|
||||||
|
"owned_by": "OpenAI Azure",
|
||||||
|
"state": enum[all,running,downloaded,downloading]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "model-llava-v1.5-ggml", # Autofilled by Jan with required URL above
|
||||||
|
"object": "model",
|
||||||
|
"created": 1686935002,
|
||||||
|
"owned_by": "mys",
|
||||||
|
"state": enum[all,running,downloaded,downloading]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
# Get model object
|
||||||
|
GET https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||||
|
{
|
||||||
|
"id": "model-azure-openai-gpt4-turbo", # Autofilled by Jan with required URL above
|
||||||
|
"object": "model",
|
||||||
|
"created": 1686935002,
|
||||||
|
"owned_by": "OpenAI Azure",
|
||||||
|
"state": enum[all,running,downloaded,downloading]
|
||||||
|
},
|
||||||
|
|
||||||
|
# Delete model
|
||||||
|
DELETE https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||||
|
|
||||||
|
# Stop model
|
||||||
|
PUT https://localhost:1337/v1/models/{model_id}/stop # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||||
|
|
||||||
|
# Start model
|
||||||
|
PUT https://localhost:1337/v1/models/{model_id}/start # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||||
|
{
|
||||||
|
"id": [string] # The model name to be used in `chat_completion` = model_id
|
||||||
|
"model_parameters": [jsonPayload],
|
||||||
|
"engine": [enum](llamacpp,openai)
|
||||||
|
}
|
||||||
|
```
|
||||||
Loading…
x
Reference in New Issue
Block a user