chore: Refactor spec outline
This commit is contained in:
parent
6ca8ce24db
commit
f70a16523d
@ -1,15 +1,44 @@
|
||||
---
|
||||
title: "Models"
|
||||
---
|
||||
|
||||
Models are AI models like Llama and Mistral
|
||||
# Model Specs
|
||||
|
||||
> OpenAI Equivalent: https://platform.openai.com/docs/api-reference/models
|
||||
|
||||
## Model Object
|
||||
## User Stories
|
||||
|
||||
*Users can download from model registries or reuse downloaded model binaries with an model*
|
||||
|
||||
*Users can use some default assistants*
|
||||
- User can use existing models (openai, llama2-7b-Q3) right away
|
||||
- User can browse model in model catalog
|
||||
- If user airdrop model, drag and drop to Jan (bin + json file), Jan can pick up and use
|
||||
|
||||
*Users can create an model from scratch*
|
||||
- User can choose model from remote model registry or even their fine-tuned model locally, even multiple model binaries
|
||||
- User can import and use the model easily on Jan
|
||||
|
||||
*Users can create an custom model from an existing model*
|
||||
|
||||
|
||||
## Jan Model Object
|
||||
> Equivalent to: https://platform.openai.com/docs/api-reference/models/object
|
||||
|
||||
- LOCAL MODEL - 1 binary `model-zephyr-7B.json` - [Reference](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/)
|
||||
|
||||
| Property | Type | Description | Validation |
|
||||
| -------- | -------- | -------- | -------- |
|
||||
| `origin` | string | Unique identifier for the source of the model object. | Required |
|
||||
| `import_format` | enum: `default`, `thebloke`, `janhq`, `openai` | Specifies the format for importing the object. | Defaults to `default` |
|
||||
| `download_url` | string | URL for downloading the model. | Optional; defaults to model with recommended hardware |
|
||||
| `id` | string | Identifier of the model file. Used mainly for API responses. | Optional; auto-generated if not specified |
|
||||
| `object` | enum: `model`, `assistant`, `thread`, `message` | Type of the Jan Object. | Defaults to `model` |
|
||||
| `created` | integer | Unix timestamp of the model's creation time. | Optional |
|
||||
| `owned_by` | string | Identifier of the owner of the model. | Optional |
|
||||
| `parameters` | object | Defines initialization and runtime parameters for the assistant. | Optional; specific sub-properties for `init` and `runtime` |
|
||||
| -- `init` | object | Defines initialization parameters for the model. | Required |
|
||||
| --`runtime` | object | Defines runtime parameters for the model. | Optional; Can be overridden by `Asissitant` |
|
||||
|
||||
| `metadata` | map | Stores additional structured information about the model. | Optional; defaults to `{}` |
|
||||
|
||||
### LOCAL MODEL - 1 binary `model-zephyr-7B.json`
|
||||
> [Reference](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/)
|
||||
|
||||
```json
|
||||
# Required
|
||||
@ -32,7 +61,6 @@ Models are AI models like Llama and Mistral
|
||||
"created": 1686935002, # Unix timestamp
|
||||
"owned_by": "TheBloke"
|
||||
|
||||
# Optional: params
|
||||
parameters: {
|
||||
"init": {
|
||||
"ctx_len": 2048,
|
||||
@ -57,11 +85,11 @@ parameters: {
|
||||
}
|
||||
```
|
||||
|
||||
- LOCAL MODEL - multiple binaries `model-llava-v1.5-ggml.json` [Reference](https://huggingface.co/mys/ggml_llava-v1.5-13b)
|
||||
### LOCAL MODEL - multiple binaries `model-llava-v1.5-ggml.json`
|
||||
> [Reference](https://huggingface.co/mys/ggml_llava-v1.5-13b)
|
||||
|
||||
```json
|
||||
# Required
|
||||
|
||||
"origin": "mys/ggml_llava-v1.5-13b"
|
||||
|
||||
# Optional - by default use `default``
|
||||
@ -76,7 +104,6 @@ parameters: {
|
||||
"created": 1686935002,
|
||||
"owned_by": "TheBloke"
|
||||
|
||||
# Optional: params
|
||||
parameters: {
|
||||
"init": {
|
||||
"ctx_len": 2048,
|
||||
@ -101,7 +128,8 @@ parameters: {
|
||||
}
|
||||
```
|
||||
|
||||
- REMOTE MODEL `model-azure-openai-gpt4-turbo.json` - [Reference](https://learn.microsoft.com/en-us/azure/ai-services/openai/)quickstart?tabs=command-line%2Cpython&pivots=rest-api
|
||||
### REMOTE MODEL `model-azure-openai-gpt4-turbo.json`
|
||||
> [Reference](https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api)
|
||||
|
||||
```json
|
||||
# Required
|
||||
@ -109,7 +137,7 @@ parameters: {
|
||||
# This is `api.openai.com` if it's OpenAI platform
|
||||
|
||||
# Optional - by default use `default``
|
||||
import_format: azure_openai
|
||||
"import_format": "azure_openai"
|
||||
# default # downloads the whole thing
|
||||
# thebloke # custom importer (detects from URL)
|
||||
# janhq # Custom importers
|
||||
@ -122,9 +150,6 @@ import_format: azure_openai
|
||||
"created": 1686935002,
|
||||
"owned_by": "OpenAI Azure"
|
||||
|
||||
# Optional: params
|
||||
# This is the one model gets configured and cannot be changed by assistant
|
||||
|
||||
parameters: {
|
||||
"init": {
|
||||
"API-KEY": "",
|
||||
@ -146,37 +171,7 @@ parameters: {
|
||||
}
|
||||
```
|
||||
|
||||
## Model API
|
||||
See [/model](/api/model)
|
||||
|
||||
- Equivalent to: https://platform.openai.com/docs/api-reference/models
|
||||
|
||||
```sh
|
||||
# List models
|
||||
GET https://localhost:1337/v1/models?filter=[enum](all,running,downloaded,downloading)
|
||||
List[model_object]
|
||||
|
||||
# Get model object
|
||||
GET https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||
model_object
|
||||
|
||||
# Delete model
|
||||
DELETE https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||
|
||||
# Stop model
|
||||
PUT https://localhost:1337/v1/models/{model_id}/stop # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||
|
||||
# Start model
|
||||
PUT https://localhost:1337/v1/models/{model_id}/start # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||
{
|
||||
"id": [string] # The model name to be used in `chat_completion` = model_id
|
||||
"model_parameters": [jsonPayload],
|
||||
"engine": [enum](llamacpp,openai)
|
||||
}
|
||||
```
|
||||
|
||||
## Model Filesystem
|
||||
|
||||
## Filesystem
|
||||
How `models` map onto your local filesystem
|
||||
|
||||
```shell=
|
||||
@ -204,7 +199,51 @@ How `models` map onto your local filesystem
|
||||
.bin
|
||||
```
|
||||
|
||||
- Test cases
|
||||
1. If user airdrop model, drag and drop to Jan (bin + json file), Jan can pick up and use
|
||||
2. If user have fine tuned model, same as step 1
|
||||
3. If user have 1 model that needs multiple binaries
|
||||
## Jan API
|
||||
### Jan Model API
|
||||
> Equivalent to: https://platform.openai.com/docs/api-reference/models
|
||||
|
||||
```sh
|
||||
# List models
|
||||
GET https://localhost:1337/v1/models?state=[enum](all,running,downloaded,downloading)
|
||||
[
|
||||
{
|
||||
"id": "model-azure-openai-gpt4-turbo", # Autofilled by Jan with required URL above
|
||||
"object": "model",
|
||||
"created": 1686935002,
|
||||
"owned_by": "OpenAI Azure",
|
||||
"state": enum[all,running,downloaded,downloading]
|
||||
},
|
||||
{
|
||||
"id": "model-llava-v1.5-ggml", # Autofilled by Jan with required URL above
|
||||
"object": "model",
|
||||
"created": 1686935002,
|
||||
"owned_by": "mys",
|
||||
"state": enum[all,running,downloaded,downloading]
|
||||
}
|
||||
]
|
||||
|
||||
# Get model object
|
||||
GET https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||
{
|
||||
"id": "model-azure-openai-gpt4-turbo", # Autofilled by Jan with required URL above
|
||||
"object": "model",
|
||||
"created": 1686935002,
|
||||
"owned_by": "OpenAI Azure",
|
||||
"state": enum[all,running,downloaded,downloading]
|
||||
},
|
||||
|
||||
# Delete model
|
||||
DELETE https://localhost:1337/v1/models/{model_id} # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||
|
||||
# Stop model
|
||||
PUT https://localhost:1337/v1/models/{model_id}/stop # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||
|
||||
# Start model
|
||||
PUT https://localhost:1337/v1/models/{model_id}/start # json file name as {model_id} model-azure-openai-gpt4-turbo, model-zephyr-7B
|
||||
{
|
||||
"id": [string] # The model name to be used in `chat_completion` = model_id
|
||||
"model_parameters": [jsonPayload],
|
||||
"engine": [enum](llamacpp,openai)
|
||||
}
|
||||
```
|
||||
Loading…
x
Reference in New Issue
Block a user