Merge branch 'dev' into release/v0.6.1

# Conflicts: # extensions/inference-cortex-extension/resources/default_settings.json # extensions/inference-cortex-extension/src/index.ts # extensions/model-extension/resources/default.json # web-app/src/containers/ThreadContent.tsx # web-app/src/containers/TokenSpeedIndicator.tsx # web-app/src/hooks/useChat.ts # web-app/src/lib/version.ts # web-app/src/routes/hub.tsx # web-app/src/routes/settings/hardware.tsx
2025-06-20 09:41:43 +07:00 · 2025-06-20 09:41:43 +07:00 · 5a09317c8d
commit 5a09317c8d
parent 67592f3f45 5a73031d62
190 changed files with 1206 additions and 5580 deletions
--- a/README.md
+++ b/README.md
@ -14,246 +14,174 @@
 <p align="center">
  <a href="https://jan.ai/docs/quickstart">Getting Started</a> 
  - <a href="https://jan.ai/docs">Docs</a> 
-  - <a href="https://github.com/menloresearch/jan/releases">Changelog</a> 
+  - <a href="https://jan.ai/changelog">Changelog</a> 
  - <a href="https://github.com/menloresearch/jan/issues">Bug reports</a> 
  - <a href="https://discord.gg/AsJ8krTT3N">Discord</a>
 </p>

-<p align="center">
-⚠️ <b> Jan is currently in Development</b>: Expect breaking changes and bugs!
-</p>
-
 Jan is a ChatGPT-alternative that runs 100% offline on your device. Our goal is to make it easy for a layperson to download and run LLMs and use AI with **full control** and **privacy**.

-Jan is powered by [Cortex](https://github.com/menloresearch/cortex.cpp), our embeddable local AI engine that runs on any hardware.
-From PCs to multi-GPU clusters, Jan & Cortex supports universal architectures:
+**⚠️ Jan is in active development.**

- [x] NVIDIA GPUs (fast)
- [x] Apple M-series (fast)
- [x] Apple Intel
- [x] Linux Debian
- [x] Windows x64
+## Installation

-#### Features:
-
- [Model Library](https://jan.ai/docs/models/manage-models#add-models) with popular LLMs like Llama, Gemma, Mistral, or Qwen
- Connect to [Remote AI APIs](https://jan.ai/docs/remote-models/openai) like Groq and OpenRouter
- Local API Server with OpenAI-equivalent API
- [Extensions](https://jan.ai/docs/extensions) for customizing Jan
-
-## Download
+Because clicking a button is still the easiest way to get started:

 <table>
-  <tr style="text-align:center">
-    <td style="text-align:center"><b>Version Type</b></td>
-    <td style="text-align:center"><b>Windows</b></td>
-    <td style="text-align:center"><b>MacOS Universal</b></td>
-    <td colspan="2" style="text-align:center"><b>Linux</b></td>
+  <tr>
+    <td><b>Platform</b></td>
+    <td><b>Stable</b></td>
+    <td><b>Beta</b></td>
+    <td><b>Nightly</b></td>
  </tr>
-  <tr style="text-align:center">
-    <td style="text-align:center"><b>Stable (Recommended)</b></td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/latest/win-x64'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/windows.png' style="height:14px; width: 14px" />
-        <b>jan.exe</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/latest/mac-universal'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/mac.png' style="height:15px; width: 15px" />
-        <b>jan.dmg</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/latest/linux-amd64-deb'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>jan.deb</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/latest/linux-amd64-appimage'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>jan.AppImage</b>
-      </a>
-    </td>
+  <tr>
+    <td><b>Windows</b></td>
+    <td><a href='https://app.jan.ai/download/latest/win-x64'>jan.exe</a></td>
+    <td><a href='https://app.jan.ai/download/beta/win-x64'>jan.exe</a></td>
+    <td><a href='https://app.jan.ai/download/nightly/win-x64'>jan.exe</a></td>
  </tr>
-  <tr style="text-align:center">
-    <td style="text-align:center"><b>Beta (Preview)</b></td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/beta/win-x64'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/windows.png' style="height:14px; width: 14px" />
-        <b>jan.exe</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/beta/mac-universal'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/mac.png' style="height:15px; width: 15px" />
-        <b>jan.dmg</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/beta/linux-amd64-deb'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>jan.deb</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/beta/linux-amd64-appimage'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>jan.AppImage</b>
-      </a>
-    </td>
+  <tr>
+    <td><b>macOS</b></td>
+    <td><a href='https://app.jan.ai/download/latest/mac-universal'>jan.dmg</a></td>
+    <td><a href='https://app.jan.ai/download/beta/mac-universal'>jan.dmg</a></td>
+    <td><a href='https://app.jan.ai/download/nightly/mac-universal'>jan.dmg</a></td>
  </tr>
-  <tr style="text-align:center">
-    <td style="text-align:center"><b>Nightly Build (Experimental)</b></td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/nightly/win-x64'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/windows.png' style="height:14px; width: 14px" />
-        <b>jan.exe</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/nightly/mac-universal'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/mac.png' style="height:15px; width: 15px" />
-        <b>jan.dmg</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/nightly/linux-amd64-deb'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>jan.deb</b>
-      </a>
-    </td>
-    <td style="text-align:center">
-      <a href='https://app.jan.ai/download/nightly/linux-amd64-appimage'>
-        <img src='https://github.com/menloresearch/jan/blob/dev/docs/static/img/linux.png' style="height:14px; width: 14px" />
-        <b>jan.AppImage</b>
-      </a>
-    </td>
+  <tr>
+    <td><b>Linux (deb)</b></td>
+    <td><a href='https://app.jan.ai/download/latest/linux-amd64-deb'>jan.deb</a></td>
+    <td><a href='https://app.jan.ai/download/beta/linux-amd64-deb'>jan.deb</a></td>
+    <td><a href='https://app.jan.ai/download/nightly/linux-amd64-deb'>jan.deb</a></td>
+  </tr>
+  <tr>
+    <td><b>Linux (AppImage)</b></td>
+    <td><a href='https://app.jan.ai/download/latest/linux-amd64-appimage'>jan.AppImage</a></td>
+    <td><a href='https://app.jan.ai/download/beta/linux-amd64-appimage'>jan.AppImage</a></td>
+    <td><a href='https://app.jan.ai/download/nightly/linux-amd64-appimage'>jan.AppImage</a></td>
  </tr>
 </table>

-Download the latest version of Jan at https://jan.ai/ or visit the [GitHub Releases](https://github.com/menloresearch/jan/releases) to download any previous release.
+Download from [jan.ai](https://jan.ai/) or [GitHub Releases](https://github.com/menloresearch/jan/releases).

 ## Demo

-https://github.com/user-attachments/assets/c3592fa2-c504-4d9d-a885-7e00122a50f3
+<video width="100%" controls>
+  <source src="./docs/public/assets/videos/enable-tool-call-for-models.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>

-_Real-time Video: Jan v0.5.7 on a Mac M2, 16GB Sonoma 14.2_
+## Features

-## Quicklinks
+- **Local AI Models**: Download and run LLMs (Llama, Gemma, Qwen, etc.) from HuggingFace
+- **Cloud Integration**: Connect to OpenAI, Anthropic, Mistral, Groq, and others
+- **Custom Assistants**: Create specialized AI assistants for your tasks
+- **OpenAI-Compatible API**: Local server at `localhost:1337` for other applications
+- **Model Context Protocol**: MCP integration for enhanced capabilities
+- **Privacy First**: Everything runs locally when you want it to

-### Jan
+## Build from Source

- [Jan Website](https://jan.ai/)
- [Jan GitHub](https://github.com/menloresearch/jan)
- [Documentation](https://jan.ai/docs)
- [Jan Changelog](https://jan.ai/changelog)
- [Jan Blog](https://jan.ai/blog)
+For those who enjoy the scenic route:

-### Cortex.cpp
+### Prerequisites

-Jan is powered by **Cortex.cpp**. It is a C++ command-line interface (CLI) designed as an alternative to [Ollama](https://ollama.com/). By default, it runs on the llama.cpp engine but also supports other engines, including ONNX and TensorRT-LLM, making it a multi-engine platform.
+- Node.js ≥ 20.0.0
+- Yarn ≥ 1.22.0
+- Make ≥ 3.81
+- Rust (for Tauri)

- [Cortex Website](https://cortex.so/)
- [Cortex GitHub](https://github.com/menloresearch/cortex.cpp)
- [Documentation](https://cortex.so/docs/)
- [Models Library](https://cortex.so/models)
- API Reference: _Under development_
+### Quick Start

-## Requirements for running Jan
+```bash
+git clone https://github.com/menloresearch/jan
+cd jan
+make dev
+```

- **MacOS**: 13 or higher
- **Windows**:
-  - Windows 10 or higher
-  - To enable GPU support:
-    - Nvidia GPU with CUDA Toolkit 11.7 or higher
-    - Nvidia driver 470.63.01 or higher
- **Linux**:
-  - glibc 2.27 or higher (check with `ldd --version`)
-  - gcc 11, g++ 11, cpp 11 or higher, refer to this [link](https://jan.ai/guides/troubleshooting/gpu-not-used/#specific-requirements-for-linux) for more information
-  - To enable GPU support:
-    - Nvidia GPU with CUDA Toolkit 11.7 or higher
-    - Nvidia driver 470.63.01 or higher
+This handles everything: installs dependencies, builds core components, and launches the app.
+
+### Alternative Commands
+
+If you prefer the verbose approach:
+
+```bash
+# Setup and development
+yarn install
+yarn build:core
+yarn build:extensions
+yarn dev
+
+# Production build
+yarn build
+
+# Clean slate (when things inevitably break)
+make clean
+```
+
+### Available Make Targets
+
+- `make dev` - Full development setup and launch (recommended)
+- `make dev-tauri` - Tauri development (deprecated, use `make dev`)
+- `make build` - Production build
+- `make install-and-build` - Install dependencies and build core/extensions
+- `make test` - Run tests and linting
+- `make lint` - Check your code doesn't offend the linters
+- `make clean` - Nuclear option: delete everything and start fresh
+
+## System Requirements
+
+**Minimum specs for a decent experience:**
+
+- **macOS**: 13.6+ (8GB RAM for 3B models, 16GB for 7B, 32GB for 13B)
+- **Windows**: 10+ with GPU support for NVIDIA/AMD/Intel Arc
+- **Linux**: Most distributions work, GPU acceleration available
+
+For detailed compatibility, check our [installation guides](https://jan.ai/docs/desktop/mac).

 ## Troubleshooting

-As Jan is in development mode, you might get stuck on a some common issues:
+When things go sideways (they will):

- [Troubleshooting a broken build](https://jan.ai/docs/troubleshooting#broken-build)
- [Troubleshooting NVIDIA GPU](https://jan.ai/docs/troubleshooting#troubleshooting-nvidia-gpu)
- [Troubleshooting Something's Amiss](https://jan.ai/docs/troubleshooting#somethings-amiss)
+1. Check our [troubleshooting docs](https://jan.ai/docs/troubleshooting)
+2. Copy your error logs and system specs
+3. Ask for help in our [Discord](https://discord.gg/FTk2MvZwJH) `#🆘|jan-help` channel

-If you can't find what you need in our troubleshooting guide, feel free reach out to us for extra help:
-
-1. Copy your [error logs & device specifications](https://jan.ai/docs/troubleshooting#how-to-get-error-logs).
-2. Go to our [Discord](https://discord.com/invite/FTk2MvZwJH) & send it to **#🆘|get-help** channel for further support.
-
-_Check the logs to ensure the information is what you intend to send. Note that we retain your logs for only 24 hours, so report any issues promptly._
+We keep logs for 24 hours, so don't procrastinate on reporting issues.

 ## Contributing

-Contributions are welcome! Please read the [CONTRIBUTING.md](CONTRIBUTING.md) file
+Contributions welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for the full spiel.

-### Pre-requisites
+## Links

- node >= 20.0.0
- yarn >= 1.22.0
- make >= 3.81
-
-### Instructions
-
-1. **Clone the repository and prepare:**
-
-   ```bash
-   git clone https://github.com/menloresearch/jan
-   cd jan
-   git checkout -b DESIRED_BRANCH
-   ```
-
-2. **Run development and use Jan Desktop**
-
-   ```bash
-   make dev
-   ```
-
-This will start the development server and open the desktop app.
-
-### For production build
-
-```bash
-# Do steps 1 and 2 in the previous section
-# Build the app
-make build
-```
-
-This will build the app MacOS m1/m2 for production (with code signing already done) and put the result in `dist` folder.
-
-## Acknowledgements
-
-Jan builds on top of other open-source projects:
-
- [llama.cpp](https://github.com/ggml-org/llama.cpp)
- [LangChain](https://github.com/langchain-ai)
- [TensorRT](https://github.com/NVIDIA/TensorRT)
- [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM)
+- [Documentation](https://jan.ai/docs) - The manual you should read
+- [API Reference](https://jan.ai/api-reference) - For the technically inclined
+- [Changelog](https://jan.ai/changelog) - What we broke and fixed
+- [Discord](https://discord.gg/FTk2MvZwJH) - Where the community lives

 ## Contact

- Bugs & requests: file a GitHub ticket
- For discussion: join our Discord [here](https://discord.gg/FTk2MvZwJH)
- For business inquiries: email hello@jan.ai
- For jobs: please email hr@jan.ai
+- **Bugs**: [GitHub Issues](https://github.com/menloresearch/jan/issues)
+- **Business**: hello@jan.ai
+- **Jobs**: hr@jan.ai
+- **General Discussion**: [Discord](https://discord.gg/FTk2MvZwJH)

 ## Trust & Safety

-Beware of scams!
+**Friendly reminder**: We're not trying to scam you.

- We will never request your personal information.
- Our product is completely free; no paid version exists.
- We do not have a token or ICO.
- We are a [bootstrapped company](https://en.wikipedia.org/wiki/Bootstrapping), and don't have any external investors (_yet_). We're open to exploring opportunities with strategic partners want to tackle [our mission](https://jan.ai/about#mission) together.
+- We won't ask for personal information
+- Jan is completely free (no premium version exists)
+- We don't have a cryptocurrency or ICO
+- We're bootstrapped and not seeking your investment (yet)

 ## License

-Jan is free and open source, under the **Apache 2.0** license.
+Apache 2.0 - Because sharing is caring.
+
+## Acknowledgements
+
+Built on the shoulders of giants:
+
+- [Llama.cpp](https://github.com/ggerganov/llama.cpp)
+- [Tauri](https://tauri.app/)
+- [Scalar](https://github.com/scalar/scalar)
--- a/docs/public/assets/images/homepage/app-frame-light-fixed.png
+++ b/docs/public/assets/images/homepage/app-frame-light-fixed.png
--- a/docs/public/assets/videos/enable-tool-call-for-models.mp4
+++ b/docs/public/assets/videos/enable-tool-call-for-models.mp4
--- a/docs/src/components/Download/CardDownload.tsx
+++ b/docs/src/components/Download/CardDownload.tsx
@ -3,6 +3,7 @@ import { IconType } from 'react-icons/lib'
 import { FaWindows, FaApple, FaLinux } from 'react-icons/fa'
 import { twMerge } from 'tailwind-merge'
 import { DownloadIcon } from 'lucide-react'
+import { formatFileSize } from '@/utils/format'

 type Props = {
  lastRelease: any
@ -14,6 +15,7 @@ type SystemType = {
  logo: IconType
  fileFormat: string
  href?: string
+  size?: string
 }

 const systemsTemplate: SystemType[] = [
@ -21,26 +23,25 @@ const systemsTemplate: SystemType[] = [
    name: 'Mac ',
    label: 'Universal',
    logo: FaApple,
-    fileFormat: '{appname}-mac-universal-{tag}.dmg',
+    fileFormat: 'Jan_{tag}_universal.dmg',
  },
-
  {
    name: 'Windows',
    label: 'Standard (64-bit)',
    logo: FaWindows,
-    fileFormat: '{appname}-win-x64-{tag}.exe',
+    fileFormat: 'Jan_{tag}_x64-setup.exe',
  },
  {
    name: 'Linux (AppImage)',
    label: 'AppImage',
    logo: FaLinux,
-    fileFormat: '{appname}-linux-x86_64-{tag}.AppImage',
+    fileFormat: 'Jan_{tag}_amd64.AppImage',
  },
  {
    name: 'Linux (deb)',
    label: 'Deb',
    logo: FaLinux,
-    fileFormat: '{appname}-linux-amd64-{tag}.deb',
+    fileFormat: 'Jan_{tag}_amd64.deb',
  },
 ]

@ -53,40 +54,26 @@ const groupTemnplate = [
 export default function CardDownload({ lastRelease }: Props) {
  const [systems, setSystems] = useState(systemsTemplate)

-  const extractAppName = (fileName: string) => {
-    const regex = /^(.*?)-(?:mac|win|linux)-(?:arm64|x64|amd64|x86_64)-.*$/
-    const match = fileName.match(regex)
-    return match ? match[1] : null
-  }
-
  useEffect(() => {
    const updateDownloadLinks = async () => {
      try {
-        // Extract appname from the first asset name
-        const firstAssetName = lastRelease.assets[0].name
-        const appname = extractAppName(firstAssetName)
-
-        if (!appname) {
-          console.error(
-            'Failed to extract appname from file name:',
-            firstAssetName
-          )
-
-          return
-        }
-
        // Remove 'v' at the start of the tag_name
        const tag = lastRelease.tag_name.startsWith('v')
          ? lastRelease.tag_name.substring(1)
          : lastRelease.tag_name

        const updatedSystems = systems.map((system) => {
-          const downloadUrl = system.fileFormat
-            .replace('{appname}', appname)
-            .replace('{tag}', tag)
+          const downloadUrl = system.fileFormat.replace('{tag}', tag)
+
+          // Find the corresponding asset to get the file size
+          const asset = lastRelease.assets.find(
+            (asset: any) => asset.name === downloadUrl
+          )
+
          return {
            ...system,
            href: `https://github.com/menloresearch/jan/releases/download/${lastRelease.tag_name}/${downloadUrl}`,
+            size: asset ? formatFileSize(asset.size) : undefined,
          }
        })

@ -118,6 +105,11 @@ export default function CardDownload({ lastRelease }: Props) {
              >
                <span>{system.label}</span>
                <DownloadIcon size={16} />
+                {system.size && (
+                  <div className="text-sm text-black/60 dark:text-white/60">
+                    {system.size}
+                  </div>
+                )}
              </a>
            </div>
          ))}
--- a/docs/src/components/DropdownDownload/index.tsx
+++ b/docs/src/components/DropdownDownload/index.tsx
@ -4,6 +4,7 @@ import { IconType } from 'react-icons/lib'
 import { IoChevronDownOutline } from 'react-icons/io5'
 import { useClickOutside } from '@/hooks/useClickOutside'
 import { twMerge } from 'tailwind-merge'
+import { formatFileSize } from '@/utils/format'

 type Props = {
  lastRelease: any
@ -14,6 +15,7 @@ type SystemType = {
  logo: IconType
  fileFormat: string
  href?: string
+  size?: string
 }

 type GpuInfo = {
@ -26,31 +28,25 @@ const systemsTemplate: SystemType[] = [
  {
    name: 'Download for Mac',
    logo: FaApple,
-    fileFormat: '{appname}-mac-universal-{tag}.dmg',
+    fileFormat: 'Jan_{tag}_universal.dmg',
  },
  {
    name: 'Download for Windows',
    logo: FaWindows,
-    fileFormat: '{appname}-win-x64-{tag}.exe',
+    fileFormat: 'Jan_{tag}_x64-setup.exe',
  },
  {
    name: 'Download for Linux (AppImage)',
    logo: FaLinux,
-    fileFormat: '{appname}-linux-x86_64-{tag}.AppImage',
+    fileFormat: 'Jan_{tag}_amd64.AppImage',
  },
  {
    name: 'Download for Linux (deb)',
    logo: FaLinux,
-    fileFormat: '{appname}-linux-amd64-{tag}.deb',
+    fileFormat: 'Jan_{tag}_amd64.deb',
  },
 ]

-const extractAppName = (fileName: string) => {
-  const regex = /^(.*?)-(?:mac|win|linux)-(?:arm64|x64|x86_64|amd64)-.*$/
-  const match = fileName.match(regex)
-  return match ? match[1] : null
-}
-
 const DropdownDownload = ({ lastRelease }: Props) => {
  const [systems, setSystems] = useState(systemsTemplate)
  const [defaultSystem, setDefaultSystem] = useState(systems[0])
@ -129,27 +125,22 @@ const DropdownDownload = ({ lastRelease }: Props) => {
  useEffect(() => {
    const updateDownloadLinks = async () => {
      try {
-        const firstAssetName = await lastRelease.assets[0]?.name
-        const appname = extractAppName(firstAssetName)
-        if (!appname) {
-          console.error(
-            'Failed to extract appname from file name:',
-            firstAssetName
-          )
-          changeDefaultSystem(systems)
-          return
-        }
        const tag = lastRelease.tag_name.startsWith('v')
          ? lastRelease.tag_name.substring(1)
          : lastRelease.tag_name

        const updatedSystems = systems.map((system) => {
-          const downloadUrl = system.fileFormat
-            .replace('{appname}', appname)
-            .replace('{tag}', tag)
+          const downloadUrl = system.fileFormat.replace('{tag}', tag)
+
+          // Find the corresponding asset to get the file size
+          const asset = lastRelease.assets.find(
+            (asset: any) => asset.name === downloadUrl
+          )
+
          return {
            ...system,
            href: `https://github.com/menloresearch/jan/releases/download/${lastRelease.tag_name}/${downloadUrl}`,
+            size: asset ? formatFileSize(asset.size) : undefined,
          }
        })
        setSystems(updatedSystems)
@ -176,10 +167,15 @@ const DropdownDownload = ({ lastRelease }: Props) => {
    <div className="inline-flex flex-shrink-0 justify-center relative">
      <a
        href={defaultSystem.href}
-        className="dark:border-r-0 dark:nx-bg-neutral-900 dark:text-white bg-black text-white hover:text-white justify-center dark:border dark:border-neutral-800 flex-shrink-0 pl-4 pr-6 py-4 rounded-l-xl inline-flex items-center !rounded-r-none"
+        className="min-w-[300px] dark:border-r-0 dark:nx-bg-neutral-900 dark:text-white bg-black text-white hover:text-white dark:border dark:border-neutral-800 flex-shrink-0 pl-4 pr-6 py-4 rounded-l-xl inline-flex items-center !rounded-r-none"
      >
        <defaultSystem.logo className="h-4 mr-2" />
-        {defaultSystem.name}
+        <span>{defaultSystem.name}</span>
+        {defaultSystem.size && (
+          <span className="text-white/60 text-sm ml-2">
+            ({defaultSystem.size})
+          </span>
+        )}
      </a>
      <button
        className="dark:nx-bg-neutral-900 dark:text-white bg-black text-white hover:text-white justify-center dark:border border-l border-gray-500 dark:border-neutral-800 flex-shrink-0 p-4 px-3 rounded-r-xl"
@ -192,18 +188,27 @@ const DropdownDownload = ({ lastRelease }: Props) => {
      </button>
      {open && (
        <div
-          className="absolute left-0 top-[64px] w-full dark:nx-bg-neutral-900 bg-black z-30 rounded-xl lg:w-[300px]"
+          className="absolute left-0 top-[64px] w-full dark:nx-bg-neutral-900 bg-black z-30 rounded-xl lg:w-[380px]"
          ref={setRefDropdownContent}
        >
          {systems.map((system) => (
            <div key={system.name} className="py-1">
              <a
                href={system.href || ''}
-                className="flex px-4 py-3 items-center text-white hover:text-white hover:bg-white/10 dark:hover:bg-white/5"
+                className="flex px-4 py-3 items-center text-white hover:text-white hover:bg-white/10 dark:hover:bg-white/5 justify-between"
                onClick={() => setOpen(false)}
              >
-                <system.logo className="w-3 mr-3 -mt-1 flex-shrink-0" />
-                <span className="text-white font-medium">{system.name}</span>
+                <div className="flex items-center">
+                  <system.logo className="w-3 mr-3 -mt-1 flex-shrink-0" />
+                  <span className="text-white font-medium flex-1">
+                    {system.name}
+                  </span>
+                </div>
+                {system.size && (
+                  <span className="text-white/60 text-sm ml-2">
+                    {system.size}
+                  </span>
+                )}
              </a>
            </div>
          ))}
--- a/docs/src/components/FooterMenu/index.tsx
+++ b/docs/src/components/FooterMenu/index.tsx
@ -77,7 +77,7 @@ const menus = [
      },
      {
        menu: 'LinkedIn',
-        path: 'https://www.linkedin.com/company/homebrewltd',
+        path: 'https://www.linkedin.com/company/menloresearch',
        external: true,
      },
    ],
--- a/docs/src/components/Home/Feature/index.tsx
+++ b/docs/src/components/Home/Feature/index.tsx
@ -44,7 +44,7 @@ const features = [
  {
    title: 'Chat with your files',
    experimantal: true,
-    description: `Set up and run your own OpenAI-compatible API server using local models with just one click.`,
+    description: `Talk to PDFs, notes, and other documents directly to get summaries, answers, or insights.`,
    image: {
      light: '/assets/images/homepage/features05.png',
      dark: '/assets/images/homepage/features05dark.png',
--- a/docs/src/pages/_meta.json
+++ b/docs/src/pages/_meta.json
@ -11,16 +11,6 @@
    "type": "page",
    "title": "Documentation"
  },
-  "cortex": {
-    "type": "page",
-    "title": "Cortex",
-    "display": "hidden"
-  },
-  "integrations": {
-    "type": "page",
-    "title": "Integrations",
-    "display": "hidden"
-  },
  "changelog": {
    "type": "page",
    "title": "Changelog",
--- a/docs/src/pages/about/index.mdx
+++ b/docs/src/pages/about/index.mdx
@ -57,17 +57,16 @@ We have a thriving community built around [Jan](../docs), where we also discuss

 - [Discord](https://discord.gg/AAGQNpJQtH)
 - [Twitter](https://twitter.com/jandotai)
- [LinkedIn](https://www.linkedin.com/company/homebrewltd)
- [HuggingFace](https://huggingface.co/janhq)
+- [LinkedIn](https://www.linkedin.com/company/menloresearch)
 - Email: hello@jan.ai

 ## Philosophy

-Homebrew is an opinionated company with a clear philosophy for the products we build: 
+[Menlo](https://menlo.ai/handbook/about) is an open R&D lab in pursuit of General Intelligence, that achieves real-world impact through agents and robots.

 ### 🔑 User Owned

-We build tools that are user-owned. Our products are [open-source](https://en.wikipedia.org/wiki/Open_source), designed to run offline or be [self-hosted](https://www.reddit.com/r/selfhosted/). We make no attempt to lock you in, and our tools are free of [user-hostile dark patterns](https://twitter.com/karpathy/status/1761467904737067456?t=yGoUuKC9LsNGJxSAKv3Ubg) [^1].
+We build tools that are user-owned. Our products are [open-source](https://en.wikipedia.org/wiki/Open_source), designed to run offline or be [self-hosted.](https://www.reddit.com/r/selfhosted/) We make no attempt to lock you in, and our tools are free of [user-hostile dark patterns](https://twitter.com/karpathy/status/1761467904737067456?t=yGoUuKC9LsNGJxSAKv3Ubg) [^1].

 We adopt [Local-first](https://www.inkandswitch.com/local-first/) principles and store data locally in [universal file formats](https://stephango.com/file-over-app). We build for privacy by default, and we do not [collect or sell your data](/privacy). 

--- a/docs/src/pages/cortex/_assets/architecture.png
+++ b/docs/src/pages/cortex/_assets/architecture.png
--- a/docs/src/pages/cortex/_assets/cortex-cover.png
+++ b/docs/src/pages/cortex/_assets/cortex-cover.png
--- a/docs/src/pages/cortex/_assets/cortex-llamacpp-act.png
+++ b/docs/src/pages/cortex/_assets/cortex-llamacpp-act.png
--- a/docs/src/pages/cortex/_assets/cortex-llamacpp-arch.png
+++ b/docs/src/pages/cortex/_assets/cortex-llamacpp-arch.png
--- a/docs/src/pages/cortex/_meta.json
+++ b/docs/src/pages/cortex/_meta.json
@ -1,136 +0,0 @@
-{
-  "-- Switcher": {
-    "type": "separator",
-    "title": "Switcher"
-  },
-  "get-started": {
-    "title": "GET STARTED",
-    "type": "separator"
-  },
-  "index": {
-    "title": "Overview",
-    "href": "/cortex"
-  },
-  "quickstart": {
-    "title": "Quickstart"
-  },
-  "hardware": {
-    "title": "Hardware"
-  },
-  "installation": {
-    "title": "Installation"
-  },
-  "basicusage": {
-    "title": "BASIC USAGE",
-    "type": "separator"
-  },
-  "command-line": {
-    "title": "CLI"
-  },
-  "ts-library": {
-    "title": "Typescript Library"
-  },
-  "py-library": {
-    "title": "Python Library"
-  },
-  "server": {
-    "title": "Server Endpoint"
-  },
-  "capabilities": {
-    "title": "CAPABILITIES",
-    "type": "separator"
-  },
-  "text-generation": {
-    "title": "Text Generation"
-  },
-  "function-calling": {
-    "display": "hidden",
-    "title": "Function Calling"
-  },
-  "embeddings": {
-    "display": "hidden",
-    "title": "Embeddings"
-  },
-  "fine-tuning": {
-    "display": "hidden",
-    "title": "Fine-tuning"
-  },
-  "vision": {
-    "display": "hidden",
-    "title": "Vision"
-  },
-  "model-operations": {
-    "display": "hidden",
-    "title": "Model Operations"
-  },
-  "rag": {
-    "display": "hidden",
-    "title": "RAG"
-  },
-  "assistant": {
-    "display": "hidden",
-    "title": "ASSISTANTS",
-    "type": "separator"
-  },
-  "assistants": {
-    "display": "hidden",
-    "title": "Overview"
-  },
-  "commandline": {
-    "title": "COMMAND LINE",
-    "type": "separator"
-  },
-  "cli": {
-    "title": "cortex"
-  },
-  "training-engines": {
-    "display": "hidden",
-    "title": "TRAINING ENGINES"
-  },
-  "extensions": {
-    "display": "hidden",
-    "title": "EXTENSIONS",
-    "type": "separator"
-  },
-  "build-extension": {
-    "display": "hidden",
-    "title": "Build an Extension"
-  },
-  "architectures": {
-    "title": "ARCHITECTURE",
-    "type": "separator"
-  },
-  "architecture": {
-    "title": "Cortex"
-  },
-  "cortex-cpp": {
-    "title": "Cortex.cpp"
-  },
-  "cortex-llamacpp": {
-    "title": "Cortex.llamacpp"
-  },
-  "cortex-tensorrt-llm": {
-    "title": "Cortex.tensorrt-llm",
-    "display": "hidden"
-  },
-  "cortex-python": {
-    "title": "Cortex.python",
-    "display": "hidden"
-  },
-  "cortex-openvino": {
-    "title": "Cortex.OpenVino",
-    "display": "hidden"
-  },
-  "ext-architecture": {
-    "display": "hidden",
-    "title": "Extensions"
-  },
-  "troubleshooting": {
-    "title": "TROUBLESHOOTING",
-    "type": "separator"
-  },
-  "error-codes": {
-    "display": "hidden",
-    "title": "Error Codes"
-  }
-}
--- a/docs/src/pages/cortex/architecture.mdx
+++ b/docs/src/pages/cortex/architecture.mdx
@ -1,202 +0,0 @@
---
-title: Overview
-description: Cortex Architecture
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-## Introduction
-
-Cortex is an alternative to the OpenAI API designed to operate entirely on your local hardware infrastructure. This headless backend platform is also engineered to support TensorRT-LLM, ensuring high-performance machine-learning model execution. It is packaged with a Docker-inspired command-line interface and a Typescript client library.
-
-The following guide details Cortex's core components, providing insights and instructions for those interested in customizing It to meet specific requirements.
-
-
-## Architecture
-
-![Architecture](./_assets/architecture.png)
-
-### Main Components
-
-Cortex is architected with several key components, each designed to fulfill specific roles within the system, ensuring efficient processing and response to client requests.
-
-1. **Cortex JS**: This component acts as the interface layer where requests are received and responses are sent.
-2. **Server:** The central processing unit of Cortex, this component coordinates all activities across the system. It manages the data flow and ensures operations are correctly executed.
-3. **Kernel**: This component checks the server's hardware configuration. Based on the current hardware setup, it determines whether additional dependencies are required, optimizing the system for performance and compatibility.
-4. **Runtime**: This process involves dynamically loading necessary libraries and models based on the server's current needs and processing requests.
-5. **Dynamic Libraries**: Consists of inference engines loaded on-demand to enhance Cortex's processing power. These engines are essential for performing specialized computational tasks. Currently, Cortex supports:
-    - Llama.cpp Engine
-    - TensorRT-LLM Engine
-    - Python-runtime Engine
-
-### Data Structure
-
-Cortex is equipped with **MySQL** and **SQLite** databases, offering flexible data management options that can be easily adapted to different environments and requirements. It also has a filesystem data that can store and retrieve data using file-based mechanisms.
-
-#### MySQL
-
-This database is used because it is ideal for Cortex environments where scalability, security, and data integrity are critical. MySQL is well-suited for handling large model-size data from the core extensions.
-
-#### SQLite
-
-This database is used for simplicity and minimal setup. It can handle the small model size from the core extensions and any data from the External extensions.
-
-#### File System
-
-Cortex uses a filesystem approach for managing configuration files, such as `model.yaml` files. These files are stored in a structured directory hierarchy, enabling efficient data retrieval and management.
-
-### Providers
-
-#### Internal Provider
-
-Integral to the CLI, it includes the core binary (**`.cpp`**) and is compiled directly with the CLI, facilitating all application parts' direct access to core functionalities.
-
-#### Core Extensions
-
-These are bundled with the CLI and include additional functionalities like remote engines and API models, facilitating more complex operations and interactions within the same architectural framework.
-
-#### External Extensions
-
-These are designed to be more flexible and are stored externally. They represent potential future expansions or integrations, allowing the architecture to extend its capabilities without modifying the core system.
-
-### Key Dependencies
-
-Cortex developed using NestJS and operates via a Node.js server framework, handling all incoming and outgoing requests. It also has a C++ runtime to handle stateless requests. 
-
-Below is a detailed overview of its core architecture components:
-
-#### NestJS Framework
-
-NestJS framework serves as the backbone of the Cortex. This framework facilitates the organization of server-side logic into modules, controllers, and extensions, which are important for maintaining a clean codebase and efficient request handling.
-
-#### Node.js Server
-
-Node.js is the primary runtime for Cortex, which handles the HTTP requests, executes the server-side logic, and manages the responses.
-
-#### C++ Runtime
-
-C++ runtime is important for managing stateless requests. This component can handle intensive tasks that require optimized performance.
-
-## Code Structure
-
-The repository is organized to separate concerns between domain definitions, business rules, and adapters or implementations.
-```
-# Entity Definitions
-domain/                    # This is the core directory where the domains are defined.
-  abstracts/               # Abstract base classes for common attributes and methods.
-  models/                  # Domain interface definitions, e.g. model, assistant.
-  repositories/            # Extensions abstract and interface
-
-# Business Rules
-usecases/                  # Application logic 
-	assistants/              # CRUD logic (invokes dtos, entities).
-	chat/                    # Logic for chat functionalities.
-	models/                  # Logic for model operations.
-
-# Adapters & Implementations
-infrastructure/            # Implementations for Cortex interactions
-  commanders/              # CLI handlers
-    models/
-    questions/             # CLI installation UX
-    shortcuts/             # CLI chained syntax
-    types/
-    usecases/              # Invokes UseCases
-
-  controllers/             # Nest controllers and HTTP routes
-		assistants/						 # Invokes UseCases
-	  chat/     						 # Invokes UseCases
-		models/                # Invokes UseCases
-	
-  database/                # Database providers (mysql, sqlite)
-	
-	# Framework specific object definitions
-  dtos/                    # DTO definitions (data transfer & validation)
-  entities/                # TypeORM entity definitions (db schema)
-  
-	# Providers
-  providers/cortex         # Cortex [server] provider (a core extension)
-  repositories/extensions  # Extension provider (core & external extensions)
-
-extensions/                # External extensions
-command.module.ts          # CLI Commands List
-main.ts                    # Entrypoint
-
-```
-<Callout type="info">
-The structure above promotes clean architecture principles, allowing for scalable and maintainable Cortex development.
-</Callout>
-
-## Runtime
-```mermaid
-sequenceDiagram
-    User-)Cortex: "Tell me a joke"
-    Cortex->>HF: Download a model
-    Cortex->>Model Controller/Service: Start the model
-    Cortex->>Chat Controller/Service: POST /completions 
-    Chat Controller/Service ->> Chat UseCases: createChatCompletions()
-    Chat UseCases -->> Model Entity: findOne()
-    Cortex->>Model Entity: Store the model data
-    Chat UseCases -->> Extension Repository: findAll()
-    Extension Repository ->> Cortex Provider: inference()
-    CortexCPP Server ->> Cortex Provider: Port /???
-
-    %% Responses
-    Cortex Provider ->> Extension Repository: inference()
-    Extension Repository ->> Chat UseCases: Response stream
-    Chat UseCases ->> Chat Controller/Service: Formatted response/stream
-    Chat Controller/Service ->> User: "Your mama"
-```
-The sequence diagram above outlines the interactions between various components in the Cortex system during runtime, particularly when handling user requests via a CLI. Here’s a detailed breakdown of the runtime sequence:
-
-1. **User Request**: The user initiates an interaction by requesting “a joke” via the Cortex CLI.
-2. **Model Activation**:
-    - The API directs the request to the `Model Controller/Service`.
-    - The service pulls and starts the appropriate model and posts a request to `'/completions'` to prepare the model for processing.
-3. **Chat Processing**:
-    - The `Chat Controller/Service` processes the user's request using `Chat UseCases`.
-    - The `Chat UseCases` interact with the Model Entity and Extension Repository to gather necessary data and logic.
-4. **Data Handling and Response Formation**:
-    - The `Model Entity` and `Extension Repository` perform data operations, which may involve calling a `Provider` for additional processing.
-    - Data is fetched, stored, and an inference is performed as needed.
-5. **Response Delivery**:
-    - The response is formatted by the `Chat UseCases` and streamed back to the user through the API.
-    - The user receives the processed response, completing the cycle of interaction.
-
-## Roadmap
-
-Our development roadmap outlines key features and epics we will focus on in the upcoming releases. These enhancements aim to improve functionality, increase efficiency, and expand Cortex's capabilities.
-
- **Crash Report Telemetry**: Enhance error reporting and operational stability by automatically collecting and analyzing crash reports.
- **RAG**: Improve response quality and contextual relevance in our AI models.
- **Cortex TensorRT-LLM**: Optimize NVIDIA TensorRT optimizations for LLMs.
- **Cortex Presets**: Streamline model configurations.
- **Cortex Python Runtime**: Provide a scalable Python execution environment for Cortex.
-
-## Risks & Technical Debt
-
-Cortex CLI, built with Nest-commander, incorporates extensions to integrate various inference providers. This flexibility, however, introduces certain risks related to dependency management and the objective of bundling the CLI into a single executable binary.
-
-### Key Risks
-
-1. **Complex Dependencies**: Utilizing Nest-commander involves a deep dependency tree, risking version conflicts and complicating updates.
-2. **Bundling Issues**: Converting to a single executable can reveal issues with `npm` dependencies and relative asset paths, leading to potential runtime errors due to unresolved assets or incompatible binary dependencies.
--- a/docs/src/pages/cortex/assistants.mdx
+++ b/docs/src/pages/cortex/assistants.mdx
@ -1,22 +0,0 @@
---
-title: Assistants
-description: Assistants
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/build-extension.mdx
+++ b/docs/src/pages/cortex/build-extension.mdx
@ -1,22 +0,0 @@
---
-title: Build an Extension
-description: Build an Extension
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/cli.mdx
+++ b/docs/src/pages/cortex/cli.mdx
@ -1,54 +0,0 @@
---
-title: Command Line Interface
-description: Cortex CLI.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# Cortex
-
-Cortex is a CLI tool used to interact with the Jan application and its various functions.
-
-<Callout type="info">
-Cortex CLI is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex [command] [flag]
-```
-### Options
-```
-  -v, --version    Cortex version (default: false)
-  -h, --help       display help for command
-```
-## Sub Commands
-  [cortex models](/cortex/cli/models): Manage and configure models.
-  [cortex serve](/cortex/cli/serve): Launch an API endpoint server for the Cortex backend.
-  [cortex chat](/cortex/cli/chat): Send a chat request to a model.
-  [cortex init|setup](/cortex/cli/init): Initialize settings and download dependencies for Cortex.
-  [cortex ps](/cortex/cli/ps): Display active models and their operational status.
-  [cortex kill](/cortex/cli/kill): Terminate active Cortex processes.
-  [cortex pull|download](/cortex/cli/pull): Download a model.
-  [cortex run](/cortex/cli/run): Shortcut to start a model and chat **(EXPERIMENTAL)**.
--- a/docs/src/pages/cortex/cli/_meta.json
+++ b/docs/src/pages/cortex/cli/_meta.json
@ -1,26 +0,0 @@
-{
-  "init": {
-    "title": "cortex init"
-  },
-  "pull": {
-    "title": "cortex pull"
-  },
-  "run": {
-    "title": "cortex run"
-  },
-  "models": {
-    "title": "cortex models"
-  },
-  "ps": {
-    "title": "cortex ps"
-  },
-  "chat": {
-    "title": "cortex chat"
-  },
-  "kill": {
-    "title": "cortex kill"
-  },
-  "serve": {
-    "title": "cortex serve"
-  }
-}
--- a/docs/src/pages/cortex/cli/chat.mdx
+++ b/docs/src/pages/cortex/cli/chat.mdx
@ -1,47 +0,0 @@
---
-title: Cortex Chat
-description: Cortex chat command.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex chat`
-
-This command starts a chat session with a specified model, allowing you to interact directly with it through an interactive chat interface.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex chat --model MODEL_ID
-```
-### Options
-```
-  -t, --thread <thread_id>  Thread Id. If not provided, will create new thread
-  -m, --message <message>   Message to send to the model
-  -a, --attach              Attach to interactive chat session (default: false)
-  -h, --help                display help for command
-```
--- a/docs/src/pages/cortex/cli/init.mdx
+++ b/docs/src/pages/cortex/cli/init.mdx
@ -1,49 +0,0 @@
---
-title: Cortex Models Init
-description: Cortex init command.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex init`
-
-This command initializes the cortex operations settings and downloads the required dependencies to run cortex.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Alias
-The following alias is also available for initializing cortex:
- `cortex setup`
-
-## Usage
-
-```bash
-cortex init
-```
-
-## Options
-```
-  -h, --help              display help for command
-```
--- a/docs/src/pages/cortex/cli/kill.mdx
+++ b/docs/src/pages/cortex/cli/kill.mdx
@ -1,45 +0,0 @@
---
-title: Cortex Kill
-description: Cortex kill command.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex kill`
-
-This command stops the currently running cortex processes.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex kill
-```
-
-## Options
-```
-  -h, --help              display help for command
-```
--- a/docs/src/pages/cortex/cli/models.mdx
+++ b/docs/src/pages/cortex/cli/models.mdx
@ -1,52 +0,0 @@
---
-title: Cortex Models
-description: Cortex CLI.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex models`
-
-This command allows you to start, stop, and manage various model operations within Cortex.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex models API_COMMAND [OPTIONS]
-
-# Start a downloaded model
-cortex models start MODEL_ID
-
-# Stop a downloaded model
-cortex models stop MODEL_ID
-```
-
-## Options
-
-```
-  -h, --help     display help for command
-```
--- a/docs/src/pages/cortex/cli/models/_meta.json
+++ b/docs/src/pages/cortex/cli/models/_meta.json
@ -1,23 +0,0 @@
-{
-  "download": {
-    "title": "cortex models pull"
-  },
-  "list": {
-    "title": "cortex models list"
-  },
-  "get": {
-    "title": "cortex models get"
-  },
-  "update": {
-    "title": "cortex models update"
-  },
-  "start": {
-    "title": "cortex models start"
-  },
-  "stop": {
-    "title": "cortex models stop"
-  },
-  "remove": {
-    "title": "cortex models remove"
-  }
-}
--- a/docs/src/pages/cortex/cli/models/download.mdx
+++ b/docs/src/pages/cortex/cli/models/download.mdx
@ -1,49 +0,0 @@
---
-title: Cortex Models Pull
-description: Cortex models subcommands.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex models pull`
-
-This command downloads a model. You can use a HuggingFace `MODEL_ID` to download a model.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex models pull MODEL_ID
-```
-## Alias
-The following alias is also available for downloading models:
- `cortex models download _`
-
-## Options
-```
-  -m, --model <model_id>  Model Id to start chat with
-  -h, --help              display help for command
-```
--- a/docs/src/pages/cortex/cli/models/get.mdx
+++ b/docs/src/pages/cortex/cli/models/get.mdx
@ -1,45 +0,0 @@
---
-title: Cortex Models Get
-description: Cortex models subcommands.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex models get`
-
-This command returns a model detail defined by a `MODEL_ID`.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex models get MODEL_ID
-```
-
-## Options
-```
-  -h, --help              display help for command
-```
--- a/docs/src/pages/cortex/cli/models/list.mdx
+++ b/docs/src/pages/cortex/cli/models/list.mdx
@ -1,46 +0,0 @@
---
-title: Cortex Models List
-description: Cortex models subcommands.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex models list`
-
-This command lists all local models.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex models list
-```
-
-## Options
-```
-  -f, --format <format>  Print models list in table or json format (default: "json")
-  -h, --help             display help for command
-```
--- a/docs/src/pages/cortex/cli/models/remove.mdx
+++ b/docs/src/pages/cortex/cli/models/remove.mdx
@ -1,45 +0,0 @@
---
-title: Cortex Models Remove
-description: Cortex models subcommands.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex models remove`
-
-This command deletes a local model defined by a `MODEL_ID`.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex models remove MODEL_ID
-```
-
-## Options
-```
-  -h, --help              display help for command
-```
--- a/docs/src/pages/cortex/cli/models/start.mdx
+++ b/docs/src/pages/cortex/cli/models/start.mdx
@ -1,46 +0,0 @@
---
-title: Cortex Models Start
-description: Cortex models subcommands.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex models start`
-
-This command starts a model defined by a `MODEL_ID`.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex models start MODEL_ID
-```
-
-## Options
-```
-  -a, --attach  Attach to interactive chat session (default: false)
-  -h, --help    display help for command
-```
--- a/docs/src/pages/cortex/cli/models/stop.mdx
+++ b/docs/src/pages/cortex/cli/models/stop.mdx
@ -1,45 +0,0 @@
---
-title: Cortex Models Stop
-description: Cortex models subcommands.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex models stop`
-
-This command stops a model defined by a `MODEL_ID`.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex models stop MODEL_ID
-```
-
-## Options
-```
-  -h, --help              display help for command
-```
--- a/docs/src/pages/cortex/cli/models/update.mdx
+++ b/docs/src/pages/cortex/cli/models/update.mdx
@ -1,48 +0,0 @@
---
-title: Cortex Models Update
-description: Cortex models subcommands.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex models update`
-
-This command updates a model configuration defined by a `MODEL_ID`.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex models update MODEL_ID OPTIONS
-```
-
-## Options
-```
-  -m, --model <model_id>      Model Id to update
-  -c, --options <options...>  Specify the options to update the model. Syntax: -c option1=value1 option2=value2. For
-                              example: cortex models update -c max_tokens=100 temperature=0.5
-  -h, --help                  display help for command
-```
--- a/docs/src/pages/cortex/cli/ps.mdx
+++ b/docs/src/pages/cortex/cli/ps.mdx
@ -1,48 +0,0 @@
---
-title: Cortex Ps
-description: Cortex ps command.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex ps`
-
-This command shows the running model and its status.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex ps
-```
-For example, it returns the following table:
-```bash
-┌─────────┬──────────────────────┬───────────────────┬───────────┬──────────┬─────┬──────┐
-│ (index) │ modelId              │ engine            │ status    │ duration │ ram │ vram │
-├─────────┼──────────────────────┼───────────────────┼───────────┼──────────┼─────┼──────┤
-│ 0       │ 'janhq/tinyllama/1b' │ 'cortex.llamacpp' │ 'running' │ '7s'     │ '-' │ '-'  │
-└─────────┴──────────────────────┴───────────────────┴───────────┴──────────┴─────┴──────┘
-```
--- a/docs/src/pages/cortex/cli/pull.mdx
+++ b/docs/src/pages/cortex/cli/pull.mdx
@ -1,82 +0,0 @@
---
-title: Cortex Pull
-description: Cortex CLI.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex pull`
-
-This command facilitates downloading machine learning models from various model hubs, including the popular 🤗 [Hugging Face](https://huggingface.co/).
-
-By default, models are downloaded to the `node_modules library path. For additional information on storage paths and options, refer [here](/cortex/cli#storage).
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Alias
-The following alias is also available for downloading models:
- `cortex download _`
-
-## Usage
-
-### Preconfigured Models
-
-Reconfigured models (with optimal runtime parameters and templates) are available from the [Jan Model Hub](https://huggingface.co/janhq) on Hugging Face.
-
-Models can be downloaded using a Docker-like interface with the following syntax: `repo_name:branch_name`. Each variant may include different quantizations and sizes, typically organized in the repository’s branches.
-
-Available models include [llama3](https://huggingface.co/janhq/llama3), [mistral](https://huggingface.co/janhq/mistral), [tinyllama](https://huggingface.co/janhq/tinyllama), and [many more](https://huggingface.co/janhq).
-
-<Callout type="info">
-New models will soon be added to HuggingFace's janhq repository.
-</Callout>
-
-```bash
-# Pull a specific variant with `repo_name:branch`
-cortex pull llama3:7b
-```
-You can also download `size`, `format`, and `quantization` variants of each model.
-
-```bash
-cortex pull llama3:8b-instruct-v3-gguf-Q4_K_M
-cortex pull llama3:8b-instruct-v3-tensorrt-llm
-```
-<Callout type="info">
-Model variants are provided via the `branches` in each model's Hugging Face repo.
-</Callout>
-### Hugging Face Models
-
-You can download any GGUF, TensorRT, or supported-format model directly from Hugging Face.
-
-```bash
-# cortex pull org_name/repo_name
-cortex pull microsoft/Phi-3-mini-4k-instruct-gguf
-```
-
-## Options
-
-```
-  -h, --help     display help for command
-```
--- a/docs/src/pages/cortex/cli/run.mdx
+++ b/docs/src/pages/cortex/cli/run.mdx
@ -1,53 +0,0 @@
---
-title: Cortex Run
-description: Cortex run command
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex run`
-
-This command facilitates the initiation of an interactive chat shell with a specified machine learning model.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex run MODEL_ID
-```
-### Options
-```
-  -t, --thread <thread_id>  Thread Id. If not provided, will create new thread
-  -h, --help                display help for command
-```
-
-## Command Chain
-
-`cortex run` command is a convenience wrapper that automatically executes a sequence of commands to simplify user interactions:
-
-1. [`cortex start`](/cortex/cli/models/start): This command starts the specified model, making it active and ready for interactions.
-2. [`cortex chat`](/cortex/cli/chat): Following model activation, this command opens an interactive chat shell where users can directly communicate with the model.
-
--- a/docs/src/pages/cortex/cli/serve.mdx
+++ b/docs/src/pages/cortex/cli/serve.mdx
@ -1,46 +0,0 @@
---
-title: Cortex Models Serve
-description: Cortex serve command.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# `cortex serve`
-
-This command runs the API endpoint server for the Cortex back-end.
-
-<Callout type="info">
-This command is compatible with all OpenAI and OpenAI-compatible endpoints.
-</Callout>
-
-## Usage
-
-```bash
-cortex serve
-```
-
-## Options
-```
-  -h, --host              configure the host for the API endpoint server
-  -h, --help              display help for command
-```
--- a/docs/src/pages/cortex/command-line.mdx
+++ b/docs/src/pages/cortex/command-line.mdx
@ -1,81 +0,0 @@
---
-title: Command Line Interface
-description: Cortex CLI.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# Command Line Interface
-
-The Cortex CLI provides a user-friendly platform for managing and operating large language models (LLMs), inspired by tools like Docker and GitHub CLI. Designed for straightforward installation and use, it simplifies the integration and management of LLMs.
-
-<Callout type="info">
-The Cortex CLI is OpenAI-compatible.
-</Callout>
-
-## Installation
-To get started with the Cortex CLI, please see our guides:
- [Quickstart](/cortex/quickstart)
- [Device specific installation](/cortex/installation)
-
-These resources provide detailed instructions to ensure Cortex is set up correctly on your machine, accommodating various hardware environments.
-
-## Usage
-
-The Cortex CLI has a robust command set that streamlines your LLM interactions.
-
-Check out the [CLI reference pages](/cortex/cli) for a comprehensive guide on all available commands and their specific functions.
-
-## Storage
-
-By default, Cortex CLI stores model binaries, thread history, and other usage data in:
-`$(npm list -g @janhq/cortex)`. 
-
-You can find the respective folders within the `/lib/node_modules/@janhq/cortex/dist/` subdirectory.
-
-<Callout type="info">
-**Ongoing Development**:
- Customizable Storage Locations
- Database Integration
-</Callout>
-
-## CLI Syntax
-
-The Cortex CLI improves the developer experience by incorporating command chaining and syntactic enhancements. 
-This approach automatically combines multiple operations into a single command, streamlining complex workflows. It simplifies the execution of extensive processes through integrated commands.
-
-### OpenAI API Equivalence
-
-The design of Cortex CLI commands strictly adheres to the method names used in the OpenAI API as a standard practice. This ensures a smooth transition for users familiar with OpenAI’s system.
-
-For example:
- The `cortex chat` command is equivalent to the [`POST /v1/chat/completions` endpoint](/cortex/cortex-chat). 
-
- The `cortex models get ID` command is equivalent to the [`GET /models ${ID}` endpoint](/cortex/cortex-models).
-
-### Command Chaining
-
-Cortex CLI’s command chaining support allows multiple commands to be executed in sequence with a simplified syntax. This approach reduces the complexity of command inputs and speeds up development tasks.
-
-For example:
- The [`cortex run`](/cortex/cortex-run), inspired by Docker and Github, starts the models and the inference engine, and provides a command line chat interface for easy testing.
--- a/docs/src/pages/cortex/cortex-cpp.mdx
+++ b/docs/src/pages/cortex/cortex-cpp.mdx
@ -1,77 +0,0 @@
---
-title: Cortex.cpp
-description: Cortex.cpp Architecture
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-# Cortex.cpp
-
-Cortex.cpp is a stateless, C++ server that is 100% compatible with OpenAI API (stateless endpoints). 
-
-It includes a Drogon server, with request queues, model orchestration logic, and hardware telemetry, and more, for prod environments.
-
-This guide walks you through how Cortex.CPP is designed, the codebase structure, and future plans.
-
-## Usage
-
-See [Quickstart](/cortex/quickstart)
-
-## Interface
-
-## Architecture
-
-## Code Structure
-
-```md
-├── app/
-│   │   ├── controllers/
-│   │   ├── models/
-│   │   ├── services/
-│   │   ├── ?engines/
-│   │   │   ├── llama.cpp
-│   │   │   ├── tensorrt-llm
-│   │   │   └── ...
-│   │   └── ...
-│   ├── CMakeLists.txt
-│   ├── config.json
-│   ├── Dockerfile
-│   ├── docker-compose.yml
-│   ├── README.md
-│   └── ...
-```
-
-`cortex-cpp` folder contains stateless implementations, most of which call into `cortex.llamacpp` and `cortex.tensorrt-llm`, depending on the engine at runtime.
-
-Here you will find the implementations for stateless endpoints: 
- `/chat/completion`
- `/audio`
- `/fine_tuning`
- `/embeddings`
- `/load_model`
- `/unload_model`
-
-And core hardware and model management logic like CPU instruction set detection, and multiple model loading logic. 
-
-## Runtime
-
-## Roadmap
--- a/docs/src/pages/cortex/cortex-llamacpp.mdx
+++ b/docs/src/pages/cortex/cortex-llamacpp.mdx
@ -1,143 +0,0 @@
---
-title: Cortex.llamacpp
-description: Cortex.llamacpp Architecture
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-# Cortex.llamacpp
-
-Cortex.llamacpp is a C++ inference library that can be loaded by any server at runtime. It submodules (and occasionally upstreams) [llama.cpp](https://github.com/ggerganov/llama.cpp) for GGUF inference. 
-
-In addition to llama.cpp, cortex.llamacpp adds: 
- OpenAI compatibility for the stateless endpoints
- Model orchestration like model warm up and concurrent models
-
-<Callout type="info">
-Cortex.llamacpp is formerly called "Nitro".
-</Callout>
-
-If you already use [Jan](/docs) or [Cortex](/cortex), cortex.llamacpp is bundled by default and you don’t need this guide. This guides walks you through how to use cortex.llamacpp as a standalone library, in any custom C++ server.
-
-## Usage
-
-To include cortex.llamacpp in your own server implementation, follow this [server example](https://github.com/menloresearch/cortex.llamacpp/tree/main/examples/server).
-
-## Interface
-
-Cortex.llamacpp has the following Interfaces:
-
- **HandleChatCompletion:** Processes chat completion tasks
-    
-    ```cpp
-    void HandleChatCompletion(
-          std::shared_ptr<Json::Value> jsonBody,
-          std::function<void(Json::Value&&, Json::Value&&)>&& callback);
-    ```
-    
- **HandleEmbedding:** Generates embeddings for the input data provided
-    
-    ```cpp
-    void HandleEmbedding(
-          std::shared_ptr<Json::Value> jsonBody,
-          std::function<void(Json::Value&&, Json::Value&&)>&& callback);
-    ```
-    
- **LoadModel:** Loads a model based on the specifications
-    
-    ```cpp
-    void LoadModel(
-          std::shared_ptr<Json::Value> jsonBody,
-          std::function<void(Json::Value&&, Json::Value&&)>&& callback);
-    ```
-    
- **UnloadModel:** Unloads a model as specified
-    
-    ```cpp
-    void UnloadModel(
-          std::shared_ptr<Json::Value> jsonBody,
-          std::function<void(Json::Value&&, Json::Value&&)>&& callback);
-    ```
-    
- **GetModelStatus:** Retrieves the status of a model
-    
-    ```cpp
-    void GetModelStatus(
-          std::shared_ptr<Json::Value> jsonBody,
-          std::function<void(Json::Value&&, Json::Value&&)>&& callback);
-    ```
-    
-**Parameters:**
-
- **`jsonBody`**: The request content in JSON format.
- **`callback`**: A function that handles the response
-
-## Architecture
-
-The main components include:
- `enginei`: an engine interface definition that extends to all engines, handling endpoint logic and facilitating communication between `cortex.cpp` and `llama engine`.
- `llama engine`: exposes APIs for embedding and inference. It loads and unloads models and simplifies API calls to `llama.cpp`.
- `llama.cpp`: submodule from the `llama.cpp` repository that provides the core functionality for embeddings and inferences.
- `llama server context`: a wrapper offers a simpler and more user-friendly interface for `llama.cpp` APIs
-
-![Cortex llamacpp architecture](./_assets/cortex-llamacpp-arch.png)
-
-### Communication Protocols:
-
- `Streaming`: Responses are processed and returned one token at a time.
- `RESTful`: The response is processed as a whole. After the llama server context completes the entire process, it returns a single result back to cortex.cpp.
-
-![Cortex llamacpp architecture](./_assets/cortex-llamacpp-act.png)
-
-## Code Structure
-
-```
-.
-├── base                              # Engine interface definition
-|   └── cortex-common                 # Common interfaces used for all engines
-|      └── enginei.h                  # Define abstract classes and interface methods for engines
-├── examples                          # Server example to integrate engine
-│   └── server.cc                     # Example server demonstrating engine integration
-├── llama.cpp                         # Upstream llama.cpp repository
-│   └── (files from upstream llama.cpp)
-├── src                               # Source implementation for llama.cpp
-│   ├── chat_completion_request.h     # OpenAI compatible request handling
-│   ├── llama_client_slot             # Manage vector of slots for parallel processing
-│   ├── llama_engine                  # Implementation of llamacpp engine for model loading and inference 
-│   ├── llama_server_context          # Context management for chat completion requests
-│   │   ├── slot                      # Struct for slot management
-│   │   └── llama_context             # Struct for llama context management
-|   |   └── chat_completion           # Struct for chat completion management
-|   |   └── embedding                 # Struct for embedding management
-├── third-party                       # Dependencies of the cortex.llamacpp project
-│   └── (list of third-party dependencies)
-```
-
-## Runtime
-
-## Roadmap
-The future plans for Cortex.llamacpp are focused on enhancing performance and expanding capabilities. Key areas of improvement include:
-
- Performance Enhancements: Optimizing speed and reducing memory usage to ensure efficient processing of tasks.
- Multimodal Model Compatibility: Expanding support to include a variety of multimodal models, enabling a broader range of applications and use cases.
-
-To follow the latest developments, see the [cortex.llamacpp GitHub](https://github.com/menloresearch/cortex.llamacpp)
--- a/docs/src/pages/cortex/cortex-openvino.mdx
+++ b/docs/src/pages/cortex/cortex-openvino.mdx
@ -1,24 +0,0 @@
---
-title: Cortex.OpenVino
-description: Cortex.OpenVino Architecture
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-# Cortex.OpenVino
--- a/docs/src/pages/cortex/cortex-python.mdx
+++ b/docs/src/pages/cortex/cortex-python.mdx
@ -1,24 +0,0 @@
---
-title: Cortex.python
-description: Cortex.python Architecture
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-# Cortex.python
--- a/docs/src/pages/cortex/cortex-tensorrt-llm.mdx
+++ b/docs/src/pages/cortex/cortex-tensorrt-llm.mdx
@ -1,24 +0,0 @@
---
-title: Cortex.tensorrt-llm
-description: Cortex.tensorrt-llm Architecture
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-# Cortex.tensorrt-llm
--- a/docs/src/pages/cortex/embeddings.mdx
+++ b/docs/src/pages/cortex/embeddings.mdx
@ -1,22 +0,0 @@
---
-title: Embeddings
-description: Embeddings
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/embeddings/overview.mdx
+++ b/docs/src/pages/cortex/embeddings/overview.mdx
@ -1,22 +0,0 @@
---
-title: Overview
-description: Overview.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/error-codes.mdx
+++ b/docs/src/pages/cortex/error-codes.mdx
@ -1,22 +0,0 @@
---
-title: Error Codes
-description: Error Codes.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/ext-architecture.mdx
+++ b/docs/src/pages/cortex/ext-architecture.mdx
@ -1,22 +0,0 @@
---
-title: Extensions Architecture
-description: Extensions Architecture
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/fine-tuning.mdx
+++ b/docs/src/pages/cortex/fine-tuning.mdx
@ -1,22 +0,0 @@
---
-title: Fine Tuning
-description: Fine Tuning
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/fine-tuning/overview.mdx
+++ b/docs/src/pages/cortex/fine-tuning/overview.mdx
@ -1,22 +0,0 @@
---
-title: Overview
-description: Overview.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/function-calling.mdx
+++ b/docs/src/pages/cortex/function-calling.mdx
@ -1,22 +0,0 @@
---
-title: Function Calling
-description: Function Calling
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/hardware.mdx
+++ b/docs/src/pages/cortex/hardware.mdx
@ -1,50 +0,0 @@
---
-title: Hardware Requirements
-description: Get started quickly with Jan, a ChatGPT-alternative that runs on your own computer, with a local API server. Learn how to install Jan and select an AI model to start chatting.
-sidebar_position: 2
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    quickstart,
-    getting started,
-    using AI model,
-    installation,
-  ]
---
-
-import { Tabs } from 'nextra/components'
-import { Callout, Steps } from 'nextra/components'
-
-# Hardware Requirements
-
-To run LLMs on device, Cortex has the following hardware requirements:
-<Callout type="info">
-These are the general hardware requirements for running Cortex on your system. Please refer to the respective [installation](/cortex/installation) sections for detailed specifications tailored to each environment.
-
-</Callout>
-
-## OS
- MacOSX 13.6 or higher.
- Windows 10 or higher.
- Ubuntu 12.04 and later.
-
-## RAM (CPU Mode)
- 8GB for running up to 3B models.
- 16GB for running up to 7B models.
- 32GB for running up to 13B models.
-
-## VRAM (GPU Mode)
- 6GB can load the 3B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
- 8GB can load the 7B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
- 12GB can load the 13B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
-
-## Disk Space
- 10GB: The app is 1.02 MB, but models are usually 4GB+
--- a/docs/src/pages/cortex/index.mdx
+++ b/docs/src/pages/cortex/index.mdx
@ -1,50 +0,0 @@
---
-title: Cortex
-description: Cortex is an Local LLM engine for developers
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Discord integration,
-    Discord,
-    bot,
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-
-# Cortex
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-![Cortex Cover Image](./_assets/cortex-cover.png)
-
-Cortex is an [OpenAI compatible](https://platform.openai.com/docs/introduction), local AI server that developers can use to build LLM apps. It can be used as a standalone server, or imported as a library. 
-
-Cortex currently supports two inference engines: 
- Llama.cpp
- TensorRT-LLM
-
-<Callout>
-  **Real-world Use**: Cortex powers [Jan](/docs), our local ChatGPT-alternative. 
-  
-  Cortex has been battle-tested through 900k downloads, and handles a variety of hardware and software edge cases.
-</Callout>
-
-### Roadmap
-
-Cortex's roadmap is to implement an [OpenAI-equivalent API](https://platform.openai.com/docs/api-reference) using a fully open source stack. Our goal is to make switching to open source AI as easy as possible for developers.
-
-### Architecture
-
-Cortex's [architecture](/cortex/architecture) features C++ inference core, with [higher-order features](/cortex/architecture) handled in Typescript. 
-
-Our [long-term direction](/cortex/roadmap) is to (eventually) move towards being a full C++ library to enable embedded and robotics use cases.
--- a/docs/src/pages/cortex/installation.mdx
+++ b/docs/src/pages/cortex/installation.mdx
@ -1,37 +0,0 @@
---
-title: Desktop Installation
-description: Cortex Desktop Installation.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-import childPages from './installation/_meta.json';
-
-# Cortex Desktop Installation
-
-<br/>
-
-<Cards
-  children={Object.keys(childPages).map((key, i) => (
-    <Card
-      key={i}
-      title={childPages[key].title}
-      href={childPages[key].href}
-    />
-  ))}
-/>
--- a/docs/src/pages/cortex/installation/_meta.json
+++ b/docs/src/pages/cortex/installation/_meta.json
@ -1,14 +0,0 @@
-{
-  "mac": {
-    "title": "Mac",
-    "href": "/cortex/installation/mac"
-  },
-  "windows": {
-    "title": "Windows",
-    "href": "/cortex/installation/windows"
-  },
-  "linux": {
-    "title": "Linux",
-    "href": "/cortex/installation/linux"
-  }
-}
--- a/docs/src/pages/cortex/installation/linux.mdx
+++ b/docs/src/pages/cortex/installation/linux.mdx
@ -1,181 +0,0 @@
---
-title: Linux
-description: Install Cortex CLI on Linux.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    quickstart,
-    getting started,
-    using AI model,
-    installation,
-    "desktop"
-  ]
---
-
-import { Tabs, Steps } from 'nextra/components'
-import { Callout } from 'nextra/components'
-import FAQBox from '@/components/FaqBox'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-# Linux Installation
-## Prerequisites
-
-### Dependencies
-
-Before installation, ensure that you have installed the following:
-
- **Node.js**: Required for running the installation.
- **NPM**: Needed to manage packages.
-
-<Callout type="info">
-The **CPU instruction sets** are not required for the initial installation of Cortex. This dependency will be automatically installed during the Cortex initialization if they are not already on your system.
-
-</Callout>
-
-### Hardware
-
-Ensure that your system meets the following requirements to run Cortex:
-<Tabs items={['OS', 'CPU', 'RAM', 'GPU', 'Disk']}>
-<Tabs.Tab>
- Debian-based (Supports `.deb` and `AppImage` )
-    - Ubuntu-based
-        - Ubuntu Desktop LTS (official)/ Ubuntu Server LTS (only for server)
-        - Edubuntu (Mainly desktop)
-        - Kubuntu (Desktop only)
-        - Lubuntu (Both desktop and server, though mainly desktop)
-        - Ubuntu Budgie (Mainly desktop)
-        - Ubuntu Cinnamon (Desktop only)
-        - Ubuntu Kylin (Both desktop and server)
-        - Ubuntu MATE (Desktop only)
- Pacman-based
-    - Arch Linux based
-        - Arch Linux (Mainly desktop)
-        - SteamOS (Desktop only)
- RPM-based (Supports `.rpm` and `AppImage` )
- Fedora-based
-    - RHEL-based (Server only)
- openSUSE (Both desktop and server)
-
-    <Callout type="info">
-    - Please check whether your Linux distribution supports desktop, server, or both environments.
-    
-    </Callout>
-</Tabs.Tab>
-<Tabs.Tab>
-<Tabs items={['Intel', 'AMD']}>
-<Tabs.Tab>
-<Callout type="info">
- Jan supports a processor that can handle AVX2. For the full list, please see [here](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2).
- We support older processors with AVX and AVX-512, though this is not recommended.
-</Callout>
- Haswell processors (Q2 2013) and newer.
- Tiger Lake (Q3 2020) and newer for Celeron and Pentium processors.
-</Tabs.Tab>
-<Tabs.Tab>
-<Callout type="info">
- Jan supports a processor that can handle AVX2. For the full list, please see [here](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2).
- We support older processors with AVX and AVX-512, though this is not recommended.
-</Callout>
- Excavator processors (Q2 2015) and newer.
-</Tabs.Tab>
-</Tabs>
-</Tabs.Tab>
-<Tabs.Tab>
- 8GB for running up to 3B models (int4).
- 16GB for running up to 7B models (int4).
- 32GB for running up to 13B models (int4).
-
-<Callout type="info">
-We support DDR2 RAM as the minimum requirement but recommend using newer generations of RAM for improved performance.
-
-</Callout>
-</Tabs.Tab>
-<Tabs.Tab>    
- 6GB can load the 3B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
- 8GB can load the 7B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
- 12GB can load the 13B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
-    
-<Callout type="info">
-Having at least 6GB VRAM when using NVIDIA, AMD, or Intel Arc GPUs is recommended.
-
-</Callout>
-</Tabs.Tab>
-<Tabs.Tab>     
- At least 10GB for app storage and model download.
-</Tabs.Tab>
-</Tabs>
-
-## Cortex Installation
-
-To install Cortex, follow the steps below:
-
-<Steps>
-### Step 1: Install Cortex
-
-Run the following command to install Cortex globally on your machine:
-
-<Callout type="info">
-Install NPM on your machine before proceeding with this step.
-
-</Callout>
-
-```sh
-# Install globally on your system
-npm i -g @janhq/cortex
-```
-<Callout type="info">
-Cortex automatically detects your CPU and GPU, downloading the appropriate CPU instruction sets and required dependencies to optimize GPU performance.
-
-</Callout>
-
-### Step 2: Verify the Installation
-
-1. After installation, you can verify that Cortex is installed correctly by getting help information.
-
-```sh
-# Get the help information
-cortex -h
-```
-2. Cortex is ready to use!
-</Steps>
-## Build from Source
-
-To install Cortex from the source, follow the steps below:
-
-1. Clone the Cortex repository [here](https://github.com/menloresearch/cortex/tree/dev).
-2. Navigate to the `cortex-js` folder.
-3. Open the terminal and run the following command to build the Cortex project:
-
-```sh
-npx nest build
-```
-
-4. Make the `command.js` executable:
-
-```sh
-chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js'
-```
-
-5. Link the package globally:
-
-```sh
-npm link
-```
-6. Initialize Cortex by following the steps [here](#step-3-initialize-cortex).
-## Uninstall Cortex
-
-Run the following command to uninstall Cortex globally on your machine:
-```sh
-# Uninstall globally on your system
-npm uninstall -g @janhq/cortex
-```
--- a/docs/src/pages/cortex/installation/mac.mdx
+++ b/docs/src/pages/cortex/installation/mac.mdx
@ -1,147 +0,0 @@
---
-title: Mac
-description: Install Cortex CLI on Mac.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    quickstart,
-    getting started,
-    using AI model,
-    installation,
-    "desktop"
-  ]
---
-
-import { Tabs, Steps } from 'nextra/components'
-import { Callout } from 'nextra/components'
-import FAQBox from '@/components/FaqBox'
-
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-# Mac Installation
-## Prerequisites
-
-### Dependencies
-
-Before installation, ensure that you have installed the following:
-
- **Node.js**: Required for running the installation.
- **NPM**: Needed to manage packages.
-
-<Callout type="info">
-The **CPU instruction sets** are not required for the initial installation of Cortex. This dependency will be automatically installed during the Cortex initialization if they are not already on your system.
-
-</Callout>
-
-### Hardware
-
-Ensure that your system meets the following requirements to run Cortex:
-<Tabs items={['Mac Intel CPU', 'Mac Apple Silicon']}>
-<Tabs.Tab>
-<Tabs items={['Operating System', 'Memory', 'Disk']}>
-<Tabs.Tab>
- MacOSX 13.6 or higher.
-</Tabs.Tab>
-<Tabs.Tab>
- 8GB for running up to 3B models.
- 16GB for running up to 7B models.
- 32GB for running up to 13B models.
-</Tabs.Tab>
-<Tabs.Tab>
- At least 10GB for app and model download.
-</Tabs.Tab>
-</Tabs>
-</Tabs.Tab>
-<Tabs.Tab>
-<Tabs items={['Operating System', 'Memory', 'Disk']}>
-<Tabs.Tab>
- MacOSX 13.6 or higher.
-</Tabs.Tab>
-<Tabs.Tab>
- 8GB for running up to 3B models.
- 16GB for running up to 7B models.
- 32GB for running up to 13B models.
-<Callout type="info">
-Apple Silicon Macs leverage Metal for GPU acceleration, providing faster performance than Intel Macs, which rely solely on CPU processing.
-
-</Callout>
-</Tabs.Tab>
-<Tabs.Tab>
- At least 10GB for app and model download.
-</Tabs.Tab>
-</Tabs>
-</Tabs.Tab>
-</Tabs>
-## Cortex Installation
-
-To install Cortex, follow the steps below:
-<Steps>
-### Step 1: Install Cortex
-
-Run the following command to install Cortex globally on your machine:
-
-<Callout type="info">
-Install NPM on your machine before proceeding with this step.
-
-</Callout>
-
-```sh
-# Install globally on your system
-npm i -g @janhq/cortex
-```
-<Callout type="info">
-Cortex automatically detects your CPU and GPU, downloading the appropriate CPU instruction sets and required dependencies to optimize GPU performance.
-
-</Callout>
-
-### Step 2: Verify the Installation
-
-1. After installation, you can verify that Cortex is installed correctly by getting help information.
-
-```sh
-# Get the help information
-cortex -h
-```
-2. Cortex is ready to use!
-</Steps>
-## Build from Source
-
-To install Cortex from the source, follow the steps below:
-
-1. Clone the Cortex repository [here](https://github.com/menloresearch/cortex/tree/dev).
-2. Navigate to the `cortex-js` folder.
-3. Open the terminal and run the following command to build the Cortex project:
-
-```sh
-npx nest build
-```
-
-4. Make the `command.js` executable:
-
-```sh
-chmod +x '[path-to]/cortex/cortex-js/dist/src/command.js'
-```
-
-5. Link the package globally:
-
-```sh
-npm link
-```
-6. Initialize Cortex by following the steps [here](#step-3-initialize-cortex).
-## Uninstall Cortex
-
-Run the following command to uninstall Cortex globally on your machine:
-```sh
-# Uninstall globally using NPM
-npm uninstall -g @janhq/cortex
-```
--- a/docs/src/pages/cortex/installation/windows.mdx
+++ b/docs/src/pages/cortex/installation/windows.mdx
@ -1,198 +0,0 @@
---
-title: Windows
-description: Install Cortex CLI on Windows.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    quickstart,
-    getting started,
-    using AI model,
-    installation,
-    "desktop"
-  ]
---
-
-import { Tabs, Steps } from 'nextra/components'
-import { Callout } from 'nextra/components'
-import FAQBox from '@/components/FaqBox'
-
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-# Windows Installation
-
-## Prerequisites
-
-### Dependencies
-
-Before installation, ensure that you have installed the following:
-
- **Node.js**: Required for running the installation.
- **NPM**: Needed to manage packages.
- **Windows Subsystem Linux (Ubuntu)**: Required to install for WSL2 installation.
-
-<Callout type="info">
-The **CPU instruction sets** are not required for the initial installation of Cortex. This dependency will be automatically installed during the Cortex initialization if they are not already on your system.
-
-</Callout>
-
-### Hardware
-
-Ensure that your system meets the following requirements to run Cortex:
-<Tabs items={['OS', 'CPU', 'RAM', 'GPU', 'Disk']}>
-<Tabs.Tab>
- Windows 10 or higher.
-</Tabs.Tab>
-<Tabs.Tab>
-<Tabs items={['Intel', 'AMD']}>
-<Tabs.Tab>
-<Callout type="info">
- Jan supports a processor that can handle AVX2. For the full list, please see [here](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2).
- We support older processors with AVX and AVX-512, though this is not recommended.
-</Callout>
- Haswell processors (Q2 2013) and newer.
- Tiger Lake (Q3 2020) and newer for Celeron and Pentium processors.
-</Tabs.Tab>
-<Tabs.Tab>
-<Callout type="info">
- Jan supports a processor that can handle AVX2. For the full list, please see [here](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions#CPUs_with_AVX2).
- We support older processors with AVX and AVX-512, though this is not recommended.
-</Callout>
- Excavator processors (Q2 2015) and newer.
-</Tabs.Tab>
-</Tabs>
-</Tabs.Tab>
-<Tabs.Tab>
- 8GB for running up to 3B models (int4).
- 16GB for running up to 7B models (int4).
- 32GB for running up to 13B models (int4).
-
-<Callout type="info">
-We support DDR2 RAM as the minimum requirement but recommend using newer generations of RAM for improved performance.
-
-</Callout>
-</Tabs.Tab>
-<Tabs.Tab>    
- 6GB can load the 3B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
- 8GB can load the 7B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
- 12GB can load the 13B model (int4) with `ngl` at 120 ~ full speed on CPU/ GPU.
-    
-<Callout type="info">
-Having at least 6GB VRAM when using NVIDIA, AMD, or Intel Arc GPUs is recommended.
-
-</Callout>
-</Tabs.Tab>
-<Tabs.Tab>     
- At least 10GB for app storage and model download.
-</Tabs.Tab>
-</Tabs>
-
-## Cortex Installation
-
-To install Cortex, follow the steps below:
-
-<Steps>
-### Step 1: Install Cortex
-
-Run the following command to install Cortex globally on your machine:
-
-<Callout type="info">
-Install NPM on your machine before proceeding with this step.
-
-</Callout>
-
-```sh
-# Install globally on your system
-npm i -g @janhq/cortex
-```
-<Callout type="info">
-Cortex automatically detects your CPU and GPU, downloading the appropriate CPU instruction sets and required dependencies to optimize GPU performance.
-
-</Callout>
-
-### Step 2: Verify the Installation
-
-1. After installation, you can verify that Cortex is installed correctly by getting help information.
-
-```sh
-# Get the help information
-cortex -h
-```
-2. Cortex is ready to use!
-</Steps>
-## Windows Subsystem Linux
-
-To install Cortex using the NPM package in WSL2, follow the steps below:
-<Steps>
-### Step 1: Open your WSL2 Terminal
-
-Open your Linux terminal in WSL2. For WSL2, you can use the Linux distribution terminal, which is Ubuntu.
-
-### Step 2: Install Cortex
-
-Run the following command to install Cortex globally on your machine:
-
-<Callout type="info">
-Install NPM on your machine before proceeding with this step.
-
-</Callout>
-
-```sh
-# Install globally on your system
-npm i -g @janhq/cortex
-```
-<Callout type="info">
-Cortex automatically detects your CPU and GPU, downloading the appropriate CPU instruction sets and required dependencies to optimize GPU performance.
-
-</Callout>
-
-### Step 3: Verify the Installation
-
-After installation, you can verify that Cortex is installed correctly by getting help information.
-
-```sh
-# Get the help information
-cortex -h
-```
-</Steps>
-## Build from Source
-
-To install Cortex from the source, follow the steps below:
-
-1. Clone the Cortex repository [here](https://github.com/menloresearch/cortex/tree/dev).
-2. Navigate to the `cortex-js` folder.
-3. Open the terminal and run the following command to build the Cortex project:
-
-```sh
-npx nest build
-```
-
-4. Make the `command.js` executable:
-
-```sh
-node "[path-to]\cortex\cortex-js\dist\src\command.js"
-```
-
-5. Link the package globally:
-
-```sh
-npm link
-```
-6. Initialize Cortex by following the steps [here](#step-3-initialize-cortex).
-
-## Uninstall Cortex
-
-Run the following command to uninstall Cortex globally on your machine:
-```sh
-# Uninstall globally on your system
-npm uninstall -g @janhq/cortex
-```
--- a/docs/src/pages/cortex/model-operations.mdx
+++ b/docs/src/pages/cortex/model-operations.mdx
@ -1,22 +0,0 @@
---
-title: Model Operations
-description: Model Operations
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/model-operations/overview.mdx
+++ b/docs/src/pages/cortex/model-operations/overview.mdx
@ -1,22 +0,0 @@
---
-title: Overview
-description: Overview.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/py-library.mdx
+++ b/docs/src/pages/cortex/py-library.mdx
@ -1,69 +0,0 @@
---
-title: Python Library
-description: Cortex Python Library.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# Python Library
-Cortex also provides a Python client library that is a **direct substitute for OpenAI's** [Python library](https://github.com/openai/openai-python), enabling easy integration and streamlined workflows.
-
-## Installation
-Use the following pip command to install the Cortex library in your project:
-```py
-pip install @janhq/cortex-python
-```
-## Usage
-
-Switching to the Cortex Client Library from the OpenAI Python Library involves simple updates.
-1. Replace the OpenAI import with Cortex in your application:
-```diff
- from openai import OpenAI
-+ from @janhq/cortex-python import Cortex
-```
-2. Modify the initialization of the client to use Cortex:
-```diff
- client = OpenAI(api_key='your-api-key')
-+ client = Cortex(base_url="BASE_URL", api_key="API_KEY")  # This can be omitted if using the default
-
-```
-### Example Usage
-```py
-from @janhq/cortex-python import Cortex
-
-client = OpenAI(base_url="http://localhost:1337", api_key="cortex")
-
-model = "TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF"
-client.models.start(model=model)
-
-completion = client.chat.completions.create(
-    model=model,
-    messages=[
-        {
-            "role": "user",
-            "content": "Say this is a test",
-        },
-    ],
-)
-print(completion.choices[0].message.content)
-```
--- a/docs/src/pages/cortex/quickstart.mdx
+++ b/docs/src/pages/cortex/quickstart.mdx
@ -1,55 +0,0 @@
---
-title: Quickstart
-description: Cortex Quickstart.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-# Quickstart
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-To get started,  confirm that your system meets the [hardware requirements](/cortex/hardware), and follow the steps below:
-
-```bash
-# 1. Install Cortex using NPM
-npm i -g @janhq/cortex
-
-# 2. Download a GGUF model
-cortex models pull llama3
-
-# 3. Run the model to start chatting
-cortex models run llama3
-
-# 4. (Optional) Run Cortex in OpenAI-compatible server mode
-cortex serve
-```
-<Callout type="info">
-For more details regarding the Cortex server mode, please see here:
- [Server Endpoint](/cortex/server)
- [`cortex serve` command](/cortex/cli/serve)
-</Callout>
-
-## What's Next?
-With Cortex now fully operational, you're ready to delve deeper:
- Explore how to [install Cortex](/cortex/installation) across various hardware environments.
- Familiarize yourself with the comprehensive set of [Cortex CLI commands](/cortex/cli) available for use.
- Gain insights into the system’s design by examining the [architecture](/cortex/architecture) of Cortex.
--- a/docs/src/pages/cortex/rag.mdx
+++ b/docs/src/pages/cortex/rag.mdx
@ -1,22 +0,0 @@
---
-title: RAG
-description: RAG
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/rag/overview.mdx
+++ b/docs/src/pages/cortex/rag/overview.mdx
@ -1,22 +0,0 @@
---
-title: Overview
-description: Overview.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/server.mdx
+++ b/docs/src/pages/cortex/server.mdx
@ -1,47 +0,0 @@
---
-title: Command Line Interface
-description: Cortex CLI.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps, Cards, Card } from 'nextra/components'
-import OAICoverage from "@/components/OAICoverage"
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# Server Endpoint
-
-Cortex can run in headless server mode, providing an [OpenAI-API compatible](https://platform.openai.com/docs/api-reference/introduction) endpoint.
-
-## Usage
-
-```
-cortex serve
-```
-
-A full, local AI server will be started on port `7331` (customizable).
-
-## Playground
-
-You can open up an interactive playground at: http://localhost:1337/api, generated from Swagger.
-
-
-## OpenAI Coverage
-
-<OAICoverage endDate='06-21-2024' />
--- a/docs/src/pages/cortex/text-generation.mdx
+++ b/docs/src/pages/cortex/text-generation.mdx
@ -1,86 +0,0 @@
---
-title: Overview
-description: Overview.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-import { Tabs } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-# Text Generation
-
-Cortex's Chat API is compatible with OpenAI’s [Chat Completions](https://platform.openai.com/docs/api-reference/chat) endpoint. It is a drop-in replacement for local inference.
-
-For local inference, Cortex is [multi-engine](#multiple-local-engines) and supports the following model formats: 
-
- `GGUF`: A generalizable LLM format that runs across CPUs and GPUs. Cortex implements a GGUF runtime through [llama.cpp](https://github.com/ggerganov/llama.cpp/).
- `TensorRT`: A a production-ready, enterprise-grade LLM format optimized for fast inference on NVIDIA GPUs. Cortex implements a TensorRT runtime through [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM).
-
-For remote inference, Cortex routes requests to multiple APIs, while providing a single, easy to use, OpenAI compatible endpoint. [Read more](#remote-api-integration). 
-
-## Usage
-
-<Tabs items={['CLI', 'Javascript', 'CURL']}>
-<Tabs.Tab>
-
-```bash
-# Streaming
-cortex chat --model janhq/TinyLlama-1.1B-Chat-v1.0-GGUF
-```
-</Tabs.Tab>
-</Tabs>
-
-**Read more:** 
-
- Chat Completion Object
- Chat Completions API
- Chat Completions CLI
-
-## Capabilities
-
-### Multiple Local Engines
-
-Cortex scales applications from prototype to production. It runs on CPU-only laptops with Llama.cpp and GPU-accelerated clusters with TensorRT-LLM.
-
-To learn more about how to configure each engine:
-
- Use llama.cpp
- Use tensorrt-llm
-
-To learn more about our engine architecture:
-
- cortex.cpp
- cortex.llamacpp
- cortex.tensorRTLLM
-
-### Multiple Remote APIs
-
-Cortex also works as an aggregator to make remote inference requests from a single endpoint. 
-
-Currently, Cortex supports: 
- OpenAI
- Groq
- Cohere
- Anthropic
- MistralAI
- Martian
- OpenRouter
-
--- a/docs/src/pages/cortex/ts-library.mdx
+++ b/docs/src/pages/cortex/ts-library.mdx
@ -1,66 +0,0 @@
---
-title: Typescript Library
-description: Cortex Node Client Library
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
-
-<Callout type="warning">
-🚧 Cortex is under construction.
-</Callout>
-
-# Typescript Library
-Cortex provides a robust Typescript client library designed as a **direct substitute for OpenAI's** [Node.js/Typescript library](https://github.com/openai/openai-node), enabling easy integration and streamlined workflows.
-
-## Installation
-Install the package via npm with the following command in your project:
-```js
-npm install @janhq/cortex-node
-```
-
-## Usage
-
-Transitioning to the Cortex Client Library from the OpenAI Client Library involves minimal changes, mostly updating the import statement.
-1. Replace the OpenAI import with Cortex in your application:
-```diff
- import OpenAI from 'openai';
-+ import { Cortex } from '@janhq/cortex-node';
-```
-2. Modify the initialization of the client to use Cortex:
-```diff
- const openai = new OpenAI({
-+ const cortex = new Cortex({
-    baseURL: ['BASE_URL'], // The default base URL for Cortex is 'http://localhost:1337'
-    apiKey: process.env['OPENAI_API_KEY'], // This can be omitted if using the default
-});
-
-```
-### Example Usage
-```js
-import { Cortex } from '@janhq/cortex-node';
-
-const cortex = new Cortex({
-    baseURL: ['http://localhost:1337'],
-    apiKey: process.env['cortex'], 
-});
-
-cortex.models.start('llama3:7b')
-cortex.models.stop('llama3:7b')
-cortex.threads.list()
-```
--- a/docs/src/pages/cortex/vision.mdx
+++ b/docs/src/pages/cortex/vision.mdx
@ -1,22 +0,0 @@
---
-title: Vision
-description: Vision
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/cortex/vision/overview.mdx
+++ b/docs/src/pages/cortex/vision/overview.mdx
@ -1,22 +0,0 @@
---
-title: Overview
-description: Overview.
-keywords:
-  [
-    Jan,
-    Customizable Intelligence, LLM,
-    local AI,
-    privacy focus,
-    free and open source,
-    private and offline,
-    conversational AI,
-    no-subscription fee,
-    large language models,
-    Cortex,
-    Jan,
-    LLMs
-  ]
---
-
-import { Callout, Steps } from 'nextra/components'
-import { Cards, Card } from 'nextra/components'
--- a/docs/src/pages/docs/_assets/add_assistant.png
+++ b/docs/src/pages/docs/_assets/add_assistant.png
--- a/docs/src/pages/docs/_assets/api-server.png
+++ b/docs/src/pages/docs/_assets/api-server.png
--- a/docs/src/pages/docs/_assets/assistant-01.png
+++ b/docs/src/pages/docs/_assets/assistant-01.png
--- a/docs/src/pages/docs/_assets/assistant-add-dialog.png
+++ b/docs/src/pages/docs/_assets/assistant-add-dialog.png
--- a/docs/src/pages/docs/_assets/assistant-dropdown.png
+++ b/docs/src/pages/docs/_assets/assistant-dropdown.png
--- a/docs/src/pages/docs/_assets/assistant-edit-dialog.png
+++ b/docs/src/pages/docs/_assets/assistant-edit-dialog.png
--- a/docs/src/pages/docs/_assets/assistants-ui-overview.png
+++ b/docs/src/pages/docs/_assets/assistants-ui-overview.png
--- a/docs/src/pages/docs/_assets/cohere.png
+++ b/docs/src/pages/docs/_assets/cohere.png
--- a/docs/src/pages/docs/_assets/google.png
+++ b/docs/src/pages/docs/_assets/google.png
--- a/docs/src/pages/docs/_assets/gpu_accl.png
+++ b/docs/src/pages/docs/_assets/gpu_accl.png
--- a/docs/src/pages/docs/_assets/groq.png
+++ b/docs/src/pages/docs/_assets/groq.png
--- a/docs/src/pages/docs/_assets/hardware.png
+++ b/docs/src/pages/docs/_assets/hardware.png
--- a/docs/src/pages/docs/_assets/hf-unsloth.png
+++ b/docs/src/pages/docs/_assets/hf-unsloth.png
--- a/docs/src/pages/docs/_assets/hf_and_jan.png
+++ b/docs/src/pages/docs/_assets/hf_and_jan.png
--- a/docs/src/pages/docs/_assets/hf_token.png
+++ b/docs/src/pages/docs/_assets/hf_token.png
--- a/docs/src/pages/docs/_assets/jan-app.png
+++ b/docs/src/pages/docs/_assets/jan-app.png
--- a/docs/src/pages/docs/_assets/jan_ui.png
+++ b/docs/src/pages/docs/_assets/jan_ui.png
--- a/docs/src/pages/docs/_assets/llama.cpp-01.png
+++ b/docs/src/pages/docs/_assets/llama.cpp-01.png
--- a/docs/src/pages/docs/_assets/mcp-on.png
+++ b/docs/src/pages/docs/_assets/mcp-on.png
--- a/docs/src/pages/docs/_assets/mcp-setup-1.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-1.png
--- a/docs/src/pages/docs/_assets/mcp-setup-10.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-10.png
--- a/docs/src/pages/docs/_assets/mcp-setup-2.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-2.png
--- a/docs/src/pages/docs/_assets/mcp-setup-3.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-3.png
--- a/docs/src/pages/docs/_assets/mcp-setup-4.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-4.png
--- a/docs/src/pages/docs/_assets/mcp-setup-5.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-5.png
--- a/docs/src/pages/docs/_assets/mcp-setup-6.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-6.png
--- a/docs/src/pages/docs/_assets/mcp-setup-7.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-7.png
--- a/docs/src/pages/docs/_assets/mcp-setup-8.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-8.png
--- a/docs/src/pages/docs/_assets/mcp-setup-9.png
+++ b/docs/src/pages/docs/_assets/mcp-setup-9.png
--- a/docs/src/pages/docs/_assets/mistralai.png
+++ b/docs/src/pages/docs/_assets/mistralai.png
--- a/docs/src/pages/docs/_assets/model-capabilities-edit-01.png
+++ b/docs/src/pages/docs/_assets/model-capabilities-edit-01.png
--- a/docs/src/pages/docs/_assets/model-capabilities-edit-02.png
+++ b/docs/src/pages/docs/_assets/model-capabilities-edit-02.png
--- a/docs/src/pages/docs/_assets/model-import-04.png
+++ b/docs/src/pages/docs/_assets/model-import-04.png
--- a/docs/src/pages/docs/_assets/model-import-05.png
+++ b/docs/src/pages/docs/_assets/model-import-05.png
--- a/Show More
+++ b/Show More