Merge pull request #5385 from menloresearch/ramon/docs-jan-models

Added Models section with jan nano to docs, updated api server section and changelog.
This commit is contained in:
Ramon Perez 2025-06-20 15:11:20 +10:00 committed by GitHub
commit ebd9e0863e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
15 changed files with 221 additions and 2 deletions

1
.gitignore vendored
View File

@ -49,3 +49,4 @@ src-tauri/resources/bin
# Helper tools
.opencode
OpenCode.md
archive/

Binary file not shown.

After

Width:  |  Height:  |  Size: 817 KiB

View File

@ -57,7 +57,7 @@ const Changelog = () => {
<p className="text-base mt-2 leading-relaxed">
Latest release updates from the Jan team. Check out our&nbsp;
<a
href="https://github.com/orgs/menloresearch/projects/5/views/52"
href="https://github.com/orgs/menloresearch/projects/30"
className="text-blue-600 dark:text-blue-400 cursor-pointer"
>
Roadmap

View File

@ -0,0 +1,21 @@
---
title: "Jan v0.6.1 is here: It's a whole new vibe!"
version: 0.6.1
description: "Are you ready for the sexiest UI ever?"
date: 2025-06-19
ogImage: "/assets/images/changelog/jan-v0.6.1-ui-revamp.png"
---
import ChangelogHeader from "@/components/Changelog/ChangelogHeader"
<ChangelogHeader title="Jan v0.6.1 is here: It's a whole new vibe!" date="2025-06-19" ogImage="/assets/images/changelog/jan-v0.6.1-ui-revamp.png" />
## Highlights 🎉
- Jan's been redesigned to be faster, cleaner, and easier to use.
- You can now create assistants with custom instructions and settings from a dedicated tab.
- You can now use Jan with Menlo's models.
Update your Jan or [download the latest](https://jan.ai/).
For more details, see the [GitHub release notes](https://github.com/menloresearch/jan/releases/tag/v0.6.1).

Binary file not shown.

After

Width:  |  Height:  |  Size: 306 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 819 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 614 KiB

View File

@ -7,6 +7,7 @@
"desktop": "Install 👋 Jan",
"threads": "Start Chatting",
"manage-models": "Manage Models",
"menlo-models": "Menlo Models",
"assistants": "Create Assistants",
"tutorials-separators": {

View File

@ -42,7 +42,9 @@ as well after downloading it from [here](https://github.com/ggml-org/llama.cpp).
2. Add an API Key (it can be anything) or fully configure the server at [Server Settings](/docs/api-server#server-settings)
3. Click **Start Server** button
4. Wait for the confirmation message in the logs panel, your server is ready when you see: `JAN API listening at: http://127.0.0.1:1337`
5. Make sure you add an API key, this can be anything you want, a word like "testing" or even a combination of numbers and letters.
![Local API Server](./_assets/api-server2.png)
### Step 2: Test Server
The easiest way to test your server is through the API Playground:
@ -50,8 +52,25 @@ The easiest way to test your server is through the API Playground:
2. Select a model from the dropdown menu in Jan interface
3. Try a simple request
4. View the response in real-time
5. When you send requests from another app, you need to add the API key in the request headers.
### Step 3: Use the API
```sh
curl http://127.0.0.1:1337/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer testing-something" \ # here you need to add your API key
-d '{
"model": "jan-nano-gguf",
"messages": [
{
"role": "user",
"content": "Write a one-sentence bedtime story about a unicorn."
}
]
}'
```
</Steps>
@ -108,6 +127,8 @@ Enable **Verbose Server Logs** for detailed error messages.
- Verify your JSON request format is correct
- Verify firewall settings
- Look for detailed error messages in the logs
- Make sure you add an API key, this can be anything you want, a word like "testing" or even a combination of numbers and letters.
- Use the API Key in the request headers when sending requests from another app.
**2. CORS Errors in Web Apps**
- Enable CORS in server settings if using from a webpage

View File

@ -25,7 +25,7 @@ import FAQBox from '@/components/FaqBox'
![Jan's Cover Image](./_assets/jan-app.png)
Jan is an AI chat application that runs 100% offline on your desktop and (*soon*) on mobile. Our goal is to
Jan is a ChatGPT alternative that runs 100% offline on your desktop and (*soon*) on mobile. Our goal is to
make it easy for anyone, with or without coding skills, to download and use AI models with full control and
[privacy](https://www.reuters.com/legal/legalindustry/privacy-paradox-with-ai-2023-10-31/).

View File

@ -0,0 +1,10 @@
{
"overview": {
"title": "Overview",
"href": "/docs/menlo-models/overview"
},
"jan-nano": {
"title": "Jan Nano",
"href": "/docs/menlo-models/jan-nano"
}
}

View File

@ -0,0 +1,125 @@
---
title: Jan Nano
description: Jan-Nano-Gguf Model
keywords:
[
Jan,
Jan Models,
Jan Model,
Jan Model List,
Menlo Models,
Menlo Model,
Jan-Nano-Gguf,
ReZero,
Model Context Protocol,
MCP,
]
---
import { Callout } from 'nextra/components'
# Jan Nano
Jan-Nano is a compact 4-billion parameter language model specifically designed and trained for deep
research tasks. This model has been optimized to work seamlessly with Model Context Protocol (MCP) servers,
enabling efficient integration with various research tools and data sources.
The model and its different model variants are fully supported by Jan.
<Callout type="info">
Jan-Nano can be used by Jan's stable version but its true capabilities shine in Jan's beta version, which
offers MCP support. You can download Jan's beta version from [here](https://jan.ai/docs/desktop/beta).
</Callout>
## System Requirements
- Minimum Requirements:
- 8GB RAM (with iQ4_XS quantization)
- 12GB VRAM (for Q8 quantization)
- CUDA-compatible GPU
- Recommended Setup:
- 16GB+ RAM
- 16GB+ VRAM
- Latest CUDA drivers
- RTX 30/40 series or newer
## Using Jan-Nano
### Step 1
Download Jan Beta from [here](https://jan.ai/docs/desktop/beta).
### Step 2
Go to the Hub Tab, search for Jan-Nano-Gguf, and click on the download button to the best model size for your system.
![Jan Nano](../_assets/jan-nano1.png)
### Step 3
Go to **Settings** > **Model Providers** > **Llama.cpp** click on the pencil icon and enable tool use for Jan-Nano-Gguf.
### Step 4
To take advantage of Jan-Nano's full capabilities, you need to enable MCP support. We're going to use it with Serper's
API. You can get a free API key from [here](https://serper.dev/). Sign up and they will immediately generate one for you.
### Step 5
Add the serper MCP to Jan via the **Settings** > **MCP Servers** tab.
![Serper MCP](../_assets/serper-mcp.png)
### Step 6
Open up a new chat and ask Jan-Nano to search the web for you.
![Jan Nano](../_assets/jan-nano-demo.gif)
## Queries to Try
Here are some example queries to showcase Jan-Nano's web search capabilities:
1. **Current Events**: What are the latest developments in renewable energy adoption in Germany and Denmark?
2. **International Business**: What is the current status of Tesla's Gigafactory in Berlin and how has it impacted the local economy?
3. **Technology Trends**: What are the newest AI developments from Google, Microsoft, and Meta that were announced this week?
4. **Global Weather**: What's the current weather forecast for Tokyo, Japan for the next 5 days?
5. **Stock Market**: What are the current stock prices for Apple, Samsung, and Huawei, and how have they performed this month?
6. **Sports Updates**: What are the latest results from the Premier League matches played this weekend?
7. **Scientific Research**: What are the most recent findings about climate change impacts in the Arctic region?
8. **Cultural Events**: What major music festivals are happening in Europe this summer and who are the headliners?
9. **Health & Medicine**: What are the latest developments in mRNA vaccine technology and its applications beyond COVID-19?
10. **Space Exploration**: What are the current missions being conducted by NASA, ESA, and China's space program?
## FAQ
- What are the recommended GGUF quantizations?
- Q8 GGUF is recommended for best performance
- iQ4_XS GGUF for very limited VRAM setups
- Avoid Q4_0 and Q4_K_M as they show significant performance degradation
- Can I run this on a laptop with 8GB RAM?
- Yes, but use the recommended quantizations (iQ4_XS)
- Note that performance may be limited with Q4 quantizations
- How much did the training cost?
- Training was done on internal A6000 clusters
- Estimated cost on RunPod would be under $100 using H200
- Hardware used:
- 8xA6000 for training code
- 4xA6000 for vllm server (inferencing)
- What frontend should I use?
- Jan Beta (recommended) - Minimalistic and polished interface
- Download link: https://jan.ai/docs/desktop/beta
- Getting Jinja errors in LM Studio?
- Use Qwen3 template from other LM Studio compatible models
- Disable “thinking” and add the required system prompt
- Fix coming soon in future GGUF releases
- Having model loading issues in Jan?
- Use latest beta version: Jan-beta_0.5.18-rc6-beta
- Ensure proper CUDA support for your GPU
- Check VRAM requirements match your quantization choice
## Resources
- [Jan-Nano Model on Hugging Face](https://huggingface.co/Menlo/Jan-nano)
- [Jan-Nano GGUF on Hugging Face](https://huggingface.co/Menlo/Jan-nano-gguf)

View File

@ -0,0 +1,40 @@
---
title: Overview
description: Jan Models
keywords:
[
Jan,
Jan Models,
Jan Model,
Jan Model List,
Menlo Models,
Menlo Model,
Jan-Nano-Gguf,
ReZero,
Model Context Protocol,
MCP,
]
---
# Menlo Models
At Menlo, we have focused on creating a series of models that are optimized for all sorts of tasks, including
web search, deep research, robotic control, and using MCPs. Our latest model, Jan-Nano-Gguf, is available in Jan
right now providing excellent results on tasks that use MCPs.
You can have a look at all of our models, and download them from the HuggingFace [Menlo Models page](https://huggingface.co/Menlo).
## Jan-Nano-Gguf (Available in Jan right now 🚀)
![Jan Nano](../_assets/jan-nano0.png)
Jan-Nano-Gguf is a 4-billion parameter model that is optimized for deep research tasks. It has been trained on a
variety of datasets and is designed to be used with the Model Context Protocol (MCP) servers.
## ReZero
ReZero (Retry-Zero) is a reinforcement learning framework that improves RAG systems by rewarding LLMs for retrying
failed queries. Traditional RAG approaches struggle when initial searches fail, but ReZero encourages persistence and
alternative strategies. This increases accuracy from 25% to 46.88% in complex information-seeking tasks.