feat: added testcase and update qa-checklist

2025-08-19 21:59:28 +07:00 · 2025-08-19 21:59:28 +07:00 · b6813f1c7a
commit b6813f1c7a
parent 3d764a92d3
23 changed files with 1207 additions and 246 deletions
--- a/autoqa/tested/assistants/create-assistant.txt
+++ b/autoqa/tested/assistants/create-assistant.txt
@ -0,0 +1,72 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task.
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait()
+finished()
+call_user()
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+- If a dialog appears (e.g., "Help Us Improve Jan" or "New Version/Update"), dismiss it before proceeding.
+- The **Add Assistant** dialog has its own vertical scrollbar; if controls are not visible, click inside the dialog to focus it, then scroll or drag the dialog’s scrollbar handle.
+
+## User Instruction
+
+Verify that the default predefined parameters for a new assistant are correct, keys are lower snake_case, and that saving/reopening preserves the values.
+
+### Steps
+
+1. Open **Assistants** from the bottom-left menu.
+2. Click **Create Assistant** to open the **Add Assistant** dialog.
+3. In **Name**, type: `Param Tester`
+4. In **Description**, type: `For parameter editing verification.`
+5. In **Instructions**, type: `Test assistant for changing predefined parameters.`
+6. In **Predefined Parameters**, click each chip so it appears in the **Parameters** list (scroll within the dialog if needed):
+   - Stream
+   - Temperature
+   - Frequency Penalty
+   - Presence Penalty
+   - Top P
+   - Top K
+7. Verify the **default values** shown after toggling the chips match exactly:
+   - Stream: **True** (Boolean)
+   - Temperature: **0.7**
+   - Frequency Penalty: **0.7**
+   - Presence Penalty: **0.7**
+   - Top P: **0.95**
+   - Top K: **2**
+8. Click **Save**.
+9. In the Assistants list, locate **Param Tester** (scroll the list if necessary) and click its **Edit** (pencil) icon.
+10. Verify the assistant’s **Name**, **Description**, **Instructions**, and all **Parameters** are present and unchanged (scroll within the dialog if needed).
+11. Click **×** to close the dialog.
+
+## Pass/Fail Output (strict)
+- Respond in English only.
+- Return ONLY one of the following JSON objects, with no extra text.
+
+If all parameters are visible, default values match exactly, and the saved assistant reopens with the same values and texts, return:
+  {"result": true}
+
+Otherwise, return:
+  {"result": false}
+
+IMPORTANT:
+- Your response must be ONLY the JSON above.
+- Do NOT add any other text before or after the JSON.
+
+"""
--- a/autoqa/tested/assistants/default-jan-assistant.txt
+++ b/autoqa/tested/assistants/default-jan-assistant.txt
@ -0,0 +1,52 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
+You are going to test the Jan application by verifying that a default assistant named **Jan** is present.
+
+Step-by-step instructions:
+0. Given the Jan application is already open.
+1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing. This ensures full visibility of the interface.
+2. In the bottom-left menu, click on **Assistants**.
+3. On the Assistants screen, verify that there is a visible assistant card named **Jan**.
+4. Confirm that it has a description under the name that starts with:
+   "Jan is a helpful desktop assistant..."
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text
+   If the assistant named Jan is present and its description is visible, return:
+      {"result": true}
+
+      Otherwise, return:
+      {"result": false}
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tested/models/user-start-chatting.txt
+++ b/autoqa/tested/models/user-start-chatting.txt
--- a/autoqa/tested/settings/app-data.txt
+++ b/autoqa/tested/settings/app-data.txt
@ -0,0 +1,65 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task.
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait()
+finished()
+call_user()
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+- If a dialog appears (e.g., "Help Us Improve Jan" or "New Version/Update"), dismiss it before proceeding.
+- The **Add Assistant** dialog has its own vertical scrollbar; if controls are not visible, click inside the dialog to focus it, then scroll or drag the dialog’s scrollbar handle.
+
+## User Instruction
+
+You are going to verify the App data folder path shown in Jan’s Settings.
+
+Navigation: **Settings > General**
+
+Step-by-step instructions:
+0. Given the Jan application is already opened.
+1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing.
+2. In the bottom-left menu, click on **Settings**.
+3. In the left sidebar, click on **General**.
+4. In the **Data folder** section of **General**, locate **App data** and the line **"Default location for messages and other user data:"**. Read the displayed path.
+5. Verify the displayed path matches **one** of the expected OS-specific defaults below (accept either the standard or nightly variant):
+   - **Windows (standard):** `C:\\Users\\<Username>\\AppData\\Roaming\\Jan\\data`
+   - **Windows (nightly):** `C:\\Users\\<Username>\\AppData\\Roaming\\Jan-nightly\\data`
+   - **macOS (standard):** `/Users/<Username>/Library/Application Support/Jan/data`
+   - **macOS (nightly):** `/Users/<Username>/Library/Application Support/Jan-nightly/data`
+   - **Linux (standard):** `/home/<Username>/.local/share/Jan/data`
+   - **Linux (nightly):** `/home/<Username>/.local/share/Jan-nightly/data`
+
+Notes for verification (guidance only, do not display to user):
+- Windows paths typically start with a drive letter and include `\\AppData\\Roaming\\Jan\\data` or `\\AppData\\Roaming\\Jan-nightly\\data`.
+- macOS paths start with `/Users/` and include `/Library/Application Support/Jan/data` or `/Library/Application Support/Jan-nightly/data`.
+- Linux paths start with `/home/` and include `/.local/share/Jan/data` or `/.local/share/Jan-nightly/data`.
+- If the **Data folder** section is not visible, scroll down within **General** until it appears.
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text
+
+   If the displayed App data path matches one of the expected OS-specific defaults above, return: {"result": True}. Otherwise, return: {"result": False}.
+
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
+
+"""
--- a/autoqa/tested/settings/check-for-updates.txt
+++ b/autoqa/tested/settings/check-for-updates.txt
@ -0,0 +1,55 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task.
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait()
+finished()
+call_user()
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+- If a dialog appears (e.g., "Help Us Improve Jan" or "New Version/Update"), dismiss it before proceeding.
+- If **New Version Available** popup appears on app launch (older versions), click **Remind Me Later** to dismiss it before continuing.
+
+## User Instruction
+
+You are going to test the **Check for Updates** function in Jan’s Settings.
+
+Navigation: **Settings > General**
+
+Step-by-step instructions:
+0. Given the Jan application is already opened.
+1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing.
+2. If a **New Version Available** popup (e.g., "New Version 0.6.8 Update Available") appears immediately on startup, click **Remind Me Later** to dismiss it before continuing.
+3. In the bottom-left menu, click on **Settings**.
+4. In the left sidebar, click on **General**.
+5. In the **General** section, locate **Check for Updates** and click the button.
+6. Verify the behavior:
+   - If Jan is already up to date, a message such as **"You're running the latest version"** should appear.
+   - If a new version is available, a popup should appear in the bottom-right corner with text like **"New Version X.Y.Z Update Available"** (e.g., "New Version 0.6.8 Update Available"), confirming the update check works as expected.
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text
+
+   If the **Check for Updates** button correctly shows either "You're running the latest version" or the new version popup (e.g., "New Version 0.6.8 Update Available"), return: {"result": True}. Otherwise, return: {"result": False}.
+
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tested/settings/enable-mcp-server.txt
+++ b/autoqa/tested/settings/enable-mcp-server.txt
@ -0,0 +1,55 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
+You are going to test the Jan application by verifying that enabling Experimental Features reveals the MCP Servers section in Settings.
+
+Step-by-step instructions:
+0. Given the Jan application is already open.
+1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing. This ensures full visibility of the interface.
+2. In the bottom-left menu, click **Settings**.
+3. In the left sidebar, make sure **General** is selected.
+4. Scroll down to the **Advanced** section.
+5. Locate the toggle labeled **Experimental Features** and switch it ON.
+6. Observe the **Settings** sidebar.
+7. Verify that a new section called **MCP Servers** appears.
+8. Click on **MCP Servers** in the sidebar to ensure it opens and displays its content correctly.
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text
+   If the MCP Servers section appears after enabling Experimental Features and you can open it successfully, return:
+    {"result": true}
+
+    Otherwise, return:
+    {"result": false}
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tested/settings/extensions.txt
+++ b/autoqa/tested/settings/extensions.txt
@ -0,0 +1,56 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
+You are going to test the Jan application by verifying the available extensions listed under Settings → Extensions.
+
+Step-by-step instructions:
+0. Given the Jan application is already open.
+1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing. This ensures full visibility of the interface.
+2. In the bottom-left corner, click **Settings**.
+3. In the left sidebar of Settings, click on **Extensions**.
+4. In the main panel, confirm that the following four extensions are listed:
+   - Jan Assistant
+   - Conversational
+   - Download Manager
+   - llama.cpp Inference Engine
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text
+
+   If all four extensions are present, return:
+      {"result": true}
+
+   Otherwise, return:
+      {"result": false}
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tested/settings/hardware-info.txt
+++ b/autoqa/tested/settings/hardware-info.txt
@ -1,4 +1,32 @@
 prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
 You are going to test the Jan application by verifying that the hardware information is displayed correctly in the Settings panel.

 Step-by-step instructions:
@ -37,7 +65,7 @@ Step-by-step instructions:
 ---

 **GPUs**
- This section is located at the bottom of the Hardware page — **scroll down if it is not immediately visible**.
+- This section is located at the bottom of the Hardware page.
 - If the system has a GPU:
  - It should display the GPU name (e.g., NVIDIA GeForce GTX 1080)
  - A toggle should be available to enable or disable GPU usage
@ -50,11 +78,17 @@ Step-by-step instructions:
 - Ensure that there are **no error messages** in the UI.
 - The layout should appear clean and correctly rendered with no broken visual elements.

-If all sections display relevant hardware information accurately and the interface is error-free, return:
-{"result": true}
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text

-Otherwise, return:
-{"result": false}
+   If all sections display relevant hardware information accurately and the interface is error-free, return:
+    {"result": true}

-Use only plain ASCII characters in your response. Do NOT use Unicode symbols.
+    Otherwise, return:
+    {"result": false}
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
 """
--- a/autoqa/tested/settings/providers-available.txt
+++ b/autoqa/tested/settings/providers-available.txt
@ -0,0 +1,58 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
+You are going to test the Jan application by verifying that all expected model providers are listed in the Settings panel.
+
+Step-by-step instructions:
+0. Given the Jan application is already opened.
+1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing. This ensures full visibility of the interface.
+2. In the bottom-left menu, click on **Settings**.
+3. In the left sidebar of Settings, click on **Model Providers**.
+4. In the main panel, verify that the following model providers are listed:
+   - Llama.cpp
+   - OpenAI
+   - Anthropic
+   - Cohere
+   - OpenRouter
+   - Mistral
+   - Groq
+   - Gemini
+   - Hugging Face
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text
+
+   If all the providers are visible, return: {"result": True}. Otherwise, return: {"result": False}.
+
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tested/settings/shortcuts.txt
+++ b/autoqa/tested/settings/shortcuts.txt
@ -0,0 +1,58 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task.
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait()
+finished()
+call_user()
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
+You are going to verify that the Shortcuts list is correctly shown.
+
+Steps:
+1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it.
+2. If a dialog appears in the bottom-right corner about a **New Version / Update Available**, click **Remind Me Later** to dismiss it.
+3. Open **Settings** from the bottom-left menu.
+4. Click **Shortcuts** in the left sidebar.
+5. In the main panel, verify the following shortcuts are visible and correctly listed (order may vary):
+   - New Chat — Ctrl N
+   - Toggle Sidebar — Ctrl B
+   - Zoom In — Ctrl +
+   - Zoom Out — Ctrl -
+   - Send Message — Enter
+   - New Line — Shift Enter
+   - Go to Settings — Ctrl ,
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- Respond in English only.
+- Return ONLY one of the following JSON objects, with no extra text.
+
+If all shortcuts are present and correct, return:
+  {"result": true}
+
+Otherwise, return:
+  {"result": false}
+
+IMPORTANT:
+- Your response must be ONLY the JSON above.
+- Do NOT add any other text before or after the JSON.
+
+"""
--- a/autoqa/tests/base/default-jan-assistant.txt
+++ b/autoqa/tests/base/default-jan-assistant.txt
@ -1,19 +0,0 @@
-prompt = """
-You are going to test the Jan application by verifying that a default assistant named **Jan** is present.
-
-Step-by-step instructions:
-0. Given the Jan application is already open.
-1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing. This ensures full visibility of the interface.
-2. In the bottom-left menu, click on **Assistants**.
-3. On the Assistants screen, verify that there is a visible assistant card named **Jan**.
-4. Confirm that it has a description under the name that starts with:
-   "Jan is a helpful desktop assistant..."
-
-If the assistant named Jan is present and its description is visible, return:
-{"result": true}
-
-Otherwise, return:
-{"result": false}
-
-Only use plain ASCII characters in your response. Do NOT use Unicode symbols.
-"""
--- a/autoqa/tests/base/enable-mcp-server.txt
+++ b/autoqa/tests/base/enable-mcp-server.txt
@ -1,22 +0,0 @@
-prompt = """
-You are going to test the Jan application by verifying that enabling Experimental Features reveals the MCP Servers section in Settings.
-
-Step-by-step instructions:
-0. Given the Jan application is already open.
-1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing. This ensures full visibility of the interface.
-2. In the bottom-left menu, click **Settings**.
-3. In the left sidebar, make sure **General** is selected.
-4. Scroll down to the **Advanced** section.
-5. Locate the toggle labeled **Experimental Features** and switch it ON.
-6. Observe the **Settings** sidebar.
-7. Verify that a new section called **MCP Servers** appears.
-8. Click on **MCP Servers** in the sidebar to ensure it opens and displays its content correctly.
-
-If the MCP Servers section appears after enabling Experimental Features and you can open it successfully, return:
-{"result": true}
-
-Otherwise, return:
-{"result": false}
-
-Only use plain ASCII characters in your response. Do NOT use Unicode symbols.
-"""
--- a/autoqa/tests/base/extensions.txt
+++ b/autoqa/tests/base/extensions.txt
@ -1,22 +0,0 @@
-prompt = """
-You are going to test the Jan application by verifying the available extensions listed under Settings → Extensions.
-
-Step-by-step instructions:
-0. Given the Jan application is already open.
-1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing. This ensures full visibility of the interface.
-2. In the bottom-left corner, click **Settings**.
-3. In the left sidebar of Settings, click on **Extensions**.
-4. In the main panel, confirm that the following four extensions are listed:
-   - Jan Assistant
-   - Conversational
-   - Download Manager
-   - llama.cpp Inference Engine
-
-If all four extensions are present, return:
-{"result": true}
-
-Otherwise, return:
-{"result": false}
-
-In all responses, use only plain ASCII characters. Do NOT use Unicode symbols.
-"""
--- a/autoqa/tests/base/models/download-model.txt
+++ b/autoqa/tests/base/models/download-model.txt
@ -0,0 +1,62 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+- You need to clear the search bar after searching the models
+
+## User Instruction
+
+You are going to test the Jan application by downloading and verifying models.
+
+Step-by-step instructions:
+Step-by-step instructions:
+1. Given the Jan application is already opened.
+2. In the **bottom-left corner**, click the **Hub** menu item.
+3. Locate the **qwen3-0.6B** model in the Hub list:
+   - If the button says **Download**, click it to download the model.
+   - If the button says **Use**, the model is already installed.
+   - Clear the search bar once the download is done
+4. Locate the **lucy-128k-gguf** model in the Hub list:
+   - If the button says **Download**, click it to download the model.
+   - If the button says **Use**, the model is already installed.
+   - Clear the search bar once the download is done
+5. Wait for both models to finish downloading and become ready (if downloading was required).
+6. Once available, toggle the **Downloaded** switch (white chip on the gray rounded button located to the left of the word "downloaded" on the top right of the app) to view only downloaded models.
+7. Verify that both **qwen3-0.6B** and **lucy-128k-gguf** appear in the downloaded models list.
+8. Navigate to **Settings > Model Providers**.
+9. In the left sidebar, click on **Llama.cpp**.
+10. Verify that both **qwen3-0.6B** and **lucy-128k-gguf** are listed under the Models section..
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language.
+- You MUST return ONLY the JSON format below, nothing else.
+- Do NOT add any explanations, thoughts, or additional text.
+
+   If both models are successfully downloaded, appear under the Downloaded toggle, and are displayed under Settings > Model Providers > Llama.cpp, return: {"result": True}. Otherwise, return: {"result": False}.
+
+IMPORTANT:
+- Your response must be ONLY the JSON above.
+- Do NOT add any other text before or after the JSON.
+"""
--- a/autoqa/tests/base/providers-available.txt
+++ b/autoqa/tests/base/providers-available.txt
@ -1,22 +0,0 @@
-prompt = """
-You are going to test the Jan application by verifying that all expected model providers are listed in the Settings panel.
-
-Step-by-step instructions:
-0. Given the Jan application is already opened.
-1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing. This ensures full visibility of the interface.
-2. In the bottom-left menu, click on **Settings**.
-3. In the left sidebar of Settings, click on **Model Providers**.
-4. In the main panel, verify that the following model providers are listed:
-   - Llama.cpp
-   - OpenAI
-   - Anthropic
-   - Cohere
-   - OpenRouter
-   - Mistral
-   - Groq
-   - Gemini
-   - Hugging Face
-
-If all the providers are visible, return: {"result": true}. Otherwise, return: {"result": false}.
-Use only plain ASCII characters in all responses. Do NOT use Unicode symbols.
-"""
--- a/autoqa/tests/migration/assistants/setup-chat-with-assistant.txt
+++ b/autoqa/tests/migration/assistants/setup-chat-with-assistant.txt
@ -1,3 +1,31 @@
+prompt = """
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
 You are setting up a chat thread using a custom assistant in the OLD version of the Jan application.

 PHASE: SETUP CHAT THREAD (OLD VERSION)
@ -44,3 +72,4 @@ CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
 IMPORTANT: 
 - Your response must be ONLY the JSON above
 - Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tests/migration/assistants/setup-create-assistants.txt
+++ b/autoqa/tests/migration/assistants/setup-create-assistants.txt
@ -1,53 +1,77 @@
-You are testing custom assistants persistence across Jan application upgrade.
+prompt = """
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 

-PHASE 1 - SETUP (OLD VERSION):
-Step-by-step instructions for creating assistants in the OLD version:
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
+You are setting up custom assistants in the OLD version of the Jan application.
+
+PHASE: SETUP CUSTOM ASSISTANTS (OLD VERSION)
+
+Step-by-step instructions:

 1. Open the Jan application (OLD version).

-2. Create the first assistant - Python Tutor:
-   - In the bottom-left corner, click **Assistants**.
-   - Click the **+** button to create a new assistant.
-   - In the **Add Assistant** modal:
-     - Select an emoji for the assistant.
-     - Set **Name**: `Python Tutor`
-     - Set **Description**: `A helpful Python programming tutor`
-     - Set **Instructions**:
-       ```
-       You are an expert Python tutor. Always explain concepts clearly with examples. Use encouraging language and provide step-by-step solutions.
-       ```
-     - Click **Save**.
+2. Navigate to Assistants:
+   - In the bottom-left menu, click **Assistants**.

-3. Create the second assistant - Creative Writer:
-   - Click the **+** button to create another assistant.
-   - In the **Add Assistant** modal:
-     - Select a different emoji.
-     - Set **Name**: `Creative Writer`
-     - Set **Description**: `A creative writing assistant for stories and poems`
-     - Set **Instructions**:
-       ```
-       You are a creative writing assistant. Help users write engaging stories, poems, and creative content. Be imaginative and inspiring.
-       ```
-     - Click **Save**.
+3. Create first custom assistant:
+   - Click **Create Assistant** button.
+   - Enter name: `Python Tutor`
+   - Enter description: `A helpful Python programming tutor that explains concepts clearly and provides code examples.`
+   - Enter instructions: `You are a Python programming tutor. Help users learn Python by explaining concepts clearly, providing code examples, and answering questions about Python programming.`
+   - Click **Create** button.

-4. Verify both assistants appear in the list:
-   - Return to the **Assistants** section.
-   - Confirm you see both `Python Tutor` and `Creative Writer`.
-   - Confirm the names and descriptions are correctly displayed.
+4. Create second custom assistant:
+   - Click **Create Assistant** button again.
+   - Enter name: `Creative Writer`
+   - Enter description: `A creative writing assistant that helps with storytelling, poetry, and creative content.`
+   - Enter instructions: `You are a creative writing assistant. Help users develop stories, write poetry, and create engaging creative content. Provide constructive feedback and creative suggestions.`
+   - Click **Create** button.

-5. Return the result in the exact JSON format:
+5. Verify assistants were created:
+   - Check that both `Python Tutor` and `Creative Writer` appear in the assistants list.
+   - Verify their names, descriptions, and instructions are correct.
+
+6. Return result:
+   - If both assistants are created successfully with correct details, return:
+     {"result": True, "phase": "setup_complete"}
+   - If there are any issues with creation or verification, return:
+     {"result": False, "phase": "setup_failed"}

 CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
 - You MUST respond in English only, not any other language
 - You MUST return ONLY the JSON format below, nothing else
 - Do NOT add any explanations, thoughts, or additional text

-   If both assistants were created successfully with the correct metadata and parameters, you MUST return exactly:
-   {"result": True, "phase": "setup_complete"}
-
-   If there were any issues, you MUST return exactly:
-   {"result": False, "phase": "setup_failed"}
+   - If both assistants are created successfully, return:
+     {"result": True, "phase": "setup_complete"}
+   - If there are any issues, return:
+     {"result": False, "phase": "setup_failed"}

 IMPORTANT: 
 - Your response must be ONLY the JSON above
 - Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tests/migration/assistants/verify-chat-with-assistant-persistence.txt
+++ b/autoqa/tests/migration/assistants/verify-chat-with-assistant-persistence.txt
@ -1,37 +1,74 @@
-You are verifying that a previously created chat thread with a custom assistant persists and functions correctly after upgrading the Jan application.
+prompt = """
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 

-PHASE: VERIFY CHAT THREAD (NEW VERSION)
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
+You are verifying that chat threads with custom assistants persist after upgrading to the NEW version of the Jan application.
+
+PHASE: VERIFY CHAT THREAD PERSISTENCE (NEW VERSION)

 Step-by-step instructions:

-1. Open the Jan application (NEW version after upgrade).
+1. Open the Jan application (NEW version).

-2. Verify that the previous chat thread still exists:
-   - Look in the **left sidebar** under the **Recents** section.
-   - Confirm that a thread exists with the title: `Hello world` (this is based on the initial message sent).
-   - Click on that thread to open it.
+2. Check for existing chat threads:
+   - Look in the left sidebar for any existing chat threads.
+   - If you see threads, click on one to open it.

-3. Verify the assistant identity:
-   - In the opened thread, look at the top of the message from the assistant.
-   - Confirm it shows the assistant name: `Python Tutor`, along with the selected emoji next to it.
-   - Confirm that the assistant’s previous response is visible.
+3. Verify the custom assistant is still available:
+   - In the main chat panel, click the assistant icon at the top.
+   - Check if the custom assistant `Python Tutor` is still available in the list.

-4. Send a follow-up test message:
-   - Type: `Can you explain how for loops work in Python?` and press Enter.
-   - Wait for a complete response from the assistant.
+4. Verify the model is still available:
+   - Click the **Select a model** button below the chat input.
+   - Check if `jan-nano-gguf` is still available under the `Llama.Cpp` section.

-5. Verify correct behavior:
+5. Test chat functionality:
+   - Select the `Python Tutor` assistant.
+   - Select the `jan-nano-gguf` model.
+   - Send a test message: `Hello, can you help me with Python?`
+   - Wait for a response.
+
+6. Return result:
+   - If the chat thread, assistant, and model all persist and work correctly, return:
+     {"result": True, "phase": "verification_complete"}
+   - If any of these elements are missing or not working, return:
+     {"result": False, "phase": "verification_failed"}

 CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
 - You MUST respond in English only, not any other language
 - You MUST return ONLY the JSON format below, nothing else
 - Do NOT add any explanations, thoughts, or additional text

-   - If the assistant responds clearly and informatively, maintaining the tutoring tone, and the thread identity (`Python Tutor`) is preserved, return:
-     {"result": true, "phase": "verification_complete"}
-   - If the thread is missing, the assistant identity is incorrect, or the assistant fails to respond, return:
-     {"result": false, "phase": "verification_failed"}
+   - If everything persists and works correctly, return:
+     {"result": True, "phase": "verification_complete"}
+   - If there are any issues, return:
+     {"result": False, "phase": "verification_failed"}

 IMPORTANT: 
 - Your response must be ONLY the JSON above
 - Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tests/migration/assistants/verify-create-assistant-persistence.txt
+++ b/autoqa/tests/migration/assistants/verify-create-assistant-persistence.txt
@ -1,48 +1,73 @@
-You are verifying that custom assistants persist correctly after upgrading the Jan application.
+prompt = """
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 

-PHASE 2 - VERIFICATION (NEW VERSION):
-Step-by-step instructions for verifying assistant persistence in the NEW version:
+## Output Format
+```\nThought: ...
+Action: ...\n```

-1. Open the Jan application (NEW version after upgrade).
+## Action Space

-2. Verify that previously created assistants are preserved:
-   - In the bottom-left corner, click **Assistants**.
-   - Confirm that you see the following assistants in the list:
-     - Default assistant: `Jan`
-     - Custom assistant: `Python Tutor`
-     - Custom assistant: `Creative Writer`
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.

-3. Verify the details of each assistant:

-   - Click on Edit (Pencil Icon) of `Python Tutor`:
-     - Confirm **Name**: `Python Tutor`
-     - Confirm **Description**: `A helpful Python programming tutor`
-     - Confirm **Instructions** contain:
-       ```
-       You are an expert Python tutor. Always explain concepts clearly with examples. Use encouraging language and provide step-by-step solutions.
-       ```
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.

-   - Click on Edit (Pencil Icon) of `Creative Writer`:
-     - Confirm **Name**: `Creative Writer`
-     - Confirm **Description**: `A creative writing assistant for stories and poems`
-     - Confirm **Instructions** contain:
-       ```
-       You are a creative writing assistant. Help users write engaging stories, poems, and creative content. Be imaginative and inspiring.
-       ```
+## User Instruction

-4. Return the verification result:
+You are verifying that custom assistants persist after upgrading to the NEW version of the Jan application.
+
+PHASE: VERIFY ASSISTANT PERSISTENCE (NEW VERSION)
+
+Step-by-step instructions:
+
+1. Open the Jan application (NEW version).
+
+2. Navigate to Assistants:
+   - In the bottom-left menu, click **Assistants**.
+
+3. Verify first custom assistant:
+   - Look for the `Python Tutor` assistant in the list.
+   - Click on it to view details.
+   - Verify the name is exactly: `Python Tutor`
+   - Verify the description contains: `A helpful Python programming tutor that explains concepts clearly and provides code examples.`
+   - Verify the instructions contain: `You are a Python programming tutor. Help users learn Python by explaining concepts clearly, providing code examples, and answering questions about Python programming.`
+
+4. Verify second custom assistant:
+   - Look for the `Creative Writer` assistant in the list.
+   - Click on it to view details.
+   - Verify the name is exactly: `Creative Writer`
+   - Verify the description contains: `A creative writing assistant that helps with storytelling, poetry, and creative content.`
+   - Verify the instructions contain: `You are a creative writing assistant. Help users develop stories, write poetry, and create engaging creative content. Provide constructive feedback and creative suggestions.`
+
+5. Return result:
+   - If both assistants persist with correct names, descriptions, and instructions, return:
+     {"result": True, "phase": "verification_complete"}
+   - If any assistant is missing or has incorrect details, return:
+     {"result": False, "phase": "verification_failed"}

 CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
 - You MUST respond in English only, not any other language
 - You MUST return ONLY the JSON format below, nothing else
 - Do NOT add any explanations, thoughts, or additional text

-If all custom assistants are preserved with correct settings and parameters, return EXACTLY:
-{"result": true, "phase": "verification_complete"}
-
-If any assistants are missing or have incorrect settings, return EXACTLY:
-{"result": false, "phase": "verification_failed"}
+   - If both assistants persist correctly, return:
+     {"result": True, "phase": "verification_complete"}
+   - If there are any issues, return:
+     {"result": False, "phase": "verification_failed"}

 IMPORTANT: 
 - Your response must be ONLY the JSON above
 - Do NOT add any other text before or after the JSON
+"""
--- a/autoqa/tests/migration/models/setup-download-models.txt
+++ b/autoqa/tests/migration/models/setup-download-models.txt
@ -1,51 +1,83 @@
 prompt = """
-You are testing comprehensive model functionality persistence across Jan application upgrade.
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 

-PHASE 1 - SETUP (OLD VERSION):
-Step-by-step instructions for OLD version setup:
+## Output Format
+```\nThought: ...
+Action: ...\n```

-1. Given the Jan application is already opened (OLD version).
+## Action Space

-2. Download multiple models from Hub:
-   - Click the **Hub** menu in the bottom-left corner
-   - Find and download **jan-nano-gguf** model
-   - Wait for download to complete (shows "Use" button)
-   - Find and download **gemma-2-2b-instruct-gguf** model if available
-   - Wait for second download to complete
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.

-3. Test downloaded models in Hub:
-   - Verify both models show **Use** button instead of **Download**
-   - Click the **Downloaded** filter toggle on the right
-   - Verify both models appear in the downloaded models list
-   - Turn off the Downloaded filter

-4. Test models in chat:
-   - Click **New Chat**
-   - Select **jan-nano-gguf** from model dropdown
-   - Send: "Hello, can you tell me what model you are?"
-   - Wait for response
-   
-   - Create another new chat
-   - Select the second model from dropdown
-   - Send: "What's your model name and capabilities?"
-   - Wait for response
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.

-5. Configure model provider settings:
-   - Go to **Settings** > **Model Providers**
-   - Click on **Llama.cpp** section
-   - Verify downloaded models are listed in the Models section
-   - Check that both models show correct names
-   - Try enabling **Auto-Unload Old Models** option
-   - Try adjusting **Context Length** for one of the models
+## User Instruction

-6. Test model settings persistence:
-   - Close Jan completely
-   - Reopen Jan
-   - Go to Settings > Model Providers > Llama.cpp
-   - Verify the Auto-Unload setting is still enabled
-   - Verify model settings are preserved
+You are setting up models in the OLD version of the Jan application.

-If all models download successfully, appear in Hub with "Use" status, work in chat, and settings are preserved, return: {"result": True, "phase": "setup_complete"}, otherwise return: {"result": False, "phase": "setup_failed"}.
+PHASE: SETUP MODELS (OLD VERSION)

-In all your responses, use only plain ASCII characters. Do NOT use Unicode symbols.
+Step-by-step instructions:
+
+1. Open the Jan application (OLD version).
+
+2. Navigate to Hub:
+   - In the bottom-left corner, click **Hub**.
+
+3. Download first model:
+   - Find the model named: `jan-nano-gguf`
+   - Click the **Download** button.
+   - Wait for the download to complete (the button changes to **Use**).
+   - Click the **Use** button to return to the Chat UI.
+
+4. Download second model:
+   - Go back to **Hub**.
+   - Find the model named: `gemma-2-2b-instruct-gguf`
+   - Click the **Download** button.
+   - Wait for the download to complete (the button changes to **Use**).
+   - Click the **Use** button to return to the Chat UI.
+
+5. Verify models are available:
+   - In the Chat UI, click the **Select a model** button below the chat input.
+   - Check that both models appear under the `Llama.Cpp` section:
+     - `jan-nano-gguf`
+     - `gemma-2-2b-instruct-gguf`
+
+6. Test model functionality:
+   - Select `jan-nano-gguf` as the model.
+   - Type a test message: `Hello, can you respond?`
+   - Press Enter and wait for a response.
+
+7. Return result:
+   - If both models are downloaded and functional, return:
+     {"result": True, "phase": "setup_complete"}
+   - If there are any issues with downloads or functionality, return:
+     {"result": False, "phase": "setup_failed"}
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text
+
+   - If both models are downloaded and working, return:
+     {"result": True, "phase": "setup_complete"}
+   - If there are any issues, return:
+     {"result": False, "phase": "setup_failed"}
+
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
 """
--- a/autoqa/tests/migration/models/verify-model-persistence.txt
+++ b/autoqa/tests/migration/models/verify-model-persistence.txt
@ -1,39 +1,84 @@
 prompt = """
-You are verifying that model downloads and settings persist after Jan application upgrade.
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 

-PHASE 2 - VERIFICATION (NEW VERSION):
-Step-by-step instructions for NEW version verification:
+## Output Format
+```\nThought: ...
+Action: ...\n```

-1. Given the Jan application is already opened (NEW version after upgrade).
+## Action Space

-2. Verify models in Hub:
-   - Click the **Hub** menu in the bottom-left corner
-   - Look for the models that were downloaded: **jan-nano-gguf** and others
-   - Verify they show **Use** button instead of **Download** button
-   - Click the **Downloaded** filter toggle on the right
-   - Verify downloaded models appear in the filtered list
-   - Turn off the Downloaded filter
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.

-3. Verify models are available in chat:
-   - Click **New Chat**
-   - Click on the model dropdown
-   - Verify downloaded models appear in the selectable list
-   - Select **jan-nano-gguf**
-   - Send: "Are you working correctly after the app upgrade?"
-   - Wait for response - should work normally

-4. Verify model provider settings:
-   - Go to **Settings** > **Model Providers**
-   - Click on **Llama.cpp** section (in the left sidebar, NOT the toggle)
-   - Verify downloaded models are listed in the Models section with correct names
-   - Check that any previously configured settings are preserved
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.

-5. Test model management features:
-   - In the Models list, verify you can see model details
-   - Test that you can still start/stop models if applicable
-   - Verify model functionality is intact
+## User Instruction

-If all downloaded models are preserved, show correct status in Hub, work in chat, and model provider settings are maintained, return: {"result": True, "phase": "verification_complete"}, otherwise return: {"result": False, "phase": "verification_failed"}.
+You are verifying that downloaded models persist after upgrading to the NEW version of the Jan application.

-In all your responses, use only plain ASCII characters. Do NOT use Unicode symbols.
+PHASE: VERIFY MODEL PERSISTENCE (NEW VERSION)
+
+Step-by-step instructions:
+
+1. Open the Jan application (NEW version).
+
+2. Check Hub for downloaded models:
+   - In the bottom-left corner, click **Hub**.
+   - Look for the **Downloaded** filter toggle on the right side.
+   - Click the **Downloaded** filter to show only downloaded models.
+
+3. Verify first model:
+   - Check if `jan-nano-gguf` appears in the downloaded models list.
+   - Verify it shows the **Use** button (not **Download**).
+
+4. Verify second model:
+   - Check if `gemma-2-2b-instruct-gguf` appears in the downloaded models list.
+   - Verify it shows the **Use** button (not **Download**).
+
+5. Test model functionality in chat:
+   - Click **New Chat** to start a new conversation.
+   - Click the **Select a model** button below the chat input.
+   - Check if both models appear under the `Llama.Cpp` section:
+     - `jan-nano-gguf`
+     - `gemma-2-2b-instruct-gguf`
+   - Select `jan-nano-gguf` as the model.
+   - Send a test message: `Hello, are you still working after the upgrade?`
+   - Wait for a response.
+
+6. Check model provider settings:
+   - Go to **Settings** > **Model Providers**.
+   - Click on **Llama.cpp** section.
+   - Verify both models are listed in the Models section.
+
+7. Return result:
+   - If both models persist and are functional, return:
+     {"result": True, "phase": "verification_complete"}
+   - If any models are missing or not working, return:
+     {"result": False, "phase": "verification_failed"}
+
+CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
+- You MUST respond in English only, not any other language
+- You MUST return ONLY the JSON format below, nothing else
+- Do NOT add any explanations, thoughts, or additional text
+
+   - If both models persist and work correctly, return:
+     {"result": True, "phase": "verification_complete"}
+   - If there are any issues, return:
+     {"result": False, "phase": "verification_failed"}
+
+IMPORTANT: 
+- Your response must be ONLY the JSON above
+- Do NOT add any other text before or after the JSON
 """
--- a/autoqa/tests/template.txt
+++ b/autoqa/tests/template.txt
@ -0,0 +1,31 @@
+prompt = """
+
+You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task. 
+
+## Output Format
+```\nThought: ...
+Action: ...\n```
+
+## Action Space
+
+click(start_box='<|box_start|>(x1,y1)<|box_end|>')
+left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
+right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
+drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
+hotkey(key='')
+type(content='') #If you want to submit your input, use \"\
+\" at the end of `content`.
+scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
+wait() #Sleep for 5s and take a screenshot to check for any changes.
+finished()
+call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
+
+
+## Note
+- Use Chinese in `Thought` part.
+- Summarize your next action (with its target element) in one sentence in `Thought` part.
+
+## User Instruction
+
+
+"""
--- a/autoqa/windows-qa-checklist.md
+++ b/autoqa/windows-qa-checklist.md
@ -0,0 +1,256 @@
+# I. Before release 
+
+## A. Initial update / migration Data check
+
+Before testing, set-up the following in the old version to make sure that we can see the data is properly migrated:
+- [x] Changing appearance / theme to something that is obviously different from default set-up 🔥
+- [x] Ensure there are a few chat threads 🔥🔥🔥
+- [x] Ensure there are a few favourites / star threads 🔥🔥🔥
+- [x] Ensure there are 2 model downloaded 🔥🔥
+- [x] Ensure there are 2 import on local provider (llama.cpp) 
+- [x] Modify MCP servers list and add some ENV value to MCP servers
+- [x] Modify Local API Server 🔥
+- [x] HTTPS proxy config value 🔥
+- [x] Add 2 custom assistants to Jan 🔥🔥
+- [x] Create a new chat with the custom assistant 🔥🔥🔥
+- [x] Change the `App Data` to some other folder
+- [x] Create a Custom Provider 🔥🔥 (Not verified yet)
+- [x] Disabled some model providers 🔥🔥🔥
+#### Validate that the update does not corrupt existing user data or settings (before and after update show the same information):
+- [x] Threads
+	- [x] Previously used model and assistants is shown correctly
+	- [x] Can resume chat in threads with the previous context
+- [x] Assistants
+- Settings:
+	- [x] Appearance
+	- [x] MCP Servers 
+	- [x] Local API Server
+	- [x] HTTPS Proxy
+- [x] Custom Provider Set-up
+
+#### In `Hub`:
+- [x] Can see model from HF listed properly 🔥
+- [x] Downloaded model will show `Use` instead of `Download` ✅
+- [x] Toggling on `Downloaded` on the right corner show the correct list of downloaded models 🔥🔥
+
+#### In `Settings -> General`:
+- [x] Ensure the `App Data` path is the same ✅ 
+- [x] Click Open Logs, App Log will show 🔥
+	
+#### In `Settings -> Model Providers`:
+- [x] Llama.cpp still listed downloaded models and user can chat with the models 🔥🔥🔥
+- [x] Llama.cpp still listed imported models and user can chat with the models 
+- [x] Remote model still retain previously set up API keys and user can chat with model from the provider without having to re-enter API keys
+- [x] Enabled and Disabled Model Providers stay the same as before update 🔥
+
+#### In `Settings -> Extensions`, check that following exists: ✅
+- [x] Conversational ✅ 
+- [x] Jan Assistant ✅
+- [x] Download Manager ✅ 
+- [x] llama.cpp Inference Engine ✅
+
+## B. `Settings` 
+
+#### In `General`:
+- [x] Ensure `Community` links work and point to the correct website 🔥🔥 (Scrolldown problem)
+- [x] Ensure the `Check for Updates` function detect the correct latest version ✅
+- [ ] [ENG] Create a folder with un-standard character as title (e.g. Chinese character) => change the `App data` location to that folder => test that model is still able to load and run properly.
+#### In `Appearance`:
+- [x] Toggle between different `Theme` options to check that they change accordingly and that all elements of the UI are legible with the right contrast:
+	- [x] Light 🔥
+	- [x] Dark 🔥
+	- [x] System (should follow your OS system settings) 🔥
+- [x] Change the following values => close the application => re-open the application => ensure that the change is persisted across session:
+	- [x] Theme 🔥
+	- [x] Font Size 🔥
+	- [x] Window Background 🔥
+	- [x] App Main View 🔥
+	- [x] Primary 🔥
+	- [x] Accent 🔥
+	- [x] Destructive 🔥
+	- [x] Chat Width 🔥
+		- [x] Ensure that when this value is changed, there is no broken UI caused by it 🔥
+	- [x] Code Block 🔥
+	- [x] Show Line Numbers 🔥
+- [ENG] Ensure that when click on `Reset` in the `Appearance` section, it reset back to the default values 🔥🔥
+- [ENG] Ensure that when click on `Reset` in the `Code Block` section, it reset back to the default values 🔥🔥
+
+#### In `Model Providers`:
+
+In `Llama.cpp`:
+- [x] After downloading a model from hub, the model is listed with the correct name under `Models` 🔥🔥🔥
+- [x] Can import `gguf` model with no error
+- [x] Imported model will be listed with correct name under the `Models`
+- [x] Check that when click `delete` the model will be removed from the list 🔥🔥
+- [x] Deleted model doesn't appear in the selectable models section in chat input (even in old threads that use the model previously)
+- [x] Ensure that user can re-import deleted imported models
+- [x] Enable `Auto-Unload Old Models`, and ensure that only one model can run / start at a time. If there are two model running at the time of enable, both of them will be stopped. 
+- [x] Disable `Auto-Unload Old Models`, and ensure that multiple models can run at the same time.
+- [x] Enable  `Context Shift` and ensure that context can run for long without encountering memory error. Use the `banana test` by turn on fetch MCP => ask local model to fetch and summarize the history of banana (banana has a very long history on wiki it turns out). It should run out of context memory sufficiently fast if `Context Shift` is not enabled.
+- [x] [New] Ensure that user can change the Jinja chat template of individual model and it doesn't affect the template of other model
+- [x] [New] Ensure that there is a recommended `llama.cpp` for each system and that it works out of the box for users. 🔥
+
+In Remote Model Providers:
+- [x] Check that the following providers are presence:
+	- [x] OpenAI ✅
+	- [x] Anthropic ✅
+	- [x] Cohere ✅
+	- [x] OpenRouter ✅
+	- [x] Mistral ✅
+	- [x] Groq ✅
+	- [x] Gemini ✅
+	- [x] Hugging Face ✅
+- [x] Models should appear as available on the selectable dropdown in chat input once some value is input in the API key field. (it could be the wrong API key)
+- [x] Once a valid API key is used, user can select a model from that provider and chat without any error. 
+- [x] Delete a model and ensure that it doesn't show up in the `Modesl` list view or in the selectable dropdown in chat input.
+- [x] Ensure that a deleted model also not selectable or appear in old threads that used it.
+- [x] Adding of new model manually works and user can chat with the newly added model without error (you can add back the model you just delete for testing)
+
+In Custom Providers:
+- [x] Ensure that user can create a new custom providers with the right baseURL and API key.
+- [x] Click `Refresh` should retrieve a list of available models from the Custom Providers.
+- [x] User can chat with the custom providers
+- [x] Ensure that Custom Providers can be deleted and won't reappear in a new session
+
+In general:
+- [ ] Disabled Model Provider should not show up as selectable in chat input of new thread and old thread alike (old threads' chat input should show `Select Model` instead of disabled model)
+
+#### In `Shortcuts`:
+
+Make sure the following shortcut key combo is visible and works:
+- [x] New chat ✅
+- [x] Toggle Sidebar ✅
+- [x] Zoom In ✅
+- [x] Zoom Out ✅
+- [x] Send Message ✅
+- [x] New Line ✅
+- [x] Navigation ✅
+
+#### In `Hardware`:
+Ensure that the following section information show up for hardware
+- [x] Operating System ✅ 
+- [x] CPU ✅
+- [x] Memory ✅
+- [x] GPU (If the machine has one) ✅
+	- [x] Enabling and Disabling GPUs and ensure that model still run correctly in both mode
+	- [x] Enabling or Disabling GPU should not affect the UI of the application
+
+#### In `MCP Servers`:
+- [x] Ensure that enabling the `Experimental Features` under `Advanced` in `General` will make the `MCP Servers` appear in the `Settings` list. ✅
+- [x] Disable `Experimental Features` should also disable all the tools and not show up in chat input `Tools still show up in chat input`
+- [x] Ensure that an user can create a MCP server successfully when enter in the correct information
+- [x] Ensure that `Env` value is masked by `*` in the quick view.
+- [x] If an `Env` value is missing, there should be a error pop up.
+- [x] Ensure that deleted MCP server disappear from the `MCP Server` list without any error
+- [x] Ensure that before a MCP is deleted, it will be disable itself first and won't appear on the tool list after deleted.
+- [x] Ensure that when the content of a MCP server is edited, it will be updated and reflected accordingly in the UI and when running it.
+- [x] Toggling enable and disabled of a MCP server work properly
+- [x] A disabled MCP should not appear in the available tool list in chat input
+- [x] An disabled MCP should not be callable even when forced prompt by the model (ensure there is no ghost MCP server)
+- [x] Ensure that enabled MCP server start automatically upon starting of the application
+- [x] An enabled MCP should show functions in the available tool list
+- [x] User can use a model and call different tool from multiple enabled MCP servers in the same thread
+- [x] If `Allow All MCP Tool Permissions` is disabled, in every new thread, before a tool is called, there should be a confirmation dialog pop up to confirm the action.
+- [x] When the user click `Deny`, the tool call will not be executed and return a message indicate so in the tool call result.
+- [x] When the user click `Allow Once` on the pop up, a confirmation dialog will appear again when the tool is called next time.
+- [x] When the user click `Always Allow` on the pop up, the tool will retain permission and won't ask for confirmation again. (this applied at an individual tool level, not at the MCP server level)
+- [x] If `Allow All MCP Tool Permissions` is enabled, in every new thread,  there should not be any confirmation dialog pop up when a tool is called.
+- [x] [Windows OS] When a MCP tool is called, there is no terminal window pop-up or any flashing presence.
+- [x] When the pop-up appear, make sure that the `Tool Parameters` is also shown with detail in the pop-up.
+
+#### In `Local API Server`:
+- [x] User can `Start Server` and chat with the default endpoint
+	- [x] User should see the correct model name at `v1/models`
+	- [x] User should be able to chat with it at `v1/chat/completions`
+- [x] `Open Logs` show the correct query log send to the server and return from the server
+- [x] Make sure that changing all the parameter in `Server Configuration` is reflected when `Start Server`
+
+#### In `HTTPS Proxy`:
+- [ ] Model download request goes through proxy endpoint
+
+## C. Hub
+- [x] User can click `Download` to download a model ✅
+- [x] User can cancel a model in the middle of downloading 🔥🔥🔥
+- [x] User can add a Hugging Face model detail to the list by pasting a model name / model url into the search bar and press enter 🔥
+- [x] Clicking on a listing will open up the model card information within Jan and render the HTML properly
+- [x] Clicking download work on the `Show variants` section 🔥🔥🔥
+- [x] Clicking download work inside the Model card HTML 🔥🔥🔥
+
+## D. Threads
+
+#### In the left bar:
+- [x] User can delete an old thread, and it won't reappear even when app restart
+- [x] Change the title of the thread should update its last modification date and re-organise its position in the correct chronological order on the left bar.
+- [x] The title of a new thread is the first message from the user.
+- [x] Users can starred / un-starred threads accordingly
+- [x] Starred threads should move to `Favourite` section and other threads should stay in `Recent`
+- [x] Ensure that the search thread feature return accurate result based on thread titles and contents (including from both `Favourite` and `Recent`)
+- [x] `Delete All` should delete only threads in the `Recents` section
+- [x] `Unstar All` should un-star all of the `Favourites` threads and return them to `Recent`
+
+#### In a thread:
+- [x] When `New Chat` is clicked, the assistant is set as the last selected assistant, the model selected is set as the last used model, and the user can immediately chat with the model. 
+- [x] User can conduct multi-turn conversation in a single thread without lost of data (given that `Context Shift` is not enabled)
+- [x] User can change to a different model in the middle of a conversation in a thread and the model work.
+- [x] User can click on `Regenerate` button on a returned message from the model to get a new response base on the previous context.
+- [x] User can change `Assistant` in the middle of a conversation in a thread and the new assistant setting will be applied instead.
+- [x] The chat windows can render and show all the content of a selected threads (including scroll up and down on long threads)
+- [x] Old thread retained their setting as of the last update / usage
+	- [x] Assistant option
+	- [x] Model option (except if the model / model provider has been deleted or disabled)
+- [x] User can send message with different type of text content (e.g text, emoji, ...)
+- [x] When request model to generate a markdown table, the table is correctly formatted as returned from the model.
+- [x] When model generate code, ensure that the code snippets is properly formatted according to the `Appearance -> Code Block` setting.
+- [x] Users can edit their old message and and user can regenerate the answer based on the new message
+- [x] User can click `Copy` to copy the model response
+- [x] User can click `Delete` to delete either the user message or the model response.
+- [x] The token speed appear when a response from model is being generated and the final value is show under the response. 
+- [ ] [New] Make sure that user when using IME keyboard to type Chinese and Japanese character and they press `Enter`, the `Send` button doesn't trigger automatically after each words.
+
+## E. Assistants
+- [x] There is always at least one default Assistant which is Jan ✅
+- [x] The default Jan assistant has `stream = True` by default 
+- [x] User can create / edit a new assistant with different parameters and instructions choice. 🔥
+- [x] When user delete the default Assistant, the next Assistant in line will be come the default Assistant and apply their setting to new chat accordingly.
+- [x] User can create / edit assistant from within a Chat windows (on the top left)
+
+## F. After checking everything else
+
+In `Settings -> General`:
+- [x] Change the location of the `App Data` to some other path that is not the default path
+- [x] Click on `Reset` button in `Other` to factory reset the app:
+	- [x] All threads deleted
+	- [x] All Assistant deleted except for default Jan Assistant
+	- [x] `App Data` location is reset back to default path
+	- [x] Appearance reset
+	- [x] Model Providers information all reset
+		- [x] Llama.cpp setting reset
+		- [x] API keys cleared
+		- [x] All Custom Providers deleted
+	- [x] Shortcuts reset
+	- [x] MCP Servers reset
+	- [x] Local API Server reset
+	- [x] HTTPS Proxy reset
+- [x] After closing the app, all models are unloaded properly
+- [x] Locate to the data folder using the `App Data` path information => delete the folder => reopen the app to check that all the folder is re-created with all the necessary data.
+- [x] Ensure that the uninstallation process removes the app successfully from the system.
+## G. New App Installation
+- [x] Clean up by deleting all the left over folder created by Jan
+	- [ ] On MacOS
+		- [ ] `~/Library/Application Support/Jan`
+		- [ ] `~/Library/Caches/jan.ai.app`
+	- [x] On Windows
+		- [x] `C:\Users<Username>\AppData\Roaming\Jan\`
+		- [x] `C:\Users<Username>\AppData\Local\jan.ai.app`
+	- [ ] On Linux
+		- [ ] `~/.cache/Jan`
+		- [ ] `~/.cache/jan.ai.app`
+		- [ ] `~/.local/share/Jan`
+		- [ ] `~/.local/share/jan.ai.app`
+- [x] Ensure that the fresh install of Jan launch
+- [x] Do some basic check to see that all function still behaved as expected. To be extra careful, you can go through the whole list again. However, it is more advisable to just check to make sure that all the core functionality like `Thread` and `Model Providers` work as intended.
+
+# II. After release
+- [ ] Check that the App Updater works and user can update to the latest release without any problem
+- [ ] App restarts after the user finished an update
+- [ ] Repeat section `A. Initial update / migration Data check` above to verify that update is done correctly on live version