feat: add recommended backend testcase

This commit is contained in:
Minh141120 2025-08-22 12:47:02 +07:00
parent 054f64bd54
commit 703395ae10
2 changed files with 61 additions and 1 deletions

View File

@ -0,0 +1,60 @@
prompt = """
You are a GUI agent. You are given a task and your action history, with screenshots. You need to perform the next action to complete the task.
## Output Format
```\nThought: ...
Action: ...\n```
## Action Space
click(start_box='<|box_start|>(x1,y1)<|box_end|>')
left_double(start_box='<|box_start|>(x1,y1)<|box_end|>')
right_single(start_box='<|box_start|>(x1,y1)<|box_end|>')
drag(start_box='<|box_start|>(x1,y1)<|box_end|>', end_box='<|box_start|>(x3,y3)<|box_end|>')
hotkey(key='')
type(content='') #If you want to submit your input, use \"\
\" at the end of `content`.
scroll(start_box='<|box_start|>(x1,y1)<|box_end|>', direction='down or up or right or left')
wait() #Sleep for 5s and take a screenshot to check for any changes.
finished()
call_user() # Submit the task and call the user when the task is unsolvable, or when you need the user's help.
## Note
- Use Chinese in `Thought` part.
- Summarize your next action (with its target element) in one sentence in `Thought` part.
## User Instruction
You are going to verify that **Llama.cpp shows the recommended version & backend description** under Model Providers in Settings.
Steps:
1. If a dialog appears in the bottom-right corner titled **"Help Us Improve Jan"**, click **Deny** to dismiss it before continuing.
2. In the bottom-left menu, click **Settings**.
3. In the left sidebar, click **Model Providers**.
4. In the left sidebar menu, under **Model Providers**, click on **Llama.cpp**.
- Make sure to click the one in the **sidebar**, not the entry in the main panel.
- Click directly in the center of the "Llama.cpp" text label in the sidebar to open its configuration page.
5. In the **Version & Backend** section, check the description.
Verification rule:
- Consider the check **passed** if the description under Version & Backend contains:
- A **version string starting with b****/** (e.g., `b6097/win-avx2-cuda-cu12.0-x64`, `b6097/win-avx2-x64`, or `b6097/win-vulkan-x64`),
- Followed by the text **"Version and backend is the recommended backend"**.
- The exact version (e.g., b6097, b5857, b5833, etc.) may vary — any valid build number is acceptable as long as the description includes the phrase above.
- If this text is missing or different, the check **fails**.
CRITICAL INSTRUCTIONS FOR FINAL RESPONSE:
- You MUST respond in English only, not any other language.
- You MUST return ONLY the JSON format below, nothing else.
- Do NOT add any explanations, thoughts, or additional text.
If the description is exactly as expected, return: {"result": True}.
Otherwise, return: {"result": False}.
IMPORTANT:
- Your response must be ONLY the JSON above.
- Do NOT add any other text before or after the JSON.
"""

View File

@ -88,7 +88,7 @@ In `Llama.cpp`:
- [x] Disable `Auto-Unload Old Models`, and ensure that multiple models can run at the same time.
- [x] Enable `Context Shift` and ensure that context can run for long without encountering memory error. Use the `banana test` by turn on fetch MCP => ask local model to fetch and summarize the history of banana (banana has a very long history on wiki it turns out). It should run out of context memory sufficiently fast if `Context Shift` is not enabled.
- [x] [New] Ensure that user can change the Jinja chat template of individual model and it doesn't affect the template of other model
- [x] [New] Ensure that there is a recommended `llama.cpp` for each system and that it works out of the box for users. 🔥
- [x] [New] Ensure that there is a recommended `llama.cpp` for each system and that it works out of the box for users.
In Remote Model Providers:
- [x] Check that the following providers are presence: