7.9 KiB
7.9 KiB
AutoQA Reliability Testing Guide
🚀 Comprehensive guide for running reliability tests with AutoQA to verify test case stability and reliability.
Overview
Reliability testing is designed to verify that your test cases are stable and reliable by running them multiple times. This helps identify flaky tests and ensures consistent behavior before deploying to production.
Two Testing Phases
1. Development Phase
- Purpose: Verify basic stability during development
- Runs: 5 times
- Success Rate Requirement: ≥80%
- Use Case: During development to catch obvious stability issues
2. Deployment Phase
- Purpose: Verify production readiness
- Runs: 20 times
- Success Rate Requirement: ≥90%
- Use Case: Before deploying to production to ensure reliability
Command Line Usage
Basic Reliability Testing
# Development phase (5 runs)
python main.py --enable-reliability-test --reliability-phase development
# Deployment phase (20 runs)
python main.py --enable-reliability-test --reliability-phase deployment
Custom Configuration
# Custom number of runs
python main.py --enable-reliability-test --reliability-runs 10
# Specific test file
python main.py --enable-reliability-test --reliability-test-path "tests/base/default-jan-assistant.txt"
# Custom max turns
python main.py --enable-reliability-test --reliability-phase development --max-turns 50
With ReportPortal Integration
# Development phase with ReportPortal
python main.py \
--enable-reliability-test \
--reliability-phase development \
--enable-reportportal \
--rp-token "YOUR_TOKEN" \
--rp-project "jan_reliability_tests"
# Deployment phase with ReportPortal
python main.py \
--enable-reliability-test \
--reliability-phase deployment \
--enable-reportportal \
--rp-token "YOUR_TOKEN" \
--rp-project "jan_reliability_tests"
Environment Variables
# Enable reliability testing
export ENABLE_RELIABILITY_TEST=true
# Set phase
export RELIABILITY_PHASE=deployment
# Custom runs (overrides phase)
export RELIABILITY_RUNS=15
# Specific test path
export RELIABILITY_TEST_PATH="tests/base/my-test.txt"
# Run with environment variables
python main.py --enable-reliability-test
Command Line Arguments
| Argument | Environment Variable | Default | Description |
|---|---|---|---|
--enable-reliability-test |
ENABLE_RELIABILITY_TEST |
false |
Enable reliability testing mode |
--reliability-phase |
RELIABILITY_PHASE |
development |
Testing phase: development or deployment |
--reliability-runs |
RELIABILITY_RUNS |
0 |
Custom number of runs (overrides phase) |
--reliability-test-path |
RELIABILITY_TEST_PATH |
- | Specific test file path |
Test Execution Flow
Single Test Reliability Testing
- Load Test File: Read the specified test file
- Run Multiple Times: Execute the test the specified number of times
- Track Results: Monitor success/failure for each run
- Calculate Success Rate: Determine overall reliability
- Generate Report: Provide detailed results and statistics
Multiple Tests Reliability Testing
- Scan Test Files: Find all test files in the specified directory
- Run Reliability Tests: Execute reliability testing on each test file
- Aggregate Results: Combine results from all tests
- Overall Assessment: Determine if the entire test suite is reliable
Output and Results
Success Rate Calculation
Success Rate = (Successful Runs / Total Runs) × 100
Development Phase Requirements
- Target: 5 runs
- Minimum Success Rate: 80%
- Result: PASS if ≥80%, FAIL if <80%
Deployment Phase Requirements
- Target: 20 runs
- Minimum Success Rate: 90%
- Result: PASS if ≥90%, FAIL if <90%
Sample Output
==========================================
RELIABILITY TEST SUMMARY
==========================================
Test: tests/base/default-jan-assistant.txt
Phase: DEVELOPMENT
Completed runs: 5/5
Successful runs: 4
Failed runs: 1
Success rate: 80.0%
Total duration: 125.3 seconds
Average duration per run: 25.1 seconds
Overall result: ✅ PASSED
Development phase requirement: ≥80% success rate
Use Cases
1. New Test Development
# Test a new test case for basic stability
python main.py \
--enable-reliability-test \
--reliability-phase development \
--reliability-test-path "tests/base/my-new-test.txt"
2. Pre-Production Validation
# Verify test suite is production-ready
python main.py \
--enable-reliability-test \
--reliability-phase deployment \
--tests-dir "tests/base"
3. Flaky Test Investigation
# Run a potentially flaky test multiple times
python main.py \
--enable-reliability-test \
--reliability-runs 25 \
--reliability-test-path "tests/base/flaky-test.txt"
4. CI/CD Integration
# Automated reliability testing in CI/CD
ENABLE_RELIABILITY_TEST=true \
RELIABILITY_PHASE=deployment \
python main.py --max-turns 40
Best Practices
1. Start with Development Phase
- Begin with 5 runs to catch obvious issues
- Use during active development
- Quick feedback on test stability
2. Use Deployment Phase for Production
- Run 20 times before production deployment
- Ensures high reliability standards
- Catches intermittent failures
3. Custom Runs for Specific Needs
- Use custom run counts for special testing scenarios
- Investigate flaky tests with higher run counts
- Balance between thoroughness and execution time
4. Monitor Execution Time
- Reliability testing takes longer than single runs
- Plan accordingly for CI/CD pipelines
- Consider parallel execution for multiple test files
Troubleshooting
Common Issues
1. Test File Not Found
# Ensure test path is correct
python main.py \
--enable-reliability-test \
--reliability-test-path "tests/base/existing-test.txt"
2. Low Success Rate
- Check test environment stability
- Verify test dependencies
- Review test logic for race conditions
3. Long Execution Time
- Reduce max turns if appropriate
- Use development phase for quick feedback
- Consider running fewer test files
Debug Mode
# Enable debug logging
export LOG_LEVEL=DEBUG
export PYTHONPATH=.
# Run with verbose output
python main.py --enable-reliability-test --reliability-phase development
Integration with Existing Workflows
Migration Testing
# Run reliability tests on migration test cases
python main.py \
--enable-reliability-test \
--reliability-phase deployment \
--tests-dir "tests/migration"
Base Testing
# Run reliability tests on base test cases
python main.py \
--enable-reliability-test \
--reliability-phase development \
--tests-dir "tests/base"
Custom Test Directories
# Run reliability tests on custom test directory
python main.py \
--enable-reliability-test \
--reliability-phase deployment \
--tests-dir "my_custom_tests"
Performance Considerations
Execution Time
- Development Phase: ~5x single test execution time
- Deployment Phase: ~20x single test execution time
- Multiple Tests: Multiply by number of test files
Resource Usage
- Screen recordings for each run
- Trajectory data for each run
- ReportPortal uploads (if enabled)
Optimization Tips
- Use development phase for quick feedback
- Run deployment phase during off-peak hours
- Consider parallel execution for multiple test files
- Clean up old recordings and trajectories regularly
Next Steps
- Start Simple: Begin with development phase on single test files
- Scale Up: Move to deployment phase for critical tests
- Automate: Integrate into CI/CD pipelines
- Monitor: Track reliability trends over time
- Improve: Use results to identify and fix flaky tests
For more information, see the main README.md, QUICK_START.md, and explore the test files in the tests/ directory.