This document outlines a comprehensive strategy for testing and stress-testing the FerrisScript test harness itself. The goal is to ensure robustness, identify edge cases, and find opportunities for improvement.
Purpose: Validate the test harness can handle:
Approach: Multi-layered testing strategy combining unit tests, integration tests, stress tests, and property-based testing.
Test Cases:
// TEST: missing_category
// No CATEGORY directive - should default to "unit"
// TEST: invalid_category_value
// CATEGORY: not_a_valid_category
// Should error with InvalidCategory
// TEST: duplicate_test_names
// TEST: duplicate_test_names
// Should error with DuplicateTestName
// TEST: expect_error_without_error_expectation
// EXPECT: success
// EXPECT_ERROR: Some error
// Should error - can't have EXPECT_ERROR with EXPECT: success
// TEST: malformed_directive
// BADDIR ECTIVE: value
// Should be ignored or handled gracefully
// TEST: empty_test_name
// TEST:
// Should error with MissingTestName
// TEST: very_long_test_name_that_exceeds_reasonable_length_limits_and_might_cause_buffer_issues_or_display_problems_in_reports
// Should handle gracefully
// TEST: unicode_test_name_🚀_emoji
// DESCRIPTION: Test with emoji and unicode characters 日本語
// Should handle UTF-8 correctly
// TEST: special_chars_in_metadata
// DESCRIPTION: Contains "quotes" and 'apostrophes' and \backslashes\
// ASSERT: Output with "nested quotes"
// Should escape properly
Implementation:
metadata_parser.rs::testsTest Cases:
Implementation:
#[test]
fn test_parse_metadata_with_many_assertions() {
let source = generate_test_with_n_assertions(1000);
let result = MetadataParser::parse_metadata(&source);
assert!(result.is_ok());
assert_eq!(result.unwrap()[0].assertions.len(), 1000);
}
#[test]
fn test_parse_metadata_mixed_line_endings() {
let source = "// TEST: mixed\r\n// CATEGORY: unit\n// EXPECT: success\r\n";
let result = MetadataParser::parse_metadata(&source);
assert!(result.is_ok());
}
Test Cases:
// Empty output
stdout: ""
stderr: ""
// Only whitespace
stdout: " \n\n \t \n"
stderr: ""
// Truncated output (interrupted execution)
stdout: "=== Test Started ===\nâś“ Step 1\nâś“ Step "
stderr: ""
// Binary/garbage data
stdout: "\x00\x01\xFF\xFE"
stderr: "Binary data"
// Extremely long lines (10K+ characters)
stdout: "A very long line..." + ("." * 10000)
// Missing markers entirely
stdout: "Test ran but forgot to print markers"
// Markers in unexpected format
stdout: "PASS test_name" (instead of âś“)
stdout: "[FAIL] test_name" (instead of âś—)
// Interleaved stdout/stderr
stdout: "Line 1\n"
stderr: "Error 1\n"
stdout: "Line 2\n"
stderr: "Error 2\n"
// UTF-8 encoding issues
stdout: "Invalid UTF-8: \xC3\x28"
// ANSI color codes in output
stdout: "\x1b[32mâś“\x1b[0m Test passed"
// Output exceeds buffer limits (1MB+)
stdout: "Logging spam..." * 100000
Implementation:
output_parser.rs::testsTest Cases:
Implementation:
#[test]
fn test_assertion_substring_ambiguity() {
let parser = OutputParser::new();
let assertions = vec![
Assertion { kind: Required, expected: "Error".into() },
Assertion { kind: Required, expected: "ErrorCode: 404".into() },
];
let output = "ErrorCode: 404 occurred";
let results = parser.validate_assertions(&assertions, output);
assert!(results[0].found); // "Error" found (substring of ErrorCode)
assert!(results[1].found); // "ErrorCode: 404" found
}
Test Cases:
Implementation:
Test Cases:
Test Cases:
Implementation:
#[test]
fn test_report_with_no_tests() {
let suite = TestSuiteResult::new("empty.ferris".to_string());
let generator = ReportGenerator::new();
let report = generator.generate_report(&suite);
assert!(report.contains("Total: 0 tests"));
}
#[test]
fn test_report_with_very_long_test_names() {
let long_name = "a".repeat(500);
let result = create_test_result(&long_name, true);
// Test formatting handles this gracefully
}
Test Cases:
Test Cases:
Scenario 1: Happy Path
// TEST: integration_happy_path
// CATEGORY: integration
// DESCRIPTION: Complete end-to-end test
// EXPECT: success
// ASSERT: Step 1 complete
// ASSERT: Step 2 complete
// ASSERT: Step 3 complete
fn _ready() {
print("Step 1 complete");
print("Step 2 complete");
print("Step 3 complete");
}
Expected: Parse metadata → Build scene → Run test → Validate → Generate report → All pass
Scenario 2: Partial Failure
// TEST: integration_partial_fail
// CATEGORY: unit
// EXPECT: success
// ASSERT: Found node A
// ASSERT: Found node B
// ASSERT: Found node C
fn _ready() {
print("Found node A");
print("Found node B");
// Missing "Found node C"
}
Expected: Report shows 2/3 assertions passed, test fails
Scenario 3: Error Demo Success
// TEST: integration_error_demo
// CATEGORY: error_demo
// EXPECT: error
// EXPECT_ERROR: Node not found
fn _ready() {
let node = get_node("MissingNode"); // Intentional error
}
Expected: Parse metadata → Run test → Extract error → Match expected → Pass
Scenario 4: Multiple Tests Per File
// TEST: test_a
// CATEGORY: unit
// EXPECT: success
// ASSERT: Test A ran
// TEST: test_b
// CATEGORY: unit
// EXPECT: success
// ASSERT: Test B ran
fn _ready() {
print("Test A ran");
print("Test B ran");
}
Expected: Both tests tracked separately, both pass
Test Cases:
Performance Targets:
Test Cases:
# Generate test suite
for i in {1..100}; do
echo "// TEST: stress_test_$i" > "test_$i.ferris"
echo "// CATEGORY: unit" >> "test_$i.ferris"
echo "// EXPECT: success" >> "test_$i.ferris"
echo "// ASSERT: Test $i passed" >> "test_$i.ferris"
echo "fn _ready() { print(\"Test $i passed\"); }" >> "test_$i.ferris"
done
# Run stress test
time ferris-test --all
Metrics to Track:
Test Cases:
Implementation:
#[test]
fn test_memory_with_large_assertions() {
let large_assertion_list = (0..10000)
.map(|i| Assertion {
kind: AssertionKind::Required,
expected: format!("Assertion {}", i),
})
.collect::<Vec<_>>();
// Verify parsing doesn't OOM
// Verify validation completes
// Verify memory is released
}
Platforms to Test:
Platform-Specific Issues:
Test Cases:
Test Cases:
Godot Not Found:
Invalid Configuration:
Scene Generation Fails:
Timeout Handling:
Quality Criteria:
Test Cases:
#[test]
fn test_error_message_quality() {
let result = parse_metadata("// TEST:");
assert!(result.is_err());
let err_msg = result.unwrap_err().to_string();
assert!(err_msg.contains("TEST"));
assert!(err_msg.contains("name"));
// Error should explain what's missing
}
Strategy: Maintain a comprehensive test suite that runs on every commit.
Components:
tests/snapshots/CI Pipeline:
test_harness_tests:
- cargo test -p ferrisscript_test_harness
- cargo clippy -p ferrisscript_test_harness
- cargo bench -p ferrisscript_test_harness (baseline)
integration_tests:
- ./run-tests.ps1 --all (Windows)
- ./run-tests.sh --all (Linux/macOS)
stress_tests:
- ./generate_stress_suite.sh 100
- time ./run-tests.sh --all
- assert time < 10s
Properties to Test:
Property 1: Parsing Idempotency
#[test]
fn prop_metadata_parse_serialize_roundtrip() {
// Given any valid TestMetadata
// When serialized to string and parsed back
// Then should equal original
}
Property 2: Output Validation Consistency
#[test]
fn prop_assertion_validation_is_consistent() {
// Given any assertion and output string
// If assertion.expected is substring of output
// Then validation should find it
}
Property 3: Report Generation Determinism
#[test]
fn prop_report_generation_is_deterministic() {
// Given same TestSuiteResult
// When generate_report() called multiple times
// Then output should be identical
}
Metadata Parser Edge Cases (metadata_parser.rs)
test_parse_metadata_with_1000_assertions()
test_parse_metadata_mixed_line_endings()
test_parse_unicode_in_metadata()
test_parse_empty_metadata_block()
test_parse_malformed_directives()
test_parse_duplicate_directives()
test_parse_very_long_test_names()
test_parse_special_chars_in_assertions()
Output Parser Robustness (output_parser.rs)
test_extract_error_from_empty_output()
test_extract_error_from_truncated_output()
test_extract_markers_with_ansi_codes()
test_validate_assertions_with_partial_matches()
test_validate_with_very_large_output()
test_extract_error_with_multiple_errors()
Report Generator Formatting (report_generator.rs)
test_report_with_1000_tests()
test_report_with_very_long_names()
test_report_with_unicode_everywhere()
test_report_colorization_edge_cases()
test_summary_statistics_accuracy()
Integration Tests (new file: tests/integration_tests.rs)
test_end_to_end_happy_path()
test_end_to_end_partial_failure()
test_end_to_end_error_demo()
test_end_to_end_multiple_tests_per_file()
test_end_to_end_with_real_godot()
Targets for Fuzzing:
Implementation:
[dependencies]
cargo-fuzz = "0.11"
# fuzz/fuzz_targets/metadata_parser.rs
#![no_main]
use libfuzzer_sys::fuzz_target;
use ferrisscript_test_harness::MetadataParser;
fuzz_target!(|data: &[u8]| {
if let Ok(s) = std::str::from_utf8(data) {
let _ = MetadataParser::parse_metadata(s);
}
});
Strategy: All code examples in documentation should:
Implementation:
#[doc] tests in RustMetrics to Track:
Metrics to Track:
This comprehensive testing strategy provides a roadmap for ensuring the test harness is robust, reliable, and performant. Implementation should be prioritized based on risk and impact, starting with edge cases in core components (metadata parser, output parser) and expanding to integration and stress testing.
Next Steps: