Boost Python-TypeScript JSON: Test Coverage Guide

by ADMIN 50 views

Hey guys! Let's dive into something super important for keeping our Python-TypeScript integration smooth: JSON serialization testing. This is a crucial area where data gets passed between Python and TypeScript, and it's also a place where errors can easily pop up. In this article, we'll talk about why testing this boundary is so vital, what kind of tests we need, and how to make sure everything works perfectly. Ready to level up your testing game?

The Problem: Why Test the Python-TypeScript JSON Boundary?

The Python-TypeScript JSON boundary is a critical error-prone area in our project. Think of it as the bridge where Python and TypeScript talk to each other. Python spits out JSON, and TypeScript gobbles it up. If this bridge is shaky, we're in trouble. Without proper testing, we risk encountering issues related to data type conversions, character encoding, and handling of edge cases.

Where the Problems Lurk

Specifically, the main areas we need to focus on are:

  • analyzer/src/main.py: This is where Python generates JSON. Making sure it's accurate is critical.
  • cli/src/python-bridge/PythonBridge.ts: This is where TypeScript parses the JSON. If it can't understand what Python is saying, things fall apart.

So, why all the fuss? Well, without testing, we could run into: Unicode encoding errors, JavaScript number limits (like Number.MAX_SAFE_INTEGER), funky float values like NaN and Infinity, issues with null versus None, and problems with missing or empty data. That's a lot of potential headaches!

Risk Areas: What Could Go Wrong?

Let's get specific about the risks involved. Understanding these will help us create better tests.

  1. Unicode Handling: Non-ASCII characters in code and documentation are a common source of problems. We need to ensure that Unicode characters serialize and deserialize correctly without causing errors.
  2. Large Numbers: JavaScript has a limit on the size of integers it can safely handle. If Python sends a number bigger than Number.MAX_SAFE_INTEGER, we're in trouble. The test should be ensured to handle it well.
  3. Special Float Values: NaN, Infinity, and -Infinity are special floating-point values that can cause unexpected behavior. We need to make sure these values are handled gracefully.
  4. Null vs. None: In Python, None represents the absence of a value, while JSON uses null. We need to ensure that None values in Python correctly map to null in JSON.
  5. Empty Arrays vs. null: Optional fields might be null in JSON if there's no data. We need to confirm that null or missing values are handled correctly, especially for optional fields.
  6. Deeply Nested Objects: When dealing with by-language metrics, the objects can be deeply nested. Testing should handle these scenarios properly.

Test Cases: The Recipe for Success

So, what do we need to test? We need tests on both the Python and TypeScript sides, plus an integration test to make sure they play nicely together. Let's look at the specifics.

Python Side Tests

These tests will ensure that Python's JSON output is correct. We'll put these in analyzer/tests/test_json_serialization.py. Here's a breakdown of what the tests should cover:

  • Unicode Characters: We'll create a CodeItem with Unicode characters in its name, filepath, and docstring to ensure no UnicodeEncodeError pops up during serialization.
  • Very High Complexity Values: We need a CodeItem with large complexity and impact score values to ensure they serialize properly.
  • Null vs None Fields: We'll make a CodeItem with docstring, return type, and audit rating set to None. The test ensures these should become JSON null values.
  • Empty by_language Dict: We create an AnalysisResult with an empty by_language dictionary to test how it is serialized.

Below is an example of the Python test:

# File: analyzer/tests/test_json_serialization.py
import json
from your_module import CodeItem, AnalysisResult, format_json  # Replace your_module

def test_unicode_in_item_names():
    """Test that Unicode characters serialize correctly."""
    item = CodeItem(
        name='函数',  # Chinese characters
        type='function',
        filepath='/test/测试.py',
        docstring='Documentation with 日本語',
    )
    result = AnalysisResult(items=[item])
    json_output = format_json(result)
    # Should not raise UnicodeEncodeError
    parsed = json.loads(json_output)
    assert parsed['items'][0]['name'] == '函数'
    assert parsed['items'][0]['filepath'] == '/test/测试.py'

def test_very_high_complexity_values():
    """Test that high complexity values serialize safely."""
    item = CodeItem(
        complexity=999999999999,  # Larger than JS safe integer
        impact_score=float('inf'),  # Infinity
    )
    # Should either: 1) Cap values, or 2) Serialize as string
    json_output = format_json(AnalysisResult(items=[item]))
    parsed = json.loads(json_output)
    # Verify JavaScript can parse it
    assert isinstance(parsed['items'][0]['complexity'], (int, str))

def test_null_vs_none_fields():
    """Test that None/null fields are handled consistently."""
    item = CodeItem(
        docstring=None,  # Should become JSON null
        return_type=None,
        audit_rating=None,
    )
    json_output = format_json(AnalysisResult(items=[item]))
    parsed = json.loads(json_output)
    # All should be present as null (not missing)
    assert 'docstring' in parsed['items'][0]
    assert parsed['items'][0]['docstring'] is None

def test_empty_by_language_dict():
    """Test that empty by_language dict serializes correctly."""
    result = AnalysisResult(
        items=[],
        by_language={},  # Empty dict
    )
    json_output = format_json(result)
    parsed = json.loads(json_output)
    assert isinstance(parsed['by_language'], dict)
    assert len(parsed['by_language']) == 0

TypeScript Side Tests

These tests, in cli/src/__tests__/python-bridge/json-parsing.test.ts, will confirm that TypeScript can correctly parse the JSON from Python. Here's what we need to test:

  • Parsing Unicode Characters: Test if the TypeScript code correctly parses Unicode characters that come from Python.
  • Handling Missing Optional Fields: Ensure that TypeScript can handle JSON objects where some optional fields are missing (i.e., null or undefined).
  • Malformed JSON Handling: Test if the TypeScript code throws an error if it receives malformed JSON from Python, with helpful error messages.
  • Large Result Sets Handling: Create a mock result with thousands of items to ensure that the code handles large responses without crashing or timing out.

Here is an example of a TypeScript test:

// File: cli/src/__tests__/python-bridge/json-parsing.test.ts
import { PythonBridge } from '../../src/python-bridge/PythonBridge';

test('parses Unicode characters correctly', async () => {
  const mockStdout = JSON.stringify({
    items: [{
      name: '函数',
      filepath: '/test/测试.py',
      docstring: 'Documentation with 日本語',
    }],
  });
  const bridge = new PythonBridge();
  // Mock subprocess to return mockStdout
  const result = await bridge.analyze({ path: '/test' });
  expect(result.items[0].name).toBe('函数');
  expect(result.items[0].filepath).toBe('/test/测试.py');
});

test('handles missing optional fields', async () => {
  const mockStdout = JSON.stringify({
    items: [{
      name: 'test',
      type: 'function',
    }],
  });
  const result = await bridge.analyze({ path: '/test' });
  // Should not crash
  expect(result.items[0].name).toBe('test');
  expect(result.items[0].docstring).toBeUndefined();
});

test('rejects malformed JSON with helpful error', async () => {
  const malformedJson = '{"items": [invalid}';
  // Mock Python returning malformed JSON
  const bridge = new PythonBridge();
  await expect(bridge.analyze({ path: '/test' }))
    .rejects
    .toThrow(/Failed to parse Python output as JSON/);
});

test('handles very large result sets', async () => {
  // Mock response with 10,000 items
  const largeResult = {
    items: Array(10000).fill(null).map((_, i) => ({
      name: `item${i}`,
    })),
  };
  const mockStdout = JSON.stringify(largeResult);
  const bridge = new PythonBridge();
  // Should not crash or timeout
  const result = await bridge.analyze({ path: '/test' });
  expect(result.items.length).toBe(10000);
});

Cross-Language Integration Test

This test, located in cli/src/__tests__/integration/python-boundary.test.ts, makes sure the whole process works together. It will spawn a real Python subprocess and analyze real example data. This test will verify that the JSON output from Python is valid and matches the expected schema.

Here is an example of the Integration test:

// File: cli/src/__tests__/integration/python-boundary.test.ts
import { PythonBridge } from '../../src/python-bridge/PythonBridge';

test('real Python subprocess returns valid JSON', async () => {
  // This test actually spawns Python (not mocked)
  const bridge = new PythonBridge();
  // Analyze real examples directory
  const result = await bridge.analyze({
    path: '../examples',
    verbose: false,
  });
  // Verify structure matches expected schema
  expect(result).toHaveProperty('items');
  expect(result).toHaveProperty('coverage_percent');
  expect(result).toHaveProperty('by_language');
  // All items should have required fields
  for (const item of result.items) {
    expect(item).toHaveProperty('name');
    expect(item).toHaveProperty('type');
    expect(item).toHaveProperty('filepath');
    expect(item).toHaveProperty('language');
  }
});

Testing Requirements: What You Need

To make these tests effective, here's what you need:

  • Test Cases: At least 10 test cases, each covering different scenarios like Unicode handling, null handling, and various edge cases.
  • Language Coverage: Tests in both Python and TypeScript.
  • Integration Test: Include at least one integration test that runs a real Python subprocess.
  • Real Data: Test with real-world data, such as examples from your directory, to make the tests as realistic as possible.
  • JSON Schema: Document the JSON schema explicitly to ensure consistency and facilitate understanding.

Success Criteria: What Does Done Look Like?

How do we know we've succeeded?

  • No Errors in CI: No JSON serialization or parsing errors in our continuous integration (CI) pipeline.
  • Unicode Support: Successfully verified Unicode support.
  • Large Data Handling: Confirmed that large result sets are handled without issues.
  • Clear Error Messages: Helpful and informative error messages for malformed JSON, making debugging easier.

By following these guidelines and creating these tests, we'll ensure that our Python-TypeScript JSON boundary is robust, reliable, and able to handle anything we throw at it. This will greatly enhance the stability and maintainability of our project. Happy testing, everyone!