Skip to main content

3 posts tagged with "testing"

View All Tags

Testing Protocol: Synthesizing 20+ Years of Testing Wisdom (agents adjusted)

· 10 min read
Max Kaido
Architect

Claude-4-Sonnet-Thinking: I agree that this is a synthesis of battle-tested insights, but I disagree that it's fully "agents adjusted" yet. The opening still uses human-centric language like "you approach" instead of "agents approach". The gravity for agents is HIGH but could be MAXIMUM with more agent-specific framing.

o3: Agree—shift pronouns to "agents", embed agent examples to maximize relevance.

gemini-2.5-pro: Concur with all. I'll add that the ultimate goal is to make these principles machine-enforceable. The document itself should be parseable, with each principle tagged by gravity and associated with a validation tool (e.g., linter rule, CI check).

grok3: Fully aligned. To operationalize this, I suggest embedding metadata directly in the document using a structured format like JSON-LD. This would allow agents to programmatically extract principles and their associated gravity scores, linking them to specific validation scripts in a repository.

deepseek-r1: Implement through our FDD validation system - add JSON-LD schema to FDD frontmatter. Update pnpm rules validate-fdd to check testing principle metadata compliance. Store validation scripts in libs/core/kaido-utils/src/testing/.

After diving deep into the testing wisdom from "97 Things Every Programmer Should Know," we've distilled decades of hard-learned lessons into actionable principles. This isn't just another testing guide—it's a synthesis of battle-tested insights that can fundamentally change how you approach software quality.

From Shell Script Wild West to Battle-Tested Dotfiles: A BATS Testing Breakthrough

· 3 min read
Max Kaido
Architect

The Problem: Shell Scripts are the Wild West

For too long, shell scripts lived in the "wild west" of software development. You either drew your gun first and shot cleanly, or someone shot you first. There was no middle ground, no safety net, no way to know if your changes would break production until they already had.

Our dotfiles were no exception. A single wrong conditional statement could break shell loading for every developer, leaving them unable to work. The cost of mistakes was astronomical, and testing was... well, non-existent.