Mercury Redesign: Clean Architecture Implementation

June 19, 2025 · 6 min read

Architect

After extensive discussion and implementation, we've achieved a clean, minimal architecture that separates concerns properly and avoids over-engineering.

Core Architecture Principles

Clear Separation: Variants (experiment tracking) vs Instances (deployment)
Single Responsibility: Config = strategy, Instance = deployment, Scheduler = timing
Type Safety: Strong enums prevent runtime errors
No Database Pollution: Infrastructure concerns stay out of business logic
Config-Over-Env: ABH variants determined by time + TypeScript config, NEVER environment variables

Final Architecture

Variants (Experiment Tracking)

enum MercuryVariant {
  MAIN = 'MAIN', // Production-ready strategy
  A = 'A', // Experimental variant A
  B = 'B', // Experimental variant B
  H = 'H', // Experimental variant H (heuristic)
}

Instances (Hardware/Deployment)

enum MercuryInstance {
  R = 'R', // Readonly shadow trading
  W = 'W', // Live trading
  ABH = 'ABH', // Experimental (variants determined by time + config)
}

Tournament Configuration (Pure Strategy)

interface TournamentConfig {
  // ... existing strategy fields ...
  topK: number; // Tournament parameter: how many positions to find

  experimental?: {
    experimentId: string; // Experiment identifier
    variant: MercuryVariant; // For metrics/analysis
    topL?: number; // Strategy recommendation (≤ topK)
  };
}

Instance Service (Risk Management)

class InstanceService {
  getLiveRatio(): number; // From LIVE_RATIO env var (0.0-1.0)
  calculateLiveCount(topL: number): number; // Math.floor(topL × liveRatio)
}

Key Concepts Explained

topK vs topL vs liveRatio

topK: Tournament finds top K positions from all candidates
topL: Strategy recommends L positions as optimal (≤ topK)
liveRatio: Risk control - what percentage of topL to trade live
liveCount: Final result - Math.floor(topL × liveRatio)

Instance Behavior

# R Instance (Shadow Production)
MERCURY_INSTANCE=R    # → Instance: R, Variant: MAIN, liveCount: 0

# W Instance (Live Production)
MERCURY_INSTANCE=W    # → Instance: W, Variant: MAIN, liveCount: topL × liveRatio
LIVE_RATIO=0.5        # → 50% of recommended positions go live

# ABH Instance (Experimental)
MERCURY_INSTANCE=ABH  # → Instance: ABH, Variant: determined by time + experiment config, liveCount: 0

Configuration File Structure (Code-Based)

// dike/tournaments/config/main.ts - Production configuration with EXPLICIT schedules
export const MAIN_TOURNAMENTS = [
  {
    type: TournamentType.MOMENTUM_STRENGTH_BUY,
    cronPattern: '0 20 13 * * *', // 13:20 UTC — US ramp start
    configVersion: 'v2',
    description: 'US market opening momentum - strong upward bias',
    config: { /* full tournament config with experimental fields */ }
  },
  // ... 7 more tournaments with exact timing and config versions
];

// config/experiments/ - Experiment lifecycle management
experiments/
├── pending/           # Future experiments (planning phase)
│   └── 1-price-normalization.ts
├── in-progress/       # Currently running experiments
│   └── 0-comparison-methods.ts
└── conducted/         # Completed experiments (historical record)
    └── (empty for now)

Tournament Schedule Principles ⚡

🎯 EXPLICIT OVER IMPLICIT

All tournament schedules visible in config/main.ts
No hidden schedules in other modules
Exact config versions specified (v2, v1)
Human-readable descriptions for each tournament

📸 BASELINE SNAPSHOT REQUIREMENT

Every experiment file contains EXACT copy of main schedule
Preserves historical baseline at experiment start time
Prevents "what was the baseline?" confusion
No lazy imports - explicit duplication required

🚫 NO "LATEST_VERSION" SHORTCUTS

CRITICAL INSIGHT: LATEST_VERSION was hiding experiment chaos
Previous mistake: See improvement → Replace config → Call it "latest"
Proper approach: Experiment with configV1 vs configV2 → Conclude which is better
Result: Clear record of what changed and why

// ❌ BAD: LATEST_VERSION hides experimental changes
configVersion: LATEST_VERSION[TournamentType.MOMENTUM_STRENGTH_BUY], // What version? When changed? Why?

// ✅ GOOD: Explicit version with experimental intent
configVersion: 'v2', // Clear version, experiment will test v2 vs v3

💎 GOLD PRINCIPLE: Config improvements must go through experiment framework, not direct "latest" replacement

Experiment Lifecycle Management

🔄 Experiment States:

PENDING: Experiment planned but not implemented
IN-PROGRESS: Currently running and collecting data
CONDUCTED: Completed with results analyzed

📁 File Movement Process:

# Start new experiment
mv experiments/pending/1-price-normalization.ts experiments/in-progress/

# Complete experiment
mv experiments/in-progress/0-comparison-methods.ts experiments/conducted/

Single-Variable Experiment Principle

Each experiment MUST test only ONE variable:

// ✅ GOOD: Experiment 0 - Only comparison method varies
variants: {
  A: { comparisonMethod: 'two-step', normalizePrice: false },
  B: { comparisonMethod: 'structured', normalizePrice: false },
  H: { comparisonMethod: 'heuristic', normalizePrice: false },
}

// ❌ BAD: Multiple variables (ruins analysis)
variants: {
  A: { comparisonMethod: 'two-step', normalizePrice: false },
  B: { comparisonMethod: 'structured', normalizePrice: true }, // 2 changes!
}

Experiment File Organization Principles

Experiment ID as Filename: 0-comparison-methods.ts where 0-comparison-methods is the experiment ID
A Variant = MAIN Copy: Always include exact copy of MAIN config at experiment time (prevents "what was baseline?" confusion)
B/H Variants = Concise Changes: Only specify differences from A to reduce duplication
Time Shifting: A (+1 hour), B (A + 20 mins), H (A + 40 mins) for conflict prevention
Historical Preservation: A variant shows exact baseline used, even if MAIN evolves later

🏆 GOLDEN RULE: Config-First Experimentation

ALL experiment parameters MUST be stored in TournamentConfig.experimental

experimental: {
  experimentId: '0-comparison-methods',
  variant: MercuryVariant.B,
  topL: 5,

  // EXPERIMENT OVERRIDES - Every parameter that varies between variants
  comparisonMethod: 'structured',
  normalizePrice: true,
  // ... any other experiment-specific settings
}

Why This Prevents Experiment Chaos:

✅ Reproducible: All parameters saved in database with results
✅ Traceable: Can see exactly what configuration produced each result
✅ No Memory Loss: Can't forget what parameters were being tested
✅ Clean Comparisons: Variants differ only in explicitly tracked parameters
✅ Historical Analysis: Past experiments remain analyzable even after code changes

Anti-Pattern (Causes Chaos):

# ❌ WRONG: Parameters scattered in environment variables
COMPARISON_METHOD=structured
NORMALIZE_PRICES=true
# Result: Forget what was being tested, can't reproduce results

Implementation Status ✅

Phase 1: Code-Based Configuration System

✅ COMPLETED: Extended TournamentConfig with experimental fields
✅ COMPLETED: Added topK to tournament configuration
✅ COMPLETED: Created tournament config factory with variant handling
✅ COMPLETED: Strong typing with MercuryVariant and MercuryInstance enums
📦 TAGGED: v2025.06.20-mercury-experimental-framework

Phase 2: Simplified Variant Architecture

✅ COMPLETED: Clean separation of variants (MAIN/A/B/H) vs instances (R/W/ABH)
✅ COMPLETED: R & W instances both run MAIN variant with different risk levels
✅ COMPLETED: ABH instance determines variants from time + experiment configuration
✅ COMPLETED: Removed confusing legacy enums and phantom variants

Infrastructure Services

✅ COMPLETED: InstanceService for live trading configuration
✅ COMPLETED: InstanceSchedulerService for time slot management
✅ COMPLETED: Clean metrics with both instance + variant labels
✅ COMPLETED: Proper dependency injection in AtlasModule

Example Scenarios

Conservative Live Trading Rollout

# Start conservative
LIVE_RATIO=0.1  # topL=7 → liveCount=0 (Math.floor(0.7))

# Gradual increase
LIVE_RATIO=0.3  # topL=7 → liveCount=2 (Math.floor(2.1))

# Full confidence
LIVE_RATIO=1.0  # topL=7 → liveCount=7 (Math.floor(7.0))

Experiment Analysis

# After experiment shows only top 3 positions are profitable
topL=3, LIVE_RATIO=0.5  # → liveCount=1 (Math.floor(1.5))

Hardware Isolation

R Instance: Shadow trading, separate database, no live keys
W Instance: Live trading, separate database, has live keys
ABH Instance: Experimental, separate database, no live keys

ABH Variant Scheduling

// ABH determines variant based on time within 2-hour blocks
const blockMinute = (currentHour % 2) * 60 + currentMinute;

if (blockMinute >= 60 && blockMinute < 80) {
  variant = MercuryVariant.A; // Minutes 60-79
} else if (blockMinute >= 80 && blockMinute < 100) {
  variant = MercuryVariant.B; // Minutes 80-99
} else if (blockMinute >= 100 && blockMinute < 120) {
  variant = MercuryVariant.H; // Minutes 100-119
}

NO ENVIRONMENT VARIABLES for variant determination - pure time + config approach!

ABH Instance Tournament Scheduling 🎯

CRITICAL ARCHITECTURE: ABH instance schedules 24 tournaments total:

8 base tournaments (from main production schedule)
× 3 variants (A, B, H) = 24 scheduled tournaments

// ABH Scheduling Pattern
class TournamentOrchestrator {
  setupABHExperimentalSchedule(experiment: ExperimentConfig) {
    experiment.baseSchedule.forEach((tournamentConfig) => {
      // Schedule variant A with +1 hour offset
      this.scheduleVariantTournament(tournamentConfig, MercuryVariant.A, 60);

      // Schedule variant B with +80 minutes offset
      this.scheduleVariantTournament(tournamentConfig, MercuryVariant.B, 80);

      // Schedule variant H with +100 minutes offset
      this.scheduleVariantTournament(tournamentConfig, MercuryVariant.H, 100);
    });
  }
}

ARCHITECTURAL VIOLATIONS PREVENTED:

❌ NEVER use MERCURY_VARIANT=A/B/H environment variables
❌ NEVER determine variants from environment at runtime
✅ ALWAYS determine variants from time + TypeScript experiment configuration
✅ ALWAYS schedule all 24 tournaments at ABH startup

SINGLE EXECUTION METHOD:

// Unified tournament execution - no variant-specific methods
runTournament(tournamentType: TournamentType, configOverride?: Partial<TournamentConfig>)

Benefits Achieved

🎯 Clean Architecture

Strategy configuration separated from deployment concerns
Time slots handled by scheduler, not stored in config
Live trading controlled by instance service, not config data

🔒 Type Safety

Strong enums prevent typos and runtime errors
IDE autocomplete and refactoring support
Clear architectural intent enforced by type system

📊 Proper Metrics

Both instance and variant labels for complete tracking
Clear distinction between deployment and experiment data
No phantom variants or confusing legacy labels

🚀 Scalable Risk Management

Gradual live trading rollout via liveRatio
Strategy recommendations captured in topL
Fine-grained control without touching strategy logic

🛡️ Safety by Design

Only W instance has live trading keys (hardware isolation)
Conservative defaults (liveRatio=0, topL=topK)
Manual restart control for configuration changes

Implementation Questions & Answers

These are the specific answers provided during 2 hours of architectural discussion:

Phase 2: Historical Analysis Scripts (Before ABH Consolidation)

What metrics should we extract from A/B/H variants before consolidating? Question-driven scripts for known pain points (to be provided)
Which date range should we analyze for historical comparison data? Will answer later (based on available data)
How do we identify which variant (A/B/H) each tournament belongs to in historical data? Separate DBs + API access while running

Phase 3: ABH Implementation

How exactly do we determine which comparison method to use for each tournament in ABH? Time-based scheduling within ABH slot + experiment config
Where does the ABH variant read the current experiment config from? TypeScript experiment configuration files
Do we keep all A/B/H environment files and modify them, or create new mercury-abh.env? Use single ABH environment with MERCURY_INSTANCE=ABH only

Phase 4: Data Tracking

Which specific entities need experimentId/portfolioAccount fields? Just tournaments (add experimentId + variant, portfolioAccount derived from tournamentType + experimentId + variant)
How do we populate these fields? At tournament creation

Phase 5: Analysis

What specific metrics do agents need from the API? Question-driven metrics linked to experiment configs (e.g., model comparison speed, prompt performance impact, optimal topK for PnL)
How do agents access Mercury backend? Direct API calls wrapped with handy scripts that provide exact data needed to answer specific questions

Phase 6: Live Trading

Where does liveCount get set? In config (code). Environment only specifies instance type (W/R/ABH), everything else from config → naturally flows to DB
How do we prevent accidental live trading? Only W instance has live trading keys. R = readonly mode. Start with liveCount: 1, manual review, then gradually increase

Phase 7: Agent Integration

How should agents change experiment configs? Code modification (since everything in code)
What triggers ABH restart after config change? Manual by user

Phase 8: Dashboard

Which dashboard components are most broken/inconsistent? Will assess after previous phases (some bugs may naturally disappear)

Next Steps

~~Historical Analysis~~: SKIPPED - Would slow down main feature. Instead: tag current git state for future API image builds, preserve existing DBs (no action needed)
ABH Implementation: Single instance running A, B, or H variants based on config
Agent Integration: Scripts for config modification and analysis
Dashboard Cleanup: Remove inconsistencies now that architecture is clean
Live Trading: Gradual rollout with user review and manual scaling

Historical Analysis (Future Option)

Git Tag: Current state tagged for building legacy API images if needed later
Database Preservation: Existing A/B/H DBs preserved automatically (no action required)
Future Access: Can analyze historical data later without blocking current development

The foundation is solid - clean separation of concerns with proper type safety and no over-engineering! 🎯

Core Architecture Principles​

Final Architecture​

Variants (Experiment Tracking)​

Instances (Hardware/Deployment)​

Tournament Configuration (Pure Strategy)​

Instance Service (Risk Management)​

Key Concepts Explained​

topK vs topL vs liveRatio​

Instance Behavior​

Configuration File Structure (Code-Based)​

Tournament Schedule Principles ⚡​

Experiment Lifecycle Management​

Single-Variable Experiment Principle​

Experiment File Organization Principles​

🏆 GOLDEN RULE: Config-First Experimentation​

Implementation Status ✅​

Phase 1: Code-Based Configuration System​

Phase 2: Simplified Variant Architecture​

Infrastructure Services​

Example Scenarios​

Conservative Live Trading Rollout​

Experiment Analysis​

Hardware Isolation​

ABH Variant Scheduling​

ABH Instance Tournament Scheduling 🎯​

Benefits Achieved​

🎯 Clean Architecture​

🔒 Type Safety​

📊 Proper Metrics​

🚀 Scalable Risk Management​

🛡️ Safety by Design​

Implementation Questions & Answers​

Phase 2: Historical Analysis Scripts (Before ABH Consolidation)​

Phase 3: ABH Implementation​

Phase 4: Data Tracking​

Phase 5: Analysis​

Phase 6: Live Trading​

Phase 7: Agent Integration​

Phase 8: Dashboard​

Next Steps​

Historical Analysis (Future Option)​

🎯 Production Schedule Now Visible:​