Skip to main content

Mercury Trading System - Comprehensive Refactoring Plan

Β· 4 min read
Max Kaido
Architect

Our Mercury trading system has grown organically over time, leading to intertwined services, duplicated code, and complex state management. This document outlines a strategic approach to refactoring key components to improve maintainability, testability, and performance.

Current Issues​

After analyzing PositionService, OrderService, ShadowOrderService, ShadowAccountService, and their related entities, we've identified several architectural and design issues:

  1. Tightly Coupled Services: Services have excessive dependencies, making changes risky
  2. Mixed Responsibilities: Business logic mixed with job scheduling, metrics, and notifications
  3. Inconsistent Data Management: Parallel transaction systems and redundant storage
  4. Scattered Logic: Core calculations (like PnL) repeated across multiple services
  5. Complex State Management: Position and order states managed across multiple services
  6. Verbose Error Handling: Boilerplate logging obscuring core business logic

Refactoring Strategy: Domain-Driven Design​

We'll apply Domain-Driven Design principles to reorganize our codebase around clear domain boundaries.

Phase 1: Domain Modeling & Boundary Definition​

  1. Identify Core Domains

    • Position Management
    • Order Execution
    • Account Management
    • Transaction Processing
    • Portfolio Analytics
  2. Define Domain Models

    • Create clear models with well-defined relationships
    • Eliminate redundant fields and duplicated data
    • Standardize entity patterns
  3. Establish Bounded Contexts

    • Define clear service boundaries
    • Document integration points between domains
    • Create context maps to visualize relationships

Phase 2: Command-Query Responsibility Segregation​

  1. Define Command Services

    • Create specialized write-focused services:
      • PositionCommandService: Position creation/updates
      • OrderCommandService: Order execution
      • AccountCommandService: Balance management
  2. Define Query Services

    • Create read-optimized services:
      • PositionQueryService: Position details/filtering
      • OrderQueryService: Order history/filtering
      • AccountQueryService: Performance metrics/history
  3. Implementation Plan

    • Start with one domain (e.g., positions)
    • Extract query-related methods to a new service
    • Refactor command methods to remove query dependencies
    • Update controllers/consumers to use appropriate service

Phase 3: Domain Service Extraction​

  1. Transaction Service Consolidation

    • Create TransactionService to encapsulate all transaction operations
    • Replace embedded transaction arrays with proper relationships
    • Standardize transaction creation across all services
  2. PnL Calculator Extraction

    • Create dedicated PnLCalculatorService for all profit/loss calculations
    • Extract calculatePnL, calculateCombinedPnL, calculateFinalPnL methods
    • Provide consistent methods for realized/unrealized PnL calculation
    • Add comprehensive unit tests for calculator logic
  3. Implementation Plan

    • Extract helper methods to new services
    • Use dependency injection to provide these services
    • Update existing services to use the extracted functionality
    • Write tests to validate calculation consistency

Phase 4: Event-Driven Architecture​

  1. Define Domain Events

    • Create event types for key state changes:
      • PositionCreatedEvent
      • OrderStatusChangedEvent
      • AccountBalanceChangedEvent
      • TakeProfitHitEvent
      • StopLossHitEvent
  2. Implement Event Publisher

    • Create DomainEventPublisher for event distribution
    • Use Bull queues for event propagation
    • Add retry/error handling for event processing
  3. Update Services to Use Events

    • Refactor services to emit events on state changes
    • Replace direct service calls with event subscribers
    • Decouple position updates from order processing
  4. Implementation Plan

    • Start with defining key events
    • Create event publisher infrastructure
    • Gradually replace direct service calls with events
    • Test event sequencing and error handling

Phase 5: Job Scheduling Separation​

  1. Create Dedicated Scheduler Services

    • PositionSchedulerService: Position creation/update scheduling
    • OrderSchedulerService: Order execution scheduling
    • AnalyticsSchedulerService: Metrics calculation scheduling
  2. Extract Queue Logic

    • Move queue-related code from business services
    • Standardize job parameters and retry logic
    • Improve job naming and tracking
  3. Implementation Plan

    • Extract scheduling methods to new services
    • Define clear job types and parameters
    • Update modules to use scheduler services
    • Add job validation and monitoring

Phase 6: Entity Relationship Redesign​

  1. Simplify Position and Order Relationship

    • Clarify ownership and lifecycle dependencies
    • Standardize relationship cardinality
    • Improve deletion/orphaning behavior
  2. Standardize Entity Design

    • Apply consistent patterns for default values
    • Standardize nullability across similar fields
    • Document validation rules in schema
  3. Implementation Plan

    • Model new entity relationships
    • Create migration plan for existing data
    • Update repositories and services
    • Verify referential integrity

Implementation Approach​

Step 1: Create Baseline Tests​

Before making significant changes, we need comprehensive tests:

  1. Unit Tests

    • Focus on core business logic
    • Mock dependencies to isolate functionality
    • Test edge cases and error handling
  2. Integration Tests

    • Test service interactions
    • Verify database operations
    • Validate event processing
  3. End-to-End Tests

    • Test complete workflows
    • Validate system behavior
    • Ensure backward compatibility

Step 2: Incremental Refactoring​

We'll follow this strategy for each component:

  1. Extract Domain Logic

    • Identify core business rules
    • Move to appropriate domain services
    • Test thoroughly
  2. Replace Direct Dependencies

    • Inject services through constructor
    • Replace tight coupling with events
    • Update tests to use mocks
  3. Improve Error Handling

    • Standardize error patterns
    • Add meaningful error contexts
    • Improve logging consistency
  4. Document Interfaces

    • Define clear method contracts
    • Document domain events
    • Update API documentation

Step 3: Performance Monitoring​

To ensure our refactoring improves the system:

  1. Add Metrics

    • Response times for key operations
    • Resource usage before/after
    • Error rates and performance degradation
  2. Benchmark Critical Paths

    • Position creation/update flow
    • Order execution flow
    • Transaction processing
  3. Monitor in Staging

    • Deploy changes to staging environment
    • Compare metrics with production
    • Validate performance under load

Prioritized Refactoring Tasks​

  1. Immediate Wins (Week 1-2)

    • Remove redundant transaction storage in ShadowAccount
    • Extract PnL calculation to dedicated service
    • Standardize error handling patterns
  2. Medium-Term Changes (Week 3-6)

    • Implement command/query separation for positions
    • Extract transaction service
    • Create event publisher infrastructure
  3. Long-Term Restructuring (Week 7-12)

    • Complete event-driven architecture
    • Redesign entity relationships
    • Implement scheduler services
    • Add performance monitoring

Progress Update (June 2025)​

We've made significant progress on our refactoring plan, with two major phases now complete:

βœ… Phase 4: Event-Driven Architecture​

We successfully implemented an event-driven architecture across the Mercury system:

  1. Domain Events: Defined clear events for positions, orders, and transactions
  2. Event Publishers: Implemented publisher services for each domain
  3. Event Listeners: Created listener services that react to events
  4. Decoupled Services: Replaced direct service calls with event-based communication

This has greatly reduced the tight coupling between services and improved testability. Services now communicate through well-defined events rather than direct method calls.

βœ… Phase 2: Command-Query Responsibility Segregation (CQRS)​

We've implemented CQRS pattern for our core domains:

  1. Position Query Service: Separated read operations for positions
  2. Order Query Service: Separated read operations for orders
  3. Updated Controllers: Controllers now use query services for read operations

This separation improves our ability to optimize read and write operations independently and clarifies the responsibilities of each service.

πŸ”„ In Progress: Phase 5: Job Scheduling Separation​

Our current focus is on extracting the job scheduling logic from business services:

  1. Started with identifying scheduling code in position and order services
  2. Planning dedicated scheduler services for each domain

Next Steps​

  1. Complete Job Scheduling Separation:

    • Create PositionSchedulerService and OrderSchedulerService
    • Move queue-related code from business services
  2. Move to Phase 6: Entity Relationship Redesign:

    • Simplify the relationship between positions and orders
    • Standardize entity design patterns across the system
  3. Consolidate Transaction Services:

    • Complete the implementation of TransactionService
    • Remove duplicated transaction code

Conclusion​

This refactoring plan targets the most critical issues in our Mercury trading system while minimizing risk through incremental changes. By focusing on domain boundaries, separating concerns, and improving code organization, we'll create a more maintainable and extensible system.

The event-driven approach will reduce service coupling, making the system more resilient and easier to test. Standardizing our entity design and consolidating core logic will improve consistency and reduce bugs.

Most importantly, these changes preserve the core functionality of the Mercury trading system while setting the stage for future features and improvements.