Clean Architecture Refactoring For MLP: A Comprehensive Guide

by SLV Team 62 views
Architecture Refactoring Proposal: Clean Architecture for MLP

Hey guys! Let's dive into a super important topic today: refactoring the architecture of our MLP (Multilayer Perceptron) repository using the Clean Architecture principles. This is a big deal because it's all about making our codebase more maintainable, extensible, and just plain awesome. We're going to break down the proposal, discuss the problems we're facing, and lay out a clear plan for how we're going to tackle this. So, buckle up, grab your favorite beverage, and let's get started!

Overview

This article is all about proposing a major revamp of the src directory in our Rust MLP repository. We're talking about implementing Clean Architecture, a design philosophy that's going to help us separate concerns, encapsulate those pesky external dependencies, and boost the overall extensibility and maintainability of our code. Right now, we've got a rather flat structure, and we're aiming for something much more organized and scalable.

Current Problems

Before we jump into the shiny new architecture, let's be real about the issues we're currently dealing with. Understanding these problems is key to appreciating the benefits of Clean Architecture.

Architectural Issues

  • Flat Structure: Imagine a single drawer crammed with all sorts of items – that's kind of what our src/ directory looks like right now. We've got 14 files, like tensor.rs, layers.rs, and trainer.rs, all sitting together with no clear organization. This makes it hard to find things and understand how everything fits together.
  • Tight Coupling: Think of it like a tangled web of code. Our data processing, training, and inference logic are all interconnected. This tight coupling means that if we change one part, it can have unpredictable effects on others. Not ideal, right?
  • External Dependency Leakage: Imagine if your walls weren't properly insulated – that's what's happening with our external dependencies. Things like CSV loading and WASM bindings are directly tied to our internal layers. This makes our core logic dependent on external factors, which isn't great for flexibility.
  • Domain-Specific: Right now, our repository is kind of a one-trick pony. It's highly specialized for the breast cancer dataset, which limits its generalizability. We want to build something that can handle a variety of tasks.
  • Extension Difficulties: Want to add a new task, like multi-class classification or regression? It's not as straightforward as it should be. The impact scope of adding new features is unclear, making the process more complex and risky.

Technical Issues

Let's get a bit more technical. Here's a peek at our current dependencies, and trust me, it's not a pretty picture:

Current Dependencies (Problematic Structure):
wasm.rs -> trainer.rs -> layers.rs -> tensor.rs
dataset.rs -> tensor.rs
trainer.rs -> loss.rs, metrics.rs, optimizer.rs

Problem: Everything cross-references, making change impact unpredictable

As you can see, everything is cross-referencing each other. This creates a tangled mess where it's tough to predict what will break when you make a change. We need to untangle this web!

Proposed Improvement: Clean Architecture

Alright, let's talk solutions! We're proposing a shift to Clean Architecture. This isn't just a buzzword; it's a proven way to structure our code so that it's more modular, testable, and maintainable. The core idea is to separate our application into layers, each with a specific responsibility.

Proposed New Directory Structure

Here's a sneak peek at the directory structure we're aiming for:

src/
β”œβ”€β”€ core/                    # Infrastructure Layer (Domain-independent)
β”‚   β”œβ”€β”€ tensor.rs           # Tensor operations & automatic differentiation
β”‚   β”œβ”€β”€ graph.rs            # Computation graph
β”‚   β”œβ”€β”€ ops.rs              # Basic operations
β”‚   └── error.rs            # Common error types
β”œβ”€β”€ domain/                  # Domain Layer (Business logic)
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ mlp.rs          # MLP abstract model
β”‚   β”‚   └── layer.rs        # Layer abstraction
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ loss.rs         # Loss function interfaces
β”‚   β”‚   β”œβ”€β”€ metrics.rs      # Evaluation metrics
β”‚   β”‚   └── optimizer.rs    # Optimization algorithms
β”‚   β”œβ”€β”€ ports.rs            # External interface definitions (Repository traits)
β”‚   └── types.rs            # Domain-specific type definitions
β”œβ”€β”€ usecase/                 # Use Case Layer (Application logic)
β”‚   β”œβ”€β”€ preprocess.rs       # Data preprocessing (normalization, one-hot)
β”‚   β”œβ”€β”€ train_mlp.rs        # MLP training use case
β”‚   β”œβ”€β”€ predict_mlp.rs      # MLP inference use case
β”‚   └── evaluate.rs         # Model evaluation use case
β”œβ”€β”€ adapters/                # Adapter Layer (External interface implementations)
β”‚   β”œβ”€β”€ data/
β”‚   β”‚   β”œβ”€β”€ csv_repo.rs     # CSV loading implementation
β”‚   β”‚   └── generic_repo.rs # Generic data loading
β”‚   └── presentation/
β”‚       β”œβ”€β”€ wasm/
β”‚       β”‚   └── bindings.rs # WASM bindings
β”‚       └── native/
β”‚           └── visualizer.rs # Native visualization
β”œβ”€β”€ app/                     # Application Layer (Integration & coordination)
β”‚   β”œβ”€β”€ mlp.rs              # High-level API (processing β†’ training β†’ inference)
β”‚   β”œβ”€β”€ config.rs           # Application configuration
β”‚   └── facade.rs           # Integration facade
└── bin/                     # Entry Points
    └── cli.rs              # CLI implementation

Let's break down what each layer is all about:

  • core/: This is our infrastructure layer. It's the heart of our system, dealing with the fundamental operations that are domain-independent. Think of things like tensor operations, computation graphs, and error handling. It's the rock-solid foundation upon which everything else is built.
  • domain/: Here lies our business logic. This layer contains our models (MLP, layers), services (loss functions, metrics, optimizers), and interfaces (repository traits). It's the brains of the operation, defining the core concepts and rules of our application.
  • usecase/: This is where we define how our application is used. It's the application logic layer, containing specific use cases like training the MLP, making predictions, and evaluating models. It's all about orchestrating the domain logic to achieve specific tasks.
  • adapters/: This layer is our gateway to the outside world. It's the external interface implementation layer, handling things like data loading (CSV, generic data sources) and presentation (WASM bindings, native visualizations). It's responsible for adapting external systems to our internal logic.
  • app/: Think of this as the conductor of our orchestra. The application layer is responsible for integration and coordination. It provides a high-level API for processing, training, and inference, and handles application configuration.
  • bin/: This is where our entry points live. It's the launching pad for our application, containing the CLI implementation and other entry points.

Dependency Direction

One of the key principles of Clean Architecture is the direction of dependencies. We want the inner layers to be independent of the outer layers. Here's how our dependencies should flow:

bin/ β†’ app/ β†’ usecase/ β†’ domain/ β†’ core/
         ↑      ↑         ↑
    adapters/ β”€β”˜         /
                        /
               ports.rs (interfaces)

The important thing to remember: Inner layers shouldn't know anything about outer layers. This is the Dependency Inversion Principle in action. It keeps our core logic clean and free from external influences.

Implementation Requirements

So, how are we going to make this Clean Architecture dream a reality? Let's break down some key implementation requirements.

1. Task Type Generalization

First, we need to make our system more flexible by introducing a TaskKind enum. This will allow us to handle different types of tasks, like binary classification, multi-class classification, and regression, with ease.

#[derive(Debug, Clone, PartialEq)]
pub enum TaskKind {
    BinaryClassification,    // sigmoid + BCE
    MultiClassification,     // softmax + CrossEntropy  
    Regression,              // identity + MSE
}

2. Domain Abstraction

Next up, we'll define repository traits in domain/ports.rs. These traits will abstract away the specifics of data access and model persistence, making our domain logic more portable and testable.

pub trait DataRepository {
    fn load_dataset(&self, config: &DataConfig) -> Result<Dataset>;
}

pub trait ModelRepository {
    fn save_model(&self, model: &MLP, path: &str) -> Result<()>;
    fn load_model(&self, path: &str) -> Result<MLP>;
}

3. Use Case Implementation

Let's take a peek at how we might implement a training use case in usecase/train_mlp.rs:

pub struct TrainMLPUsecase<R: DataRepository> {
    data_repo: Arc<R>,
}

impl<R: DataRepository> TrainMLPUsecase<R> {
    pub fn execute(&self, request: TrainRequest) -> Result<TrainResponse> {
        // Load data
        let dataset = self.data_repo.load_dataset(&request.data_config)?;
        
        // Preprocessing
        let processed = self.preprocess(dataset, &request.preprocess_config)?;
        
        // Build model
        let model = self.build_model(&request.model_config)?;
        
        // Execute training
        let history = self.train(model, processed, &request.train_config)?;
        
        Ok(TrainResponse { model, history })
    }
}

This shows how we can use the DataRepository trait to load data, preprocess it, build a model, and train it, all within the use case.

4. High-Level API

In app/mlp.rs, we'll create a high-level API that simplifies the process of training and evaluating our models. This API will act as a facade, hiding the complexities of the underlying layers.

pub struct MLPApplication {
    train_usecase: TrainMLPUsecase<CsvRepository>,
    predict_usecase: PredictMLPUsecase<CsvRepository>,
    evaluate_usecase: EvaluateUsecase,
}

impl MLPApplication {
    /// One-stop execution: processing β†’ training β†’ evaluation
    pub fn train_and_evaluate(&self, config: MLPConfig) -> Result<MLPResult> {
        let train_result = self.train_usecase.execute(config.train_request)?;
        let eval_result = self.evaluate_usecase.execute(EvaluateRequest {
            model: train_result.model,
            test_data: config.test_data,
        })?;
        
        Ok(MLPResult {
            model: train_result.model,
            training_history: train_result.history,
            evaluation: eval_result,
        })
    }
}

5. CLI Support

Finally, we'll add CLI support in bin/cli.rs to make our application easy to use from the command line. This will allow users to specify tasks, data paths, and configurations.

#[derive(Parser)]
struct Args {
    #[clap(short, long)]
    task: TaskKind,
    
    #[clap(short, long)]
    data: PathBuf,
    
    #[clap(short, long, default_value = "config.json")]
    config: PathBuf,
}

Technical Details

Let's get a bit more granular and talk about some of the design patterns and techniques we'll be using.

Design Pattern Applications

  • Repository Pattern: We'll use this to abstract data access, as we saw earlier with the DataRepository trait.
  • Strategy Pattern: This will come in handy for selecting loss functions and optimization algorithms at runtime.
  • Factory Pattern: We'll use this to abstract the creation of our models, making it easy to swap out different architectures.
  • Facade Pattern: Our high-level API in the app/ layer will act as a facade, simplifying complex processes for the user.

Type-Level Safety

We're big fans of type-level safety, so we'll be using Rust's powerful type system to ensure that our code is as robust as possible. For example, we can use PhantomData to associate a task type with our MLPModel at compile time.

// Compile-time task type guarantee
pub struct MLPModel<T: TaskType> {
    layers: Sequential,
    task_type: PhantomData<T>,
}

pub struct BinaryClassification;
pub struct MultiClassification;
pub struct Regression;

impl TaskType for BinaryClassification {
    type LossFunction = BinaryCrossEntropy;
    type ActivationFunction = Sigmoid;
}

Error Handling

Robust error handling is crucial. We'll define specific error types for each layer, making it easier to understand and handle errors that occur in our application.

// Define appropriate error types for each layer
#[derive(Debug, thiserror::Error)]
pub enum DomainError {
    #[error("Invalid model configuration: {message}")]
    InvalidModelConfig { message: String },
    
    #[error("Training failed: {source}")]
    TrainingFailed { source: CoreError },
}

Deliverables & Acceptance Criteria

To keep things organized, we'll break down the refactoring into phases, each with its own deliverables and acceptance criteria.

Phase 1: Foundation

  • Deliverables:
    • New directory structure created.
    • core/ modules migrated (tensor.rs, graph.rs, ops.rs, error.rs).
    • Interfaces defined in domain/ports.rs.
  • Acceptance Criteria:
    • Basic compilation errors resolved.

Phase 2: Domain Layer Implementation

  • Deliverables:
    • MLP and layers abstracted in domain/models/.
    • Loss, metrics, and optimization implemented in domain/services/.
    • TaskKind enum and type-level safety implemented.
  • Acceptance Criteria:
    • Domain layer components are well-defined and loosely coupled.

Phase 3: Use Case Layer Implementation

  • Deliverables:
    • Training, inference, and evaluation use cases implemented in usecase/.
    • Data preprocessing logic separated.
  • Acceptance Criteria:
    • Use cases are clearly defined and orchestrate domain logic effectively.

Phase 4: Adapter Layer Implementation

  • Deliverables:
    • CSV and generic data loading implemented in adapters/data/.
    • WASM and visualization implemented in adapters/presentation/.
  • Acceptance Criteria:
    • Adapters provide a clean interface to external systems.

Phase 5: Application Layer & Integration

  • Deliverables:
    • High-level API implemented in app/mlp.rs.
    • CLI entry point implemented in bin/cli.rs.
  • Acceptance Criteria:
    • Application layer provides a simple and intuitive API.
    • CLI is functional and easy to use.
    • Compatibility with existing examples/ is ensured.

Phase 6: Testing & Documentation

  • Deliverables:
    • Unit tests implemented for each layer.
    • Integration tests implemented.
    • README.md updated to reflect the new structure and usage.
    • Multi-class classification and regression samples added.
  • Acceptance Criteria:
    • Code is thoroughly tested.
    • Documentation is clear and up-to-date.
    • New samples demonstrate the versatility of the refactored architecture.

Verification Items

To ensure we've hit our goals, we'll verify the following:

  • Training and inference work with the existing breast cancer dataset.
  • The new multi-class classification sample works.
  • The new regression sample works.
  • The WASM build succeeds.
  • The task type can be specified from the CLI.

Future Extensibility

One of the biggest benefits of Clean Architecture is that it makes our system much easier to extend in the future. Let's look at some ways we can leverage this.

Workspace Preparation

We can prepare for the future by organizing our project as a Cargo workspace. This will allow us to split our codebase into separate crates, each corresponding to a layer in our architecture.

# Future Cargo.toml (workspace structure)
[workspace]
members = [
    "core",           # mlp-core crate
    "domain",         # mlp-domain crate  
    "usecase",        # mlp-usecase crate
    "adapters",       # mlp-adapters crate
    "app",            # mlp-app crate
    "cli",            # mlp-cli binary crate
]

Extension Points

With our Clean Architecture in place, adding new features becomes much easier. Here are some potential extension points:

  • New optimization algorithms (Adam, RMSprop).
  • New layer types (Convolutional, LSTM).
  • New data sources (Database, API).
  • New deployment targets (Python bindings, Mobile).

Module Migration Map (Current β†’ New Structure)

To make the migration process smoother, let's map out how our existing modules will move into the new structure.

Phase 1: Infrastructure Layer (Core)

Current Location           β†’  New Location                 Changes
────────────────────────────────────────────────────────────────
src/tensor.rs             β†’  src/core/tensor.rs          Direct migration
src/graph.rs              β†’  src/core/graph.rs           Direct migration
src/ops.rs                β†’  src/core/ops.rs             Direct migration
src/error.rs              β†’  src/core/error.rs           Direct migration

Phase 2: Domain Layer (Domain)

Current Location           β†’  New Location                 Changes
────────────────────────────────────────────────────────────────
src/layers.rs             β†’  src/domain/models/layer.rs  Abstract Sequential etc.
                          β†’  src/domain/models/mlp.rs    Separate MLP-specific logic
src/loss.rs               β†’  src/domain/services/loss.rs Interface-ize loss functions
src/metrics.rs            β†’  src/domain/services/metrics.rs Service-ize evaluation metrics
src/optimizer.rs          β†’  src/domain/services/optimizer.rs Service-ize optimization
New creation              β†’  src/domain/ports.rs         Define Repository traits
New creation              β†’  src/domain/types.rs         Domain types like TaskKind

Phase 3: Use Case Layer (Usecase)

Current Location           β†’  New Location                 Changes
────────────────────────────────────────────────────────────────
src/trainer.rs            β†’  src/usecase/train_mlp.rs    Extract training logic
                          β†’  src/usecase/predict_mlp.rs  Extract inference logic
                          β†’  src/usecase/evaluate.rs     Extract evaluation logic
New creation              β†’  src/usecase/preprocess.rs   Preprocessing use case

Phase 4: Adapter Layer (Adapters)

Current Location           β†’  New Location                 Changes
────────────────────────────────────────────────────────────────
src/dataset.rs            β†’  src/adapters/data/csv_repo.rs Repository implementation
src/generic_dataset.rs    β†’  src/adapters/data/generic_repo.rs Generic data loading
src/wasm.rs              β†’  src/adapters/presentation/wasm/bindings.rs WASM provision
src/visualizer.rs        β†’  src/adapters/presentation/native/visualizer.rs Visualization

Phase 5: Application Layer (App)

Current Location           β†’  New Location                 Changes
────────────────────────────────────────────────────────────────
New creation              β†’  src/app/mlp.rs              Create high-level API
New creation              β†’  src/app/config.rs           Configuration management
New creation              β†’  src/app/facade.rs           Integration facade

Phase 6: Entry Points (Bin)

Current Location           β†’  New Location                 Changes
────────────────────────────────────────────────────────────────
examples/training_demo.rs β†’  src/bin/cli.rs              CLI entry point-ize
examples/prediction.rs    β†’  Feature integration          CLI subcommand-ize

lib.rs Updates

Finally, we'll update our src/lib.rs to reflect the new module structure.

// New src/lib.rs structure
pub mod core;
pub mod domain;
pub mod usecase; 
pub mod adapters;
pub mod app;

// Re-exports for backward compatibility
pub mod prelude {
    // Maintain compatibility with existing APIs
    pub use crate::app::mlp::*;
    pub use crate::domain::models::*;
    pub use crate::core::tensor::Tensor;
    // ...
}

Discussion & Review Points

Before we dive into implementation, let's make sure we're all on the same page. Here are some key discussion and review points.

Design Decision Validity

  • Directory Structure: Does the proposed 6-layer structure make sense?
  • Responsibility Separation: Are the responsibilities of each layer clearly defined?
  • Dependencies: Is the Dependency Inversion Principle properly applied?

Implementation Approach Confirmation

  • Migration Strategy: Can we migrate from the existing code smoothly?
  • Performance: Is the abstraction overhead within an acceptable range?
  • Testability: Will it be easy to write unit tests for each layer?

Future Extensibility

  • New Task Addition: Will it be easy to add tasks beyond classification/regression?
  • New Data Formats: Will it be easy to add data sources beyond CSV?
  • Deployment Forms: Will it be easy to support targets beyond WASM?

Please review this proposal and provide your feedback before we start implementation. Your input is invaluable!