Chordia / docs /API_REFERENCE_EN.md
Corolin's picture
first commit
0a6452f
# API Reference
(Google Gemini Translation)
This document provides a detailed description of all API interfaces, classes, and functions for the emotion and physiological state change prediction model.
## Table of Contents
1. [Model Classes](#model-classes)
2. [Data Processing Classes](#data-processing-classes)
3. [Utility Classes](#utility-classes)
4. [Loss Functions](#loss-functions)
5. [Evaluation Metrics](#evaluation-metrics)
6. [Factory Functions](#factory-functions)
7. [Command-Line Interface](#command-line-interface)
## Model Classes
### `PADPredictor`
A Multi-Layer Perceptron-based predictor for emotion and physiological state changes.
```python
class PADPredictor(nn.Module):
def __init__(self,
input_dim: int = 7,
output_dim: int = 3,
hidden_dims: list = [512, 256, 128],
dropout_rate: float = 0.3,
weight_init: str = "xavier_uniform",
bias_init: str = "zeros")
```
#### Parameters
- `input_dim` (int): Input dimension, defaults to 7 (User PAD 3D + Vitality 1D + AI Current PAD 3D)
- `output_dim` (int): Output dimension, defaults to 3 (ΔPAD 3D, Pressure is dynamically calculated via formula)
- `hidden_dims` (list): List of hidden layer dimensions, defaults to [512, 256, 128]
- `dropout_rate` (float): Dropout probability, defaults to 0.3
- `weight_init` (str): Weight initialization method, defaults to "xavier_uniform"
- `bias_init` (str): Bias initialization method, defaults to "zeros"
#### Methods
##### `forward(self, x: torch.Tensor) -> torch.Tensor`
Forward pass.
**Parameters:**
- `x` (torch.Tensor): Input tensor with shape (batch_size, input_dim)
**Returns:**
- `torch.Tensor`: Output tensor with shape (batch_size, output_dim)
**Example:**
```python
import torch
from src.models.pad_predictor import PADPredictor
model = PADPredictor()
input_data = torch.randn(4, 7) # batch_size=4, input_dim=7
output = model(input_data)
print(f"Output shape: {output.shape}") # torch.Size([4, 3])
```
##### `predict_components(self, x: torch.Tensor) -> Dict[str, torch.Tensor]`
Predicts and decomposes output components.
**Parameters:**
- `x` (torch.Tensor): Input tensor
**Returns:**
- `Dict[str, torch.Tensor]`: Dictionary containing various components
- `'delta_pad'`: ΔPAD (3D)
- `'delta_pressure'`: ΔPressure (1D, dynamically calculated)
- `'confidence'`: Confidence (1D, optional)
**Example:**
```python
components = model.predict_components(input_data)
print(f"ΔPAD shape: {components['delta_pad'].shape}") # torch.Size([4, 3])
print(f"ΔPressure shape: {components['delta_pressure'].shape}") # torch.Size([4, 1])
print(f"Confidence shape: {components['confidence'].shape}") # torch.Size([4, 1])
```
##### `get_model_info(self) -> Dict[str, Any]`
Retrieves model information.
**Returns:**
- `Dict[str, Any]`: Dictionary containing model information
**Example:**
```python
info = model.get_model_info()
print(f"Model type: {info['model_type']}")
print(f"Total parameters: {info['total_parameters']}")
print(f"Trainable parameters: {info['trainable_parameters']}")
```
##### `save_model(self, filepath: str, include_optimizer: bool = False, optimizer: Optional[torch.optim.Optimizer] = None)`
Saves the model to a file.
**Parameters:**
- `filepath` (str): Path to save the model
- `include_optimizer` (bool): Whether to include optimizer state, defaults to False
- `optimizer` (Optional[torch.optim.Optimizer]): Optimizer object
**Example:**
```python
model.save_model("model.pth", include_optimizer=True, optimizer=optimizer)
```
##### `load_model(cls, filepath: str, device: str = 'cpu') -> 'PADPredictor'`
Loads the model from a file.
**Parameters:**
- `filepath` (str): Path to the model file
- `device` (str): Device type, defaults to 'cpu'
**Returns:**
- `PADPredictor`: Loaded model instance
**Example:**
```python
loaded_model = PADPredictor.load_model("model.pth", device='cuda')
```
##### `freeze_layers(self, layer_names: list = None)`
Freezes parameters of specified layers.
**Parameters:**
- `layer_names` (list): List of layer names to freeze; if None, all layers are frozen
**Example:**
```python
# Freeze all layers
model.freeze_layers()
# Freeze specific layers
model.freeze_layers(['network.0.weight', 'network.2.weight'])
```
##### `unfreeze_layers(self, layer_names: list = None)`
Unfreezes parameters of specified layers.
**Parameters:**
- `layer_names` (list): List of layer names to unfreeze; if None, all layers are unfrozen
## Data Processing Classes
### `DataPreprocessor`
Data preprocessor responsible for feature and label scaling.
```python
class DataPreprocessor:
def __init__(self,
feature_scaler: str = "standard",
label_scaler: str = "standard",
feature_range: tuple = None,
label_range: tuple = None)
```
#### Parameters
- `feature_scaler` (str): Feature scaling method, defaults to "standard"
- `label_scaler` (str): Label scaling method, defaults to "standard"
- `feature_range` (tuple): Feature range for MinMax scaling
- `label_range` (tuple): Label range for MinMax scaling
#### Methods
##### `fit(self, features: np.ndarray, labels: np.ndarray) -> 'DataPreprocessor'`
Fits preprocessor parameters.
**Parameters:**
- `features` (np.ndarray): Training feature data
- `labels` (np.ndarray): Training label data
**Returns:**
- `DataPreprocessor`: Self instance
##### `transform(self, features: np.ndarray, labels: np.ndarray = None) -> tuple`
Transforms data.
**Parameters:**
- `features` (np.ndarray): Input feature data
- `labels` (np.ndarray, optional): Input label data
**Returns:**
- `tuple`: (transformed features, transformed labels)
##### `fit_transform(self, features: np.ndarray, labels: np.ndarray = None) -> tuple`
Fits and transforms data.
##### `inverse_transform(self, features: np.ndarray, labels: np.ndarray = None) -> tuple`
Inverse transforms data.
##### `save(self, filepath: str)`
Saves the preprocessor to a file.
##### `load(cls, filepath: str) -> 'DataPreprocessor'`
Loads the preprocessor from a file.
**Example:**
```python
from src.data.preprocessor import DataPreprocessor
# Create preprocessor
preprocessor = DataPreprocessor(
feature_scaler="standard",
label_scaler="standard"
)
# Fit and transform data
processed_features, processed_labels = preprocessor.fit_transform(train_features, train_labels)
# Save preprocessor
preprocessor.save("preprocessor.pkl")
# Load preprocessor
loaded_preprocessor = DataPreprocessor.load("preprocessor.pkl")
```
### `SyntheticDataGenerator`
Synthetic data generator for creating training and test data.
```python
class SyntheticDataGenerator:
def __init__(self,
num_samples: int = 1000,
seed: int = 42,
noise_level: float = 0.1,
correlation_strength: float = 0.5)
```
#### Parameters
- `num_samples` (int): Number of samples to generate, defaults to 1000
- `seed` (int): Random seed, defaults to 42
- `noise_level` (float): Noise level, defaults to 0.1
- `correlation_strength` (float): Correlation strength, defaults to 0.5
#### Methods
##### `generate_data(self) -> tuple`
Generates synthetic data.
**Returns:**
- `tuple`: (feature data, label data)
##### `save_data(self, features: np.ndarray, labels: np.ndarray, filepath: str, format: str = 'csv')`
Saves data to a file.
**Example:**
```python
from src.data.synthetic_generator import SyntheticDataGenerator
# Create data generator
generator = SyntheticDataGenerator(num_samples=1000, seed=42)
# Generate data
features, labels = generator.generate_data()
# Save data
generator.save_data(features, labels, "synthetic_data.csv", format='csv')
```
### `EmotionDataset`
PyTorch Dataset class for emotion prediction tasks.
```python
class EmotionDataset(Dataset):
def __init__(self,
features: np.ndarray,
labels: np.ndarray,
transform: callable = None)
```
#### Parameters
- `features` (np.ndarray): Feature data
- `labels` (np.ndarray): Label data
- `transform` (callable): Data transformation function
## Utility Classes
### `InferenceEngine`
Inference engine providing high-performance model inference.
```python
class InferenceEngine:
def __init__(self,
model: nn.Module,
preprocessor: DataPreprocessor = None,
device: str = 'auto')
```
#### Methods
##### `predict(self, input_data: Union[list, np.ndarray]) -> Dict[str, Any]`
Single sample prediction.
**Parameters:**
- `input_data`: Input data, can be a list or NumPy array
**Returns:**
- `Dict[str, Any]`: Dictionary of prediction results
**Example:**
```python
from src.utils.inference_engine import create_inference_engine
# Create inference engine
engine = create_inference_engine(
model_path="model.pth",
preprocessor_path="preprocessor.pkl"
)
# Single sample prediction
input_data = [0.5, 0.3, -0.2, 75.0, 0.1, 0.4, -0.1]
result = engine.predict(input_data)
print(f"ΔPAD: {result['delta_pad']}")
print(f"Confidence: {result['confidence']}")
```
##### `predict_batch(self, input_batch: Union[list, np.ndarray]) -> List[Dict[str, Any]]`
Batch prediction.
##### `benchmark(self, num_samples: int = 1000, batch_size: int = 32) -> Dict[str, float]`
Performance benchmarking.
**Returns:**
- `Dict[str, float]`: Performance statistics
**Example:**
```python
# Performance benchmarking
stats = engine.benchmark(num_samples=1000, batch_size=32)
print(f"Throughput: {stats['throughput']:.2f} samples/sec")
print(f"Average latency: {stats['avg_latency']:.2f}ms")
```
### `ModelTrainer`
Model trainer providing full training pipeline management.
```python
class ModelTrainer:
def __init__(self,
model: nn.Module,
preprocessor: DataPreprocessor = None,
device: str = 'auto')
```
#### Methods
##### `train(self, train_loader: DataLoader, val_loader: DataLoader, config: Dict[str, Any]) -> Dict[str, Any]`
Trains the model.
**Parameters:**
- `train_loader` (DataLoader): Training data loader
- `val_loader` (DataLoader): Validation data loader
- `config` (Dict[str, Any]): Training configuration
**Returns:**
- `Dict[str, Any]`: Training history
**Example:**
```python
from src.utils.trainer import ModelTrainer
# Create trainer
trainer = ModelTrainer(model, preprocessor)
# Training configuration
config = {
'epochs': 100,
'learning_rate': 0.001,
'weight_decay': 1e-4,
'patience': 10,
'save_dir': './models'
}
# Start training
history = trainer.train(train_loader, val_loader, config)
```
##### `evaluate(self, test_loader: DataLoader) -> Dict[str, float]`
Evaluates the model.
## Loss Functions
### `WeightedMSELoss`
Weighted Mean Squared Error loss function.
```python
class WeightedMSELoss(nn.Module):
def __init__(self,
delta_pad_weight: float = 1.0,
delta_pressure_weight: float = 1.0,
confidence_weight: float = 0.5,
reduction: str = 'mean')
```
#### Parameters
- `delta_pad_weight` (float): Weight for ΔPAD loss, defaults to 1.0
- `delta_pressure_weight` (float): Weight for ΔPressure loss, defaults to 1.0
- `confidence_weight` (float): Weight for confidence loss, defaults to 0.5
- `reduction` (str): Reduction method for the loss, defaults to 'mean'
**Example:**
```python
from src.models.loss_functions import WeightedMSELoss
criterion = WeightedMSELoss(
delta_pad_weight=1.0,
delta_pressure_weight=1.0,
confidence_weight=0.5
)
loss = criterion(predictions, targets)
```
### `ConfidenceLoss`
Confidence loss function.
```python
class ConfidenceLoss(nn.Module):
def __init__(self, reduction: str = 'mean')
```
## Evaluation Metrics
### `RegressionMetrics`
Regression evaluation metrics calculator.
```python
class RegressionMetrics:
def __init__(self)
```
#### Methods
##### `calculate_all_metrics(self, y_true: np.ndarray, y_pred: np.ndarray) -> Dict[str, float]`
Calculates all regression metrics.
**Parameters:**
- `y_true` (np.ndarray): True values
- `y_pred` (np.ndarray): Predicted values
**Returns:**
- `Dict[str, float]`: Dictionary containing all metrics
**Example:**
```python
from src.models.metrics import RegressionMetrics
metrics_calculator = RegressionMetrics()
metrics = metrics_calculator.calculate_all_metrics(true_labels, predictions)
print(f"MSE: {metrics['mse']:.4f}")
print(f"MAE: {metrics['mae']:.4f}")
print(f"R²: {metrics['r2']:.4f}")
```
### `PADMetrics`
PAD-specific evaluation metrics.
```python
class PADMetrics:
def __init__(self)
```
#### Methods
##### `evaluate_predictions(self, predictions: np.ndarray, targets: np.ndarray) -> Dict[str, Any]`
Evaluates PAD prediction results.
## Factory Functions
### `create_pad_predictor(config: Optional[Dict[str, Any]] = None) -> PADPredictor`
Factory function for creating a PAD predictor.
**Parameters:**
- `config` (Dict[str, Any], optional): Configuration dictionary
**Returns:**
- `PADPredictor`: PAD predictor instance
**Example:**
```python
from src.models.pad_predictor import create_pad_predictor
# Use default configuration
model = create_pad_predictor()
# Use custom configuration
config = {
'dimensions': {
'input_dim': 7,
'output_dim': 4 or 3
},
'architecture': {
'hidden_layers': [
{'size': 256, 'activation': 'ReLU', 'dropout': 0.3},
{'size': 128, 'activation': 'ReLU', 'dropout': 0.2}
]
}
}
model = create_pad_predictor(config)
```
### `create_inference_engine(model_path: str, preprocessor_path: str = None, device: str = 'auto') -> InferenceEngine`
Factory function for creating an inference engine.
**Parameters:**
- `model_path` (str): Path to the model file
- `preprocessor_path` (str, optional): Path to the preprocessor file
- `device` (str): Device type
**Returns:**
- `InferenceEngine`: Inference engine instance
### `create_training_setup(config: Dict[str, Any]) -> tuple`
Factory function for creating a training setup.
**Parameters:**
- `config` (Dict[str, Any]): Training configuration
**Returns:**
- `tuple`: (model, trainer, data loader)
## Command-Line Interface
### Main CLI Tool
The project provides a unified command-line interface supporting various operations:
```bash
emotion-prediction <command> [options]
```
#### Available Commands
- `train`: Trains the model
- `predict`: Makes predictions
- `evaluate`: Evaluates the model
- `inference`: Inference script
- `benchmark`: Performance benchmarking
#### Train Command
```bash
emotion-prediction train --config CONFIG_FILE [OPTIONS]
```
**Parameters:**
- `--config, -c`: Path to the training configuration file (required)
- `--output-dir, -o`: Output directory (default: ./outputs)
- `--device`: Computing device (auto/cpu/cuda, default: auto)
- `--resume`: Resume training from a checkpoint
- `--epochs`: Override number of training epochs
- `--batch-size`: Override batch size
- `--learning-rate`: Override learning rate
- `--seed`: Random seed (default: 42)
- `--verbose, -v`: Verbose output
- `--log-level`: Log level (DEBUG/INFO/WARNING/ERROR)
**Example:**
```bash
# Basic training
emotion-prediction train --config configs/training_config.yaml
# GPU training
emotion-prediction train --config configs/training_config.yaml --device cuda
# Resume from checkpoint
emotion-prediction train --config configs/training_config.yaml --resume checkpoint.pth
```
#### Predict Command
```bash
emotion-prediction predict --model MODEL_FILE [OPTIONS]
```
**Parameters:**
- `--model, -m`: Path to the model file (required)
- `--preprocessor, -p`: Path to the preprocessor file
- `--interactive, -i`: Interactive mode
- `--quick`: Quick prediction mode (7 numerical values)
- `--batch`: Batch prediction mode (input file)
- `--output, -o`: Output file path
- `--device`: Computing device
- `--verbose, -v`: Verbose output
- `--log-level`: Log level
**Example:**
```bash
# Interactive prediction
emotion-prediction predict --model model.pth --interactive
# Quick prediction
emotion-prediction predict --model model.pth --quick 0.5 0.3 -0.2 75.0 0.1 0.4 -0.1
# Batch prediction
emotion-prediction predict --model model.pth --batch input.csv --output results.csv
```
#### Evaluate Command
```bash
emotion-prediction evaluate --model MODEL_FILE --data DATA_FILE [OPTIONS]
```
**Parameters:**
- `--model, -m`: Path to the model file (required)
- `--data, -d`: Path to the test data file (required)
- `--preprocessor, -p`: Path to the preprocessor file
- `--output, -o`: Path for evaluation results output
- `--report`: Path for generating a detailed report file
- `--metrics`: List of evaluation metrics (default: mse mae r2)
- `--batch-size`: Batch size (default: 32)
- `--device`: Computing device
- `--verbose, -v`: Verbose output
- `--log-level`: Log level
**Example:**
```bash
# Basic evaluation
emotion-prediction evaluate --model model.pth --data test_data.csv
# Generate detailed report
emotion-prediction evaluate --model model.pth --data test_data.csv --report report.html
```
#### Benchmark Command
```bash
emotion-prediction benchmark --model MODEL_FILE [OPTIONS]
```
**Parameters:**
- `--model, -m`: Path to the model file (required)
- `--preprocessor, -p`: Path to the preprocessor file
- `--num-samples`: Number of test samples (default: 1000)
- `--batch-size`: Batch size (default: 32)
- `--device`: Computing device
- `--report`: Path for generating a performance report file
- `--warmup`: Number of warmup iterations (default: 10)
- `--verbose, -v`: Verbose output
- `--log-level`: Log level
**Example:**
```bash
# Standard benchmarking
emotion-prediction benchmark --model model.pth
# Custom test parameters
emotion-prediction benchmark --model model.pth --num-samples 5000 --batch-size 64
```
## Configuration File API
### Model Configuration
Model configuration files use YAML format and support the following parameters:
```yaml
# Model basic information
model_info:
name: str # Model name
type: str # Model type
version: str # Model version
# Input/output dimensions
dimensions:
input_dim: int # Input dimension
output_dim: int # Output dimension
# Network architecture
architecture:
hidden_layers:
- size: int # Layer size
activation: str # Activation function
dropout: float # Dropout rate
output_layer:
activation: str # Output activation function
use_batch_norm: bool # Whether to use batch normalization
use_layer_norm: bool # Whether to use layer normalization
# Initialization parameters
initialization:
weight_init: str # Weight initialization method
bias_init: str # Bias initialization method
# Regularization
regularization:
weight_decay: float # L2 regularization coefficient
dropout_config:
type: str # Dropout type
rate: float # Dropout rate
```
### Training Configuration
Training configuration files support the following parameters:
```yaml
# Training information
training_info:
experiment_name: str # Experiment name
description: str # Experiment description
seed: int # Random seed
# Training hyperparameters
training:
optimizer:
type: str # Optimizer type
learning_rate: float # Learning rate
weight_decay: float # Weight decay
scheduler:
type: str # Scheduler type
epochs: int # Number of training epochs
early_stopping:
enabled: bool # Whether to enable early stopping
patience: int # Patience value
min_delta: float # Minimum improvement
```
## Exception Handling
The project defines the following custom exceptions:
### `ModelLoadError`
Model loading error.
### `DataPreprocessingError`
Data preprocessing error.
### `InferenceError`
Inference process error.
### `ConfigurationError`
Configuration file error.
**Example:**
```python
from src.utils.exceptions import ModelLoadError, InferenceError
try:
model = PADPredictor.load_model("invalid_model.pth")
except ModelLoadError as e:
print(f"Model loading failed: {e}")
try:
result = engine.predict(invalid_input)
except InferenceError as e:
print(f"Inference failed: {e}")
```
## Logging System
The project uses a structured logging system:
```python
from src.utils.logger import setup_logger
import logging
# Set up logging
setup_logger(level='INFO', log_file='training.log')
logger = logging.getLogger(__name__)
# Use logging
logger.info("Training started")
logger.debug(f"Batch size: {batch_size}")
logger.warning("Potential overfitting detected")
logger.error("Error occurred during training")
```
## Type Hinting
The project fully supports type hinting, with detailed type annotations for all public APIs:
```python
from typing import Dict, List, Optional, Union, Tuple
import numpy as np
import torch
def predict_emotion(
input_data: Union[List[float], np.ndarray],
model_path: str,
preprocessor_path: Optional[str] = None,
device: str = 'auto'
) -> Dict[str, Any]:
"""
Predicts emotional changes
Args:
input_data: Input data, 7-dimensional vector
model_path: Path to the model file
preprocessor_path: Path to the preprocessor file
device: Computing device
Returns:
A dictionary containing prediction results
Raises:
InferenceError: Raised when inference fails
"""
pass
```
---
For more details, please refer to the source code and example files. If you have any questions, please check the [Troubleshooting Guide](TUTORIAL.md#troubleshooting) or submit an Issue.