docs/API_REFERENCE_EN.md · Corolin/Chordia at main

Chordia / docs /API_REFERENCE_EN.md

Corolin

first commit

0a6452f 3 days ago

preview code

raw

history blame contribute delete

21.8 kB

	# API Reference
	(Google Gemini Translation)

	This document provides a detailed description of all API interfaces, classes, and functions for the emotion and physiological state change prediction model.

	## Table of Contents

	1. [Model Classes](#model-classes)
	2. [Data Processing Classes](#data-processing-classes)
	3. [Utility Classes](#utility-classes)
	4. [Loss Functions](#loss-functions)
	5. [Evaluation Metrics](#evaluation-metrics)
	6. [Factory Functions](#factory-functions)
	7. [Command-Line Interface](#command-line-interface)

	## Model Classes

	### `PADPredictor`

	A Multi-Layer Perceptron-based predictor for emotion and physiological state changes.

	```python
	class PADPredictor(nn.Module):
	def __init__(self,
	input_dim: int = 7,
	output_dim: int = 3,
	hidden_dims: list = [512, 256, 128],
	dropout_rate: float = 0.3,
	weight_init: str = "xavier_uniform",
	bias_init: str = "zeros")
	```

	#### Parameters

	- `input_dim` (int): Input dimension, defaults to 7 (User PAD 3D + Vitality 1D + AI Current PAD 3D)
	- `output_dim` (int): Output dimension, defaults to 3 (ΔPAD 3D, Pressure is dynamically calculated via formula)
	- `hidden_dims` (list): List of hidden layer dimensions, defaults to [512, 256, 128]
	- `dropout_rate` (float): Dropout probability, defaults to 0.3
	- `weight_init` (str): Weight initialization method, defaults to "xavier_uniform"
	- `bias_init` (str): Bias initialization method, defaults to "zeros"

	#### Methods

	##### `forward(self, x: torch.Tensor) -> torch.Tensor`

	Forward pass.

	Parameters:
	- `x` (torch.Tensor): Input tensor with shape (batch_size, input_dim)

	Returns:
	- `torch.Tensor`: Output tensor with shape (batch_size, output_dim)

	Example:
	```python
	import torch
	from src.models.pad_predictor import PADPredictor

	model = PADPredictor()
	input_data = torch.randn(4, 7) # batch_size=4, input_dim=7
	output = model(input_data)
	print(f"Output shape: {output.shape}") # torch.Size([4, 3])
	```

	##### `predict_components(self, x: torch.Tensor) -> Dict[str, torch.Tensor]`

	Predicts and decomposes output components.

	Parameters:
	- `x` (torch.Tensor): Input tensor

	Returns:
	- `Dict[str, torch.Tensor]`: Dictionary containing various components
	- `'delta_pad'`: ΔPAD (3D)
	- `'delta_pressure'`: ΔPressure (1D, dynamically calculated)
	- `'confidence'`: Confidence (1D, optional)

	Example:
	```python
	components = model.predict_components(input_data)
	print(f"ΔPAD shape: {components['delta_pad'].shape}") # torch.Size([4, 3])
	print(f"ΔPressure shape: {components['delta_pressure'].shape}") # torch.Size([4, 1])
	print(f"Confidence shape: {components['confidence'].shape}") # torch.Size([4, 1])
	```

	##### `get_model_info(self) -> Dict[str, Any]`

	Retrieves model information.

	Returns:
	- `Dict[str, Any]`: Dictionary containing model information

	Example:
	```python
	info = model.get_model_info()
	print(f"Model type: {info['model_type']}")
	print(f"Total parameters: {info['total_parameters']}")
	print(f"Trainable parameters: {info['trainable_parameters']}")
	```

	##### `save_model(self, filepath: str, include_optimizer: bool = False, optimizer: Optional[torch.optim.Optimizer] = None)`

	Saves the model to a file.

	Parameters:
	- `filepath` (str): Path to save the model
	- `include_optimizer` (bool): Whether to include optimizer state, defaults to False
	- `optimizer` (Optional[torch.optim.Optimizer]): Optimizer object

	Example:
	```python
	model.save_model("model.pth", include_optimizer=True, optimizer=optimizer)
	```

	##### `load_model(cls, filepath: str, device: str = 'cpu') -> 'PADPredictor'`

	Loads the model from a file.

	Parameters:
	- `filepath` (str): Path to the model file
	- `device` (str): Device type, defaults to 'cpu'

	Returns:
	- `PADPredictor`: Loaded model instance

	Example:
	```python
	loaded_model = PADPredictor.load_model("model.pth", device='cuda')
	```

	##### `freeze_layers(self, layer_names: list = None)`

	Freezes parameters of specified layers.

	Parameters:
	- `layer_names` (list): List of layer names to freeze; if None, all layers are frozen

	Example:
	```python
	# Freeze all layers
	model.freeze_layers()

	# Freeze specific layers
	model.freeze_layers(['network.0.weight', 'network.2.weight'])
	```

	##### `unfreeze_layers(self, layer_names: list = None)`

	Unfreezes parameters of specified layers.

	Parameters:
	- `layer_names` (list): List of layer names to unfreeze; if None, all layers are unfrozen

	## Data Processing Classes

	### `DataPreprocessor`

	Data preprocessor responsible for feature and label scaling.

	```python
	class DataPreprocessor:
	def __init__(self,
	feature_scaler: str = "standard",
	label_scaler: str = "standard",
	feature_range: tuple = None,
	label_range: tuple = None)
	```

	#### Parameters

	- `feature_scaler` (str): Feature scaling method, defaults to "standard"
	- `label_scaler` (str): Label scaling method, defaults to "standard"
	- `feature_range` (tuple): Feature range for MinMax scaling
	- `label_range` (tuple): Label range for MinMax scaling

	#### Methods

	##### `fit(self, features: np.ndarray, labels: np.ndarray) -> 'DataPreprocessor'`

	Fits preprocessor parameters.

	Parameters:
	- `features` (np.ndarray): Training feature data
	- `labels` (np.ndarray): Training label data

	Returns:
	- `DataPreprocessor`: Self instance

	##### `transform(self, features: np.ndarray, labels: np.ndarray = None) -> tuple`

	Transforms data.

	Parameters:
	- `features` (np.ndarray): Input feature data
	- `labels` (np.ndarray, optional): Input label data

	Returns:
	- `tuple`: (transformed features, transformed labels)

	##### `fit_transform(self, features: np.ndarray, labels: np.ndarray = None) -> tuple`

	Fits and transforms data.

	##### `inverse_transform(self, features: np.ndarray, labels: np.ndarray = None) -> tuple`

	Inverse transforms data.

	##### `save(self, filepath: str)`

	Saves the preprocessor to a file.

	##### `load(cls, filepath: str) -> 'DataPreprocessor'`

	Loads the preprocessor from a file.

	Example:
	```python
	from src.data.preprocessor import DataPreprocessor

	# Create preprocessor
	preprocessor = DataPreprocessor(
	feature_scaler="standard",
	label_scaler="standard"
	)

	# Fit and transform data
	processed_features, processed_labels = preprocessor.fit_transform(train_features, train_labels)

	# Save preprocessor
	preprocessor.save("preprocessor.pkl")

	# Load preprocessor
	loaded_preprocessor = DataPreprocessor.load("preprocessor.pkl")
	```

	### `SyntheticDataGenerator`

	Synthetic data generator for creating training and test data.

	```python
	class SyntheticDataGenerator:
	def __init__(self,
	num_samples: int = 1000,
	seed: int = 42,
	noise_level: float = 0.1,
	correlation_strength: float = 0.5)
	```

	#### Parameters

	- `num_samples` (int): Number of samples to generate, defaults to 1000
	- `seed` (int): Random seed, defaults to 42
	- `noise_level` (float): Noise level, defaults to 0.1
	- `correlation_strength` (float): Correlation strength, defaults to 0.5

	#### Methods

	##### `generate_data(self) -> tuple`

	Generates synthetic data.

	Returns:
	- `tuple`: (feature data, label data)

	##### `save_data(self, features: np.ndarray, labels: np.ndarray, filepath: str, format: str = 'csv')`

	Saves data to a file.

	Example:
	```python
	from src.data.synthetic_generator import SyntheticDataGenerator

	# Create data generator
	generator = SyntheticDataGenerator(num_samples=1000, seed=42)

	# Generate data
	features, labels = generator.generate_data()

	# Save data
	generator.save_data(features, labels, "synthetic_data.csv", format='csv')
	```

	### `EmotionDataset`

	PyTorch Dataset class for emotion prediction tasks.

	```python
	class EmotionDataset(Dataset):
	def __init__(self,
	features: np.ndarray,
	labels: np.ndarray,
	transform: callable = None)
	```

	#### Parameters

	- `features` (np.ndarray): Feature data
	- `labels` (np.ndarray): Label data
	- `transform` (callable): Data transformation function

	## Utility Classes

	### `InferenceEngine`

	Inference engine providing high-performance model inference.

	```python
	class InferenceEngine:
	def __init__(self,
	model: nn.Module,
	preprocessor: DataPreprocessor = None,
	device: str = 'auto')
	```

	#### Methods

	##### `predict(self, input_data: Union[list, np.ndarray]) -> Dict[str, Any]`

	Single sample prediction.

	Parameters:
	- `input_data`: Input data, can be a list or NumPy array

	Returns:
	- `Dict[str, Any]`: Dictionary of prediction results

	Example:
	```python
	from src.utils.inference_engine import create_inference_engine

	# Create inference engine
	engine = create_inference_engine(
	model_path="model.pth",
	preprocessor_path="preprocessor.pkl"
	)

	# Single sample prediction
	input_data = [0.5, 0.3, -0.2, 75.0, 0.1, 0.4, -0.1]
	result = engine.predict(input_data)
	print(f"ΔPAD: {result['delta_pad']}")
	print(f"Confidence: {result['confidence']}")
	```

	##### `predict_batch(self, input_batch: Union[list, np.ndarray]) -> List[Dict[str, Any]]`

	Batch prediction.

	##### `benchmark(self, num_samples: int = 1000, batch_size: int = 32) -> Dict[str, float]`

	Performance benchmarking.

	Returns:
	- `Dict[str, float]`: Performance statistics

	Example:
	```python
	# Performance benchmarking
	stats = engine.benchmark(num_samples=1000, batch_size=32)
	print(f"Throughput: {stats['throughput']:.2f} samples/sec")
	print(f"Average latency: {stats['avg_latency']:.2f}ms")
	```

	### `ModelTrainer`

	Model trainer providing full training pipeline management.

	```python
	class ModelTrainer:
	def __init__(self,
	model: nn.Module,
	preprocessor: DataPreprocessor = None,
	device: str = 'auto')
	```

	#### Methods

	##### `train(self, train_loader: DataLoader, val_loader: DataLoader, config: Dict[str, Any]) -> Dict[str, Any]`

	Trains the model.

	Parameters:
	- `train_loader` (DataLoader): Training data loader
	- `val_loader` (DataLoader): Validation data loader
	- `config` (Dict[str, Any]): Training configuration

	Returns:
	- `Dict[str, Any]`: Training history

	Example:
	```python
	from src.utils.trainer import ModelTrainer

	# Create trainer
	trainer = ModelTrainer(model, preprocessor)

	# Training configuration
	config = {
	'epochs': 100,
	'learning_rate': 0.001,
	'weight_decay': 1e-4,
	'patience': 10,
	'save_dir': './models'
	}

	# Start training
	history = trainer.train(train_loader, val_loader, config)
	```

	##### `evaluate(self, test_loader: DataLoader) -> Dict[str, float]`

	Evaluates the model.

	## Loss Functions

	### `WeightedMSELoss`

	Weighted Mean Squared Error loss function.

	```python
	class WeightedMSELoss(nn.Module):
	def __init__(self,
	delta_pad_weight: float = 1.0,
	delta_pressure_weight: float = 1.0,
	confidence_weight: float = 0.5,
	reduction: str = 'mean')
	```

	#### Parameters

	- `delta_pad_weight` (float): Weight for ΔPAD loss, defaults to 1.0
	- `delta_pressure_weight` (float): Weight for ΔPressure loss, defaults to 1.0
	- `confidence_weight` (float): Weight for confidence loss, defaults to 0.5
	- `reduction` (str): Reduction method for the loss, defaults to 'mean'

	Example:
	```python
	from src.models.loss_functions import WeightedMSELoss

	criterion = WeightedMSELoss(
	delta_pad_weight=1.0,
	delta_pressure_weight=1.0,
	confidence_weight=0.5
	)

	loss = criterion(predictions, targets)
	```

	### `ConfidenceLoss`

	Confidence loss function.

	```python
	class ConfidenceLoss(nn.Module):
	def __init__(self, reduction: str = 'mean')
	```

	## Evaluation Metrics

	### `RegressionMetrics`

	Regression evaluation metrics calculator.

	```python
	class RegressionMetrics:
	def __init__(self)
	```

	#### Methods

	##### `calculate_all_metrics(self, y_true: np.ndarray, y_pred: np.ndarray) -> Dict[str, float]`

	Calculates all regression metrics.

	Parameters:
	- `y_true` (np.ndarray): True values
	- `y_pred` (np.ndarray): Predicted values

	Returns:
	- `Dict[str, float]`: Dictionary containing all metrics

	Example:
	```python
	from src.models.metrics import RegressionMetrics

	metrics_calculator = RegressionMetrics()
	metrics = metrics_calculator.calculate_all_metrics(true_labels, predictions)

	print(f"MSE: {metrics['mse']:.4f}")
	print(f"MAE: {metrics['mae']:.4f}")
	print(f"R²: {metrics['r2']:.4f}")
	```

	### `PADMetrics`

	PAD-specific evaluation metrics.

	```python
	class PADMetrics:
	def __init__(self)
	```

	#### Methods

	##### `evaluate_predictions(self, predictions: np.ndarray, targets: np.ndarray) -> Dict[str, Any]`

	Evaluates PAD prediction results.

	## Factory Functions

	### `create_pad_predictor(config: Optional[Dict[str, Any]] = None) -> PADPredictor`

	Factory function for creating a PAD predictor.

	Parameters:
	- `config` (Dict[str, Any], optional): Configuration dictionary

	Returns:
	- `PADPredictor`: PAD predictor instance

	Example:
	```python
	from src.models.pad_predictor import create_pad_predictor

	# Use default configuration
	model = create_pad_predictor()

	# Use custom configuration
	config = {
	'dimensions': {
	'input_dim': 7,
	'output_dim': 4 or 3
	},
	'architecture': {
	'hidden_layers': [
	{'size': 256, 'activation': 'ReLU', 'dropout': 0.3},
	{'size': 128, 'activation': 'ReLU', 'dropout': 0.2}
	]
	}
	}
	model = create_pad_predictor(config)
	```

	### `create_inference_engine(model_path: str, preprocessor_path: str = None, device: str = 'auto') -> InferenceEngine`

	Factory function for creating an inference engine.

	Parameters:
	- `model_path` (str): Path to the model file
	- `preprocessor_path` (str, optional): Path to the preprocessor file
	- `device` (str): Device type

	Returns:
	- `InferenceEngine`: Inference engine instance

	### `create_training_setup(config: Dict[str, Any]) -> tuple`

	Factory function for creating a training setup.

	Parameters:
	- `config` (Dict[str, Any]): Training configuration

	Returns:
	- `tuple`: (model, trainer, data loader)

	## Command-Line Interface

	### Main CLI Tool

	The project provides a unified command-line interface supporting various operations:

	```bash
	emotion-prediction <command> [options]
	```

	#### Available Commands

	- `train`: Trains the model
	- `predict`: Makes predictions
	- `evaluate`: Evaluates the model
	- `inference`: Inference script
	- `benchmark`: Performance benchmarking

	#### Train Command

	```bash
	emotion-prediction train --config CONFIG_FILE [OPTIONS]
	```

	Parameters:
	- `--config, -c`: Path to the training configuration file (required)
	- `--output-dir, -o`: Output directory (default: ./outputs)
	- `--device`: Computing device (auto/cpu/cuda, default: auto)
	- `--resume`: Resume training from a checkpoint
	- `--epochs`: Override number of training epochs
	- `--batch-size`: Override batch size
	- `--learning-rate`: Override learning rate
	- `--seed`: Random seed (default: 42)
	- `--verbose, -v`: Verbose output
	- `--log-level`: Log level (DEBUG/INFO/WARNING/ERROR)

	Example:
	```bash
	# Basic training
	emotion-prediction train --config configs/training_config.yaml

	# GPU training
	emotion-prediction train --config configs/training_config.yaml --device cuda

	# Resume from checkpoint
	emotion-prediction train --config configs/training_config.yaml --resume checkpoint.pth
	```

	#### Predict Command

	```bash
	emotion-prediction predict --model MODEL_FILE [OPTIONS]
	```

	Parameters:
	- `--model, -m`: Path to the model file (required)
	- `--preprocessor, -p`: Path to the preprocessor file
	- `--interactive, -i`: Interactive mode
	- `--quick`: Quick prediction mode (7 numerical values)
	- `--batch`: Batch prediction mode (input file)
	- `--output, -o`: Output file path
	- `--device`: Computing device
	- `--verbose, -v`: Verbose output
	- `--log-level`: Log level

	Example:
	```bash
	# Interactive prediction
	emotion-prediction predict --model model.pth --interactive

	# Quick prediction
	emotion-prediction predict --model model.pth --quick 0.5 0.3 -0.2 75.0 0.1 0.4 -0.1

	# Batch prediction
	emotion-prediction predict --model model.pth --batch input.csv --output results.csv
	```

	#### Evaluate Command

	```bash
	emotion-prediction evaluate --model MODEL_FILE --data DATA_FILE [OPTIONS]
	```

	Parameters:
	- `--model, -m`: Path to the model file (required)
	- `--data, -d`: Path to the test data file (required)
	- `--preprocessor, -p`: Path to the preprocessor file
	- `--output, -o`: Path for evaluation results output
	- `--report`: Path for generating a detailed report file
	- `--metrics`: List of evaluation metrics (default: mse mae r2)
	- `--batch-size`: Batch size (default: 32)
	- `--device`: Computing device
	- `--verbose, -v`: Verbose output
	- `--log-level`: Log level

	Example:
	```bash
	# Basic evaluation
	emotion-prediction evaluate --model model.pth --data test_data.csv

	# Generate detailed report
	emotion-prediction evaluate --model model.pth --data test_data.csv --report report.html
	```

	#### Benchmark Command

	```bash
	emotion-prediction benchmark --model MODEL_FILE [OPTIONS]
	```

	Parameters:
	- `--model, -m`: Path to the model file (required)
	- `--preprocessor, -p`: Path to the preprocessor file
	- `--num-samples`: Number of test samples (default: 1000)
	- `--batch-size`: Batch size (default: 32)
	- `--device`: Computing device
	- `--report`: Path for generating a performance report file
	- `--warmup`: Number of warmup iterations (default: 10)
	- `--verbose, -v`: Verbose output
	- `--log-level`: Log level

	Example:
	```bash
	# Standard benchmarking
	emotion-prediction benchmark --model model.pth

	# Custom test parameters
	emotion-prediction benchmark --model model.pth --num-samples 5000 --batch-size 64
	```

	## Configuration File API

	### Model Configuration

	Model configuration files use YAML format and support the following parameters:

	```yaml
	# Model basic information
	model_info:
	name: str # Model name
	type: str # Model type
	version: str # Model version

	# Input/output dimensions
	dimensions:
	input_dim: int # Input dimension
	output_dim: int # Output dimension

	# Network architecture
	architecture:
	hidden_layers:
	- size: int # Layer size
	activation: str # Activation function
	dropout: float # Dropout rate
	output_layer:
	activation: str # Output activation function
	use_batch_norm: bool # Whether to use batch normalization
	use_layer_norm: bool # Whether to use layer normalization

	# Initialization parameters
	initialization:
	weight_init: str # Weight initialization method
	bias_init: str # Bias initialization method

	# Regularization
	regularization:
	weight_decay: float # L2 regularization coefficient
	dropout_config:
	type: str # Dropout type
	rate: float # Dropout rate
	```

	### Training Configuration

	Training configuration files support the following parameters:

	```yaml
	# Training information
	training_info:
	experiment_name: str # Experiment name
	description: str # Experiment description
	seed: int # Random seed

	# Training hyperparameters
	training:
	optimizer:
	type: str # Optimizer type
	learning_rate: float # Learning rate
	weight_decay: float # Weight decay
	scheduler:
	type: str # Scheduler type
	epochs: int # Number of training epochs
	early_stopping:
	enabled: bool # Whether to enable early stopping
	patience: int # Patience value
	min_delta: float # Minimum improvement
	```

	## Exception Handling

	The project defines the following custom exceptions:

	### `ModelLoadError`

	Model loading error.

	### `DataPreprocessingError`

	Data preprocessing error.

	### `InferenceError`

	Inference process error.

	### `ConfigurationError`

	Configuration file error.

	Example:
	```python
	from src.utils.exceptions import ModelLoadError, InferenceError

	try:
	model = PADPredictor.load_model("invalid_model.pth")
	except ModelLoadError as e:
	print(f"Model loading failed: {e}")

	try:
	result = engine.predict(invalid_input)
	except InferenceError as e:
	print(f"Inference failed: {e}")
	```

	## Logging System

	The project uses a structured logging system:

	```python
	from src.utils.logger import setup_logger
	import logging

	# Set up logging
	setup_logger(level='INFO', log_file='training.log')
	logger = logging.getLogger(__name__)

	# Use logging
	logger.info("Training started")
	logger.debug(f"Batch size: {batch_size}")
	logger.warning("Potential overfitting detected")
	logger.error("Error occurred during training")
	```

	## Type Hinting

	The project fully supports type hinting, with detailed type annotations for all public APIs:

	```python
	from typing import Dict, List, Optional, Union, Tuple
	import numpy as np
	import torch

	def predict_emotion(
	input_data: Union[List[float], np.ndarray],
	model_path: str,
	preprocessor_path: Optional[str] = None,
	device: str = 'auto'
	) -> Dict[str, Any]:
	"""
	Predicts emotional changes

	Args:
	input_data: Input data, 7-dimensional vector
	model_path: Path to the model file
	preprocessor_path: Path to the preprocessor file
	device: Computing device

	Returns:
	A dictionary containing prediction results

	Raises:
	InferenceError: Raised when inference fails
	"""
	pass
	```

	---

	For more details, please refer to the source code and example files. If you have any questions, please check the [Troubleshooting Guide](TUTORIAL.md#troubleshooting) or submit an Issue.