Mikask
/

moe-cnn

Image Classification

Eval Results (legacy)

Model card Files Files and versions

moe-cnn / README.md

Mikask's picture

Update README.md

2cc4e2c verified 9 months ago

|

history blame contribute delete

1.79 kB

	---
	license: mit
	datasets:
	- ylecun/mnist
	metrics:
	- accuracy
	pipeline_tag: image-classification
	model-index:
	- name: MoE-CNN
	results:
	- task:
	type: image-classification
	dataset:
	name: MNIST
	type: image-classification
	metrics:
	- name: Accuracy
	type: Accuracy
	value: 99.75
	---
	# MoE-CNN

	### Model Description

	- Model type: Image Classification
	- License: MIT

	## How to Get Started with the Model

	Use the code below to get started with the model.

	```python
	model = MixtureOfExperts(num_experts=10)

	checkpoint_path = "FP_ML_MOE_SIMPLE_99_75.pth"
	checkpoint = torch.load(checkpoint_path)

	model.load_state_dict(checkpoint['model_state_dict'])

	print(f"Validation Accuracy: {checkpoint["val_accuracy"]:.2f}")

	input_data = torch.randn(1, 1, 28, 28)
	results = model.predict(input_data.to(device))
	print("Results:", results)
	```

	## Training Details

	### Training Data
	https://huggingface.co/datasets/ylecun/mnist

	### Training Procedure

	#### Data Augmentation
	- RandomRotation(10)
	- RandomAffine(0, shear=10)
	- RandomAffine(0, translate=(0.1, 0.1))
	- RandomResizedCrop(28, scale=(0.8, 1.0))
	- RandomPerspective(distortion_scale=0.2, p=0.5)
	- Resize((28, 28))

	#### Training Hyperparameters
	Adam with learning rate of 0.001 for fast initial convergence
	SGD with learning rate of 0.01 and learning rate decay to 0.001

	#### Size
	2,247,151 parameters with 674,145 effective parameters

	## Evaluation

	### Testing Data, Factors & Metrics

	#### Testing Data
	https://huggingface.co/datasets/ylecun/mnist

	#### Metrics
	- Accuracy: 99.75%
	- Error rate: 0.25%

	## Technical Specifications [optional]

	### Model Architecture
	Mixture-of-Experts (MoE) architecture with a simple CNN as the experts.