Update Readme

866f258 verified 2 months ago

3.76 kB

	---
	license: mit
	language: en
	pipeline_tag: image-classification
	library_name: keras
	tags:
	- image-classification
	- alexnet
	- cifar10
	- tensorflow
	- keras
	metrics:
	- accuracy
	datasets:
	- cifar10
	---

	# Revisiting AlexNet: Achieving High-Accuracy on CIFAR-10 with Modern Optimization Techniques

	## Model Description

	This repository contains a TensorFlow/Keras implementation of the AlexNet architecture, optimized and trained from scratch on the CIFAR-10 dataset. The original 2012 architecture has been modernized by replacing Local Response Normalization (LRN) layers with Batch Normalization and incorporating robust regularization techniques like L2 weight decay and aggressive data augmentation.

	The model was developed in a Kaggle environment, demonstrating a reproducible workflow for achieving high accuracy on a benchmark computer vision task.

	---

	## Model Details

	\| Detail \| Value \|
	\| :--- \| :--- \|
	\| Architecture \| Modified AlexNet with Batch Normalization \|
	\| Parameters \| ~46 million \|
	\| Framework\| TensorFlow / Keras \|
	\| Task \| Image Classification \|
	\| Original Paper \| [Revisiting AlexNet: Achieving High-Accuracy on CIFAR-10 with Modern Optimization Techniques]() \|
	\| Original Paper \| [ImageNet Classification with Deep Convolutional Neural Networks](https://papers.nips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf) \|

	---

	## Training Procedure

	### Data
	The model was trained on the CIFAR-10 dataset.
	* Preprocessing: All images were resized from 32x32 to 224x224 and pixel values were normalized to a [0, 1] range.
	* Augmentation: The training data was augmented on-the-fly with Random Horizontal Flips, Random Rotations (10%), and Random Zooms (10%).

	### Hyperparameters

	\| Hyperparameter \| Value \|
	\| :--- \| :--- \|
	\| Optimizer \| Adam \|
	\| Learning Rate \| `1e-4` (with `ReduceLROnPlateau` callback) \|
	\| Batch Size \| `128` (GPU) / `1024` (TPU) \|
	\| Epochs \| Trained for a max of 100 with `EarlyStopping` (patience=10) \|
	\| Regularization \| L2 Weight Decay (`λ=0.0005`), Dropout (`rate=0.5`) \|
	\| Hardware\| Kaggle GPU (2x T4) or TPU (v5e-8) \|

	---

	## Evaluation

	The model achieved the following performance on the CIFAR-10 test set:

	* Test Accuracy: 95.7%
	* Test Loss: 0.6143

	---

	## How to Use

	This model can be easily loaded from the Hub for inference. Below is a complete example of how to load the model and predict the class of a sample image.

	```python
	import tensorflow as tf
	import numpy as np
	from PIL import Image
	import requests
	from huggingface_hub import from_pretrained_keras

	# 1. Load the model from the Hub
	# Replace with your actual repo_id
	repo_id = "metanthropiclabs/alexnet-cifar10-optimized"
	model = from_pretrained_keras(repo_id)

	# 2. Define class labels
	cifar10_labels = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

	# 3. Load and preprocess a sample image
	# Example: an image of a cat from the web
	url = '[https://storage.googleapis.com/petbacker/images/blog/2017/cat-in-a-box.jpg](https://storage.googleapis.com/petbacker/images/blog/2017/cat-in-a-box.jpg)'
	image = Image.open(requests.get(url, stream=True).raw)
	image = image.resize((224, 224)) # Must match model's input size
	image_array = np.array(image)

	# Normalize and add a batch dimension
	image_array = image_array.astype('float32') / 255.0
	image_tensor = tf.expand_dims(image_array, 0) # Create a batch of 1

	# 4. Make a prediction
	predictions = model.predict(image_tensor)
	predicted_class_index = np.argmax(predictions[0])
	predicted_class_name = cifar10_labels[predicted_class_index]

	print(f"Predicted Class: {predicted_class_name}")
	# Expected output for this image: Predicted Class: cat

	```