a-albiol
/

AthenAI

Safetensors

Model card Files Files and versions

xet

Community

a-albiol commited on Aug 11, 2025

Commit

d57ea6e

verified ·

1 Parent(s): 1f5355b

Update README.md

Browse files

Files changed (1) hide show

README.md +228 -97

README.md CHANGED Viewed

@@ -1,199 +1,330 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
 #### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
 ### Testing Data, Factors & Metrics
 #### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
 #### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
 #### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
 ### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
 ## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
 ### Model Architecture and Objective
-[More Information Needed]
 ### Compute Infrastructure
-[More Information Needed]
 #### Hardware
-[More Information Needed]
 #### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**
-[More Information Needed]
 **APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]

 ---
+license: mit
 library_name: transformers
+tags:
+- fitness
+- workout
+- health
+- text-generation
+- t5
+- flan-t5
+- exercise
+- personalized
+- seq2seq
+language:
+- en
+- es
+datasets:
+- onurSakar/GYM-Exercise
+- niharika41298/gym-exercise-data
+- yuhonas/free-exercise-db
+pipeline_tag: text2text-generation
+widget:
+- text: 'Generate a workout for: {"training_phase": "weight_loss", "motivation": "wellbeing", "special_situation": "none"}'
+  example_title: "Weight Loss Workout"
+- text: 'Generate a workout for: {"training_phase": "muscle_gain", "motivation": "self_improvement", "special_situation": "none"}'
+  example_title: "Muscle Building Workout"
+- text: 'Generate a workout for: {"training_phase": "cardio_improve", "motivation": "medical_recommendation", "special_situation": "injury_recovery"}'
+  example_title: "Cardio Recovery Workout"
 ---
+# AthenAI - AI-Powered Personalized Workout Generator
+AthenAI is a fine-tuned FLAN-T5-base model designed to generate personalized workout routines based on user context, training goals, motivation, and special health situations. The model creates structured workout plans with multiple exercise blocks, duration estimates, and detailed instructions in JSON format.
 ## Model Details
 ### Model Description
+AthenAI takes user context as input (training phase, motivation, special situations) and generates comprehensive workout plans tailored to individual needs. The model was trained on synthetic workout data derived from comprehensive exercise databases and can handle various fitness scenarios from weight loss to injury recovery.
+- **Developed by:** a-albiol
+- **Model type:** Text-to-Text Generation (Sequence-to-Sequence)
+- **Language(s) (NLP):** English (primary), Spanish (secondary)
+- **License:** MIT
+- **Finetuned from model:** google/flan-t5-base
+### Model Sources
+- **Repository:** [AthenAI GitHub Repository](https://github.com/a-albiol/AthenAI)
+- **Base Model:** [google/flan-t5-base](https://huggingface.co/google/flan-t5-base)
 ## Uses
 ### Direct Use
+AthenAI is designed for direct use in generating personalized workout routines. Users can input their training context and receive structured workout plans immediately. The model is particularly useful for:
+- **Personal Fitness Applications**: Generate daily workout routines
+- **Fitness Apps**: Provide adaptive exercise recommendations
+- **Gym Management Systems**: Create member-specific workout plans
+- **Health & Wellness Platforms**: Offer personalized fitness guidance
+### Downstream Use
+The model can be integrated into larger fitness ecosystems:
+- **Mobile Fitness Apps**: Backend workout generation service
+- **Personal Training Software**: Assist trainers with plan creation
+- **Rehabilitation Systems**: Generate recovery-focused exercise routines
+- **Corporate Wellness Programs**: Provide employee fitness plans
 ### Out-of-Scope Use
+AthenAI should not be used for:
+- Medical diagnosis or treatment recommendations
+- Professional medical or physiotherapy advice
+- Unsupervised use for individuals with serious cardiovascular conditions
+- Replacement for professional fitness consultation
+- Legal or liability-bearing fitness recommendations
 ## Bias, Risks, and Limitations
+### Known Limitations
+1. **Exercise Database Scope**: Primarily trained on gym-based exercises, limited outdoor/home alternatives
+2. **Equipment Assumptions**: May suggest exercises without considering equipment availability
+3. **Medical Expertise**: Cannot replace professional medical or fitness consultation
+4. **Cultural Context**: Training data may reflect Western fitness practices
+5. **Language Limitations**: Optimized for English with limited Spanish support
+### Potential Biases
+- **Dataset Bias**: Limited to exercises popular in online fitness communities
+- **Demographic Bias**: May not adequately represent all age groups, fitness levels, or cultural backgrounds
+- **Equipment Bias**: Assumes access to standard gym equipment
+- **Ability Bias**: May not fully accommodate all physical limitations or disabilities
+### Recommendations
+Users should be aware that AthenAI:
+- Provides general fitness guidance, not medical advice
+- Should be used alongside professional consultation for special health conditions
+- May require human review for individuals with physical limitations
+- Works best when combined with proper fitness supervision
+## How to Get Started with the Model
+### Installation
+```bash
+pip install transformers torch
+```
+### Basic Usage
+```python
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
+# Load model and tokenizer
+model_name = "a-albiol/AthenAI"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
+# Define user context
+context = {
+    "training_phase": "weight_loss",
+    "motivation": "wellbeing",
+    "special_situation": "none"
+}
+# Generate workout
+input_text = f"Generate a workout for: {context}"
+inputs = tokenizer(input_text, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=512, do_sample=True, temperature=0.7)
+workout = tokenizer.decode(outputs[0], skip_special_tokens=True)
+print(workout)
+```
+### Input Parameters
+**Training Phases:**
+- `weight_loss`: Fat loss focused routines
+- `muscle_gain`: Strength and muscle building
+- `cardio_improve`: Cardiovascular fitness enhancement
+- `maintenance`: General fitness maintenance
+**Motivation Types:**
+- `medical_recommendation`: Health-prescribed exercise
+- `self_improvement`: Personal development goals
+- `competition`: Athletic/competitive training
+- `rehabilitation`: Recovery and therapy
+- `wellbeing`: General wellness and mood
+**Special Situations:**
+- `pregnancy`: Prenatal fitness routines
+- `post_partum`: Postpartum recovery workouts
+- `injury_recovery`: Rehabilitation exercises
+- `chronic_condition`: Adapted for chronic health issues
+- `elderly_population`: Senior-friendly routines
+- `physical_limitation`: Modified for disabilities
+- `none`: Standard training
 ## Training Details
 ### Training Data
+The model was trained on synthetic workout data generated from multiple comprehensive exercise databases:
+1. **Primary Dataset**: `onurSakar/GYM-Exercise` - Structured gym exercise data with instructions and categories
+2. **Kaggle Dataset**: `niharika41298/gym-exercise-data` - Comprehensive exercise descriptions with difficulty levels
+3. **GitHub Dataset**: `yuhonas/free-exercise-db` - Open-source exercise database with detailed instructions
+**Total Training Examples**: 2000+ synthetic workout scenarios covering diverse user contexts and exercise combinations.
+### Training Procedure
+#### Preprocessing
+1. **Data Extraction**: Parsed exercise data from multiple sources using regex patterns
+2. **Data Normalization**: Standardized column names and formats across datasets
+3. **Synthetic Generation**: Created workout scenarios using template-based generation
+4. **Context Mapping**: Paired user contexts with appropriate workout structures
+5. **Tokenization**: Applied FLAN-T5 tokenizer with padding and truncation
 #### Training Hyperparameters
+- **Base Model**: google/flan-t5-base (850M parameters)
+- **Training Epochs**: 1 per phase
+- **Batch Size**: 8 (original) / 1 (optimized with gradient accumulation)
+- **Gradient Accumulation Steps**: 8
+- **Learning Rate**: 5e-5
+- **Weight Decay**: 0.01
+- **Training Regime**: FP16 mixed precision
+- **Optimizer**: AdamW
+- **Evaluation Strategy**: Steps-based (every 200/500 steps)
+- **Save Strategy**: Every 500/1000 steps
+#### Training Architecture
+**Multi-Phase Training Approach:**
+1. **Phase 1**: Fine-tuning on Kaggle-derived synthetic data (1000 examples)
+2. **Phase 2**: Additional training on GitHub-derived synthetic data (1000 examples)
+#### Speeds, Sizes, Times
+- **Model Size**: ~850MB (FLAN-T5-base architecture)
+- **Training Time**: ~2-4 hours per phase on Google Colab GPU
+- **Inference Speed**: 0.5-1 seconds per workout (GPU), 2-5 seconds (CPU)
+- **Memory Requirements**: 4-8GB RAM for inference, 12GB+ for training
+## Evaluation
 ### Testing Data, Factors & Metrics
 #### Testing Data
+The model was evaluated on held-out synthetic workout data representing diverse user contexts and exercise combinations. Test cases included edge cases such as multiple special situations and complex user requirements.
 #### Factors
+Evaluation was performed across multiple factors:
+- **Training Phase Diversity**: All four training phases (weight_loss, muscle_gain, cardio_improve, maintenance)
+- **Motivation Variety**: Five motivation types from medical to competitive
+- **Special Situations**: Seven different special health/physical situations
+- **Workout Complexity**: Varying block structures and exercise counts
 #### Metrics
+Primary evaluation metrics included:
+- **JSON Format Validity**: Structural correctness of generated workout plans
+- **Context Relevance**: Appropriateness of exercises for given user context
+- **Exercise Variety**: Diversity in recommended exercises across similar contexts
+- **Block Structure Coherence**: Logical flow from warmup to cooldown
+- **Duration Estimation Accuracy**: Realistic time estimates for workout completion
 ### Results
+The model successfully generates valid JSON workout structures with contextually appropriate exercises. Manual evaluation showed strong performance in:
+- Context understanding and exercise selection
+- Workout structure and flow
+- Adaptation to special situations
+- Exercise variety and avoiding repetition
 ## Environmental Impact
+Training AthenAI involved fine-tuning a pre-existing model rather than training from scratch, significantly reducing computational requirements and carbon footprint.
+- **Hardware Type**: NVIDIA T4 GPU (Google Colab)
+- **Hours used**: Approximately 6-8 hours total training time
+- **Cloud Provider**: Google Cloud Platform (Colab)
+- **Compute Region**: Variable (Colab allocation)
+- **Carbon Emitted**: Estimated <5kg CO2eq (due to fine-tuning approach)
+## Technical Specifications
 ### Model Architecture and Objective
+- **Architecture**: FLAN-T5-base (Text-to-Text Transfer Transformer)
+- **Parameters**: ~850 million parameters
+- **Objective**: Sequence-to-sequence generation for workout plan creation
+- **Input Format**: Natural language context description
+- **Output Format**: Structured JSON workout plans
+- **Context Window**: 512 tokens maximum
+- **Generation Strategy**: Autoregressive text generation with temperature sampling
 ### Compute Infrastructure
 #### Hardware
+- **Training**: NVIDIA T4 GPU (Google Colab Pro)
+- **Memory**: 16GB GPU memory, 25GB system RAM
+- **Storage**: 100GB+ for datasets and model checkpoints
 #### Software
+- **Framework**: Hugging Face Transformers 4.45.2
+- **Training Library**: Hugging Face Trainer
+- **Data Processing**: Pandas, Datasets 3.0.1
+- **Environment**: Python 3.10, PyTorch 2.0+
+- **Platform**: Google Colab with GPU acceleration
+## Citation
+If you use AthenAI in your research or applications, please cite:
 **BibTeX:**
+```bibtex
+@misc{athenai2024,
+  title={AthenAI: AI-Powered Personalized Workout Generator},
+  author={a-albiol},
+  year={2024},
+  publisher={Hugging Face},
+  journal={Hugging Face Model Hub},
+  howpublished={\url{https://huggingface.co/a-albiol/AthenAI}}
+}
+```
 **APA:**
+a-albiol. (2024). AthenAI: AI-Powered Personalized Workout Generator. Hugging Face Model Hub. https://huggingface.co/a-albiol/AthenAI
+## Glossary
+- **Training Phase**: User's current fitness goal (weight loss, muscle gain, etc.)
+- **Motivation**: Underlying reason for exercising (medical, competition, etc.)
+- **Special Situation**: Health or physical considerations (pregnancy, injury, etc.)
+- **Workout Block**: Structured section of workout (warmup, main, cooldown)
+- **Fine-tuning**: Process of adapting pre-trained model to specific task
+- **Synthetic Data**: Artificially generated training examples based on real exercise databases
+## More Information
+For detailed implementation, training notebooks, and additional examples, visit the project repository. The model continues to evolve with community feedback and additional training data.
+For technical support or questions about integration, please open an issue in the repository or contact through Hugging Face model discussions.
+## Model Card Authors
+**Primary Author**: a-albiol
+**Contributors**: Community feedback and testing
 ## Model Card Contact
+For questions, feedback, or collaboration inquiries:
+- **Hugging Face**: [@a-albiol](https://huggingface.co/a-albiol)
+- **Model Discussions**: Use the Hugging Face model page discussion section
+- **Issues**: Report technical issues through the repository issue tracker