🚀 Refined BitTransformerLM: Organized codebase with best practices

902f2d4 verified 4 months ago

3.31 kB

	# BitTransformerLM Scripts

	This directory contains organized scripts for BitTransformerLM development, training, and evaluation.

	## Directory Structure

	```
	scripts/
	├── training/ # Training scripts and experiments
	├── examples/ # Example usage and demonstrations
	├── testing/ # Test scripts and validation
	├── benchmarks/ # Performance benchmarks
	└── tools/ # Utility scripts and data processing
	```

	## Training Scripts (`training/`)

	- basic_training.py - Simple training setup for small models
	- breakthrough_training.py - Advanced training with breakthrough techniques
	- cpu_edge_training.py - CPU-optimized training for edge deployment
	- final_breakthrough_training.py - Production training pipeline
	- full_attention_training.py - Full attention mechanism training
	- full_bits_train.py - Complete bit-level training
	- production_training.py - Production-ready training script
	- progressive_scaleup.py - Progressive model scaling
	- quick_training_run.py - Fast training for development

	## Example Scripts (`examples/`)

	- example.py - Basic usage example
	- better_sampling.py - Advanced sampling techniques
	- debug_generation.py - Generation debugging utilities
	- raw_generation.py - Low-level generation examples
	- simple_test.py - Simple model testing

	## Testing Scripts (`testing/`)

	- code_test.py - Code functionality testing
	- diffusion_tests.py - Diffusion mode testing
	- enhanced_generation_test.py - Advanced generation testing
	- full_attention_inference_test.py - Attention mechanism tests
	- test_conversation.py - Conversational AI testing

	## Benchmark Scripts (`benchmarks/`)

	- wikitext_benchmark.py - WikiText dataset benchmarking
	- wikitext_schedule.py - WikiText training schedule

	## Utility Tools (`tools/`)

	- build_full_bits.py - Bit sequence construction
	- create_dataset.py - Dataset creation utilities
	- enhanced_checkpoint_system.py - Advanced checkpointing
	- integration_flow.py - Integration workflow
	- integration_schedule.py - Integration scheduling
	- sync_to_hf.py - HuggingFace synchronization
	- unified_workflow.py - Unified training workflow
	- watcher.py - File system monitoring

	## Usage

	All scripts support the standardized CLI interface provided by `bit_transformer.cli_standards`. Use `--help` with any script to see available options.

	### Quick Start

	```bash
	# Train a small model
	python scripts/training/basic_training.py --model-size small --epochs 5

	# Run a simple test
	python scripts/examples/simple_test.py --d-model 64

	# Benchmark on WikiText
	python scripts/benchmarks/wikitext_benchmark.py --dataset-name wikitext-2
	```

	### Environment Variables

	Scripts support configuration via environment variables with `BT_` prefix:

	```bash
	export BT_D_MODEL=128
	export BT_NUM_LAYERS=4
	export BT_BATCH_SIZE=16
	python scripts/training/basic_training.py
	```

	## Development Guidelines

	- All scripts should use `bit_transformer.cli_standards` for argument parsing
	- Include proper logging and error handling
	- Support both CPU and GPU execution
	- Follow the naming conventions established in existing scripts
	- Add documentation for any new hyperparameters or features