James Edmunds commited on
Commit
d74e599
Β·
1 Parent(s): f87cd97

Add data directory structure with .gitkeep files and usage instructions

Browse files
README.md CHANGED
@@ -146,9 +146,29 @@ SongLift_LyrGen2/
146
  β”‚ └── utils/ # Utility functions
147
  β”œβ”€β”€ scripts/ # Data processing & testing
148
  β”œβ”€β”€ data/
149
- β”‚ β”œβ”€β”€ raw/lyrics/ # Original lyrics files
150
- β”‚ └── processed/ # Embeddings & processed data
151
- └── docs/ # Documentation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  ```
153
 
154
  ## πŸ” Browser Compatibility
 
146
  β”‚ └── utils/ # Utility functions
147
  β”œβ”€β”€ scripts/ # Data processing & testing
148
  β”œβ”€β”€ data/
149
+ β”‚ β”œβ”€β”€ raw/lyrics/ # Place your lyrics files here (organized by artist folders)
150
+ β”‚ └── processed/ # Generated embeddings & ChromaDB files
151
+ └── .env.example # Environment variables template
152
+ ```
153
+
154
+ ### πŸ“‚ Data Directory Setup
155
+
156
+ The `data/` directory structure is preserved for you to add your own lyrics:
157
+
158
+ ```
159
+ data/raw/lyrics/
160
+ β”œβ”€β”€ artist1/
161
+ β”‚ β”œβ”€β”€ song1.txt
162
+ β”‚ └── song2.txt
163
+ β”œβ”€β”€ artist2/
164
+ β”‚ β”œβ”€β”€ song1.txt
165
+ β”‚ └── song2.txt
166
+ └── ...
167
+ ```
168
+
169
+ After adding lyrics, run the processing pipeline:
170
+ ```bash
171
+ python scripts/process_lyrics.py
172
  ```
173
 
174
  ## πŸ” Browser Compatibility
data/processed/embeddings/.gitkeep ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # This file ensures the embeddings directory structure is preserved in git
2
+ # Processed embeddings and ChromaDB files will be stored here
3
+ # This directory is populated by running: python scripts/process_lyrics.py
data/raw/lyrics/.gitkeep ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file ensures the lyrics directory structure is preserved in git
2
+ # Place your lyrics files (.txt) in this directory organized by artist folders
3
+ # Example structure:
4
+ # data/raw/lyrics/
5
+ # β”œβ”€β”€ artist1/
6
+ # β”‚ β”œβ”€β”€ song1.txt
7
+ # β”‚ └── song2.txt
8
+ # └── artist2/
9
+ # β”œβ”€β”€ song1.txt
10
+ # └── song2.txt