Spaces:
Sleeping
file_utils.py
Purpose
File I/O operations for Nano Banana Streamlit. Centralized handling of image saving/loading, metadata management, filename generation, and directory operations.
Responsibilities
- Generate safe, unique filenames with timestamps
- Save/load images to/from disk
- Save/load metadata as JSON
- Create standardized metadata dictionaries
- Compute image hashes for change detection
- Manage output directory structure
- List recent generations
Dependencies
Imports
json- JSON serializationhashlib- Image hashing (SHA-256)re- Filename sanitization (regex)datetime- Timestampspathlib.Path- Path operationsPIL.Image- Image handlingconfig.settings.Settings- Directory pathsutils.logging_utils.get_logger- Logging
Used By
- All services - Save generation results
- All pages - Load/display images
- Backend clients - Save API responses
models/generation_result.py- Metadata creation
Public Interface
Filename Utilities
sanitize_filename(name: str) -> str
Remove unsafe characters from filename.
Rules:
- Removes:
< > : " / \ | ? * - Replaces with underscore
- Strips leading/trailing spaces and dots
- Limits to 100 characters
- Falls back to "generated" if empty
Example:
safe = sanitize_filename("My Character: v2.0")
# Returns: "My_Character__v2_0"
generate_timestamp_filename(base_name: str, extension: str = "png") -> str
Generate filename with timestamp.
Format: {base_name}_{YYYYMMDD_HHMMSS}.{extension}
Example:
filename = generate_timestamp_filename("character", "png")
# Returns: "character_20251023_143052.png"
get_unique_filename(directory: Path, base_name: str, extension: str = "png") -> Path
Generate unique filename that doesn't exist in directory.
If file exists, appends counter: _1, _2, etc.
Example:
path = get_unique_filename(Settings.CHARACTER_SHEETS_DIR, "hero", "png")
# Returns: Path("outputs/character_sheets/hero_20251023_143052.png")
# If exists: Path("outputs/character_sheets/hero_20251023_143052_1.png")
Image Operations
save_image(image: Image, directory: Path, base_name: str, metadata: dict = None) -> Tuple[Path, Path]
Save image and optional metadata.
Parameters:
image: PIL Image to savedirectory: Target directory (created if doesn't exist)base_name: Base filename (will add timestamp)metadata: Optional metadata dict (saved as JSON)
Returns: (image_path, metadata_path) tuple
Example:
metadata = {"prompt": "sunset", "backend": "Gemini"}
img_path, meta_path = save_image(
image=generated_image,
directory=Settings.CHARACTER_SHEETS_DIR,
base_name="hero",
metadata=metadata
)
# Saves:
# outputs/character_sheets/hero_20251023_143052.png
# outputs/character_sheets/hero_20251023_143052.json
load_image(file_path: Path) -> Image
Load image from disk.
Raises:
FileNotFoundError: If file doesn't existIOError: If file can't be read as image
Example:
image = load_image(Path("outputs/character_sheets/hero_20251023_143052.png"))
Metadata Operations
save_metadata(file_path: Path, metadata: dict)
Save metadata dictionary as JSON.
Format: Indented JSON with UTF-8 encoding
Raises: IOError if write fails
load_metadata(file_path: Path) -> dict
Load metadata from JSON file.
Raises:
FileNotFoundError: If file doesn't existjson.JSONDecodeError: If invalid JSON
Example:
meta = load_metadata(Path("outputs/character_sheets/hero_20251023_143052.json"))
prompt = meta["prompt"]
create_generation_metadata(...) -> dict
Create standardized metadata dictionary.
Parameters:
prompt: Generation prompt (required)backend: Backend used (required)aspect_ratio: Aspect ratio (required)temperature: Temperature value (required)input_images: List of input image paths (optional)generation_time: Time taken in seconds (optional)**kwargs: Additional custom fields
Returns: Metadata dictionary with standard fields
Standard Fields:
timestamp: ISO format timestampprompt: Generation promptbackend: Backend nameaspect_ratio: Aspect ratio stringtemperature: Temperature valueversion: Application version ("2.0.0-streamlit")input_images: List of input paths (if provided)generation_time_seconds: Time taken (if provided)
Example:
metadata = create_generation_metadata(
prompt="sunset over mountains",
backend="Gemini API (Cloud)",
aspect_ratio="16:9",
temperature=0.4,
generation_time=3.5,
character_name="Hero", # Custom field
stage="front_portrait" # Custom field
)
Image Hashing
compute_image_hash(image: Image) -> str
Compute SHA-256 hash of image data.
Useful for detecting if input images have changed.
Returns: Hex string (64 characters)
Example:
hash1 = compute_image_hash(image1)
hash2 = compute_image_hash(image2)
if hash1 == hash2:
print("Images are identical")
Directory Operations
ensure_output_directories()
Ensure all output directories exist.
Creates all directories defined in Settings if they don't exist. Called on startup.
get_output_directory_for_type(generation_type: str) -> Path
Get appropriate output directory for generation type.
Types:
"character_sheet"→Settings.CHARACTER_SHEETS_DIR"wardrobe"→Settings.WARDROBE_CHANGES_DIR"composition"→Settings.COMPOSITIONS_DIR"standard"→Settings.STANDARD_DIR
Raises: ValueError if unknown type
Example:
output_dir = get_output_directory_for_type("character_sheet")
# Returns: Path("outputs/character_sheets")
list_recent_generations(generation_type: str, count: int = 10) -> list
List recent generation files in a directory.
Returns: List of (image_path, metadata_path) tuples, newest first
Metadata path is None if JSON file doesn't exist.
Example:
recent = list_recent_generations("character_sheet", count=5)
for img_path, meta_path in recent:
image = load_image(img_path)
if meta_path:
metadata = load_metadata(meta_path)
Usage Examples
Service Saving Output
from utils.file_utils import save_image, create_generation_metadata, get_output_directory_for_type
class CharacterForgeService:
def generate(self, prompt, backend, ...):
# ... generation code ...
# Create metadata
metadata = create_generation_metadata(
prompt=prompt,
backend=backend,
aspect_ratio="3:4",
temperature=0.35,
generation_time=elapsed_time,
character_name=character_name,
stage="front_portrait"
)
# Save image and metadata
output_dir = get_output_directory_for_type("character_sheet")
img_path, meta_path = save_image(
image=generated_image,
directory=output_dir,
base_name=character_name,
metadata=metadata
)
return img_path
Page Displaying Recent Generations
import streamlit as st
from utils.file_utils import list_recent_generations, load_image
st.subheader("Recent Character Sheets")
recent = list_recent_generations("character_sheet", count=4)
cols = st.columns(4)
for idx, (img_path, meta_path) in enumerate(recent):
with cols[idx]:
image = load_image(img_path)
st.image(image, caption=img_path.stem, use_container_width=True)
Loading Previous Generation
from utils.file_utils import load_image, load_metadata
# User selects a previous generation
image_path = st.selectbox("Load previous", [...])
if image_path:
# Load image
image = load_image(Path(image_path))
st.image(image)
# Load metadata (if exists)
meta_path = Path(image_path).with_suffix(".json")
if meta_path.exists():
metadata = load_metadata(meta_path)
st.json(metadata)
# Restore settings
st.session_state.prompt = metadata["prompt"]
st.session_state.backend = metadata["backend"]
Error Handling
File Operations
All functions raise appropriate exceptions:
FileNotFoundError: File doesn't existIOError: Read/write errorjson.JSONDecodeError: Invalid JSONValueError: Invalid parameters
Errors are logged before raising.
Automatic Recovery
- Directories created automatically if they don't exist
- Filename conflicts resolved with counter suffix
- Missing metadata handled gracefully (returns None)
Known Limitations
- Filename length limit: 100 characters (base name)
- No image format conversion (saves as PNG only)
- No image compression options
- No batch operations
- No cloud storage integration
- Hash only detects exact pixel matches (not perceptual similarity)
Future Improvements
- Support multiple image formats (JPEG, WEBP)
- Add image compression/quality options
- Add batch save/load operations
- Add cloud storage backends (S3, GCS)
- Add perceptual image hashing (pHash)
- Add image metadata embedding (EXIF)
- Add file cleanup/archiving utilities
- Add generation statistics tracking
Testing
- Test sanitize_filename() with various unsafe characters
- Test generate_timestamp_filename() format
- Test get_unique_filename() collision handling
- Test save_image() creates files correctly
- Test load_image() with valid/invalid files
- Test save/load_metadata() round-trip
- Test create_generation_metadata() includes all fields
- Test compute_image_hash() consistency
- Test list_recent_generations() sorting
Related Files
config/settings.py- Directory path constantsutils/logging_utils.py- Logging functions- All services - Save generation results
- All pages - Load and display files
models/generation_result.py- Uses metadata creation
Security Considerations
- Filename sanitization prevents directory traversal
- No arbitrary file paths allowed (always in Settings directories)
- JSON encoding ensures no code injection
- File permissions inherited from parent directory
Performance Considerations
- Image hashing loads full image into memory
- Large images may be slow to hash
- list_recent_generations() sorts by modification time (fast)
- JSON serialization is fast for typical metadata size
Change History
- 2025-10-23: Initial creation for Streamlit migration
- Centralized all file I/O operations
- Added comprehensive filename handling
- Added metadata standardization
- Added directory management
- Added recent generations listing
- Integrated with Settings and logging