Spaces:

AlBaraa63
/

text_detection

Sleeping

App Files Files Community

AlBaraa63 commited on Oct 22, 2025

Commit

d5841ad

0 Parent(s):

Simple clean UI version

Browse files

Files changed (10) hide show

README.md +118 -0
app.py +143 -0
inputs/test1.png +0 -0
inputs/test2.png +0 -0
main.py +82 -0
outputs/test1.txt +11 -0
outputs/test2.txt +5 -0
packages.txt +2 -0
preprocessing.py +193 -0
requirements.txt +5 -0

README.md ADDED Viewed

	@@ -0,0 +1,118 @@

+---
+title: Text Detection Demo
+emoji: 📝
+colorFrom: blue
+colorTo: green
+sdk: gradio
+sdk_version: 5.49.1
+app_file: app.py
+pinned: false
+license: apache-2.0
+---
+# 📝 Text Detection Demo
+Extract text from any image using OCR (Optical Character Recognition).
+## 🎯 What It Does
+Upload an image → AI extracts the text → Copy and use!
+## 🚀 Try it Live
+**Demo:** https://huggingface.co/spaces/AlBaraa63/text_detection
+## 📁 Files
+```
+text_detection/
+├── app.py               # Gradio web demo
+├── main.py              # CLI version
+├── preprocessing.py     # Image processing helpers
+├── requirements.txt     # Dependencies
+├── packages.txt         # System dependencies
+└── README.md           # This file
+```
+## �️ Setup
+### 1. Install Tesseract OCR
+- **Windows:** Download from [here](https://github.com/UB-Mannheim/tesseract/wiki)
+- Install to: `C:\Program Files\Tesseract-OCR`
+### 2. Install Python Packages
+```bash
+pip install -r requirements.txt
+```
+Or manually:
+```bash
+pip install opencv-python pytesseract numpy
+```
+### 3. Test Installation
+```bash
+python test_tesseract.py
+```
+## 🚀 Usage
+### Simple - Run and Enter Path
+```bash
+python main.py
+```
+Then enter your image path when asked.
+### Example
+```bash
+python main.py
+# Enter: inputs/image.png
+```
+## 📝 Example
+**Input Image:** Screenshot with text
+**Output:** Text file with detected text
+```
+Image: image.png
+Size: 869 x 296 pixels
+DETECTED TEXT:
+Mix - antent - homesick (super slowed)
+Mixes are playlists YouTube makes for you
+✅ Text saved to: output.txt
+```
+## 🎓 How It Works
+1. **Load Image** - Read the image file
+2. **Preprocess** - Convert to grayscale and enhance
+3. **OCR** - Extract text using Tesseract
+4. **Save** - Write text to output.txt
+## 📊 What's Included
+- **1 sample image** in `inputs/` folder for testing
+- Works with any image format (PNG, JPG, etc.)
+- Clean and minimal - perfect for learning!
+## 💡 Tips
+- Works best with clear, high-contrast images
+- Screenshots work great
+- Photos might need better lighting
+- Larger images = better accuracy
+## � Next Steps
+Once you understand this basic version, you can:
+- Add preprocessing options
+- Batch process multiple images
+- Add confidence scores
+- Try different languages
+---
+*Simple text detection for learning* 🎓

app.py ADDED Viewed

	@@ -0,0 +1,143 @@

+"""
+Text Detection Demo with Gradio
+Extract text from images using OCR
+"""
+import gradio as gr
+import cv2
+import pytesseract
+import numpy as np
+from PIL import Image
+import os
+# Set Tesseract path (will be overridden in cloud deployment)
+if os.path.exists(r'C:\Program Files\Tesseract-OCR\tesseract.exe'):
+    pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
+def extract_text_from_image(image):
+    """
+    Extract text from an uploaded image
+    Args:
+        image: PIL Image or numpy array
+    Returns:
+        tuple: (processed_image, extracted_text)
+    """
+    try:
+        # Convert PIL Image to numpy array if needed
+        if isinstance(image, Image.Image):
+            image = np.array(image)
+        # Convert RGB to BGR for OpenCV
+        if len(image.shape) == 3 and image.shape[2] == 3:
+            img = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
+        else:
+            img = image
+        # Convert to grayscale
+        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
+        # Apply thresholding for better OCR
+        _, threshold = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
+        # Extract text using Tesseract
+        text = pytesseract.image_to_string(threshold)
+        # Clean up the text
+        text = text.strip()
+        if not text:
+            text = "⚠️ No text detected in the image.\n\nTips:\n- Make sure the image contains clear text\n- Try an image with higher resolution\n- Ensure good contrast between text and background"
+        # Convert processed image back to RGB for display
+        processed_display = cv2.cvtColor(threshold, cv2.COLOR_GRAY2RGB)
+        return processed_display, text
+    except Exception as e:
+        error_msg = f"❌ Error processing image: {str(e)}\n\nPlease try another image."
+        return image, error_msg
+# Create Gradio interface
+with gr.Blocks(theme=gr.themes.Soft(), title="Text Detection Demo") as demo:
+    gr.Markdown(
+        """
+        # 📝 Text Detection Demo
+        ### Extract text from any image using OCR
+        Upload an image containing text, and the AI will extract all readable text from it.
+        Perfect for documents, screenshots, photos of signs, and more!
+        """
+    )
+    with gr.Row():
+        with gr.Column():
+            input_image = gr.Image(
+                label="Upload Image",
+                type="pil",
+                height=400
+            )
+            extract_btn = gr.Button(
+                "🔍 Extract Text",
+                variant="primary",
+                size="lg"
+            )
+            gr.Markdown(
+                """
+                ### 💡 Tips for best results:
+                - Use clear, high-resolution images
+                - Ensure good lighting and contrast
+                - Avoid blurry or distorted text
+                - Works with printed and digital text
+                """
+            )
+        with gr.Column():
+            output_image = gr.Image(
+                label="Processed Image (Thresholded)",
+                height=400
+            )
+            output_text = gr.Textbox(
+                label="Extracted Text",
+                lines=10,
+                placeholder="Extracted text will appear here...",
+                show_copy_button=True
+            )
+    # Example images section
+    gr.Markdown("### 📸 Try these examples:")
+    gr.Examples(
+        examples=[
+            ["inputs/test1.jpg"] if os.path.exists("inputs/test1.jpg") else None,
+            ["inputs/test2.jpg"] if os.path.exists("inputs/test2.jpg") else None,
+        ],
+        inputs=input_image,
+        label="Sample Images"
+    )
+    # Connect the button to the function
+    extract_btn.click(
+        fn=extract_text_from_image,
+        inputs=input_image,
+        outputs=[output_image, output_text]
+    )
+    # Footer
+    gr.Markdown(
+        """
+        ---
+        Made with ❤️ using Gradio and Tesseract OCR
+        """
+    )
+# Launch the app
+if __name__ == "__main__":
+    demo.launch(
+        share=True,  # Creates a public shareable link
+        server_name="0.0.0.0",  # Allow external connections
+        server_port=7860
+    )

inputs/test1.png ADDED Viewed

inputs/test2.png ADDED Viewed

main.py ADDED Viewed

	@@ -0,0 +1,82 @@

+"""
+Simple Text Detection - Extract text from any image
+Just run: python main.py
+"""
+import cv2
+import pytesseract
+import os
+# Set Tesseract path (Windows)
+pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
+def extract_text(image_path):
+    """Extract text from an image"""
+    # Read image
+    img = cv2.imread(image_path)
+    if img is None:
+        print(f"Could not read image: {image_path}")
+        print("Make sure the file path is correct")
+        return None
+    # Convert to grayscale
+    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
+    # Apply thresholding to make text clearer
+    _, threshold = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
+    # Extract text using OCR
+    print("\nExtracting text...")
+    text = pytesseract.image_to_string(threshold)
+    # Clean up the text
+    text = text.strip()
+    if text:
+        print("\nDETECTED TEXT:")
+        print("="*30)
+        print(text)
+        print("="*30)
+        # Create outputs folder if it doesn't exist
+        os.makedirs("outputs", exist_ok=True)
+        # Get image filename without extension
+        image_name = os.path.splitext(os.path.basename(image_path))[0]
+        # Save to file in outputs folder with same name as image
+        output_file = os.path.join("outputs", f"{image_name}.txt")
+        with open(output_file, 'w', encoding='utf-8') as f:
+            f.write(text)
+        print(f"\nText saved to: {output_file}")
+        return text
+    else:
+        print("\nNo text detected in the image")
+        return None
+def main():
+    """Main function"""
+    print("\nSimple Text Detection Tool")
+    image_path = input("\nEnter img path: ").strip()
+    # Remove quotes if user copied path with quotes
+    image_path = image_path.strip('"').strip("'")
+    # Check if file exists
+    if not os.path.exists(image_path):
+        print(f"\nFile not found: {image_path}")
+        print(" Please check the path and try again")
+        return
+    # Extract text
+    extract_text(image_path)
+if __name__ == "__main__":
+    main()

outputs/test1.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+Tesseract installer for Windows
+Normally we run Tesseract on Debian GNU Linux, but there was also the need for a Windows version.
+That's why we have built a Tesseract installer for Windows.
+WARNING: Tesseract should be either installed in the directory which is suggested during the
+installation or in a new directory. The uninstaller removes the whole installation directory. If you
+installed Tesseract in an existing directory, that directory will be removed with all its subdirectories
+and files.
+The latest installers can be downloaded here:

outputs/test2.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+Tt was the best of
+times, it was the worst
+of times, it was the age
+of wisdom, it was the
+age of foolishness...

packages.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ tesseract-ocr
2	+ tesseract-ocr-eng

preprocessing.py ADDED Viewed

	@@ -0,0 +1,193 @@

+"""
+Preprocessing functions to improve OCR accuracy
+Includes various image enhancement techniques
+"""
+import cv2
+import numpy as np
+def convert_to_grayscale(img):
+    """Convert image to grayscale"""
+    if len(img.shape) == 3:
+        return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
+    return img
+def apply_thresholding(img, method='otsu'):
+    """
+    Apply thresholding to image
+    Methods:
+        - 'otsu': Otsu's automatic thresholding
+        - 'adaptive': Adaptive thresholding
+        - 'binary': Simple binary thresholding
+    """
+    gray = convert_to_grayscale(img)
+    if method == 'otsu':
+        # Otsu's thresholding - automatic threshold selection
+        _, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
+    elif method == 'adaptive':
+        # Adaptive thresholding - good for varying lighting
+        thresh = cv2.adaptiveThreshold(
+            gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
+            cv2.THRESH_BINARY, 11, 2
+        )
+    elif method == 'binary':
+        # Simple binary thresholding
+        _, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
+    else:
+        thresh = gray
+    return thresh
+def remove_noise(img, method='median'):
+    """
+    Remove noise from image
+    Methods:
+        - 'median': Median blur (good for salt-and-pepper noise)
+        - 'gaussian': Gaussian blur (general smoothing)
+        - 'bilateral': Bilateral filter (preserves edges)
+    """
+    if method == 'median':
+        return cv2.medianBlur(img, 3)
+    elif method == 'gaussian':
+        return cv2.GaussianBlur(img, (5, 5), 0)
+    elif method == 'bilateral':
+        return cv2.bilateralFilter(img, 9, 75, 75)
+    return img
+def dilate_text(img, kernel_size=(1, 1)):
+    """Dilate text to make it thicker"""
+    kernel = np.ones(kernel_size, np.uint8)
+    return cv2.dilate(img, kernel, iterations=1)
+def erode_text(img, kernel_size=(1, 1)):
+    """Erode text to make it thinner"""
+    kernel = np.ones(kernel_size, np.uint8)
+    return cv2.erode(img, kernel, iterations=1)
+def invert_image(img):
+    """Invert image colors (useful if text is white on black)"""
+    return cv2.bitwise_not(img)
+def enhance_contrast(img):
+    """Enhance image contrast using CLAHE"""
+    gray = convert_to_grayscale(img)
+    clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
+    return clahe.apply(gray)
+def resize_image(img, scale=2.0):
+    """
+    Resize image for better OCR
+    Larger images often work better with Tesseract
+    """
+    height, width = img.shape[:2]
+    new_width = int(width * scale)
+    new_height = int(height * scale)
+    return cv2.resize(img, (new_width, new_height), interpolation=cv2.INTER_CUBIC)
+def add_border(img, border_size=10, color=255):
+    """Add white border around image"""
+    return cv2.copyMakeBorder(
+        img, border_size, border_size, border_size, border_size,
+        cv2.BORDER_CONSTANT, value=color
+    )
+def preprocess_pipeline(img, config='default'):
+    """
+    Complete preprocessing pipeline
+    Configs:
+        - 'default': Standard preprocessing
+        - 'aggressive': More aggressive preprocessing
+        - 'light': Light preprocessing
+        - 'custom': Custom pipeline
+    """
+    if config == 'default':
+        # Standard pipeline
+        processed = convert_to_grayscale(img)
+        processed = remove_noise(processed, 'median')
+        processed = apply_thresholding(processed, 'otsu')
+        processed = add_border(processed, 10)
+    elif config == 'aggressive':
+        # Aggressive preprocessing
+        processed = convert_to_grayscale(img)
+        processed = enhance_contrast(processed)
+        processed = remove_noise(processed, 'bilateral')
+        processed = apply_thresholding(processed, 'adaptive')
+        processed = dilate_text(processed, (2, 2))
+        processed = add_border(processed, 15)
+    elif config == 'light':
+        # Light preprocessing
+        processed = convert_to_grayscale(img)
+        processed = apply_thresholding(processed, 'otsu')
+    elif config == 'upscale':
+        # Upscale and process
+        processed = resize_image(img, scale=3.0)
+        processed = convert_to_grayscale(processed)
+        processed = remove_noise(processed, 'median')
+        processed = apply_thresholding(processed, 'otsu')
+        processed = add_border(processed, 20)
+    else:
+        # No preprocessing
+        processed = img
+    return processed
+def preprocess_for_ocr(img, show_steps=False):
+    """
+    Optimized preprocessing for OCR
+    Returns preprocessed image ready for Tesseract
+    """
+    steps = {}
+    # Step 1: Convert to grayscale
+    gray = convert_to_grayscale(img)
+    if show_steps:
+        steps['1_grayscale'] = gray.copy()
+    # Step 2: Upscale image (Tesseract works better with larger images)
+    upscaled = resize_image(gray, scale=2.5)
+    if show_steps:
+        steps['2_upscaled'] = upscaled.copy()
+    # Step 3: Remove noise
+    denoised = remove_noise(upscaled, 'bilateral')
+    if show_steps:
+        steps['3_denoised'] = denoised.copy()
+    # Step 4: Apply thresholding
+    thresh = apply_thresholding(denoised, 'otsu')
+    if show_steps:
+        steps['4_threshold'] = thresh.copy()
+    # Step 5: Add border
+    bordered = add_border(thresh, 20)
+    if show_steps:
+        steps['5_bordered'] = bordered.copy()
+    if show_steps:
+        return bordered, steps
+    return bordered

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+gradio
+opencv-python-headless
+pytesseract
+numpy
+pillow