AI-Solutions-KK's picture
Update README.md
53f9ac2 unverified

A newer version of the Streamlit SDK is available: 1.57.0

Upgrade
metadata
title: Academic Paraphraser (AP)
emoji: πŸ§ͺ
colorFrom: blue
colorTo: purple
sdk: streamlit
sdk_version: 1.28.0
app_file: app.py
pinned: false
license: mit
tags:
  - academic-writing
  - paraphrasing
  - nlp
  - engineering
  - plagiarism-detection
  - text-processing
  - streamlit
  - transformers

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

πŸ§ͺAcademic Paraphraser (AP)

Advanced AI-Powered Academic Writing Assistant for Engineering / Academic Domains

Python 3.8+ License: MIT Transformers Streamlit Hugging Face Spaces

πŸ“‹ Table of Contents

πŸš€ Live Demo

Try the live application on Hugging Face Spaces: Open in Spaces

The app provides an intuitive web interface for all paraphrasing and quality assessment features.

πŸ”¬ Overview

The Academic Paraphraser is a sophisticated AI-powered tool designed specifically for academic and technical writing in engineering domains. It combines state-of-the-art natural language processing with domain-specific knowledge to provide intelligent paraphrasing while preserving technical accuracy and meaning.

🎯 Key Objectives

  • Preserve Technical Accuracy: Maintains engineering terminology and concepts
  • Enhance Writing Quality: Improves readability and academic style
  • Reduce Similarity: Helps avoid plagiarism while retaining original meaning
  • Multi-Domain Support: Covers Mechanical, Electrical, Computer Science, and Civil Engineering

✨ Features

πŸš€ Core Components

Component Description Technology
πŸ€– Academic Paraphraser T5-based neural paraphrasing Transformer Architecture
πŸ” Plagiarism Remover Rule-based similarity reduction NLP + Linguistics
πŸ“Š Quality Checker Comprehensive assessment Multi-metric Analysis

πŸ› οΈ Advanced Capabilities

  • πŸŽ“ Domain-Specific Processing

    • Mechanical Engineering terminology preservation
    • Electrical Engineering concept handling
    • Computer Science algorithm descriptions
    • Civil Engineering technical language
  • πŸ“ Intelligent Text Processing

    • Synonym replacement with context awareness
    • Sentence restructuring while preserving meaning
    • Technical term identification and protection
    • Academic style enhancement
  • πŸ“ˆ Quality Assessment

    • Similarity analysis (lexical & structural)
    • Readability scoring
    • Word variety metrics
    • Length appropriateness checking
  • ⚑ Performance Optimized

    • Lightweight T5-small model for cloud deployment
    • Efficient rule-based processing
    • Comprehensive error handling
    • Scalable architecture

πŸ—οΈ Architecture

graph TB
    A[Input Text] --> B[Domain Detection]
    B --> C{Processing Pipeline}
    
    C --> D[Academic Paraphraser]
    C --> E[Plagiarism Remover]
    
    D --> F[Technical Term Preservation]
    E --> G[Rule-Based Transformation]
    
    F --> H[Quality Assessment]
    G --> H
    
    H --> I[Similarity Analysis]
    H --> J[Readability Check]
    H --> K[Vocabulary Assessment]
    
    I --> L[Final Output]
    J --> L
    K --> L
    
    L --> M[Quality Score]
    L --> N[Processed Text]
    L --> O[Recommendations]

System Architecture

The Academic Paraphraser follows a modular architecture with three main processing pipelines:

  1. AI Paraphraser Pipeline (T5-based)

    • Input preprocessing and domain detection
    • Technical term extraction and preservation
    • Neural paraphrasing with multiple variants
    • Post-processing and quality filtering
  2. Plagiarism Remover Pipeline (Rule-based)

    • Lexical transformation using synonyms
    • Syntactic restructuring
    • Domain-specific term protection
    • Aggressiveness-based processing levels
  3. Quality Assessment Pipeline

    • Multi-dimensional similarity analysis
    • Readability and coherence scoring
    • Vocabulary diversity metrics
    • Comprehensive recommendations

πŸš€ Installation

Prerequisites

  • Python 3.8+
  • PyTorch
  • Transformers library
  • Streamlit
  • NLTK
  • SpaCy

Method 1: Local Installation

git clone https://huggingface.co/spaces/AI-Solutions-KK/Writing_Assistant
cd Writing_Assistant
pip install -r requirements.txt
streamlit run app.py

Method 2: Hugging Face Spaces Deployment

# Fork this repository
# Upload to your Hugging Face Space
# The app will automatically deploy with the configuration in the header

Method 3: Google Colab Setup

# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

# Clone repository
!git clone https://huggingface.co/spaces/AI-Solutions-KK/Writing_Assistant.git
%cd Writing_Assistant

# Install dependencies
!pip install -q transformers torch nltk spacy textstat sentence-transformers
!python -m spacy download en_core_web_sm

Required Packages (requirements.txt)

streamlit>=1.28.0
transformers>=4.21.0
torch>=1.12.0
nltk>=3.7
spacy>=3.4.0
textstat>=0.7.3
sentence-transformers>=2.2.2
numpy>=1.21.0
pandas>=1.3.0
scipy>=1.7.0
scikit-learn>=1.0.0

πŸš€ Quick Start

Web Application

Simply use the live demo above - no installation required!

Local Development

git clone https://huggingface.co/spaces/AI-Solutions-KK/Writing_Assistant
cd Writing_Assistant
pip install -r requirements.txt
streamlit run app.py

Programmatic Usage

# The app includes three main components:
# 1. Academic Paraphraser (T5-based)
# 2. Plagiarism Remover (Rule-based)
# 3. Quality Checker (Multi-metric assessment)

# All functionality is accessible through the web interface

πŸ“š Usage Examples

Example 1: Mechanical Engineering

Input: "The stress analysis reveals significant strain concentrations at critical junction points, requiring enhanced material properties."

Output: "The stress examination demonstrates considerable strain accumulation at vital connection locations, necessitating improved material characteristics."

Example 2: Computer Science

Input: "The algorithm implementation utilizes efficient data structures to optimize computational complexity."

Output: Multiple variants with confidence scoring and technical term preservation.

Example 3: Complete Pipeline

Process text through plagiarism removal β†’ AI paraphrasing β†’ quality assessment for comprehensive results.

Example 4: Quality Assessment

# Comprehensive quality check
original = "The electrical circuit demonstrates high impedance characteristics."
paraphrased = "This electrical network exhibits elevated impedance properties."

quality = quality_checker.comprehensive_quality_check(original, paraphrased)

print(f"Overall Score: {quality['overall_score']:.1f}%")
print(f"Similarity: {quality['detailed_scores']['similarity']['overall_similarity']:.3f}")
print(f"Recommendations: {quality['recommendations']}")

πŸ“– How to Use

  1. Select Domain: Choose your academic field (Mechanical, Electrical, Computer Science, Civil, or General)
  2. Choose Processing Mode:
    • πŸ€– AI Paraphraser: T5-based neural paraphrasing
    • πŸ› οΈ Plagiarism Remover: Rule-based similarity reduction
    • πŸ” Quality Checker: Comprehensive assessment
    • πŸš€ Complete Pipeline: All-in-one processing
  3. Enter Text: Input your academic text (50-500 words recommended)
  4. Process: Click process and review results
  5. Quality Check: Use built-in metrics and recommendations

πŸ“Š API Documentation

AcademicParaphraser Class

paraphrase(text, domain="general", num_variants=3)

Generates multiple paraphrased versions of input text.

Parameters:

  • text (str): Input text to paraphrase
  • domain (str): Engineering domain ('mechanical', 'electrical', 'computer_science', 'civil')
  • num_variants (int): Number of variants to generate

Returns:

  • List of dictionaries containing paraphrased variants with metadata

extract_technical_terms(text, domain)

Identifies and extracts technical terms for preservation.

PlagiarismRemover Class

remove_plagiarism(text, domain="general", aggressiveness="medium")

Applies transformations to reduce text similarity.

Parameters:

  • text (str): Input text to process
  • domain (str): Engineering domain
  • aggressiveness (str): Processing intensity ('low', 'medium', 'high')

Returns:

  • Dictionary with processed text and transformation metadata

QualityChecker Class

comprehensive_quality_check(original_text, paraphrased_text, domain="general")

Performs detailed quality assessment.

Returns:

  • Comprehensive quality metrics and recommendations

⚑ Performance

Benchmarks

Component Processing Time Accuracy
Plagiarism Remover ~0.1s per 100 words 85-90%
Quality Checker ~0.05s per assessment 90-95%
T5 Paraphraser ~2-5s per variant 80-90%

Optimization Features

  • πŸš€ Lightweight Models: T5-small for faster processing
  • ⚑ Efficient Algorithms: Optimized rule-based transformations
  • πŸ’Ύ Memory Management: Minimal resource usage
  • πŸ”„ Batch Processing: Support for multiple texts

πŸ—‚οΈ Project Structure

academic-paraphraser/
β”‚
β”œβ”€β”€ app.py                         # Complete Streamlit application
β”œβ”€β”€ requirements.txt               # Python dependencies
β”œβ”€β”€ README.md                      # This documentation
└── LICENSE                        # MIT License

πŸ“Š Supported Domains

  • πŸ”§ Mechanical Engineering: Stress analysis, materials, thermodynamics, mechanics
  • ⚑ Electrical Engineering: Circuits, power systems, signal processing, electronics
  • πŸ’» Computer Science: Algorithms, data structures, machine learning, software engineering
  • πŸ—οΈ Civil Engineering: Structures, foundations, construction, geotechnical
  • πŸ“š General Academic: Research methodology, analysis, theory, academic writing

πŸ§ͺ Testing

The application includes built-in system testing:

  • βœ… Import Tests: Verify all components load correctly
  • βœ… Initialization Tests: Check model loading and setup
  • βœ… Functionality Tests: Validate core processing capabilities
  • βœ… Pipeline Tests: Test end-to-end processing
  • βœ… Error Handling: Verify graceful error management

Use the "πŸ§ͺ Testing" tab in the web interface to run comprehensive tests.

Sample Test Results

πŸ§ͺ COMPREHENSIVE TEST RESULTS
════════════════════════════════════════
βœ… IMPORTS: 3/3 passed (100.0%)
βœ… INITIALIZATION: 3/3 passed (100.0%) 
βœ… BASIC_FUNCTIONALITY: 3/3 passed (100.0%)
βœ… PIPELINE: 4/4 passed (100.0%)
βœ… ERROR_HANDLING: 4/4 passed (100.0%)
βœ… PERFORMANCE: 1/1 passed (100.0%)

🎯 OVERALL RESULT: 18/18 tests passed (100.0%)
πŸŽ‰ EXCELLENT! Ready for deployment

🀝 Contributing

We welcome contributions! This is a single-file application for easy deployment and maintenance.

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add comprehensive tests for new features
  • Update documentation as needed
  • Maintain backward compatibility

πŸ› Known Issues & Limitations

  • T5 Model: Optimized with T5-small for cloud deployment
  • Processing Speed: Optimized for web deployment
  • Domain Coverage: Currently optimized for 5 engineering/academic domains
  • Language Support: English only at present

πŸ› οΈ Troubleshooting

Common Issues

  • Memory Errors: App uses T5-small model for optimal performance
  • Processing Timeout: Optimized processing times for cloud deployment
  • Import Errors: All dependencies included in requirements.txt

Streamlit Memory Errors

# Use smaller model variant for Streamlit deployment:
paraphraser = AcademicParaphraser(model_name="t5-small")

Hugging Face Spaces Timeout

# Add caching for model loading in app.py:
@st.cache_resource
def load_models():
    return AcademicParaphraser(model_name="t5-small")

NLTK Data Missing

import nltk
nltk.download('punkt')
nltk.download('stopwords')

Performance Tips

  • Use 50-500 words for optimal results
  • Select appropriate domain for best accuracy
  • Try different aggressiveness levels for plagiarism removal
  • Use complete pipeline for comprehensive processing

πŸ“ž Support

  • 🌐 Live Demo: Use the Hugging Face Spaces interface above
  • πŸ’‘ Built-in Help: Check the "πŸ’‘ Tips" tab in the application
  • πŸ§ͺ Testing: Use built-in system tests to verify functionality
  • πŸ“š Documentation: Complete usage guide included in the web interface
  • πŸ“§ Contact: karantatyasokamble@gmail.com

πŸ“œ License

This project is licensed under the MIT License - free for academic and commercial use.

πŸ† Acknowledgments

  • πŸ€— Hugging Face for Transformers library and Spaces platform
  • Streamlit for the amazing web app framework
  • NLTK & SpaCy for natural language processing tools
  • PyTorch for deep learning framework
  • Engineering Community for domain-specific insights

πŸ“Š Citation

If you use this work in your research, please cite:

@software{engineering_academic_paraphraser,
  title={Engineering Academic Paraphraser: AI-Powered Writing Assistant for Technical Domains},
  author={AI-Solutions-KK (KARAN KAMBLE)},
  year={2025},
  url={https://huggingface.co/spaces/AI-Solutions-KK/Writing_Assistant},
  email={karantatyasokamble@gmail.com}
}

🌟 Star this repository if you find it helpful! 🌟

Open in Spaces

Made with ❀️ for the Engineering Academic Community

Hugging Face

Contact: karantatyasokamble@gmail.com