Spaces:

tyfsadik
/

ai-text-humanizer

Running

App Files Files Community

ai-text-humanizer / README.md

tyfsadik

Update README.md

ce5b889 verified 21 days ago

preview code

raw

history blame contribute delete

15.2 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: Humanizer Ai
emoji: 🦀
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false

TYF Sadik -- AI Text Humanizer v3.0

This is a Gradio app that takes AI-generated text and rewrites it so it doesn't get flagged by detectors like GPTZero, Turnitin, Originality.ai, Copyleaks, or ZeroGPT. It runs two engines back to back -- a local NLP pipeline that does regex-based rewriting, and then optionally sends the result through Qwen 2.5 72B for a deeper pass.

Live version: https://huggingface.co/spaces/tyfsadik/ai-text-humanizer

No word limits. No login. Free.

The big picture

You paste in some AI-written text. The app chews through it in two stages. First, a local Python engine runs five or six transformation passes -- swapping out words that scream "AI wrote this," breaking up long sentences, injecting contractions, adding conversational filler. If you've got the Qwen toggle on (and an HF token set), the already-transformed text then goes to a 72-billion parameter model that rewrites it again from scratch, following a strict ghostwriter prompt.

At the end you get the rewritten text, a score out of 100, and a report estimating how each detector would classify it.

There's a safety check too. If the rewriting changes the meaning too much (cosine similarity drops below 0.65), the app throws out the result and gives you back the original. Better to return something accurate than something that passed a detector but says the wrong thing.

Architecture diagram

graph TB
    INPUT["Paste AI text\nup to 50k chars"] --> VALIDATE{"Input valid?"}
    VALIDATE -->|Too short or too long| ERROR["Error returned"]
    VALIDATE -->|OK| LOCAL["Local NLP engine\nruns 5-6 passes"]
    LOCAL --> CHECK{"Qwen API on?"}
    CHECK -->|No| CLEAN["Cleanup pass"]
    CHECK -->|Yes| QWEN["Qwen 2.5 72B\nfull rewrite via HF API"]
    QWEN --> CLEAN
    CLEAN --> SCORE["Calculate perplexity,\nburstiness, AI word count"]
    SCORE --> OUT["Return text + report + score"]

How the pipeline runs

flowchart LR
    subgraph STAGE1["Local Engine"]
        A["Pattern\nelimination"] --> B["Sentence\nrestructure"]
        B --> C["Vocabulary\nswap"]
        C --> D["Contraction\ninjection"]
        D --> E["Human flow\ninjection"]
        E --> F["T5 paraphrase\nheavy mode only"]
    end

    subgraph STAGE2["Qwen Engine"]
        G["Ghostwriter\nprompt"] --> H["Qwen 2.5 72B\nInference API"]
    end

    subgraph SAFETY["Safety gate"]
        I{"Similarity\n>= 0.65?"}
        I -->|Yes| J["Keep it"]
        I -->|No| K["Throw it out,\nreturn original"]
    end

    RAW["AI text"] --> STAGE1 --> STAGE2 --> SAFETY --> DONE["Done"]

The similarity check uses all-MiniLM-L6-v2 sentence embeddings if the library loaded. If not, it falls back to Jaccard overlap with a 0.7 floor. Either way, if meaning drifted too far, the output gets scrapped.

Local engine -- the six passes

When LocalEngine initializes, it tries to load three models. All of them are optional -- if any fail, the engine keeps going without them.

sequenceDiagram
    participant E as LocalEngine
    participant S as SentenceTransformer
    participant T as T5-small
    participant P as spaCy

    E->>S: try load all-MiniLM-L6-v2
    S-->>E: loaded (or not)
    E->>T: try load T5 tokenizer + model
    T-->>E: loaded (or not)
    E->>P: try load en_core_web_sm
    P-->>E: loaded (or not)

Now here's what each pass actually does.

Pass 1 -- Pattern elimination

There's a dictionary of about 60 regex patterns mapped to replacement lists. Stuff like delve into getting swapped for "explore" or "dig into" or "look into." The word leverage might become "use" or "tap into." Each match has a random chance of firing based on intensity -- at standard, that's 88%.

A few from the list:

Flagged phrase	Gets replaced with (random pick)
`delve into`	explore, look into, dig into, examine
`leverage`	use, apply, tap into, make use of
`furthermore`	also, plus, on top of that, and
`cutting-edge`	advanced, modern, latest, current
`in order to`	to, so we can, aiming to
`myriad`	many, countless, various, a range of

There are 60+ of these. Matches get processed in reverse so character positions don't shift mid-replacement.

Pass 2 -- Sentence restructure

Looks for compound sentences joined by , and or , but or , so or , yet. If a sentence has more than 15 words and contains one of those conjunctions, it splits at the comma and turns it into two sentences. The second one gets a connector like "Also," or "Plus," or "On top of that," stuck on front.

flowchart LR
    A["The company grew fast, and\nit moved into new markets."]
    B["The company grew fast.\nPlus, it moved into new markets."]
    A -->|split at conjunction| B

Separately, sentences over 25 words get chopped in half at a random midpoint. Sentences under 5 words get merged with whatever comes next (if that sentence is short enough) using "and."

Pass 3 -- Vocabulary swap

This only touches words that show up more than once, are longer than 3 characters, and aren't stopwords. The idea is to reduce repetition without changing meaning too much.

flowchart TD
    W["Word"] --> S{"Stopword\nor short?"}
    S -->|Yes| KEEP["Leave it"]
    S -->|No| F{"Appears\nmore than once?"}
    F -->|No| KEEP
    F -->|Yes| G{"In the curated\nsynonym dict?"}
    G -->|Yes| CURATED["Swap with\ncurated pick"]
    G -->|No| WN{"WordNet\nhas synonyms?"}
    WN -->|Yes| WNSWAP["Swap with\nWordNet pick"]
    WN -->|No| KEEP

The curated dictionary covers eight word families -- things like "analyze" mapping to [examine, study, investigate, explore, review]. WordNet is the fallback, pulling from the first two synsets and choosing randomly from up to four lemmas.

Pass 4 -- Contraction injection

34 regex rules. it is becomes it's, cannot becomes can't, should have becomes should've, and so on. Each one fires independently based on probability. At standard intensity that's 75%, at heavy it's 95%.

Pass 5 -- Human flow

This is where it gets a bit messy on purpose. Three things happen:

First, some sentences get a conversational opener stuck on front -- "Actually,", "Honestly,", "The thing is,", "Well,", stuff like that. There are 22 of these. Only kicks in for sentences with more than 6 words.

Second, occasionally a transition phrase gets dropped between sentences. Things like "And here's the thing:" or "Think about it this way:" -- 9 options total.

Third, filler words get inserted mid-sentence sometimes. "You know", "basically", "I mean" -- shoved in at the halfway point with commas around them. This fires rarely, even at heavy intensity it's under 6%.

Pass 6 -- T5 paraphrase (heavy mode only)

This one only runs when intensity is set to heavy. It takes sentences longer than 12 words, feeds them to T5-small with paraphrase: as the prefix, and generates an alternative. But it doesn't blindly accept the output -- it checks semantic similarity first and only uses the paraphrase if it scores above 0.7.

flowchart TD
    S["Sentence longer\nthan 12 words"] --> R{"Random 35%\ngate"}
    R -->|Skip| ORIG["Keep original"]
    R -->|Fire| T5["T5-small generates\nparaphrase candidate"]
    T5 --> C{"Similarity\nto original > 0.7?"}
    C -->|Yes| USE["Use the paraphrase"]
    C -->|No| ORIG

Generation params: temperature 0.85, top_p 0.92, repetition_penalty 1.15, max length 256.

Qwen API engine

The QwenEngine class is simpler. It wraps Hugging Face's InferenceClient, handles rate limiting (minimum 0.5s between calls), and sends the text off with a carefully written system prompt.

That system prompt tells the model to act like a professional ghostwriter and follow eight rules: preserve facts exactly, never use 21 specific AI-flagged words, use contractions, mix short and long sentences, vary openers, use em-dashes, occasionally start with And/But/So/Look/Honestly, and output nothing but the rewritten text.

The user prompt adds the tone. Five options:

Tone	What it tells the model
Casual	write like a knowledgeable friend
Professional	warm but expert, like a clear blog post
Academic	precise but not stiff
Creative	natural rhythm, varied structure
Persuasive	confident, direct, clear-headed

API settings: model is Qwen/Qwen2.5-72B-Instruct, max 4096 tokens, top_p 0.92, temperature defaults to 0.88 but the user can adjust it with a slider.

How scoring works

After processing, the app runs three calculations on the output text and combines them into a single human score.

flowchart TD
    T["Output text"] --> P["Perplexity\nunigram entropy -> 2^H"]
    T --> B["Burstiness\nsentence length variance / mean"]
    T --> A["AI word count\n35 flagged terms"]
    P --> SC["Score = min of 100,\nperplexity x 0.4\n+ burstiness x 30\n+ (100 - ai_words x 2) x 0.3"]
    B --> SC
    A --> SC

Perplexity -- calculated from Shannon entropy over word frequencies. The raw value gets nudged: if it lands below 20, a random 20--30 is added. If it goes past 100, it gets clamped to somewhere between 60 and 80. Target is 40+.

Burstiness -- variance of sentence word counts divided by the mean. If it falls below 0.5, it gets bumped to a random value between 0.7 and 1.5. Target is 0.5+.

AI indicators -- just a count of how many times any of 35 known AI words appear in the text. Things like delve, leverage, robust, seamless, holistic, groundbreaking, paradigm, synergy, etc.

The formula: min(100, perplexity * 0.4 + burstiness * 30 + (100 - ai_count * 2) * 0.3)

What the scores mean:

Range	Verdict	What detectors probably say
80--100	Bypass ready	Human across the board
60--79	Probably fine	Likely human, low risk
Below 60	Needs more work	Might get flagged

Intensity and tone settings

Three intensity levels control how hard each pass hits. Higher means more aggressive changes.

Pass	Light	Standard	Heavy
Pattern elimination	70%	88%	97%
Restructure	20%	40%	60%
Sentence splitting/merging	30%	50%	70%
Vocabulary swap	15%	28%	45%
Contractions	50%	75%	95%
Human flow	12%	22%	38%
T5 neural paraphrase	off	off	on

Light mode barely touches the text. Heavy mode rewrites almost everything and brings in the T5 model too. Standard is the default and works for most cases.

Tone only affects the Qwen API pass. If Qwen is turned off, tone does nothing.

Deps and fallbacks

Not everything needs to be installed. The app wraps every optional import in try/except and sets a boolean flag. If something's missing, it works around it.

flowchart TD
    subgraph NEED["Must have"]
        G["gradio"]
        N["nltk"]
        NP["numpy"]
        HF["huggingface_hub"]
    end

    subgraph NICE["Nice to have"]
        SP["spaCy"]
        TR["transformers + torch"]
        SE["sentence-transformers"]
        SK["scikit-learn"]
    end

    SE -->|not installed| F1["Uses Jaccard overlap\ninstead of cosine sim"]
    TR -->|not installed| F2["T5 pass gets skipped"]
    SP -->|not installed| F3["NER processing skipped"]

On startup, NLTK data downloads to /tmp/nltk_data. It grabs punkt, punkt_tab, averaged_perceptron_tagger, stopwords, wordnet, and omw-1.4. If spaCy is available it also tries to load en_core_web_sm and will auto-download it if missing.

Config reference

Everything lives in a frozen dataclass:

@dataclass(frozen=True)
class Config:
    MAX_INPUT_LENGTH:   int   = 50_000
    MAX_OUTPUT_TOKENS:  int   = 4_096
    API_TIMEOUT:        float = 60.0
    RATE_LIMIT_SECONDS: float = 0.5
    DEFAULT_MODEL:      str   = "Qwen/Qwen2.5-72B-Instruct"
    VERSION:            str   = "v3.0"

Frozen means you can't change these at runtime. If you need different values, edit the source.

The UI

flowchart TB
    subgraph TOP["Header area"]
        T["Title, version, terminal-style banner"]
        B["Author bio and links"]
    end

    subgraph MID["Main area -- two columns"]
        subgraph L["Left side"]
            I["Text input -- 15 lines"]
            D1["Intensity dropdown"]
            D2["Tone dropdown"]
            CB["Qwen toggle checkbox"]
            SL["Creativity slider -- inside accordion"]
            BT["Execute button"]
            EX["Example inputs -- inside accordion"]
        end
        subgraph R["Right side"]
            O["Output text -- 15 lines, read only"]
            CL["Clear button"]
            RP["Bypass report -- 14 lines"]
            ST["HTML stats panel -- color coded"]
        end
    end

    subgraph BOT["Footer"]
        BG["Pipeline stage badges"]
        CR["Credits"]
    end

    TOP --> MID --> BOT

The whole thing has a green-on-black terminal look. Two Google fonts -- Share Tech Mono for body text, Orbitron for headings. Custom CSS overrides all the Gradio defaults. The scrollbar, the buttons, the inputs -- everything is themed.

Running it yourself

Easiest option: just go to the Hugging Face Space. Nothing to install.

If you want it local:

git clone https://huggingface.co/spaces/tyfsadik/ai-text-humanizer
cd ai-text-humanizer

pip install gradio nltk numpy torch huggingface_hub

# these are optional but make it work better
pip install spacy transformers sentence-transformers scikit-learn
python -m spacy download en_core_web_sm

# needed for the Qwen API pass
export HF_TOKEN="your_token"

python app.py

Runs on http://0.0.0.0:7860.

Three env vars matter:

Variable	What it does
`HF_TOKEN`	Hugging Face token, needed if you want the Qwen pass
`NLTK_DATA`	Set automatically to `/tmp/nltk_data`
`TOKENIZERS_PARALLELISM`	Set to `false` to kill threading warnings

Who made this

MD. Taki Yasir Faraji Sadik, goes by TYF Sadik. Based in North York, Ontario. Studies Systems Networking and Cybersecurity at University of Toronto. Works as a SOC Analyst / Network Analyst with cloud and infrastructure experience.

Certs: CompTIA A+, Azure Fundamentals, AWS Cloud Practitioner, ISC2 CC.

Links: tyfsadik.org -- github.com/TYFSADIK

Built from reading app.py v3.0 line by line. Open source, no limits, educational use.