Spaces:

can-org
/

Testing-AI-Contain

Sleeping

App Files Files Community

Pujan Neupane commited on Apr 25, 2025

Commit

228034e

unverified ·

2 Parent(s): 9cc53a0 f70e4b3

Merge pull request #2 from cyberalertnepal/Pujan

Browse files

Files changed (8) hide show

Dockerfile +34 -0
HuggingFace/main.py → MODEL/app.py +1 -1
{HuggingFace → MODEL}/readme.md +0 -0
MODEL/requirements.txt +1 -0
README.md +13 -34
app.py +35 -51
requirements.txt +6 -209
test.sh +0 -1

Dockerfile ADDED Viewed

	@@ -0,0 +1,34 @@

+# Use the latest slim Python 3.11 image
+FROM python:3.11-slim
+# Set environment variables
+ENV HOME=/home/user \
+    PATH=/home/user/.local/bin:$PATH \
+    PYTHONDONTWRITEBYTECODE=1 \
+    PYTHONUNBUFFERED=1
+# Install system dependencies
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    build-essential \
+    git \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Create a non-root user for safety
+RUN useradd -ms /bin/bash user
+USER user
+WORKDIR $HOME/app
+# Copy app source code
+COPY --chown=user . .
+# Install Python dependencies
+RUN pip install --no-cache-dir --upgrade pip \
+ && pip install --no-cache-dir -r requirements.txt
+# Expose port
+EXPOSE 7860
+# Start the FastAPI app using uvicorn
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]

HuggingFace/main.py → MODEL/app.py RENAMED Viewed

@@ -7,7 +7,7 @@ def download_repo():
     if not hf_token:
         raise ValueError("HF_TOKEN not found in environment variables.")
-    repo_id = "Pujan-Dev/test"
     local_dir = "../Ai-Text-Detector/"
     repo = Repository(local_dir, clone_from=repo_id, token=hf_token)

     if not hf_token:
         raise ValueError("HF_TOKEN not found in environment variables.")
+    repo_id = "can-org/AIModel"
     local_dir = "../Ai-Text-Detector/"
     repo = Repository(local_dir, clone_from=repo_id, token=hf_token)

{HuggingFace → MODEL}/readme.md RENAMED Viewed

File without changes

MODEL/requirements.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ huggingface_hub

README.md CHANGED Viewed

@@ -39,14 +39,6 @@ This command installs all the dependencies listed in the `requirements.txt` file
 ### **Code Overview**
-```python
-executor = ThreadPoolExecutor(max_workers=2)
-```
-- **`ThreadPoolExecutor(max_workers=2)`** limits the number of concurrent threads (tasks) per worker process to 2 for text classification. This helps control resource usage and prevent overloading the server.
----
 ### **Running and Load Balancing:**
 To run the app in production with load balancing:
@@ -55,25 +47,8 @@ To run the app in production with load balancing:
 uvicorn app:app --host 0.0.0.0 --port 8000 --workers 4
 ```
-This command launches the FastAPI app with **4 worker processes**, allowing it to handle multiple requests concurrently.
-### **Concurrency Explained:**
-1. **`ThreadPoolExecutor(max_workers=20)`**
-   - Controls the **number of threads** within a **single worker** process.
-   - Allows up to 20 tasks (text classification requests) to be handled simultaneously per worker, improving responsiveness for I/O-bound tasks.
-2. **`--workers 4` in Uvicorn**
-   - Spawns **4 independent worker processes** to handle incoming HTTP requests.
-   - Each worker can independently handle multiple tasks, increasing the app's ability to process concurrent requests in parallel.
-### **How They Relate:**
-- **Uvicorn’s `--workers`** defines how many worker processes the server will run.
-- **`ThreadPoolExecutor`** limits how many tasks (threads) each worker can process concurrently.
-For example, with **4 workers** and **20 threads per worker**, the server can handle **80 tasks concurrently**. This provides scalable and efficient processing, balancing the load across multiple workers and threads.
 ### **Endpoints**
@@ -116,13 +91,13 @@ uvicorn app:app --host 0.0.0.0 --port 8000 --workers 4
 You can test the FastAPI endpoint using `curl` like this:
 ```bash
-curl -X POST http://127.0.0.1:8000/analyze \
-  -H "Authorization: Bearer HelloThere" \
   -H "Content-Type: application/json" \
   -d '{"text": "This is a sample sentence for analysis."}'
 ```
-- The `-H "Authorization: Bearer HelloThere"` part is used to simulate the **handshake**.
 - FastAPI checks this token against the one loaded from the `.env` file.
 - If the token matches, the request is accepted and processed.
 - Otherwise, it responds with a `403 Unauthorized` error.
@@ -131,8 +106,8 @@ curl -X POST http://127.0.0.1:8000/analyze \
 ### **API Documentation**
-- **Swagger UI:** `http://127.0.0.1:8000/docs` -> `/docs`
-- **ReDoc:** `http://127.0.0.1:8000/redoc` -> `/redoc`
 ### **🔐 Handshake Mechanism**
@@ -180,8 +155,8 @@ nestjs-fastapi-bridge/
 Create a `.env` file at the root with the following:
 ```environment
-  FASTAPI_BASE_URL=http://localhost:8000
-  SECRET_TOKEN="HelloThere"
 ```
 #### 2. `fastapi.service.ts`
@@ -281,9 +256,13 @@ Run the server of flask and nest.js:
 Make sure your FastAPI service is running at `http://localhost:8000`.
 ### Test with CURL
 ```bash
 curl -X POST http://localhost:3000/analyze-text \
   -H 'Content-Type: application/json' \
   -d '{"text": "This is a test input"}'
 ```

 ### **Code Overview**
 ### **Running and Load Balancing:**
 To run the app in production with load balancing:
 uvicorn app:app --host 0.0.0.0 --port 8000 --workers 4
 ```
+This command launches the FastAPI app.
 ### **Endpoints**
 You can test the FastAPI endpoint using `curl` like this:
 ```bash
+curl -X POST https://can-org-canspace.hf.space/analyze \
+  -H "Authorization: Bearer SECRET_CODE" \
   -H "Content-Type: application/json" \
   -d '{"text": "This is a sample sentence for analysis."}'
 ```
+- The `-H "Authorization: Bearer SECRET_CODE"` part is used to simulate the **handshake**.
 - FastAPI checks this token against the one loaded from the `.env` file.
 - If the token matches, the request is accepted and processed.
 - Otherwise, it responds with a `403 Unauthorized` error.
 ### **API Documentation**
+- **Swagger UI:** `https://can-org-canspace.hf.space/docs` -> `/docs`
+- **ReDoc:** `https://can-org-canspace.hf.space/redoc` -> `/redoc`
 ### **🔐 Handshake Mechanism**
 Create a `.env` file at the root with the following:
 ```environment
+  FASTAPI_BASE_URL=https://can-org-canspace.hf.space/
+  SECRET_TOKEN="SECRET_CODE_TOKEN"
 ```
 #### 2. `fastapi.service.ts`
 Make sure your FastAPI service is running at `http://localhost:8000`.
 ### Test with CURL
+http://localhost:3000/-> Server of nest.js
 ```bash
 curl -X POST http://localhost:3000/analyze-text \
   -H 'Content-Type: application/json' \
   -d '{"text": "This is a test input"}'
 ```
+### MODEL
+- You can download the model from the `/MODEL/app.py` file.

app.py CHANGED Viewed

@@ -1,44 +1,40 @@
 import torch
-from transformers import GPT2LMHeadModel, GPT2TokenizerFast
-from fastapi import FastAPI, HTTPException, Header
 from pydantic import BaseModel
-import asyncio
-from concurrent.futures import ThreadPoolExecutor
 from contextlib import asynccontextmanager
-from dotenv import dotenv_values
-# FastAPI instance
 app = FastAPI()
-executor = ThreadPoolExecutor(max_workers=20)
-# Load .env file
-env = dotenv_values(".env")
-EXPECTED_TOKEN = env.get("SECRET_TOKEN")
-# Global variables for model and tokenizer
 model, tokenizer = None, None
-# Function to verify token
-def verify_token(auth: str):
-    if auth != f"Bearer {EXPECTED_TOKEN}":
-        raise HTTPException(status_code=403, detail="Unauthorized")
 # Function to load model and tokenizer
 def load_model():
     model_path = "./Ai-Text-Detector/model"
     weights_path = "./Ai-Text-Detector/model_weights.pth"
-    tokenizer = GPT2TokenizerFast.from_pretrained(model_path)
-    model = GPT2LMHeadModel.from_pretrained("gpt2")
-    model.load_state_dict(torch.load(weights_path, map_location=torch.device("cpu")))
-    model.eval()  # Set the model to evaluation mode
     return model, tokenizer
 @asynccontextmanager
 async def lifespan(app: FastAPI):
     global model, tokenizer
@@ -46,20 +42,20 @@ async def lifespan(app: FastAPI):
     yield
-# Attach the lifespan context manager
 app = FastAPI(lifespan=lifespan)
-# Request body for input data
 class TextInput(BaseModel):
     text: str
-# Sync function to classify text
-def classify_text_sync(sentence: str):
     inputs = tokenizer(sentence, return_tensors="pt", truncation=True, padding=True)
     input_ids = inputs["input_ids"]
     attention_mask = inputs["attention_mask"]
@@ -70,35 +66,26 @@ def classify_text_sync(sentence: str):
         perplexity = torch.exp(loss).item()
     if perplexity < 60:
-        result = "AI-generated*"
     elif perplexity < 80:
-        result = "Probably AI-generated*"
     else:
-        result = "Human-written*"
     return result, perplexity
-# Async wrapper for text classification
-async def classify_text(sentence: str):
-    loop = asyncio.get_event_loop()
-    return await loop.run_in_executor(executor, classify_text_sync, sentence)
 # POST route to analyze text
 @app.post("/analyze")
-async def analyze_text(data: TextInput, authorization: str = Header(default="")):
-    verify_token(authorization)  # Token verification
     user_input = data.text.strip()
     if not user_input:
         raise HTTPException(status_code=400, detail="Text cannot be empty")
-    result, perplexity = await classify_text(user_input)
     return {
         "result": result,
@@ -119,11 +106,8 @@ async def health_check():
 @app.get("/")
 def index():
-    return {"message": "It's an API"}
-# Start the app (run with uvicorn)
-if __name__ == "__main__":
-    import uvicorn
-    uvicorn.run("main:app", host="0.0.0.0", port=8000, workers=4)

 import torch
+from transformers import GPT2LMHeadModel, GPT2TokenizerFast, GPT2Config
+from fastapi import FastAPI, HTTPException
 from pydantic import BaseModel
 from contextlib import asynccontextmanager
+import asyncio
+# FastAPI app instance
 app = FastAPI()
+# Global model and tokenizer variables
 model, tokenizer = None, None
 # Function to load model and tokenizer
 def load_model():
     model_path = "./Ai-Text-Detector/model"
     weights_path = "./Ai-Text-Detector/model_weights.pth"
+    try:
+        tokenizer = GPT2TokenizerFast.from_pretrained(model_path)
+        config = GPT2Config.from_pretrained(model_path)
+        model = GPT2LMHeadModel(config)
+        model.load_state_dict(
+            torch.load(weights_path, map_location=torch.device("cpu"))
+        )
+        model.eval()  # Set model to evaluation mode
+    except Exception as e:
+        raise RuntimeError(f"Error loading model: {str(e)}")
     return model, tokenizer
+# Load model on app startup
 @asynccontextmanager
 async def lifespan(app: FastAPI):
     global model, tokenizer
     yield
+# Attach startup loader
 app = FastAPI(lifespan=lifespan)
+# Input schema
 class TextInput(BaseModel):
     text: str
+# Sync text classification
+def classify_text(sentence: str):
     inputs = tokenizer(sentence, return_tensors="pt", truncation=True, padding=True)
     input_ids = inputs["input_ids"]
     attention_mask = inputs["attention_mask"]
         perplexity = torch.exp(loss).item()
     if perplexity < 60:
+        result = "AI-generated"
     elif perplexity < 80:
+        result = "Probably AI-generated"
     else:
+        result = "Human-written"
     return result, perplexity
 # POST route to analyze text
 @app.post("/analyze")
+async def analyze_text(data: TextInput):
     user_input = data.text.strip()
     if not user_input:
         raise HTTPException(status_code=400, detail="Text cannot be empty")
+    # Run classification asynchronously to prevent blocking
+    result, perplexity = await asyncio.to_thread(classify_text, user_input)
     return {
         "result": result,
 @app.get("/")
 def index():
+    return {
+        "message": "FastAPI API is up.",
+        "try": "/docs to test the API.",
+        "status": "OK",
+    }

requirements.txt CHANGED Viewed

@@ -1,210 +1,7 @@
-absl-py==2.2.2
-accelerate==1.6.0
-aiohappyeyeballs==2.6.1
-aiohttp==3.11.16
-aiosignal==1.3.2
-altair==5.5.0
-annotated-types==0.7.0
-anyio==4.9.0
-argon2-cffi==23.1.0
-argon2-cffi-bindings==21.2.0
-arrow==1.3.0
-asgiref==3.8.1
-asttokens==3.0.0
-async-lru==2.0.5
-attrs==25.3.0
-babel==2.17.0
-beautifulsoup4==4.13.4
-bleach==6.2.0
-blinker==1.9.0
-cachetools==5.5.2
-certifi==2025.1.31
-cffi==1.17.1
-charset-normalizer==2.1.1
-click==8.1.8
-comm==0.2.2
-contourpy==1.3.1
-cycler==0.12.1
-datasets==3.5.0
-DateTime==4.7
-debugpy==1.8.13
-decorator==5.2.1
-defusedxml==0.7.1
-dill==0.3.8
-Django==5.2
-dotenv==0.9.9
-executing==2.2.0
-fastapi==0.115.12
-fastjsonschema==2.21.1
-filelock==3.13.1
-Flask==3.1.0
-flask-cors==5.0.1
-fonttools==4.56.0
-fqdn==1.5.1
-frozenlist==1.6.0
-fsspec==2024.6.1
-generativeai==0.0.1
-gitdb==4.0.12
-GitPython==3.1.44
-google-ai-generativelanguage==0.6.15
-google-api-core==2.24.2
-google-api-python-client==2.165.0
-google-auth==2.38.0
-google-auth-httplib2==0.2.0
-google-genai==1.7.0
-google-generativeai==0.8.4
-googleapis-common-protos==1.69.2
-grpcio==1.71.0
-grpcio-status==1.71.0
-h11==0.14.0
-h5py==3.13.0
-html5lib==1.1
-httpcore==1.0.7
-httplib2==0.22.0
-httpx==0.28.1
-huggingface-hub==0.30.2
-idna==3.10
-inquirerpy==0.3.4
-ipykernel==6.29.5
-ipython==9.0.2
-ipython_pygments_lexers==1.1.1
-isoduration==20.11.0
-itsdangerous==2.2.0
-jedi==0.19.2
-Jinja2==3.1.4
-joblib==1.4.2
-json5==0.12.0
-jsonpointer==3.0.0
-jsonschema==4.23.0
-jsonschema-specifications==2024.10.1
-jupyter-events==0.12.0
-jupyter-lsp==2.2.5
-jupyter_client==8.6.3
-jupyter_core==5.7.2
-jupyter_server==2.15.0
-jupyter_server_terminals==0.5.3
-jupyterlab==4.4.0
-jupyterlab_pygments==0.3.0
-jupyterlab_server==2.27.3
-keras==3.9.2
-kiwisolver==1.4.8
-markdown-it-py==3.0.0
-MarkupSafe==3.0.2
-matplotlib==3.10.1
-matplotlib-inline==0.1.7
-mdurl==0.1.2
-mechanize==0.4.10
-mistune==3.1.3
-ml_dtypes==0.5.1
-mpmath==1.3.0
-multidict==6.4.3
-multiprocess==0.70.16
-namex==0.0.8
-narwhals==1.35.0
-nbclient==0.10.2
-nbconvert==7.16.6
-nbformat==5.10.4
-nest-asyncio==1.6.0
-networkx==3.3
-notebook==7.4.0
-notebook_shim==0.2.4
-numpy==2.2.4
-nvidia-cublas-cu11==11.11.3.6
-nvidia-cuda-cupti-cu11==11.8.87
-nvidia-cuda-nvrtc-cu11==11.8.89
-nvidia-cuda-runtime-cu11==11.8.89
-nvidia-cudnn-cu11==9.1.0.70
-nvidia-cufft-cu11==10.9.0.58
-nvidia-curand-cu11==10.3.0.86
-nvidia-cusolver-cu11==11.4.1.48
-nvidia-cusparse-cu11==11.7.5.86
-nvidia-nccl-cu11==2.21.5
-nvidia-nvtx-cu11==11.8.86
-optree==0.15.0
-overrides==7.7.0
-packaging==24.2
-pandas==2.2.3
-pandocfilters==1.5.1
-parso==0.8.4
-pexpect==4.9.0
-pfzy==0.3.4
-pillow==11.1.0
-platformdirs==4.3.7
-prometheus_client==0.21.1
-prompt_toolkit==3.0.50
-propcache==0.3.1
-proto-plus==1.26.1
-protobuf==5.29.4
-psutil==7.0.0
-ptyprocess==0.7.0
-pure_eval==0.2.3
-pyarrow==19.0.1
-pyasn1==0.6.1
-pyasn1_modules==0.4.1
-pycparser==2.22
-pydantic==2.10.6
-pydantic_core==2.27.2
-pydeck==0.9.1
-pygame==2.6.1
-Pygments==2.19.1
-pyparsing==3.2.2
-pystyle==2.0
-python-dateutil==2.9.0.post0
-python-dotenv==1.1.0
-python-json-logger==3.3.0
-pytz==2025.1
-PyYAML==6.0.2
-pyzmq==26.3.0
-referencing==0.36.2
-regex==2024.11.6
-requests==2.32.3
-rfc3339-validator==0.1.4
-rfc3986-validator==0.1.1
-rich==14.0.0
-rpds-py==0.24.0
-rsa==4.9
-safetensors==0.5.3
-scikit-learn==1.6.1
-scipy==1.15.2
-seaborn==0.13.2
-Send2Trash==1.8.3
-setuptools==70.2.0
-six==1.17.0
-smmap==5.0.2
-sniffio==1.3.1
-soupsieve==2.6
-sqlparse==0.5.3
-stack-data==0.6.3
-starlette==0.46.2
-streamlit==1.44.1
-sympy==1.13.1
-tenacity==9.1.2
-terminado==0.18.1
-threadpoolctl==3.6.0
-tinycss2==1.4.0
-tokenizers==0.21.1
-toml==0.10.2
-torch==2.6.0+cu118
-torchaudio==2.6.0+cu118
-torchvision==0.21.0+cu118
-tornado==6.4.2
-tqdm==4.67.1
-traitlets==5.14.3
 transformers==4.51.3
-triton==3.2.0
-types-python-dateutil==2.9.0.20241206
-typing_extensions==4.12.2
-tzdata==2025.2
-uri-template==1.3.0
-uritemplate==4.1.1
-urllib3==1.26.20
-watchdog==6.0.0
-wcwidth==0.2.13
-webcolors==24.11.1
-webencodings==0.5.1
-websocket-client==1.8.0
-websockets==15.0.1
-Werkzeug==3.1.3
-xxhash==3.5.0
-yarl==1.20.0
-zope.interface==7.2

+torch==2.6.0
 transformers==4.51.3
+fastapi==0.103.0
+pydantic==1.10.12
+asyncio==3.4.3
+uvicorn[standard]==0.21.1

test.sh DELETED Viewed

	@@ -1 +0,0 @@
1	- echo "ok"