# CURL Test Commands for Ingestion Pipeline ## Backend Configuration - **URL**: `https://binkhoale1812-studdybuddy-ingestion1.hf.space/` - **User ID**: `44e65346-8eaa-4f95-b17a-f6219953e7a8` - **Project ID**: `496e2fad-ec7e-4562-b06a-ea2491f2460` - **Test Files**: `Lecture5_ML.pdf`, `Lecture6_ANN_DL.pdf` ## 1. Health Check ```bash curl -X GET "https://binkhoale1812-studdybuddy-ingestion1.hf.space/health" \ -H "Content-Type: application/json" ``` ## 2. Upload Files ```bash curl -X POST "https://binkhoale1812-studdybuddy-ingestion1.hf.space/upload" \ -F "user_id=44e65346-8eaa-4f95-b17a-f6219953e7a8" \ -F "project_id=496e2fad-ec7e-4562-b06a-ea2491f2460" \ -F "files=@../exefiles/Lecture5_ML.pdf" \ -F "files=@../exefiles/Lecture6_ANN_DL.pdf" ``` ## 3. Check Upload Status Replace `{JOB_ID}` with the job_id from the upload response: ```bash curl -X GET "https://binkhoale1812-studdybuddy-ingestion1.hf.space/upload/status?job_id={JOB_ID}" \ -H "Content-Type: application/json" ``` ## 4. List Uploaded Files ```bash curl -X GET "https://binkhoale1812-studdybuddy-ingestion1.hf.space/files?user_id=44e65346-8eaa-4f95-b17a-f6219953e7a8&project_id=496e2fad-ec7e-4562-b06a-ea2491f2460" \ -H "Content-Type: application/json" ``` ## 5. Get File Chunks (Lecture5_ML.pdf) ```bash curl -X GET "https://binkhoale1812-studdybuddy-ingestion1.hf.space/files/chunks?user_id=44e65346-8eaa-4f95-b17a-f6219953e7a8&project_id=496e2fad-ec7e-4562-b06a-ea2491f2460&filename=Lecture5_ML.pdf&limit=5" \ -H "Content-Type: application/json" ``` ## 6. Get File Chunks (Lecture6_ANN_DL.pdf) ```bash curl -X GET "https://binkhoale1812-studdybuddy-ingestion1.hf.space/files/chunks?user_id=44e65346-8eaa-4f95-b17a-f6219953e7a8&project_id=496e2fad-ec7e-4562-b06a-ea2491f2460&filename=Lecture6_ANN_DL.pdf&limit=5" \ -H "Content-Type: application/json" ``` ## Expected Responses ### Health Check Response ```json { "ok": true, "mongodb_connected": true, "service": "ingestion_pipeline" } ``` ### Upload Response ```json { "job_id": "uuid-string", "status": "processing", "total_files": 2 } ``` ### Status Response ```json { "job_id": "uuid-string", "status": "completed", "total": 2, "completed": 2, "progress": 100.0, "last_error": null, "created_at": 1234567890.123 } ``` ### Files List Response ```json { "files": [ { "filename": "Lecture5_ML.pdf", "summary": "Document summary..." }, { "filename": "Lecture6_ANN_DL.pdf", "summary": "Document summary..." } ], "filenames": ["Lecture5_ML.pdf", "Lecture6_ANN_DL.pdf"] } ``` ### Chunks Response ```json { "chunks": [ { "user_id": "44e65346-8eaa-4f95-b17a-f6219953e7a8", "project_id": "496e2fad-ec7e-4562-b06a-ea2491f2460", "filename": "Lecture5_ML.pdf", "topic_name": "Machine Learning Introduction", "summary": "Chunk summary...", "content": "Chunk content...", "embedding": [0.1, 0.2, ...], "page_span": [1, 3], "card_id": "lecture5_ml-c0001" } ] } ``` ## Testing Steps 1. **Run Health Check**: Verify the service is running 2. **Upload Files**: Upload both PDF files 3. **Monitor Progress**: Check job status until completion 4. **Verify Files**: List uploaded files 5. **Inspect Chunks**: Get document chunks to verify processing ## Troubleshooting - **Connection Issues**: Check if the backend URL is accessible - **File Not Found**: Ensure PDF files exist in `../exefiles/` directory - **Upload Fails**: Check file size limits and format support - **Processing Stuck**: Monitor job status and check logs