Spaces:

arunn7
/

ai-code-analyzer

Sleeping

Go to Google Drive → MyDrive/ai-code-analyzer/
Download the fine-tuned-analyst folder
Place it in your project root: C:\Users\arunk\professional\ai-code-analyzer\fine-tuned-analyst\

Option B: Download Directly from Colab

# Run this in Colab to create a downloadable ZIP
import shutil
shutil.make_archive('fine-tuned-analyst', 'zip', './fine-tuned-analyst')

from google.colab import files
files.download('fine-tuned-analyst.zip')

Then extract the ZIP in your project root.

Step 2: Install Required Dependencies

Update your requirements.txt to include PEFT:

# Add this line to requirements.txt
peft>=0.7.0

Install it:

pip install peft

Step 3: Test the Enhanced Analyzer Locally

Run the test script to verify everything works:

python optimized_code_analyzer_enhanced.py

You should see:

✅ CodeT5+ analysis
✅ Fine-tuned DeepSeek analysis
✅ Model comparison

Step 4: Update Your Streamlit UI

Replace the analyzer import in matrix_final.py:

Find this (around line 8):

from optimized_code_analyzer import OptimizedCodeAnalyzer

Replace with:

from optimized_code_analyzer_enhanced import EnhancedCodeAnalyzer

Find this (around line 287):

@st.cache_resource
def get_local_analyzer():
    return OptimizedCodeAnalyzer(
        model_id="Salesforce/codet5p-220m",
        precision="fp16",
        quick_max_new_tokens=180,
        detailed_max_new_tokens=240,
    )

Replace with:

@st.cache_resource
def get_local_analyzer(model_type="codet5"):
    return EnhancedCodeAnalyzer(
        model_type=model_type,
        precision="fp16",
        quick_max_new_tokens=180,
        detailed_max_new_tokens=300,
    )

Step 5: Add Model Selector to Sidebar

Add this to your sidebar (around line 490, in the sidebar section):

# Model Selection
st.sidebar.markdown("---")
st.sidebar.markdown("### 🤖 AI Model Selection")
model_choice = st.sidebar.radio(
    "Choose Analysis Model:",
    ["CodeT5+ (Fast)", "Fine-tuned DeepSeek (Accurate)"],
    help="CodeT5+ is faster, Fine-tuned model gives more detailed analysis"
)

model_type = "codet5" if "CodeT5+" in model_choice else "deepseek-finetuned"

Step 6: Update the Analysis Call

Find where the analyzer is called (around line 600+) and update it:

Find something like:

local_analyzer = get_local_analyzer()
result = local_analyzer.analyze_code_fast(code)

Replace with:

local_analyzer = get_local_analyzer(model_type)
result = local_analyzer.analyze_code_fast(code)

Step 7: Test Everything

Run your Streamlit app:

streamlit run matrix_final.py

Test both models:

Select "CodeT5+ (Fast)" → Run analysis → Should work as before
Select "Fine-tuned DeepSeek (Accurate)" → Run analysis → Should give detailed analysis with quality scores

📊 What Each Model Does

CodeT5+ (Base Model)

Speed: ⚡ Fast (2-3 seconds)
Memory: ~1GB
Analysis: General code analysis
Best for: Quick checks, batch processing
Quality: Good for basic issues

Fine-tuned DeepSeek (Your Model)

Speed: 🚀 Moderate (3-5 seconds)
Memory: ~1.5GB
Analysis: Detailed with quality scores (1-100)
Best for: Deep analysis, learning, production code
Quality: Excellent - trained on your specific patterns
Output format:
- Quality Score (1-100)
- Bugs section
- Performance issues
- Security concerns
- Improvement suggestions with code examples

🎯 Key Features of the Enhanced System

1. Dual Model Support

Seamlessly switch between models
Separate caching for each model
No breaking changes to existing code

2. Improved Analysis Quality

Your fine-tuned model provides:

Structured output: Quality score, bugs, performance, security
Code examples: Shows how to fix issues
Contextual understanding: Trained on your dataset patterns
Consistent formatting: Always includes all sections

3. Memory Efficient

LoRA adapters are tiny (~20MB vs 1GB+ full model)
Base model shared, adapters loaded on demand
Can deploy both models without doubling memory

🚀 Deployment Options

Option 1: Local Deployment (Current)

Pros:

Free
Fast
Full control
Easy testing

Cons:

Only you can use it
Needs your computer running

Setup: Already working! Just use Streamlit locally.

Option 2: Hugging Face Spaces (Recommended)

Pros:

FREE hosting
Automatic HTTPS
Share with anyone
GPU available (paid tier)

Setup:

Create account on huggingface.co
Create new Space (Streamlit)
Upload files:
- matrix_final.py
- optimized_code_analyzer_enhanced.py
- requirements.txt
- fine-tuned-analyst/ folder
Add app.py:

# app.py (for HF Spaces)
import subprocess
subprocess.run(["streamlit", "run", "matrix_final.py"])

Option 3: Railway.app

Cost: $5/month Memory: Up to 8GB Speed: Faster than HF Spaces

Setup:

Connect GitHub repo
Set start command: streamlit run matrix_final.py --server.port $PORT
Deploy

Option 4: Render.com

Cost: FREE tier available Memory: 512MB (might be tight) Speed: Good

Setup:

Connect repo
Use Docker:

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD streamlit run matrix_final.py --server.port $PORT

🐛 Troubleshooting

Issue: "fine-tuned-analyst folder not found"

Solution: Make sure the folder is in your project root with these files:

fine-tuned-analyst/
├── adapter_config.json
├── adapter_model.bin (or adapter_model.safetensors)
├── tokenizer_config.json
└── special_tokens_map.json

Issue: "PEFT not installed"

Solution:

pip install peft

Issue: "Model too slow"

Solution:

Use "quick" mode instead of "detailed"
Reduce max_new_tokens to 150
Use INT8 or INT4 quantization

Issue: "Out of memory"

Solution:

Close other applications
Use CodeT5+ instead (smaller)
Enable quantization: precision="int8"

📚 Understanding the Libraries Used

Core Libraries

Transformers (transformers)

What: Hugging Face's library for AI models
Does: Loads models, tokenizers, handles generation
Used for: Loading DeepSeek and CodeT5+ models

PEFT (peft)

What: Parameter Efficient Fine-Tuning
Does: Loads LoRA adapters efficiently
Used for: Your fine-tuned model adapters

PyTorch (torch)

What: Deep learning framework
Does: Runs neural networks on GPU/CPU
Used for: Model inference, tensor operations

Streamlit (streamlit)

What: Web app framework for Python
Does: Creates interactive UI
Used for: Your code analyzer interface

How They Work Together

User Input (Streamlit)
    ↓
EnhancedCodeAnalyzer
    ↓
Transformers (loads base model)
    ↓
PEFT (loads adapters)
    ↓
PyTorch (runs inference)
    ↓
Result → Streamlit UI

🎓 Next Steps

Test both models with various code samples
Compare quality - which model works better for your use cases?
Expand dataset - Add more samples and retrain (only takes 20 minutes!)
Deploy - Choose a hosting option and share with others
Iterate - Collect feedback and improve

💡 Tips for Best Results

When to Use CodeT5+

Quick syntax checks
Batch processing many files
Resource-constrained environments
Simple code reviews

When to Use Fine-tuned DeepSeek

Production code reviews
Learning/education
Complex analysis needed
When quality > speed
Security audits

🎉 Congratulations!

You've successfully:

✅ Fine-tuned a language model
✅ Integrated it with your app
✅ Created a dual-model system
✅ Learned about model deployment
✅ Built a production-ready tool

Your code analyzer now has:

2 AI models to choose from
Professional quality analysis
Scalable architecture for future improvements
Production-ready code

📞 Support

If you need help:

Check error messages carefully
Review this guide
Test with simple code first
Compare with working examples
Ask for help with specific errors

Happy coding! 🚀