RemiAI_Framework / report.md
Roshan1162003's picture
Full project overwrite upload
2806c4e

RemiAI Framework - Technical Report

1. Executive Summary

This report details the internal architecture and operational flow of the RemiAI Framework, an open-source, Electron-based desktop application designed for offline AI interaction. The framework allows users to run Large Language Models (LLMs) locally in GGUF format without requiring Python or complex dependencies (like CUDA or Torch) on the client machine. It automatically manages hardware acceleration (AVX/AVX2) and provides a seamless "Download & Run" experience for students and developers.

2. System Architecture

The application follows a standard Electron Multi-Process Architecture, enhanced with a custom Native AI Backend.

2.1 Block Diagram

graph TD
    subgraph "User Machine (Windows)"
        A[Electron Main Process] -- Controls --> B[Window Management]
        A -- Spawns/Manages --> C[Native AI Engine Backend]
        B -- Renders --> D[Electron Renderer Process (UI)]
        C -- HTTP/API (Port 5000) --> D
        
        subgraph "Hardware Layer"
            E[CPU (AVX/AVX2)]
            F[RAM]
            G[Start-up Check]
        end
        
        G -- Detects Flags --> A
        A -- Selects Binary --> C
        C -- Loads --> H[model.gguf]
    end

2.2 Component Breakdown

  1. Electron Main Process (main.js):

    • Role: The application entry point.
    • Responsibilities:
      • Lifecycle management (Start, Stop, Quit).
      • Hardware Detection using systeminformation to check for AVX/AVX2 support.
      • Engine Selection: Dynamically chooses the correct binary (cpu_avx2 or cpu_avx) to maximize performance or ensure compatibility.
      • Backend Spawning: Launches the bujji_engine.exe (optimized llama.cpp server) as a child process.
      • Window Creation: Loads index.html.
  2. Native AI Engine (Backend):

    • Role: The "Brain" of the application.
    • Technology: Pre-compiled binaries (likely based on llama.cpp) optimized for CPU inference.
    • Operation: Runs a local server on port 5000.
    • Model: Loads weights strictly from a file named model.gguf.
    • No Python Required: The binary is self-contained with all necessary DLLs.
    • Git LFS integration: Large binaries (.exe, .dll) are tracked via Git LFS to keep the repo clean. The main.js includes a startup check to ensure these files are fully downloaded (and not just LFS pointers) before launching.
  3. Renderer Process (index.html + renderer.js):

    • Role: The User Interface.
    • Responsibilities:
      • Displays the chat interface.
      • Sends user prompts to localhost:5000.
      • Receives and streams AI responses.

3. Operational Flow Chart

Detailed step-by-step process of the application startup:

sequenceDiagram
    participant U as User
    participant M as Main Process
    participant S as System Check
    participant E as AI Engine (Backend)
    participant W as UI Window

    U->>M: Launches Application (npm start / exe)
    M->>S: Request CPU Flags (AVX2?)
    S-->>M: Returns Flags (e.g., "avx2 enabled")
    
    alt AVX2 Supported
        M->>M: Select "engine/cpu_avx2/bujji_engine.exe"
    else Only AVX
        M->>M: Select "engine/cpu_avx/bujji_engine.exe"
    end

    M->>M: Validate Engine File Size (Check for Git LFS pointers)
    M-->>U: Error Dialog if File Missing/Small

    M->>E: Spawn Process (model.gguf, port 5000, 4 threads)
    E-->>M: Server Started (Background)
    M->>W: Create Window (Load index.html)
    W->>E: Check Connection (Health Check)
    E-->>W: Ready
    W-->>U: Display Chat Interface

4. Technical Specifications & Requirements

4.1 Prerequisites

  • Operating System: Windows (10/11) 64-bit.
  • software: Git & Git LFS (Required for downloading engine binaries).
  • Runtime: Node.js (LTS version recommended).
  • Hardware:
    • Any modern CPU (Intel/AMD) with AVX support.
    • Minimum 8GB RAM (16GB recommended for larger models).
    • Disk space proportional to the model size (e.g., 4GB for a 7B model).

4.2 File Structure

The critical file structure required for the app to function:

Root/
β”œβ”€β”€ engine/                 # Connects to AI Backend
β”‚   β”œβ”€β”€ cpu_avx/            # Fallback binaries
β”‚   └── cpu_avx2/           # High-performance binaries
β”œβ”€β”€ model.gguf              # The AI Model (Must be named exactly this)
β”œβ”€β”€ main.js                 # Core Logic
β”œβ”€β”€ index.html              # UI Layer
β”œβ”€β”€ package.json            # Dependencies
└── node_modules/           # Installed via npm install

5. Development & Open Source Strategy

5.1 Licensing

This project is released under the MIT License. This allows any student or developer to:

  • Use the code freely.
  • Modify the interface (rename "RemiAI" to their own brand).
  • Distribute their own versions.

5.2 Hosting Strategy

  • GitHub: Contains the source code (JS, HTML, CSS).
  • Hugging Face: Hosts the large model.gguf file and the zipped release builds due to storage limits on GitHub. We use Hugging Face for "Large File Storage" of the AI weights.

6. Conclusion

The RemiAI/Bujji framework democratizes access to local AI. by removing the complex Python environment setup and packaging the inference engine directly with the app, we enable any student with a laptop to run powerful AI models simply by typing npm start.