open-navigator / INTEL_ARC_QUICKSTART.md
jcbowyer's picture
Clean HuggingFace deployment without binary files
61d29fc

πŸš€ Intel Arc + DuckDB Quick Reference

Get started with local AI legislative analysis in 5 minutes

⚑ Performance at a Glance

Task Standard (Postgres + CPU) Optimized (DuckDB + Arc GPU) Speedup
Context injection (100 bills) 500ms 20ms 25x
Vector search (10K records) 800ms 18ms 44x
LLM inference (3B model) 350 tok/s 1,200 tok/s 3.4x
Full testimony analysis 2,000ms 80ms 25x

🎯 Three-Step Setup

1. Install (5 minutes)

cd /path/to/open-navigator
./scripts/intel_llm_setup.sh
source .venv-intel/bin/activate

2. Test DuckDB VSS (30 seconds)

python scripts/duckdb_vss_demo.py

Expected output:

πŸ“Š Creating demo DuckDB database with VSS...
βœ… Demo database created!
πŸ“ˆ Results (searching 1,000 bills):
   Average: 18.45ms
🎯 Top 3 most similar bills: ...

3. Run Analysis (1 minute)

python scripts/legislative_analysis_intel.py

🧠 Code Examples

Example 1: Fast Bill Search

from scripts.legislative_analysis_intel import DuckDBLegislativeAnalyzer

with DuckDBLegislativeAnalyzer() as analyzer:
    # Get bill context in < 50ms
    bill = analyzer.get_bill_context("HB1234")
    testimony = analyzer.get_all_testimony_for_bill("HB1234")
    
    print(f"Bill: {bill['title']}")
    print(f"Testimony records: {len(testimony)}")

Example 2: Vector Similarity Search

import numpy as np

# Your query embedding (384 dimensions from sentence-transformers)
query_embedding = model.encode("water fluoridation policy")

# Fast vector search (< 20ms for 10K bills)
similar_bills = analyzer.search_similar_testimony(
    query_embedding.tolist(),
    limit=10
)

for bill in similar_bills:
    print(f"{bill['bill_id']}: {bill['text'][:100]}... (similarity: {bill['similarity']:.2f})")

Example 3: Extract Interest Groups

from scripts.legislative_analysis_intel import IntelOptimizedLLM, InterestGroup

# Initialize Intel-optimized LLM (uses Arc GPU)
llm = IntelOptimizedLLM(model_name="meta-llama/Llama-3.2-3B-Instruct")
llm.load_model(use_openvino=True)  # OpenVINO = best Arc GPU performance

# Extract structured data
groups = llm.extract_interest_groups(bill_context, testimony)

# Results
for group in groups:
    print(f"""
    Group: {group.group_name}
    Lobbyist: {group.lobbyist}
    Stance: {group.stance} (score: {group.stance_score})
    Tradeoffs: {group.tradeoff_notes}
    Confidence: {group.confidence}
    """)

Example 4: Query Hugging Face Datasets Directly

import duckdb

conn = duckdb.connect()

# No download needed - streams from HF!
df = conn.execute("""
    SELECT * 
    FROM read_parquet(
        'hf://datasets/CommunityOne/states-al-nonprofits-locations/data/train-*.parquet'
    )
    WHERE city = 'Birmingham'
    LIMIT 100
""").fetchdf()

print(f"Found {len(df)} organizations in Birmingham, AL")

🎨 Output Schema

Interest Group Extraction:

{
  "groups": [
    {
      "group_name": "Alabama Dental Association",
      "lobbyist": "John Smith, DDS",
      "stance": "conditional",
      "stance_score": 0.6,
      "tradeoff_notes": "Support if Section 4 amended to include rural exemption and phased implementation timeline",
      "testimony_excerpt": "While we have concerns about Section 4's implementation timeline, we support the overall goals if rural communities receive proper resources...",
      "bill_id": "HB1234",
      "confidence": 0.85
    },
    {
      "group_name": "Sierra Club Alabama Chapter",
      "lobbyist": null,
      "stance": "oppose",
      "stance_score": -0.9,
      "tradeoff_notes": null,
      "testimony_excerpt": "We strongly oppose this bill due to environmental concerns...",
      "bill_id": "HB1234",
      "confidence": 0.92
    }
  ]
}

πŸ”§ Environment Variables

# Enable Intel GPU
export ZES_ENABLE_SYSMAN=1

# Ollama GPU usage (if using Ollama)
export OLLAMA_NUM_GPU=999

# IPEX-LLM optimizations
export IPEX_LLM_NUM_GPU=1
export ONEAPI_DEVICE_SELECTOR=level_zero:0

πŸ’‘ Best Practices

1. Cache Embeddings

DON'T recompute every time:

# Slow - recomputes embeddings every run
for bill in bills:
    embedding = model.encode(bill['text'])
    analyze(embedding)

DO cache in DuckDB:

# Fast - compute once, reuse forever
conn.execute("""
    CREATE TABLE bill_embeddings AS
    SELECT bill_id, embedding
    FROM ... -- computed once
""")

# Then just query
similar = conn.execute("""
    SELECT * FROM bill_embeddings
    ORDER BY array_distance(embedding, ?) 
    LIMIT 10
""", [query]).fetchall()

2. Batch Processing

DON'T process one at a time:

for bill_id in bill_ids:  # Slow!
    result = analyze_single_bill(bill_id)

DO batch efficiently:

# Fast - processes 100 bills in parallel
results = llm.extract_interest_groups_batch(
    bill_contexts=bills,
    testimony_batches=all_testimony,
    batch_size=32  # Fits in Arc GPU memory
)

3. Monitor GPU Usage

# Linux: intel_gpu_top
sudo apt install intel-gpu-tools
intel_gpu_top

# Windows: Task Manager β†’ Performance β†’ GPU
# Look for "GPU 0 - Intel Arc Graphics"

πŸ› Troubleshooting

Issue: "ModuleNotFoundError: optimum"

pip install optimum[openvino]

Issue: Slow inference (still using CPU)

Check device:

import torch
print(f"Device: {torch.cuda.get_device_name(0)}")  # Should show Arc GPU

# Force GPU
model = OVModelForCausalLM.from_pretrained(
    model_name,
    device="GPU"  # Explicitly set
)

Issue: Out of memory

Use smaller model or reduce batch size:

# Use 3B instead of 8B
model_name = "meta-llama/Llama-3.2-3B-Instruct"

# Reduce context
testimony = testimony[:10]  # Top 10 only

πŸ“š Resources

🎯 Next Steps

  1. βœ… Run the demo: python scripts/duckdb_vss_demo.py
  2. βœ… Test analysis: python scripts/legislative_analysis_intel.py
  3. πŸ“š Read full guide: Intel Arc Optimization Guide
  4. πŸš€ Build your own: Use the DuckDBLegislativeAnalyzer class
  5. 🀝 Share results: Open an issue with your findings!

πŸ’¬ Questions?


Built with ❀️ for Data Engineering Managers who want local, private, fast legislative intelligence.