Spaces:

Arvind2006
/

jenkins-error-explainer

Running

App Files Files Community

Arvind2006 commited on Jan 11

Commit

3068b6b

verified ·

1 Parent(s): d7d9c59

Upload 13 files

Browse files

Files changed (8) hide show

README.md +127 -10
extract_error_features.py +24 -0
image.png +0 -0
ingest_docs.py +54 -0
main.py +13 -0
preload_model.py +9 -0
requirements.txt +10 -0
retrieve_docs.py +53 -0

README.md CHANGED Viewed

@@ -1,10 +1,127 @@
----
-title: Jenkins Error Explainer
-emoji: 📉
-colorFrom: indigo
-colorTo: green
-sdk: docker
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Jenkins Error Explainer
+A documentation-grounded system that explains Jenkins pipeline errors using official Jenkins documentation.
+This project analyzes raw Jenkins build logs, extracts structured error signals, retrieves relevant documentation sections, and generates clear, human-readable explanations without relying on supervised training data.
+---
+## Motivation
+Jenkins error logs are often verbose, difficult to interpret, and highly contextual.
+There is no standardized dataset of Jenkins errors or canonical explanations.
+This project addresses that gap by:
+- Using heuristic-based error feature extraction
+- Grounding explanations strictly in official Jenkins documentation
+- Avoiding hallucinated or unsafe advice
+---
+## What This Project Does
+Given a Jenkins pipeline error log, the system:
+1. Extracts key error signals (syntax errors, missing agents, missing plugins, etc.)
+2. Retrieves relevant sections from Jenkins documentation
+3. Generates a structured explanation including:
+   - Error summary
+   - Likely causes
+   - Links to official documentation
+---
+## What This Project Does NOT Do
+- Does not train or fine-tune a language model
+- Does not rely on labeled error datasets
+- Does not scrape community forums or StackOverflow
+---
+## Project Structure
+![alt text](image.png)
+## Error Categories Covered
+- Pipeline syntax errors (invalid Groovy, missing braces)
+- Missing agent or unavailable nodes
+- Missing plugins / undefined DSL methods
+- Missing credentials
+- Workspace and file system errors
+- Git / SCM authentication failures during checkout
+---
+## Model Usage
+This project does not train or fine-tune any machine learning models.
+Pre-trained sentence embedding models are used solely for semantic retrieval
+over Jenkins documentation. No Jenkins error logs are used for training.
+This design avoids the need for labeled datasets and ensures predictable,
+reproducible behavior.
+---
+## Example
+**Input:**
+Raw Jenkins console output containing a groovy syntax error.
+**Output:**
+Error Category:
+groovy_syntax_error
+Error Summary:
+The pipeline failed due to a Groovy syntax error, most likely caused by an invalid or incomplete Jenkinsfile.
+Likely Causes:
+- Missing or mismatched braces in the Jenkinsfile
+- Invalid declarative pipeline structure
+Relevant Documentation:
+- pipeline_syntax.txt (https://www.jenkins.io/doc/)
+- using_a_jenkinsfile.txt (https://www.jenkins.io/doc/)
+- using_a_jenkinsfile.txt (https://www.jenkins.io/doc/)
+- using_a_jenkinsfile.txt (https://www.jenkins.io/doc/)
+- using_a_jenkinsfile.txt (https://www.jenkins.io/doc/)
+## Design Principles
+- Documentation-first retrieval (RAG)
+- Heuristic-driven error understanding
+- Explicit handling of uncertainty
+- Reproducible and explainable behavior
+---
+## Future Extensions
+- Jenkinsfile explanation support
+- Plugin-aware error analysis
+- CLI or web-based interface
+- Version-aware documentation indexing
+---
+## Guarantees and Limitations
+This tool does NOT act as a general-purpose AI assistant.
+Guarantees:
+- Explanations are grounded in official Jenkins documentation only.
+- Error classification is deterministic and rule-based.
+- If relevant documentation is not found, the tool explicitly reports uncertainty.
+Limitations:
+- The tool does not attempt to fix errors automatically.
+- It does not infer causes beyond documented Jenkins behavior.
+- Unknown or plugin-specific errors may be reported as unsupported.
+---
+## Disclaimer
+This project is an independent personal project and is not affiliated with or endorsed by the Jenkins project.

extract_error_features.py ADDED Viewed

	@@ -0,0 +1,24 @@

+import re
+from error_taxonomy import ERROR_CATEGORIES
+COMPILED_PATTERNS = {
+    cat: [re.compile(p) for p in pats]
+    for cat, pats in ERROR_CATEGORIES.items()
+}
+def extract_error_features(log_text: str) -> dict:
+    result = {
+        "category": "unknown",
+        "matched_signals": [],
+        "line_numbers": []
+    }
+    for category, patterns in COMPILED_PATTERNS.items():
+        for pattern in patterns:
+            if pattern.search(log_text):
+                result["category"] = category
+                result["matched_signals"].append(pattern.pattern)
+    result["line_numbers"] = re.findall(r"line (\d+)", log_text)
+    return result

image.png ADDED Viewed

ingest_docs.py ADDED Viewed

	@@ -0,0 +1,54 @@

+# ingest_docs.py
+import os
+import json
+import faiss
+from sentence_transformers import SentenceTransformer
+RAW_DOCS_DIR = "data/docs/raw"
+INDEX_PATH = "data/docs/docs.index"
+META_PATH = "data/docs/docs_meta.json"
+CHUNK_SIZE = 400
+def chunk_text(text, size):
+    chunks = []
+    for i in range(0, len(text), size):
+        chunk = text[i:i+size].strip()
+        if chunk:
+            chunks.append(chunk)
+    return chunks
+def main():
+    model = SentenceTransformer("paraphrase-MiniLM-L3-v2", cache_folder="./model_cache")
+    documents = []
+    metadata = []
+    for fname in os.listdir(RAW_DOCS_DIR):
+        path = os.path.join(RAW_DOCS_DIR, fname)
+        with open(path, "r", encoding="utf-8") as f:
+            text = f.read()
+        chunks = chunk_text(text, CHUNK_SIZE)
+        for chunk in chunks:
+            documents.append(chunk)
+            metadata.append({
+                "source_file": fname,
+                "source": f"https://www.jenkins.io/doc/"
+            })
+    embeddings = model.encode(documents)
+    index = faiss.IndexFlatL2(embeddings.shape[1])
+    index.add(embeddings)
+    os.makedirs("data/docs", exist_ok=True)
+    faiss.write_index(index, INDEX_PATH)
+    with open(META_PATH, "w", encoding="utf-8") as f:
+        json.dump(metadata, f, indent=2)
+    print(f"Ingested {len(documents)} document chunks.")
+if __name__ == "__main__":
+    main()

main.py ADDED Viewed

	@@ -0,0 +1,13 @@

+from fastapi import FastAPI
+from explain_error import explain_error
+app = FastAPI()
+@app.get("/")
+def home():
+    return {"message": "Jenkins Error Explainer API running"}
+@app.post("/explain")
+def explain(payload: dict):
+    log = payload["log_text"]
+    return explain_error(log)

preload_model.py ADDED Viewed

	@@ -0,0 +1,9 @@

+from sentence_transformers import SentenceTransformer
+print("Downloading model...")
+MODEL = SentenceTransformer(
+    "paraphrase-MiniLM-L3-v2",
+    cache_folder="./model_cache"
+)
+print("Model cached successfully!")

requirements.txt ADDED Viewed

	@@ -0,0 +1,10 @@

+fastapi
+uvicorn
+sentence-transformers==2.2.2
+huggingface_hub==0.19.4
+faiss-cpu
+numpy
+pydantic
+torch
+transformers==4.35.2
+tokenizers==0.15.2

retrieve_docs.py ADDED Viewed

	@@ -0,0 +1,53 @@

+# retrieve_docs.py
+import json
+import faiss
+import numpy as np
+from sentence_transformers import SentenceTransformer
+INDEX_PATH = "data/docs/docs.index"
+META_PATH = "data/docs/docs_meta.json"
+TOP_K = 5
+QUERY_TEMPLATES = {
+    "groovy_syntax_error": "Jenkins pipeline Groovy syntax error missing brace",
+    "missing_agent": "Jenkins pipeline agent none requires node context",
+    "no_node_available": "Jenkins no nodes with label scheduling executor",
+    "missing_plugin": "Jenkins No such DSL method pipeline step",
+    "missing_credentials": "Jenkins credentials not found pipeline",
+    "file_not_found": "Jenkins pipeline workspace file not found",
+    "git_authentication_error": "Jenkins git authentication failed checkout"
+}
+model = SentenceTransformer(
+    "paraphrase-MiniLM-L3-v2",
+    cache_folder="./model_cache"
+)
+def retrieve_docs(error_category: str):
+    index = faiss.read_index(INDEX_PATH)
+    with open(META_PATH, "r", encoding="utf-8") as f:
+        metadata = json.load(f)
+    query = QUERY_TEMPLATES.get(
+        error_category,
+        "Jenkins pipeline error"
+    )
+    query_embedding = model.encode([query])
+    distances, indices = index.search(query_embedding, TOP_K)
+    results = []
+    for idx in indices[0]:
+        results.append({
+            "text": None,
+            "meta": metadata[idx]
+        })
+    return results
+if __name__ == "__main__":
+    results = retrieve_docs("Started by user Arvind Nandigam org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed: WorkflowScript: 10: expecting '}', found '' @ line 10, column 1.1 error")
+    for r in results:
+        print(r)