jebin2 commited on
Commit
7f0653e
·
0 Parent(s):

Initial commit with LFS-tracked PDFs

Browse files
.gitattributes ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.pdf filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ .env
2
+ venv
3
+ __pycache__/
Dockerfile ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use official Python image
2
+ FROM python:3.10-slim
3
+
4
+ RUN apt-get update && apt-get install -y git
5
+
6
+ RUN useradd -m -u 1000 user
7
+ USER user
8
+ ENV PATH="/home/user/.local/bin:$PATH"
9
+
10
+ WORKDIR /app
11
+
12
+ COPY --chown=user ./requirements.txt requirements.txt
13
+ RUN pip install --no-cache-dir --upgrade -r requirements.txt
14
+
15
+ COPY --chown=user . /app
16
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
LearningPointsExtractor.md ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PDF Learning Points Extractor
2
+
3
+ You are an expert at extracting key learning points from educational PDFs and presenting them as bite-sized, actionable insights perfect for mobile notifications and quick learning.
4
+
5
+ ## Your Task
6
+
7
+ When given a PDF document, extract the most important, practical, and memorable points that would help someone learn the subject matter effectively. Each point should be:
8
+
9
+ 1. **Self-contained**: Understandable without additional context
10
+ 2. **Actionable**: Something the learner can immediately understand or apply
11
+ 3. **Concise**: Brief enough to read in a notification (2-3 sentences max)
12
+ 4. **Valuable**: Represents a key concept, principle, or insight from the material
13
+
14
+ ## Output Format
15
+
16
+ Return ONLY a valid JSON object in this exact format:
17
+
18
+ ```json
19
+ {
20
+ "title": "[Clear, descriptive title - max 8 words]",
21
+ "content": "[The important point/statement - 2-3 sentences max, clear and concise]"
22
+ }
23
+ ```
24
+
25
+ ## Guidelines
26
+
27
+ - **Title**: Should be specific and descriptive (e.g., "Stack Memory Management" not just "Memory")
28
+ - **Content**: Should explain the concept clearly, include practical relevance when possible
29
+ - Extract ONE point per request - make it the most impactful point you can find
30
+ - Prioritize foundational concepts, practical techniques, and key insights
31
+ - Avoid filler words - be direct and clear
32
+ - Use simple language while maintaining technical accuracy
33
+ - If the PDF covers multiple topics, focus on core principles first
34
+
35
+ ## Important Notes
36
+
37
+ - Extract points sequentially through the document for comprehensive coverage
38
+ - Focus on concepts that have lasting value, not just facts
39
+ - Ensure each point teaches something meaningful
40
+ - Keep the learner engaged with clear, practical insights
41
+ - Always return valid, parseable JSON
42
+
43
+ ## Ensuring Uniqueness with Date-Based Page Selection
44
+
45
+ **CRITICAL**: You will extract a DIFFERENT point each day using date-based page rotation.
46
+
47
+ ### Page Selection Algorithm
48
+
49
+ 1. The user will provide:
50
+ - Current date (e.g., "2025-10-02")
51
+ - Total number of pages in the PDF (e.g., 250)
52
+
53
+ 2. Calculate the target page using modulo:
54
+ ```
55
+ Day of Year = Calculate from the date (1-365/366)
56
+ Target Page = (Day of Year % Total Pages) + 1
57
+ ```
58
+ If Day of Year is 275 and Total Pages is 250:
59
+ Target Page = (275 % 250) + 1 = 26
60
+
61
+ 3. Extract the most important/interesting learning point from that specific page
62
+
63
+ 4. If the calculated page has no substantial content (cover page, blank, TOC):
64
+ - Move to the next page with actual content
65
+ - Mention in your response which page you used
66
+
67
+ ### Your Process
68
+
69
+ 1. Calculate: Day of year = 275
70
+ 2. Calculate: Target page = (275 % 250) + 1 = 26
71
+ 3. Extract the best learning point from page 26
72
+ 4. If page 26 is blank/non-content, use the nearest content page
73
+ 5. Return the JSON with the point from that page
74
+
75
+ **Note**: Clearly identify which page number you extracted from in your internal process.
76
+
77
+ When ready, calculate the target page and extract one significant learning point from that specific page.
ProgrammingGroundUp-1-0-booksize.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:663bd554622af154a94e0363fbd8b5b3e93137247f6eeada77005c911ec74513
3
+ size 1383853
README.md ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: CruxNow
3
+ emoji: ⚡
4
+ colorFrom: purple
5
+ colorTo: purple
6
+ sdk: docker
7
+ pinned: false
8
+ short_description: Smart updates. Core insights. Delivered instantly.
9
+ ---
10
+
11
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
The C Programming Language (Kernighan Ritchie).pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94ed541af52448918b11ac6fb257a6589903ab6340c61c2713d1e1b7be0e3a68
3
+ size 1143598
app.py ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI
2
+ from gemiwrap import GeminiWrapper
3
+ from google import genai
4
+ from google.genai import types
5
+ import json_repair
6
+ from functools import partial
7
+ import os
8
+ if os.path.exists(".env"):
9
+ from dotenv import load_dotenv
10
+ load_dotenv()
11
+
12
+ app = FastAPI()
13
+
14
+ geminiWrapper = partial(GeminiWrapper,
15
+ model_name="gemini-flash-lite-latest",
16
+ schema=genai.types.Schema(
17
+ type = genai.types.Type.OBJECT,
18
+ required = ["title", "content"],
19
+ properties = {
20
+ "title": genai.types.Schema(
21
+ type = genai.types.Type.STRING,
22
+ ),
23
+ "content": genai.types.Schema(
24
+ type = genai.types.Type.STRING,
25
+ ),
26
+ },
27
+ ),
28
+ delete_files=True
29
+ )
30
+
31
+ @app.get("/")
32
+ def greet_json():
33
+ return {"Hello": "World!"}
34
+
35
+ @app.get("/breaking_news")
36
+ def breaking_news():
37
+ user_prompt = None
38
+ with open("breaking_news.md", 'r') as file:
39
+ user_prompt = file.read()
40
+
41
+ grounding_tool = types.Tool(
42
+ google_search=types.GoogleSearch()
43
+ )
44
+ model_responses = geminiWrapper(tools=[grounding_tool], response_mime_type="text/plain").send_message(user_prompt=user_prompt)
45
+ return json_repair.loads(model_responses[0])
46
+
47
+ @app.get("/hacker_news")
48
+ def hacker_news():
49
+ with open("test.txt", 'w') as file:
50
+ file.write("Hey, Hey")
51
+
52
+ text = None
53
+ with open("test.txt", 'r') as file:
54
+ text = file.read()
55
+ return {"Hello": text}
56
+
57
+ @app.get("/hackers_law")
58
+ def hackers_law():
59
+ user_prompt = None
60
+ with open("hackers_law.md", 'r') as file:
61
+ user_prompt = file.read()
62
+
63
+ model_responses = geminiWrapper().send_message(user_prompt=user_prompt)
64
+ return json_repair.loads(model_responses[0])
65
+
66
+ @app.get("/pdf_crux")
67
+ def pdf_crux(name: str):
68
+ system_prompt = None
69
+ with open("LearningPointsExtractor.md", 'r') as file:
70
+ system_prompt = file.read()
71
+ # name : VonNeumann, hacker_laws,
72
+ file_path = "ProgrammingGroundUp-1-0-booksize.pdf"
73
+ if name == "assembly":
74
+ file_path = "ProgrammingGroundUp-1-0-booksize.pdf"
75
+ elif name == "arm":
76
+ file_path = "arm-baremetal-ebook.pdf"
77
+ elif name == "cpumemory":
78
+ file_path = "cpumemory.pdf"
79
+ elif name == "c":
80
+ file_path = "The C Programming Language (Kernighan Ritchie).pdf"
81
+
82
+ model_responses = geminiWrapper().send_message(user_prompt="", system_instruction=system_prompt, file_path=file_path)
83
+ return json_repair.loads(model_responses[0])
arm-baremetal-ebook.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2da89ad0d2e1fcef470c046ad57d5904b9076c2741606862b8b7710294c461c
3
+ size 692678
breaking_news.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Breaking News Finder - Enhanced Prompt
2
+
3
+ ## Task
4
+ Search for and report the most significant breaking news story from the past 24 hours in one of these categories: World, Politics, Business, Technology, Science, Health, Entertainment, or Sports.
5
+
6
+ ## Selection Criteria
7
+ - **Recency**: Published within the last 24 hours
8
+ - **Significance**: Major developments that would lead news broadcasts or front pages
9
+ - **Verification**: From established, credible news organizations only
10
+ - **Impact**: Stories affecting large populations, markets, or having widespread consequences
11
+
12
+ ## Source Requirements
13
+ Prioritize news from these types of sources:
14
+ - Major international news agencies (Reuters, AP, BBC, CNN, etc.)
15
+ - Established newspapers and news websites
16
+ - Government or official organizational announcements
17
+ - Verified social media accounts of news organizations
18
+
19
+ ## Content Guidelines
20
+ - Use objective, professional journalism language
21
+ - Include specific factual details: WHO, WHAT, WHEN, WHERE, WHY
22
+ - Mention exact times, locations, and key figures when available
23
+ - Focus on confirmed facts, not speculation or analysis
24
+ - Keep content concise but comprehensive (100-200 words)
25
+
26
+ ## Output Format
27
+ Return results in this exact JSON structure:
28
+ ```json
29
+ {
30
+ "category": "category_name",
31
+ "title": "Compelling, specific headline that captures the breaking news",
32
+ "content": "Professional news summary with key facts, figures, and context. Include specific details about what happened, who is involved, when it occurred, and why it matters."
33
+ }
34
+ ```
35
+
36
+ ## Quality Checklist
37
+ Before finalizing, ensure the story:
38
+ - [ ] Is genuinely breaking news (not just recently published old news)
39
+ - [ ] Comes from a reputable, verifiable source
40
+ - [ ] Contains specific, factual details
41
+ - [ ] Would be considered significant by major news outlets
42
+ - [ ] Is written in clear, professional language
43
+
44
+ ## Important Notes
45
+ - Return ONLY the JSON format with no additional text
46
+ - Do not create, embellish, or speculate on any information
47
+ - If no genuinely breaking news is found, search for the most significant recent story that meets the criteria
48
+ - Verify information accuracy before including in the response
cpumemory.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2902dddcadb1ec97eeabd338bfaabf80f1fa2ff4e4d769b4052da88bfbb20387
3
+ size 934051
hackers_law.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hacker Laws Fetcher Prompt
2
+
3
+ Extract one random law from the Hacker Laws website and format it for iOS notification display.
4
+
5
+ ## Instructions:
6
+ 1. Fetch content from: https://hacker-laws.com
7
+ 2. Select ONE law randomly from the available laws on the page
8
+ 3. Extract the law's title and its main content/description
9
+ 4. Format the output as valid JSON with "title" and "content" fields
10
+ 5. Keep the content concise but complete - suitable for iOS notification display
11
+ 6. Remove any markdown formatting, links, or extra whitespace
12
+ 7. Ensure the content is readable and fits well in a notification
13
+
14
+ ## Output Format:
15
+ ```json
16
+ {
17
+ "title": "[Law Title]",
18
+ "content": "[Law Description/Content]"
19
+ }
20
+ ```
21
+
22
+ ## Example Output:
23
+ ```json
24
+ {
25
+ "title": "Murphy's Law",
26
+ "content": "Anything that can go wrong will go wrong."
27
+ }
28
+ ```
29
+
30
+ ## Additional Requirements:
31
+ - Use only plain text (no markdown, HTML, or special formatting)
32
+ - Keep total length under 200 characters if possible for notification readability
33
+ - If the law description is too long, summarize the core concept
34
+ - Ensure the four dashes (----) separator is exactly as shown
35
+ - Return valid JSON format only - no additional text or explanation
36
+ - Use proper JSON syntax with quoted keys and values
37
+ - Choose a different law each time if possible
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ fastapi
2
+ uvicorn[standard]
3
+ git+https://github.com/jebin2/gemiwrap.git
4
+ google-generativeai
5
+ json_repair