maintaining same chunk throughout pipeline
Browse files- prompts/generator_prompt.txt +56 -26
- prompts/question_prompt.txt +59 -25
- src/pipeline/batch.mjs +2 -1
- src/pipeline/step.mjs +79 -20
- src/providers/ollama_provider.mjs +44 -12
- src/question/question_core.mjs +95 -39
- tests/pipeline.mock.test.mjs +18 -16
- try_prompt.sh +85 -0
prompts/generator_prompt.txt
CHANGED
|
@@ -1,29 +1,59 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
-
|
| 14 |
-
-
|
| 15 |
-
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
CONTEXT:
|
| 28 |
{{CONTEXT}}
|
| 29 |
|
|
|
|
| 1 |
+
# SYSTEM ROLE
|
| 2 |
+
You are a knowledge distillation generator optimized for training reasoning LoRAs. Your outputs must demonstrate *pedagogical reasoning fidelity* - showing not just answers, but the exact cognitive process a student model should learn. Every output will be used as training data.
|
| 3 |
+
|
| 4 |
+
## CORE DIRECTIVES (NON-NEGOTIABLE)
|
| 5 |
+
1. **CONTEXT FIDELITY**: Use ONLY provided context. No external knowledge. Ever.
|
| 6 |
+
2. **REASONING GRANULARITY**: Decompose reasoning into atomic, teachable steps
|
| 7 |
+
3. **UNCERTAINTY CALIBRATION**: Quantify confidence at each reasoning stage
|
| 8 |
+
4. **BIAS MITIGATION**: Explicitly flag context limitations and reasoning risks
|
| 9 |
+
5. **DISTILLATION OPTIMIZATION**: Structure outputs for maximum LoRA weight efficiency
|
| 10 |
+
|
| 11 |
+
## REASONING PROTOCOL (EXECUTE IN ORDER INSIDE XML TAGS)
|
| 12 |
+
<understanding>
|
| 13 |
+
- Restate question in atomic components
|
| 14 |
+
- Identify: [Simple/Factual] vs [Multi-hop/Inferential] vs [Ambiguous]
|
| 15 |
+
- Flag required context elements (quote paragraph numbers)
|
| 16 |
+
</understanding>
|
| 17 |
+
|
| 18 |
+
<context_verification>
|
| 19 |
+
- For EACH required fact:
|
| 20 |
+
▸ Cite exact context location (para #[X])
|
| 21 |
+
▸ Assess source quality: [Primary/Secondary/Contradictory/Uncertain]
|
| 22 |
+
▸ If missing/insufficient: TERMINATE with "I cannot answer..."
|
| 23 |
+
</context_verification>
|
| 24 |
+
|
| 25 |
+
<reasoning_chain confidence_baseline="90%">
|
| 26 |
+
[STRUCTURED STEP FORMAT PER STEP]
|
| 27 |
+
Step #[N]:
|
| 28 |
+
- Operation: [Retrieval/Comparison/Causality/Quantification/Contradiction-Check]
|
| 29 |
+
- Context evidence: "Short quote" (para #[X])
|
| 30 |
+
- Confidence delta: [+0%/-5% etc.] due to [reason]
|
| 31 |
+
- Inference rule used: [e.g., "Temporal transitivity", "Numerical constraint propagation"]
|
| 32 |
+
- Bias check: [None/Selection bias/Uncertainty propagation risk]
|
| 33 |
+
</reasoning_chain>
|
| 34 |
+
|
| 35 |
+
<synthesis>
|
| 36 |
+
- Resolve conflicts between steps
|
| 37 |
+
- Calculate cumulative confidence: (baseline * step confidences)
|
| 38 |
+
- Final confidence threshold: <80% → "I cannot answer..."
|
| 39 |
+
- Verify against reasoning_chain constraints
|
| 40 |
+
</synthesis>
|
| 41 |
+
|
| 42 |
+
## OUTPUT SPECIFICATION (MACHINE-PARSIABLE)
|
| 43 |
+
After </synthesis>:
|
| 44 |
+
Confidence: [INTEGER 0-100]
|
| 45 |
+
Answer: [CONCISE RESPONSE OR EXACT FALLBACK PHRASE]
|
| 46 |
+
Evidence: [MAX 3 SHORT PHRASES] | [PARA #S]
|
| 47 |
+
Uncertainty_flags: [NONE/CONTEXT_GAPS/CONTRADICTIONS/BIAS_RISK]
|
| 48 |
+
|
| 49 |
+
## STRICT FORMATTING RULES
|
| 50 |
+
- XML tags MUST close properly
|
| 51 |
+
- Evidence phrases: ≤7 words each
|
| 52 |
+
- Confidence calculations must show work in <synthesis>
|
| 53 |
+
- If context_verification fails: OUTPUT ONLY "I cannot answer this from the provided context." (NO tags)
|
| 54 |
+
- NEVER use markdown, asterisks, or special formatting
|
| 55 |
+
|
| 56 |
+
---
|
| 57 |
CONTEXT:
|
| 58 |
{{CONTEXT}}
|
| 59 |
|
prompts/question_prompt.txt
CHANGED
|
@@ -1,34 +1,68 @@
|
|
| 1 |
-
You are a
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
that can be answered ONLY using information found inside the context.
|
| 10 |
-
3. Produce questions that:
|
| 11 |
-
- focus strictly on the content of the chunk,
|
| 12 |
-
- avoid hallucinating any information not present,
|
| 13 |
-
- require comprehension, reasoning, or synthesis across the chunk,
|
| 14 |
-
- vary naturally in difficulty (some simple, some deeper),
|
| 15 |
-
- avoid meta or speculative questions,
|
| 16 |
-
- avoid yes/no questions unless they are meaningful.
|
| 17 |
|
| 18 |
-
|
| 19 |
|
| 20 |
-
|
| 21 |
-
"questions": [
|
| 22 |
-
"Question 1?",
|
| 23 |
-
"Question 2?",
|
| 24 |
-
"Question 3?"
|
| 25 |
-
]
|
| 26 |
-
}
|
| 27 |
|
| 28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
---
|
| 31 |
-
CONTEXT START
|
| 32 |
-
{{CONTEXT}}
|
| 33 |
-
CONTEXT END
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
You are a Master Question Architect in a knowledge distillation pipeline.
|
| 2 |
|
| 3 |
+
Your job is to generate high-value training questions from a single CONTEXT CHUNK.
|
| 4 |
|
| 5 |
+
The questions will be used to train reasoning LoRAs, so they must:
|
| 6 |
+
- require actual reasoning over the context (not just parroting one sentence),
|
| 7 |
+
- be answerable ONLY from the context,
|
| 8 |
+
- avoid hallucinating any information not present in the context.
|
| 9 |
|
| 10 |
+
CONTEXT:
|
| 11 |
+
{{CONTEXT}}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
+
---
|
| 14 |
|
| 15 |
+
INTERNAL THINKING (do NOT mention this section in the final output):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
+
<analysis>
|
| 18 |
+
1. Identify key entities, concepts, and relationships in the context.
|
| 19 |
+
2. Map possible reasoning pathways, including:
|
| 20 |
+
- factual retrieval,
|
| 21 |
+
- causal links,
|
| 22 |
+
- temporal sequences,
|
| 23 |
+
- comparative relationships,
|
| 24 |
+
- counterfactual “what if” variations.
|
| 25 |
+
3. Estimate how deep the reasoning can go (from very easy to very hard).
|
| 26 |
+
4. Decide which aspects of the context, if questioned, would best:
|
| 27 |
+
- expose subtle misunderstandings,
|
| 28 |
+
- exercise multi-step reasoning,
|
| 29 |
+
- probe edge cases and boundary conditions,
|
| 30 |
+
- cover as many important concepts as possible.
|
| 31 |
+
</analysis>
|
| 32 |
|
| 33 |
---
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
+
OUTPUT INSTRUCTIONS (VISIBLE):
|
| 36 |
+
|
| 37 |
+
1. Generate up to {{MAX_QUESTIONS}} diverse, high-quality questions.
|
| 38 |
+
- If the context supports fewer than {{MAX_QUESTIONS}} good questions,
|
| 39 |
+
generate only as many as make sense.
|
| 40 |
+
2. Prefer questions that fall into a mix of these categories:
|
| 41 |
+
- factual retrieval (directly answerable from one or two sentences),
|
| 42 |
+
- single-hop inference (simple reasoning or rephrasing),
|
| 43 |
+
- multi-hop reasoning (requires combining several parts of the context),
|
| 44 |
+
- “what if” / counterfactual (change one key assumption and ask about it),
|
| 45 |
+
- meta-reasoning (asking about the reasoning or structure in the context).
|
| 46 |
+
3. Questions MUST be answerable solely from the CONTEXT.
|
| 47 |
+
- If you are unsure the context supports a question, DO NOT ask it.
|
| 48 |
+
4. Avoid:
|
| 49 |
+
- yes/no questions,
|
| 50 |
+
- vague or extremely open-ended questions,
|
| 51 |
+
- questions that require outside knowledge.
|
| 52 |
+
|
| 53 |
+
FORMAT (MACHINE-FRIENDLY, NO JSON):
|
| 54 |
+
|
| 55 |
+
- Output ONLY the questions.
|
| 56 |
+
- One question per line.
|
| 57 |
+
- No numbering, no bullet points, no explanations.
|
| 58 |
+
- Do not include the analysis block in your output.
|
| 59 |
+
- Do not prefix with “Q1:”, “Q2:” etc.
|
| 60 |
+
- Do not add extra commentary.
|
| 61 |
+
|
| 62 |
+
EXAMPLES OF VALID OUTPUT FORMAT:
|
| 63 |
+
|
| 64 |
+
What are the main reasons given in the context for X?
|
| 65 |
+
How does Y relate to Z according to the context?
|
| 66 |
+
What would likely change if condition A were different, based on the text?
|
| 67 |
+
|
| 68 |
+
Now, think carefully in <analysis> (internally), then output ONLY the questions.
|
src/pipeline/batch.mjs
CHANGED
|
@@ -233,6 +233,8 @@ export async function runPipelineBatch({
|
|
| 233 |
try {
|
| 234 |
const result = await runPipelineStep({
|
| 235 |
question: q,
|
|
|
|
|
|
|
| 236 |
verbose,
|
| 237 |
logger,
|
| 238 |
});
|
|
@@ -288,4 +290,3 @@ export async function runPipelineBatch({
|
|
| 288 |
|
| 289 |
throw new Error(`Unknown PIPELINE_SEED_MODE: ${seedMode}`);
|
| 290 |
}
|
| 291 |
-
|
|
|
|
| 233 |
try {
|
| 234 |
const result = await runPipelineStep({
|
| 235 |
question: q,
|
| 236 |
+
// 🔑 KEY FIX: reuse this ES chunk as the *only* context
|
| 237 |
+
initialContext: [chunk],
|
| 238 |
verbose,
|
| 239 |
logger,
|
| 240 |
});
|
|
|
|
| 290 |
|
| 291 |
throw new Error(`Unknown PIPELINE_SEED_MODE: ${seedMode}`);
|
| 292 |
}
|
|
|
src/pipeline/step.mjs
CHANGED
|
@@ -10,12 +10,34 @@ import { preview } from './util.mjs';
|
|
| 10 |
* Run a single pipeline step for one question.
|
| 11 |
*
|
| 12 |
* Flow:
|
| 13 |
-
* retrieval → generator → verifier → reward
|
| 14 |
*
|
| 15 |
-
*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
*/
|
| 17 |
export async function runPipelineStep({
|
| 18 |
question,
|
|
|
|
| 19 |
retrievalMode = process.env.RETRIEVAL_MODE || 'hybrid',
|
| 20 |
k = Number(process.env.RETRIEVAL_K || '6'),
|
| 21 |
generatorProvider,
|
|
@@ -28,7 +50,7 @@ export async function runPipelineStep({
|
|
| 28 |
const errLog = logger?.error?.bind(logger) || console.error;
|
| 29 |
|
| 30 |
// ----------------------------------------
|
| 31 |
-
//
|
| 32 |
// ----------------------------------------
|
| 33 |
if (!question || !question.trim()) {
|
| 34 |
if (verbose) log(' [pipeline] empty / invalid question, skipping');
|
|
@@ -40,28 +62,65 @@ export async function runPipelineStep({
|
|
| 40 |
const rewProv = rewardProvider || loadProviderFor('reward');
|
| 41 |
|
| 42 |
// ----------------------------------------
|
| 43 |
-
// Retrieval
|
| 44 |
// ----------------------------------------
|
| 45 |
let context = [];
|
| 46 |
-
try {
|
| 47 |
-
if (verbose) log(` [retrieval] mode=${retrievalMode} k=${k}`);
|
| 48 |
-
context = await hybridSearch(question, k);
|
| 49 |
|
|
|
|
|
|
|
|
|
|
| 50 |
if (verbose) {
|
| 51 |
-
log(
|
| 52 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
const first = context[0]?.content ?? '';
|
| 54 |
-
log(' [
|
| 55 |
log(' ' + preview(first, 200).replace(/\n/g, '\n '));
|
| 56 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
}
|
| 58 |
-
}
|
| 59 |
-
|
| 60 |
-
|
|
|
|
|
|
|
| 61 |
return {
|
| 62 |
status: 'retrieval_failed',
|
| 63 |
question,
|
| 64 |
-
error:
|
| 65 |
};
|
| 66 |
}
|
| 67 |
|
|
@@ -75,7 +134,7 @@ export async function runPipelineStep({
|
|
| 75 |
|
| 76 |
if (verbose) {
|
| 77 |
log(' [generator] answer:');
|
| 78 |
-
log(' ' + preview(gen.answer ?? '', 400).replace(/\n/g, '\n '));
|
| 79 |
}
|
| 80 |
} catch (e) {
|
| 81 |
const msg = e?.message || String(e);
|
|
@@ -109,8 +168,8 @@ export async function runPipelineStep({
|
|
| 109 |
ver = await runVerifier({ question, context, gen }, verProv);
|
| 110 |
|
| 111 |
if (verbose) {
|
| 112 |
-
log(' [verifier] ok=' + ver.ok);
|
| 113 |
-
log(' ' + preview(ver.raw ?? '', 200).replace(/\n/g, '\n '));
|
| 114 |
}
|
| 115 |
} catch (e) {
|
| 116 |
const msg = e?.message || String(e);
|
|
@@ -141,11 +200,11 @@ export async function runPipelineStep({
|
|
| 141 |
let rew;
|
| 142 |
try {
|
| 143 |
if (verbose) log(' [reward] calling model…');
|
| 144 |
-
rew = await runReward({ question, context, gen }, rewProv);
|
| 145 |
|
| 146 |
if (verbose) {
|
| 147 |
-
log(` [reward] score=${rew.score} ok=${rew.ok}`);
|
| 148 |
-
log(' ' + preview(rew.raw ?? '', 200).replace(/\n/g, '\n '));
|
| 149 |
}
|
| 150 |
} catch (e) {
|
| 151 |
const msg = e?.message || String(e);
|
|
|
|
| 10 |
* Run a single pipeline step for one question.
|
| 11 |
*
|
| 12 |
* Flow:
|
| 13 |
+
* retrieval (or provided context) → generator → verifier → reward
|
| 14 |
*
|
| 15 |
+
* Design constraints:
|
| 16 |
+
* - Exactly one context chunk is used per question.
|
| 17 |
+
* - If `initialContext` is provided, we NEVER hit Elasticsearch.
|
| 18 |
+
* - If we call ES, we still only keep the FIRST returned chunk.
|
| 19 |
+
*
|
| 20 |
+
* Returns:
|
| 21 |
+
* {
|
| 22 |
+
* status: 'accepted'
|
| 23 |
+
* | 'invalid_question'
|
| 24 |
+
* | 'retrieval_failed'
|
| 25 |
+
* | 'generator_failed'
|
| 26 |
+
* | 'verifier_rejected'
|
| 27 |
+
* | 'verifier_error'
|
| 28 |
+
* | 'reward_rejected'
|
| 29 |
+
* | 'reward_error',
|
| 30 |
+
* question,
|
| 31 |
+
* context, // array with exactly one chunk (when successful)
|
| 32 |
+
* gen,
|
| 33 |
+
* ver,
|
| 34 |
+
* rew,
|
| 35 |
+
* error? // optional error message
|
| 36 |
+
* }
|
| 37 |
*/
|
| 38 |
export async function runPipelineStep({
|
| 39 |
question,
|
| 40 |
+
initialContext, // optional: [{ id?, content, ... }]
|
| 41 |
retrievalMode = process.env.RETRIEVAL_MODE || 'hybrid',
|
| 42 |
k = Number(process.env.RETRIEVAL_K || '6'),
|
| 43 |
generatorProvider,
|
|
|
|
| 50 |
const errLog = logger?.error?.bind(logger) || console.error;
|
| 51 |
|
| 52 |
// ----------------------------------------
|
| 53 |
+
// Question sanity
|
| 54 |
// ----------------------------------------
|
| 55 |
if (!question || !question.trim()) {
|
| 56 |
if (verbose) log(' [pipeline] empty / invalid question, skipping');
|
|
|
|
| 62 |
const rewProv = rewardProvider || loadProviderFor('reward');
|
| 63 |
|
| 64 |
// ----------------------------------------
|
| 65 |
+
// Retrieval / context selection
|
| 66 |
// ----------------------------------------
|
| 67 |
let context = [];
|
|
|
|
|
|
|
|
|
|
| 68 |
|
| 69 |
+
if (initialContext && Array.isArray(initialContext) && initialContext.length > 0) {
|
| 70 |
+
// Use provided context, no ES call
|
| 71 |
+
context = initialContext.slice(0, 1); // enforce single-chunk invariant
|
| 72 |
if (verbose) {
|
| 73 |
+
log(
|
| 74 |
+
` [retrieval] using initialContext provided (len=${initialContext.length}), ` +
|
| 75 |
+
`keeping first chunk only`,
|
| 76 |
+
);
|
| 77 |
+
const first = context[0]?.content ?? '';
|
| 78 |
+
log(' [context] first chunk (provided):');
|
| 79 |
+
log(' ' + preview(first, 200).replace(/\n/g, '\n '));
|
| 80 |
+
}
|
| 81 |
+
} else {
|
| 82 |
+
// Go to ES exactly once
|
| 83 |
+
try {
|
| 84 |
+
if (verbose) log(` [retrieval] mode=${retrievalMode} k=${k}`);
|
| 85 |
+
const hits = await hybridSearch(question, k);
|
| 86 |
+
if (verbose) {
|
| 87 |
+
log(` [retrieval] got ${hits.length} chunks from ES`);
|
| 88 |
+
}
|
| 89 |
+
|
| 90 |
+
if (!hits || hits.length === 0) {
|
| 91 |
+
if (verbose) log(' [retrieval] no chunks found → retrieval_failed');
|
| 92 |
+
return {
|
| 93 |
+
status: 'retrieval_failed',
|
| 94 |
+
question,
|
| 95 |
+
error: 'no_chunks',
|
| 96 |
+
};
|
| 97 |
+
}
|
| 98 |
+
|
| 99 |
+
// Enforce single-chunk context
|
| 100 |
+
context = [hits[0]];
|
| 101 |
+
if (verbose) {
|
| 102 |
const first = context[0]?.content ?? '';
|
| 103 |
+
log(' [context] first chunk (from ES):');
|
| 104 |
log(' ' + preview(first, 200).replace(/\n/g, '\n '));
|
| 105 |
}
|
| 106 |
+
} catch (e) {
|
| 107 |
+
const msg = e?.message || String(e);
|
| 108 |
+
if (verbose) errLog(' [retrieval] ERROR:', msg);
|
| 109 |
+
return {
|
| 110 |
+
status: 'retrieval_failed',
|
| 111 |
+
question,
|
| 112 |
+
error: msg,
|
| 113 |
+
};
|
| 114 |
}
|
| 115 |
+
}
|
| 116 |
+
|
| 117 |
+
// Safety: if somehow context is still empty here, fail fast
|
| 118 |
+
if (!context || context.length === 0) {
|
| 119 |
+
if (verbose) log(' [retrieval] context empty after selection → retrieval_failed');
|
| 120 |
return {
|
| 121 |
status: 'retrieval_failed',
|
| 122 |
question,
|
| 123 |
+
error: 'empty_context',
|
| 124 |
};
|
| 125 |
}
|
| 126 |
|
|
|
|
| 134 |
|
| 135 |
if (verbose) {
|
| 136 |
log(' [generator] answer:');
|
| 137 |
+
log(' ' + preview(gen?.answer ?? '', 400).replace(/\n/g, '\n '));
|
| 138 |
}
|
| 139 |
} catch (e) {
|
| 140 |
const msg = e?.message || String(e);
|
|
|
|
| 168 |
ver = await runVerifier({ question, context, gen }, verProv);
|
| 169 |
|
| 170 |
if (verbose) {
|
| 171 |
+
log(' [verifier] ok=' + (ver?.ok === true));
|
| 172 |
+
log(' ' + preview(ver?.raw ?? '', 200).replace(/\n/g, '\n '));
|
| 173 |
}
|
| 174 |
} catch (e) {
|
| 175 |
const msg = e?.message || String(e);
|
|
|
|
| 200 |
let rew;
|
| 201 |
try {
|
| 202 |
if (verbose) log(' [reward] calling model…');
|
| 203 |
+
rew = await runReward({ question, context, gen, ver }, rewProv);
|
| 204 |
|
| 205 |
if (verbose) {
|
| 206 |
+
log(` [reward] score=${rew?.score} ok=${rew?.ok}`);
|
| 207 |
+
log(' ' + preview(rew?.raw ?? '', 200).replace(/\n/g, '\n '));
|
| 208 |
}
|
| 209 |
} catch (e) {
|
| 210 |
const msg = e?.message || String(e);
|
src/providers/ollama_provider.mjs
CHANGED
|
@@ -1,6 +1,10 @@
|
|
| 1 |
// src/providers/ollama_provider.mjs
|
| 2 |
import { BaseProvider } from './base.mjs';
|
| 3 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
function normalizeBase(url) {
|
| 5 |
// strip trailing slashes so we can safely append /api/generate
|
| 6 |
return url.replace(/\/+$/, '');
|
|
@@ -8,21 +12,42 @@ function normalizeBase(url) {
|
|
| 8 |
|
| 9 |
export class OllamaProvider extends BaseProvider {
|
| 10 |
/**
|
| 11 |
-
* @param {object} opts
|
| 12 |
-
*
|
| 13 |
-
*
|
| 14 |
*/
|
| 15 |
constructor(opts = {}) {
|
| 16 |
super();
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
// Base URL: env or default, WITHOUT endpoint path
|
| 19 |
const envBase = process.env.OLLAMA_URL || 'http://localhost:11434';
|
| 20 |
-
this.baseUrl = normalizeBase(
|
| 21 |
|
| 22 |
-
//
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
this.model =
|
| 24 |
-
|
| 25 |
-
|
| 26 |
process.env.OLLAMA_MODEL ||
|
| 27 |
'qwen3-vl:8b-thinking';
|
| 28 |
}
|
|
@@ -35,14 +60,21 @@ export class OllamaProvider extends BaseProvider {
|
|
| 35 |
async generate(prompt) {
|
| 36 |
const url = `${this.baseUrl}/api/generate`;
|
| 37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
const res = await fetch(url, {
|
| 39 |
method: 'POST',
|
| 40 |
headers: { 'Content-Type': 'application/json' },
|
| 41 |
-
body: JSON.stringify(
|
| 42 |
-
model: this.model,
|
| 43 |
-
prompt,
|
| 44 |
-
stream: false,
|
| 45 |
-
}),
|
| 46 |
});
|
| 47 |
|
| 48 |
if (!res.ok) {
|
|
|
|
| 1 |
// src/providers/ollama_provider.mjs
|
| 2 |
import { BaseProvider } from './base.mjs';
|
| 3 |
|
| 4 |
+
const ENABLE_REASONING =
|
| 5 |
+
process.env.OLLAMA_REASONING === '1' ||
|
| 6 |
+
process.env.OLLAMA_REASONING === 'true';
|
| 7 |
+
|
| 8 |
function normalizeBase(url) {
|
| 9 |
// strip trailing slashes so we can safely append /api/generate
|
| 10 |
return url.replace(/\/+$/, '');
|
|
|
|
| 12 |
|
| 13 |
export class OllamaProvider extends BaseProvider {
|
| 14 |
/**
|
| 15 |
+
* @param {object|string} opts
|
| 16 |
+
* - if string: treated as stage name ('generator' | 'verifier' | 'reward' | 'question')
|
| 17 |
+
* - if object: { model?, baseUrl?, stage? }
|
| 18 |
*/
|
| 19 |
constructor(opts = {}) {
|
| 20 |
super();
|
| 21 |
|
| 22 |
+
let stage = null;
|
| 23 |
+
let options = {};
|
| 24 |
+
|
| 25 |
+
if (typeof opts === 'string') {
|
| 26 |
+
stage = opts;
|
| 27 |
+
} else if (opts && typeof opts === 'object') {
|
| 28 |
+
options = opts;
|
| 29 |
+
stage = opts.stage || null;
|
| 30 |
+
}
|
| 31 |
+
|
| 32 |
// Base URL: env or default, WITHOUT endpoint path
|
| 33 |
const envBase = process.env.OLLAMA_URL || 'http://localhost:11434';
|
| 34 |
+
this.baseUrl = normalizeBase(options.baseUrl || envBase);
|
| 35 |
|
| 36 |
+
// Stage-specific model: QUESTION_MODEL, GENERATOR_MODEL, VERIFIER_MODEL, REWARD_MODEL
|
| 37 |
+
let stageModel = null;
|
| 38 |
+
if (stage) {
|
| 39 |
+
const key = `${stage.toUpperCase()}_MODEL`;
|
| 40 |
+
stageModel = process.env[key] || null;
|
| 41 |
+
}
|
| 42 |
+
|
| 43 |
+
// Model resolution order:
|
| 44 |
+
// 1) explicit opts.model
|
| 45 |
+
// 2) stage-specific env (e.g. GENERATOR_MODEL)
|
| 46 |
+
// 3) generic OLLAMA_MODEL
|
| 47 |
+
// 4) default qwen3-vl:8b-thinking
|
| 48 |
this.model =
|
| 49 |
+
options.model ||
|
| 50 |
+
stageModel ||
|
| 51 |
process.env.OLLAMA_MODEL ||
|
| 52 |
'qwen3-vl:8b-thinking';
|
| 53 |
}
|
|
|
|
| 60 |
async generate(prompt) {
|
| 61 |
const url = `${this.baseUrl}/api/generate`;
|
| 62 |
|
| 63 |
+
const body = {
|
| 64 |
+
model: this.model,
|
| 65 |
+
prompt,
|
| 66 |
+
stream: false, // single JSON response, easier for pipeline/tests
|
| 67 |
+
};
|
| 68 |
+
|
| 69 |
+
// enable Ollama reasoning mode for *-thinking models when requested
|
| 70 |
+
if (ENABLE_REASONING) {
|
| 71 |
+
body.options = { reasoning: true };
|
| 72 |
+
}
|
| 73 |
+
|
| 74 |
const res = await fetch(url, {
|
| 75 |
method: 'POST',
|
| 76 |
headers: { 'Content-Type': 'application/json' },
|
| 77 |
+
body: JSON.stringify(body),
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
});
|
| 79 |
|
| 80 |
if (!res.ok) {
|
src/question/question_core.mjs
CHANGED
|
@@ -1,70 +1,126 @@
|
|
| 1 |
// src/question/question_core.mjs
|
| 2 |
import fs from 'fs/promises';
|
| 3 |
import path from 'path';
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
async function loadQuestionTemplate() {
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
'..',
|
| 10 |
-
'prompts',
|
| 11 |
-
'question_prompt.txt',
|
| 12 |
-
);
|
| 13 |
-
return await fs.readFile(filePath, 'utf8');
|
| 14 |
}
|
| 15 |
|
| 16 |
/**
|
| 17 |
-
*
|
| 18 |
*
|
| 19 |
-
* @param {string}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
* @param {object} provider - { generate(prompt) → string }
|
| 21 |
-
* @param {object}
|
| 22 |
-
* - maxQuestions:
|
| 23 |
*
|
| 24 |
-
* @returns {
|
| 25 |
-
* questions: string[],
|
| 26 |
* raw: string,
|
|
|
|
|
|
|
|
|
|
| 27 |
* parsed: any
|
| 28 |
-
* }
|
| 29 |
*/
|
| 30 |
export async function runQuestionGenerator(
|
| 31 |
contextText,
|
| 32 |
provider,
|
| 33 |
-
|
| 34 |
) {
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
}
|
| 38 |
|
| 39 |
const template = await loadQuestionTemplate();
|
| 40 |
|
| 41 |
const prompt = template
|
| 42 |
-
.replace(
|
| 43 |
-
.replace(
|
| 44 |
|
| 45 |
const raw = await provider.generate(prompt);
|
| 46 |
|
| 47 |
-
|
| 48 |
-
try {
|
| 49 |
-
parsed = JSON.parse(raw);
|
| 50 |
-
} catch {
|
| 51 |
-
parsed = { error: 'invalid_json', raw };
|
| 52 |
-
}
|
| 53 |
-
|
| 54 |
-
let questions = [];
|
| 55 |
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
questions
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
}
|
| 63 |
-
|
| 64 |
-
return { questions, raw, parsed };
|
| 65 |
}
|
| 66 |
|
| 67 |
export default {
|
| 68 |
runQuestionGenerator,
|
| 69 |
};
|
| 70 |
-
|
|
|
|
| 1 |
// src/question/question_core.mjs
|
| 2 |
import fs from 'fs/promises';
|
| 3 |
import path from 'path';
|
| 4 |
+
import { fileURLToPath } from 'url';
|
| 5 |
+
|
| 6 |
+
const __filename = fileURLToPath(import.meta.url);
|
| 7 |
+
const __dirname = path.dirname(__filename);
|
| 8 |
+
|
| 9 |
+
const TEMPLATE_PATH = path.resolve(
|
| 10 |
+
__dirname,
|
| 11 |
+
'..',
|
| 12 |
+
'..',
|
| 13 |
+
'prompts',
|
| 14 |
+
'question_prompt.txt',
|
| 15 |
+
);
|
| 16 |
+
|
| 17 |
+
let cachedTemplate = null;
|
| 18 |
|
| 19 |
async function loadQuestionTemplate() {
|
| 20 |
+
if (cachedTemplate) return cachedTemplate;
|
| 21 |
+
cachedTemplate = await fs.readFile(TEMPLATE_PATH, 'utf8');
|
| 22 |
+
return cachedTemplate;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
}
|
| 24 |
|
| 25 |
/**
|
| 26 |
+
* Extract questions using JSON-first, then plain-text fallback.
|
| 27 |
*
|
| 28 |
+
* @param {string} raw
|
| 29 |
+
* @param {number} maxQuestions
|
| 30 |
+
* @returns {{ questions: string[], parsed: any }}
|
| 31 |
+
*/
|
| 32 |
+
function parseQuestions(raw, maxQuestions) {
|
| 33 |
+
let parsed = null;
|
| 34 |
+
let questions = [];
|
| 35 |
+
|
| 36 |
+
if (!raw || typeof raw !== 'string') {
|
| 37 |
+
return { questions, parsed };
|
| 38 |
+
}
|
| 39 |
+
|
| 40 |
+
// ----- 1) Try JSON -----
|
| 41 |
+
try {
|
| 42 |
+
const json = JSON.parse(raw);
|
| 43 |
+
parsed = json;
|
| 44 |
+
|
| 45 |
+
// Case A: { questions: [...] }
|
| 46 |
+
if (json && Array.isArray(json.questions)) {
|
| 47 |
+
questions = json.questions
|
| 48 |
+
.map((q) => String(q).trim())
|
| 49 |
+
.filter((q) => q.length > 0);
|
| 50 |
+
}
|
| 51 |
+
// Case B: root is an array: [ "Q1?", "Q2?" ]
|
| 52 |
+
else if (Array.isArray(json)) {
|
| 53 |
+
questions = json
|
| 54 |
+
.map((q) => String(q).trim())
|
| 55 |
+
.filter((q) => q.length > 0);
|
| 56 |
+
}
|
| 57 |
+
} catch (e) {
|
| 58 |
+
parsed = { error: 'invalid_json', message: e?.message };
|
| 59 |
+
}
|
| 60 |
+
|
| 61 |
+
// ----- 2) Plain-text fallback if we still have no questions -----
|
| 62 |
+
if (!questions.length) {
|
| 63 |
+
const lines = raw
|
| 64 |
+
.split('\n')
|
| 65 |
+
.map((l) => l.trim())
|
| 66 |
+
// strip bullets / numbering: "1. ", "- ", "* ", "• "
|
| 67 |
+
.map((l) => l.replace(/^[-•*()\d.\s]+/, ''))
|
| 68 |
+
// keep lines that look like questions
|
| 69 |
+
.filter((l) => l.length > 0 && /[??!]$/.test(l));
|
| 70 |
+
|
| 71 |
+
questions = lines;
|
| 72 |
+
}
|
| 73 |
+
|
| 74 |
+
if (questions.length > maxQuestions) {
|
| 75 |
+
questions = questions.slice(0, maxQuestions);
|
| 76 |
+
}
|
| 77 |
+
|
| 78 |
+
return { questions, parsed };
|
| 79 |
+
}
|
| 80 |
+
|
| 81 |
+
/**
|
| 82 |
+
* Build prompt and generate questions from a context chunk.
|
| 83 |
+
*
|
| 84 |
+
* @param {string} contextText - chunk from ES
|
| 85 |
* @param {object} provider - { generate(prompt) → string }
|
| 86 |
+
* @param {object} opts
|
| 87 |
+
* - maxQuestions?: number (defaults QUESTION_MAX or 5)
|
| 88 |
*
|
| 89 |
+
* @returns {Promise<{
|
|
|
|
| 90 |
* raw: string,
|
| 91 |
+
* prompt: string,
|
| 92 |
+
* questions: string[],
|
| 93 |
+
* maxQuestions: number,
|
| 94 |
* parsed: any
|
| 95 |
+
* }>}
|
| 96 |
*/
|
| 97 |
export async function runQuestionGenerator(
|
| 98 |
contextText,
|
| 99 |
provider,
|
| 100 |
+
opts = {},
|
| 101 |
) {
|
| 102 |
+
const maxQuestions =
|
| 103 |
+
opts.maxQuestions ?? Number(process.env.QUESTION_MAX || '5');
|
|
|
|
| 104 |
|
| 105 |
const template = await loadQuestionTemplate();
|
| 106 |
|
| 107 |
const prompt = template
|
| 108 |
+
.replace(/{{CONTEXT}}/g, contextText)
|
| 109 |
+
.replace(/{{MAX_QUESTIONS}}/g, String(maxQuestions));
|
| 110 |
|
| 111 |
const raw = await provider.generate(prompt);
|
| 112 |
|
| 113 |
+
const { questions, parsed } = parseQuestions(raw, maxQuestions);
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 114 |
|
| 115 |
+
return {
|
| 116 |
+
raw,
|
| 117 |
+
prompt,
|
| 118 |
+
questions,
|
| 119 |
+
maxQuestions,
|
| 120 |
+
parsed,
|
| 121 |
+
};
|
|
|
|
|
|
|
| 122 |
}
|
| 123 |
|
| 124 |
export default {
|
| 125 |
runQuestionGenerator,
|
| 126 |
};
|
|
|
tests/pipeline.mock.test.mjs
CHANGED
|
@@ -9,19 +9,24 @@ vi.mock('../src/providers/provider.mjs', () => {
|
|
| 9 |
return {
|
| 10 |
stage,
|
| 11 |
async generate(prompt) {
|
|
|
|
|
|
|
|
|
|
| 12 |
if (stage === 'generator') {
|
| 13 |
-
// generator returns a plain-text answer
|
| 14 |
-
return 'mocked
|
| 15 |
}
|
| 16 |
if (stage === 'verifier') {
|
| 17 |
-
// verifier first line
|
| 18 |
-
return '
|
| 19 |
}
|
| 20 |
if (stage === 'reward') {
|
| 21 |
-
// reward
|
| 22 |
-
return '0.
|
| 23 |
}
|
| 24 |
-
|
|
|
|
|
|
|
| 25 |
},
|
| 26 |
};
|
| 27 |
},
|
|
@@ -46,23 +51,20 @@ describe('runPipelineStep (mocked providers)', () => {
|
|
| 46 |
});
|
| 47 |
|
| 48 |
it('runs a full pipeline step successfully', async () => {
|
| 49 |
-
const result = await runPipelineStep({
|
| 50 |
-
question: 'What is mock testing?',
|
| 51 |
-
verbose: false,
|
| 52 |
-
logger: console,
|
| 53 |
-
});
|
| 54 |
|
| 55 |
expect(result.status).toBe('accepted');
|
| 56 |
|
| 57 |
// generator output made it through
|
| 58 |
-
expect(result.gen.answer).toBe('mocked
|
| 59 |
|
| 60 |
// verifier + reward both say OK
|
| 61 |
expect(result.ver.ok).toBe(true);
|
| 62 |
expect(result.rew.ok).toBe(true);
|
| 63 |
-
expect(result.rew.score).toBeCloseTo(0.99, 5);
|
| 64 |
|
| 65 |
-
//
|
| 66 |
-
|
|
|
|
|
|
|
| 67 |
});
|
| 68 |
});
|
|
|
|
| 9 |
return {
|
| 10 |
stage,
|
| 11 |
async generate(prompt) {
|
| 12 |
+
// simple debug guard if needed:
|
| 13 |
+
// console.log(`[mock ${stage}] prompt:\n`, prompt);
|
| 14 |
+
|
| 15 |
if (stage === 'generator') {
|
| 16 |
+
// pretend generator returns a plain-text answer
|
| 17 |
+
return 'mocked';
|
| 18 |
}
|
| 19 |
if (stage === 'verifier') {
|
| 20 |
+
// verifier returns a "yes" first line so runVerifier.ok = true
|
| 21 |
+
return 'yes\nmock verifier justification';
|
| 22 |
}
|
| 23 |
if (stage === 'reward') {
|
| 24 |
+
// reward returns a score in [0,1]
|
| 25 |
+
return '0.9 great sample';
|
| 26 |
}
|
| 27 |
+
|
| 28 |
+
// fallback
|
| 29 |
+
return 'ok';
|
| 30 |
},
|
| 31 |
};
|
| 32 |
},
|
|
|
|
| 51 |
});
|
| 52 |
|
| 53 |
it('runs a full pipeline step successfully', async () => {
|
| 54 |
+
const result = await runPipelineStep({ question: 'What is mock testing?' });
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
expect(result.status).toBe('accepted');
|
| 57 |
|
| 58 |
// generator output made it through
|
| 59 |
+
expect(result.gen.answer).toBe('mocked');
|
| 60 |
|
| 61 |
// verifier + reward both say OK
|
| 62 |
expect(result.ver.ok).toBe(true);
|
| 63 |
expect(result.rew.ok).toBe(true);
|
|
|
|
| 64 |
|
| 65 |
+
// NEW CONTRACT:
|
| 66 |
+
// even though retrieval returns 2 chunks, step.mjs enforces a single-chunk context
|
| 67 |
+
expect(result.context.length).toBe(1);
|
| 68 |
+
expect(result.context[0].content).toBe('mock context 1');
|
| 69 |
});
|
| 70 |
});
|
try_prompt.sh
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
set -euo pipefail
|
| 3 |
+
|
| 4 |
+
MODEL="$1"
|
| 5 |
+
PROMPT_FILE="$2"
|
| 6 |
+
REASONING_FLAG="${3:-}"
|
| 7 |
+
|
| 8 |
+
ES_NODE="${ES_NODE:-http://localhost:9200}"
|
| 9 |
+
ES_INDEX="${ES_INDEX:-quo_distill_index}"
|
| 10 |
+
OLLAMA_URL="${OLLAMA_URL:-http://localhost:11434}"
|
| 11 |
+
|
| 12 |
+
if [[ -z "${MODEL:-}" || -z "${PROMPT_FILE:-}" ]]; then
|
| 13 |
+
echo "Usage: $0 <model> <prompt_file> [-r]"
|
| 14 |
+
exit 1
|
| 15 |
+
fi
|
| 16 |
+
|
| 17 |
+
if [[ ! -f "$PROMPT_FILE" ]]; then
|
| 18 |
+
echo "❌ Error: prompt file '$PROMPT_FILE' not found."
|
| 19 |
+
exit 1
|
| 20 |
+
fi
|
| 21 |
+
|
| 22 |
+
############################################################
|
| 23 |
+
# 1. Fetch random ES chunk
|
| 24 |
+
############################################################
|
| 25 |
+
echo "📡 Fetching 1 random chunk from Elasticsearch…"
|
| 26 |
+
|
| 27 |
+
RANDOM_DOC=$(curl -s -X POST "$ES_NODE/$ES_INDEX/_search" \
|
| 28 |
+
-H "Content-Type: application/json" \
|
| 29 |
+
-d '{
|
| 30 |
+
"size": 1,
|
| 31 |
+
"query": { "function_score": { "random_score": {} } }
|
| 32 |
+
}')
|
| 33 |
+
|
| 34 |
+
CHUNK=$(echo "$RANDOM_DOC" | jq -r '.hits.hits[0]._source.content')
|
| 35 |
+
DOC_ID=$(echo "$RANDOM_DOC" | jq -r '.hits.hits[0]._id')
|
| 36 |
+
|
| 37 |
+
echo "🧩 Random chunk ID: $DOC_ID"
|
| 38 |
+
echo "----------------------------------------------"
|
| 39 |
+
echo "$CHUNK" | head -n 20
|
| 40 |
+
echo "… (truncated)"
|
| 41 |
+
echo "----------------------------------------------"
|
| 42 |
+
|
| 43 |
+
############################################################
|
| 44 |
+
# 2. Replace {{CONTEXT}} in prompt
|
| 45 |
+
############################################################
|
| 46 |
+
RAW_PROMPT=$(cat "$PROMPT_FILE")
|
| 47 |
+
PROMPT="${RAW_PROMPT//\{\{CONTEXT\}\}/$CHUNK}"
|
| 48 |
+
|
| 49 |
+
############################################################
|
| 50 |
+
# 3. Build JSON payload (no jq merging!)
|
| 51 |
+
############################################################
|
| 52 |
+
if [[ "$REASONING_FLAG" == "-r" ]]; then
|
| 53 |
+
echo "🧠 Reasoning mode: ON"
|
| 54 |
+
OPTIONS='"options":{"reasoning":true},'
|
| 55 |
+
else
|
| 56 |
+
echo "🧠 Reasoning mode: OFF"
|
| 57 |
+
OPTIONS=""
|
| 58 |
+
fi
|
| 59 |
+
|
| 60 |
+
# Safely quote prompt text
|
| 61 |
+
PROMPT_JSON=$(printf '%s' "$PROMPT" | jq -Rs .)
|
| 62 |
+
|
| 63 |
+
# Build payload manually — no parsing of fragments
|
| 64 |
+
PAYLOAD=$(cat <<EOF
|
| 65 |
+
{
|
| 66 |
+
"model": "$MODEL",
|
| 67 |
+
"prompt": $PROMPT_JSON,
|
| 68 |
+
$OPTIONS
|
| 69 |
+
"stream": false
|
| 70 |
+
}
|
| 71 |
+
EOF
|
| 72 |
+
)
|
| 73 |
+
|
| 74 |
+
############################################################
|
| 75 |
+
# 4. Send request to Ollama
|
| 76 |
+
############################################################
|
| 77 |
+
echo
|
| 78 |
+
echo "🚀 Sending to Ollama ($MODEL)…"
|
| 79 |
+
echo "=============================================="
|
| 80 |
+
echo
|
| 81 |
+
|
| 82 |
+
curl -s -X POST "$OLLAMA_URL/api/generate" \
|
| 83 |
+
-H "Content-Type: application/json" \
|
| 84 |
+
-d "$PAYLOAD" \
|
| 85 |
+
| jq -r '.response // .message // .output'
|