comp4cls
/

comp4cls-4B

@@ -7,17 +7,14 @@ tags: []
 **Comp4Cls** is a retrieval-augmented classification framework that uses **entity-centric semantic compression** to turn long scientific/technical documents into short, task-focused representations for both retrieval and labeling. Documents (papers, patents, and R&D reports) are first compressed into structured summaries that preserve discriminative signals (e.g., core concepts, methods, problems, findings), embedded, and stored in a vector DB. At inference, a query is compressed the same way, nearest neighbors are retrieved, and a small LLM assigns the final class label using the compressed evidence.
 The end-to-end workflow—**Phase 1: compression + indexing, Phase 2: retrieval + classification**—is illustrated in the framework diagram on *page 2*. Experiments on a large bilingual corpus with hierarchical, multi-label taxonomies show that a **4B-scale** Comp4Cls matches or outperforms **8B–14B** models, especially in fine-grained categories, while cutting token usage and compute. Moderate compression (often **~20% of entities**) preserves retrieval fidelity and boosts downstream F1, enabling lightweight, low-latency deployment in production pipelines. See *Table II on page 8* (compression vs. length), *Figure 6 on page 9* (retrieval quality under compression), and *Figure 7 on page 10* (accuracy vs. larger LLMs).
-## Framework Diagram
 <h2>Framework Diagram</h2>
 <p align="center">
   <img src="comp4cls_framework.jpg" width="720" alt="Comp4Cls framework diagram">
   <br>
-  <em>Figure 1. Two-phase pipeline: compression/indexing then retrieval/classification.</em>
 </p>
 ## Model Details

 **Comp4Cls** is a retrieval-augmented classification framework that uses **entity-centric semantic compression** to turn long scientific/technical documents into short, task-focused representations for both retrieval and labeling. Documents (papers, patents, and R&D reports) are first compressed into structured summaries that preserve discriminative signals (e.g., core concepts, methods, problems, findings), embedded, and stored in a vector DB. At inference, a query is compressed the same way, nearest neighbors are retrieved, and a small LLM assigns the final class label using the compressed evidence.
 The end-to-end workflow—**Phase 1: compression + indexing, Phase 2: retrieval + classification**—is illustrated in the framework diagram on *page 2*. Experiments on a large bilingual corpus with hierarchical, multi-label taxonomies show that a **4B-scale** Comp4Cls matches or outperforms **8B–14B** models, especially in fine-grained categories, while cutting token usage and compute. Moderate compression (often **~20% of entities**) preserves retrieval fidelity and boosts downstream F1, enabling lightweight, low-latency deployment in production pipelines. See *Table II on page 8* (compression vs. length), *Figure 6 on page 9* (retrieval quality under compression), and *Figure 7 on page 10* (accuracy vs. larger LLMs).
 <h2>Framework Diagram</h2>
 <p align="center">
   <img src="comp4cls_framework.jpg" width="720" alt="Comp4Cls framework diagram">
   <br>
+  <em>Figure 1. Overview of the **Comp4Cls** framework. The system operates in two phases: (i) documents with predefined class labels are semantically compressed, embedded, and stored in a vector database; (ii) when a new query arrives, it is compressed and used to retrieve the top-$k$ most similar documents from the vector store. The large language model (LLM) then determines the final class label based on the retrieved context. Finally, the compressed query and its assigned label are stored back into the database, enabling downstream services such as document categorization, semantic search, and TL;DR summarization.</em>
 </p>
 ## Model Details