Llama-3.1-8B-Poster-Extraction
Model Description
This repository hosts Meta Llama 3.1 8B Instruct configured for Machine Actionable Poster Extraction tasks. The model is designed to extract structured, machine-readable information from scientific conference posters, enabling automated processing and analysis of poster content.
Intended Use Cases
Machine Actionable Poster Extraction
- Structured Data Extraction: Convert unstructured poster content into structured JSON/XML formats
- Section Identification: Identify and segment poster sections (Title, Authors, Abstract, Methods, Results, Conclusions)
- Entity Recognition: Extract key scientific entities including:
- Author names and affiliations
- Research methodologies
- Statistical findings
- Citations and references
- Semantic Understanding: Interpret relationships between extracted elements
- Metadata Generation: Create machine-readable metadata for poster cataloging
Scientific Document Processing
- Conference poster digitization
- Research content aggregation
- Automated poster summarization
- Cross-poster comparative analysis
Model Specifications
| Attribute | Value |
|---|---|
| Base Model | meta-llama/Llama-3.1-8B-Instruct |
| Parameters | 8 Billion |
| Context Length | 128K tokens |
| Architecture | LLaMA 3.1 |
| Precision | bfloat16 |
| License | Llama 3.1 Community License |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "jimnoneill/Llama-3.1-8B-Poster-Extraction"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto"
)
# Example: Extract structured data from poster text
prompt = """Extract structured information from the following scientific poster content.
Return the extracted information in JSON format with fields: title, authors, affiliations,
abstract, methods, results, conclusions.
Poster Content:
[Your poster text here]
"""
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=2048, temperature=0.1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Extraction Schema Example
{
"title": "Extracted poster title",
"authors": [
{"name": "Author Name", "affiliation": "Institution"}
],
"abstract": "Extracted abstract text",
"sections": {
"background": "Background content",
"methods": "Methodology description",
"results": "Key findings",
"conclusions": "Summary conclusions"
},
"entities": {
"methods": ["method1", "method2"],
"metrics": ["metric1", "metric2"],
"findings": ["finding1", "finding2"]
}
}
Performance Considerations
- Optimized for GPU inference (recommended: NVIDIA RTX 4090 or equivalent)
- Supports quantization for memory-constrained environments
- Compatible with vLLM and other inference optimization frameworks
Citation
If you use this model for poster extraction research, please cite:
@misc{oneill2025poster,
title={Machine Actionable Poster Extraction with Llama 3.1},
author={O'Neill, James},
year={2025},
publisher={HuggingFace}
}
License
This model is released under the Llama 3.1 Community License.
Acknowledgments
- Meta AI for the Llama 3.1 base model
- HuggingFace for the model hosting infrastructure
- Downloads last month
- 61
Model tree for jimnoneill/Llama-3.1-8B-Poster-Extraction
Base model
meta-llama/Llama-3.1-8B
Finetuned
meta-llama/Llama-3.1-8B-Instruct