Llama-3.1-8B-Poster-Extraction

Model Description

This repository hosts Meta Llama 3.1 8B Instruct configured for Machine Actionable Poster Extraction tasks. The model is designed to extract structured, machine-readable information from scientific conference posters, enabling automated processing and analysis of poster content.

Intended Use Cases

Machine Actionable Poster Extraction

Structured Data Extraction: Convert unstructured poster content into structured JSON/XML formats
Section Identification: Identify and segment poster sections (Title, Authors, Abstract, Methods, Results, Conclusions)
Entity Recognition: Extract key scientific entities including:
- Author names and affiliations
- Research methodologies
- Statistical findings
- Citations and references
Semantic Understanding: Interpret relationships between extracted elements
Metadata Generation: Create machine-readable metadata for poster cataloging

Scientific Document Processing

Conference poster digitization
Research content aggregation
Automated poster summarization
Cross-poster comparative analysis

Model Specifications

Attribute	Value
Base Model	meta-llama/Llama-3.1-8B-Instruct
Parameters	8 Billion
Context Length	128K tokens
Architecture	LLaMA 3.1
Precision	bfloat16
License	Llama 3.1 Community License

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "jimnoneill/Llama-3.1-8B-Poster-Extraction"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)

# Example: Extract structured data from poster text
prompt = """Extract structured information from the following scientific poster content.
Return the extracted information in JSON format with fields: title, authors, affiliations, 
abstract, methods, results, conclusions.

Poster Content:
[Your poster text here]
"""

messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=2048, temperature=0.1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Extraction Schema Example

{
  "title": "Extracted poster title",
  "authors": [
    {"name": "Author Name", "affiliation": "Institution"}
  ],
  "abstract": "Extracted abstract text",
  "sections": {
    "background": "Background content",
    "methods": "Methodology description",
    "results": "Key findings",
    "conclusions": "Summary conclusions"
  },
  "entities": {
    "methods": ["method1", "method2"],
    "metrics": ["metric1", "metric2"],
    "findings": ["finding1", "finding2"]
  }
}

Performance Considerations

Optimized for GPU inference (recommended: NVIDIA RTX 4090 or equivalent)
Supports quantization for memory-constrained environments
Compatible with vLLM and other inference optimization frameworks

Citation

If you use this model for poster extraction research, please cite:

@misc{oneill2025poster,
  title={Machine Actionable Poster Extraction with Llama 3.1},
  author={O'Neill, James},
  year={2025},
  publisher={HuggingFace}
}

License

This model is released under the Llama 3.1 Community License.

Acknowledgments

Meta AI for the Llama 3.1 base model
HuggingFace for the model hosting infrastructure

Downloads last month: 61

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for jimnoneill/Llama-3.1-8B-Poster-Extraction

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Finetuned

(2190)

this model