Sustainability Technology Filter v2

A fine-tuned Qwen2.5-1.5B model with LoRA adapters for multi-dimensional sustainability technology assessment. This model evaluates news articles across 6 LCSA-based dimensions to identify genuinely impactful sustainability technologies - not just greenwashing or speculation.

Model Description

Purpose

This model is designed for automated filtering of sustainability and clean technology news. It scores articles on multiple dimensions derived from the Life Cycle Sustainability Assessment (LCSA) framework, enabling:

Content curation: Identify high-quality sustainability technology articles
Trend analysis: Track technology readiness and deployment patterns
Research filtering: Separate substantive innovations from hype

Key Features

Multi-dimensional scoring: 6 independent LCSA dimensions (0-10 scale)
Explicit scope boundaries: Trained to reject AI/ML papers, consumer electronics, programming tutorials
Multilingual support: 21 languages including EN, DE, FR, ES, PT, NL, ZH, and more
Evidence-based: Focuses on documented deployments and metrics, not announcements

What's New in v2

Improved scope filtering: Explicit exclusions for off-topic content (AI/ML infrastructure, consumer electronics, military tech)
Better dimension independence: Max correlation 0.61 (down from 0.80+ in v1)
Multilingual prefilter: Keywords in 21 languages for global coverage
Lower MAE: 0.654 validation MAE (vs 0.712 in v1)

Dimensions

The model scores articles on 6 dimensions from the LCSA framework:

Technology Assessment

Dimension	Weight	Range	Question
Technology Readiness Level	15%	0-9	Lab concept to Commercial deployment?
Technical Performance	15%	0-10	Proven efficiency, reliability, scalability?
Economic Competitiveness	20%	0-10	Cost-competitive with incumbents?

Sustainability Impact

Dimension	Weight	Range	Question
Life Cycle Environmental Impact	30%	0-10	Full lifecycle benefits (not just use phase)?
Social Equity Impact	10%	0-10	Job creation, accessibility, community benefit?
Governance & Systemic Impact	10%	0-10	Policy alignment, infrastructure readiness?

Dimension Descriptions

Technology Readiness Level (TRL)

Based on NASA/DOE TRL scale:

0: Out of scope (not technology)
1-3: Lab/proof of concept
4-5: Validated in relevant environment
6-7: Demonstrated in operational environment
8-9: Commercial deployment at scale

Technical Performance

Measures real-world metrics: efficiency improvements, reliability data, scalability evidence, real-world performance.

Economic Competitiveness

Life Cycle Cost (LCC) assessment: CAPEX/OPEX competitiveness, learning curve trajectory, market adoption, subsidy dependence.

Life Cycle Environmental Impact

Holistic environmental assessment: cradle-to-grave emissions, resource extraction impacts, manufacturing footprint, end-of-life recyclability.

Social Equity Impact

Human-centered sustainability: job creation, geographic accessibility, affordability, community acceptance, just transition.

Governance & Systemic Impact

System-level readiness: policy alignment, infrastructure compatibility, supply chain maturity, standards and certification.

Performance

Overall Metrics

Metric	Validation	Test
MAE	0.654	0.717
RMSE	1.14	1.22

Per-Dimension Performance (Test Set)

Dimension	MAE	RMSE
social_equity_impact	0.63	1.08
economic_competitiveness	0.67	1.15
life_cycle_environmental_impact	0.69	1.10
governance_systemic_impact	0.77	1.28
technical_performance	0.77	1.30
technology_readiness_level	0.78	1.38

Comparison with v1

Metric	v1	v2	Change
Validation MAE	0.712	0.654	-8.1%
Test MAE	0.690	0.717	+3.9%
Max dimension correlation	0.80	0.61	Better independence

Gatekeeper Rules

TRL Gatekeeper

If technology_readiness_level < 3.0 then overall weighted average capped at 2.9

Rationale: Lab-only technologies cannot achieve high sustainability scores regardless of theoretical potential.

Tier Classification

Tier	Weighted Average	Description
High	>= 6.0	Commercial deployment, proven sustainability
Medium	>= 4.0	Pilot/early commercial, promising sustainability
Low	< 4.0	Lab stage or poor sustainability profile

Scope Exclusions

The model scores 0 on all dimensions for off-topic content:

Excluded Categories

AI/ML Infrastructure - Model architectures, LLMs, benchmarks (without sustainability application)
Consumer Electronics - Smartphone reviews, gaming hardware, GPUs
Programming - Tutorials, frameworks, developer tools
Other - Military tech, travel, crypto speculation, entertainment

In-Scope Topics

Renewable energy (solar, wind, hydro, geothermal, nuclear)
Electric vehicles and sustainable transport
Energy storage (batteries, hydrogen, grid storage)
Carbon capture and emissions reduction
Circular economy and waste reduction
Green building and energy efficiency
Sustainable agriculture and food tech
AI/ML applied to sustainability

Training Details

Parameter	Value
Base Model	Qwen/Qwen2.5-1.5B
Training Mode	Knowledge Distillation
Oracle Model	Gemini Flash 2.0
Trainable Parameters	18.5M (1.18% LoRA)
Epochs	3
Batch Size	8
Learning Rate	2e-5
Max Length	512 tokens

Data Split

Split	Examples
Training	4,358
Validation	547
Test	543

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel
import torch
import numpy as np

# Load model
base_model = AutoModelForSequenceClassification.from_pretrained(
    "Qwen/Qwen2.5-1.5B",
    num_labels=6,
    problem_type="regression"
)
model = PeftModel.from_pretrained(base_model, "jeergrvgreg/sustainability-technology-v2")
tokenizer = AutoTokenizer.from_pretrained("jeergrvgreg/sustainability-technology-v2")

if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
    model.config.pad_token_id = tokenizer.pad_token_id

model.eval()

# Score an article
article = "Title: Solar Panel Achieves 30% Efficiency\n\nResearchers developed..."
inputs = tokenizer(article, return_tensors="pt", max_length=512, truncation=True, padding=True)

with torch.no_grad():
    scores = model(**inputs).logits[0].numpy()

dimensions = ["technology_readiness_level", "technical_performance",
              "economic_competitiveness", "life_cycle_environmental_impact",
              "social_equity_impact", "governance_systemic_impact"]
weights = [0.15, 0.15, 0.20, 0.30, 0.10, 0.10]

for dim, score in zip(dimensions, scores):
    print(f"{dim}: {score:.1f}")

weighted_avg = np.average(scores, weights=weights)
if scores[0] < 3.0:
    weighted_avg = min(weighted_avg, 2.9)
print(f"Weighted Average: {weighted_avg:.2f}")

Limitations

Language: Training predominantly English; prefilter supports 21 languages
High-Tier Data: Only 0.4% high-tier examples in training
Precision: MAE ~0.7 sufficient for tier classification, not precise scoring
Context: 512 token limit may truncate long articles
Temporal: Trained on 2025-2026 news

Intended Use

Primary Use Cases

News aggregation and filtering
Research monitoring for clean tech
Content curation for sustainability dashboards
Trend analysis across sectors

Out-of-Scope

Investment decisions (scores content quality, not viability)
Policy recommendations (requires expert interpretation)
Academic paper assessment
Real-time trading

Technical Specifications

Architecture: Qwen/Qwen2.5-1.5B + LoRA (r=8, alpha=16)
GPU VRAM: 4GB minimum, 8GB recommended
Inference: ~30ms/article on RTX 3060

Environmental Impact

Hardware: NVIDIA RTX 4080
Training Time: ~1 hour
Carbon: < 0.1 kg CO2eq

Citation

@misc{sustainability_technology_v2,
  title={Sustainability Technology Filter v2},
  author={NexusMind},
  year={2026},
  url={https://huggingface.co/jeergrvgreg/sustainability-technology-v2}
}

Version History

Version	Date	Changes
v2.0	2026-01-14	Scope exclusions, multilingual prefilter, improved independence
v1.0	2025-11-27	Initial LCSA-based model

Framework Versions: PEFT 0.17.1, Transformers 4.47+, PyTorch 2.0+

Downloads last month: 4

Model tree for jeergrvgreg/sustainability-technology-v2

Base model

Qwen/Qwen2.5-1.5B

Adapter

(522)

this model

Evaluation results

Test MAE
self-reported

0.717
Validation MAE
self-reported

0.654