KevSun
/

climate-attitude-LM

Text Classification

text-embeddings-inference

Model card Files Files and versions

climate-attitude-LM / README.md

KevSun's picture

Update README.md

6812a1a verified over 1 year ago

|

history blame contribute delete

3.5 kB

	---
	license: apache-2.0
	---
	This language model is designed to assess the attitude expressed in texts about climate change.
	It categorizes the attitude into three types: risk, neutral, and opportunity.
	These categories correspond to the negative, neutral, and positive classifications commonly used in sentiment analysis.


	In comparison to similar existing models, such as "climatebert/distilroberta-base-climate-sentiment" and "XerOpred/twitter-climate-sentiment-model," which typically achieve accuracies ranging from 10% to 30% and F1 scores around 15%, our model demonstrates exceptional performance. When evaluated using the test dataset from "climatebert/climate_sentiment," it achieves an accuracy of 89% and an F1 score of 89%.

	Note that you should paste or type a text concerning the climate change in the API input bar or using the testing code.
	Otherwise, the model does not work so well. e,.g, An example input could be, "Major oil companies have misled Americans for decades about the threat of human-caused climate change, according to a new report released Tuesday by Democrats in Congress.
	The 65-page report was the result of a three-year investigation and was made public hours before a Senate Budget Committee hearing about the role that oil and gas companies have played in global warming.
	"

	Please cite: "Sun., K, and Wang, R. 2024. The fine-tuned language model for detecting human attitudes to climate changes" if you use this model.

	The project in github (including training code) is available at: https://github.com/fivehills/climate_attitude_LM/

	The following code shows how to test in the model.

	```
	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	import torch

	# Load model and tokenizer
	model_path = "KevSun/climate-attitude-LM" # Ensure this path points to the correct directory
	model = AutoModelForSequenceClassification.from_pretrained(model_path)
	tokenizer = AutoTokenizer.from_pretrained(model_path)

	# Define the path to your text file
	file_path = 'yourtext.txt'

	# Read the content of the file
	with open(file_path, 'r', encoding='utf-8') as file:
	new_text = file.read()

	# Encode the text using the tokenizer used during training
	encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=64)

	# Move the model to the correct device (CPU or GPU if available)
	device = "cuda" if torch.cuda.is_available() else "cpu"
	model = model.to(device) # Move model to the correct device
	encoded_input = {k: v.to(device) for k, v in encoded_input.items()} # Move tensor to the correct device

	model.eval() # Set the model to evaluation mode

	# Perform the prediction
	with torch.no_grad():
	outputs = model(**encoded_input)

	# Get the predictions (assumes classification with labels)
	predictions = outputs.logits.squeeze()

	# Assuming softmax is needed to interpret the logits as probabilities
	probabilities = torch.softmax(predictions, dim=0)

	# Define labels for each class index based on your classification categories
	labels = ["risk", "neutral", "opportunity"]
	predicted_index = torch.argmax(probabilities).item() # Get the index of the max probability
	predicted_label = labels[predicted_index]
	predicted_probability = probabilities[predicted_index].item()

	# Print the predicted label and its probability
	print(f"Predicted Label: {predicted_label}, Probability: {predicted_probability:.4f}")

	##the output example: predicted Label: neutral, Probability: 0.8377

	```