DarianNLP
/

bert_sequel_beagles

natural-language

Model card Files Files and versions

bert_sequel_beagles / README.md

DarianNLP's picture

Update README.md

68e5ed7 verified about 1 month ago

|

history blame contribute delete

2.72 kB

	---
	language:
	- en
	tags:
	- regression
	- similarity
	- sql
	- natural-language
	- reward-model
	license: mit
	datasets:
	- custom
	metrics:
	- mse
	- mae
	- rmse
	model-index:
	- name: BERT Reward Model for CoT Filtering
	results:
	- task:
	type: regression
	name: Similarity Score Prediction
	dataset:
	name: Custom CoT Dataset
	type: custom
	metrics:
	- type: mse
	value: 0.0238
	- type: mae
	value: 0.1229
	- type: rmse
	value: 0.1543
	---

	# BERT Reward Model for CoT Filtering

	A BERT-based regression model fine-tuned to predict similarity scores between SQL queries, reasoning chains (Chain-of-Thought), and natural language descriptions.

	## Model Description

	This model is based on `bert-base-uncased` and has been fine-tuned for regression to predict similarity scores in the range [0, 1]. The model takes as input a concatenation of:
	- SQL query
	- Reasoning/Chain-of-Thought explanation
	- Predicted natural language description

	And outputs a similarity score indicating how well the predicted NL matches the ground truth.

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load model and tokenizer

	tokenizer = AutoTokenizer.from_pretrained("DarianNLP/bert_sequel_beagles")
	model = AutoModelForSequenceClassification.from_pretrained(
	"DarianNLP/bert_sequel_beagles",
	num_labels=1,
	problem_type="regression"
	)
	model.eval()

	# Prepare input
	sql = "SELECT movie_title FROM movies WHERE movie_release_year = 1945"
	reasoning = "think: The SQL selects the movie title..."
	predicted_nl = "What was the most popular movie released in 1945?"

	input_text = f"SQL: {sql}\nReasoning: {reasoning}\nNL: {predicted_nl}"

	# Tokenize and predict
	inputs = tokenizer(input_text, return_tensors="pt", truncation=True, max_length=512)
	with torch.no_grad():
	outputs = model(**inputs)
	# Apply sigmoid to get probability
	similarity_score = torch.sigmoid(outputs.logits).item()

	print(f"Predicted similarity: {similarity_score:.3f}")
	```

	## Training Details

	- Base Model: bert-base-uncased
	- Training Dataset: Custom CoT dataset with corruptions (7,342 examples)
	- Train/Val/Test Split: 75% / 12.5% / 12.5%
	- Training Loss: MSE (Mean Squared Error)
	- Evaluation Metrics:
	- MSE: 0.0238
	- MAE: 0.1229
	- RMSE: 0.1543

	## Limitations

	- Maximum input length: 512 tokens (BERT's limit)
	- Trained on a specific domain (SQL to NL translation with CoT)
	- Performance may vary on out-of-domain data

	## Citation

	If you use this model, please cite:

	```bibtex
	@misc{bert_cot_reward_model,
	title={BERT Reward Model for Chain-of-Thought Filtering},
	author={Darian Lee},
	year={2025},
	}
	```