Text Classification
Transformers
PyTorch
JAX
Safetensors
code
English
roberta
text-embeddings-inference
Instructions to use Fsoft-AIC/Codebert-docstring-inconsistency with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Fsoft-AIC/Codebert-docstring-inconsistency with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="Fsoft-AIC/Codebert-docstring-inconsistency")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency") model = AutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -112,7 +112,7 @@ model = AutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-d
|
|
| 112 |
## Limitations
|
| 113 |
This model is trained on 5M subset of The Vault in a self-supervised manner. Since the negative samples are generated artificially, the model's ability to identify instances that require a strong semantic understanding between the code and the docstring might be restricted.
|
| 114 |
|
| 115 |
-
It is hard to evaluate the model due to the unavailable labeled datasets.
|
| 116 |
|
| 117 |
## Additional information
|
| 118 |
### Licensing Information
|
|
|
|
| 112 |
## Limitations
|
| 113 |
This model is trained on 5M subset of The Vault in a self-supervised manner. Since the negative samples are generated artificially, the model's ability to identify instances that require a strong semantic understanding between the code and the docstring might be restricted.
|
| 114 |
|
| 115 |
+
It is hard to evaluate the model due to the unavailable labeled datasets. GPT-3.5-turbo is adopted as a reference to measure the correlation between the model and GPT-3.5-turbo's scores. However, the result could be influenced by GPT-3.5-turbo's potential biases and ambiguous conditions. Therefore, we recommend having human labeling dataset and fine-tune this model to achieve the best result.
|
| 116 |
|
| 117 |
## Additional information
|
| 118 |
### Licensing Information
|