Fsoft-AIC
/

Codebert-docstring-inconsistency

Text Classification

text-embeddings-inference

Model card Files Files and versions

NamCyan commited on May 11, 2023

Commit

5d56c2d

·

1 Parent(s): e9a7954

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -102,17 +102,17 @@ Using model with Jax and Pytorch
 ```python
 from transformers import AutoTokenizer, AutoModelForSequenceClassification, FlaxAutoModelForSequenceClassification
-#Load jax model
 model = FlaxAutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency")
-#Load torch model
 model = AutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency")
 ```
 ## Limitations
-This model is trained on a subset of 5M data in The Vault in the self-supervised manner. Since the negative samples are generated artificially, the model's ability to identify instances that require a strong semantic understanding between the code and the docstring might be restricted.
-It is hard to evaluate the model due to the unavailable labeled datasets. ChatGPT is adopted as a reference to measure the correlation between the model and ChatGPT's scores. However, the result could be influenced by ChatGPT's potential biases and ambiguous conditions. Therefore, we recommend having human labeling dataset and finetune this model to achieve the best result.
 ## Additional information
 ### Licensing Information

 ```python
 from transformers import AutoTokenizer, AutoModelForSequenceClassification, FlaxAutoModelForSequenceClassification
+#Load model with jax
 model = FlaxAutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency")
+#Load model with torch
 model = AutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency")
 ```
 ## Limitations
+This model is trained on 5M subset of The Vault in a self-supervised manner. Since the negative samples are generated artificially, the model's ability to identify instances that require a strong semantic understanding between the code and the docstring might be restricted.
+It is hard to evaluate the model due to the unavailable labeled datasets. ChatGPT is adopted as a reference to measure the correlation between the model and ChatGPT's scores. However, the result could be influenced by ChatGPT's potential biases and ambiguous conditions. Therefore, we recommend having human labeling dataset and fine-tune this model to achieve the best result.
 ## Additional information
 ### Licensing Information