Add model weights

Files changed (8) hide show

README.md +32 -0
config.json +3 -0
pytorch_model.bin +3 -0
special_tokens_map.json +3 -0
tokenizer.json +3 -0
tokenizer_config.json +3 -0
training_args.bin +3 -0
vocab.txt +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,32 @@

+# Financial Relation Extraction
+## Process
+Detecting the presence of a relationship between financial terms and qualifying the relationship in case of its presence. Example use cases:
+* An A-B trust is a joint trust created by a married couple for the purpose of minimizing estate taxes. (<em>Relationship **exists**, type: **is**</em>)
+* There are no withdrawal penalties. (<em>Relationship **does not exist**, type: **x**</em>)
+## Data
+The data consists of financial definitions collected from different sources (Wikimedia, IFRS, Investopedia) for financial indicators. Each definition has been split up into sentences, and term relationships in a sentence have been extracted using the [Stanford Open Information Extraction](https://nlp.stanford.edu/software/openie.html) module.
+A typical row in the dataset consists of a definition sentence and its corresponding relationship label.
+The labels were restricted to the 5 most-widely identified relationships, namely: **x** (no relationship), **has**, **is in**, **is** and **are**.
+## Model
+The model used is a standard Roberta-base transformer model from the Hugging Face library. See [HUGGING FACE DistilBERT base model](https://huggingface.co/distilbert-base-uncased) for more details about the model.
+In addition, the model has been pretrained to initializa weigths that would otherwise be unused if loaded from an existing pretrained stock model.
+## Metrics
+The evaluation metrics used are: Precision, Recall and F1-score. The following is the classification report on the test set.
+| relation  | precision        | recall           | f1-score  | support  |
+| ------------- |:-------------:|:-------------:|:-------------:| -----:|
+| has | 0.7416      | 0.9674 | 0.8396 | 2362 |
+| is in | 0.7813      | 0.7925 | 0.7869 | 2362 |
+| is | 0.8650     | 0.6863 | 0.7653 | 2362 |
+| are | 0.8365     | 0.8493 | 0.8429 | 2362 |
+| x | 0.9515     | 0.8302 | 0.8867 | 2362 |
+|  |      |  |  |  |
+| macro avg | 0.8352     | 0.8251 | 0.8243 | 11810 |
+| weighted avg | 0.8352     | 0.8251 | 0.8243 | 11810 |

config.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:07f7615afabda7ff754ea77e3a06a2d218132bfcc3aa42e22f22ac1585bd7718
+size 774

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ff72da51e34eb2d892c303c6f5f7beed57b13965a4c9a0e1379fb95654da8d30
+size 267872407

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:303df45a03609e4ead04bc3dc1536d0ab19b5358db685b6f3da123d05ec200e3
+size 112

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ecafc709dc78a0d00e3bc20477606e97ccfd239fdfddd7e53fcb24300ba0bc13
+size 466247

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:87fab29eb94840215d6b277994841550362ceff337f16bbf44e9af30fd2fb62d
+size 291

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e2ab6d3f261a834531ac404acb765265a52a8016338c732b78daa1f299bf6002
+size 2415

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff