yfyangd commited on
Commit
a0c8563
·
1 Parent(s): fab200c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -1
README.md CHANGED
@@ -10,4 +10,45 @@ pinned: false
10
  license: apache-2.0
11
  ---
12
 
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  license: apache-2.0
11
  ---
12
 
13
+ # PaperSimilairty
14
+ ## Background
15
+ National Taiwan University on Tuesday (August 9) announced a decision to rescind a master's degree it gave to Lin Chih-chien (林智堅) in 2017, citing plagiarism after a meeting by the school's academic ethics committee.
16
+
17
+ The review meeting was held in late July and committee members decided to conduct a probe into plagiarism allegations against Lin, Taoyuan candidate of the Democratic Progressive Party (DPP) for the local elections in November. The committee members advised that Lin should be stripped of his master's degree, a decision later approved by the school's Office of Academic Affairs, according to the statement.
18
+
19
+ The school explained it received multiple plagiarism complaints against Lin since early July and the committee later convened to launch the probe, "in due process with no delay." The committee also collaborated with a third party comprised of scholars and experts in the field and hosted interviews with stakeholders, including Lin himself and the victim, before making the conclusion.
20
+
21
+ "The act sullied the reputation of National Taiwan University...and the school will reinforce the importance of academic integrity and ethics, not letting it happen again."
22
+
23
+ In this study, we proposed machine learning method to analyze the similarity between Mr. Lin and Mr. Yu's papers. Provide an objective indicator for your reference.
24
+
25
+ News: https://news.pts.org.tw/article/594307
26
+
27
+ ## Method
28
+ ```Jaccard Similarity``` ref: https://medium.com/data-science-bootcamp/understand-jaccard-index-jaccard-similarity-in-minutes-25a703fbf9d7
29
+
30
+ ```Cosine Similarity``` ref: https://medium.com/geekculture/cosine-similarity-and-cosine-distance-48eed889a5c4
31
+
32
+ ```Persion Similarity``` ref: https://medium.com/@cavaldovinos/human-pose-estimation-pose-similarity-dc8bf9f78556
33
+
34
+
35
+ ## Install dependencies
36
+
37
+ ```
38
+ python -m pip install -r requirements.txt
39
+ ```
40
+
41
+ This code was tested with python 3.7
42
+
43
+ ## Code
44
+ ```Paper_Similarity.ipynb``` is the detailed program explanation, including data exporatlion, data preprocess, model.
45
+
46
+ ```demo.py``` is the demo file of 3 type of similarity search. Please follow the below scrips. ```Lin.txt``` and ```Yu.txt``` can be replaced with the papers you want to review. But please follow the " Paper_Similarity.ipynb " first to convert the file type from pdf to txt.
47
+
48
+ ```
49
+ sim = TextSimilarity('/Lin.txt','/Yu.txt')
50
+ sim.JaccardSim(a.str_a,a.str_b)
51
+ # Out[20]: 0.38188640627665016
52
+
53
+ sim.splitWordSimlaryty(a.str_a,a.str_b,sim=a.cos_sim)
54
+ # Out[21]: 0.9356422914890785