gmay29 commited on
Commit
1cb6605
·
verified ·
1 Parent(s): 3c8e7b8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -23
README.md CHANGED
@@ -29,7 +29,7 @@ Not designed for parsing **candidate resumes**.
29
  ### Training Data
30
 
31
  * **Synthetic internship and job description dataset** generated with **mostly.ai**
32
- * \~X,000 labeled samples (replace with actual dataset size if you know it)
33
  * Labels: `SKILL`, `DISCIPLINE`, `COURSE`, `ROLE`
34
 
35
  ---
@@ -40,27 +40,35 @@ Not designed for parsing **candidate resumes**.
40
  import torch
41
  from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
42
 
43
- # Load model
44
  model_name = "gmay29/ner_model_final"
45
  tokenizer = AutoTokenizer.from_pretrained(model_name)
46
  model = AutoModelForTokenClassification.from_pretrained(model_name)
47
 
48
- device = 0 if torch.cuda.is_available() else -1
49
 
50
- # Create NER pipeline
51
- ner_pipeline = pipeline(
52
- "ner",
53
- model=model,
54
- tokenizer=tokenizer,
55
- device=device,
56
- aggregation_strategy="simple"
57
- )
58
 
59
- # Example internship/job description
60
  text = """
 
61
  Responsibilities of the Intern:
62
- Design, develop, and implement AI agents for various applications.
63
- Strong programming skills in Python and experience with TensorFlow or PyTorch.
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  """
65
 
66
  # Run inference
@@ -74,15 +82,16 @@ for ent in entities:
74
  **Example Output:**
75
 
76
  ```
77
- ROLE | Intern | score=0.991
78
- SKILL | design | score=0.874
79
- SKILL | develop | score=0.862
80
- SKILL | implement | score=0.849
81
- SKILL | AI agents | score=0.921
82
- SKILL | Python | score=0.982
83
- SKILL | TensorFlow | score=0.976
84
- SKILL | PyTorch | score=0.973
85
- DISCIPLINE | AI | score=0.955
 
86
  ```
87
 
88
  ---
 
29
  ### Training Data
30
 
31
  * **Synthetic internship and job description dataset** generated with **mostly.ai**
32
+ * \~20,000 labeled samples (replace with actual dataset size if you know it)
33
  * Labels: `SKILL`, `DISCIPLINE`, `COURSE`, `ROLE`
34
 
35
  ---
 
40
  import torch
41
  from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
42
 
43
+ # Load from Hugging Face Hub
44
  model_name = "gmay29/ner_model_final"
45
  tokenizer = AutoTokenizer.from_pretrained(model_name)
46
  model = AutoModelForTokenClassification.from_pretrained(model_name)
47
 
48
+ device = 0 if torch.cuda.is_available() else -1 # pipeline expects 0 for GPU, -1 for CPU
49
 
50
+ # Create NER pipeline (handles context automatically)
51
+ ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer, device=device, aggregation_strategy="simple")
 
 
 
 
 
 
52
 
53
+ # Example job description text
54
  text = """
55
+ Details
56
  Responsibilities of the Intern:
57
+
58
+ Accurately enter and update data into the company's databases and systems.
59
+ Maintain and organize digital and physical records.
60
+ Assist in generating and compiling reports using data from various sources.
61
+ Perform data quality checks to ensure accuracy and completeness.
62
+ Support the team with other administrative tasks as needed.
63
+ Requirements:
64
+
65
+ Strong data entry skills with high accuracy and attention to detail.
66
+ Proficiency in Microsoft Excel and other data management tools.
67
+ Basic understanding of data analysis and reporting.
68
+ Excellent organizational and time-management skills.
69
+ Good communication skills, both written and verbal.
70
+ Ability to work independently and manage time effectively.
71
+ A proactive approach to problem-solving.
72
  """
73
 
74
  # Run inference
 
82
  **Example Output:**
83
 
84
  ```
85
+ Device set to use cpu
86
+ Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.
87
+ SKILL | detail | score=1.000
88
+ SKILL | Excel | score=1.000
89
+ SKILL | management | score=1.000
90
+ SKILL | data analysis | score=1.000
91
+ SKILL | reporting | score=1.000
92
+ SKILL | time | score=0.993
93
+ SKILL | management | score=1.000
94
+ SKILL | communication skills | score=1.000
95
  ```
96
 
97
  ---