|
|
|
|
|
<!DOCTYPE html> |
|
|
<html> |
|
|
<head> |
|
|
<title>Lesk Algorithm Explained</title> |
|
|
<meta name="viewport" content="width=device-width, initial-scale=1"> |
|
|
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet"> |
|
|
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css"> |
|
|
<style> |
|
|
body { |
|
|
background: linear-gradient(135deg, #f8fafc 0%, #e3e6f3 100%); |
|
|
min-height: 100vh; |
|
|
display: flex; |
|
|
flex-direction: column; |
|
|
} |
|
|
.navbar { |
|
|
box-shadow: 0 2px 8px rgba(0,0,0,0.07); |
|
|
} |
|
|
.main-container { |
|
|
max-width: 800px; |
|
|
margin: 0 auto 2rem auto; |
|
|
padding: 2rem; |
|
|
background-color: white; |
|
|
border-radius: 16px; |
|
|
box-shadow: 0 4px 24px rgba(79,140,255,0.07); |
|
|
} |
|
|
.code-block { |
|
|
background-color: #f5f5f5; |
|
|
padding: 1rem; |
|
|
border-radius: 8px; |
|
|
font-family: monospace; |
|
|
white-space: pre-wrap; |
|
|
border-left: 4px solid #4f8cff; |
|
|
font-size: 0.9rem; |
|
|
} |
|
|
.algorithm-step { |
|
|
background-color: #e3e6f3; |
|
|
padding: 1.2rem; |
|
|
border-radius: 12px; |
|
|
margin-bottom: 1.2rem; |
|
|
transition: transform 0.2s; |
|
|
} |
|
|
.algorithm-step:hover { |
|
|
transform: translateY(-3px); |
|
|
} |
|
|
.enhancement { |
|
|
background-color: #e3f2fd; |
|
|
border-left: 4px solid #4f8cff; |
|
|
padding: 1rem; |
|
|
margin-bottom: 1rem; |
|
|
border-radius: 0 8px 8px 0; |
|
|
} |
|
|
.section-title { |
|
|
margin-bottom: 1.2rem; |
|
|
font-weight: 600; |
|
|
color: #333; |
|
|
display: flex; |
|
|
align-items: center; |
|
|
} |
|
|
.section-title i { |
|
|
margin-right: 0.5rem; |
|
|
color: #4f8cff; |
|
|
} |
|
|
.step-number { |
|
|
display: inline-flex; |
|
|
align-items: center; |
|
|
justify-content: center; |
|
|
width: 32px; |
|
|
height: 32px; |
|
|
background-color: #4f8cff; |
|
|
color: white; |
|
|
border-radius: 50%; |
|
|
margin-right: 0.8rem; |
|
|
font-weight: bold; |
|
|
} |
|
|
.advantage-item { |
|
|
display: flex; |
|
|
align-items: flex-start; |
|
|
margin-bottom: 0.8rem; |
|
|
} |
|
|
.advantage-item i { |
|
|
color: #10b981; |
|
|
margin-right: 0.8rem; |
|
|
margin-top: 0.25rem; |
|
|
} |
|
|
.footer { |
|
|
margin-top: auto; |
|
|
background: #f8fafc; |
|
|
color: #6c757d; |
|
|
text-align: center; |
|
|
padding: 1rem 0 0.5rem 0; |
|
|
font-size: 0.95rem; |
|
|
} |
|
|
</style> |
|
|
</head> |
|
|
<body> |
|
|
|
|
|
<nav class="navbar navbar-expand-lg navbar-light bg-white mb-4"> |
|
|
<div class="container"> |
|
|
<a class="navbar-brand fw-bold" href="/"> |
|
|
<i class="fa-solid fa-brain me-2"></i>WSD Tool |
|
|
</a> |
|
|
<button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbarNav"> |
|
|
<span class="navbar-toggler-icon"></span> |
|
|
</button> |
|
|
<div class="collapse navbar-collapse" id="navbarNav"> |
|
|
<ul class="navbar-nav ms-auto"> |
|
|
<li class="nav-item"> |
|
|
<a href="{{ url_for('index') }}" class="btn btn-outline-primary"> |
|
|
<i class="fa-solid fa-arrow-left me-1"></i>Back to Tool |
|
|
</a> |
|
|
</li> |
|
|
</ul> |
|
|
</div> |
|
|
</div> |
|
|
</nav> |
|
|
|
|
|
<div class="container main-container"> |
|
|
<h2 class="text-center mb-4">Understanding the Enhanced Lesk Algorithm</h2> |
|
|
|
|
|
<div class="mb-4"> |
|
|
<h4 class="section-title"><i class="fa-solid fa-info-circle"></i>What is Word Sense Disambiguation?</h4> |
|
|
<p>Word Sense Disambiguation (WSD) is the task of identifying which meaning of a word is used in a sentence when the word has multiple meanings. For example, determining whether "bank" refers to a financial institution or the side of a river.</p> |
|
|
</div> |
|
|
|
|
|
<div class="mb-4"> |
|
|
<h4 class="section-title"><i class="fa-solid fa-book"></i>The Original Lesk Algorithm</h4> |
|
|
<p>The Lesk algorithm, introduced by Michael Lesk in 1986, is a classical approach to WSD that works by comparing the dictionary definition of each possible sense with the words in the context.</p> |
|
|
|
|
|
<div class="enhancement"> |
|
|
<p class="mb-0"><strong>Basic Idea:</strong> Choose the sense whose dictionary definition shares the most words with the context in which the target word appears.</p> |
|
|
</div> |
|
|
</div> |
|
|
|
|
|
<div class="mb-5"> |
|
|
<h4 class="section-title"><i class="fa-solid fa-rocket"></i>Our Enhanced Lesk Algorithm</h4> |
|
|
<p>We've significantly improved the original Lesk algorithm with several enhancements:</p> |
|
|
|
|
|
<div class="algorithm-step"> |
|
|
<h5><span class="step-number">1</span>BERT Semantic Similarity</h5> |
|
|
<p>Instead of just counting overlapping words, we use BERT embeddings to calculate semantic similarity between the context and each sense definition, capturing deeper meaning relationships.</p> |
|
|
</div> |
|
|
|
|
|
<div class="algorithm-step"> |
|
|
<h5><span class="step-number">2</span>Context Weighting</h5> |
|
|
<p>Words closer to the target word are given higher weight, as they're more likely to be relevant to its meaning. This proximity-based weighting improves accuracy.</p> |
|
|
</div> |
|
|
|
|
|
<div class="algorithm-step"> |
|
|
<h5><span class="step-number">3</span>Rich Sense Signatures</h5> |
|
|
<p>We expand sense definitions with examples, hypernyms, hyponyms, and other related terms from WordNet to create richer signatures for comparison.</p> |
|
|
</div> |
|
|
|
|
|
<div class="algorithm-step"> |
|
|
<h5><span class="step-number">4</span>Collocation Detection</h5> |
|
|
<p>We identify common word combinations (like "river bank" or "baseball bat") that strongly indicate specific senses.</p> |
|
|
</div> |
|
|
|
|
|
<div class="algorithm-step"> |
|
|
<h5><span class="step-number">5</span>User Feedback Learning</h5> |
|
|
<p>The system learns from user corrections, improving its accuracy over time by adjusting sense scores based on feedback.</p> |
|
|
</div> |
|
|
</div> |
|
|
|
|
|
<div class="mb-4"> |
|
|
<h4 class="section-title"><i class="fa-solid fa-code"></i>Example</h4> |
|
|
<p>For the sentence "She saw a bat flying in the dark":</p> |
|
|
|
|
|
<div class="code-block"> |
|
|
Target word: "bat" |
|
|
|
|
|
Possible senses: |
|
|
1. "a nocturnal mammal with wings" |
|
|
2. "a implement used for hitting a ball in sports" |
|
|
|
|
|
Context words: [she, saw, flying, dark] |
|
|
|
|
|
Collocation check: "bat flying" → strong indicator of animal sense |
|
|
Rule application: "flying" → animal sense rule triggered |
|
|
|
|
|
Sense 1 signature: [nocturnal, mammal, wing, fly, night, animal, cave, ...] |
|
|
Sense 2 signature: [implement, hit, ball, sport, game, baseball, cricket, ...] |
|
|
|
|
|
Overlap scores: |
|
|
- Sense 1: High overlap with "flying" and "dark" (related to nocturnal, night) |
|
|
- Sense 2: Low overlap with context words |
|
|
|
|
|
BERT similarity: |
|
|
- Sense 1: High similarity between "bat flying in the dark" and "nocturnal mammal with wings" |
|
|
- Sense 2: Lower similarity with sports equipment definition |
|
|
|
|
|
Final scores: |
|
|
- Sense 1 (animal): 8.7 |
|
|
- Sense 2 (sports): 2.3 |
|
|
|
|
|
Result: Sense 1 is selected as the correct meaning.</div> |
|
|
</div> |
|
|
|
|
|
<div class="mb-4"> |
|
|
<h4 class="section-title"><i class="fa-solid fa-chart-line"></i>Advantages Over Basic Lesk</h4> |
|
|
|
|
|
<div class="advantage-item"> |
|
|
<i class="fa-solid fa-check-circle fa-lg"></i> |
|
|
<div>Higher accuracy for common ambiguous words</div> |
|
|
</div> |
|
|
<div class="advantage-item"> |
|
|
<i class="fa-solid fa-check-circle fa-lg"></i> |
|
|
<div>Better handling of contextual nuances</div> |
|
|
</div> |
|
|
<div class="advantage-item"> |
|
|
<i class="fa-solid fa-check-circle fa-lg"></i> |
|
|
<div>Integration of modern NLP techniques</div> |
|
|
</div> |
|
|
<div class="advantage-item"> |
|
|
<i class="fa-solid fa-check-circle fa-lg"></i> |
|
|
<div>Adaptive learning from user feedback</div> |
|
|
</div> |
|
|
<div class="advantage-item"> |
|
|
<i class="fa-solid fa-check-circle fa-lg"></i> |
|
|
<div>Combination of statistical and rule-based approaches</div> |
|
|
</div> |
|
|
</div> |
|
|
|
|
|
<div class="text-center mt-5"> |
|
|
<a href="{{ url_for('index') }}" class="btn btn-primary px-4 py-2"> |
|
|
<i class="fa-solid fa-flask me-2"></i>Try the WSD Tool |
|
|
</a> |
|
|
</div> |
|
|
</div> |
|
|
|
|
|
<footer class="footer"> |
|
|
<div>Made with <i class="fa-solid fa-heart text-danger"></i> by <a href="https://github.com/Gunjankumar55" target="_blank">Gunjankumar Choudhari</a> | <a href="{{ url_for('index') }}">Home</a></div> |
|
|
<div class="mt-1">© 2024 WSD Tool. All rights reserved.</div> |
|
|
</footer> |
|
|
|
|
|
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script> |
|
|
</body> |
|
|
</html> |
|
|
|