AutoPage

Runtime error

App Files Files Community

AutoPage / templates /deep-ltl.github.io /index.html

Mqleet

[update] templates

a3d3755 about 2 months ago

raw

history blame contribute delete

12.8 kB

	<!DOCTYPE html>
	<html lang="en">

	<head>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<title>DeepLTL: Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL</title>
	<link rel="stylesheet" href="style.css">
	<link rel="preconnect" href="https://fonts.googleapis.com">
	<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
	<link
	href="https://fonts.googleapis.com/css2?family=IBM+Plex+Sans:ital,wght@0,100..700;1,100..700&family=Inter:ital,opsz,wght@0,14..32,100..900;1,14..32,100..900&display=swap"
	rel="stylesheet">
	<script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
	</head>

	<body>

	<header>
	<div class="content">
	<h1>DeepLTL</h1>
	<h2>Learning to Efficiently Satisfy Complex LTL Specifications for Multi-Task RL</h2>
	<p class="authors">Mathias Jackermeier,<br class="wide-hidden" /> Alessandro Abate <br />
	University of Oxford</p>
	<p>ICLR 2025 (Oral)</p>
	<div class="links">
	<a href="https://openreview.net/pdf?id=9pW2J49flQ" target="_blank">Paper</a> <span class="separator">/</span>
	<a href="https://deep-ltl.github.io/slides.pdf" target="_blank">Slides</a> <span class="separator">/</span>
	<a href="https://github.com/mathiasj33/deep-ltl" target="_blank">GitHub</a>
	</div>
	</div>
	<div class="tldr">
	<p>💡<i>TL;DR: We develop a novel approach to train RL agents to zero-shot follow arbitrary instructions specified in a formal language.</i></p>
	</div>
	</header>

	<main>
	<section id="abstract">
	<div class="content">
	<h2>Abstract</h2>
	<p>
	Linear temporal logic (LTL) has recently been adopted as a powerful formalism
	for specifying complex, temporally extended tasks in multi-task reinforcement
	learning (RL). However, learning policies that efficiently satisfy arbitrary specifications not
	observed
	during training remains a challenging problem. Existing approaches suffer from several shortcomings:
	they are often only applicable to
	finite-horizon fragments of LTL, are restricted to suboptimal solutions, and do not
	adequately handle safety constraints. In this work, we propose a novel learning
	approach to address these concerns. Our method leverages the structure of Büchi
	automata, which explicitly represent the semantics of LTL specifications, to learn
	policies conditioned on sequences of truth assignments that lead to satisfying the
	desired formulae. Experiments in a variety of discrete and continuous domains
	demonstrate that our approach is able to zero-shot satisfy a wide range of finite- and
	infinite-horizon specifications, and outperforms existing methods in terms of both
	satisfaction probability and efficiency.
	</p>
	</div>
	</section>

	<section>
	<div class="content">
	<h2>DeepLTL</h2>
	<div class="image overview">
	<img src="images/overview.png" alt="Schematic overview of DeepLTL">
	</div>
	<p>
	We develop a novel approach, called <i>DeepLTL</i>, to learn policies that efficiently zero-shot
	satisfy complex LTL specifications.
	We first employ goal-conditioned RL to learn a general policy that can execute reach-avoid
	sequences
	of high-level propositions, such as "reach goal A while avoiding region B, and subsequently stay
	in
	region C". At test time, we first translate a given LTL specification into a limit-deterministic
	Büchi automaton (LDBA), from which we extract possible reach-avoid sequences that satisfy the
	formula. We select the optimal sequence according to the learned value function, and execute it
	with
	the trained policy.
	</p>
	</div>
	</section>

	<section>
	<div class="content">
	<h2>Quantitative results</h2>
	<div class="image">
	<img src="images/curves.png" alt="Schematic overview of DeepLTL">
	</div>
	<p>
	We find that DeepLTL is able to efficiently zero-shot satisfy a wide range of finite- and
	infinite-horizon LTL specifications in a variety of discrete and continuous domains. Our
	approach
	outperforms existing methods in terms of both satisfaction probability and efficiency.
	</p>
	</div>
	</section>

	<section>
	<div class="content">
	<h2>Qualitative results</h2>
	<div class="videos">
	<div class="video-with-caption">
	<div class="video-container">
	<video muted loop disablePictureInPicture preload="metadata">
	<source src="images/output.mp4#t=0.1" type="video/mp4">
	</video>
	<button class="play-button">
	<svg class="play-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
	<path d="M8 5v14l11-7z" />
	</svg>
	<svg class="pause-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
	<path d="M6 19h4V5H6v14zm8-14v14h4V5h-4z" />
	</svg>
	</button>
	</div>
	<p>
	\((\neg(\mathsf{yellow}\lor \mathsf{green})\;\mathsf{U}\; \mathsf{magenta}) \land
	\mathsf{F}\,\mathsf{blue}\)
	</p>
	</div>
	<div class="video-with-caption">
	<div class="video-container">
	<video muted loop disablePictureInPicture preload="metadata">
	<source src="images/infinite.mp4#t=0.1" type="video/mp4">
	</video>
	<button class="play-button">
	<svg class="play-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
	<path d="M8 5v14l11-7z" />
	</svg>
	<svg class="pause-icon" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24">
	<path d="M6 19h4V5H6v14zm8-14v14h4V5h-4z" />
	</svg>
	</button>
	</div>
	<p>
	\(\mathsf{G}\mathsf{F}\,\mathsf{green} \land \mathsf{G}\mathsf{F}\,\mathsf{yellow} \land
	\mathsf{G}\,\neg\mathsf{blue}\)
	</p>
	</div>
	</div>
	<p>
	We demonstrate the behaviour of the task-conditioned policy learned by DeepLTL in the zone
	environment. DeepLTL consistently achieves the desired behaviour, even with complex,
	infinite-horizon specifications.
	</p>
	</div>
	</section>

	<section>
	<div class="content">
	<h2>Citation</h2>
	<div class="citation">
	<pre>
	@inproceedings{deepltl,
	title = {Deep{LTL}: Learning to Efficiently Satisfy Complex {LTL} Specifications for Multi-Task {RL}},
	author = {Mathias Jackermeier and Alessandro Abate},
	booktitle = {International Conference on Learning Representations ({ICLR})},
	year = {2025}
	}</pre>
	<div class="copy">
	<span class="copy-text">Copied!</span>
	<div class="copy-icon" onclick="copyCitation()">
	<svg xmlns="http://www.w3.org/2000/svg" class="copy-click"
	viewBox="0 0 448 512"><!--!Font Awesome Free 6.7.2 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free Copyright 2025 Fonticons, Inc.-->
	<path fill="currentColor"
	d="M384 336l-192 0c-8.8 0-16-7.2-16-16l0-256c0-8.8 7.2-16 16-16l140.1 0L400 115.9 400 320c0 8.8-7.2 16-16 16zM192 384l192 0c35.3 0 64-28.7 64-64l0-204.1c0-12.7-5.1-24.9-14.1-33.9L366.1 14.1c-9-9-21.2-14.1-33.9-14.1L192 0c-35.3 0-64 28.7-64 64l0 256c0 35.3 28.7 64 64 64zM64 128c-35.3 0-64 28.7-64 64L0 448c0 35.3 28.7 64 64 64l192 0c35.3 0 64-28.7 64-64l0-32-48 0 0 32c0 8.8-7.2 16-16 16L64 464c-8.8 0-16-7.2-16-16l0-256c0-8.8 7.2-16 16-16l32 0 0-48-32 0z" />
	</svg>
	<svg xmlns="http://www.w3.org/2000/svg" class="copy-confirm"
	viewBox="0 0 448 512"><!--!Font Awesome Free 6.7.2 by @fontawesome - https://fontawesome.com License - https://fontawesome.com/license/free Copyright 2025 Fonticons, Inc.-->
	<path fill="currentColor"
	d="M438.6 105.4c12.5 12.5 12.5 32.8 0 45.3l-256 256c-12.5 12.5-32.8 12.5-45.3 0l-128-128c-12.5-12.5-12.5-32.8 0-45.3s32.8-12.5 45.3 0L160 338.7 393.4 105.4c12.5-12.5 32.8-12.5 45.3 0z" />
	</svg>
	</div>
	</div>
	</div>
	</div>
	</section>
	</main>

	<footer>
	<div class="content">
	Header background by <a href="http://www.freepik.com">Harryarts / Freepik</a>
	</div>
	</footer>

	<script>
	document.addEventListener('DOMContentLoaded', () => {
	document.querySelectorAll('.video-container').forEach(container => {
	const video = container.querySelector('video');
	const button = container.querySelector('.play-button');
	const playIcon = button.querySelector('.play-icon');
	const pauseIcon = button.querySelector('.pause-icon');

	container.addEventListener('click', () => {
	if (video.paused) {
	video.play();
	playIcon.style.opacity = 0;
	pauseIcon.style.opacity = 1;

	setTimeout(() => {
	button.style.opacity = 0;
	button.style.cursor = 'auto';
	}, 50);
	} else {
	video.pause();
	button.style.opacity = 1;
	playIcon.style.opacity = 1;
	pauseIcon.style.opacity = 0;
	button.style.cursor = 'pointer';
	}
	});
	});
	});

	const copyCitation = () => {
	const citationText = document.querySelector('.citation pre').textContent;
	navigator.clipboard.writeText(citationText).then(showCopyConfirm, (err) => console.error(err));
	}

	let copyEnabled = true;

	const showCopyConfirm = () => {
	if (!copyEnabled) return;
	copyEnabled = false;
	const copyIcon = document.querySelector('.copy-click');
	const copyConfirm = document.querySelector('.copy-confirm');
	const copyText = document.querySelector('.copy-text');
	copyIcon.style.opacity = 0;
	copyConfirm.style.opacity = 1;
	copyText.classList.add('fade-in');
	setTimeout(() => {
	copyIcon.style.opacity = 1;
	copyConfirm.style.opacity = 0;
	copyText.classList.remove('fade-in');
	copyText.classList.add('fade-out');
	setTimeout(() => {
	copyText.classList.remove('fade-out');
	copyEnabled = true;
	}, 300);
	}, 2000);
	}
	</script>
	</body>

	</html>