Spaces:

Nomearod
/

agentbench

Running

App Files Files Community

agentbench / data /tech_docs /fastapi_pagination.md

Nomearod

feat: Day 4 — corpus, ingest script, first 10 golden questions

a152b95 about 2 months ago

preview code

raw

history blame contribute delete

6.1 kB

	# Pagination in FastAPI

	Pagination is essential for any API that returns collections of resources. Without pagination, endpoints serving large datasets would consume excessive memory, bandwidth, and time. FastAPI supports multiple pagination strategies, each suited to different use cases.

	## Offset/Limit Pagination (Skip/Limit Pattern)

	The most common approach uses `skip` and `limit` query parameters:

	```python
	from fastapi import FastAPI, Query, Depends
	from pydantic import BaseModel

	app = FastAPI()

	class Item(BaseModel):
	id: int
	name: str
	price: float

	# Simulated database with 10,000 items
	all_items = [Item(id=i, name=f"Item {i}", price=round(i * 1.5, 2)) for i in range(1, 10001)]

	class PaginationParams:
	def __init__(
	self,
	skip: int = Query(default=0, ge=0, description="Number of items to skip"),
	limit: int = Query(default=20, ge=1, le=100, description="Number of items to return"),
	):
	self.skip = skip
	self.limit = limit

	@app.get("/items/")
	async def list_items(pagination: PaginationParams = Depends()):
	items = all_items[pagination.skip : pagination.skip + pagination.limit]
	return {
	"items": items,
	"total": len(all_items),
	"skip": pagination.skip,
	"limit": pagination.limit,
	}
	```

	This implementation uses a default page size of 20 items, a minimum of 1 item per page, and a maximum of 100 items per page. For a dataset of 10,000 items with the default page size of 20, there are 500 total pages. Requesting page 3 would use `skip=40&limit=20` to retrieve items 41 through 60.

	The offset/limit pattern is simple to implement but has performance drawbacks for large offsets. A query with `skip=9000` on a SQL database must scan and discard 9,000 rows before returning the requested 20, resulting in O(n) performance where n is the offset value.

	## Cursor-Based Pagination

	Cursor-based pagination uses an opaque token (cursor) pointing to the last item in the previous page. This avoids the performance degradation of large offsets:

	```python
	import base64
	from fastapi import FastAPI, Query

	app = FastAPI()

	def encode_cursor(item_id: int) -> str:
	return base64.urlsafe_b64encode(f"id:{item_id}".encode()).decode()

	def decode_cursor(cursor: str) -> int:
	decoded = base64.urlsafe_b64decode(cursor.encode()).decode()
	return int(decoded.split(":")[1])

	@app.get("/items/")
	async def list_items(
	cursor: str \| None = Query(default=None, description="Pagination cursor"),
	limit: int = Query(default=20, ge=1, le=100),
	):
	if cursor:
	last_id = decode_cursor(cursor)
	# In a real DB: SELECT * FROM items WHERE id > last_id ORDER BY id LIMIT limit
	items = [item for item in all_items if item.id > last_id][:limit]
	else:
	items = all_items[:limit]

	next_cursor = None
	if len(items) == limit:
	next_cursor = encode_cursor(items[-1].id)

	return {
	"items": items,
	"next_cursor": next_cursor,
	"limit": limit,
	"has_more": len(items) == limit,
	}
	```

	Cursor-based pagination maintains consistent O(1) performance regardless of how deep into the dataset the client has paginated. It is the recommended approach for datasets exceeding 100,000 records or for real-time feeds where items may be inserted or deleted between page requests.

	## Pagination with Total Count and Link Headers

	Include total count metadata and RFC 5988 Link headers for discoverability:

	```python
	from fastapi import FastAPI, Query, Response
	from math import ceil

	app = FastAPI()

	@app.get("/items/")
	async def list_items(
	response: Response,
	page: int = Query(default=1, ge=1, description="Page number"),
	per_page: int = Query(default=20, ge=1, le=100, description="Items per page"),
	):
	total = len(all_items)
	total_pages = ceil(total / per_page)
	skip = (page - 1) * per_page
	items = all_items[skip : skip + per_page]

	# Build Link headers
	base_url = "/items/"
	links = []
	if page > 1:
	links.append(f'<{base_url}?page=1&per_page={per_page}>; rel="first"')
	links.append(f'<{base_url}?page={page - 1}&per_page={per_page}>; rel="prev"')
	if page < total_pages:
	links.append(f'<{base_url}?page={page + 1}&per_page={per_page}>; rel="next"')
	links.append(f'<{base_url}?page={total_pages}&per_page={per_page}>; rel="last"')

	response.headers["Link"] = ", ".join(links)
	response.headers["X-Total-Count"] = str(total)
	response.headers["X-Total-Pages"] = str(total_pages)

	return {
	"items": items,
	"page": page,
	"per_page": per_page,
	"total": total,
	"total_pages": total_pages,
	}
	```

	With 10,000 items and a default page size of 20, the `X-Total-Pages` header returns 500. At 50 items per page, there are 200 total pages. The Link header follows the RFC 5988 standard used by the GitHub API and other major REST APIs.

	## Pagination Response Model

	Standardize pagination responses across endpoints with a generic response model:

	```python
	from typing import Generic, TypeVar, List
	from pydantic import BaseModel

	T = TypeVar("T")

	class PaginatedResponse(BaseModel, Generic[T]):
	items: List[T]
	total: int
	page: int
	per_page: int
	total_pages: int

	@app.get("/items/", response_model=PaginatedResponse[Item])
	async def list_items(
	page: int = Query(default=1, ge=1),
	per_page: int = Query(default=20, ge=1, le=100),
	):
	total = len(all_items)
	skip = (page - 1) * per_page
	return PaginatedResponse(
	items=all_items[skip : skip + per_page],
	total=total,
	page=page,
	per_page=per_page,
	total_pages=ceil(total / per_page),
	)
	```

	This generic model ensures every paginated endpoint returns a consistent structure. The `total_pages` field is always calculated as `ceil(total / per_page)`. For 10,000 items at 20 per page, that is `ceil(10000 / 20) = 500` pages. For 10,000 items at 30 per page, that is `ceil(10000 / 30) = 334` pages (with the last page containing only 10 items).