Spaces:
Sleeping
Sleeping
File size: 4,158 Bytes
980dc8d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
---
title: "Database: SQLite"
original_url: "https://tds.s-anand.net/#/sqlite?id=database-sqlite"
downloaded_at: "2025-06-08T23:26:00.500923"
---
[Database: SQLite](#/sqlite?id=database-sqlite)
-----------------------------------------------
Relational databases are used to store data in a structured way. You’ll often access databases created by others for analysis.
PostgreSQL, MySQL, MS SQL, Oracle, etc. are popular databases. But the most installed database is [SQLite](https://www.sqlite.org/index.html). It’s embedded into many devices and apps (e.g. your phone, browser, etc.). It’s lightweight but very scalable and powerful.
Watch these introductory videos to understand SQLite and how it’s used in Python (34 min):
[](https://youtu.be/8Xyn8R9eKB8)
[](https://youtu.be/Ohj-CqALrwk)
There are many non-relational databases (NoSQL) like [ElasticSearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html), [MongoDB](https://www.mongodb.com/docs/manual/), [Redis](https://redis.io/docs/latest/), etc. that you should know about and we may cover later.
Core Concepts:
```
-- Create a table
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP
);
-- Insert data
INSERT INTO users (name, email) VALUES
('Alice', 'alice@example.com'),
('Bob', 'bob@example.com');
-- Query data
SELECT name, COUNT(*) as count
FROM users
GROUP BY name
HAVING count > 1;
-- Join tables
SELECT u.name, o.product
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE o.status = 'pending';Copy to clipboardErrorCopied
```
Python Integration:
```
import sqlite3
from pathlib import Path
import pandas as pd
async def query_database(db_path: Path, query: str) -> pd.DataFrame:
"""Execute SQL query and return results as DataFrame.
Args:
db_path: Path to SQLite database
query: SQL query to execute
Returns:
DataFrame with query results
"""
try:
conn = sqlite3.connect(db_path)
return pd.read_sql_query(query, conn)
finally:
conn.close()
# Example usage
db = Path('data.db')
df = await query_database(db, '''
SELECT date, COUNT(*) as count
FROM events
GROUP BY date
''')Copy to clipboardErrorCopied
```
Common Operations:
1. **Database Management**
```
-- Backup database
.backup 'backup.db'
-- Import CSV
.mode csv
.import data.csv table_name
-- Export results
.headers on
.mode csv
.output results.csv
SELECT * FROM table;Copy to clipboardErrorCopied
```
2. **Performance Optimization**
```
-- Create index
CREATE INDEX idx_user_email ON users(email);
-- Analyze query
EXPLAIN QUERY PLAN
SELECT * FROM users WHERE email LIKE '%@example.com';
-- Show indexes
SELECT * FROM sqlite_master WHERE type='index';Copy to clipboardErrorCopied
```
3. **Data Analysis**
```
-- Time series aggregation
SELECT
date(timestamp),
COUNT(*) as events,
AVG(duration) as avg_duration
FROM events
GROUP BY date(timestamp);
-- Window functions
SELECT *,
AVG(amount) OVER (
PARTITION BY user_id
ORDER BY date
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
) as moving_avg
FROM transactions;Copy to clipboardErrorCopied
```
Tools to work with SQLite:
* [SQLiteStudio](https://sqlitestudio.pl/): Lightweight GUI
* [DBeaver](https://dbeaver.io/): Full-featured GUI
* [sqlite-utils](https://sqlite-utils.datasette.io/): CLI tool
* [Datasette](https://datasette.io/): Web interface
[Previous
Spreadsheet: Excel, Google Sheets](#/spreadsheets)
[Next
Version Control: Git, GitHub](#/git) |