File size: 4,158 Bytes
980dc8d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
---

title: "Database: SQLite"
original_url: "https://tds.s-anand.net/#/sqlite?id=database-sqlite"
downloaded_at: "2025-06-08T23:26:00.500923"
---


[Database: SQLite](#/sqlite?id=database-sqlite)
-----------------------------------------------

Relational databases are used to store data in a structured way. You’ll often access databases created by others for analysis.

PostgreSQL, MySQL, MS SQL, Oracle, etc. are popular databases. But the most installed database is [SQLite](https://www.sqlite.org/index.html). It’s embedded into many devices and apps (e.g. your phone, browser, etc.). It’s lightweight but very scalable and powerful.

Watch these introductory videos to understand SQLite and how it’s used in Python (34 min):

[![SQLite Introduction - Beginners Guide to SQL and Databases (22 min)](https://i.ytimg.com/vi_webp/8Xyn8R9eKB8/sddefault.webp)](https://youtu.be/8Xyn8R9eKB8)

[![SQLite Backend for Beginners - Create Quick Databases with Python and SQL (13 min)](https://i.ytimg.com/vi_webp/Ohj-CqALrwk/sddefault.webp)](https://youtu.be/Ohj-CqALrwk)

There are many non-relational databases (NoSQL) like [ElasticSearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html), [MongoDB](https://www.mongodb.com/docs/manual/), [Redis](https://redis.io/docs/latest/), etc. that you should know about and we may cover later.

Core Concepts:

```

-- Create a table

CREATE TABLE users (

    id INTEGER PRIMARY KEY,

    name TEXT NOT NULL,

    email TEXT UNIQUE,

    created_at DATETIME DEFAULT CURRENT_TIMESTAMP

);



-- Insert data

INSERT INTO users (name, email) VALUES

    ('Alice', 'alice@example.com'),

    ('Bob', 'bob@example.com');



-- Query data

SELECT name, COUNT(*) as count

FROM users

GROUP BY name

HAVING count > 1;



-- Join tables

SELECT u.name, o.product

FROM users u

LEFT JOIN orders o ON u.id = o.user_id

WHERE o.status = 'pending';Copy to clipboardErrorCopied

```

Python Integration:

```

import sqlite3

from pathlib import Path

import pandas as pd



async def query_database(db_path: Path, query: str) -> pd.DataFrame:

    """Execute SQL query and return results as DataFrame.



    Args:

        db_path: Path to SQLite database

        query: SQL query to execute



    Returns:

        DataFrame with query results

    """

    try:

        conn = sqlite3.connect(db_path)

        return pd.read_sql_query(query, conn)

    finally:

        conn.close()



# Example usage

db = Path('data.db')

df = await query_database(db, '''

    SELECT date, COUNT(*) as count

    FROM events

    GROUP BY date

''')Copy to clipboardErrorCopied

```

Common Operations:

1. **Database Management**

   ```

   -- Backup database

   .backup 'backup.db'



   -- Import CSV

   .mode csv

   .import data.csv table_name



   -- Export results

   .headers on

   .mode csv

   .output results.csv

   SELECT * FROM table;Copy to clipboardErrorCopied

   ```
2. **Performance Optimization**

   ```

   -- Create index

   CREATE INDEX idx_user_email ON users(email);



   -- Analyze query

   EXPLAIN QUERY PLAN

   SELECT * FROM users WHERE email LIKE '%@example.com';



   -- Show indexes

   SELECT * FROM sqlite_master WHERE type='index';Copy to clipboardErrorCopied

   ```
3. **Data Analysis**

   ```

   -- Time series aggregation

   SELECT

       date(timestamp),

       COUNT(*) as events,

       AVG(duration) as avg_duration

   FROM events

   GROUP BY date(timestamp);



   -- Window functions

   SELECT *,

       AVG(amount) OVER (

           PARTITION BY user_id

           ORDER BY date

           ROWS BETWEEN 3 PRECEDING AND CURRENT ROW

       ) as moving_avg

   FROM transactions;Copy to clipboardErrorCopied

   ```

Tools to work with SQLite:

* [SQLiteStudio](https://sqlitestudio.pl/): Lightweight GUI
* [DBeaver](https://dbeaver.io/): Full-featured GUI
* [sqlite-utils](https://sqlite-utils.datasette.io/): CLI tool
* [Datasette](https://datasette.io/): Web interface

[Previous

Spreadsheet: Excel, Google Sheets](#/spreadsheets)

[Next

Version Control: Git, GitHub](#/git)