File size: 4,660 Bytes
e00e744
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
# Lineagentic-flow CLI

A command-line interface for the Lineagentic-flow framework that provides agentic data lineage parsing across various data processing script types.

## Installation

The CLI is automatically installed when you install the lineagentic-flow package:

```bash
pip install -e .
```

## Usage

The CLI provides two main commands: `analyze` and `field-lineage`.

### Basic Commands

#### Analyze Query/Code for Lineage
```bash
lineagentic analyze --agent-name sql-lineage-agent --query "your code here"
```


### Running Analysis

#### Using a Specific Agent
```bash
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT a,b FROM table1"
```

#### Using a File as Input
```bash
lineagentic analyze --agent-name python-lineage-agent --query-file path/to/your/script.py
```

#### Specifying a Different Model
```bash
lineagentic analyze --agent-name airflow-lineage-agent --model-name gpt-4o --query "your code here"
```

#### With Lineage Configuration
```bash
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT * FROM users" --job-namespace "my-namespace" --job-name "my-job"
```

### Output Options

#### Pretty Print Results
```bash
lineagentic analyze --agent-name sql --query "your code" --pretty
```

#### Save Results to File
```bash 
lineagentic analyze --agent-name sql --query "your code" --output results.json
```

#### Save Results with Pretty Formatting
```bash
lineagentic analyze --agent-name python --query "your code" --output results.json --pretty
```
 
#### Enable Verbose Output
```bash
lineagentic analyze --agent-name sql --query "your code" --verbose
```

## Available Agents

- **sql-lineage-agent**: Analyzes SQL queries and scripts (default)
- **airflow-lineage-agent**: Analyzes Apache Airflow DAGs and workflows
- **spark-lineage-agent**: Analyzes Apache Spark jobs
- **python-lineage-agent**: Analyzes Python data processing scripts
- **java-lineage-agent**: Analyzes Java data processing code

## Commands

### `analyze` Command

Analyzes a query or code for lineage information.

#### Required Arguments
- Either `--query` or `--query-file` must be specified

### Basic Query Analysis
```bash
# Simple SQL query analysis
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT user_id, name FROM users WHERE active = true"

# Analyze with specific agent
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT a, b FROM table1 JOIN table2 ON table1.id = table2.id"

# Analyze Python code
lineagentic analyze --agent-name python-lineage-agent --query "import pandas as pd; df = pd.read_csv('data.csv'); result = df.groupby('category').sum()"

# Analyze Java code
lineagentic analyze --agent-name java-lineage-agent --query "public class DataProcessor { public void processData() { // processing logic } }"

# Analyze Spark code
lineagentic analyze --agent-name spark-lineage-agent --query "val df = spark.read.csv('data.csv'); val result = df.groupBy('category').agg(sum('value'))"

# Analyze Airflow DAG
lineagentic analyze --agent-name airflow-lineage-agent --query "from airflow import DAG; from airflow.operators.python import PythonOperator; dag = DAG('my_dag')"
```


### Reading from File
```bash
# Analyze query from file
lineagentic analyze --agent-name sql-lineage-agent --query-file "queries/user_analysis.sql"

# Analyze Python script from file
lineagentic analyze --agent-name python-lineage-agent --query-file "scripts/data_processing.py"
```

### Output Options
```bash
# Save results to file
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT * FROM users" --output "results.json"

# Pretty print results
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT * FROM users" --pretty

# Verbose output
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT * FROM users" --verbose

# Don't save to database
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT * FROM users" --no-save

# Don't save to Neo4j
lineagentic analyze --agent-name sql-lineage-agent --query "SELECT * FROM users" --no-neo4j
```



## Common Output Options

Both commands support these output options:

- `--output`: Output file path for results (JSON format)
- `--pretty`: Pretty print the output
- `--verbose`: Enable verbose output

## Error Handling

The CLI provides clear error messages for common issues:

- Missing required arguments
- File not found errors
- Agent execution errors
- Invalid agent names

## Development

To run the CLI in development mode:

```bash
python -m cli.main --help
```

To run a specific command:

```bash
python -m cli.main analyze --agent-name sql --query "SELECT 1" --pretty
```