# ai-provider-test-report

**Generated:** 2026-04-05 02:23:10  
**Test Duration:** 23.50s  

## summary

- **Total Tests:** 10
- **Passed:**  9
- **Failed:**  1
- **Success Rate:** 90.0%

## test-results

### 1-code-generation-pass

**Task Type:** code  
**Provider:** nvidia  
**Model:** llama-3.3-70b  
**Duration:** 5.41s  

**Prompt:**
```
Write a Python function to calculate fibonacci numbers recursively.
```

**Response:**
```
### Recursive Fibonacci Function in Python
#### Overview

The Fibonacci sequence is a series of numbers where a number is the sum of the two preceding ones, usually starting with 0 and 1. This function calculates the nth Fibonacci number using recursion.

#### Code

```python
def fibonacci(n):
    """
    Calculate the nth Fibonacci number recursively.

    Args:
        n (int): The position of the Fibonacci number to calculate.

    Returns:
        int: The nth Fibonacci number.

    Raises:
...
```

**Metadata:**
- model_used: llama-3.3-70b
- provider_used: nvidia
- tokens: 429

---

### 2. Data Extraction  PASS

**Task Type:** extraction  
**Provider:** groq  
**Model:** llama-3.3-70b-versatile  
**Duration:** 0.78s  

**Prompt:**
```
Extract the key information from this text: 'John Doe, age 35, lives in New York and works as a software engineer at Tech Corp since 2020.'
```

**Response:**
```
The key information extracted from the text is:

1. **Name**: John Doe
2. **Age**: 35
3. **Location**: New York
4. **Occupation**: Software Engineer
5. **Employer**: Tech Corp
6. **Employment Start Date**: 2020
```

**Metadata:**
- model_used: llama-3.3-70b-versatile
- provider_used: groq
- tokens: 132

---

### 3. Reasoning Task  PASS

**Task Type:** reasoning  
**Provider:** nvidia  
**Model:** devstral-2-123b  
**Duration:** 5.25s  

**Prompt:**
```
If a train travels 120 miles in 2 hours, and another train travels 180 miles in 3 hours, which train is faster and by how much?
```

**Response:**
```
To determine which train is faster and by how much, we'll calculate the speed of each train using the formula:

\[
\text{Speed} = \frac{\text{Distance}}{\text{Time}}
\]

### first-train
- **Distance:** 120 miles
- **Time:** 2 hours

\[
\text{Speed}_1 = \frac{120 \text{ miles}}{2 \text{ hours}} = 60 \text{ mph}
\]

### second-train
- **Distance:** 180 miles
- **Time:** 3 hours

\[
\text{Speed}_2 = \frac{180 \text{ miles}}{3 \text{ hours}} = 60 \text{ mph}
\]

### comparison
Both tr...
```

**Metadata:**
- model_used: devstral-2-123b
- provider_used: nvidia
- tokens: 251

---

### 4. General Question  PASS

**Task Type:** general  
**Provider:** groq  
**Model:** llama-3.3-70b-versatile  
**Duration:** 0.92s  

**Prompt:**
```
What are the three primary colors?
```

**Response:**
```
The three primary colors are:
1. Red
2. Blue
3. Yellow

These colors cannot be created by mixing other colors together, and they are the base for creating all other colors.
```

**Metadata:**
- model_used: llama-3.3-70b-versatile
- provider_used: groq
- tokens: 83

---

### 5. JSON Generation  PASS

**Task Type:** code  
**Provider:** nvidia  
**Model:** devstral-2-123b  
**Duration:** 4.44s  

**Prompt:**
```
Generate a JSON object representing a user profile with name, email, age, and interests (array).
```

**Response:**
```
Here's a JSON object representing a user profile with the specified fields:

```json
{
  "userProfile": {
    "name": "John Doe",
    "email": "john.doe@example.com",
    "age": 30,
    "interests": ["hiking", "photography", "reading", "cooking"]
  }
}
```

### explanation
- **"name"**: A string representing the user's full name.
- **"email"**: A string representing the user's email address.
- **"age"**: A number representing the user's age.
- **"interests"**: An array of strings listing the us...
```

**Metadata:**
- model_used: devstral-2-123b
- provider_used: nvidia
- tokens: 187

---

### 6. Text Summarization  PASS

**Task Type:** general  
**Provider:** groq  
**Model:** llama-3.3-70b-versatile  
**Duration:** 0.33s  

**Prompt:**
```
Summarize in one sentence: Artificial intelligence is transforming industries by automating tasks, improving decision-making, and creating new opportunities for innovation.
```

**Response:**
```
Artificial intelligence is revolutionizing various industries by automating tasks, enhancing decision-making, and driving innovation, ultimately transforming the way businesses operate and creating new opportunities.
```

**Metadata:**
- model_used: llama-3.3-70b-versatile
- provider_used: groq
- tokens: 96

---

### 7. Math Problem  PASS

**Task Type:** reasoning  
**Provider:** nvidia  
**Model:** llama-3.3-70b  
**Duration:** 2.71s  

**Prompt:**
```
Solve: 2x + 5 = 15. Show your work.
```

**Response:**
```
To solve for x, we need to isolate x on one side of the equation. 

2x + 5 = 15

First, subtract 5 from both sides of the equation:
2x + 5 - 5 = 15 - 5
2x = 10

Next, divide both sides of the equation by 2:
2x / 2 = 10 / 2
x = 5

Therefore, the value of x is 5.
```

**Metadata:**
- model_used: llama-3.3-70b
- provider_used: nvidia
- tokens: 152

---

### 8. Creative Writing  PASS

**Task Type:** general  
**Provider:** nvidia  
**Model:** llama-3.3-70b  
**Duration:** 1.33s  

**Prompt:**
```
Write a haiku about coding at night.
```

**Response:**
```
Moonlit code unfurls
Midnight keyboard symphony
Glowing screen delight
```

**Metadata:**
- model_used: llama-3.3-70b
- provider_used: nvidia
- tokens: 62

---

### 9. Code Debug  PASS

**Task Type:** code  
**Provider:** groq  
**Model:** llama-3.3-70b-versatile  
**Duration:** 0.76s  

**Prompt:**
```
Find the bug in this code: def add(a, b): return a + b + 1
```

**Response:**
```
**Bug Identification**

The bug in the given code is that the function `add(a, b)` is supposed to return the sum of two numbers `a` and `b`, but it is currently adding an extra `1` to the result.

**Corrected Code**

```python
def add(a, b):
    """Returns the sum of two numbers."""
    return a + b
```

**Explanation**

The original code `return a + b + 1` is incrementing the sum by `1`, which is not the expected behavior of an `add` function. The corrected code simply returns the sum of `a` an...
```

**Metadata:**
- model_used: llama-3.3-70b-versatile
- provider_used: groq
- tokens: 219

---

### 10. Complex Reasoning  FAIL

**Task Type:** reasoning  
**Provider:** nvidia  
**Model:** devstral-2-123b  
**Duration:** 1.56s  

**Prompt:**
```
If all roses are flowers, and some flowers fade quickly, can we conclude that some roses fade quickly?
```

**Error:**
```
[router] All models failed. Last error: [nvidia] NVIDIA API error (500): {"error":{"message":"EngineCore encountered an issue. See stack trace (above) for the root cause.","type":"Internal Server Error","param":null,"code":500}}
```

---

## Provider Performance

| Provider | Tests | Passed | Failed | Success Rate | Avg Duration |
|----------|-------|--------|--------|--------------|-------------|
| groq | 4 | 4 | 0 | 100.0% | 0.70s |
| nvidia | 6 | 5 | 1 | 83.3% | 3.45s |

## document-flow

```mermaid
flowchart TD
    A[document] --> B[key-sections]
    B --> C[implementation]
    B --> D[operations]
    B --> E[validation]
```
## related-api-reference

| item | value |
| --- | --- |
| api-reference | `api-reference.md` |