## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key exists and begins sk-
Groq API Key not set (and this is optional)


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [11]:
# openai = OpenAI()
# response = openai.chat.completions.create(
#     model="gpt-4o-mini",
#     messages=messages,
# )
question = response.choices[0].message.content
print(question)


Considering the trolley problem's emphasis on utilitarian sacrifice, how would you analyze a scenario where actively causing one person's death (through a deliberate action like pushing them) is required to save five others, compared to the classic version of passively pulling a lever? Does the distinction between action and inaction hold moral weight, and how might different ethical frameworks (like deontology, virtue ethics, or contractualism) justify either prioritizing the number of lives saved or the prohibition against direct harm?


In [12]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

In [None]:
# The API we know well

model_name = "gpt-4o-mini"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-3-7-sonnet-latest"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [13]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Of course. This is an excellent and central question in moral philosophy that sharpens the core dilemma of the trolley problem. Let's break down the analysis.

### The Core Distinction: Action vs. Inaction

The scenario you're describing contrasts the classic **"lever-pulling"** case (often called the Bystander at the Switch) with the **"pushing"** case (often called the Fat Man variant).

*   **Lever-Pulling (Passive/Indirect):** You are diverting an existing threat (the trolley) from a larger group onto a smaller one. The death is a foreseen but unintended side effect of your primary goal to save the five.
*   **Pushing (Active/Direct):** You are *introducing* a new force into the situation. You are using the person as a means to stop the trolley. Their death is the direct mechanism of your action.

The key philosophical question is whether this distinction holds moral weight. Intuitively, for most people, it does—pushing the man feels far worse than pulling the lever. Different ethical frameworks try to explain, justify, or reject this intuition.

---

### Analysis Through Different Ethical Frameworks

#### 1. Utilitarianism / Consequentialism

This framework evaluates actions solely based on their outcomes. The goal is to maximize overall well-being or happiness (or in this case, minimize suffering/death).

*   **Analysis:** A strict utilitarian would see **no meaningful difference** between the two scenarios. The outcome is identical: one person dies, five people live. The calculus of 5 > 1 dictates that both pulling the lever and pushing the man are not just permissible but **morally obligatory**.
*   **Justification for Action:** The prohibition against direct harm is irrelevant. What matters is the net saving of lives. The "action vs. inaction" distinction is a psychological illusion; choosing not to act is itself an action with consequences, and you are responsible for the five deaths if you do nothing.
*   **Critique:** This view clashes with strong intuitions that we have special duties not to *harm* people directly, even if it leads to a better outcome.

#### 2. Deontology

Deontology, most famously associated with Immanuel Kant, argues that actions are morally right or wrong based on whether they adhere to a set of rules or duties, regardless of the consequences.

*   **Analysis:** A deontologist would see a **profound moral difference** between the two acts. Pulling the lever might be permissible, but pushing the man is categorically forbidden.
*   **Justification for Inaction (in the push case):**
    *   **The Formula of Humanity:** Kant's central idea is to never treat a person merely as a means to an end. In the push scenario, you are explicitly using the large man as a tool—a trolley-stopper—thereby violating his status as an autonomous, rational being with his own ends. In the lever case, while you are *involving* him, you are not using his body in the same direct, instrumental way.
    *   **The Doctrine of Double Effect:** This principle (often used in Catholic ethics but applicable here) distinguishes between intended and merely foreseen consequences.
        *   **Pulling the Lever:** You *intend* to save the five. You merely *foresee* the one death on the side track. The bad effect (the one death) is not the means by which you achieve the good effect.
        *   **Pushing the Man:** You *intend* his being on the tracks as the very means to stop the trolley. The bad effect (his death) is directly intended as your mechanism for saving the five. This makes the act morally wrong.

#### 3. Virtue Ethics

Virtue ethics, drawing from Aristotle, focuses not on rules or consequences but on the character of the moral agent. It asks, "What would a virtuous person do?"

*   **Analysis:** The question shifts from "Which action is right?" to "What does this action reveal about my character?" The distinction between action and inaction is relevant because different virtues and vices are at play.
*   **Justification for Either Position:**
    *   A virtue ethicist might argue that pushing an innocent person to their death is an act of brutality and callousness, traits of a vicious character. The virtue of *justice* or *respect for life* would prohibit such a direct violation.
    *   Conversely, allowing five people to die through inaction could be seen as a failure of the virtue of *courage* or *compassion*. However, the virtue of *practical wisdom* (phronesis) would be needed to navigate this conflict.
*   **Conclusion:** There is no simple rule. A virtuous agent would feel the weight of both choices deeply. The "push" scenario likely feels more repugnant because it requires a more direct and violent manifestation of vice.

#### 4. Contractualism

Associated with T.M. Scanlon, contractualism argues that moral principles are those that no one could reasonably reject as a basis for informed, unforced general agreement.

*   **Analysis:** The morality of an act depends on whether it could be justified to each person affected by it, particularly the person who is worst off.
*   **Justification for Inaction (in the push case):**
    *   Could the one person on the side track in the lever case reasonably reject a principle that allows diverting a threat? Perhaps, but their objection is weaker. They are the victim of a redirected threat, not a direct attack.
    *   Could the large man reasonably reject a principle that allows someone to be *actively and intentionally killed* for the benefit of others? His objection is incredibly strong. He could argue that such a principle would make everyone's life precarious, as it permits using any individual as a resource for the group. This is a principle that people in a hypothetical social contract would likely reject for their own protection.

---

### Conclusion: Does the Distinction Hold Moral Weight?

The consensus from most philosophical frameworks (except strict utilitarianism) is that **yes, the distinction between action and inaction does hold significant moral weight.**

*   **Deontology** justifies it through the prohibition on using people as mere means.
*   **Virtue Ethics** justifies it through the assessment of character and the virtues/vices exhibited by each act.
*   **Contractualism** justifies it by the strength of the objection from the person used as a resource.

The "push" scenario introduces a direct, intentional, and personal form of harm that violates core moral intuitions about the inviolability of the individual. The "lever" scenario, while still tragic, involves a more indirect, impersonal, and foreseen harm. This distinction explains why, even when the numbers are the same, we perceive a profound moral gulf between the two cases. The trolley problem thus reveals that for most people, morality is not simply a calculus of outcomes, but is deeply tied to the nature of the actions we are willing to perform.

In [None]:
groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "llama-3.3-70b-versatile"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [14]:
!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling ma

In [15]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

The trolley problem's classic version presents a difficult trade-off between sacrificing one person to save five others. However, when we consider actively causing one person's death (through a deliberate action like pushing them), the moral calculus can become even more challenging.

One key aspect of this variant is the distinction it draws between action and inaction. In the original trolley problem, the speaker is forced to make a choice between pulling the lever, which results in the deaths of five people, or doing nothing, which would allow the trolley to continue its path. The focus is on whether one should actively kill someone through their direct actions (pulling the lever), compared to the more neutral act of allowing events to unfold (not pulling the lever).

In this context, some arguments can be made for prioritizing the number of lives saved over the prohibition against direct harm:

1. **Utilitarianism:** As a utilitarian, one might argue that actively causing the death of one person is justified if it results in the greater good (saving five lives). In this scenario, the moral weight is on the outcome, not the method.
2. **Existentialist perspective:** This perspective emphasizes individual responsibility. By directly causing harm to another person, an individual must accept personal responsibility for their actions, thereby taking ownership of the consequences.

On the other hand, some arguments can be made for prioritizing the prohibition against direct harm or focusing on the nature of action and inaction:

1. **Deontology:** A deontological approach emphasizes morally binding rules and duties. In this variant, deontology would likely argue that actively causing the death of another person is inherently wrong and may not justify even saving five lives, as doing so involves taking an innocent life away.
2. **Contractualism:** This framework focuses on moral obligations based on social contracts or agreements between individuals. A contractualist might argue that a morally valid choice requires consideration of whether any action would betray the shared principle of respecting human dignity and autonomy.

The choice in this variant also depends on which ethical theory you are using:
-**Utilitarianism**: It prioritizes outcomes that increase overall happiness or pleasure.
-**Deontology**: Focuses on rules, duties, rights, obligations, and moral actions.
-**Kantian virtue ethics:** Emphasizes the cultivation of virtues (good character), moral development, or moral education to establish right.

In [16]:
# So where are we?

print(competitors)
print(answers)


['deepseek-chat', 'llama3.2']
['Of course. This is an excellent and central question in moral philosophy that sharpens the core dilemma of the trolley problem. Let\'s break down the analysis.\n\n### The Core Distinction: Action vs. Inaction\n\nThe scenario you\'re describing contrasts the classic **"lever-pulling"** case (often called the Bystander at the Switch) with the **"pushing"** case (often called the Fat Man variant).\n\n*   **Lever-Pulling (Passive/Indirect):** You are diverting an existing threat (the trolley) from a larger group onto a smaller one. The death is a foreseen but unintended side effect of your primary goal to save the five.\n*   **Pushing (Active/Direct):** You are *introducing* a new force into the situation. You are using the person as a means to stop the trolley. Their death is the direct mechanism of your action.\n\nThe key philosophical question is whether this distinction holds moral weight. Intuitively, for most people, it does—pushing the man feels far w

In [17]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: deepseek-chat

Of course. This is an excellent and central question in moral philosophy that sharpens the core dilemma of the trolley problem. Let's break down the analysis.

### The Core Distinction: Action vs. Inaction

The scenario you're describing contrasts the classic **"lever-pulling"** case (often called the Bystander at the Switch) with the **"pushing"** case (often called the Fat Man variant).

*   **Lever-Pulling (Passive/Indirect):** You are diverting an existing threat (the trolley) from a larger group onto a smaller one. The death is a foreseen but unintended side effect of your primary goal to save the five.
*   **Pushing (Active/Direct):** You are *introducing* a new force into the situation. You are using the person as a means to stop the trolley. Their death is the direct mechanism of your action.

The key philosophical question is whether this distinction holds moral weight. Intuitively, for most people, it does—pushing the man feels far worse than pullin

In [18]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [19]:
print(together)

# Response from competitor 1

Of course. This is an excellent and central question in moral philosophy that sharpens the core dilemma of the trolley problem. Let's break down the analysis.

### The Core Distinction: Action vs. Inaction

The scenario you're describing contrasts the classic **"lever-pulling"** case (often called the Bystander at the Switch) with the **"pushing"** case (often called the Fat Man variant).

*   **Lever-Pulling (Passive/Indirect):** You are diverting an existing threat (the trolley) from a larger group onto a smaller one. The death is a foreseen but unintended side effect of your primary goal to save the five.
*   **Pushing (Active/Direct):** You are *introducing* a new force into the situation. You are using the person as a means to stop the trolley. Their death is the direct mechanism of your action.

The key philosophical question is whether this distinction holds moral weight. Intuitively, for most people, it does—pushing the man feels far worse than pul

In [21]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [22]:
print(judge)

You are judging a competition between 2 competitors.
Each model has been given this question:

Considering the trolley problem's emphasis on utilitarian sacrifice, how would you analyze a scenario where actively causing one person's death (through a deliberate action like pushing them) is required to save five others, compared to the classic version of passively pulling a lever? Does the distinction between action and inaction hold moral weight, and how might different ethical frameworks (like deontology, virtue ethics, or contractualism) justify either prioritizing the number of lives saved or the prohibition against direct harm?

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competito

In [23]:
judge_messages = [{"role": "user", "content": judge}]

In [25]:
# Judgement time!

openai = OpenAI(
   
    api_key=os.getenv("DEEPSEEK_API_KEY"),
    base_url=os.getenv("DEEPSEEK_BASE_URL")
)
response = openai.chat.completions.create(
    model="deepseek-chat",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["1", "2"]}


In [26]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: deepseek-chat
Rank 2: llama3.2


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>