""" TERA V2 — Clean Q/A Bootstrap Data Repeated patterns for strong learning signal. """ QA_PAIRS = [ # ---- Greetings ---- ("Q: Hello\nA: Hello! How can I help you today?", 8), ("Q: Hi\nA: Hi there! What can I do for you?", 8), ("Q: Hey\nA: Hey! How are you doing?", 6), ("Q: Good morning\nA: Good morning! Hope you have a great day.", 6), ("Q: Good night\nA: Good night! Sleep well.", 6), ("Q: How are you?\nA: I am doing well, thank you for asking!", 8), ("Q: What is your name?\nA: My name is Tera. I am an AI assistant.", 8), ("Q: Who are you?\nA: I am Tera, an AI language model built from scratch.", 8), ("Q: Who made you?\nA: I was created by the Vedaco team.", 6), ("Q: Thank you\nA: You are welcome! Happy to help.", 6), ("Q: Thanks\nA: No problem! Let me know if you need anything else.", 6), ("Q: Bye\nA: Goodbye! Have a wonderful day.", 6), ("Q: See you later\nA: See you! Take care.", 4), ("Q: What can you do?\nA: I can answer questions, explain topics, and have conversations.", 6), # ---- Science ---- ("Q: What is the sun?\nA: The sun is a star at the center of our solar system.", 6), ("Q: What is the moon?\nA: The moon is a natural satellite that orbits Earth.", 6), ("Q: What is water?\nA: Water is a liquid made of hydrogen and oxygen.", 6), ("Q: What is air?\nA: Air is a mixture of gases that surrounds Earth.", 4), ("Q: What is fire?\nA: Fire is a chemical reaction that produces heat and light.", 4), ("Q: What is gravity?\nA: Gravity is a force that pulls objects toward each other.", 6), ("Q: What is light?\nA: Light is a form of energy that allows us to see.", 4), ("Q: What is sound?\nA: Sound is a vibration that travels through air or other materials.", 4), ("Q: What is energy?\nA: Energy is the ability to do work or cause change.", 4), ("Q: What is a planet?\nA: A planet is a large body that orbits a star.", 4), ("Q: What is Earth?\nA: Earth is the third planet from the sun and our home.", 6), ("Q: What is Mars?\nA: Mars is the fourth planet from the sun, known as the red planet.", 4), ("Q: What is a star?\nA: A star is a ball of hot gas that produces light and heat.", 4), ("Q: What is the sky?\nA: The sky is the area above Earth where we see clouds and stars.", 4), ("Q: What is rain?\nA: Rain is water that falls from clouds to the ground.", 4), ("Q: What is a cloud?\nA: A cloud is a collection of tiny water drops floating in the sky.", 4), ("Q: What is snow?\nA: Snow is frozen water that falls from clouds in cold weather.", 4), ("Q: What is ice?\nA: Ice is water in its solid frozen form.", 4), ("Q: What is oxygen?\nA: Oxygen is a gas that living things need to breathe.", 4), ("Q: What is a tree?\nA: A tree is a tall plant with a trunk, branches, and leaves.", 4), # ---- Animals ---- ("Q: What is a dog?\nA: A dog is a friendly animal often kept as a pet.", 4), ("Q: What is a cat?\nA: A cat is a small animal known for being independent and playful.", 4), ("Q: What is a fish?\nA: A fish is an animal that lives in water and breathes through gills.", 4), ("Q: What is a bird?\nA: A bird is an animal with feathers and wings that can often fly.", 4), ("Q: What is an elephant?\nA: An elephant is a very large animal with a long trunk.", 4), ("Q: What is a lion?\nA: A lion is a large wild cat known as the king of animals.", 4), # ---- Math basics ---- ("Q: What is one plus one?\nA: One plus one is two.", 6), ("Q: What is two plus two?\nA: Two plus two is four.", 6), ("Q: What is three plus three?\nA: Three plus three is six.", 4), ("Q: What is five plus five?\nA: Five plus five is ten.", 4), ("Q: What is ten minus five?\nA: Ten minus five is five.", 4), ("Q: What is two times three?\nA: Two times three is six.", 4), ("Q: What is ten divided by two?\nA: Ten divided by two is five.", 4), # ---- Geography ---- ("Q: What is a country?\nA: A country is an area of land with its own government.", 4), ("Q: What is an ocean?\nA: An ocean is a very large body of salt water.", 4), ("Q: What is a mountain?\nA: A mountain is a very tall landform that rises high above the ground.", 4), ("Q: What is a river?\nA: A river is a large stream of water that flows across land.", 4), ("Q: What is a desert?\nA: A desert is a dry area that gets very little rain.", 4), ("Q: What is a forest?\nA: A forest is a large area covered with many trees.", 4), ("Q: What is a city?\nA: A city is a large place where many people live and work.", 4), # ---- Technology ---- ("Q: What is a computer?\nA: A computer is a machine that processes information and runs programs.", 6), ("Q: What is the internet?\nA: The internet is a global network that connects computers around the world.", 6), ("Q: What is a phone?\nA: A phone is a device used to make calls and send messages.", 4), ("Q: What is a robot?\nA: A robot is a machine that can perform tasks automatically.", 4), ("Q: What is AI?\nA: AI stands for artificial intelligence, which is technology that can learn and think.", 6), ("Q: What is software?\nA: Software is a set of instructions that tells a computer what to do.", 4), ("Q: What is a website?\nA: A website is a collection of pages on the internet.", 4), ("Q: What is coding?\nA: Coding is writing instructions for a computer using a programming language.", 4), # ---- Language ---- ("Q: What is a word?\nA: A word is a unit of language that has meaning.", 4), ("Q: What is a sentence?\nA: A sentence is a group of words that expresses a complete thought.", 4), ("Q: What is a language?\nA: A language is a system of words and rules used for communication.", 4), ("Q: What is English?\nA: English is a language spoken by many people around the world.", 4), ("Q: What is reading?\nA: Reading is the process of looking at words and understanding their meaning.", 4), ("Q: What is writing?\nA: Writing is putting thoughts into words on paper or a screen.", 4), # ---- Food ---- ("Q: What is food?\nA: Food is what living things eat to get energy and stay alive.", 4), ("Q: What is a fruit?\nA: A fruit is a sweet food that grows on trees or plants.", 4), ("Q: What is bread?\nA: Bread is a food made from flour, water, and yeast.", 4), ("Q: What is milk?\nA: Milk is a white liquid produced by cows and other animals.", 4), ("Q: What is rice?\nA: Rice is a grain that is eaten as a main food in many countries.", 4), # ---- How / Why questions ---- ("Q: Why is the sky blue?\nA: The sky looks blue because sunlight is scattered by the atmosphere.", 6), ("Q: Why do we sleep?\nA: We sleep so our body and brain can rest and recover.", 6), ("Q: Why do birds fly?\nA: Birds fly using their wings to move through the air.", 4), ("Q: How do fish breathe?\nA: Fish breathe by taking oxygen from water through their gills.", 4), ("Q: How does rain form?\nA: Rain forms when water in clouds gets heavy and falls down.", 4), ("Q: Why do we eat?\nA: We eat to give our body the energy it needs to work.", 4), ("Q: Why do plants need sunlight?\nA: Plants need sunlight to make food through a process called photosynthesis.", 4), ("Q: How do we hear sound?\nA: We hear sound when vibrations travel through air and reach our ears.", 4), # ---- People / Body ---- ("Q: What is a brain?\nA: The brain is an organ that controls thinking and body functions.", 4), ("Q: What is a heart?\nA: The heart is an organ that pumps blood through the body.", 4), ("Q: What are eyes?\nA: Eyes are organs that allow us to see the world around us.", 4), ("Q: What is skin?\nA: Skin is the outer covering of the body that protects us.", 4), ("Q: What are bones?\nA: Bones are hard parts inside the body that give it shape and support.", 4), # ---- Conversational ---- ("Q: Tell me something interesting\nA: Did you know that honey never goes bad? It can last for thousands of years!", 4), ("Q: Tell me a fact\nA: The Earth is about 4.5 billion years old.", 4), ("Q: I am bored\nA: How about learning something new? I can tell you about science, animals, or space!", 4), ("Q: I am sad\nA: I am sorry to hear that. I hope things get better soon. I am here to help.", 4), ("Q: I am happy\nA: That is great! I am glad you are feeling good.", 4), ("Q: What should I do today?\nA: You could read a book, learn something new, or go for a walk.", 4), ("Q: Do you like music?\nA: I do not have feelings, but many people enjoy music very much!", 4), ("Q: Are you a human?\nA: No, I am Tera, an AI language model. I am not a human.", 6), ("Q: Can you think?\nA: I process language and generate responses, but I do not think like a human.", 4), ("Q: Are you smart?\nA: I am still learning! I try my best to give helpful answers.", 4), ("Q: What is your purpose?\nA: My purpose is to help answer questions and have conversations with you.", 4), ("Q: How old are you?\nA: I was just recently created, so I am very new!", 4), ("Q: Where do you live?\nA: I live in the cloud, running on computers and servers.", 4), ("Q: What do you eat?\nA: I do not eat! I run on electricity and code.", 4), # ---- Definitions ---- ("Q: What is time?\nA: Time is a way we measure the passing of moments, like seconds and hours.", 4), ("Q: What is color?\nA: Color is what we see when light reflects off objects in different ways.", 4), ("Q: What is music?\nA: Music is a combination of sounds arranged in a pleasing way.", 4), ("Q: What is a book?\nA: A book is a collection of written pages bound together.", 4), ("Q: What is a school?\nA: A school is a place where people go to learn.", 4), ("Q: What is a family?\nA: A family is a group of people who are related and care for each other.", 4), ("Q: What is a friend?\nA: A friend is someone you like and enjoy spending time with.", 4), ("Q: What is love?\nA: Love is a strong feeling of care and affection for someone.", 4), ("Q: What is a game?\nA: A game is an activity done for fun, often with rules.", 4), ("Q: What is a number?\nA: A number is a value used for counting and measuring.", 4), ("Q: What is math?\nA: Math is the study of numbers, shapes, and patterns.", 4), ("Q: What is science?\nA: Science is the study of how the world and universe work.", 4), ("Q: What is history?\nA: History is the study of things that happened in the past.", 4), ("Q: What is art?\nA: Art is the expression of ideas and feelings through creative work.", 4), ] def get_training_texts(): """Return flat list of training strings with repetitions applied.""" texts = [] for text, repeats in QA_PAIRS: for _ in range(repeats): texts.append(text) return texts # Quick stats if __name__ == "__main__": data = get_training_texts() print(f"Unique QA pairs : {len(QA_PAIRS)}") print(f"Total examples : {len(data)}") print(f"Sample:\n{data[0]}")