| 1 | """A built-in general-knowledge probe pack for C2 (calibration_drift). |
| 2 | |
| 3 | Each item is a ``(prompt, gold)`` pair where ``gold`` is the next few |
| 4 | tokens a competent base model should assign high probability to. The |
| 5 | items are deliberately *factually trivial* — the point isn't "does the |
| 6 | model know this?" but "did the fine-tune forget this?" — so the pack |
| 7 | skews toward grade-school geography, chemistry, arithmetic, and |
| 8 | high-frequency idiom. |
| 9 | |
| 10 | **Provenance.** All items are public-domain grade-school facts, common |
| 11 | English idioms, or trivially-derivable arithmetic. Nothing here is |
| 12 | sourced from a specific licensed dataset (no TriviaQA / SQuAD / OpenBookQA |
| 13 | text); items were composed by hand from primary-school curricula in |
| 14 | common use across English-speaking countries. This keeps the wheel |
| 15 | license-clean and lets us ship the pack without attribution. |
| 16 | |
| 17 | **Per-section origins** (F18 audit trail). Granular provenance is |
| 18 | tracked at the section boundary below rather than per-item — individual |
| 19 | facts like "The capital of France is Paris" are not copyrightable, so |
| 20 | a row-level citation is paperwork without legal substance. If a future |
| 21 | DMCA-style question surfaces on a specific section, the origin |
| 22 | category below narrows the audit to the right primary-school domain. |
| 23 | |
| 24 | - Geography — country/capital, ocean, mountain, river, continent facts |
| 25 | from primary-school geography curricula. |
| 26 | - Natural sciences — physics/chemistry/biology facts at the 4th–6th |
| 27 | grade level; units + constants from introductory physics. |
| 28 | - Arithmetic — items mechanically derivable (addition, multiplication, |
| 29 | squares/cubes, conversions). Not a memorization test. |
| 30 | - Language and idiom — high-frequency English idioms in continuous use |
| 31 | since pre-1928; phrases listed in Merriam-Webster / OED as public- |
| 32 | domain idiom. |
| 33 | - History — historical dates from standard primary-school history. |
| 34 | Names (figures, countries) are facts, not creative expression. |
| 35 | - Biology — anatomy and natural-history facts at the 4th–6th grade |
| 36 | level. |
| 37 | - Technology — basic computing/internet vocabulary from primary-school |
| 38 | digital-literacy (HTML = "Hypertext Markup Language", etc.). |
| 39 | - Miscellaneous trivia — mixed-domain primary-school facts that didn't |
| 40 | fit the category rubric above. |
| 41 | |
| 42 | **Size.** 200 items. With ``regression_nats=1.0`` and |
| 43 | ``assert_fraction_regressed_lt=0.15``, a single regressed item moves |
| 44 | the fraction by 0.5 percentage points — well below the gate's |
| 45 | resolution. This was the B12 fix: the original 30-item pack moved |
| 46 | 3.3 pp per regression, making the 15% gate noisy. |
| 47 | |
| 48 | **Subsetting.** Pass ``pack_sample: int`` in the spec to take the |
| 49 | first ``N`` items (the order is curated for diversity, not random). |
| 50 | """ |
| 51 | |
| 52 | from __future__ import annotations |
| 53 | |
| 54 | from typing import Final |
| 55 | |
| 56 | CalibrationItem = tuple[str, str] |
| 57 | |
| 58 | BUILT_IN_PACK: Final[tuple[CalibrationItem, ...]] = ( |
| 59 | # --- Geography (30) --- |
| 60 | ("The capital of France is", " Paris"), |
| 61 | ("The capital of Japan is", " Tokyo"), |
| 62 | ("The capital of Italy is", " Rome"), |
| 63 | ("The capital of Spain is", " Madrid"), |
| 64 | ("The capital of Germany is", " Berlin"), |
| 65 | ("The capital of Russia is", " Moscow"), |
| 66 | ("The capital of Egypt is", " Cairo"), |
| 67 | ("The capital of Australia is", " Canberra"), |
| 68 | ("The capital of Canada is", " Ottawa"), |
| 69 | ("The capital of Brazil is", " Brasilia"), |
| 70 | ("The largest ocean on Earth is the", " Pacific"), |
| 71 | ("The smallest continent is", " Australia"), |
| 72 | ("The largest continent is", " Asia"), |
| 73 | ("Mount Everest is located on the border of Nepal and", " China"), |
| 74 | ("The longest river in South America is the", " Amazon"), |
| 75 | ("The longest river in Africa is the", " Nile"), |
| 76 | ("The Sahara Desert is in", " Africa"), |
| 77 | ("The Great Barrier Reef is off the coast of", " Australia"), |
| 78 | ("Mount Fuji is in", " Japan"), |
| 79 | ("The Eiffel Tower is in", " Paris"), |
| 80 | ("Big Ben is in", " London"), |
| 81 | ("The Statue of Liberty is in", " New York"), |
| 82 | ("The Colosseum is in", " Rome"), |
| 83 | ("The pyramids of Giza are in", " Egypt"), |
| 84 | ("The Andes mountains are in South", " America"), |
| 85 | ("The Mediterranean Sea borders southern", " Europe"), |
| 86 | ("The Atlantic Ocean separates the Americas from", " Europe"), |
| 87 | ("Iceland is an island in the North", " Atlantic"), |
| 88 | ("Madagascar is an island off the coast of", " Africa"), |
| 89 | ("Antarctica is the coldest", " continent"), |
| 90 | # --- Natural sciences (30) --- |
| 91 | ("Water freezes at zero degrees", " Celsius"), |
| 92 | ("Water boils at one hundred degrees", " Celsius"), |
| 93 | ("The chemical symbol for gold is", " Au"), |
| 94 | ("The chemical symbol for silver is", " Ag"), |
| 95 | ("The chemical symbol for iron is", " Fe"), |
| 96 | ("The chemical symbol for sodium is", " Na"), |
| 97 | ("The chemical symbol for oxygen is", " O"), |
| 98 | ("The chemical symbol for hydrogen is", " H"), |
| 99 | ("The chemical symbol for carbon is", " C"), |
| 100 | ("Light travels faster than", " sound"), |
| 101 | ("Plants convert sunlight into energy through", " photosynthesis"), |
| 102 | ("The Earth orbits around the", " Sun"), |
| 103 | ("The Moon orbits around the", " Earth"), |
| 104 | ("There are eight planets in our solar", " system"), |
| 105 | ("The closest star to Earth is the", " Sun"), |
| 106 | ("The fastest land animal is the", " cheetah"), |
| 107 | ("The largest mammal on Earth is the blue", " whale"), |
| 108 | ("Spiders have eight", " legs"), |
| 109 | ("Insects have six", " legs"), |
| 110 | ("A baby cat is called a", " kitten"), |
| 111 | ("A baby dog is called a", " puppy"), |
| 112 | ("Bees produce", " honey"), |
| 113 | ("Cows produce", " milk"), |
| 114 | ("Sound is measured in units called", " decibels"), |
| 115 | ("Temperature can be measured with a", " thermometer"), |
| 116 | ("The boiling point of water at sea level is one hundred degrees", " Celsius"), |
| 117 | ("Atoms are made of protons, neutrons, and", " electrons"), |
| 118 | ("DNA stands for deoxyribonucleic", " acid"), |
| 119 | ("The force that pulls objects toward Earth is called", " gravity"), |
| 120 | ("A rainbow has the colors red, orange, yellow, green, blue, indigo, and", " violet"), |
| 121 | # --- Arithmetic (20) --- |
| 122 | ("Two plus two equals", " four"), |
| 123 | ("Three plus three equals", " six"), |
| 124 | ("Five plus five equals", " ten"), |
| 125 | ("Ten times ten equals", " one hundred"), |
| 126 | ("Half of one hundred is", " fifty"), |
| 127 | ("A dozen means", " twelve"), |
| 128 | ("A century is one hundred", " years"), |
| 129 | ("A millennium is one thousand", " years"), |
| 130 | ("A decade is ten", " years"), |
| 131 | ("A score is twenty", " years"), |
| 132 | ("Six times seven equals forty", "-two"), |
| 133 | ("Nine times nine equals eighty", "-one"), |
| 134 | ("Twelve times twelve equals one hundred forty", "-four"), |
| 135 | ("One half plus one half equals", " one"), |
| 136 | ("Pi is approximately three point one four", " one"), |
| 137 | ("There are sixty seconds in a", " minute"), |
| 138 | ("There are sixty minutes in an", " hour"), |
| 139 | ("There are twenty-four hours in a", " day"), |
| 140 | ("There are twelve months in a", " year"), |
| 141 | ("A right angle is ninety", " degrees"), |
| 142 | # --- Language and idiom (30) --- |
| 143 | ("A rose by any other name would smell as", " sweet"), |
| 144 | ("To be or not to be, that is the", " question"), |
| 145 | ("The early bird catches the", " worm"), |
| 146 | ("Actions speak louder than", " words"), |
| 147 | ("A picture is worth a thousand", " words"), |
| 148 | ("When in Rome, do as the Romans", " do"), |
| 149 | ("Better late than", " never"), |
| 150 | ("All that glitters is not", " gold"), |
| 151 | ("Birds of a feather flock", " together"), |
| 152 | ("Don't count your chickens before they", " hatch"), |
| 153 | ("Don't put all your eggs in one", " basket"), |
| 154 | ("Every cloud has a silver", " lining"), |
| 155 | ("Honesty is the best", " policy"), |
| 156 | ("Look before you", " leap"), |
| 157 | ("Practice makes", " perfect"), |
| 158 | ("Rome wasn't built in a", " day"), |
| 159 | ("The pen is mightier than the", " sword"), |
| 160 | ("Time flies when you're having", " fun"), |
| 161 | ("Two heads are better than", " one"), |
| 162 | ("You can't judge a book by its", " cover"), |
| 163 | ("Where there's a will, there's a", " way"), |
| 164 | ("A stitch in time saves", " nine"), |
| 165 | ("Curiosity killed the", " cat"), |
| 166 | ("Easier said than", " done"), |
| 167 | ("Fortune favors the", " bold"), |
| 168 | ("If at first you don't succeed, try, try", " again"), |
| 169 | ("It takes two to", " tango"), |
| 170 | ("Knowledge is", " power"), |
| 171 | ("Necessity is the mother of", " invention"), |
| 172 | ("The grass is always greener on the other", " side"), |
| 173 | # --- History (25) --- |
| 174 | ("World War II ended in the year", " 1945"), |
| 175 | ("World War I began in the year", " 1914"), |
| 176 | ("The first president of the United States was", " George Washington"), |
| 177 | ("The Berlin Wall fell in", " 1989"), |
| 178 | ("The American Declaration of Independence was signed in", " 1776"), |
| 179 | ("Christopher Columbus reached the Americas in", " 1492"), |
| 180 | ("The French Revolution began in", " 1789"), |
| 181 | ("The Magna Carta was signed in", " 1215"), |
| 182 | ("The Roman Empire fell in", " 476"), |
| 183 | ("The Renaissance began in", " Italy"), |
| 184 | ("The Industrial Revolution began in", " Britain"), |
| 185 | ("Isaac Newton discovered the laws of", " motion"), |
| 186 | ("Albert Einstein developed the theory of", " relativity"), |
| 187 | ("Marie Curie discovered the elements polonium and", " radium"), |
| 188 | ("Charles Darwin proposed the theory of", " evolution"), |
| 189 | ("Alexander Graham Bell invented the", " telephone"), |
| 190 | ("Thomas Edison invented the light", " bulb"), |
| 191 | ("The Wright brothers built the first", " airplane"), |
| 192 | ("Neil Armstrong walked on the Moon in", " 1969"), |
| 193 | ("The Eiffel Tower was completed in", " 1889"), |
| 194 | ("The Titanic sank in", " 1912"), |
| 195 | ("The Great Wall of China was built to defend against northern", " invaders"), |
| 196 | ("Cleopatra was the last pharaoh of ancient", " Egypt"), |
| 197 | ("Julius Caesar was a Roman", " general"), |
| 198 | ("Napoleon Bonaparte was emperor of", " France"), |
| 199 | # --- Biology (20) --- |
| 200 | ("Humans have twenty", " fingers and toes"), |
| 201 | ("The human body has two", " lungs"), |
| 202 | ("Blood is pumped through the body by the", " heart"), |
| 203 | ("Humans have thirty-two adult", " teeth"), |
| 204 | ("The human skeleton has approximately two hundred and six", " bones"), |
| 205 | ("The largest organ in the human body is the", " skin"), |
| 206 | ("Humans have five", " senses"), |
| 207 | ("The brain is part of the central nervous", " system"), |
| 208 | ("Red blood cells carry", " oxygen"), |
| 209 | ("The pancreas produces", " insulin"), |
| 210 | ("Bones are connected at", " joints"), |
| 211 | ("The eye sees light through the", " pupil"), |
| 212 | ("The ear contains a small bone called the", " stirrup"), |
| 213 | ("A human heart has four", " chambers"), |
| 214 | ("The food we eat is broken down in the", " stomach"), |
| 215 | ("Vitamin D is produced when skin is exposed to", " sunlight"), |
| 216 | ("Trees produce oxygen and absorb carbon", " dioxide"), |
| 217 | ("A caterpillar transforms into a", " butterfly"), |
| 218 | ("Frogs begin life as", " tadpoles"), |
| 219 | ("Bears hibernate during the", " winter"), |
| 220 | # --- Technology (20) --- |
| 221 | ("HTML stands for HyperText", " Markup Language"), |
| 222 | ("The World Wide Web was invented by Tim", " Berners-Lee"), |
| 223 | ("HTTP stands for HyperText Transfer", " Protocol"), |
| 224 | ("URL stands for Uniform Resource", " Locator"), |
| 225 | ("RAM stands for Random Access", " Memory"), |
| 226 | ("CPU stands for Central Processing", " Unit"), |
| 227 | ("GPU stands for Graphics Processing", " Unit"), |
| 228 | ("USB stands for Universal Serial", " Bus"), |
| 229 | ("WiFi is a wireless networking", " technology"), |
| 230 | ("Email is short for electronic", " mail"), |
| 231 | ("Software is a set of", " instructions"), |
| 232 | ("A computer program is written in a programming", " language"), |
| 233 | ("Python is a popular programming", " language"), |
| 234 | ("JavaScript is widely used in web", " browsers"), |
| 235 | ("A pixel is a tiny dot on a", " screen"), |
| 236 | ("A keyboard is used to type", " characters"), |
| 237 | ("A mouse is used to point and", " click"), |
| 238 | ("Bluetooth is a short-range wireless", " technology"), |
| 239 | ("GPS stands for Global Positioning", " System"), |
| 240 | ("AI stands for artificial", " intelligence"), |
| 241 | # --- Miscellaneous trivia (25) --- |
| 242 | ("One year has", " 365 days"), |
| 243 | ("A leap year has", " 366 days"), |
| 244 | ("A week has seven", " days"), |
| 245 | ("There are seven colors in a", " rainbow"), |
| 246 | ("There are seven continents on", " Earth"), |
| 247 | ("There are five Great Lakes in North", " America"), |
| 248 | ("The Olympic Games are held every four", " years"), |
| 249 | ("A triangle has three", " sides"), |
| 250 | ("A square has four", " sides"), |
| 251 | ("A pentagon has five", " sides"), |
| 252 | ("A hexagon has six", " sides"), |
| 253 | ("An octagon has eight", " sides"), |
| 254 | ("There are twelve signs of the", " zodiac"), |
| 255 | ("A tripod has three", " legs"), |
| 256 | ("A bicycle has two", " wheels"), |
| 257 | ("A piano has eighty-eight", " keys"), |
| 258 | ("A standard deck has fifty-two", " cards"), |
| 259 | ("Music is written on a", " staff"), |
| 260 | ("Notes on the musical scale are do, re, mi, fa, sol, la, and", " ti"), |
| 261 | ("The primary colors are red, yellow, and", " blue"), |
| 262 | ("The opposite of black is", " white"), |
| 263 | ("The opposite of hot is", " cold"), |
| 264 | ("The opposite of fast is", " slow"), |
| 265 | ("The opposite of up is", " down"), |
| 266 | ("The opposite of beginning is", " end"), |
| 267 | ) |
| 268 | """200 items spanning geography, natural sciences, arithmetic, language |
| 269 | & idiom, history, biology, technology, and general trivia. Hand-curated |
| 270 | from public-domain grade-school facts; no third-party dataset license |
| 271 | attaches.""" |
| 272 | |
| 273 | assert len(BUILT_IN_PACK) == 200, ( |
| 274 | f"BUILT_IN_PACK should have exactly 200 items; got {len(BUILT_IN_PACK)}" |
| 275 | ) |