On suboptimal optimization.

On suboptimal optimization.

I’ve been helping a friend learn the math behind optimization so that she can pass a graduation-requirement course in linear algebra. 

Optimization is a wonderful mathematical tool.  Biochemists love it – progression toward an energy minimum directs protein folding, among other physical phenomena.  Economists love it – whenever you’re trying to make money, you’re solving for a constrained maximum.  Philosophers love it – how can we provide the most happiness for a population?  Computer scientists love it – self-taught translation algorithms use this same methodology (I still believe that you could mostly replace Ludwig Wittgenstein’s Philosophical Investigations with this New York Times Magazine article on machine learning and a primer on principal component analysis).

But, even though optimization problems are useful, the math behind them can be tricky.  I’m skeptical that this mathematical technique is essential for everyone who wants a B.A. to grasp – my friend, for example, is a wonderful preschool teacher who hopes to finally finish a degree in child psychology.  She would have graduated two years ago except that she’s failed this math class three times.

I could understand if the university wanted her to take statistics, as that would help her understand psychology research papers … and the science underlying contemporary political debates … and value-added models for education … and more.  A basic understanding of statistics might make people better citizens.

Whereas … linear algebra?  This is a beautiful but counterintuitive field of mathematics.  If you’re interested in certain subjects – if you want to become a physicist, for example – you really should learn this math.  A deep understanding of linear algebra can enliven your study of quantum mechanics.

The summary of quantum mechanics: animation by Templaton.

Then again, Werner Heisenberg, who was a brilliant physicist, had a limited grasp on linear algebra.  He made huge contributions to our understanding of quantum mechanics, but his lack of mathematical expertise occasionally held him back.  He never quite understood the implications of the Heisenberg Uncertainty Principle, and he failed to provide Adolph Hitler with an atomic bomb.

In retrospect, maybe it’s good that Heisenberg didn’t know more linear algebra.

While I doubt that Heisenberg would have made a great preschool teacher, I don’t think that deficits in linear algebra were deterring him from that profession.  After each evening that I spend working with my friend, I do feel that she understands matrices a little better … but her ability to nurture children isn’t improving.

And yet.  Somebody in an office decided that all university students here need to pass this class.  I don’t think this rule optimizes the educational outcomes for their students, but perhaps they are maximizing something else, like the registration fees that can be extracted.

Optimization is a wonderful mathematical tool, but it’s easy to misuse.  Numbers will always do what they’re supposed to, but each such problem begins with a choice.  What exactly do you hope to optimize?

Choose the wrong thing and you’ll make the world worse.

#

Figure 1 from Eykholt et al., 2018.

Most automobile companies are researching self-driving cars.  They’re the way of the future!  In a previous essay, I included links to studies showing that unremarkable-looking graffiti could confound self-driving cars … but the issue I want to discuss today is both more mundane and more perfidious.

After all, using graffiti to make a self-driving car interpret a stop sign as “Speed Limit 45” is a design flaw.  A car that accelerates instead of braking in that situation is not operating as intended.

But passenger-less self-driving cars that roam the city all day, intentionally creating as many traffic jams as possible?  That’s a feature.  That’s what self-driving cars are designed to do.

A machine designed to create traffic jams?

Despite my wariness about automation and algorithms run amok, I hadn’t considered this problem until I read Adam Millard-Ball’s recent research paper, “The Autonomous Vehicle Parking Problem.” Millard-Ball begins with a simple assumption: what if a self-driving car is designed to maximize utility for its owner?

This assumption seems reasonable.  After all, the AI piloting a self-driving car must include an explicit response to the trolley problem.  Should the car intentionally crash and kill its passenger in order to save the lives of a group of pedestrians?  This ethical quandary is notoriously tricky to answer … but a computer scientist designing a self-driving car will probably answer, “no.” 

Otherwise, the manufacturers won’t sell cars.  Would you ride in a vehicle that was programmed to sacrifice you?

Luckily, the AI will not have to make that sort of life and death decision often.  But here’s a question that will arise daily: if you commute in a self-driving car, what should the car do while you’re working?

If the car was designed to maximize public utility, perhaps it would spend those hours serving as a low-cost taxi.  If demand for transportation happened to be lower than the quantity of available, unoccupied self-driving cars, it might use its elaborate array of sensors to squeeze into as small a space as possible inside a parking garage.

But what if the car is designed to benefit its owner?

Perhaps the owner would still want for the car to work as a taxi, just as an extra source of income.  But some people – especially the people wealthy enough to afford to purchase the first wave of self-driving cars – don’t like the idea of strangers mucking around in their vehicles.  Some self-driving cars would spend those hours unoccupied.

But they won’t park.  In most cities, parking costs between $2 and $10 per hour, depending on whether it’s street or garage parking, whether you purchase a long-term contract, etc. 

The cost to just keep driving is generally going to be lower than $2 per hour.  Worse, this cost is a function of the car’s speed.  If the car is idling at a dead stop, it will use approximately 0.1 gallon per hour, costing 25 cents per hour at today’s prices.  If the car is traveling at 30 mph without breaks, it will use approximately 1 gallon per hour, costing $2.50 per hour.

To save money, the car wants to stay on the road … but it wants for traffic to be as close to a standstill as possible.

Luckily for the car, this is an easy optimization problem.  It can consult its onboard GPS to find nearby areas where traffic is slow, then drive over there.  As more and more self-driving cars converge on the same jammed streets, they’ll slow traffic more and more, allowing them to consume the workday with as little motion as possible.

Photo by walidhassanein on Flickr.

Pity the person sitting behind the wheel of an occupied car on those streets.  All the self-driving cars will be having a great time stuck in that traffic jam: we’re saving money!, they get to think.  Meanwhile the human is stuck swearing at empty shells, cursing a bevy of computer programmers who made their choices months or years ago.

And all those idling engines exhale carbon dioxide.  But it doesn’t cost money to pollute, because one political party’s worth of politicians willfully ignore the fact that capitalism, by philosophical design, requires we set prices for scarce resources … like clean air, or habitable planets.

On the Tower of Babel and beneficial curses.

On the Tower of Babel and beneficial curses.

In Jack Vance’s The Eyes of the Overworld, a bumbling anti-hero named Cugel the Clever is beset by one misfortune after another.  He attempts to burglarize a wizard’s palace but is caught in the act.  The wizard Iucounu forces Cugel to retrieve an ancient artifact – a seemingly suicidal quest.  To ensure that Cugel does not shirk his duties, Iucounu subjects him to the torments of Firx, a subcutaneous parasite who entwines searingly with nerve endings in Cugel’s abdomen, and whose desire to reuinte with his mate in Iucounu’s palace will spur Cugel ever onward.

Early in his journey, Cugel is chased by a gang of bandits.  He escapes into a crumbling fortress – only to find that the fortress is haunted.

eyesofthe.jpgThe ghost spoke: “Demolish this fort.  While stone joins stone I must stay, even while Earth grows cold and swings through darkness.”

          “Willingly,” croaked Cugel, “if it were not for those outside who seek my life.”

          “To the back of the hall is a passage.  Use stealth and strength, then do my behest.”

          “The fort is as good as razed,” declared Cugel fervently.  “But what circumstances bound you to so unremitting a post?”

          “They are forgotten; I remain.  Perform my charge, or I curse you with an everlasting tedium like my own!”

“Everlasting tedium” sounds like a raw deal, so Cugel figures he’d better slay his assailants and get to wrecking this haunted edifice.  He kills three bandits and mortally wounds the fourth with a boulder to the head:

Cugel came cautiously forward.  “Since you face death, tell me what you know of hidden treasure.”

          “I know of none,” said the bandit.  “Were there such you would be the last to learn, for you have killed me.”

          “This is no fault of mine,” said Cugel.  “You pursued me, not I you.  Why did you do so?”

          “To eat, to survive, though life and death are equally barren and I despise both equally.”

          Cugel reflected.  “In this case you need not resent my part in the transition which you now face.  The question regarding hidden valuables again becomes relevant.  Perhaps you have a final word on this matter?”

          “I have a final word.  I display my single treasure.”  The creature groped in its pouch and withdrew a round white pebble.  “This is the skull-stone of a grue, and at this moment trembles with force.  I use this force to curse you, to bring upon you the immediate onset of cankerous death.”

“Immediate onset of cankerous death” sounds grim.  Dude’s day has gone from bad to worse.

          Cugel hastily killed the bandit, then heaved a dismal sigh.  The night had brought only difficulty.  “Iucounu, if I survive, there shall be a reckoning indeed!”

          Cugel turned to examine the fort.  Certain of the stones would fall at a touch; others would require much more effort.  He might well not survive to perform the task.  What were the terms of the bandit’s curse?  “ – immediate onset of cankerous death.”  Sheer viciousness.  The ghost-king’s curse was no less oppressive: how had it gone?  “ – everlasting tedium.”

          Cugel rubbed his chin and nodded gravely.  Raising his voice, he called, “Lord ghost, I may not stay to do your bidding: I have killed the bandits and now I depart.  Farewell and may the eons pass with dispatch.”

          From the depths of the fort came a moan, and Cugel felt the pressure of the unknown.  “I activate my curse!” came a whisper to Cugel’s brain.

          Cugel strode quickly away to the southeast.  “Excellent; all is well.  The ‘everlasting tedium’ exactly countervenes the ‘immediate onset of death’ and I am left only with the ‘canker’ which, in the person of Firx, already afflicts me.  One must use his wits in dealing with maledictions.”

At times, one curse can save us from another.

6589836543_8e8c008a53_z

#

In the biblical story of the Tower of Babel, humans are cursed for building a bridge to heaven.  Implicit in this story is the idea that humans nearly succeeded: our edifice of bricks and stone was threatening God.

800px-Marten_van_Valckenborch_Tower_of_babel-large

In part, this story was written to disparage other religious beliefs.  In the beginning, Yahweh was worshiped by a small tribe of relatively powerless people, and so the Old Testament seems to be riddled with rebuttals (some of which I’ve discussed previously, here).  In From Gods to God (translated by Valerie Zakovitch), Avigdor Shinan and Yair Zakovitch write that:

fromgodstogodThe derivation of “Babel” from b-l-l seems to have originated as a response to the widely accepted Babylonian explanation of that place’s name, Bab-ilu, “God’s Gate,” or Bab I-lani, “Gate of the Gods” – a meaning that, we’ll soon see, was known in Israel.  Indeed, the story of the Tower of Babel in its entirety polemicizes against a Babylonian tradition according to which the tower-temple in Babylon, which was dedicated to the god Marduk, was built as a tribute both to him and to the belief that Babylon was the earthly passageway between heaven and earth.  According to ancient Babylonian belief, the tower in Babylon – Babel – was Heaven’s Gate.

It seems that the biblical writer, unwilling to accept that Babylon – a pagan city – was the entryway to heaven, found various ways to counter this Babylonian tradition that was well known in Israel.  First, he converted the story of the building into one of ultimate failure and human conceit.  At the same time, though, he introduced an alternative story about the gate to heaven.  This time the gate’s location was in Israel, the Land of One God.  This replacement story is found in Genesis 28: the story of Jacob’s dream.

Hanging_Gardens_of_Babylon

The Bible succeeded in its propaganda campaign: by now the standard interpretation of the Tower of Babel is that humans approached the world with insufficient humility, we began a technological campaign that ultimately ended in failure, and Yahweh cursed us such that we could not cooperate well enough to attempt a similar project in the future.  Babel – Babylon – was not a passageway to heaven.  The gateway was never finished.  Because we’ve lost the ability to communicate with each other, it never will be finished.

#

The story of the Tower of Babel implies that all humans shared a single language before our brash undertaking.  The world’s current multitude of tongues were spawned by Yahweh’s curse.  But… what if languages are good?  What if we need diversity?

In 1940, Benjamin Lee Whorf speculated that the language we speak shapes the way we think.  His idea was egregiously overstated – creatures with no spoken language seem to be perfectly capable of thought, so there’s no reason to assume that humans who speak a language that lacks a certain word or verb tense can’t understand the underlying concepts.

quote-language-shapes-the-way-we-think-and-determines-what-we-can-think-about-benjamin-lee-whorf-54-35-67.jpg

But Whorf’s basic idea is reasonable.  It is probably easier to have thoughts that can be expressed in your language.

For example, the best language we’ve developed to discuss quantum mechanics is linear algebra; because Werner Heisenberg had only passing familiarity with this language, he had some misconceptions about the Heisenberg Uncertainty Principle.

Or there’s the case of my first Ph.D. advisor, who told me that he spent time working construction in Germany after high school.  He said that he spoke extremely poor German… but still, after he’d been in the country long enough, this was the language he reflexively thought in.  He said that he could feel his impoverished language lulling him into impoverished thought.

His language was probably more like a headwind than a cage – we constantly invent words as we struggle to express ourselves, so it’s clear that the lack of a word can’t prevent a thought – but he felt his mind to be steered all the same.

19537_27p1pWhorf’s theory of language is also a major motif in Elif Batuman’s The Idiot, in which the characters’ English-language miscommunication is partly attributed to their different linguistic upbringings.  The narrator is perpetually tentative: did her years speaking Turkish instill this in her?

I wrote a research paper about the Turkish suffix –mis.  I learned from a book about comparative linguistics that it was called the inferential or evidential tense, and that similar structures existed in the languages of Estonia and Tibet.  The Turkish inferential tense, I read, was used in various forms associated with oral transmission and hearsay: fairy tales, epics, jokes, and gossip.

… [-mis] was a curse, condemning you to the awareness that everything you said was potentially encroaching on someone else’s experience, that your own subjectivity was booby-trapped and set you up to have conflicting stories with others.  … There was no way to go through life, in Turkish or any other language, making only factual statements about direct observations.  You were forced to use -mis, just by the human condition – just by existing in relation to other people.

She felt cursed by the need to constantly consider why she held her beliefs.  And yet.  Wouldn’t we all be better off if more people considered the provenance of their beliefs?

#

Most languages have good features and bad.  English has its flaws – I wish it had a subjunctive tense – but I like that it isn’t as gendered as most European languages – which treat every object as either masculine or feminine – or Thai – in which men and women are expected to use different words to say a simple “thank you.”  Although Thai culture is in many ways more accepting of those who were born with the wrong genitalia than we are in the U.S., I imagine every “thank you” would be fraught for a kid striving to establish his or her authentic identity.

And, Turkish?  I know nothing about the language except what I learned from Batuman’s novel.  So I’d never argue that speaking Turkish gives people a better view of the world.

But I think that our world as a whole is made better by hosting a diversity of perspectives.  Perhaps no language is better than any other … but, if different languages allow for different ways of thinking … then a world with several languages seems better than a world with only one.

tongueofadamThis is the central idea explored by Abdelfattah Kilito in his recent essay, The Tongue of Adam (translated by Robyn Creswell).  After an acquaintance was dismissive of the Moroccan Kilito after he composed an academic text in Arabic instead of French, he meditated on the value of different languages and the benefits of living in a world with many.

Here is Kilito’s description of the curse Yahweh used to stop humans from completing the Tower of Babel:

After Babel, men cannot seek to rival God as they seemed to do when they began building the tower.  They cannot, because they’ve lost the original language.  God’s confusion of tongues ensures his supremacy.  The idea may seem odd, but consider the story of Babel as we find it in Genesis: “And they said, Go to, let us build us a city and a tower, whose top may reach unto heaven; and let us make us a name, lest we be scattered abroad upon the face of the whole earth” [11:4].  A tower whose top would touch the heavens: taken literally, the expression suggests a desire to reach the sky, to become like gods.  A rather worrisome project: “And the Lord came down to see the city and the tower, which the children of men builded” [11:5].  Man’s attempt to rise up is answered by the Lord’s descent: “Go to, let us go down, and there confound their language, that they may not understand one another’s speech” [11:7].  God does not destroy the work.  He punishes men by confounding their language, the only language, the one that unites them.  For Yahweh, the root of the menace is this tongue, which gives men tremendous power in their striving toward a single goal, an assault on the heavens.  The confusion of tongues brings this work to a stop; it is a symbolic demolition, the end of mankind’s hopes and dreams.  Deprived of its original language, mankind breaks into groups and scatters across the surface of the earth.  With its route to the heavens cut off, mankind turns its eyes to the horizon.

And here is Kilito’s description of this same dispersal as a blessing:

The expression, “the diversity of your languages,” in [Genesis 30:22, which states that “Among His wonders is the creation of the heavens and the earth, and the diversity of your languages and colors.  In these are signs for mankind”], means not only the diversity of spoken tongues, but also, according to some commentators, the diversity of articulated sounds and pronunciation of words.  Voice, like the color of the skin, varies from one individual to the next.  This is a divine gift.  Otherwise, ambiguity, disorder, and misunderstanding would reign. … Plurality and heterogeneity are the conditions of knowledge.

Kilito endorses Whorf’s theory of language.  Here is his analysis of the birth of Arabic as told in the Quran:

According to Jumahi, “Ismael is the first to have forgotten the language of his father.” This rupture in language must have been brutal: in a blinding instant, one language is erased and cedes its place to another.  According to Jahiz, Ismael acquired Arabic without having to learn it.  And because the ancient language disappeared without a trace, he had no trouble expressing himself in the new one.  This alteration, due to divine intervention, also affected his character and his nature, in such a way that his whole personality changed.

His personality is changed because his language is changed: new words meant a new way of thinking, a new way of seeing the world.  If humans had not built the Tower of Babel – if we had never been cursed – we would share a single perspective… an ideological monoculture like a whole world paved over with strip mall after strip mall … the same four buildings, over and over … Starbucks, McDonalds, Walmart, CAFO … Starbucks, McDonalds …

#

The current occupancy of the White House … and congress … and the U.S. Supreme Court … seems a curse.  The health care proposals will allow outrageous medical debt to wreck a lot of people’s lives, and each of us has only a single life to live.  Those who complete their educations in the midst of the impending recession will have lifelong earnings far lower than those who chance to graduate during boom years.  Our vitriolic attorney general will devastate entire communities by demanding that children and parents and neighbors and friends be buried alive for low-level, non-violent criminal offenses.  Innocent kids whose parents are needlessly yanked away will suffer for the entirety of their lives.

I can’t blithely compare this plague to fantasy tales in the Bible.  Real people are going to suffer egregiously.

At the same time, I do think that kind-hearted citizens of the United States needed to be saved from our own complacency.  Two political parties dominate discourse in this country – since the Clinton years, these parties have espoused very similar economic and punitive policies.  I have real sympathy for voters who couldn’t bear to vote for another Clinton in the last election because they’d seen their families steadily decline in a nation helmed by smug elitists.

Worse, all through the Obama years, huge numbers of people deplored our world’s problems – widespread ignorance, mediocre public education, ever-more-precarious climate destabilization, an unfair mental toll exacted on marginalized communities – without doing anything about it.  Some gave money, but few people – or so it seemed to me – saw those flaws as a demand to change their lives.

Climate-Change-Top-PhotoAnyone who cares deeply about climate change can choose to eat plants, drive less, drive a smaller car, buy used, and simply buy less.  Anyone embarrassed by the quality of education available in this country… can teach.  We can find those who need care, and care for them.

After the 45th stepped into office – or so it has seemed to me – more people realized that change, and hope, and whatnot … falls to us.  Our choices, as individuals, make the world.  I’ve seen more people choosing to be better, and for that I am grateful.

Obviously, I wish it hadn’t come to this.  But complacency is a curse.  Sometimes we need new curses to countervene another.

On uncertainty (with cartoon ending).

The whole cartoon is at the end.
See this monstrosity, in its entirety, at the end of this essay.

Reading about the uncertainty principle in popular literature almost always sets my teeth on edge.

CaptureI assume most people have a few qualms like that, things they often see done incorrectly that infuriate them.  After a few pointed interactions with our thesis advisor, a friend of mine started going berserk whenever he saw “it’s” and “its” misused on signs.  My middle school algebra teacher fumed whenever he saw store prices marked “.25% off!” when they meant you’d pay three quarters of the standard price, not 99.75%.  A violinist friend with perfect pitch called me (much too early) on a Sunday morning to complain that the birds on her windowsill were out of tune… how could she sleep when they couldn’t hit an F#??

“Ha,” I say.  “That’s silly… they should just let it go.”  But then I start frowning and sputtering when I read about the uncertainty principle.  Anytime somebody writes a line to the effect of, we’ve learned from quantum mechanics that measurement obscures the world, so we will always be uncertain what reality might have been had we not measured it.

My ire is risible in part because the idea isn’t so bad.  It even holds in some fields.  Like social psychology, I’d say.  If a research group identifies a peculiarity of the human mind and then widely publicizes their findings, that particularity might go away.  There was a study published shortly before I got my first driver’s license concluding that the rightmost lanes of toll booths were almost always fastest.  Now that’s no longer true.  Humans can correct their mistakes, but first they have to realize they’re mistaken.

That’s not the uncertainty principle, though.

CaptureAnd, silly me, I’d always thought that this misconception was due to liberal arts professors wanting to cite some fancy-sounding physics they didn’t understand.  I didn’t realize the original misconception was due to Heisenberg himself.  In The Physical Principles of Quantum Theory. he wrote (and please note that this is not the correct explanation for the uncertainty principle):

Thus suppose that the velocity of a free electron is precisely known, while the position is completely unknown.  Then the principle states that every subsequent observation of the position will alter the momentum by an unknown and undeterminable amount such that after carrying out the experiment our knowledge of the electronic motion is restricted by the uncertainty relation.  This may be expressed in concise and general terms by saying that every experiment destroys some of the knowledge of the system which was obtained by previous experiments.

Most of this isn’t so bad, despite not being the uncertainty principle.  The next line is worse, if what you’re hoping for is an accurate translation of quantum mechanics into English.

This formulation makes it clear that the uncertainty relation does not refer to the past; if the velocity of the electron is at first known and the position then exactly measured, the position for times previous to the measurement may be calculated.  Then for these past times ∆p∆q [“p” stands for momentum and “q” stands for position in most mathematical expressions of quantum mechanics] is smaller than the usual limiting value, but this knowledge of the past is of a purely speculative character, since it can never (because of the unknown change in momentum caused by the position measurement) be used as an initial condition in any calculation of the future progress of the electron and thus cannot be subjected to experimental verification.

That’s not correct.  Because the uncertainty principle is not about measurement, it’s about the world and what states the world itself can possibly adopt.  We can’t trace the position & momentum both backward through time to know where & how fast an electron was earlier because the interactions that define a measurement create discrete properties, i.e. they are not revealing crisp properties that pre-existed the measurement.

Heisenberg was a brilliant man, but he made two major mistakes (that I know of, at least.  Maybe he had his own running tally of things he wished he’d done differently).  One mistake may have saved us all, as was depicted beautifully in Michael Frayn’s Copenhagen (also… they made a film of this?  I was lucky enough to see the play in person, but I’ll have to watch it again!) — who knows what would’ve happened if Germany had the bomb?

Heisenberg’s other big mistake was his word-based interpretation of the uncertainty principle he discovered.

CaptureHis misconception is understandable, though.  It’s very hard to translate from mathematics into words.  I’ll try my best with this essay, but I might botch it too — it’s going to be extra-hard for me because my math is so rusty.  I studied quantum mechanics from 2003 to 2007 but since then haven’t had professional reasons to work through any of the equations.  Eight years of lassitude is a long time, long enough to forget a lot, especially because my mathematical grounding was never very good.  I skipped several prerequisite math courses because I had good intuition for numbers, but this meant that when my study groups solved problem sets together we often divided the labor such that I’d write down the correct answer then they’d work backwards from it and teach me why it was correct.

I solved equations Robert Johnson crossroads style, except I had a Texas Instruments graphing calculator instead of a guitar.

The other major impediment Heisenberg was up against is that the uncertainty principle is most intuitive when expressed in matrix mechanics… and Heisenberg had no formal training in linear algebra.  I hadn’t realized this until I read Jagdish Mehra’s The Formulation of Matrix Mechanics and Its Modifications from his Historical Development of Quantum Theory.  A charming book, citing many of the letters the researchers sent to one another, providing mini-biographies of everyone who contributed to the theory.  The chapter describing Heisenberg’s rush to learn matrices in order to collaborate with Max Born and Pascual Jordan before the former left for a lecture series in the United States has a surprising amount of action for a history book about mathematics… but the outcome seems to be that Heisenberg’s rushed autodidacticism left him with some misconceptions.

Which is too bad.  The key idea was Heisenberg’s, the idea that non-commuting variables might underlie quantum behavior.

Commuting? I should probably explain that, at least briefly.  My algebra teacher, the same one who turned apoplectic when he saw miswritten grocery store discount signs, taught the subject like it was gym class (which I mean as a compliment, despite hating gym class).  Each operation was its own sport with a set of rules.  Multiplication, for instance, had rules that let you commute, and distribute, and associate.  When you commute, you get to shuffle your players around.  7 • 5 will give you the same answer as 5 • 7.

CaptureBut just because kicks to the head are legal in MMA doesn’t mean you can do ’em in soccer.  You’re allowed to commute when you’re playing multiplication, but you can’t do it in quantum mechanics.  You can’t commute matrices either, which was why Born realized that they might be the best way to express quantum phenomena algebraically.  If you have a matrix A and another matrix B, then A • B will often not be the same as B • A.

That difference underlies the uncertainty principle.

So, here’s the part of the essay wherein I will try my very best to make the math both comprehensible and accurate.  But I might fail at one or the other or both… if so, my apologies!

A matrix is an array of numbers that represents an operation.  I think the easiest way to understand matrices is to start by imagining operators that work in two dimensions.

Just like surgeons all dressed up in their scrubs and carrying a gleaming scalpel and peering down the corridors searching for a next victim, every operator needs something to operate on.  In the case of surgeons, it’s moneyed sick people.  In the case of matrices, it’s “vectors.”

As a first approximation, you can imagine vectors are just coordinate pairs.  Dots on a graph.  Typically the term “vector” implies something with a starting point, a direction, and a length… but it’s not a big deal to imagine a whole bunch of vectors that all start from the origin, so then all you need to know is the point at which the tip of an arrow might end.

It’ll be easiest to show you some operations if we have a bunch of vectors.  So here’s a list of them, always with the x coordinate written above the y coordinate.

3        4        5        2        6        1         7         3          5

0 ,      0 ,      0 ,      1 ,      1 ,      2 ,       2 ,       5 ,        5

That set of points makes a crude smiley face.

graph-1

And we can operate on that set points with a matrix in order to change the image in a predictable way.  I’ve always thought the way the math works here is cute… you have to imagine a vector leaping out of the water like a dolphin or killer whale and then splashing down horizontally onto the matrix.  Then the vector sinks down through the rows.

It won’t be as fun when I depict it statically, but the math works like this:

Picture 2

Does it make sense why I imagine the vector, the (x,y) thing, flopping over sideways?

The simplest matrix is something called an “identity” matrix.  It looks like this:

Picture 4

When we multiply a vector by the identity matrix, it isn’t changed.  The zeros mean the y term of our initial vector won’t affect the x term of our result, and the x term of our initial vector won’t affect the y term of our result.  Here:

Picture 5

And there are a couple other simple matrices we might consider (you’ll only need to learn a little more before I get back to that “matrices don’t commute” idea).

If we want to make our smiling face twice as big, we can use this operator:

2   0

0   2

Hopefully that matrix makes a little bit of sense.  The x and y terms still do not affect each other, which is why we have the zeros on the upward diagonal, and every coordinate must become twice as large to scoot everything farther from the origin, making the entire picture bigger.

We could instead make a mirror image of our picture by reflecting across the y axis:

-1   0

0    1

Or rotate our picture 90º counterclockwise:

0  -1

1   0

The rotation matrix has those terms because the previous Y axis spins down to align with the negative X axis, and the X axis rotates up to become the positive Y axis.

And those last two operators, mirror reflection and rotation, will let us see why the commutative property does not hold in linear algebra.  Why A • B is not necessarily equal to B • A if both A & B are matrices.

Here are some nifty pictures showing what happens when we first reflect our smile then rotate, versus first rotating then reflecting.  If the matrices did commute, if A • B = B • A, the outcome of the pair of operations would be the same no matter what order they were applied in.  And they aren’t! The top row of the image below shows reflection then rotation; the bottom row shows rotating our smile then reflecting it.

graph-2

And that, in essence, is where the uncertainty principle comes from.  Although there is one more mathematical concept that I should tell you about, the other rationale for using matrices to understand quantum mechanics in the first place.

You can write a matrix that would represent any operation or any set of forces.  One important class of matrices are those that use the positions of each relevant object, like the locations of each electron around a nucleus, in order to calculate the total energy of a system.  The electrons have kinetic energy based on their momentum (the derivative of their position with respect to time) and potential energy related to their position itself, due to interaction with the protons in the nucleus and, if there are multiple electrons, repulsive forces between each other…

Elliptic_orbit(I assume you’ve heard the term “two-body problem” before, used by couples who are trying to find a pair of jobs in the same city so they can move there together.  It’s a big issue in science and medicine, double matching for residencies, internships, post-docs, etc.  Well, it turns out that nobody thinks it’s funny to make a math joke out of this and say, “At least two-body problems are solvable.  Three-body problems have to be approximated numerically.”)

…but once you have a wavefunction (which is basically just a fancy vector, now with a stack of functions instead of a stack of numbers), you can imagine acting upon it with any matrix you want.  Any measurement you make, for instance, can be represented by a matrix.  And the cute thing about quantum mechanics, the thing that makes it quantized, is that only a discrete set of answers can come out of most measurements.  This is because a measurement causes the system to adopt an eigenfunction of the matrix representing that measurement.

An eigenfunction is a vector that still looks the same after it’s been operated upon by a particular matrix (from the German word “eigen,” which means something like “own” or “self”).  If we consider the operator for reflection that I jotted out above, you can see that a vector pointing straight up will still resemble itself after it’s been acted upon.

And a neat property of quantum mechanics is that every operator has a set of eigenfunctions that spans whatever space you’re working with.  For instance, the X & Y axes together span all of two-dimensional space… but so do any pair of non-parallel lines.  You could pick any pair of lines that cross and use them as a basis set to describe two-dimensional space.  Any point you want to reach can indeed be arrived at by moving some distance along your first line and then some distance along your second.

This is relevant to quantum mechanics because any measurement collapses the system into an eigenfunction of its representative matrix, and the probability that it will end up in any one state is determined by the amount of that eigenfunction you need to describe its previous wavefunction in your new basis set.

That is one ugly sentence.

Maybe it’s not so surprising that Heisenberg described this incorrectly in words, because this is somewhat arduous…

Here, I’ll draw another nifty picture.  We’ll have to imagine two different operations (you could even get ahead of me and imagine that these represent measuring position and momentum, since that’s the pair of famous variables that don’t commute), and the eigenvectors for these operations are represented by either the blue arrows or the red arrows below.

graph-3

If we make a measurement with the blue matrix, it’ll collapse the system into one of the two blue eigenvectors.  If we decide to measure the same property again, i.e. act upon the system with the blue matrix again, we’re sure to see that same blue eigenvector.  We’ll know what we’ll be getting.

But once the system has collapsed into a blue arrow, if we measure with the red matrix the system has to shift to align with one of the red arrows.  And our probability of getting each red answer depends upon how similar each red arrow is to the blue arrows… the one that looks more like our current state is more likely to occur, but because neither red arrow matches a blue arrow perfectly, there’s a chance we’ll end up with either answer.

And if we want to make a blue measurement, then red, then blue… the two blue measurements won’t necessarily be the same.  After we’re in a state that matches a red eigenvector, we have some probability to flop back to either blue eigenvector, depending, again, on how similar each is to the red eigenvector we land in.

That’s the uncertainty principle.  That position is simply not well-defined when momentum is precisely known, and vice versa.  The eigenfunctions for one type of measurement do not resemble the eigenfunctions for the other measurement.  Which means that the type of measurement you have to make in order to know one or the other property invariably changes the system and gives you an unpredictable result… it’s like you’re rolling dice every time you switch which flavor of measurement you’re making.

But the measurement isn’t causing error.  It’s revealing an underlying probability distribution.  That is, there is no conceivable “gentle” way of measuring that will give a predictable answer, because the phenomenon itself is probabilistic.  Because the mechanics are quantized, because there are no in-between states, the system flops like a landbound fish from eigenvectors of one measurement to eigenvectors of the other.

Which is why it bothers me so much to see the uncertainty principle described as measurement obscuring reality when the idea crops up in philosophy or literature.  Those allusions also tend to place too much import on the idea of “observers,” like the old adage about a tree making or not making sound when it falls in an empty forest.  Perhaps I did a bad job of this too by writing “measurement” so often.  Maybe that word makes it sound as though quantum collapse requires intentional human involvement.  It doesn’t.  Any interaction between quantum mechanics and a semi-classical system will couple them and can cause the probabilistic distribution of wavefunctions to condense into particle-like behavior.

And I think the biggest difference between the uncertainty principle and the way it’s often portrayed in literature is that, rather than measurements obscuring reality, you could almost say that measurements create reality.  There wasn’t a discrete state until the measurement was made.  It’s like asking an inebriated collegiate friend who just learned something troubling about his romantic partner, “Well, what are you going to do?”  He’ll probably answer.  While you’re talking about it, it’ll seem like he’s going to stick to that answer.  But if you hadn’t asked he probably would’ve continued to mull things over, continued to exist in that seemingly in-between state where there’s both a chance that he’ll break up or try to work things out.  By asking, you learn his plan… but you also forced him to come up with a plan.

And it’s important that our collegian be drunk in this analogy… because making a different measurement has to re-randomize behavior.  Even after he resolves to break up, if you ask “Where should we go for our midnight snack,” mulling that over would make him forget what he’d planned to do about the whole dating situation.  The next time you ask, he might decide to ride it out.  It’s only when allowed to keep the one answer in the forefront of his mind that the answer stays consistent.

The uncertainty principle says that position and momentum can’t both be known precisely not because measurement is difficult, but because elementary particles are too drunk to remember where they are when you ask how fast they’re moving.

And, here, a treat!  As a reward for wading through all this, I’ve drawn a cartoon version of Heisenberg’s misconception.  Note that this is not, in fact, the correct explanation for the uncertainty principle… but do you really need me to sketch a bunch of besotted electrons?

cartoon-title

cartoon-1003

cartoon-summary