On suboptimal optimization.

On suboptimal optimization.

I’ve been helping a friend learn the math behind optimization so that she can pass a graduation-requirement course in linear algebra. 

Optimization is a wonderful mathematical tool.  Biochemists love it – progression toward an energy minimum directs protein folding, among other physical phenomena.  Economists love it – whenever you’re trying to make money, you’re solving for a constrained maximum.  Philosophers love it – how can we provide the most happiness for a population?  Computer scientists love it – self-taught translation algorithms use this same methodology (I still believe that you could mostly replace Ludwig Wittgenstein’s Philosophical Investigations with this New York Times Magazine article on machine learning and a primer on principal component analysis).

But, even though optimization problems are useful, the math behind them can be tricky.  I’m skeptical that this mathematical technique is essential for everyone who wants a B.A. to grasp – my friend, for example, is a wonderful preschool teacher who hopes to finally finish a degree in child psychology.  She would have graduated two years ago except that she’s failed this math class three times.

I could understand if the university wanted her to take statistics, as that would help her understand psychology research papers … and the science underlying contemporary political debates … and value-added models for education … and more.  A basic understanding of statistics might make people better citizens.

Whereas … linear algebra?  This is a beautiful but counterintuitive field of mathematics.  If you’re interested in certain subjects – if you want to become a physicist, for example – you really should learn this math.  A deep understanding of linear algebra can enliven your study of quantum mechanics.

The summary of quantum mechanics: animation by Templaton.

Then again, Werner Heisenberg, who was a brilliant physicist, had a limited grasp on linear algebra.  He made huge contributions to our understanding of quantum mechanics, but his lack of mathematical expertise occasionally held him back.  He never quite understood the implications of the Heisenberg Uncertainty Principle, and he failed to provide Adolph Hitler with an atomic bomb.

In retrospect, maybe it’s good that Heisenberg didn’t know more linear algebra.

While I doubt that Heisenberg would have made a great preschool teacher, I don’t think that deficits in linear algebra were deterring him from that profession.  After each evening that I spend working with my friend, I do feel that she understands matrices a little better … but her ability to nurture children isn’t improving.

And yet.  Somebody in an office decided that all university students here need to pass this class.  I don’t think this rule optimizes the educational outcomes for their students, but perhaps they are maximizing something else, like the registration fees that can be extracted.

Optimization is a wonderful mathematical tool, but it’s easy to misuse.  Numbers will always do what they’re supposed to, but each such problem begins with a choice.  What exactly do you hope to optimize?

Choose the wrong thing and you’ll make the world worse.

#

Figure 1 from Eykholt et al., 2018.

Most automobile companies are researching self-driving cars.  They’re the way of the future!  In a previous essay, I included links to studies showing that unremarkable-looking graffiti could confound self-driving cars … but the issue I want to discuss today is both more mundane and more perfidious.

After all, using graffiti to make a self-driving car interpret a stop sign as “Speed Limit 45” is a design flaw.  A car that accelerates instead of braking in that situation is not operating as intended.

But passenger-less self-driving cars that roam the city all day, intentionally creating as many traffic jams as possible?  That’s a feature.  That’s what self-driving cars are designed to do.

A machine designed to create traffic jams?

Despite my wariness about automation and algorithms run amok, I hadn’t considered this problem until I read Adam Millard-Ball’s recent research paper, “The Autonomous Vehicle Parking Problem.” Millard-Ball begins with a simple assumption: what if a self-driving car is designed to maximize utility for its owner?

This assumption seems reasonable.  After all, the AI piloting a self-driving car must include an explicit response to the trolley problem.  Should the car intentionally crash and kill its passenger in order to save the lives of a group of pedestrians?  This ethical quandary is notoriously tricky to answer … but a computer scientist designing a self-driving car will probably answer, “no.” 

Otherwise, the manufacturers won’t sell cars.  Would you ride in a vehicle that was programmed to sacrifice you?

Luckily, the AI will not have to make that sort of life and death decision often.  But here’s a question that will arise daily: if you commute in a self-driving car, what should the car do while you’re working?

If the car was designed to maximize public utility, perhaps it would spend those hours serving as a low-cost taxi.  If demand for transportation happened to be lower than the quantity of available, unoccupied self-driving cars, it might use its elaborate array of sensors to squeeze into as small a space as possible inside a parking garage.

But what if the car is designed to benefit its owner?

Perhaps the owner would still want for the car to work as a taxi, just as an extra source of income.  But some people – especially the people wealthy enough to afford to purchase the first wave of self-driving cars – don’t like the idea of strangers mucking around in their vehicles.  Some self-driving cars would spend those hours unoccupied.

But they won’t park.  In most cities, parking costs between $2 and $10 per hour, depending on whether it’s street or garage parking, whether you purchase a long-term contract, etc. 

The cost to just keep driving is generally going to be lower than $2 per hour.  Worse, this cost is a function of the car’s speed.  If the car is idling at a dead stop, it will use approximately 0.1 gallon per hour, costing 25 cents per hour at today’s prices.  If the car is traveling at 30 mph without breaks, it will use approximately 1 gallon per hour, costing $2.50 per hour.

To save money, the car wants to stay on the road … but it wants for traffic to be as close to a standstill as possible.

Luckily for the car, this is an easy optimization problem.  It can consult its onboard GPS to find nearby areas where traffic is slow, then drive over there.  As more and more self-driving cars converge on the same jammed streets, they’ll slow traffic more and more, allowing them to consume the workday with as little motion as possible.

Photo by walidhassanein on Flickr.

Pity the person sitting behind the wheel of an occupied car on those streets.  All the self-driving cars will be having a great time stuck in that traffic jam: we’re saving money!, they get to think.  Meanwhile the human is stuck swearing at empty shells, cursing a bevy of computer programmers who made their choices months or years ago.

And all those idling engines exhale carbon dioxide.  But it doesn’t cost money to pollute, because one political party’s worth of politicians willfully ignore the fact that capitalism, by philosophical design, requires we set prices for scarce resources … like clean air, or habitable planets.

On the water-fueled car.

On the water-fueled car.

“I heard there was, like, a car that runs on water … “

“Dude, no, there’ve been, like, six of them.  But oil companies bought all the patents.”

A lot of the people who attend my poetry class in jail believe in freaky conspiracy theories.  Somebody started telling me that the plots of various Berenstain Bears books are different from when he was a child, which is evidence that the universe bifurcated and that he’s now trapped in an alternate timeline from the path he was on before …

old hat(New printings of some Berenstain Bears books really are different.  Take Old Hat New Hat, a charming story about shopping and satisfaction: after the protagonist realizes that he prefers the old, beat-up hat he already owns to any of the newer, fancier models, a harried salesperson reacts with a mix of disgust and disbelieve.  This scene has been excised from the board book version that you could buy today.  Can’t have anything that tarnishes the joy of consumerism!)

I’ve written about conspiracy theories previously, but I think it’s worth re-iterating, in the interest of fairness, that the men in jail are correct when they assume that vast numbers of people are “breathing together” against them.  Politicians, judges, police, corporate CEOs and more have cooperated to build a world in which men like my students are locked away.  Not too long ago, it would have been fairly easy for them to carve out a meaningful existence, but advances in automation, the ease of international shipping, and changes to tax policy have dismantled the opportunities of the past.

Which means that I often find myself seriously debating misinterpretations of Hugh Everett’s “many worlds” theory (described midway through my essay, “Ashes”), or Biblical prophecies, or Jung-like burblings of the collective unconsciousness.

Or, last week, the existence of water cars.

In 2012, government officials from Pakistan announced that a local scientist had invented a process for using water as fuel.  At the time, I was still running a webcomic – one week’s Evil Dave vs. Regular Dave focused on news of the invention.

dave062.jpg

When scientists argue that a water-powered car can’t exist, they typically reference the Second Law of Thermodynamics (also discussed in “Ashes”).  The Second Law asserts that extremely unlikely events occur so rarely that you can safely assume their probability to be zero.

If something is disallowed by the Second Law, there’s nothing actually preventing it from happening.  For an oversimplified example, imagine there are 10 molecules of a gas randomly whizzing about inside a box.  The Second Law says that all 10 will never be traveling in the exact same direction at the same time.  If they were, you’d get energy from nothing.  They might all strike the north-facing wall at the same time, causing the box to move, instead of an equal number hitting the northern and southern facing walls.

But, just like flipping eight coins and seeing them all land heads, sometimes the above scenario will occur.  It violates the Second Law, and it can happen.  Perpetual motion machines can exist.  They are just very, very rare.  (Imagine a fraction where the denominator is a one followed by as many zeros as you could write before you die.  That number will be bigger than the chance of a water-fueled car working for even several seconds.)

When chemists talk about fuel, they think about diagrams that look roughly like this:

graph.PNG

The y axis on this graph is energy, and the x axis is mostly meaningless – here it’s labeled “reaction coordinate,” but you wouldn’t be so far off if you just think of it as time.

For a gasoline powered car, the term “reactants” refers to octane and oxygen.  Combined, these have a higher amount of energy stored in their chemical bonds than an equivalent mass of the “products,” carbon dioxide and water, so you can release energy through combustion.  The released energy moves your car forward.

And there’s a hill in the middle.  This is generally called the “activation barrier” of the reaction.  Basically, the universe thinks it’s a good idea to turn octane and oxygen into CO2 and H2O … but the universe is lazy.  Left to its own devices, it can’t be bothered.  Which is good – because this reaction has a high activation barrier, we rarely explode while refueling at the gas station.

Your car uses a battery to provide the energy needed to start this process, after which the energy of the first reaction can be used to activate the next.  The net result is that you’re soon cruising the highway with nary a care, dribbling water from your tailpipe, pumping carbon into the air.

(Your car also uses a “catalyst” – this component doesn’t change how much energy you’ll extract per molecule of octane, but it lowers the height of the activation barrier, which makes it easier for the car to start.  Maybe you’ve heard the term “cold fusion.”  If we could harness a reaction combining hydrogen molecules to form helium, that would be a great source of power.  Hydrogen fusion is what our sun uses.  This reaction chucks out a lot of energy and has non-toxic byproducts.

But the “cold” part of “cold fusion” refers to the fact that, without a catalyst, this reaction has an extremely steep activation barrier.  It works on the sun because hydrogen molecules are crammed together at high temperature and pressure.  Something like millions of degrees.  I personally get all sweaty and miserable at 80 degrees, and am liable to burn myself when futzing about near an oven at 500 degrees … I’d prefer not to drive a 1,000,000 degree hydrogen-fusion-powered automobile.)

Magnificent_CME_Erupts_on_the_Sun_-_August_31.jpg
Seriously, I would not want this to be happening beneath the hood of the family ride.

With any fuel source, you can guess at its workings by comparing the energy of its inputs and outputs.  Octane and oxygen have high chemical energies, carbon dioxide and water have lower energies, so that’s why your car goes forward.  Our planet, too, can be viewed as a simple machine.  High frequency (blue-ish) light streams toward us from the sun, then something happens here that increases the order of molecules on Earth, after which we release a bunch of low-frequency (red-ish) light.

(We release low-frequency “infrared” light as body heat – night vision goggles work by detecting this.)

Our planet is an order-creating machine fueled by changing the color of photons from the sun.

A water-fueled car is impractical because other molecules that contain hydrogen and oxygen have higher chemical energy than an equivalent mass of water.  There’s no energy available for you to siphon away into movement.

If you were worried that major oil companies are conspiring against you by hiding the existence of water-fueled cars, you can breathe a sigh of relief.  But don’t let yourself get too complacent, because these companies really are conspiring against you.  They’re trying to starve your children.

On a guaranteed basic income.

On a guaranteed basic income.

For several months, a friend and I have volleyed emails about a sprawling essay on consciousness, free will, and literature.

Brain_powerThe essay will explore the idea that humans feel we have free will because our conscious mind grafts narrative explanations (“I did this because…”) onto our actions. It seems quite clear that our conscious minds do not originate all the choices that we then take credit for. With an electroencephalogram, you could predict when someone is about to raise an arm, for instance, before the person has even consciously decided to do so.

Which is still free will, of course. If we are choosing an action, it hardly matters whether our conscious or subconscious mind makes the choice. But then again, we might not be “free.” If an outside observer were able to scan a person’s brain to sufficient detail, all of that person’s future choices could probably be predicted (as long as our poor study subject is imprisoned in an isolation chamber). Our brains dictate our thoughts and choices, but these brains are composed of salts and such that follow the same laws of physics as all other matter.

That’s okay. It is almost certainly impossible that any outside observer could (non-destructively) scan a brain to sufficient detail. If quantum mechanical detail is implicated in the workings of our brains, it is definitely impossible: quantum mechanical information can’t be duplicated. Wikipedia has a proof of this “no cloning theorem” involving lots of bras and kets, but this is probably unreadable for anyone who hasn’t done much matrix math. An easier way to reason through it might be this: if you agree with the Heisenberg uncertainty principle, the idea that certain pairs of variables cannot be simultaneously measured to arbitrary precision, the no cloning theorem has to be true. Otherwise you could simply make many copies of a system and measure one variable precisely for each copy.

So, no one will ever be able to prove to me that I am not free. But let’s just postulate, for a moment, that the laws of physics that, so far, have correctly described the behavior of all matter outside my brain also correctly describe the movement of matter inside my brain. In which case, those inviolable laws of physics are dictating my actions as I type this essay. And yet, I feel free. Each word I type feels like a choice. My brain is constantly concocting a story that explains why I am choosing each word.

Does the same neural circuitry that deludes me into feeling free – that has evolved, it seems, to constantly sculpt narratives that make sense of our actions, the same way our dreams often burgeon to include details like a too hot room or a ringing telephone – also give me the ability to write fiction?

In other words, did free will spawn The Iliad?

iliad.JPG

The essay is obviously rather speculative. I’m incorporating relevant findings from neuroscience, but, as I’ve mentioned, it’s quite likely that no feasible experiments could ever test some of these ideas.

The essay is also unfinished. No laws of physics forbid me from finishing it. I’m just slow because K & I have two young kids. At the end of each day, once our 2.5 year old and our 3 month old are finally asleep, we exhaustedly glance at each other and murmur, “Where did the time go?”

tradersBut I am very fortunate to have a collaborator always ready to nudge me back into action. My friend recently sent me an article by Tim Christiaens on the philosophy of financial markets. He sent it because the author argues – correctly, in my opinion – that for many stock market actions it’s sensible to consider the Homo sapiens trader + the nearby multi-monitor computer as a single decision-making entity. Tool-wielding is known to change our brains – even something as simple as a pointing stick alters our self-perception of our reach. And the algorithms churned through by stock traders’ computers are incredibly complex. There’s not a good way for the human to check a computer’s results; the numbers it spits out have to be trusted. So it seems reasonable to consider the two together as a single super-entity that collaborates in choosing when to buy or sell. If something in the room has free will, it would be the tools & trader together.

Which isn’t as weird as it might initially sound. After all, each Homo sapiens shell is already a multi-species super-entity. As I type this essay, the choice of which word to write next is made inside my brain, then signals are sent through my nervous system to my hands and fingers commanding them to tap the appropriate keys. The choice is influenced by all the hormones and signaling molecules inside my brain. It so happens that bacteria and other organisms living in my body excrete signaling molecules that can cross the blood-brain barrier and influence my choice.

The milieu of intestinal bacteria living inside each of us gets to vote on our moods and actions. People with depression seem to harbor noticeably different sets of bacteria than people without. And it seems quite possible that parasites like Toxoplasma gondii can have major influences on our personalities.

CaptureIndeed, in his article on stock markets, Christiaens mentions the influence of small molecules on financial behavior, reporting that “some researchers study the trader’s body through the prism of testosterone levels as an indicator of performance. It turns out that traders who regularly visit prostitutes consequently have higher testosterone levels and outperform other traders.”

Now, I could harp on the fact that we designed these markets. That they could have been designed in many different ways. And that it seems pretty rotten to have designed a system in which higher testosterone (and the attendant impulsiveness and risky decision-making) would correlate with success. Indeed, a better, more equitable market design would probably quell the performance boost of testosterone.

I could rant about all that. But I won’t. Instead I’ll simply mention that Toxoplasma seems to boost testosterone. Instead of popping into brothels after work, traders could snack on cat shit.

cat-1014209_1280.jpg

On the topic of market design, Christiaens also includes a lovely description of the interplay between the structure of our economy and the ways that people are compelled to live:

The reason why financial markets are able to determine the viability of lifestyles is because most individuals and governments are indebted and therefore need a ‘creditworthy’ reputation. As the [U.S.] welfare state declined during the 1980s, access to credit was facilitated in order to sustain high consumption, avoid overproduction and stimulate economic growth. For Lazzarato [a referenced writer], debt is not an obligation emerging from a contract between free and equal individuals, but is from the start an unequal power relation where the creditor can assert his force over the debtor. As long as he is indebted, the latter’s rights are virtually suspended. For instance, a debtor’s property rights can be superseded when he fails to reimburse the creditor by evicting him from his home or selling his property at a public auction. State violence is called upon to force non-creditworthy individuals to comply. We [need] not even jump to these extreme cases of state enforcement to see that debt entails a disequilibrium of power. Even the peaceful house loan harbors a concentration of risk on the side of the debtor. When I take a $100,000 loan for a house that, during an economic crisis, loses its value, I still have to pay $100,000 plus interests to the bank. The risk of a housing crash is shifted to the debtor’s side of the bargain. During a financial crisis this risk concentration makes it possible for the creditors to demand a change of lifestyle from the debtor, without the former having to reform themselves.

Several of my prior essays have touched upon the benefits of a guaranteed basic income for all people, but I think this paragraph is a good lead-in for a reprise. As Christiaens implies, there is violence behind all loans – both the violence that led to initial ownership claims and the threat of state violence that compels repayment. Not that I’m against the threat of state violence to compel people to follow rules in general – without this threat we would have anarchy, in which case actual violence tends to predominate over the threat of incipient enforcement.

We all need wealth to live. After all, land holdings are wealth, and at the very least each human needs access to a place to collect fresh water, a place to grow food, a place to stand and sleep. But no one is born wealthy. A fortunate few people receive gifts of wealth soon after birth, but many people foolishly choose to be born to less well-off parents.

The need for wealth curtails the choices people can make. They need to maintain their “creditworthiness,” as in Christiaens’s passage, or their hire-ability. Wealth has to come from somewhere, and, starting from zero, we rely on others choosing to give it to us. Yes, often in recompense for labor, but just because you are willing and able to do a form of work does not mean that anyone will pay you for it.

Unless people are already wealthy enough to survive, they are at the mercy of others choosing to give them things. Employers are not forced to trade money for salaried working hours. And there isn’t wealth simply waiting around to be claimed. It all starts from something – I’d argue that all wealth stems originally from land holdings – but the world’s finite allotment of land was claimed long ago through violence.

A guaranteed basic income would serve to acknowledge the brutal baselessness of those initial land grabs. It is an imperfect solution, I know. It doesn’t make sense to me that everyone’s expenses should rise whenever a new child is born. But a world where people received a guaranteed basic income would be better than one without. The unluckily-born populace would be less compelled to enter into subjugating financial arrangements. We’d have less misery – feeling poor causes a lot of stress. We’d presumably have less crime and drug abuse, too, for similar reasons.

And, of course, less hypocrisy. It’s worth acknowledging that our good fortune comes from somewhere. No one among us created the world.

On uncertainty (with cartoon ending).

The whole cartoon is at the end.
See this monstrosity, in its entirety, at the end of this essay.

Reading about the uncertainty principle in popular literature almost always sets my teeth on edge.

CaptureI assume most people have a few qualms like that, things they often see done incorrectly that infuriate them.  After a few pointed interactions with our thesis advisor, a friend of mine started going berserk whenever he saw “it’s” and “its” misused on signs.  My middle school algebra teacher fumed whenever he saw store prices marked “.25% off!” when they meant you’d pay three quarters of the standard price, not 99.75%.  A violinist friend with perfect pitch called me (much too early) on a Sunday morning to complain that the birds on her windowsill were out of tune… how could she sleep when they couldn’t hit an F#??

“Ha,” I say.  “That’s silly… they should just let it go.”  But then I start frowning and sputtering when I read about the uncertainty principle.  Anytime somebody writes a line to the effect of, we’ve learned from quantum mechanics that measurement obscures the world, so we will always be uncertain what reality might have been had we not measured it.

My ire is risible in part because the idea isn’t so bad.  It even holds in some fields.  Like social psychology, I’d say.  If a research group identifies a peculiarity of the human mind and then widely publicizes their findings, that particularity might go away.  There was a study published shortly before I got my first driver’s license concluding that the rightmost lanes of toll booths were almost always fastest.  Now that’s no longer true.  Humans can correct their mistakes, but first they have to realize they’re mistaken.

That’s not the uncertainty principle, though.

CaptureAnd, silly me, I’d always thought that this misconception was due to liberal arts professors wanting to cite some fancy-sounding physics they didn’t understand.  I didn’t realize the original misconception was due to Heisenberg himself.  In The Physical Principles of Quantum Theory. he wrote (and please note that this is not the correct explanation for the uncertainty principle):

Thus suppose that the velocity of a free electron is precisely known, while the position is completely unknown.  Then the principle states that every subsequent observation of the position will alter the momentum by an unknown and undeterminable amount such that after carrying out the experiment our knowledge of the electronic motion is restricted by the uncertainty relation.  This may be expressed in concise and general terms by saying that every experiment destroys some of the knowledge of the system which was obtained by previous experiments.

Most of this isn’t so bad, despite not being the uncertainty principle.  The next line is worse, if what you’re hoping for is an accurate translation of quantum mechanics into English.

This formulation makes it clear that the uncertainty relation does not refer to the past; if the velocity of the electron is at first known and the position then exactly measured, the position for times previous to the measurement may be calculated.  Then for these past times ∆p∆q [“p” stands for momentum and “q” stands for position in most mathematical expressions of quantum mechanics] is smaller than the usual limiting value, but this knowledge of the past is of a purely speculative character, since it can never (because of the unknown change in momentum caused by the position measurement) be used as an initial condition in any calculation of the future progress of the electron and thus cannot be subjected to experimental verification.

That’s not correct.  Because the uncertainty principle is not about measurement, it’s about the world and what states the world itself can possibly adopt.  We can’t trace the position & momentum both backward through time to know where & how fast an electron was earlier because the interactions that define a measurement create discrete properties, i.e. they are not revealing crisp properties that pre-existed the measurement.

Heisenberg was a brilliant man, but he made two major mistakes (that I know of, at least.  Maybe he had his own running tally of things he wished he’d done differently).  One mistake may have saved us all, as was depicted beautifully in Michael Frayn’s Copenhagen (also… they made a film of this?  I was lucky enough to see the play in person, but I’ll have to watch it again!) — who knows what would’ve happened if Germany had the bomb?

Heisenberg’s other big mistake was his word-based interpretation of the uncertainty principle he discovered.

CaptureHis misconception is understandable, though.  It’s very hard to translate from mathematics into words.  I’ll try my best with this essay, but I might botch it too — it’s going to be extra-hard for me because my math is so rusty.  I studied quantum mechanics from 2003 to 2007 but since then haven’t had professional reasons to work through any of the equations.  Eight years of lassitude is a long time, long enough to forget a lot, especially because my mathematical grounding was never very good.  I skipped several prerequisite math courses because I had good intuition for numbers, but this meant that when my study groups solved problem sets together we often divided the labor such that I’d write down the correct answer then they’d work backwards from it and teach me why it was correct.

I solved equations Robert Johnson crossroads style, except I had a Texas Instruments graphing calculator instead of a guitar.

The other major impediment Heisenberg was up against is that the uncertainty principle is most intuitive when expressed in matrix mechanics… and Heisenberg had no formal training in linear algebra.  I hadn’t realized this until I read Jagdish Mehra’s The Formulation of Matrix Mechanics and Its Modifications from his Historical Development of Quantum Theory.  A charming book, citing many of the letters the researchers sent to one another, providing mini-biographies of everyone who contributed to the theory.  The chapter describing Heisenberg’s rush to learn matrices in order to collaborate with Max Born and Pascual Jordan before the former left for a lecture series in the United States has a surprising amount of action for a history book about mathematics… but the outcome seems to be that Heisenberg’s rushed autodidacticism left him with some misconceptions.

Which is too bad.  The key idea was Heisenberg’s, the idea that non-commuting variables might underlie quantum behavior.

Commuting? I should probably explain that, at least briefly.  My algebra teacher, the same one who turned apoplectic when he saw miswritten grocery store discount signs, taught the subject like it was gym class (which I mean as a compliment, despite hating gym class).  Each operation was its own sport with a set of rules.  Multiplication, for instance, had rules that let you commute, and distribute, and associate.  When you commute, you get to shuffle your players around.  7 • 5 will give you the same answer as 5 • 7.

CaptureBut just because kicks to the head are legal in MMA doesn’t mean you can do ’em in soccer.  You’re allowed to commute when you’re playing multiplication, but you can’t do it in quantum mechanics.  You can’t commute matrices either, which was why Born realized that they might be the best way to express quantum phenomena algebraically.  If you have a matrix A and another matrix B, then A • B will often not be the same as B • A.

That difference underlies the uncertainty principle.

So, here’s the part of the essay wherein I will try my very best to make the math both comprehensible and accurate.  But I might fail at one or the other or both… if so, my apologies!

A matrix is an array of numbers that represents an operation.  I think the easiest way to understand matrices is to start by imagining operators that work in two dimensions.

Just like surgeons all dressed up in their scrubs and carrying a gleaming scalpel and peering down the corridors searching for a next victim, every operator needs something to operate on.  In the case of surgeons, it’s moneyed sick people.  In the case of matrices, it’s “vectors.”

As a first approximation, you can imagine vectors are just coordinate pairs.  Dots on a graph.  Typically the term “vector” implies something with a starting point, a direction, and a length… but it’s not a big deal to imagine a whole bunch of vectors that all start from the origin, so then all you need to know is the point at which the tip of an arrow might end.

It’ll be easiest to show you some operations if we have a bunch of vectors.  So here’s a list of them, always with the x coordinate written above the y coordinate.

3        4        5        2        6        1         7         3          5

0 ,      0 ,      0 ,      1 ,      1 ,      2 ,       2 ,       5 ,        5

That set of points makes a crude smiley face.

graph-1

And we can operate on that set points with a matrix in order to change the image in a predictable way.  I’ve always thought the way the math works here is cute… you have to imagine a vector leaping out of the water like a dolphin or killer whale and then splashing down horizontally onto the matrix.  Then the vector sinks down through the rows.

It won’t be as fun when I depict it statically, but the math works like this:

Picture 2

Does it make sense why I imagine the vector, the (x,y) thing, flopping over sideways?

The simplest matrix is something called an “identity” matrix.  It looks like this:

Picture 4

When we multiply a vector by the identity matrix, it isn’t changed.  The zeros mean the y term of our initial vector won’t affect the x term of our result, and the x term of our initial vector won’t affect the y term of our result.  Here:

Picture 5

And there are a couple other simple matrices we might consider (you’ll only need to learn a little more before I get back to that “matrices don’t commute” idea).

If we want to make our smiling face twice as big, we can use this operator:

2   0

0   2

Hopefully that matrix makes a little bit of sense.  The x and y terms still do not affect each other, which is why we have the zeros on the upward diagonal, and every coordinate must become twice as large to scoot everything farther from the origin, making the entire picture bigger.

We could instead make a mirror image of our picture by reflecting across the y axis:

-1   0

0    1

Or rotate our picture 90º counterclockwise:

0  -1

1   0

The rotation matrix has those terms because the previous Y axis spins down to align with the negative X axis, and the X axis rotates up to become the positive Y axis.

And those last two operators, mirror reflection and rotation, will let us see why the commutative property does not hold in linear algebra.  Why A • B is not necessarily equal to B • A if both A & B are matrices.

Here are some nifty pictures showing what happens when we first reflect our smile then rotate, versus first rotating then reflecting.  If the matrices did commute, if A • B = B • A, the outcome of the pair of operations would be the same no matter what order they were applied in.  And they aren’t! The top row of the image below shows reflection then rotation; the bottom row shows rotating our smile then reflecting it.

graph-2

And that, in essence, is where the uncertainty principle comes from.  Although there is one more mathematical concept that I should tell you about, the other rationale for using matrices to understand quantum mechanics in the first place.

You can write a matrix that would represent any operation or any set of forces.  One important class of matrices are those that use the positions of each relevant object, like the locations of each electron around a nucleus, in order to calculate the total energy of a system.  The electrons have kinetic energy based on their momentum (the derivative of their position with respect to time) and potential energy related to their position itself, due to interaction with the protons in the nucleus and, if there are multiple electrons, repulsive forces between each other…

Elliptic_orbit(I assume you’ve heard the term “two-body problem” before, used by couples who are trying to find a pair of jobs in the same city so they can move there together.  It’s a big issue in science and medicine, double matching for residencies, internships, post-docs, etc.  Well, it turns out that nobody thinks it’s funny to make a math joke out of this and say, “At least two-body problems are solvable.  Three-body problems have to be approximated numerically.”)

…but once you have a wavefunction (which is basically just a fancy vector, now with a stack of functions instead of a stack of numbers), you can imagine acting upon it with any matrix you want.  Any measurement you make, for instance, can be represented by a matrix.  And the cute thing about quantum mechanics, the thing that makes it quantized, is that only a discrete set of answers can come out of most measurements.  This is because a measurement causes the system to adopt an eigenfunction of the matrix representing that measurement.

An eigenfunction is a vector that still looks the same after it’s been operated upon by a particular matrix (from the German word “eigen,” which means something like “own” or “self”).  If we consider the operator for reflection that I jotted out above, you can see that a vector pointing straight up will still resemble itself after it’s been acted upon.

And a neat property of quantum mechanics is that every operator has a set of eigenfunctions that spans whatever space you’re working with.  For instance, the X & Y axes together span all of two-dimensional space… but so do any pair of non-parallel lines.  You could pick any pair of lines that cross and use them as a basis set to describe two-dimensional space.  Any point you want to reach can indeed be arrived at by moving some distance along your first line and then some distance along your second.

This is relevant to quantum mechanics because any measurement collapses the system into an eigenfunction of its representative matrix, and the probability that it will end up in any one state is determined by the amount of that eigenfunction you need to describe its previous wavefunction in your new basis set.

That is one ugly sentence.

Maybe it’s not so surprising that Heisenberg described this incorrectly in words, because this is somewhat arduous…

Here, I’ll draw another nifty picture.  We’ll have to imagine two different operations (you could even get ahead of me and imagine that these represent measuring position and momentum, since that’s the pair of famous variables that don’t commute), and the eigenvectors for these operations are represented by either the blue arrows or the red arrows below.

graph-3

If we make a measurement with the blue matrix, it’ll collapse the system into one of the two blue eigenvectors.  If we decide to measure the same property again, i.e. act upon the system with the blue matrix again, we’re sure to see that same blue eigenvector.  We’ll know what we’ll be getting.

But once the system has collapsed into a blue arrow, if we measure with the red matrix the system has to shift to align with one of the red arrows.  And our probability of getting each red answer depends upon how similar each red arrow is to the blue arrows… the one that looks more like our current state is more likely to occur, but because neither red arrow matches a blue arrow perfectly, there’s a chance we’ll end up with either answer.

And if we want to make a blue measurement, then red, then blue… the two blue measurements won’t necessarily be the same.  After we’re in a state that matches a red eigenvector, we have some probability to flop back to either blue eigenvector, depending, again, on how similar each is to the red eigenvector we land in.

That’s the uncertainty principle.  That position is simply not well-defined when momentum is precisely known, and vice versa.  The eigenfunctions for one type of measurement do not resemble the eigenfunctions for the other measurement.  Which means that the type of measurement you have to make in order to know one or the other property invariably changes the system and gives you an unpredictable result… it’s like you’re rolling dice every time you switch which flavor of measurement you’re making.

But the measurement isn’t causing error.  It’s revealing an underlying probability distribution.  That is, there is no conceivable “gentle” way of measuring that will give a predictable answer, because the phenomenon itself is probabilistic.  Because the mechanics are quantized, because there are no in-between states, the system flops like a landbound fish from eigenvectors of one measurement to eigenvectors of the other.

Which is why it bothers me so much to see the uncertainty principle described as measurement obscuring reality when the idea crops up in philosophy or literature.  Those allusions also tend to place too much import on the idea of “observers,” like the old adage about a tree making or not making sound when it falls in an empty forest.  Perhaps I did a bad job of this too by writing “measurement” so often.  Maybe that word makes it sound as though quantum collapse requires intentional human involvement.  It doesn’t.  Any interaction between quantum mechanics and a semi-classical system will couple them and can cause the probabilistic distribution of wavefunctions to condense into particle-like behavior.

And I think the biggest difference between the uncertainty principle and the way it’s often portrayed in literature is that, rather than measurements obscuring reality, you could almost say that measurements create reality.  There wasn’t a discrete state until the measurement was made.  It’s like asking an inebriated collegiate friend who just learned something troubling about his romantic partner, “Well, what are you going to do?”  He’ll probably answer.  While you’re talking about it, it’ll seem like he’s going to stick to that answer.  But if you hadn’t asked he probably would’ve continued to mull things over, continued to exist in that seemingly in-between state where there’s both a chance that he’ll break up or try to work things out.  By asking, you learn his plan… but you also forced him to come up with a plan.

And it’s important that our collegian be drunk in this analogy… because making a different measurement has to re-randomize behavior.  Even after he resolves to break up, if you ask “Where should we go for our midnight snack,” mulling that over would make him forget what he’d planned to do about the whole dating situation.  The next time you ask, he might decide to ride it out.  It’s only when allowed to keep the one answer in the forefront of his mind that the answer stays consistent.

The uncertainty principle says that position and momentum can’t both be known precisely not because measurement is difficult, but because elementary particles are too drunk to remember where they are when you ask how fast they’re moving.

And, here, a treat!  As a reward for wading through all this, I’ve drawn a cartoon version of Heisenberg’s misconception.  Note that this is not, in fact, the correct explanation for the uncertainty principle… but do you really need me to sketch a bunch of besotted electrons?

cartoon-title

cartoon-1003

cartoon-summary

On time-traveling information and quantum mechanics.

Screenshot from that Facebook-linked article you all saw recently.
Screenshot from that Facebook-linked article you might have seen recently.

K (who is better at reading the internet than I am) asked me, “Have you seen all those reports about future actions dictating the past?”

I promptly rolled my eyes.  Thinking, which ones?  Because there are a lot of “scientific” studies of that ilk.  One of my favorites (“favorite” here meaning “most laughably silly) is the psychology study demonstrating that people remember words better if they will study them after being quizzed as to which they remember.

Which would be a neat trick — a kid could say, “Please, God, let me know the right answers on this test and I promise I’ll study the material as soon as I get home,” and it would work!

It doesn’t.  Of course not.  What Bem demonstrated in his paper, “Feeling the future: Experimental evidence for anomalous retroactive influences on cognition and affect,” is that our current academic publishing system (wherein researchers are rewarded only for novel results, and particularly counter-intuitive novel results) is suboptimal for the real pursuit of scientific knowledge.  If researchers are allowed to collect lots of data, analyze that data with statistical tests for p-values, and report only what works… then it’s easy to find counter-intuitive results.  Those results will also generally be not true.

The other interesting finding that came from Bem’s work was also related to academic publishing: even if a result is blatantly untrue, it’s difficult to correct the scientific literature.  Several researchers wasted their time attempting to reproduce Bem’s result, and as expected they found that none of the work was correct … but then they could not publish their findings.  Their rejection from the Journal of Personality and Social Psychology read, “This journal does not publish replication studies, whether successful or unsuccessful.”

Anyway, that’s the kind of “science” I was expecting when K asked if I’d seen the new study on future events dictating the past.

I was wrong.  She was talking about a pretty standard quantum mechanics experiment, one postulated a few decades ago, conducted with photons in 2007, and conducted with helium atoms recently.

The basic gist of why these are described as “mind blowing”: there are numerous results in quantum mechanics that can seem silly if you think of objects as being either particle or wave and somehow “choosing” which to be at any given time.  Matter has a wave nature, and the behavior we think of as particle-like arises from the state of an object being linked to the state of other objects.  The common phrasing for this is to say that observation causes a shift from wave-like to particle-like behavior, but the underlying explanation is that our observational techniques result in a state-restricting coupling.

Quantum mechanics is difficult to write about using English-language metaphors — translating from the language of mathematics into English seems to have all the problems of translating between two spoken languages, and then some — but here’s a crude way to think about this type of result:

If you’re standing with your back to two narrow hallways (sufficient for only one person to walk through at a time) and a friend walks through and taps you on the shoulder, you won’t know which hallway your friend came through.  Unless your friend tells you.  Let’s just imagine that your friend is as cagey with his or her secrets as the average helium atom tends to be.

Here is a Roguelike diagram of our thought experiment... in case you haven't played many roguelikes (for shame!  You should try Brogue! https://sites.google.com/site/broguegame/]), you are the @, the F is your friend, the B is your buddy, and those octothorpes are single-person-wide hallways.
Here is a Roguelike diagram of our thought experiment… in case you haven’t played many roguelikes (for shame! You should try Brogue!), you are the @, the F is your friend, the B is your buddy, and those octothorpes are single-person-wide hallways.

If your friend then leaves, however, and at the same time a second buddy of yours walks through to tap you on the shoulder and say hello, then your friend’s history becomes coupled to this second buddy’s.  If your friend walked through the northern hallway, your buddy had to be in the southern, and vice versa.  Their positions are coupled because they can’t occupy the same space at the same time. If you never ask who walked where, though, there’s a residual probability that each walked through each hallway — and if you ever query one, because their histories are coupled, the other’s history suddenly snaps into focus. No matter how far away that second person might be.  Learning which route either took tells you immediately about the other.

Not that this information is necessarily useful.  But perhaps you saw reports about faster-than-light-speed information travel between entangled objects.  The above example applies just as well (or as poorly, if you’re a stickler for accuracy or truth or what have you) to those studies as well.

In some ways this reminds me of the scene from Bottle Rocket, wherein a character is told “You’re like paper.  You know, you’re trash,” and then, “You know, you’re like paper falling by, you know… It doesn’t sound that bad in Spanish.”

A lot of results from quantum mechanics sound weird, but they don’t sound that weird in mathematics.

But I’ll admit that the way some of these results are written up in the popular press is bizarre.  Here’s a quote from Jay Kuo’s article (which K alerted me to after it was featured on George Takei’s webpage) about the recent helium atom experiment:

Screen shot from Tim Wogan's article.
Screen shot from Tim Wogan’s article.

“What they found is weirder than anything seen to date: Every time the two grates were in place, the helium atom passed through, on many paths in many forms, just like a wave.  But whenever the second grate was not present, the atom invariably passed through the first grate like a particle.  The fascinating part was, the second grate’s very existence in the path was random.  And what’s more, it hadn’t happened yet.”

From a passage like that, it’d be hard to tell that this is an experiment that was first conducted nearly a decade ago, and a result that was exactly what you’d expect.  Honestly, I had trouble even parsing the above paragraph, and could barely understand the experiment from the description given in the article. And I studied quantum mechanics! I spent my junior and senior years of college doing research in the field! (My research was on the electronic structure of DNA bases, not entanglement specifically, but still.) I don’t know how people without that background were supposed to follow the science here. Or get through it without their eyes glazing over.

So, as to people’s excitement about this result: it’s a little bit weirder to think about the wavelength of big things (“big” here meaning the helium atoms; they’re big compared to photons), but it’s mostly weird in English.  Or any other metaphor-based language.  Our day-to-day perceptions don’t yield the metaphorical fodder we’d need to properly describe these phenomena in words.

Because, yeah, I like to think that I’m sitting still in a chair, typing this.  But I have a wavelength too.  So do you.  You might be anywhere within the boundaries roughly transcribed by your wavelength!  And of course, there aren’t really any boundaries, because the probability of finding you in a place never quite drops to zero. Even if we consider locations far away from your moments-prior center of mass. But your probability peak on a likelihood vs. location graph is very, very steep.  You, my friend, are rather large: your wavelength is very small.

******************

p.s. If you happened across Jay Kuo’s article and were baffled, and would like an explanation that describes the experimental set-up used (I purposefully left out all the experimental details because I thought they’d distract from my two main points, that translating from mathematics to English is hard and inevitably introduces inaccuracies, and that for coupled pairs of objects [the real word for this is “entangled”] information can be transfered instantaneously), you could check out Tim Wogan’s summary on Physics World.  Wogan alludes to the idea that identifying the state of one object out of an entangled pair causes something reminiscent of faster-than-light travel:

“Indeed, the results of both Truscott and Aspect’s experiments [show] that [an object]’s wave or particle nature is most likely undefined until a measurement is made.  The other less likely option would be that of backward causation — that the particle somehow has information from the future — but this involves sending a message faster than light, which is forbidden by the rules of relativity.”

I don’t really like the use of the word “measurement” above (sure, I changed a few other words in that quotation, but only to improve readability — I didn’t want to change anything that might alter Wogan’s ideas), because to me this sounds excessively human-centric, as though quantum collapse couldn’t happen without us.

Over time, the state of an object can become coupled to the states of others (if two blue billiard balls collide, for instance, then you know that at some point in time they were in the same place) or uncoupled from the states of prior interaction partners (if one of those blue billiard balls then collides with a third red ball, the trajectories of the two blue balls will no longer be coupled).

In this double-slit experiment, the coupling between helium atom and detector (when the detector either chirups or doesn’t, that making-sound-or-not state is coupled to the position of the helium atom) which unveils information about objects entangled with the helium.

sherlock-holmes-462978_640Maybe this seems less confusing if you think about it in terms of progressively revealing clues instead of causing behavior?  But, again, the English descriptions are never going to exactly match the math.