Heisenberg uncertainty principle, proved with just math
This was a homework problem that I had for my mathematical methods for physics class, and I really enjoyed it! It was good practice using bra-ket notation and derives a really, really fundamental property about the world. It’s really amazing how an inequality concerning physical observables – the position of an object, its momentum, etc. – have a profound relationship that can be expressed through pure math. This comes about by virtue of the idea that these quantities are treated as “operators” (like multiplication and division) acting on a wave function in quantum mechanics.
Part 0 – A crash course in bra-ket notation
Basics
“Bra-ket” notation was invented by Dirac as a way to format linear algebra in a way that was convenient for many of the usual operations done in quantum mechanics. Here is a super brief rundown of what we will need to deal with:
A ket is vector.
A bra is a linear map that maps a vector into a scalar number.
I like thinking about this kind of abstract idea in this way: bras are just row vectors, and kets are just column vectors. The dot product of a row vector with a column vector just gives us a single number, so in this sense we mapped the ket (the column vector) into a scalar number according to the “linear map” defined by the bra (the row vector).
Dot products and operators
Now in linear algebra, taking dot products (more generally, an inner product) is a pretty common thing to do. We usually write this with a dot between the two vectors In bra-ket notation, we write this by squishing the bra and the ket together:
(remember that a dot product will spit out a scalar number!)
We can multiply a vector by a scalar value by appending it to the left of a ket, which is written as one might expect:
We could also perform an arbitrary operation on a vector with (e.g. could be the operator “square all components of the vector” – note that this should return another vector of identical proportions).
Thus, we could chain a bra, an operator, and a ket together which would amount to just using the operator on ket , and then taking the resulting vector’s inner product with . This would, as explained above, leave us with just a scalar.
Hermitian conjugates
We get the “Hermitian conjugate” of a matrix by taking its transpose, and then taking the complex conjugate of each of the elements of the matrix. The notation for taking the Hermitian conjugate is to raise it to the power of “dagger”:
Crucially, and if a matrix equals its own Hermitian conjugate, the matrix is “Hermitian”.
Dirac was one devious guy … but armed with our new tools, we are ready to begin!
Part 1 – Schwarz inequality
For Hermitian operators , , we will first prove that
The Schwarz inequality for vectors , of an inner product space states that
Since and are operators, when we operate on some vector by either of these, we get a vector. Thus, consider and . Plugging this into the Schwarz inequality yields
Part 2 – Triangle inequality
Our next goal is to prove that , where is the commutator of the operators , given by
(From linear algebra, you may recall that matrix multiplication is generally not commutative: . The degree to which this order matters is the “commutator”.)
Expanding the left side of the inequality gives
Now, consider the triangle inequality for complex numbers , . We have Since the two terms on the left hand side, and , are just complex numbers, we can use the triangle inequality.
Squaring both sides,
We can combine the terms on the right hand sign since again, and are just complex numbers, so they commute and they also obey the same inequality from part 1. Thus,
Part 3 – Defining some useful operators
Consider two operators , , where is the identity matrix and , are real numbers. We will show that and are both Hermitian, and thus that .
Take the Hermitian conjugate of :
, since is Hermitian, and since is real. Thus,
, so is Hermitian. By symmetry, the same argument applies to : .
Now,
Part 4 – Putting it all together
The expected value of an operator in state is defined as . Thus, the standard deviation of this measurement, or the “uncertainty”, of in the normalized state is given as such: , using the result from part 3.
Now, in part 2 we found that
Since and are Hermitian, we can insert them into this inequality.
and , and by definition. Thus, combing all of these into the above inequality yields
for any two generic, Hermitian operators and ! Any two operators exhibit an “uncertainty” that is a function of the commutator of the two operators. This does seem to make sense; a commutator is just an expression of how much two operators commute, so logically the uncertainty relation should be affected by this quantity.
To derive the so-called “Heisenberg uncertainty principle”, which expresses the minimum uncertainty in position and momentum of a particle, we note that (this is called the canonical commutation relation), and that the position and momentum operators are Hermitian. Thus,
Conclusion
I once read somewhere that in some deep sense, everything strange about quantum mechanics is derived from the fact that the position and momentum operators do not commute. Indeed, from this assumption/observation/axiom/whatever you want to call it, the uncertainty principle is a mathematical necessity given the commutation relation. What an idea!