Heisenberg uncertainty principle, proved with just math

2023-09-24

This was a homework problem that I had for my mathematical methods for physics class, and I really enjoyed it! It was good practice using bra-ket notation and derives a really, really fundamental property about the world. It’s really amazing how an inequality concerning physical observables – the position of an object, its momentum, etc. – have a profound relationship that can be expressed through pure math. This comes about by virtue of the idea that these quantities are treated as “operators” (like multiplication and division) acting on a wave function ψ\psi in quantum mechanics.

Part 0 – A crash course in bra-ket notation

Basics

“Bra-ket” notation was invented by Dirac as a way to format linear algebra in a way that was convenient for many of the usual operations done in quantum mechanics. Here is a super brief rundown of what we will need to deal with:

A ket b\ket{b} is vector.
A bra a\bra{a} is a linear map that maps a vector into a scalar number.

I like thinking about this kind of abstract idea in this way: bras are just row vectors, and kets are just column vectors. The dot product of a row vector with a column vector just gives us a single number, so in this sense we mapped the ket (the column vector) into a scalar number according to the “linear map” defined by the bra (the row vector).

Dot products and operators

Now in linear algebra, taking dot products (more generally, an inner product) is a pretty common thing to do. We usually write this with a dot between the two vectors aundefinedbundefined\overrightarrow{a} \cdot \overrightarrow{b} In bra-ket notation, we write this by squishing the bra and the ket together:
ab is a dot product\braket{a|b} \text{ is a dot product}
(remember that a dot product will spit out a scalar number!)

We can multiply a vector by a scalar value by appending it to the left of a ket, which is written as one might expect: λaundefined=λa\lambda \overrightarrow{a} = \lambda \ket{a}

We could also perform an arbitrary operation OO on a vector with OaO\ket{a} (e.g. OO could be the operator “square all components of the vector” – note that this should return another vector of identical proportions).

Thus, we could chain a bra, an operator, and a ket together aOb\bra{a} O \ket{b} which would amount to just using the operator on ket b\ket{b}, and then taking the resulting vector’s inner product with a\bra{a}. This would, as explained above, leave us with just a scalar.

Hermitian conjugates

We get the “Hermitian conjugate” of a matrix by taking its transpose, and then taking the complex conjugate of each of the elements of the matrix. The notation for taking the Hermitian conjugate is to raise it to the power of “dagger”: \dagger
Crucially, a=a\ket{a}^\dagger = \bra{a} and if a matrix equals its own Hermitian conjugate, the matrix is “Hermitian”.

Dirac was one devious guy … but armed with our new tools, we are ready to begin!

Part 1 – Schwarz inequality

For Hermitian operators AA, BB, we will first prove that

ψABψ2ψA2ψψB2ψ|\bra{\psi}AB\ket{\psi}|^2 \le \bra{\psi}A^2\ket{\psi}|\bra{\psi}B^2\ket{\psi}

The Schwarz inequality for vectors aa, bb of an inner product space states that

abaabb|\braket{a|b}| \le \sqrt{\braket{a|a}\braket{b|b}}

Since AA and BB are operators, when we operate on some vector ψ\psi by either of these, we get a vector. Thus, consider a=Aψa=A\psi and b=Bψb=B\psi. Plugging this into the Schwarz inequality yields AψBψAψAψBψBψ|\braket{A \psi | B \psi}| \le \sqrt{\braket{A\psi|A\psi}\braket{B \psi | B\psi}}

ψABψ2ψA2ψψB2ψ\rightarrow|\bra{\psi}AB\ket{\psi}|^2 \le \bra{\psi}A^2\ket{\psi}|\bra{\psi}B^2\ket{\psi}

Part 2 – Triangle inequality

Our next goal is to prove that ψ[A,B]ψ24ψA2ψψB2ψ|\bra{\psi}[A,B]\ket{\psi}|^2 \le 4\bra{\psi}A^2\ket{\psi}|\bra{\psi}B^2\ket{\psi}, where [A,B][A,B] is the commutator of the operators AA, BB given by [A,B]=ABBA[A,B] = AB - BA
(From linear algebra, you may recall that matrix multiplication is generally not commutative: ABBAAB \neq BA. The degree to which this order matters is the “commutator”.)

Expanding the left side of the inequality gives ψABBAψ24ψA2ψψB2ψ|\bra{\psi}AB-BA\ket{\psi}|^2 \le 4\bra{\psi}A^2\ket{\psi}|\bra{\psi}B^2\ket{\psi} ψABψψBAψ24ψA2ψψB2ψ|\bra{\psi}AB\ket{\psi}-\bra{\psi}BA\ket{\psi}|^2 \le 4\bra{\psi}A^2\ket{\psi}|\bra{\psi}B^2\ket{\psi}

Now, consider the triangle inequality for complex numbers aa, bb. We have a+ba+b a,bC|a+b| \le |a| + |b| \space \forall a, b \in \mathbf{C} Since the two terms on the left hand side, ψABψ\bra{\psi}AB\ket{\psi} and ψBAψ\bra{\psi}BA\ket{\psi}, are just complex numbers, we can use the triangle inequality.

ψABψψBAψψABψ+ψBAψ|\bra{\psi}AB\ket{\psi}-\bra{\psi}BA\ket{\psi}| \le |\bra{\psi}AB\ket{\psi}| +|\bra{\psi}BA\ket{\psi}|

Squaring both sides, ψABψψBAψ2ψABψ2+ψBAψ2+2ψABψψBAψ|\bra{\psi}AB\ket{\psi}-\bra{\psi}BA\ket{\psi}|^2 \le |\bra{\psi}AB\ket{\psi}|^2 +|\bra{\psi}BA\ket{\psi}|^2+2|\bra{\psi}AB\ket{\psi}||\bra{\psi}BA\ket{\psi}|

We can combine the terms on the right hand sign since again, ψABψ\bra{\psi}AB\ket{\psi} and ψBAψ\bra{\psi}BA\ket{\psi} are just complex numbers, so they commute and they also obey the same inequality from part 1. Thus, ψ[A,B]ψ24ψA2ψψB2ψ|\bra{\psi}[A,B]\ket{\psi}|^2 \le 4\bra{\psi}A^2\ket{\psi}|\bra{\psi}B^2\ket{\psi}

Part 3 – Defining some useful operators

Consider two operators AAα1A'\equiv A-\alpha\mathbf{1}, BBβ1B'\equiv B-\beta\mathbf{1}, where 1\mathbf{1} is the identity matrix and α\alpha, β\beta are real numbers. We will show that AA' and BB' are both Hermitian, and thus that [A,B]=[A,B][A',B']=[A,B].

Take the Hermitian conjugate of AA':

A=Aα1{A'}^{\dagger}=A^{\dagger}-{\alpha}^{\star}\mathbf{1}^{\dagger}

1=1\mathbf{1}^{\dagger}=\mathbf{1}, A=A{A}^{\dagger}=A since AA is Hermitian, and α=α{\alpha}^{\star}=\alpha since α\alpha is real. Thus,

A=Aα1=A{A'}^{\dagger}=A-{\alpha}\mathbf{1}=A', so AA' is Hermitian. By symmetry, the same argument applies to BB': B=B{B'}^{\dagger}=B'.

Now, [A,B]=ABBA[A', B']=A'B'-B'A' =(Aα1)(Bα1)(Bα1)(Aα1)=(A-\alpha\mathbf{1})(B-\alpha\mathbf{1})-(B-\alpha\mathbf{1})(A-\alpha\mathbf{1}) =ABβAαB+αβ1(BAαBβA+αβ1)=AB-\beta A-\alpha B + \alpha\beta\mathbf{1}-(BA-\alpha B-\beta A + \alpha\beta\mathbf{1}) =ABBA=AB-BA =[A,B]=[A,B]

Part 4 – Putting it all together

The expected value of an operator AavgA_{avg} in state ψ\ket{\psi} is defined as Aavg=ψAψA_{avg}=\bra{\psi}A\ket{\psi}. Thus, the standard deviation of this measurement, or the “uncertainty”, of AA in the normalized state ψ\psi is given as such: ΔA=ψ(AAavg1)2ψ\Delta A = \sqrt{\bra{\psi}(A-A_{avg}\mathbf{1})^2|\ket{\psi}} =ψA2ψ=\sqrt{\bra{\psi}A'^2|\ket{\psi}}, using the result from part 3.

Now, in part 2 we found that 4ψA2ψψB2ψψ[A,B]ψ24\bra{\psi}A^2\ket{\psi}|\bra{\psi}B^2\ket{\psi} \ge |\bra{\psi}[A,B]\ket{\psi}|^2 2ψA2ψψB2ψψ[A,B]ψ\rightarrow 2\sqrt{\bra{\psi}A^2\ket{\psi}|\bra{\psi}B^2\ket{\psi}} \ge |\bra{\psi}[A,B]\ket{\psi}|

Since AA' and BB' are Hermitian, we can insert them into this inequality. 2ψA2ψψB2ψψ[A,B]ψ2\sqrt{\bra{\psi}A'^2\ket{\psi}|\bra{\psi}B'^2\ket{\psi}} \ge |\bra{\psi}[A',B']\ket{\psi}|ψA2ψψB2ψ12ψ[A,B]ψ\sqrt{\bra{\psi}A'^2\ket{\psi}|\bra{\psi}B'^2\ket{\psi}} \ge \frac{1}{2}|\bra{\psi}[A',B']\ket{\psi}|

[A,B]=[A,B][A', B']=[A,B] and ψA2ψ=ΔA\sqrt{\bra{\psi}A'^2|\ket{\psi}}=\Delta A, and ψB2ψ=ΔB\sqrt{\bra{\psi}B'^2|\ket{\psi}}=\Delta B by definition. Thus, combing all of these into the above inequality yields

ΔAΔB12ψ[A,B]ψ\Delta A \Delta B \ge \frac{1}{2}|\bra{\psi}[A,B]\ket{\psi}| for any two generic, Hermitian operators AA and BB! Any two operators exhibit an “uncertainty” that is a function of the commutator of the two operators. This does seem to make sense; a commutator is just an expression of how much two operators commute, so logically the uncertainty relation should be affected by this quantity.

To derive the so-called “Heisenberg uncertainty principle”, which expresses the minimum uncertainty in position and momentum of a particle, we note that [x,p]=i[x,p]=i \hbar (this is called the canonical commutation relation), and that the position and momentum operators are Hermitian. Thus, ΔxΔp12ψiψ\Delta x \Delta p \ge \frac{1}{2}|\bra{\psi}i\hbar\ket{\psi}| ΔxΔp2\Delta x \Delta p \ge \frac{\hbar}{2}

Conclusion

I once read somewhere that in some deep sense, everything strange about quantum mechanics is derived from the fact that the position and momentum operators do not commute. Indeed, from this assumption/observation/axiom/whatever you want to call it, the uncertainty principle is a mathematical necessity given the commutation relation. What an idea!