10973 words
55 minutes
Loading... readings
Quantum Information

This note refers to the lecture notes of Professor John Preskill.
To get the original lecture notes, click 📖.

Chap 1     Introduction and Overview 📖#

1.1 Physics of information#

  • Landauer’s principle: Erasing 1 bit information requires at lease work W=kTln2W=kT\ln2.

  • Reversible computation: The information is fully retained, that is, the input and output correspond one to one to avoid erasing information and energy cost.

    e.g. NAND gate:

    FormType
    (a,b)¬(ab)(a,b)\to\neg(a\wedge b)irreversible
    (a,b,c)(a,b,cab)(a,b,c)\to(a,b,c\oplus a\wedge b), c=1c=1reversible
  • Maxwell’s demon: Information entropy.

1.2 Quantum information#

  • True randomness

    Measurement in quantum mechanics is inherently probabilistic. If we measure a general state ψ=xaxx\ket \psi = \sum_{x} a_x \ket x by projecting onto the {0,1}\{\ket 0,\ket 1\} basis, the probability of obtaining the outcome x\ket x is ax2|a_x|^2. This randomness is a fundamental feature of quantum theory.

  • Uncertainty principle

    In quantum theory, noncommuting observables cannot simultaneously have precisely defined values (the uncertainty principle).

  • Acquiring information causes disturbance

    Performing a measurement of one observable AA will necessarily influence (disturb) the outcome of a subsequent measurement of an observable BB, if AA and BB do not commute.

  • No-cloning principle

    It is impossible to create an independent and identical copy of an arbitrary unknown quantum state.

  • Nonlocal correlation

    In theoretical physics, quantum nonlocality refers to the phenomenon by which the measurement statistics of a multipartite quantum system do not allow an interpretation with local realism.

1.3 Efficient quantum algorithms#

  • Quantum bits have some properties that classical bits don’t have, which can reduce the computational complexity on specific problems.

    e.g. Shor’s Algorithm:

    This is a quantum algorithm for integer factorization. Classically, factoring a large number is considered a “hard” problem (it is in NP, but not believed to be in P). The best-known classical algorithms require super-polynomial time. Shor’s algorithm, however, can run in polynomial time on a quantum computer, providing an exponential speedup.

    e.g. Grover’s Algorithm:

    This is a quantum search algorithm. For an unstructured database with NN items, a classical search requires O(N)O(N) operations on average to find a target. Grover’s algorithm achieves this in O(N)O(\sqrt{N}) operations, a quadratic speedup. This can be applied to brute-forcing solutions for NP-Complete problems, reducing a classical O(2N)O(2^N) search time to a quantum O(2N/2)O(2^{N/2}) time.

1.4 Quantum complexity#

  • Quantum states: Quantum states of NN qubits can be expressed as a vector in a space of dimension 2N2^N.An orthonormal basis for the space can be labeled by binary strings such as

    011010N.\ket{\underbrace{ 0110 \cdots 10}_{N}}.

    A general normalized vector can be expanded in this basis as

    ψ=x=02N1axx,\ket \psi = \sum_{x=0}^{2^N-1} a_x \ket x,

    where axa_x‘s are complex number satisfying xax2=1\sum_x |a_x|^2=1. If we measure all N qubits by projecting each onto the {0,1}\{\ket 0,\ket 1\} basis, the probability of obtaining the outcome x\ket x is ax2|a_x|^2.

  • Quantum gate: Apply a unitary transformation U\bm{U} to the N qubits,

    ψUψ.\ket \psi \to \bm{U} \ket \psi.
  • Nonelocal correlation: If we divide a quantum system into some subsystems and carry out each measurement within one of the subsystems, which means no collective measurements spanning the boundaries between the subsystems is allowed, the measurements will reveal very little information about what the state of original system is.

    Most of the information is stored in correlations. By measuring the correlations, which is considered to be a “collective” measurement, we can learn much more; in principle, we can completely reconstruct the state.

1.5 Quantum parallelism#

  • There is a function that transforms input xx into output f(x)f(x). If we input NN bit, there are 2N2^N possible arguments.

    To obtain all possible outcomes, 2N2^N computations are required for classical bits; however, only a single calculation is required for qubits.

    e.g. Deutsch’s problem

    There is a black box that computes a function that takes a single bit xx to a single bit f(x)f(x). We want to know whether f(x)f(x) is constant (f(0)=f(1))(f(0)=f(1)) or balanced (f(0)f(1))(f(0)\neq f(1)).

    For classical bits, 2 computations are required to get the answer. In contrast, for qubits, it just once calculation suffices.

    All we need is a transformation UfU_f that operates on two qubits:

    Uf:xyxyf(x).U_f:\ket x \ket y \to \ket x \ket{y \oplus f(x)}.

    We can choose the second qubit of the input state to be a superposition of 0\ket 0 and 1\ket 1, 12(01)\frac{1}{\sqrt{2}}(\ket 0-\ket 1), then

    Uf:x12(01)x12(0f(x)1f(x))=x(1)f(x)12(01),\begin{align*} U_f:\ket x \frac{1}{\sqrt{2}}(\ket 0 - \ket 1) & \to \ket x \frac{1}{\sqrt{2}}(\ket{0 \oplus f(x)} - \ket{1 \oplus f(x)})\\ &=\ket x(-1)^{f(x)} \frac{1}{\sqrt{2}}(\ket 0-\ket 1), \end{align*}

    Now suppose we prepare the first qubit as 12(0+1)\frac{1}{\sqrt{2}}(\ket 0+\ket 1). Then the black box acts as

    Uf:12(0+1)12(01)12[(1)f(0)0+(1)f(1)1]12(01).\begin{align*} U_f:& \frac{1}{\sqrt{2}}(\ket 0+\ket 1) \frac{1}{\sqrt{2}}(\ket 0-\ket 1) \to\\ & \frac{1}{\sqrt{2}} \left[(-1)^{f(0)} \ket 0+(-1)^{f(1)}\ket 1 \right] \frac{1}{\sqrt{2}}(\ket 0-\ket 1). \end{align*}

    Finally, we can perform a measurement that projects the first qubit onto the basis

    ±=12(0±1).\ket \pm = \frac{1}{\sqrt{2}}(\ket 0\pm\ket 1).

    Evidently, we will always obtain +|+\rangle if the function is constant(f(0)=f(1))(f(0) = f(1)), and \ket - if the function is balanced(f(0)f(1))(f(0) \ne f(1)).

  • Let the quantum computer acts as

    Uf:xxf(x),U_f:\ket x \ket{} \to \ket x \ket{f(x)},

    we could choose the input register to be in a state

    [12(0+1)]N=12N/2x=02N1x,\left[\frac{1}{\sqrt{2}}(\ket 0+\ket 1) \right]^N = \frac{1}{2^{N/2}} \sum_{x=0}^{2^N-1} \ket x,

    and by computing f (x) only once, we can generate a state

    12N/2x=02N1xf(x).\frac{1}{2^{N/2}} \sum_{x=0}^{2^N-1} \ket x \ket{f(x)}.

    If we measure the first qubit and obtain x0\ket{x_0}. This procedure would prepare a state

    x0f(x0).\ket{x_0} \ket{f(x_0)}.

    By measuring the state, we can find the value of f(x0)f(x_0). However, we can’t determine f(y0)f(y_0) for any y0x0.y_0\ne x_0. We need to go back to the step before we measured the first qubit and expect measurement result of first qubit is y0y_0. In this case, then, the quantum computation provided no advantage over a classical one.

1.6 A new classification of complexity#

  • For classical complexity: for any algorithm A and N bits input, TA(N)T_A(N) is the largest elementary steps needed to take to complete.

    Call A is polynomial time if TA(N)T_A(N) \le Poly(NN), where Poly(NN) denotes a polynomial of NN. Otherwise, we say it is exponential time.

  • Quantum classification of complexity is indeed different than the classical classification (as is suspected but not proved).

1.7 What about errors?#

  • Information is encoded in superposition of states ψ\ket \psi. Unfortunately, these nonlocal correlations are extremely fragile and tend to decay very rapidly in practice. The problem is that quantum system is inevitably in contact with a much larger system, its environment. Interactions between a quantum system and its environment establish nonlocal correlations between the two. Eventually the quantum information that we initially encoded in the system becomes encoded in correlations between the system and the environment, which means we can no longer access the information by observing only the system, or we can say the information is irrevocably lost.

    Even if we could achieve perfect isolation from the environment, we cannot expect quantum computers to operate with perfect accuracy, as quantum gates may introduce certain errors into the system, for example, U0U0(1+O(ε))U_0 \to U_0(1 + O(\varepsilon)), through it’s small, errors will accumulate.

  • Overall, we can divide the errors into four categories.

    Phase error: 0+101\ket 0 + \ket 1 \to \ket 0 - \ket 1.

    Small error: initial state a0+b1a \ket 0 + b \ket 1 changes amount of order ε\varepsilon.

    Measurement causes disturbance.

    No cloning: quantum information cannot be copied with perfect fidelity.

1.8 Quantum error-correcting codes#

  • Before we talk about quantum error correction, let’s look at how does classical error correction work? The simplest example of a classical error-correcting code is a repetition code: we replace the bit we wish to protect by 3 copies of the bit,

    0(000),0 \to (000), 1(111).1 \to (111).

    With a quantum code, we should be mindful of the requirement that we will need to be able to correct the errors without measuring any of the encoded information.

    Suppose that we encode a single qubit with 3 qubits:

    00=000\ket{0} \to \ket{\overline{0}} = \ket{000} 11=111\ket{1} \to \ket{\overline{1}} = \ket{111}

    For a 3-qubit state xyz\ket{xyz} we could measure the two-qubit observables yzy \oplus z, and xzx \oplus z. For both xyz=000\ket{xyz} = \ket{000} and 111\ket{111} these would be 0, but if any one bit flips, then at least one of these quantities will be 1. In fact, if there is a single bit flip, the two bits

    (yz,xz),(y \oplus z, x \oplus z),

    just designate in binary notation the position of the bit that flipped. For example, if the first qubit flips,

    000100,111011\ket{000} \to \ket{100}, \ket{111} \to \ket{011}

    the measurement of (yz,xz)(y \oplus z, x \oplus z) yield the result (0,1)(0, 1), which instructs us to flip the first bit.

    For small errors, like

    000000+ε100,\ket{000} \to \ket{000} + \varepsilon \ket{100}, 111111+ε011.\ket{111} \to \ket{111} + \varepsilon \ket{011}.

    In measuring (yz,xz)(y \oplus z, x \oplus z), we would project out an eigenstate of this observable. Most of the time (probability 1ε21 − |\varepsilon|^2) we obtain the result (0,0)(0, 0) and project the damaged state back to the original state, and so correct the error. Occasionally (probability ε2|\varepsilon|^2) we obtain the result (0,1)(0, 1) and project the state onto the state that the first qubit is flipped. But then the outcome instructs us to flip the first bit, which restores the original state.

  • To address phase errors, we encode a single qubit using nine qubits, according to

    ++=123/2(000+111)(000+111)(000+111),\ket{+} \to \ket{\overline{+}} = \frac{1}{2^{3/2}} (\ket{000} + \ket{111})(\ket{000} + \ket{111})(\ket{000} + \ket{111}), =123/2(000111)(000111)(000111).\ket{-} \to \ket{\overline{-}} = \frac{1}{2^{3/2}} (\ket{000} - \ket{111})(\ket{000} - \ket{111})(\ket{000} - \ket{111}).

    Now suppose that a phase flip occurs in one of the clusters

    000+111000111\ket{000} + \ket{111} \to \ket{000} - \ket{111}

    In this case, we need to measure a six-qubit observable to do the comparison, for example, the observable that flips qubits 1 through 6. A pair of clusters with the same sign is an eigenstate with eigenvalue +1, and a pair of clusters with opposite sign is an eigenstate with eigenvalue −1. By measuring the six-qubit observable for a second pair of clusters, we can determine which cluster has a different sign than the others. For example,

    123/2(000+111)(000+111)(000+111)123/2(000111)(000+111)(000+111),\begin{align*} & \frac{1}{2^{3/2}} (\ket{000} + \ket{111})(\ket{000} + \ket{111})(\ket{000} + \ket{111}) \\ \to & \frac{1} {2^{3/2}}(\ket{000} - \ket{111})(\ket{000} + \ket{111})(\ket{000} + \ket{111}), \end{align*}

    by measuring the observable that flips qubits 1 through 6 and the observable that flips qubits 4 through 9, we get the result (1,1)(-1, 1), , which instructs us to change the sign of first cluster.

1.9 Quantum hardware#

  • To build hardware for a quantum computer, we’ll need technology that enables us to manipulate qubits. The hardware will need to meet some stringent specifications:

    1. Storage: We’ll need to store qubits for a long time, long enough to complete an interesting computation.
    2. Isolation: The qubits must be well isolated from the environment, to minimize decoherence errors.
    3. Readout: We’ll need to measure the qubits efficiently and reliably.
    4. Gates: We’ll need to manipulate the quantum states of individual qubits, and to induce controlled interactions among qubits, so that we can perform quantum gates.
    5. Precision: The quantum gates should be implemented with high precision if the device is to perform reliably.
  • There are many kinds of quantum hardwares, like iron trap, superconducting quantum computer etc. The introduction in the lecture seems outdated. To better understand these things, it is recommended to read relevant reviews from recent years.

 

 

 

Chap 2     States and Ensembles 📖#

2.1 Axioms of quantum mechanics#

  • Axiom 1. States.

    A state is a complete description of a physical system. In quantum mechanics, a state is a ray in a Hilbert space.

  • Axiom 2. Observables.

    An observable is a property of a physical system that in principle can be measured. In quantum mechanics, an observable is a self-adjoint operator.

  • Axiom 3. Measurement.

    A measurement is a process in which information about the state of a physical system is acquired by an observer. In quantum mechanics, the measurement of an observable A\bm{A} prepares an eigenstate of A\bm{A}, and the observer learns the value of the corresponding eigenvalue.If the quantum state just prior to the measurement is ψ\ket{\psi}, then the outcome an is obtained with a priori probability

    Prob(an)=Enψ2=ψEnψ;\text{Prob}(a_n) = \|\bm{E_n} \ket{\psi}\|^2 = \bra{\psi} \bm{E_n} \ket{\psi};

    if the outcome ana_n is attained, then the (normalized) quantum state just after the measurement is

    EnψEnψ.\frac{\bm{E_n} \ket{\psi}}{\|\bm{E_n} \ket{\psi}\|}.
  • Axiom 4. Dynamics.

    Dynamics describes how a state evolves over time. In quantum mechanics, the time evolution of a closed system is described by a unitary operator.

  • Axiom 5. Composite Systems.

    If the Hilbert space of system AA is HA\mathcal{H}_A and the Hilbert space of system BB is HB\mathcal{H}_B, then the Hilbert space of the composite systems ABAB is the tensor product HAHB\mathcal{H}_A \otimes \mathcal{H}_B. If system AA is prepared in the state ψA\ket{\psi}_{A} and system B is prepared in the state φB\ket{\varphi}_{B}, then the composite system’s state is the product ψAφB\ket{\psi}_A \otimes \ket{\varphi}_B.

2.2 The qubit#

  • The smallest nontrivial Hilbert space is two-dimensional. We may denote an orthonormal basis for a two-dimensional vector space as {0,1}\{\ket{0}, \ket{1}\}. A qubit is a quantum system described by a two-dimensional Hilbert space, whose state can take any value of the form

    ψ=α0+β1,\ket{\psi} = \alpha \ket{0} + \beta \ket{1},

    where α\alpha and β\beta are complex numbers satisfying the normalization condition α2+β2=1|\alpha|^2 + |\beta|^2 = 1.

2.2.1 Spin-1/2#

  • 0\ket{0} and 1\ket{1} are the spin up (\ket{\uparrow}) and spin down (\ket{\downarrow}) states along a particular axis such as the z-axis.

    A finite rotation is expressed as (here =1\hbar = 1)

    R(n^,θ)=eiθn^J,\bm{R}(\hat{n}, \theta) = e^{-i \theta \hat{n} \cdot \bm{J}},

    Rotations about distinct axes don’t commute. From elementary properties of rotations, we find the commutation relations

    [Ji,Jj]=iεijkJk.[\bm{J_i}, \bm{J_j}] = i \varepsilon_{ijk} \bm{J_k}.

    where εijk\varepsilon_{ijk} is the totally antisymmetric tensor with ε123=1\varepsilon_{123} = 1, and repeated indices are summed. The operators Ji\bm{J_i} are the generators of rotations. For a spin-1/2 system, the operators Ji\bm{J_i} can be represented by the Pauli matrices:

    Ji=σi2,\bm{J_i} = \frac{\bm{\sigma_i}}{2},

    where

    σ1=(0110),   σ2=(0ii0),   σ3=(1001).\bm{\sigma}_1 = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}, ~~~ \bm{\sigma}_2 = \begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix}, ~~~ \bm{\sigma}_3 = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}.

    It is easy to check that the Pauli matrices satisfy the relations

    [σi,σj]=2iεijkσk,   {σi,σj}=2δijI.[\bm{\sigma_i}, \bm{\sigma_j}] = 2 i \varepsilon_{ijk} \bm{\sigma_k}, ~~~ \{\bm{\sigma_i}, \bm{\sigma_j}\} = 2 \delta_{ij} \bm{I}.

    Then we have

    σiσj=12({σi,σj}+[σi,σj])=δijI+iεijkσk.\bm{\sigma_i} \bm{\sigma_j} = \frac{1}{2} (\{\bm{\sigma_i}, \bm{\sigma_j}\} + [\bm{\sigma_i}, \bm{\sigma_j}]) = \delta_{ij} \bm{I} + i \varepsilon_{ijk} \bm{\sigma_k}.

    Now consider the dot product of a vector and the Pauli operators

    n^σ=n1σ1+n2σ2+n3σ3=(n3n1in2n1+in2n3),\hat{n} \cdot \bm{\sigma} = n_1 \bm{\sigma}_1 + n_2 \bm{\sigma}_2 + n_3 \bm{\sigma}_3 = \begin{pmatrix} n_3 & n_1 - i n_2 \\ n_1 + i n_2 & -n_3 \end{pmatrix},

    where n^=(n1,n2,n3)\hat{n} = (n_1, n_2, n_3) is a unit vector. Then we have

    (n^σ)(n^σ)=n^2I=I.(\hat{n} \cdot \bm{\sigma}) (\hat{n} \cdot \bm{\sigma}) = \hat{n}^2 \bm{I} = \bm{I}.

    By expanding the exponential series, we see that a finite rotation along the axis n^\hat{n} by an angle θ\theta is represented as

    R(n^,θ)=i=01i!(in^σ)i(θ/2)i=i=01(2i)!(θ/2)2i+in^σ1(2i+1)!(θ/2)2i+1=cos(θ/2)in^σsin(θ/2).\begin{align*} \bm{R}(\hat{n}, \theta) &= \sum_{i=0}^\infty \frac{1}{i!} (-i \hat{n} \cdot \bm{\sigma})^i (\theta/2)^i \\ &= \sum_{i=0}^\infty \frac{1}{(2i)!} (\theta/2)^{2i} + i \hat{n} \cdot \bm{\sigma} \frac{1}{(2i+1)!} (\theta/2)^{2i+1} \\ &= \cos(\theta/2) - i \hat{n} \cdot \bm{\sigma}\sin(\theta/2). \end{align*}

    which means that a rotation along the axis n^\hat{n} by an angle θ\theta is represented by a unitary operator that is a linear combination of the identity operator and the operator n^σ\hat{n} \cdot \bm{\sigma}.

    Therefore, we can construct eigenstates of angular momentum along the axis n^=(sinθcosϕ,sinθsinϕ,cosθ)\hat{n} = (\sin \theta \cos \phi, \sin \theta \sin \phi, \cos \theta) as

    n^+=R(y^,θ)0=(cos(θ/2)eiϕsin(θ/2)),n^=R(y^,θ)1=(eiϕsin(θ/2)cos(θ/2)),\ket{\hat{n}+} = \bm{R}(\hat{y'}, \theta) \ket{0} = \begin{pmatrix} \cos(\theta/2) \\ e^{i \phi} \sin(\theta/2) \end{pmatrix},\\ \ket{\hat{n}-} = \bm{R}(\hat{y'}, \theta) \ket{1} = \begin{pmatrix} - e^{- i \phi} \sin(\theta/2) \\ \cos(\theta/2) \end{pmatrix},

    where y^=(sinϕ,cosϕ,0)\hat{y'} = (-\sin \phi, \cos \phi, 0) is a unit vector orthogonal to n^\hat{n} and z^\hat{z}.

    In order to make the form more symmetrical, we can rewrite the states as

    n^+=R(y^,θ)R(z^,ϕ)0=(eiϕ/2cos(θ/2)eiϕ/2sin(θ/2)),n^=R(y^,θ)R(z^,ϕ)1=(eiϕ/2sin(θ/2)eiϕ/2cos(θ/2)).\ket{\hat{n'}+} = \bm{R}(\hat{y'}, \theta) \bm{R}(\hat{z}, \phi) \ket{0} = \begin{pmatrix} e^{- i \phi /2} \cos(\theta/2) \\ e^{i \phi /2} \sin(\theta/2) \end{pmatrix},\\ \ket{\hat{n'}-} = \bm{R}(\hat{y'}, \theta) \bm{R}(\hat{z}, \phi) \ket{1} = \begin{pmatrix} - e^{- i \phi /2} \sin(\theta/2) \\ e^{i \phi /2} \cos(\theta/2) \end{pmatrix}.

    These are two different representations of the same state, differing by at most a phase. The states n^+\ket{\hat{n}+} and n^\ket{\hat{n}-} are orthonormal eigenstates of the operator n^σ\hat{n} \cdot \bm{\sigma} with eigenvalues +1 and −1, respectively.

    In the special case θ=π/2,ϕ=0\theta = \pi/2, \phi = 0 (the x^\hat{x}-axis), we have

    x^+=12(11),   x^=12(11),\ket{\hat{x}+} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}, ~~~ \ket{\hat{x}-} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix},

    and for θ=π/2,ϕ=π/2\theta = \pi/2, \phi = \pi/2 (the y^\hat{y}-axis), we have

    y^+=12(1i),   y^=12(i1).\ket{\hat{y}+} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ i \end{pmatrix}, ~~~ \ket{\hat{y}-} = \frac{1}{\sqrt{2}} \begin{pmatrix} i \\ 1 \end{pmatrix}.

2.2.2 Phonon polarization#

  • Another important two-state system is provided by a photon, which can have two independent polarizations.

    We can choose the basis states to be the states of horizontal polarization H\ket{H} and vertical polarization V\ket{V}. Under a rotation about the axis of propagation, the two linear polarization states transform as

    HcosθH+sinθV,   VsinθH+cosθV,\ket{H} \to \cos \theta \ket{H} + \sin \theta \ket{V}, ~~~ \ket{V} \to -\sin \theta \ket{H} + \cos \theta \ket{V},

    which can be represented by a 2×22 \times 2 rotation matrix

    (cosθsinθsinθcosθ).\begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}.

    The matrix has two eigenstates, the states of circular polarization:

    R=12(1i),   L=12(i1),\ket{R} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ i \end{pmatrix}, ~~~ \ket{L} = \frac{1}{\sqrt{2}} \begin{pmatrix} i \\ 1 \end{pmatrix},

    with eigenvalues eiθe^{i\theta} and eiθe^{-i\theta}, the states of right and left circular polarization. That is, these are the eigenstates of the rotation generator σ2\sigma_2 with eigenvalues ±1\pm 1. Because the eigenvalues are ±1\pm 1 (not ±1/2\pm 1/2) we say that the photon has spin-1.

2.3 The density operator#

2.3.1 The bipartite quantum system#

  • Having understood everything about a single qubit, we are ready to address systems with two qubits. Interaction among multiple qubits is essential for quantum computation, and the most distinctive feature of quantum mechanics, nonlocal correlation, arises only in systems with two or more qubits.

    When we study open systems, that is, when we limit our attention to just part of a larger system, it is generally impossible to assign a single state vector to system AA alone. Thus, we must use a different powerful tool, the density operator (which will be introduced later), to describe the state of the system. We will find that (contrary to the axioms):

    1. States are not rays.

    2. Measurements are not orthogonal projections.

    3. Evolution is not unitary.

    To arrive at the laws obeyed by open quantum systems, we must recall our fifth axiom, which relates the description of a composite quantum system to the description of its component parts. As a first step toward understanding the quantum description of an open system, we consider a bipartite system consisting of two subsystems, AA and BB.

    The Hilbert space of a bipartite system ABAB is the tensor product HAHB\mathcal{H}_A \otimes \mathcal{H}_B of the Hilbert spaces of the two subsystems. We may denote an orthonormal basis for the space as {00,01,10,11}\{\ket{00}, \ket{01}, \ket{10}, \ket{11}\}. A general state of the composite system can be expressed as

    ψ=i,jcijij,\ket{\psi} = \sum_{i,j} c_{ij} \ket{ij},

    where ij=iAjB\ket{ij} = \ket{i}_A \otimes \ket{j}_B and the complex coefficients cijc_{ij} satisfy the normalization condition i,jcij2=1\sum_{i,j} |c_{ij}|^2 = 1.

    An observable acting on both qubits A and B, M=MAMB\bm{M} = \bm{M}_A \otimes \bm{M}_B, has expectation value

    M=ψ(MAMB)ψ=tr(Mρ),\langle \bm{M} \rangle = \bra{\psi} (\bm{M}_A \otimes \bm{M}_B) \ket{\psi} = \text{tr}(\bm{M} \bm{\rho}),

    where ρ=ψψ\bm{\rho} = \ket{\psi} \bra{\psi} is the density operator for the composite system.

    An observable acting on qubit A only can be expressed as

    MAIB,\bm{M}_A \otimes \bm{I}_B,

    where MA\bm{M}_A is a Hermitian operator acting on HA\mathcal{H}_A, and IB\bm{I}_B is the identity operator acting on HB\mathcal{H}_B.

    The expectation value of the observable in the state ψ\ket{\psi} is:

    MAIB=ψ(MAIB)ψ=i,j,i,jcijcijij(MAIB)ij=i,j,i,jcijcijiMAijIBj=i,iαiiiMAi=tr(MAρA),\begin{align*} \langle \bm{M}_A \otimes \bm{I}_B \rangle &= \bra{\psi} (\bm{M}_A \otimes \bm{I}_B) \ket{\psi} \\ &= \sum_{i,j,i',j'} c_{ij}^* c_{i'j'} \bra{ij} (\bm{M}_A \otimes \bm{I}_B) \ket{i'j'} \\ &= \sum_{i,j,i',j'} c_{ij}^* c_{i'j'} \bra{i} \bm{M}_A \ket{i'} \bra{j} \bm{I}_B \ket{j'} \\ &= \sum_{i,i'} \alpha_{ii'} \bra{i} \bm{M}_A \ket{i'} \\ &= \text{tr}(\bm{M}_A \bm{\rho}_A), \end{align*}

    where αii=jcijcij\alpha_{ii'} = \sum_{j} c_{ij}^* c_{i'j} and ρA=iiαiiii\bm{\rho}_A = \sum_{ii'} \alpha_{ii'} \ket{i'} \bra{i}. The operator ρA\bm{\rho}_A is called the density operator (or density matrix) for qubit A.

    ρA\bm{\rho}_A can be obtained from ρ\bm{\rho} by taking the partial trace over subsystem B:

    ρA=trB(ρ)=j(IAjB)ρ(IAjB)=i,j,icijcijii.\bm{\rho}_A = \text{tr}_B(\bm{\rho}) = \sum_{j} (\bm{I}_A \otimes \bra{j}_B) \bm{\rho} (\bm{I}_A \otimes \ket{j}_B) = \sum_{i,j,i'} c_{ij}^* c_{i'j} \ket{i'} \bra{i}.

    The density operator ρ\bm{\rho} has the following properties:

    1. ρ\bm{\rho} is Hermitian: ρ=ρ\bm{\rho} = \bm{\rho}^\dagger.
    2. ρ\bm{\rho} is positive: for any vector ϕ\ket{\phi}, ϕρϕ0\bra{\phi} \bm{\rho} \ket{\phi} \ge 0.
    3. ρ\bm{\rho} has unit trace: tr(ρ)=1\text{tr}(\bm{\rho}) = 1.
  • If the state is a pure state ψ\ket{\psi}, then the density matrix ρ=ψψ\bm{\rho} = \ket{\psi} \bra{\psi} is the projection onto the one-dimensional space spanned by ψ\ket{\psi}. Otherwise the state is mixed.

    A general density matrix, expressed in the basis {i\ket{i}} in which it is diagonal, has the form

    ρ=ipiii,\bm{\rho} = \sum_{i} p_i \ket{i} \bra{i},

    where 0<pi10 < p_i ≤ 1 and ipi=1\sum_{i} p_i = 1. A useful criterion for distinguishing pure states from mixed states is provided by the trace of the square of the density operator. If ρ\bm{\rho} represents a pure state, then ρ2=ρ\bm{\rho}^2 = \bm{\rho}, and so tr(ρ2)=tr(ρ)=1\text{tr}(\bm{\rho}^2) = \text{tr}(\bm{\rho}) = 1. On the other hand, if ρ\bm{\rho} represents a mixed state, then tr(ρ2)<1\text{tr}(\bm{\rho}^2) < 1. Thus we have the criterion

    {tr(ρ2)=1pure statetr(ρ2)<1mixed state\begin{cases} \text{tr}(\bm{\rho}^2) = 1 & \text{pure state}\\ \text{tr}(\bm{\rho}^2) < 1 & \text{mixed state} \end{cases}

    The expectation value of any observable M\bm{M} in a state described by the density operator ρ\bm{\rho} is given by

    M=tr(Mρ)=ipiiMi.\langle \bm{M} \rangle = \text{tr}(\bm{M} \bm{\rho}) = \sum_{i} p_i \bra{i} \bm{M} \ket{i}.

2.3.2 Bloch sphere#

  • Let’s return to the case in which system A is a single qubit, and consider the form of the general density matrix. The most general self-adjoint 2 × 2 matrix has four real parameters, and can be expanded in the basis {I,σ1,σ2,σ3\bm{I}, \bm{\sigma}_1, \bm{\sigma}_2, \bm{\sigma}_3}. Since each σi\bm{\sigma_i} is traceless, the coefficient of I\bm{I} in the expansion of a density matrix ρ\bm{\rho} must be 12\frac{1}{2} (so that tr(ρ)=1\text{tr}(\bm{\rho}) = 1), and ρ\bm{\rho} may be expressed as

    ρ=ρ(P)=12(I+Pσ)=12(1+P3P1iP2P1+iP21P3),\bm{\rho} = \bm{\rho} (\bm{P}) = \frac{1}{2} (\bm{I} + \bm{P} \cdot \bm{\sigma}) = \frac{1}{2} \begin{pmatrix} 1 + P_3 & P_1 - i P_2 \\ P_1 + i P_2 & 1 - P_3 \end{pmatrix},

    where P1,P2,P3P_1, P_2, P_3 are real numbers. We can compute detρ=14(1P2)\det \bm{\rho} = \frac{1}{4} (1 - \bm{P}^2). Therefore, a necessary condition for ρ\bm{\rho} to have nonnegative eigenvalues is detρ0\det \bm{\rho} \geq 0 or P21\bm{P}^2 \leq 1. This condition is also sufficient; since trρ=1\text{tr} \bm{\rho} = 1, it is not possible for ρ\bm{\rho} to have two negative eigenvalues.Thus, there is a 1 − 1 correspondence between the possible density matrices of a single qubit and the points on the unit 3-dimensional sphere 0P10 ≤ |\bm{P}| ≤ 1. This ball is usually called the Bloch sphere (although it is really a ball, not a sphere).

    The pure states correspond to the points on the surface of the Bloch sphere, where P=1|\bm{P}| = 1. The mixed states correspond to the points in the interior of the Bloch sphere, where P<1|\bm{P}| < 1. The completely mixed state ρ=12I\bm{\rho} = \frac{1}{2} \bm{I} corresponds to the center of the Bloch sphere, where P=0\bm{P} = 0.

    The vector P\bm{P} is called the polarization vector or the Bloch vector. The expectation value of the spin operator n^σ\hat{n} \cdot \bm{\sigma} in the state ρ\bm{\rho} is

    n^σρ=tr(n^σρ(P))=n^P.\langle \hat{n} \cdot \bm{\sigma} \rangle _{\bm{\rho}} = \text{tr}(\hat{n} \cdot \bm{\sigma} \bm{\rho} (\bm{P})) = \hat{n} \cdot \bm{P}.

    The following equation is used in the derivation of the above formula

    tr(σiσj)=2δij.\text{tr}(\bm{\sigma_i} \bm{\sigma_j}) = 2 \delta_{ij}.

2.4 Schmidt decomposition#

  • A bipartite pure state can be expressed in a standard form (the Schmidt decomposition) that is often very useful.

    Let HA\mathcal{H}_A and HB\mathcal{H}_B be the Hilbert spaces of systems AA and BB with dimensions dAd_A and dBd_B, respectively. Let {iA}\{\ket{i}_A\} and {jB}\{\ket{j}_B\} be orthonormal bases for HA\mathcal{H}_A and HB\mathcal{H}_B. An arbitrary vector in HAHB\mathcal{H}_A \otimes \mathcal{H}_B can be expanded as

    ψAB=ijcijiAjB=iiAi~B,\ket{\psi}_{AB} = \sum_{i} \sum_{j} c_{ij} \ket{i}_A \otimes \ket{j}_B = \sum_{i} \ket{i}_A \otimes \ket{\tilde{i}}_B,

    where the complex coefficients cijc_{ij} satisfy the normalization condition i,jcij2=1\sum_{i,j} |c_{ij}|^2 = 1 and i~B=jcijjB\ket{\tilde{i}}_B = \sum_{j} c_{ij} \ket{j}_B are (not necessarily normalized or orthogonal) vectors in HB\mathcal{H}_B.

    If the {iA}\{\ket{i}_A\} basis is chosen to be the basis in which ρA\bm{\rho}_A is diagonal,

    ρA=ipiiAiA,\bm{\rho}_A = \sum_{i} p_i \ket{i}_A \bra{i}_A,

    where pip_i are the eigenvalues of ρA\bm{\rho}_A and iA\ket{i}_A are the corresponding eigenvectors. Also, we can obtain ρA\bm{\rho}_A from ψ\ket{\psi} by taking the partial trace over subsystem B:

    ρA=trB(ψψ)=i,j,kiAjAki~Bj~kB=i,jiAjAj~i~B,\bm{\rho}_A = \text{tr}_B(\ket{\psi} \bra{\psi}) = \sum_{i,j,k} \ket{i}_A \bra{j}_A \langle k | \tilde{i} \rangle_B \langle \tilde{j} | k \rangle_B = \sum_{i,j} \ket{i}_A \bra{j}_A \langle \tilde{j} | \tilde{i} \rangle_B,

    where we have used the completeness relation kkBkB=IB\sum_k \ket{k}_B \bra{k}_B = \bm{I}_B. Comparing the two expressions for ρA\bm{\rho}_A, we see that

    j~i~B=piδij,\langle \tilde{j} | \tilde{i} \rangle_B = p_i \delta_{ij},

    which means that the vectors i~B\ket{\tilde{i}}_B are orthogonal, and their norms are pi\sqrt{p_i}. We obtain orthonormal vectors by rescaling,

    iB=1pii~B,\ket{i}_B = \frac{1}{\sqrt{p_i}} \ket{\tilde{i}}_B,

    Thus we can express ψ\ket{\psi} as

    ψ=ipiiAiB.\ket{\psi} = \sum_{i} \sqrt{p_i} \ket{i}_A \otimes \ket{i}_B.

    The above equation is the Schmidt decomposition of the bipartite pure state ψAB\ket{\psi}_{AB}. It is instructive to compare the Schmidt decomposition of the bipartite pure state ψAB\ket{\psi}_{AB} with its expansion in a generic orthonormal basis

    ψAB=abcabaAbB.\ket{\psi}_{AB} = \sum_{a} \sum_{b} c_{ab} \ket{a}_A \otimes \ket{b}_B.

    aA\ket{a}_A and bB\ket{b}_B are not necessarily eigenstates of their respective density matrices. The coefficients cabc_{ab} can be regarded as the matrix elements of a dA×dBd_A \times d_B matrix CC. By performing a singular value decomposition of the matrix CC, we can express it as

    C=UAΛUBT,C = U_A \Lambda U_B^T,

    where UAU_A is a dA×dAd_A \times d_A unitary matrix, UBU_B is a dB×dBd_B \times d_B unitary matrix, and Λ\Lambda is a dA×dBd_A \times d_B diagonal matrix with nonnegative real entries. The diagonal entries of Λ\Lambda are called the singular values of CC. If we define new orthonormal bases for HA\mathcal{H}_A and HB\mathcal{H}_B by

    iA=a(UA)aiaA,   iB=b(UB)bibB,   Λij=piδij,\ket{i}_A = \sum_{a} (U_A)_{ai} \ket{a}_A, ~~~ \ket{i'}_B = \sum_{b} (U_B)_{bi} \ket{b}_B, ~~~ \Lambda_{ij} = \sqrt{p_i} \delta_{ij},

    then we can express the state ψ\ket{\psi} in the new bases as

    ψAB=i=1rpiiAiB,\ket{\psi}_{AB} = \sum_{i=1}^{r} \sqrt{p_i} \ket{i}_A \otimes \ket{i}_B,

    where pi\sqrt{p_i} are the nonzero singular values of CC, and rr is the rank of the matrix CC. This expression is called the Schmidt decomposition of the state ψ\ket{\psi}.

  • The Schmidt decomposition has several important properties:

    1. The number of nonzero terms in the Schmidt decomposition, rr, is called the Schmidt rank of the state ψAB\ket{\psi}_{AB}. The Schmidt rank is at most min(dA,dB)\min(d_A, d_B).

    2. The reduced density matrices ρA\bm{\rho}_A and ρB\bm{\rho}_B have the same nonzero eigenvalues, which are the squares of the Schmidt coefficients pi\sqrt{p_i}.

    3. The Schmidt decomposition is unique if ρA\rho_A (and hence ρB\rho_B) have no degenerate eigenvalues other than zero.

    4. A bipartite pure state ψAB\ket{\psi}_{AB} is entangled if and only if its Schmidt rank is greater than 1.

2.5 Ensemble#

2.5.1 Convexity#

  • A density operator ρ\bm{\rho} can be expressed as a convex combination of pure states:

    ρ=ipiψiψi,\bm{\rho} = \sum_{i} p_i \ket{\psi_i} \bra{\psi_i},

    where 0<pi10 < p_i ≤ 1 and ipi=1\sum_{i} p_i = 1. The states {ψi\ket{\psi_i}} are all normalized vectors, but we do not assume that they are mutually orthogonal. The set of all density operators is convex, meaning that if ρ1\bm{\rho_1} and ρ2\bm{\rho_2} are density operators, then any convex combination of them is also a density operator:

    ρ=λρ1+(1λ)ρ2,   0λ1.\bm{\rho} = \lambda \bm{\rho_1} + (1 - \lambda) \bm{\rho_2}, ~~~ 0 ≤ \lambda ≤ 1.

    The extreme points of this convex set are the pure states, which cannot be expressed as a nontrivial convex combination of other density operators.

2.5.2 Ensemble preparation#

  • A density operator ρ\bm{\rho} can be prepared by an ensemble of pure states {(pi,ψi)}\{(p_i, \ket{\psi_i})\}, where pip_i is the probability of preparing the pure state ψi\ket{\psi_i}. The ensemble preparation can be achieved by a classical random process that selects the state ψi\ket{\psi_i} with probability pip_i.

    Different ensembles can yield the same density operator. For example, the completely mixed state ρ=12I\bm{\rho} = \frac{1}{2} \bm{I} for a single qubit can be prepared by the ensemble {(12,0),(12,1)}\{(\frac{1}{2}, \ket{0}), (\frac{1}{2}, \ket{1})\}

    ρ=1200+1211,\bm{\rho} = \frac{1}{2} \ket{0} \bra{0} + \frac{1}{2} \ket{1} \bra{1},

    or by the ensemble {(12,+),(12,)}\{(\frac{1}{2}, \ket{+}), (\frac{1}{2}, \ket{-})\},

    ρ=12+++12,\bm{\rho} = \frac{1}{2} \ket{+} \bra{+} + \frac{1}{2} \ket{-} \bra{-},

    where +=12(0+1)\ket{+} = \frac{1}{\sqrt{2}} (\ket{0} + \ket{1}) and =12(01)\ket{-} = \frac{1}{\sqrt{2}} (\ket{0} - \ket{1}).

    The non-uniqueness of ensemble preparation reflects the fact that the density operator contains all the information about the statistical properties of the quantum system, but does not specify how the system was prepared.

2.5.3 The HJW theorem#

  • Any density operator ρA\bm{\rho}_A acting on a Hilbert space HA\mathcal{H}_A can be represented as the reduced density operator of a pure state ΨAB\ket{\Psi}_{AB} in a larger Hilbert space HAHB\mathcal{H}_A \otimes \mathcal{H}_B. This process is called purification.

    To purify a density operator ρA\bm{\rho}_A, we can perform the following steps:

    1. For a density matrix ρA\bm{\rho}_A, consider one such realization:

      ρA=iqiφiAφiA,   iqi=1.\bm{\rho}_A = \sum_{i} q_i \ket{\varphi_i}_A \bra{\varphi_i}_A, ~~~ \sum_{i} q_i = 1.

      Here the states {φiA\ket{\varphi_i}_A} are all normalized vectors, but we do not assume that they are mutually orthogonal.

    2. Introduce an auxiliary Hilbert space HB\mathcal{H}_B with an orthonormal basis {αB}\{\ket{\alpha}_B\}, where the vectors {αB}HB\{\ket{\alpha}_B\} \in \mathcal{H}_B are mutually orthogonal and normalized.

    3. Construct the purified state ΨAB\ket{\Psi}_{AB} in the composite Hilbert space HAHB\mathcal{H}_A \otimes \mathcal{H}_B as follows:

      Ψ1AB=iqiφiAαiB.\ket{\Psi_1}_{AB} = \sum_{i} \sqrt{q_i} \ket{\varphi_i}_A \otimes \ket{\alpha_i}_B.

    The reduced density operator of subsystem A is obtained by taking the partial trace over subsystem B:

    ρA=trB(Ψ1ABΨ1AB)=iqiφiAφiA.\bm{\rho}_A = \text{tr}_B(\ket{\Psi_1}_{AB} \bra{\Psi_1}_{AB}) = \sum_{i} q_i \ket{\varphi_i}_A \bra{\varphi_i}_A.
  • The purified state Ψ1AB\ket{\Psi_1}_{AB} is not unique; any unitary transformation applied to subsystem BB will yield a different purified state that still reduces to the same density operator ρA\bm{\rho}_A when traced over BB. This non-uniqueness reflects the fact that there are many different ways to represent the same mixed state as a pure state in a larger Hilbert space.

    We can realize a different ensemble interpretation of ρA\bm{\rho}_A by performing a different measurement of BB. So let

    ρA=μrμϕμAϕμA,   μrμ=1,\bm{\rho}_A = \sum_\mu r_\mu \ket{\phi_\mu}_A \bra{\phi_\mu}_A, ~~~ \sum_\mu r_\mu = 1,

    be another ensemble interpretation of ρA\bm{\rho}_A. Then we can construct another purified state Ψ2AB\ket{\Psi_2}_{AB} as follows:

    Ψ2AB=μrμϕμAβμB.\ket{\Psi_2}_{AB} = \sum_\mu \sqrt{r_\mu} \ket{\phi_\mu}_A \otimes \ket{\beta_\mu}_B.

    where {βμB}\{\ket{\beta_\mu}_B\} is another orthonormal basis in HB\mathcal{H}_B. Ψ1AB\ket{\Psi_1}_{AB} and Ψ2AB\ket{\Psi_2}_{AB} are two different purifications of the same density operator ρA\bm{\rho}_A. Since they are purifications of the same density operator, we can find their relation by using the Schmidt decomposition. Let

    ρA=ipiiAiA,   ipi=1,\bm{\rho}_A = \sum_{i} p_i \ket{i}_A \bra{i}_A, ~~~ \sum_{i} p_i = 1,

    be the spectral decomposition of ρA\bm{\rho}_A. Then we can express Ψ1AB\ket{\Psi_1}_{AB} and Ψ2AB\ket{\Psi_2}_{AB} in their Schmidt decompositions:

    Ψ1AB=ipiiAiB,\ket{\Psi_1}_{AB} = \sum_{i} \sqrt{p_i} \ket{i}_A \otimes \ket{i}_B, Ψ2AB=ipiiAiB,\ket{\Psi_2}_{AB} = \sum_{i} \sqrt{p_i} \ket{i}_A \otimes \ket{i'}_B,

    where {iB}\{\ket{i}_B\} and {iB}\{\ket{i'}_B\} are orthonormal bases in HB\mathcal{H}_B. Since both {iB}\{\ket{i}_B\} and {iB}\{\ket{i'}_B\} are orthonormal bases, there exists a unitary transformation UB=iiBiB\bm{U}_B = \sum_{i} \ket{i'}_B \bra{i}_B such that

    Ψ2AB=(IAUB)Ψ1AB.\ket{\Psi_2}_{AB} = (\bm{I}_A \otimes \bm{U}_B) \ket{\Psi_1}_{AB}.

    This result is known as the Hughston-Jozsa-Wootters (HJW) theorem. It shows that any two ensemble interpretations of a given density operator can be related by a suitable unitary transformation on the auxiliary system used in the purification. The HJW theorem has important implications for quantum information theory, as it highlights the flexibility in representing mixed states and the role of auxiliary systems in quantum state preparation and manipulation.

2.5.4 Quantum erasure#

  • The non-uniqueness of ensemble preparations for a given density matrix, as explained by the HJW theorem, is beautifully illustrated by the phenomenon of quantum erasure. It demonstrates that the information we can potentially acquire about a quantum system dictates its behavior (e.g., whether it exhibits wave-like interference), and that “erasing” this information can restore the seemingly lost quantum effects.

    Let’s consider a mixed state for a qubit A, which represents a particle that can take one of two paths, 0A\ket{0}_A or 1A\ket{1}_A, with equal probability and no coherence between them:

    ρA=1200+1211=12I.\bm{\rho}_A = \frac{1}{2} \ket{0}\bra{0} + \frac{1}{2} \ket{1}\bra{1} = \frac{1}{2}\bm{I}.

    This state will not produce an interference pattern because the off-diagonal elements of the density matrix are zero.

    As per the HJW theorem, we can “purify” this state by introducing an auxiliary system B and constructing an entangled pure state ΨAB\ket{\Psi}_{AB} such that trB(ΨABΨAB)=ρA\text{tr}_B(\ket{\Psi}_{AB}\bra{\Psi}_{AB}) = \bm{\rho}_A. One such purification is:

    ΨAB=12(0A0B+1A1B).\ket{\Psi}_{AB} = \frac{1}{\sqrt{2}} (\ket{0}_A \otimes \ket{0}_B + \ket{1}_A \otimes \ket{1}_B).

    Here, system B acts as a “which-path” detector. If system B is measured in the state 0B\ket{0}_B, we know with certainty that system A is in state 0A\ket{0}_A (took path 0). If B is in 1B\ket{1}_B, then A is in 1A\ket{1}_A (took path 1). As long as this which-path information exists in system B, system A will not show interference.

    The “erasure” happens when we perform a measurement on system B that makes it impossible to know whether B was in state 0B\ket{0}_B or 1B\ket{1}_B. This is achieved by measuring B in a different basis, for example, the Hadamard basis {+B,B}\{\ket{+}_B, \ket{-}_B\}, where ±B=12(0B±1B)\ket{\pm}_B = \frac{1}{\sqrt{2}}(\ket{0}_B \pm \ket{1}_B).

    To see the effect on A, we can rewrite the entangled state ΨAB\ket{\Psi}_{AB} in terms of this new basis for B. By substituting 0B=12(+B+B)\ket{0}_B = \frac{1}{\sqrt{2}}(\ket{+}_B + \ket{-}_B) and 1B=12(+BB)\ket{1}_B = \frac{1}{\sqrt{2}}(\ket{+}_B - \ket{-}_B), we get:

    ΨAB=12[0A12(+B+B)+1A12(+BB)]=12[(0A+1A)+B+(0A1A)B]=12[+A+B+AB]\begin{align*} \ket{\Psi}_{AB} &= \frac{1}{\sqrt{2}} \left[ \ket{0}_A \otimes \frac{1}{\sqrt{2}}(\ket{+}_B + \ket{-}_B) + \ket{1}_A \otimes \frac{1}{\sqrt{2}}(\ket{+}_B - \ket{-}_B) \right] \\ &= \frac{1}{2} \left[ (\ket{0}_A + \ket{1}_A) \otimes \ket{+}_B + (\ket{0}_A - \ket{1}_A) \otimes \ket{-}_B \right] \\ &= \frac{1}{\sqrt{2}} \left[ \ket{+}_A \otimes \ket{+}_B + \ket{-}_A \otimes \ket{-}_B \right] \end{align*}

    Now, if we measure system B and get the result +B\ket{+}_B, the state of system A collapses to the pure state +A=12(0A+1A)\ket{+}_A = \frac{1}{\sqrt{2}}(\ket{0}_A + \ket{1}_A). This state has maximum coherence between the two paths and will produce a specific interference pattern.

    If we instead measure system B and get the result B\ket{-}_B, the state of system A collapses to the pure state A=12(0A1A)\ket{-}_A = \frac{1}{\sqrt{2}}(\ket{0}_A - \ket{1}_A). This state will also produce an interference pattern, but one that is phase-shifted relative to the first.

    By measuring B in the Hadamard basis, we have “erased” the which-path information. We can’t tell which path particle A took. In doing so, we can sort the detection events of A based on the measurement outcomes of B. The sub-ensemble of A-particles corresponding to the B-measurement +B\ket{+}_B shows one interference pattern, and the sub-ensemble corresponding to B\ket{-}_B shows another. The sum of these two patterns washes out the interference, which is consistent with the initial mixed state ρA\bm{\rho}_A. Thus, the interference is “recovered” through correlation, showcasing how different measurements on an auxiliary system can reveal different ensemble interpretations of a density operator.

  • Delayed Choice: The quantum eraser experiment becomes even more profound when considering the “delayed choice” aspect. The decision of which basis to use for measuring system B (the which-path detector) can be made after system A has already been detected. For example, we can let system A hit a screen, recording its position, and only later choose whether to measure system B in the path basis {0B,1B}\{\ket{0}_B, \ket{1}_B\} or the erasure basis {+B,B}\{\ket{+}_B, \ket{-}_B\}. When we later analyze the data from system A, we sort its detection events according to the subsequent measurement outcomes of B. If we chose the path basis for B, the sorted A-data shows no interference. If we chose the erasure basis for B, the sorted A-data reveals the interference patterns. This implies that the behavior of particle A (whether it acts like a particle or a wave) is not determined until a future measurement is made on its entangled partner B.

    It is crucial to emphasize that this doesn’t mean information travels back in time; rather, it highlights that we cannot attribute a definite classical history to a quantum system before the measurement is fully completed. An observer looking only at the total detection results for system A will never see an interference pattern in real-time. The interference fringes only emerge during classical post-processing, when the results from A are correlated with the measurement outcomes from B. Here is why:

    1. Two possible outcomes for A: When we choose to measure B in the erasure basis {+B,B}\{\ket{+}_B, \ket{-}_B\}, there are two possible collapsed states for A, each occurring with 50% probability:

      • If B is measured as +B\ket{+}_B, system A collapses to ψ+A=12(0A+1A)\ket{\psi_+}_A = \frac{1}{\sqrt{2}} (\ket{0}_A + \ket{1}_A).

      • If B is measured as B\ket{-}_B, system A collapses to ψA=12(0A1A)\ket{\psi_-}_A = \frac{1}{\sqrt{2}} (\ket{0}_A - \ket{1}_A).

    2. Opposite interference patterns: Let’s denote the wavefunction at a position xx on the screen from path 0 and path 1 as ψ0(x)\psi_0(x) and ψ1(x)\psi_1(x) respectively. The probability distribution (i.e., the interference pattern) for each case is:

      P(x+B)=xψ+A2=12ψ0(x)+ψ1(x)2=12(ψ02+ψ12+2Re(ψ0ψ1))P(x | +_B) = |\langle x | \psi_+ \rangle_A|^2 = \frac{1}{2} |\psi_0(x) + \psi_1(x)|^2 = \frac{1}{2} (|\psi_0|^2 + |\psi_1|^2 + 2\text{Re}(\psi_0^* \psi_1)) P(xB)=xψA2=12ψ0(x)ψ1(x)2=12(ψ02+ψ122Re(ψ0ψ1))P(x | -_B) = |\langle x | \psi_- \rangle_A|^2 = \frac{1}{2} |\psi_0(x) - \psi_1(x)|^2 = \frac{1}{2} (|\psi_0|^2 + |\psi_1|^2 - 2\text{Re}(\psi_0^* \psi_1))

      Notice the opposite sign of the interference term 2Re(ψ0ψ1)2\text{Re}(\psi_0^* \psi_1). This means the two patterns are perfectly out of phase: the bright fringes of one pattern correspond exactly to the dark fringes of the other.

    3. Total pattern washes out: The observer of A, without knowledge of B’s measurement, sees the sum of these two mutually exclusive outcomes, weighted by their probabilities (each 50%):

      Ptotal(x)=12P(x+B)+12P(xB)=14(ψ02+ψ12+2Re(ψ0ψ1))+14(ψ02+ψ122Re(ψ0ψ1))=12(ψ02+ψ12)\begin{align*} P_{\text{total}}(x) &= \frac{1}{2} P(x | +_B) + \frac{1}{2} P(x | -_B) \\ &= \frac{1}{4} (|\psi_0|^2 + |\psi_1|^2 + 2\text{Re}(\psi_0^* \psi_1)) + \frac{1}{4} (|\psi_0|^2 + |\psi_1|^2 - 2\text{Re}(\psi_0^* \psi_1)) \\ &= \frac{1}{2} (|\psi_0|^2 + |\psi_1|^2) \end{align*}

    The interference terms cancel perfectly. The resulting total pattern is a featureless distribution, identical to the one described by the mixed state ρA=12I\bm{\rho}_A = \frac{1}{2}\bm{I}. The choice made on B only determines which patterns can be retrospectively sorted and revealed from the total, seemingly random data set. This can also explain why faster-than-light communication is impossible using entanglement and quantum erasure. The interference patterns only emerge when the data from both systems are compared after the fact, requiring classical communication.

2.6 Distance measures for quantum information#

2.6.1 Fidelity and Uhlmann’s theorem#

  • The distinguishability of two pure states ψ\ket{\psi} and ϕ\ket{\phi} is quantified by the deviation from 1 of their overlap ϕψ2|\langle \phi | \psi \rangle|^2, also called fidelity. For two density operators ρ\bm{\rho} and σ\bm{\sigma}, the fidelity F(ρ,σ)F(\bm{\rho}, \bm{\sigma}) between two density operators ρ\bm{\rho} and σ\bm{\sigma} is defined as

    F(ρ,σ)=(trρ12σρ12)2.F(\bm{\rho}, \bm{\sigma}) = \left(\text{tr} \sqrt{\bm{\rho^\frac{1}{2}} \bm{\sigma} \bm{\rho^\frac{1}{2}}} \right)^2.

    The square root of a density operator ρ\bm{\rho}, denoted as ρ12\bm{\rho}^{\frac{1}{2}} or ρ\sqrt{\bm{\rho}}, is a unique positive semidefinite operator that, when multiplied by itself, yields ρ\bm{\rho}. The most straightforward way to compute it is through spectral decomposition. If the spectral decomposition of ρ\bm{\rho} is

    ρ=ipiii,\bm{\rho} = \sum_{i} p_i \ket{i} \bra{i},

    where pip_i are the eigenvalues and i\ket{i} are the corresponding eigenvectors, then its square root is defined as:

    ρ12=ipiii.\bm{\rho}^{\frac{1}{2}} = \sum_{i} \sqrt{p_i} \ket{i} \bra{i}.

    The fidelity has the following properties:

    1. 0F(ρ,σ)10 ≤ F(\bm{\rho}, \bm{\sigma}) ≤ 1.
    2. F(ρ,σ)=F(σ,ρ)F(\bm{\rho}, \bm{\sigma}) = F(\bm{\sigma}, \bm{\rho}) (symmetry).
    3. F(ρ,σ)=1F(\bm{\rho}, \bm{\sigma}) = 1 if and only if ρ=σ\bm{\rho} = \bm{\sigma}.
    4. If ρ=ψψ\bm{\rho} = \ket{\psi} \bra{\psi} is a pure state, then F(ψψ,σ)=ψσψF(\ket{\psi} \bra{\psi}, \bm{\sigma}) = \bra{\psi} \bm{\sigma} \ket{\psi}.

    We may also express the fidelity in terms of the L1L^1 norm,

    F(ρ,σ)=σ12ρ1212,F(\bm{\rho}, \bm{\sigma}) = \left\| \bm{\sigma^\frac{1}{2}} \bm{\rho^\frac{1}{2}} \right\|^2_1,

    where A1=trAA\|\bm{A}\|_1 = \text{tr} \sqrt{\bm{A}^\dagger \bm{A}}.

  • It is useful to know how the fidelity of two density operators is related to the overlap of their purifications. A particular purification of ρ\bm{\rho} has the form

    Ψρ=ipiiAiB,\ket{\Psi_{\bm{\rho}}} = \sum_{i} \sqrt{p_i} \ket{i}_A \otimes \ket{i}_B,

    where {iA}\{\ket{i}_A\} and {iB}\{\ket{i}_B\} are orthonormal bases for systems A and B, respectively. According to the HJW theorem, a general purification has the form

    Ψρ(V)=(IAVB)Ψρ,\ket{\Psi_{\bm{\rho}}(\bm{V})} = (\bm{I}_A \otimes \bm{V}_B) \ket{\Psi_{\bm{\rho}}},

    where VB\bm{V}_B is an arbitrary unitary operator acting on system B. It can be expressed more generally as

    Ψρ(V)=(ρ12VB)Ψ~,\ket{\Psi_{\bm{\rho}}(\bm{V})} = (\bm{\rho}^\frac{1}{2} \otimes \bm{V}_B) \ket{\tilde{\Psi}},

    where Ψ~=iiAiB\ket{\tilde{\Psi}} = \sum_{i} \ket{i}_A \otimes \ket{i}_B is the unconventionally normalized maximally entangled state.

    If ρ\bm{\rho} and σ\bm{\sigma} are two density operators on AA, the inner product of their purifications on ABAB can be expressed as

    Ψσ(W)Ψρ(V)=Ψ~σ12ρ12WVΨ~.\langle \Psi_{\bm{\sigma}}(\bm{W}) | \Psi_{\bm{\rho}}(\bm{V}) \rangle = \langle \tilde{\Psi} | \bm{\sigma}^\frac{1}{2} \bm{\rho}^\frac{1}{2} \otimes \bm{W}^\dagger \bm{V} | \tilde{\Psi} \rangle.

    Noting that

    IUΨ~=ijiAUjijB=ijUijTiAjB=UTIΨ~,\bm{I} \otimes \bm{U} \ket{\tilde{\Psi}} = \sum_{ij} \ket{i}_A \otimes U_{ji} \ket{j}_B = \sum_{ij} U^T_{ij} \ket{i}_A \otimes \ket{j}_B = \bm{U^T} \otimes \bm{I} \ket{\tilde{\Psi}},

    we have,

    Ψσ(W)Ψρ(V)=Ψ~σ12ρ12UIΨ~=tr(σ12ρ12U),\langle \Psi_{\bm{\sigma}}(\bm{W}) | \Psi_{\bm{\rho}}(\bm{V}) \rangle = \langle \tilde{\Psi} | \bm{\sigma}^\frac{1}{2} \bm{\rho}^\frac{1}{2} \bm{U} \otimes \bm{I} | \tilde{\Psi} \rangle = \text{tr}(\bm{\sigma}^\frac{1}{2} \bm{\rho}^\frac{1}{2} \bm{U}),

    where U=(WV)T\bm{U} = (\bm{W}^\dagger \bm{V})^T.

    Now we may use the polar decomposition

    A=UAA,\bm{A} = \bm{U'} \sqrt{\bm{A}^\dagger \bm{A}},

    where U\bm{U'} is a unitary operator, to rewrite the inner product as

    Ψσ(W)Ψρ(V)=tr(ρ12σρ12UU)=aλaaUUa,\langle \Psi_{\bm{\sigma}}(\bm{W}) | \Psi_{\bm{\rho}}(\bm{V}) \rangle = \text{tr}(\sqrt{\bm{\rho}^\frac{1}{2} \bm{\sigma} \bm{\rho}^\frac{1}{2}} \bm{U'} \bm{U}) = \sum_a \lambda_a \bra{a} \bm{U'} \bm{U} \ket{a},

    where λa\lambda_a are the eigenvalues of ρ12σρ12\sqrt{\bm{\rho}^\frac{1}{2} \bm{\sigma} \bm{\rho}^\frac{1}{2}} and {a}\{\ket{a}\} is the corresponding orthonormal basis. By choosing U=U\bm{U} = \bm{U'}^\dagger, we can maximize the inner product. Thus we have:

    F(ρ,σ)=maxΨρ,ΨσΨσΨρ2,F(\bm{\rho}, \bm{\sigma}) = \max_{\ket{\Psi_{\bm{\rho}}}, \ket{\Psi_{\bm{\sigma}}}} |\langle \Psi_{\bm{\sigma}} | \Psi_{\bm{\rho}} \rangle|^2,

    The fidelity of two density operators is the maximal possible overlap of their purifications, a result called Uhlmann’s theorem. One corollary of Uhlmann’s theorem is the monotonicity of fidelity:

    F(ρAB,σAB)F(ρA,σA),F(\bm{\rho_{AB}}, \bm{\sigma_{AB}}) \leq F(\bm{\rho_{A}}, \bm{\sigma_{A}}),

    which says that tracing out a subsystem cannot decrease the fidelity of two density operators.

2.6.2 Relationships between distance measures#

  • There are other possible ways besides fidelity for quantifying the difference between quantum states ρ\bm{\rho} and σ\bm{\sigma}, such as the distance between the states using the L1L^1 or L2L^2 norm,

    ρσ1 or ρσ2,\|\bm{\rho} - \bm{\sigma}\|_1 \text{ or } \|\bm{\rho} - \bm{\sigma}\|_2,

    where the L2L^2 norm of an operator is defined by

    A2=trAA.\|\bm{A}\|_2 = \sqrt{\text{tr} \bm{A^\dagger} \bm{A}}.

    If {λi,i=0,1,2,d1}\{|\lambda_i|, i = 0, 1, 2, \ldots d-1\} denotes the eigenvalues of AA\sqrt{\bm{A}^\dagger \bm{A}}, then

    A1=i=0d1λi,   A2=i=0d1λi2.\|\bm{A}\|_1 = \sum_{i=0}^{d-1} |\lambda_i|, ~~~ \|\bm{A}\|_2 = \sqrt{\sum_{i=0}^{d-1} |\lambda_i|^2}.

    According to the Cauchy-Schwarz inequality, we have

    A1dA2.\|\bm{A}\|_1 ≤ \sqrt{d} \|\bm{A}\|_2.

    Because of the factor of d\sqrt{d} on the right hand side, for a high-dimensional system density operators which are close together in the L2L^2 norm might not be close in the L1L^1 norm.

  • We can derive a dimension-independent inequality relating the L1L^1 distance between ρ\bm{\rho} and σ\bm{\sigma} and the L2L^2 distance

    ρσ=iλiii,\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}} = \sum_{i} \lambda_i \ket{i} \bra{i},

    where λi\lambda_i are the eigenvalues of the operator ρσ\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}} and {i}\{\ket{i}\} is the corresponding orthonormal basis. Then we note that the absolute value of this difference may be written as

    ρσ=iλiii=(ρσ)U=U(ρσ),|\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}}| = \sum_{i} |\lambda_i| \ket{i} \bra{i} = (\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}}) \bm{U} = \bm{U} (\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}}),

    where U=isign(λi)ii\bm{U} = \sum_{i} \text{sign}(\lambda_i) \ket{i} \bra{i} is a unitary operator. Using

    ρσ=12(ρσ)(ρ+σ)+12(ρ+σ)(ρσ),\bm{\rho} - \bm{\sigma} = \frac{1}{2} (\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}})(\sqrt{\bm{\rho}} + \sqrt{\bm{\sigma}}) + \frac{1}{2} (\sqrt{\bm{\rho}} + \sqrt{\bm{\sigma}})(\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}}),

    and the cyclicity of the trace, we find

    tr[(ρσ)U]=tr[ρσ(ρ+σ)]=iλiiρ+σiiiρσi=iλi2=ρσ22.\begin{align*} \text{tr} [(\rho - \sigma) \bm{U}] &= \text{tr} [|\sqrt{\rho} - \sqrt{\sigma}|(\sqrt{\rho} + \sqrt{\sigma})] = \sum_{i} |\lambda_i| \bra{i} \sqrt{\rho} + \sqrt{\sigma} \ket{i} \\ &\geq \sum_{i} |\bra{i} \sqrt{\rho} - \sqrt{\sigma} \ket{i}| = \sum_{i} |\lambda_i|^2 = \|\sqrt{\rho} - \sqrt{\sigma}\|^2_2. \end{align*}

    Finally, using

    ρσ1=trρσtr[(ρσ)U],\|\bm{\rho} - \bm{\sigma}\|_1 = \text{tr} |\bm{\rho} - \bm{\sigma}| \geq \text{tr} [(\bm{\rho} - \bm{\sigma}) \bm{U}],

    we have

    ρσ1ρσ22.\|\bm{\rho} - \bm{\sigma}\|_1 \geq \|\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}}\|_2^2.

    This L2L^2 distance between square roots can be related to fidelity. First we note that

    ρσ22=tr[(ρσ)2]=tr(ρ+σ2ρσ)=22tr(ρσ),\|\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}}\|_2^2 = \text{tr}[(\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}})^2] = \text{tr}(\bm{\rho} + \bm{\sigma} -2 \sqrt{\bm{\rho}} \sqrt{\bm{\sigma}}) = 2 - 2 \text{tr}(\sqrt{\bm{\rho}} \sqrt{\bm{\sigma}}),

    since trρ=trσ=1\text{tr} \bm{\rho} = \text{tr} \bm{\sigma} = 1. From the polar decomposition A=UAAA = \bm{U} \sqrt{A^\dagger A} (where U\bm{U} is unitary), we see that trAAtrA\text{tr} \sqrt{A^\dagger A} \geq |\text{tr} A|, and therefore

    F(ρ,σ)=trρ12σρ12tr(ρσ);\sqrt{F(\bm{\rho}, \bm{\sigma})} = \text{tr} \sqrt{\bm{\rho^\frac{1}{2}} \bm{\sigma} \bm{\rho^\frac{1}{2}}} \geq |\text{tr}(\sqrt{\bm{\rho}} \sqrt{\bm{\sigma}})|;

    hence,

    F(ρ,σ)112ρσ22112ρσ1.\sqrt{F(\bm{\rho}, \bm{\sigma})} \geq 1 - \frac{1}{2} \|\sqrt{\bm{\rho}} - \sqrt{\bm{\sigma}}\|_2^2 \geq 1 - \frac{1}{2} \|\bm{\rho} - \bm{\sigma}\|_1.

    The L1L^1 distance also provides an upper bound on fidelity. To derive this upper limit, first we need to prove that the trace norm is contractive under the partial trace. Let XA=trB(XAB)X_A = \text{tr}_B(X_{AB}). There exists a unitary matrix UA\bm{U}_A, such that XA1=tr(XAUA)\|\bm{X_A}\|_1 = \text{tr}(\bm{X_A} \bm{U}_A). By properties of the partial trace, we have

    tr(XAUA)=iiAXAUAiA=iiAtrB(XAB)UAiA=iiAjBXABjBUAiA=ijijABXAB(UAIB)ijAB=trAB[XAB(UAIB)].\begin{align*} \text{tr}(\bm{X_A} \bm{U}_A) &= \sum_{i} \bra{i}_A \bm{X}_A \bm{U}_A \ket{i}_A \\ &= \sum_{i} \bra{i}_A \text{tr}_B(X_{AB}) \bm{U}_A \ket{i}_A \\ &= \sum_{i} \bra{i}_A \bra{j}_B X_{AB} \ket{j}_B \bm{U}_A \ket{i}_A \\ &= \sum_{ij} \bra{ij}_{AB} X_{AB} (\bm{U}_A \otimes \bm{I}_B) \ket{ij}_{AB} \\ &= \text{tr}_{AB}[\bm{X}_{AB} (\bm{U}_A \otimes \bm{I}_B)]. \end{align*}

    Then we have trAB[XAB(UAIB)]XAB1\text{tr}_{AB}[\bm{X}_{AB} (\bm{U}_A \otimes \bm{I}_B)] \le \|X_{AB}\|_1. Combining these steps gives trB(XAB)1=trAB[XAB(UAIB)]XAB1\|\text{tr}_B(\bm{X}_{AB})\|_1 = \text{tr}_{AB}[\bm{X}_{AB} (\bm{U}_A \otimes \bm{I}_B)] \le \|\bm{X}_{AB}\|_1.

    Let Ψρ|\Psi_{\bm{\rho}*}\rangle and Ψσ|\Psi_{\bm{\sigma}*}\rangle be the optimal purifications of ρ\bm{\rho} and σ\bm{\sigma}. Start with the trace distance and Substituting into the above inequality we got:

    ρσ1=trB(ΨρΨρΨσΨσ)1ΨρΨρΨσΨσ1\|\bm{\rho} - \bm{\sigma}\|_1 = \|\text{tr}_B(|\Psi_{\bm{\rho}*}\rangle\langle\Psi_{\bm{\rho}*}| - |\Psi_{\bm{\sigma}*}\rangle\langle\Psi_{\bm{\sigma}*}|)\|_1 \le \||\Psi_{\bm{\rho}*}\rangle\langle\Psi_{\bm{\rho}*}| - |\Psi_{\bm{\sigma}*}\rangle\langle\Psi_{\bm{\sigma}*}|\|_1

    Apply the formula for trace distance between two pure states, ψψϕϕ1=21ψϕ2\||\psi\rangle\langle\psi| - |\phi\rangle\langle\phi|\|_1 = 2\sqrt{1 - |\langle \psi | \phi \rangle|^2}:

    ρσ121ΨσΨρ2\|\bm{\rho} - \bm{\sigma}\|_1 \le 2\sqrt{1 - |\langle \Psi_{\bm{\sigma}*} | \Psi_{\bm{\rho}*} \rangle|^2}

    By Uhlmann’s theorem, the fidelity is the maximal overlap of purifications, F(ρ,σ)=ΨσΨρ\sqrt{F(\bm{\rho}, \bm{\sigma})} = |\langle \Psi_{\bm{\sigma}*} | \Psi_{\bm{\rho}*} \rangle|.Substituting this gives the intermediate relation:

    ρσ121F(ρ,σ)\|\bm{\rho} - \bm{\sigma}\|_1 \le 2\sqrt{1 - F(\bm{\rho}, \bm{\sigma})}

    Squaring both sides yields ρσ124(1F(ρ,σ))\|\bm{\rho} - \bm{\sigma}\|_1^2 \le 4(1 - F(\bm{\rho}, \bm{\sigma})).Rearranging this gives the final inequality:

    F(ρ,σ)114ρσ12.F(\bm{\rho}, \bm{\sigma}) \leq 1 - \frac{1}{4} \|\bm{\rho} - \bm{\sigma}\|_1^2.

    Finally, we have

    1F(ρ,σ)12ρσ11F(ρ,σ).1 - \sqrt{F(\bm{\rho}, \bm{\sigma})} \leq \frac{1}{2} \|\bm{\rho} - \bm{\sigma}\|_1 \leq \sqrt{1- F(\bm{\rho}, \bm{\sigma})}.

 

 

 

Chap 3     Measurement and Evolution 📖#

3.1 Orthogonal measurement and generalized measurement#

  • An axiom of quantum theory asserts that a measurement may be described as an orthogonal projection operator. However, if we realize a measurement of system S by performing an orthogonal measurement on a larger system that contains S, the resulting operation performed on S alone need not be an orthogonal projection.

    We would like to find a mathematical description of such “generalized measurements” on system S. But first, let’s recall how measurement of an arbitrary Hermitian operator can be achieved in principle, following the classic treatment of Von Neumann.

3.1.1 The Von Neumann pointer model#

  • To measure an observable M\bm{M}, we will modify the Hamiltonian by turning on a coupling between that observable and another variable that represents the apparatus. This auxiliary system can be referred to as the “pointer”, the “meter”, or the “ancilla”. This coupling establishes a correlation between the eigenstates of the observable and the distinguishable states of the pointer, so that we can prepare an eigenstate of the observable by “observing” the pointer. Von Neumann’s model treats the pointer as a particle of mass mm. We intend to measure the position of the pointer, so it should be prepared initially in a wavepacket state that is narrow in position space. But the wavepacket cannot be too narrow, because according to the uncertainty principle, a very narrow wave packet will spread too rapidly. If the initial width of the wave packet is Δx\Delta x, then the uncertainty in its velocity will be Δv=Δp/m/(mΔx)\Delta v = \Delta p / m \sim \hbar / (m \Delta x). After a time tt, the wavepacket will spread to a width:

    Δx(t)Δx+tmΔx.\Delta x(t) \sim \Delta x + \frac{\hbar t}{m \Delta x}.

    This width is minimized when (Δx)2t/m(\Delta x)^2 \sim \hbar t / m. Therefore, if the experiment takes a time tt, the resolution we can achieve for the final position of the pointer is limited by:

    Δx(Δx)SQLtm.\Delta x \ge (\Delta x)_{SQL} \sim \sqrt{\frac{\hbar t}{m}}.

    This is known as the “standard quantum limit” (SQL). In the Von Neumann model, we choose a pointer that is sufficiently heavy (mm is large enough) that this limitation is not serious, and the spreading of its wavepacket can be neglected.

    The Hamiltonian describing the coupling of the quantum system to the pointer has the form:

    H=H0+12mP2+λ(t)MP;\bm{H} = \bm{H}_0 + \frac{1}{2m}P^2 + \lambda(t) \bm{M} \otimes \bm{P};

    where H0\bm{H}_0 is the system’s Hamiltonian, P2/2mP^2/2m is the pointer’s Hamiltonian, λ(t)\lambda(t) is a tunable coupling constant, and the observable M\bm{M} is coupled to the pointer’s momentum P\bm{P}.

    To simplify the analysis, we assume:

    1. The pointer is very heavy, so the P22m\frac{P^2}{2m} term can be neglected.
    2. The measurement is very fast, or [M,H0]=0[\bm{M}, \bm{H}_0] = 0, so the system’s free evolution H0\bm{H}_0 can be neglected.

    Under these conditions, the Hamiltonian is approximated as Hλ(t)MP\bm{H} \simeq \lambda(t) \bm{M} \otimes P. If the coupling is turned on between time 0 and T, the evolution operator is:

    U(T)exp(iλTMP).\bm{U}(T) \simeq \exp(-i \lambda T \bm{M} \otimes \bm{P}).

    We expand in the eigenbasis of M\bm{M}, {a}\{\ket{a}\}, where M=aaMaa\bm{M} = \sum_a \ket{a} M_a \bra{a}. The evolution operator becomes:

    U(T)=aaexp(iλTMaP)a.\bm{U}(T) = \sum_a \ket{a} \exp(-i \lambda T M_a \bm{P}) \bra{a}.

    The momentum operator P\bm{P} is the generator of position translations (P=iddx\bm{P} = -i \frac{d}{dx}), so eix0Pψ(x)=ψ(xx0)e^{-ix_0 \bm{P}} \psi(x) = \psi(x - x_0). Assume the initial state of the system and pointer is (aαaa)ψ(x)(\sum_a \alpha_a \ket{a}) \otimes \ket{\psi(x)}. After evolving for time T, the state becomes:

    U(T)(aαaaψ(x))=aαaaψ(xλTMa).\bm{U}(T) \left( \sum_a \alpha_a \ket{a} \otimes \ket{\psi(x)} \right) = \sum_a \alpha_a \ket{a} \otimes \ket{\psi(x - \lambda T M_a)}.

    The result of the evolution is that the pointer’s position, λTMa\lambda T M_a, becomes correlated with the eigenvalue MaM_a of the system’s observable M\bm{M}. If the pointer’s wavepacket is narrow enough for us to resolve the position shifts corresponding to different MaM_a, then when we “observe” the pointer has shifted by λTMa\lambda T M_a, we have prepared the system in the corresponding eigenstate a\ket{a} with probability αa2|\alpha_a|^2. This is Von Neumann’s model of orthogonal measurement.

    e.g. Stern-Gerlach Apparatus

    This is a classic example of measuring σ3\bm{\sigma}_3 for a spin-1/2 object. The object passes through an inhomogeneous magnetic field B3=λzB_3 = \lambda z. The coupling Hamiltonian is:

    H=λμzσ3.\bm{H} = -\lambda \mu z \bm{\sigma}_3.

    In this case, the observable σ3\bm{\sigma}_3 (system) is coupled to the position zz (pointer). This coupling imparts an impulse (change in momentum) to the pointer that is correlated with the spin, allowing us to project the spin onto the z\ket{\uparrow_z} or z\ket{\downarrow_z} state by observing whether the object is pushed up or down.

3.1.2 Orthogonal measurements#

  • Thinking more abstractly, suppose that {Pa,a=0,1,2,...N1}\{\bm{P}_a, a=0,1,2,...N-1\} is a complete set of orthogonal projectors satisfying

    PaPb=δabPa,Pa=Pa,aPa=I.\bm{P}_a \bm{P}_b = \delta_{ab} \bm{P}_a, \quad \bm{P}_a = \bm{P}_a^\dagger, \quad \sum_a \bm{P}_a = \bm{I}.

    To perform an orthogonal measurement with these outcomes, we introduce an N-dimensional pointer system with fiducial orthonormal basis states {a,a=0,1,2,...,N1}\{\ket{a}, a=0,1,2,...,N-1\}, and, by coupling the system to the pointer, perform the unitary transformation

    U=a,bPab+ab.\bm{U} = \sum_{a,b} \bm{P}_a \otimes \ket{b+a}\bra{b}.

    Thus the pointer advances by an amount aa if the state of the system is within the support of the projector Pa\bm{P}_a. (The addition in b+a\ket{b+a} is understood to be modulo N). The unitarity of U\bm{U} is easy to verify:

    UU=(a,bPab+ab)(c,dPcdd+c)=a,b,c,dδacPaδbdb+ad+c=aPabb+ab+a=II.\begin{align*} \bm{U} \bm{U}^\dagger &= \left( \sum_{a,b} \bm{P}_a \otimes \ket{b+a}\bra{b} \right) \left( \sum_{c,d} \bm{P}_c \otimes \ket{d}\bra{d+c} \right) \\ &= \sum_{a,b,c,d} \delta_{ac} \bm{P}_a \otimes \delta_{bd} \ket{b+a}\bra{d+c} \\ &= \sum_a \bm{P}_a \otimes \sum_b \ket{b+a}\bra{b+a} = \bm{I} \otimes \bm{I}. \end{align*}

    This unitary transformation acts on an initial product state of system and pointer according to

    U:Ψ=ψ0Ψ=aPaψa;\bm{U}: \ket{\Psi} = \ket{\psi} \otimes \ket{0} \mapsto \ket{\Psi'} = \sum_a \bm{P}_a \ket{\psi} \otimes \ket{a};

    if the pointer is then measured in the fiducial basis, the measurement postulate implies that the outcome aa occurs with probability

    Prob(a)=Ψ(Iaa)Ψ=ψPaψ,\text{Prob}(a) = \bra{\Psi'} (\bm{I} \otimes \ket{a}\bra{a}) \ket{\Psi'} = \bra{\psi} \bm{P}_a \ket{\psi},

    and that when this outcome occurs the normalized post-measurement state is

    PaψPaψ.\frac{\bm{P}_a \ket{\psi}}{\| \bm{P}_a \ket{\psi} \|}.

    If the measurement is performed and its outcome is not known, the initial pure state of the system becomes a mixture of these post-measurement states:

    aProb(a)PaψψPaψPaψ=aPaψψPa.\sum_a \text{Prob}(a) \frac{\bm{P}_a \ket{\psi} \bra{\psi} \bm{P}_a}{\bra{\psi} \bm{P}_a \ket{\psi}} = \sum_a \bm{P}_a \ket{\psi} \bra{\psi} \bm{P}_a.

    In fact, the system is described by this density operator once it becomes entangled with the pointer, whether we bother to observe the pointer or not.

    If the initial state of the system before the measurement is a mixed state with density matrix ρ\bm{\rho}, then by expressing ρ\bm{\rho} as an ensemble of pure states we conclude that the measurement modifies the state according to

    ρaPaρPa.\bm{\rho} \mapsto \sum_a \bm{P}_a \bm{\rho} \bm{P}_a.

    We see that if, by coupling the system to our pointer, we can execute suitable unitary transformations correlating the system and the pointer, and if we can observe the pointer in its fiducial basis, then we are empowered to perform any conceivable orthogonal measurement on the system.

    Here is the summary of section 3.1.2, formatted in your note-taking style:

3.1.3 Generalized measurements (POVM)#

  • In the discussion of orthogonal measurement, the pointer’s fiducial basis was used for two things: correlating with the system’s projectors {Pa}\{\bm{P}_a\} and being the basis in which the pointer is measured.

    A generalized measurement arises when these two roles are separated. For example, the pointer might be measured in a different basis than the one used to establish the correlation.

    e.g. Qubit-Qubit Measurement

    Suppose a qubit system A and a qubit pointer B interact via the unitary map:

    U:(α0+β1)A0Bα0A0B+β1A1B.\bm{U}: (\alpha\ket{0}+\beta\ket{1})_{A}\otimes\ket{0}_{B} \mapsto \alpha\ket{0}_{A}\otimes\ket{0}_{B}+\beta\ket{1}_{A}\otimes\ket{1}_{B}.

    If we measure the pointer B in the {0,1}\{\ket{0}, \ket{1}\} basis, this induces an orthogonal measurement on system A in its {0,1}\{\ket{0}, \ket{1}\} basis. However, if we instead measure the pointer B in the Hadamard basis {±=12(0±1)}\{\ket{\pm} = \frac{1}{\sqrt{2}}(\ket{0} \pm \ket{1}) \}, the two possible outcomes yield two post-measurement states for system A:

    α0±β1.\alpha\ket{0} \pm \beta\ket{1}.

    These two post-measurement states are not orthogonal (unless α=β|\alpha|=|\beta|). This is a generalized measurement. If two such measurements are performed in succession, the outcomes need not be the same.

    In the general case, we entangle system A with a pointer B (initially in state 0B\ket{0}_B) via a unitary U\bm{U}, and then perform an orthogonal measurement on B in the basis {a}\{\ket{a}\}. We can expand the action of U\bm{U} as:

    U:ψA0BaMaψAaB.\bm{U}: \ket{\psi}_A \otimes \ket{0}_B \mapsto \sum_a \bm{M}_a \ket{\psi}_A \otimes \ket{a}_B.

    The operators {Ma}\{\bm{M}_a\} are the measurement operators for the generalized measurement on A. Since U\bm{U} is unitary, it preserves the norm, which implies the completeness relation:

    aMaMa=I.\sum_a \bm{M}_a^\dagger \bm{M}_a = \bm{I}.

    The measurement postulate asserts that outcome aa occurs with probability:

    Prob(a)=Maψ2.\text{Prob}(a) = \| \bm{M}_a \ket{\psi} \|^2.

    If outcome aa occurs, the post-measurement state of the system is:

    MaψMaψ.\frac{\bm{M}_a \ket{\psi}}{\| \bm{M}_a \ket{\psi} \|}.

    If the initial state is a density operator ρ\bm{\rho}, the probability of outcome aa is given by

    Prob(a)=tr(Eaρ),\text{Prob}(a) = \text{tr}(\bm{E}_a \bm{\rho}),

    where Ea=MaMa\bm{E}_a = \bm{M}_a^\dagger \bm{M}_a. The set of operators {Ea}\{\bm{E}_a\} is called a positive operator-valued measure (POVM). These operators satisfy the following properties:

    1. Hermiticity: Ea=Ea\bm{E}_a = \bm{E}_a^\dagger.
    2. Positivity: Ea0\bm{E}_a \ge 0 (i.e., ψEaψ0\bra{\psi} \bm{E}_a \ket{\psi} \ge 0 for any ψ\ket{\psi}).
    3. Completeness: aEa=I\sum_a \bm{E}_a = \bm{I}.

    Any set of operators {Ea}\{\bm{E}_a\} satisfying these properties (a POVM) can be realized by an orthogonal measurement on an auxiliary system. We can define measurement operators Ma=UaEa\bm{M}_a = \bm{U}_a \sqrt{\bm{E}_a}, where Ua\bm{U}_a is an arbitrary unitary operator. This is the polar decomposition of Ma\bm{M}_a. The POVM {Ea}\{\bm{E}_a\} only determines the probability of each outcome. The post-measurement state,

    UaEaψEaψ,\bm{U}_a \frac{\sqrt{\bm{E}_a} \ket{\psi}}{\| \sqrt{\bm{E}_a} \ket{\psi} \|},

    is not uniquely determined by the POVM, as it depends on the arbitrary choice of the unitaries {Ua}\{\bm{U}_a\} for each outcome.

  • The POVM formalism is not just a mathematical generalization; it grants us access to measurement strategies that are impossible for standard orthogonal measurements. A key example of this advantage is in the task of distinguishing non-orthogonal states. If one is given a state known to be either ψ1\ket{\psi_1} or ψ2\ket{\psi_2} (where ψ1ψ20\langle \psi_1 | \psi_2 \rangle \neq 0), a single orthogonal measurement cannot perfectly determine which state it was without a finite probability of error. However, the flexibility of POVMs allows us to design a measurement that never makes an error, at the cost of sometimes returning an inconclusive result. The following example illustrates this powerful technique, known as unambiguous state discrimination.

    e.g. Distinguishing Non-Orthogonal States

    Suppose Alice gives Bob a qubit that is in one of two non-orthogonal states:

    ψ1=0orψ2=12(0+1)=+.\ket{\psi_1} = \ket{0} \quad \text{or} \quad \ket{\psi_2} = \frac{1}{\sqrt{2}}(\ket{0} + \ket{1}) = \ket{+}.

    Bob cannot perform any measurement that perfectly distinguishes these two states. However, he can perform a measurement that sometimes identifies the state perfectly, without ever making an error.

    The strategy is to construct a POVM where some outcomes are “forbidden” for one of the states.

    1. We need an outcome E1\bm{E}_1 that proves the state was ψ2\ket{\psi_2}. This requires the probability of this outcome for ψ1\ket{\psi_1} to be zero: ψ1E1ψ1=0E10=0\bra{\psi_1} \bm{E}_1 \ket{\psi_1} = \bra{0} \bm{E}_1 \ket{0} = 0. This means E1\bm{E}_1 must be proportional to the projector onto the state orthogonal to 0\ket{0}, which is 11\ket{1}\bra{1}.
    2. We need an outcome E2\bm{E}_2 that proves the state was ψ1\ket{\psi_1}. This requires ψ2E2ψ2=+E2+=0\bra{\psi_2} \bm{E}_2 \ket{\psi_2} = \bra{+} \bm{E}_2 \ket{+} = 0. This means E2\bm{E}_2 must be proportional to the projector onto the state orthogonal to +\ket{+}, which is \ket{-}\bra{-}.
    3. The third outcome, E3\bm{E}_3, represents the “inconclusive” result.

    This logic leads to the following POVM (where the pre-factors are chosen to maximize the success probability while ensuring all Ei\bm{E}_i are positive operators):

    E1=21+211E2=21+2where =12(01)E3=IE1E2.\begin{align*} \bm{E}_1 &= \frac{\sqrt{2}}{1+\sqrt{2}} \ket{1}\bra{1} \\ \bm{E}_2 &= \frac{\sqrt{2}}{1+\sqrt{2}} \ket{-}\bra{-} \quad \text{where } \ket{-} = \frac{1}{\sqrt{2}}(\ket{0} - \ket{1}) \\ \bm{E}_3 &= \bm{I} - \bm{E}_1 - \bm{E}_2. \end{align*}

    One can check that E1+E2+E3=I\bm{E}_1 + \bm{E}_2 + \bm{E}_3 = \bm{I}, so this is a valid POVM.

    Now, let’s analyze the measurement outcomes for each possible state Bob receives:

    1. If Bob receives ψ1=0\ket{\psi_1} = \ket{0}:
      • Prob(E1)=ψ1E1ψ1=21+20110=0\text{Prob}(E_1) = \bra{\psi_1} \bm{E}_1 \ket{\psi_1} = \frac{\sqrt{2}}{1+\sqrt{2}} \bra{0} \ket{1}\bra{1} \ket{0} = 0.
      • Prob(E2)=ψ1E2ψ1=21+202=21+2122=22(1+2)\text{Prob}(E_2) = \bra{\psi_1} \bm{E}_2 \ket{\psi_1} = \frac{\sqrt{2}}{1+\sqrt{2}} |\langle 0 | - \rangle|^2 = \frac{\sqrt{2}}{1+\sqrt{2}} \left| \frac{1}{\sqrt{2}} \right|^2 = \frac{\sqrt{2}}{2(1+\sqrt{2})}.
    2. If Bob receives ψ2=+\ket{\psi_2} = \ket{+}:
      • Prob(E1)=ψ2E1ψ2=21+2+12=21+2122=22(1+2)\text{Prob}(E_1) = \bra{\psi_2} \bm{E}_1 \ket{\psi_2} = \frac{\sqrt{2}}{1+\sqrt{2}} |\langle + | 1 \rangle|^2 = \frac{\sqrt{2}}{1+\sqrt{2}} \left| \frac{1}{\sqrt{2}} \right|^2 = \frac{\sqrt{2}}{2(1+\sqrt{2})}.
      • Prob(E2)=ψ2E2ψ2=21+2+2=0\text{Prob}(E_2) = \bra{\psi_2} \bm{E}_2 \ket{\psi_2} = \frac{\sqrt{2}}{1+\sqrt{2}} |\langle + | - \rangle|^2 = 0.

    The key is in the zero probabilities:

    • If Bob’s measurement yields outcome E1\bm{E}_1, he knows with certainty that the state must have been ψ2\ket{\psi_2} (because Prob(E1ψ1)=0\text{Prob}(E_1|\psi_1) = 0).
    • If Bob’s measurement yields outcome E2\bm{E}_2, he knows with certainty that the state must have been ψ1\ket{\psi_1} (because Prob(E2ψ2)=0\text{Prob}(E_2|\psi_2) = 0).
    • If Bob’s measurement yields outcome E3\bm{E}_3, he gains no information and cannot distinguish the states.

    In this procedure, Bob never makes an error in identifying the state. The trade-off is that he only succeeds some of the time; in the case of outcome E3\bm{E}_3, his measurement is inconclusive.

3.2 Quantum channels#

3.2.1 The operator-sum representation#

  • We now ask how to describe the evolution of a subsystem A when the total system AB undergoes unitary evolution. This is the next step from generalized measurements. We can model this process by imagining that system A starts in a state ρ\bm{\rho} (unentangled with an environment B in a state 0B\ket{0}_B) and then interacts with B via a unitary U\bm{U}. The resulting joint state, if we were to measure the environment in a basis {a}\{\ket{a}\}, would be aMaψAaB\sum_a \bm{M}_a \ket{\psi}_A \otimes \ket{a}_B. If we do not measure the environment, but instead “trace out” (ignore) its degrees of freedom, the initial state ρ\bm{\rho} of system A evolves to a new state E(ρ)\mathcal{E}(\bm{\rho}). This evolution is given by a linear map E\mathcal{E}:

    E(ρ)=aMaρMa.\mathcal{E}(\bm{\rho}) = \sum_a \bm{M}_a \bm{\rho} \bm{M}_a^\dagger.

    This map is called a quantum channel. It is also known as a superoperator (as it maps operators to operators) or, more formally, a trace-preserving completely positive (TPCP) map. The expression E(ρ)=aMaρMa\mathcal{E}(\bm{\rho}) = \sum_a \bm{M}_a \bm{\rho} \bm{M}_a^\dagger is called the operator-sum representation of the channel. The operators {Ma}\{\bm{M}_a\} are called the Kraus operators or operation elements. For E\mathcal{E} to be a valid channel, the Kraus operators must satisfy the completeness relation, which follows from the fact that the total evolution U\bm{U} is unitary:

    aMaMa=I.\sum_a \bm{M}_a^\dagger \bm{M}_a = \bm{I}.

    A quantum channel maps density operators to density operators. This means the map E\mathcal{E} has the following properties:

    1. Linearity: E(αρ1+βρ2)=αE(ρ1)+βE(ρ2).\mathcal{E}(\alpha \bm{\rho}_1 + \beta \bm{\rho}_2) = \alpha \mathcal{E}(\bm{\rho}_1) + \beta \mathcal{E}(\bm{\rho}_2).
    2. Preserves Hermiticity: If ρ=ρ\bm{\rho} = \bm{\rho}^\dagger, then E(ρ)=E(ρ)\mathcal{E}(\bm{\rho}) = \mathcal{E}(\bm{\rho})^\dagger.
    3. Preserves positivity: If ρ0\bm{\rho} \ge 0, then E(ρ)0\mathcal{E}(\bm{\rho}) \ge 0.
    4. Preserves trace: tr(E(ρ))=tr(ρ)\text{tr}(\mathcal{E}(\bm{\rho})) = \text{tr}(\bm{\rho}).

    Any map E\mathcal{E} that has an operator-sum representation can be physically realized by a unitary transformation U\bm{U} on an extended system AB, followed by a partial trace over B.

  • The operator-sum representation of a given channel E\mathcal{E} is not unique. This is because we can choose to trace out the environment B in any orthonormal basis we please. Suppose we had the joint state aMaψAaB\sum_a \bm{M}_a \ket{\psi}_A \otimes \ket{a}_B. If we choose a different basis {μ}\{\ket{\mu}\} for the environment, related to the first by a unitary matrix V\bm{V} such that:

    a=μμVμa.\ket{a} = \sum_\mu \ket{\mu} V_{\mu a}.

    Then the joint state can be rewritten as:

    aMaψA(μμBVμa)=μ(aVμaMa)ψAμB.\sum_a \bm{M}_a \ket{\psi}_A \otimes \left( \sum_\mu \ket{\mu}_B V_{\mu a} \right) = \sum_\mu \left( \sum_a V_{\mu a} \bm{M}_a \right) \ket{\psi}_A \otimes \ket{\mu}_B.

    This gives a new set of Kraus operators {Nμ}\{\bm{N}_\mu\}, where

    Nμ=aVμaMa.\bm{N}_\mu = \sum_a V_{\mu a} \bm{M}_a.

    This new set describes the exact same quantum channel, as μNμρNμ=aMaρMa\sum_\mu \bm{N}_\mu \bm{\rho} \bm{N}_\mu^\dagger = \sum_a \bm{M}_a \bm{\rho} \bm{M}_a^\dagger for all ρ\bm{\rho}. Two channels E1\mathcal{E}_1 and E2\mathcal{E}_2 can be composed to obtain another channel E2E1\mathcal{E}_2 \circ \mathcal{E}_1. If E1\mathcal{E}_1 has operators {Ma}\{\bm{M}_a\} and E2\mathcal{E}_2 has operators {Nμ}\{\bm{N}_\mu\}, the composed channel E2E1\mathcal{E}_2 \circ \mathcal{E}_1 has operators {NμMa}\{\bm{N}_\mu \bm{M}_a\}. Because they can be composed this way, quantum channels form a dynamical semigroup.

    Quantum channels are the formal mechanism for describing decoherence, the evolution of pure states into mixed states. Unitary evolution is the special case of a channel with only one Kraus operator.

3.2.2 Linearity#

  • A quantum channel specifies how an initial density operator evolves to a final density operator. On general grounds, we should expect this evolution to be described by a linear map E\mathcal{E}. This requirement stems from the interpretation of the density operator as an ensemble of possible states.

    Suppose that at time t=0t=0, an initial state ρi\bm{\rho}_i is prepared with probability pip_i. The initial state of the ensemble is described by the convex combination:

    ρ=ipiρi.\bm{\rho} = \sum_i p_i \bm{\rho}_i.

    If this ensemble evolves to time t=Tt=T, the final state will be:

    ρ=E(ρ)=E(ipiρi).\bm{\rho}' = \mathcal{E}(\bm{\rho}) = \mathcal{E}\left(\sum_i p_i \bm{\rho}_i\right).

    Alternatively, we can consider the ensemble of the final states. At time t=Tt=T, the state will be E(ρi)\mathcal{E}(\bm{\rho}_i) with probability pip_i. The density operator describing this final ensemble is:

    ρ=ipiE(ρi).\bm{\rho}' = \sum_i p_i \mathcal{E}(\bm{\rho}_i).

    Equating these two equivalent descriptions of the final state ρ\bm{\rho}' forces the map E\mathcal{E} to be linear:

    E(ipiρi)=ipiE(ρi).\mathcal{E}\left(\sum_i p_i \bm{\rho}_i\right) = \sum_i p_i \mathcal{E}(\bm{\rho}_i).

3.2.3 Complete positivity#

  • A quantum channel is a linear map that takes density operators to density operators. In particular, it maps nonnegative operators to nonnegative operators. We therefore say that a channel is a positive map.

    However, a channel has a stronger property than mere positivity; it is completely positive. This means that the channel remains positive even when we consider it to be acting on just part of a larger system. If a channel E\mathcal{E} maps system A to system AA', we can extend it to an auxiliary system B by considering the map EI\mathcal{E} \otimes \bm{I}, which acts on the composite system AB. A map E\mathcal{E} is completely positive if any such extension EI\mathcal{E} \otimes \bm{I} is a positive map.

    Quantum channels (which have an operator-sum representation) are clearly completely positive. If E\mathcal{E} has Kraus operators {Ma}\{\bm{M}_a\}, then the extended map EI\mathcal{E} \otimes \bm{I} also has an operator-sum representation with Kraus operators {MaI}\{\bm{M}_a \otimes \bm{I}\}. Since EI\mathcal{E} \otimes \bm{I} maps ρABa(MaI)ρAB(MaI)\bm{\rho}_{AB} \mapsto \sum_a (\bm{M}_a \otimes \bm{I}) \bm{\rho}_{AB} (\bm{M}_a^\dagger \otimes \bm{I}), it will always map positive operators to positive operators.

    This is a crucial physical requirement. Even if our channel only describes system A, we must ensure it can map a valid (positive) state of the entire universe (AB) to another valid (positive) state.

  • Not all positive maps are completely positive.

    e.g. The Transpose Map

    Consider the transpose map T\bm{T}, which acts on a dd-dimensional system A as:

    T:ρρT.\bm{T}: \bm{\rho} \mapsto \bm{\rho}^T.

    This map T\bm{T} is positive, because for any vector ψ\ket{\psi}:

    ψρTψ=i,jψj(ρT)jiψi=i,jψi(ρ)ijψj=ψρψ0,\bra{\psi} \bm{\rho}^T \ket{\psi} = \sum_{i,j} \psi_j^* (\bm{\rho}^T)_{ji} \psi_i = \sum_{i,j} \psi_i (\bm{\rho})_{ij} \psi_j^* = \bra{\psi^*} \bm{\rho} \ket{\psi^*} \ge 0,

    so ρT\bm{\rho}^T is non-negative if ρ\bm{\rho} is.

    However, T\bm{T} is not completely positive. Consider its extension TI\bm{T} \otimes \bm{I} acting on a d×dd \times d maximally entangled state on AB:

    Φ~AB=i=0d1iAiB.\ket{\tilde{\Phi}}_{AB} = \sum_{i=0}^{d-1} \ket{i}_A \otimes \ket{i}_B.

    The extended map acts on the projector Φ~Φ~\ket{\tilde{\Phi}}\bra{\tilde{\Phi}} as:

    (TI)(Φ~Φ~)=(TI)(i,jijAijB)=i,jT(ijA)ijB=i,jjiAijB.\begin{align*} (\bm{T} \otimes \bm{I}) (\ket{\tilde{\Phi}}\bra{\tilde{\Phi}}) &= (\bm{T} \otimes \bm{I}) \left( \sum_{i,j} \ket{i}\bra{j}_A \otimes \ket{i}\bra{j}_B \right) \\ &= \sum_{i,j} \bm{T}(\ket{i}\bra{j}_A) \otimes \ket{i}\bra{j}_B \\ &= \sum_{i,j} \ket{j}\bra{i}_A \otimes \ket{i}\bra{j}_B. \end{align*}

    The resulting operator is the SWAP\textbf{SWAP} operator, which interchanges the systems A and B.

    SWAP:ψAφBφAψB.\textbf{SWAP} : \ket{\psi}_A \otimes \ket{\varphi}_B \mapsto \ket{\varphi}_A \otimes \ket{\psi}_B.

    Since the square of SWAP is the identity, its eigenvalues are ±1\pm 1. States which are symmetric under interchange of A and B have eigenvalue +1, while antisymmetric states have eigenvalue -1.

    Since the output operator SWAP\textbf{SWAP} is not a positive operator (it has negative eigenvalues), the map TI\bm{T} \otimes \bm{I} is not positive. Therefore, the transpose map T\bm{T} is not completely positive.

3.2.4 Reversibility#

  • A unitary transformation U\bm{U} is reversible, as it has a unitary inverse U\bm{U}^\dagger. If a state evolves by U\bm{U}, we can recover the original state by applying U\bm{U}^\dagger. Is the same true for general quantum channels? If a channel E1\mathcal{E}_1 (with Kraus operators {Ma}\{\bm{M}_a\}) is inverted by a channel E2\mathcal{E}_2 (with Kraus operators {Nμ}\{\bm{N}_\mu\}), then their composition must be the identity map for any pure state ψ\ket{\psi}:

    E2E1(ψψ)=μ,aNμMaψψMaNμ=ψψ.\mathcal{E}_2 \circ \mathcal{E}_1 (\ket{\psi}\bra{\psi}) = \sum_{\mu, a} \bm{N}_\mu \bm{M}_a \ket{\psi}\bra{\psi} \bm{M}_a^\dagger \bm{N}_\mu^\dagger = \ket{\psi}\bra{\psi}.

    Since the left-hand side is a sum of positive terms, this equation can only hold if each term is proportional to ψψ\ket{\psi}\bra{\psi}, which implies:

    NμMa=λμaI,\bm{N}_\mu \bm{M}_a = \lambda_{\mu a} \bm{I},

    for each μ\mu and aa. Using the completeness relation for E2\mathcal{E}_2 (μNμNμ=I\sum_\mu \bm{N}_\mu^\dagger \bm{N}_\mu = \bm{I}), we find:

    MbMa=Mb(μNμNμ)Ma=μ(NμMb)(NμMa)=μλμbλμaIβbaI.\bm{M}_b^\dagger \bm{M}_a = \bm{M}_b^\dagger \left( \sum_\mu \bm{N}_\mu^\dagger \bm{N}_\mu \right) \bm{M}_a = \sum_\mu (\bm{N}_\mu \bm{M}_b)^\dagger (\bm{N}_\mu \bm{M}_a) = \sum_\mu \lambda_{\mu b}^* \lambda_{\mu a} \bm{I} \equiv \beta_{ba} \bm{I}.

    This means that each Kraus operator Ma\bm{M}_a must be proportional to a unitary matrix. Using the polar decomposition Ma=UaMaMa\bm{M}_a = \bm{U}_a \sqrt{\bm{M}_a^\dagger \bm{M}_a}, we get:

    Ma=Uaβaa.\bm{M}_a = \bm{U}_a \sqrt{\beta_{aa}}.

    Substituting this back into the relation MbMa=βbaI\bm{M}_b^\dagger \bm{M}_a = \beta_{ba} \bm{I}, we find that all the unitaries Ua,Ub,...\bm{U}_a, \bm{U}_b, ... must be proportional to each other. This implies that the channel E1\mathcal{E}_1 must have only one Kraus operator (up to a scaling factor), which is a single unitary matrix. We conclude that a quantum channel can be inverted by another quantum channel if and only if it is a unitary transformation. This means that decoherence is irreversible. Once system A becomes entangled with system B (the environment), we cannot undo the evolution of A if we don’t have access to B. Decoherence causes quantum information to leak to the environment, and because we cannot control the environment, this information cannot be recovered.

    This argument applies to channels mapping a system A to another system A’ as long as they have the same dimension. This conclusion can be evaded if dim(A)>dim(A)\dim(A') > \dim(A). In this case, the Kraus operators are rectangular, and the same application of the polar decomposition and its consequences does not hold. This exception, where information can be encoded into a larger Hilbert space, is the basis for quantum error correction.

3.2.5 Quantum channels in the Heisenberg picture#

  • We have described quantum channels in the Schrödinger picture, where the quantum state evolves:

    ρ=E(ρ)=aMaρMa.\bm{\rho}' = \mathcal{E}(\bm{\rho}) = \sum_a \bm{M}_a \bm{\rho} \bm{M}_a^\dagger.

    Alternatively, we can use the Heisenberg picture, where the state is stationary and the operators (observables) evolve. This evolution is described by the dual map, E\mathcal{E}^*:

    A=E(A)=aMaAMa.\bm{A}' = \mathcal{E}^*(\bm{A}) = \sum_a \bm{M}_a^\dagger \bm{A} \bm{M}_a.

    These two pictures give the same expectation values, as tr(AE(ρ))=tr(E(A)ρ)\text{tr}(\bm{A} \mathcal{E}(\bm{\rho})) = \text{tr}(\mathcal{E}^*(\bm{A}) \bm{\rho}).

    Note that the dual of a channel need not be a channel; that is, it might not be trace preserving (aMaMaI\sum_{a} \bm{M}_a \bm{M}_a^\dagger \ne \bm{I}). Instead, the completeness property of the Kraus operators {Ma}\{\bm{M}_a\} implies that:

    E(I)=I,\mathcal{E}^*(\bm{I}) = \bm{I},

    if E\mathcal{E} is a channel. We say that a map is unital if it preserves the identity operator, and conclude that the dual of a channel is a unital map.Not all quantum channels are unital, but some are. If the Kraus operators of E\mathcal{E} satisfy:

    aMaMa=I=aMaMa,\sum_a \bm{M}_a^\dagger \bm{M}_a = \bm{I} = \sum_a \bm{M}_a \bm{M}_a^\dagger,

    then E\mathcal{E} is unital and its dual E\mathcal{E}^* is also a unital channel.A unital quantum channel maps a maximally mixed density operator to itself.

3.2.6 Quantum operations#

  • Generalized measurements and quantum channels are special cases of a more general notion called a quantum operation. This concept arises when we entangle a system with a meter, measure the meter, and then retain some of the information about the outcome while discarding the rest.

    We can consider a generalized measurement described by Kraus operators {Maμ}\{\bm{M}_{a\mu}\} which carry two labels, aa and μ\mu. These obey the usual completeness relation:

    a,μMaμMaμ=I.\sum_{a,\mu} \bm{M}_{a\mu}^\dagger \bm{M}_{a\mu} = \bm{I}.

    Suppose that after the measurement, we remember the outcome aa but forget the outcome μ\mu. If the quantum state before the measurement is ρ\bm{\rho}, we sum over the outcomes μ\mu that we forgot. The resulting map, Ea\mathcal{E}_a, associated with outcome aa, is:

    Ea(ρ)μMaμρMaμ.\mathcal{E}_a(\bm{\rho}) \equiv \sum_\mu \bm{M}_{a\mu} \bm{\rho} \bm{M}_{a\mu}^\dagger.

    The outcome aa occurs with probability:

    Prob(a)=tr(Ea(ρ)).\mathrm{Prob}(a) = \text{tr}(\mathcal{E}_a(\bm{\rho})).

    The map Ea\mathcal{E}_a looks like a quantum channel, except that its Kraus operators satisfy an inequality constraint instead of an equality:

    μMaμMaμI.\sum_\mu \bm{M}_{a\mu}^\dagger \bm{M}_{a\mu} \le \bm{I}.

    This means that a quantum operation is not, in general, trace-preserving.

    • If aa takes just one value (all information is discarded), the map is a quantum channel.

    • If μ\mu takes just one value (all information is retained), the map corresponds to a generalized measurement.

    Because the map Ea\mathcal{E}_a is not trace-preserving, the post-measurement state must be renormalized:

    ρEa(ρ)tr(Ea(ρ)).\bm{\rho} \mapsto \frac{\mathcal{E}_a(\bm{\rho})}{\text{tr}(\mathcal{E}_a(\bm{\rho}))}.

    This is a nonlinear map on the state. However, it is often convenient to regard the operation as the linear map Ea\mathcal{E}_a that takes ρ\bm{\rho} to a subnormalized state. For example, in a sequence of nn measurements with outcomes {a1,a2,...,an}\{a_1, a_2, ..., a_n\}, we can apply the linear maps in order and renormalize only at the very end:

    ρEanEa1(ρ)tr(EanEa1(ρ)).\bm{\rho} \mapsto \frac{\mathcal{E}_{a_n} \circ \cdots \circ \mathcal{E}_{a_1}(\bm{\rho})}{\text{tr}(\mathcal{E}_{a_n} \circ \cdots \circ \mathcal{E}_{a_1}(\bm{\rho}))}.

    The denominator in this expression is the total probability of observing that specific sequence of measurement outcomes.


Quantum Information
https://zhouzi-wanderder-blog.vercel.app/posts/quantum_information/
Author
Zhouzi
Published at
2025-04-11