forked from jacobwatkins1/Nielsen-and-Chuang-Solutions
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmain.tex
7799 lines (6609 loc) · 545 KB
/
main.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass{book}
\usepackage[utf8]{inputenc}
\usepackage[margin=1in]{geometry}
\usepackage[utf8]{inputenc}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{amsthm}
\usepackage{physics}
\usepackage{bbold}
\usepackage{appendix}
\usepackage[makeroom]{cancel}
\usepackage[linesnumbered,ruled]{algorithm2e}
\usepackage{tikz}
\usetikzlibrary{quantikz}
\setlength{\parskip}{1em}
\DeclareMathOperator{\diag}{diag}
\newcommand{\mc}[1]{\mathcal{#1}}
\newcommand{\innerprod}[2]{\langle#1,#2\rangle}
\title{Guided solutions to Nielsen and Chuang: Quantum Computation and Quantum Information}
\author{Jacob Watkins}
\date{Last updated May 2023}
\begin{document}
\maketitle
\setcounter{chapter}{-1}
\chapter{Plan and Progress}
\section{Purpose of this solution manual}
The textbook \emph{Quantum Computation and Quantum Information} by Nielsen and Chuang (NC) is often considered the authoritative textbook in the subject by those in the field. Since the book was published, the field has grown and evolved enormously, and rich subfields have branched off to be of interest in their own right. In parallel with the science, more resources have become available for those interested in learning quantum computing. These include textbooks, tutorials, blog posts, software packages, and small quantum devices available to the public for playful exploration. Despite these developments, NC retains much of its original value and relevance as a comprehensive introduction to this subject.
Surprisingly, there seems to be a lack of complete, easily available, and guided solutions to NC problems. Partial solutions exist in several places on the internet, including GitHub. It appears, taken together, they have covered chapters 2,3,4 and 9 almost completely, with scattered solutions for other chapters. In many cases, these solutions are not worked through pedagogically. The goal of this document is to provide a comprehensive set of worked-through solutions to as many (if not all) NC exercises as possible, as a supplement to the textbook for those who need assistance in getting unstuck on a problem.
Under what conditions can a student best learn a technical subject such as quantum computing? For those students who learn in a formal class setting at a university, or under mentorship from an expert, the student, ideally, has access not only to the relevant information, but also \emph{feedback} that allows them to refine their skills. In today's age of the internet, almost everyone has some access to the same raw information on quantum computing. However, this may not be enough to truly equalize the playing field for those outside academic institutions, or for those who attend a university with little to no expertise in the subject. A lack of feedback and guidance for independent learners could hamper their progress, since they are unaware of conceptual traps or useful mental shortcuts that are often discussed in informal settings such as office hours.
It has sometimes been suggested to me, by my previous professors or certain textbook authors, that a bright, motivated student can learn on their own directly from a textbook, without recurse to much else. While this might be true in some sense, a more important question is how efficiently this learning occurs, compared to those who have guided practice with feedback. I hope that these worked-through solutions will provide a different kind of resource to independent quantum learners, a kind which is sorely needed.
\textbf{I would welcome any involvement from the community on this project. If anyone wants to help contribute, please send me a message!}
\section{Progress to completion}
This progress was last updated May 26th, 2023.
\begin{itemize}
\item Chapter 1: All in-text exercises done. Two open ended problems left alone.
\item Chapter 2: All complete! Would like some editorial review.
\item Chapter 3: 1 "complete".
\item Chapter 4: All but 27-30 (not my fav). Some final probs (not research questions!)
\item Chapter 5: 1-14 complete, 15-29 remain, along with 6 practice problems.
\item Chapter 6: 1-8 complete.
\item Chapter 7: 1-19 complete, except 16.
\item Chapter 8: 1-6 complete.
\item Chapter 9: All exercises complete (end-of-chapter problems remain).
\item Chapter 10: 1-11 and 29-40 complete.
\item Chapter 11: Only first exercise.
\item Chapter 12: 1-4.
\item Appendix 1: All exercises complete!
\item Appendix 2: Some.
\item Appendix 3: Only the first.
\item Appendix 4: All exercises complete except 17 and part 2 of problem 1.
\item Appendix 5: Almost done. Problem 1 needs to be cleaned up.
\item Appendix 6: All exercises complete!
\end{itemize}
\chapter{Introduction and overview}
\section*{Exercise 1.1 (Probabilistic classical algorithm)}
Suppose that the problem is not to distinguish between the constant and balanced functions with certainty, but rather, with some probability of error $\epsilon < 1/2$. What is the performance of the best classical algorithm for this problem?
\textbf{Solution}: Intuitively, we should be able to check whether a function $f$ as constant or balanced by looking at the output of a few randomly chosen inputs. If $f$ were balanced, it would be very unlucky if all of the inputs we tried gave the same answer, just as flipping a coin ten times rarely gives all heads (or tails).
Lets make these ideas more precise. Suppose I sample $s$ inputs at random from the $2^n$ possible bitstrings of length $n$. If any two of these inputs have a different output under $f$, I can be certain that $f$ is balanced. If all of my inputs give the same answer, it makes sense to guess that $f$ is constant. Of course, there is a chance that $f$ is balanced, and I merely got unlucky. The probability that all my outputs are the same, assuming $f$ is in fact balanced, is
\begin{align}
\frac{1}{2^s}
\end{align}
where we've assumed each input is sampled (uniformly) randomly and independently. We can make this ``failure probability" as small as we like. By choosing $s > \log 1/\epsilon$, the failure probability is less than $\epsilon$. Therefore, we will be able to distinguish $f$ constant vs balanced with an arbitrarily small chance of failure.
Notice that, not only can we make $\epsilon$ arbitrarily small, we can do so ``exponentially quickly". Each new sample shrinks our failure probability by $1/2$. This suggests that, in practice, we should have no trouble solving this problem with regular computers with a high probability of success.
\section*{Exercise 1.2}
Explain how a device which, upon input of one of two non-orthogonal quantum states $\ket{\psi}$ or $\ket{\varphi}$ correctly identified the state, could be used to build a device which cloned the states $\ket{\psi}$ and $\ket{\varphi}$, in violation of the no-cloning theorem. Conversely, explain how a device for cloning could be used to distinguish non-orthogonal quantum states.
\textbf{Solution}: Suppose we could distinguish any two quantum states $\ket{\psi}$ and $\ket{\varphi}$ perfectly, given only a single copy. Such a device could store the result in a single (classical or quantum) bit, with, say, zero for $\ket{\psi}$ and one for $\ket{\varphi}$. Then we can define a unitary operation which will reconstruct the state from the result. Specifically, let $CU$ be the controlled unitary which acts on a ``blank" quantum system $S$ by preparing the state $\ket{\psi}$ or $\ket{\varphi}$ depending on whether the result bit was zero or one. Applying this operation after the measurement would prepare another copy of the original state, effectively cloning it.
Conversely, cloning could be used to distinguish any two quantum states. By creating many copies of the state of interest, and performing suitable measurements on the copies, we can fully determine the quantum state to arbitrary precision. Or at least enough to tell them apart.
Together, this shows that the ability to clone and the ability to distinguish arbitrary states are equivalent in power. Since we cannot do one of them, we cannot do either!
\chapter{Introduction to quantum mechanics}
\section*{Exercise 2.1: (Linear dependence: example)}
Show that $(1,-1)$, $(1, 2)$ and $(2, 1)$ are linearly dependent.
\textbf{Solution}: By inspection (i.e. staring long enough), one might be able to see that $(1,-1) + (1,2) = (2,1)$, hence,
\begin{align}
(1,-1)+(1,2) -(2,1) = 0.
\end{align}
This demonstrates linear dependence, the last rearrangement being more of a formality. Essentially, one of the vectors can be reached by ``moving along" the directions of the other two.
A more systematic approach would be to cast this problem as a system of linear equations. We want to know if one of the three vectors is in the span of the other two. It turns out that this is equivalent to asking whether the matrix equation
\begin{align}
\begin{pmatrix}
1 & 1 & 2 \\
-2 & 2 & 1
\end{pmatrix}
\begin{pmatrix}
a \\
b \\
c
\end{pmatrix}
=\begin{pmatrix}
0 \\
0
\end{pmatrix}
\end{align}
has any ``nontrivial" solutions (nontrivial meaning at least one of $a, b, c$ is nonzero). This can be solved using your favorite method for linear equations (row reduction, substitution, computers). In fact, there are an infinite number of possible answers: give it a try!
\section*{Exercise 2.2: (Matrix representations: example)}
Suppose $V$ is a vector space with basis vectors $\ket{0}$ and $\ket{1}$, and $A$ is a linear operator from $V$ to $V$ such that $A\ket{0}=\ket{1}$ and $A\ket{1}=\ket{0}$. Give a matrix representation for $A$, with respect to the input basis $\ket{0}, \ket{1}$, and the output basis $\ket{0}, \ket{1}$. Find input and output bases which give rise to a different matrix representation of $A$.
\textbf{Solution}: Making the identification
\begin{align}
\ket{0} =
\begin{pmatrix}
1 \\
0
\end{pmatrix}, \quad
\ket{1} =
\begin{pmatrix}
0 \\
1
\end{pmatrix}
\end{align}
it is easy to see that the matrix representation $M$ of $A$ must be
\begin{align}
M =
\begin{pmatrix}
0 & 1 \\
1 & 0
\end{pmatrix}.
\end{align}
On the other hand, say we choose the output basis to be $\ket{1},\ket{0}$ instead of $\ket{0},\ket{1}$ (note that order matters). In this case, $M$ would take the form of the identity matrix.
\section*{Exercise 2.3: (Matrix representation for operator products)}
Suppose $A$ is a linear operator from vector space $V$ to vector space $W$, and $B$ is a linear operator from vector space $W$ to vector space $X$. Let $\ket{v_i}, \ket{w_j}$, and $\ket{x_k}$ be bases for the vector spaces $V, W$, and $X$ respectively. Show that the matrix representation for the linear transformation $BA$ is the matrix product of the matrix representations for $B$ and $A$, with respect to the appropriate bases.
\textbf{Solution}: Let us compute the action of $BA$ on a basis vector $\ket{v_i}$.
\begin{align}
\begin{aligned}
BA\ket{v_i} &= B(A\ket{v_i}) \\
&= B(\sum_j A_{ji}\ket{w_j}) \\
&= \sum_j A_{ji}B\ket{w_j} \\
&= \sum_j\sum_k A_{ji}B_{kj}\ket{x_k} \\
&= \sum_k \left(\sum_j B_{kj}A_{ji}\right) \ket{x_k} \\
&= \sum_k (BA)_{ki} \ket{x_k}.
\end{aligned}
\end{align}
In the last step, we identified the coefficient as the product of the two matrix representations of $A$ and $B$. This gives the result.
\section*{Exercise 2.4: (Matrix representation for the identity)}
Show that the identity operator on a vector space $V$ has a matrix representation which is one along the diagonal and zero everywhere else, if the matrix representation is taken with respect to the same input and output bases. This matrix is known as the \emph{identity matrix}.
\textbf{Solution}: Let $M$ be the matrix representation of the identity operator $I$ with respect to some basis $\left\{\ket{v_i}\right\}$. Then,
\begin{align}
I\ket{v_i} = \ket{v_i} = \sum_j \delta_{ji}\ket{v_j},
\end{align}
where $\delta_{ji}$ is the Kronecker delta. The coefficients $\delta_{ji}$ are exactly those of the identity matrix $\mathbb{1}$, hence $M = \mathbb{1}$.
\section*{Exercise 2.5}
Verify that $(\cdot,\cdot)$ just defined is an inner product on $\mathbb{C}^n$.
\textbf{Solution}: Let's use letters $x, y, z$ to denote elements of $\mathbb{C}^n$ (lists of $n$ complex entries), and let $\lambda \in \mathbb{C}$ be a complex number. To check linearity, let's first consider $(x, y+z)$. Using the definition,
\begin{align}
\begin{aligned}
(x, y+z) &= \sum_{i=1}^n x_i^* (y_i + z_i) \\
&= \sum_{i=1}^n (x_i^* y_i + x_i^* z_i) \\
&= \sum_{i=1}^n x_i^* y_i + \sum_{i=1}^n x_i^* z_i \\
&= (x, y) + (x,z).
\end{aligned}
\end{align}
This shows that $(\cdot, \cdot)$ satisfies the additive portion of linearity. To show the multiplication part, consider $(x, \lambda y)$.
\begin{align}
\begin{aligned}
(x, \lambda y) &= \sum_{i=1}^n x_i^* \lambda y_i \\
&= \lambda \sum_{i=1}^n x_i^* y_i \\
&= \lambda (x,y).
\end{aligned}
\end{align}
The two properties just shown, used together, can be used to demonstrate linearity as given in the text. Can you see why? As a starting point, can we generalize to have sums of more than two vectors?
For the ``skew-symmetry" property (2), we perform the following calculation.
\begin{align}
\begin{aligned}
(x, y) &= \sum_{i=1}^n x_i^* y_i \\
&= \left(\sum_{i=1}^n x_i y_i^*\right)^* \\
&= (y, x)^*
\end{aligned}
\end{align}
We used a nice property of complex conjugates: they distribute over sums and products of complex numbers. Finally, we demonstrate property (3) with another calculation from the definition of $(\cdot, \cdot)$.
\begin{align}
\begin{aligned}
(x, x) &= \sum_{i=1}^n x_i^* x_i \\
&= \sum_{i=1}^n \abs{x_i}^2
\end{aligned}
\end{align}
We see that $(x,x)$ is a sum of $n$ nonnegative entries $\abs{x_i}^2$, and is therefore also nonnegative. Moreover, $(x,x)$ will be positive unless every term in the sum is zero, that is, if $x_i = 0$ for every $i$. In this case, we simply have $x = 0$. Therefore $(x,x) = 0$ if and only if $x = 0$. We've now shown that $(\cdot, \cdot)$ defines a valid inner product.
\section*{Exercise 2.6}
Show that any inner product $(\cdot, \cdot)$ is conjugate-linear in the first argument,
\begin{align}
\left(\sum_i \lambda_i \ket{w_i}, \ket{v}\right) = \sum_i \lambda_i^* (\ket{w_i}, \ket{v})
\end{align}
\textbf{Solution}: For simplicity, I will drop the ket notation, using $w_i$ instead of $\ket{w_i}$, for example. The conjugate-linearity follows from two of the three properties in the definition of an inner product. The following calculation shows this.
\begin{align}
\begin{aligned}
\left(\sum_i \lambda_i w_i, v\right) &= \left(v, \sum_i \lambda_i w_i\right)^* \\
&= \sum_i \lambda_i^* (v, w_i)^* \\
&= \sum_i \lambda_i^* (w_i, v).
\end{aligned}
\end{align}
In particular, we only needed to use properties (1) and (2)! Some properties of complex conjugation were invoked such as distribution over multiplication: $(ab)^* = a^* b^*$.
\section*{Exercise 2.7}
Verify that $\ket{w} \equiv (1, 1)$ and $\ket{v} \equiv (1, -1)$ are orthogonal. What are the normalized forms of these vectors?
\textbf{Solution}: To see they are orthogonal, we compute their inner product as defined in the text.
\begin{align}
(w,v) := \sum_{i=1}^2 w_i^* v_i = 1^2 - 1^2 = 0
\end{align}
Note that the complex conjugate in the definition becomes unimportant when both vectors are real. To normalize the vectors, let's first compute their norms.
\begin{align}
\norm{w} = \sqrt{1^2 + 1^2} = \sqrt{2} \\
\norm{v} = \sqrt{1^2 + (-1)^2} = \sqrt{2}
\end{align}
Hence, the normalized versions $\hat{w}$, $\hat{v}$ are $(1,1)/\sqrt{2}$ and $(1,-1)/\sqrt{2}$ respectively. As a check, can you verify that each of these really has unit norm?
\section*{Exercise 2.8}
Prove that the Gram-Schmidt procedure produces an orthonormal basis for $V$.
\textbf{Solution}: There are several things we need to check that are packed into this sentence. We need to check that the $\ket{v_k}$ are orthonormal, meaning they are orthogonal and normalized. Then we need to check that they are a basis for $V$.
These facts are not necessarily separate. If $d=\dim{V}$ is the dimension of $V$, a set of $d$ orthogonal, nonzero vectors will always form a basis. If it helps, this can be illustrated from the $xy$ axes of the plane, or an $xyz$ grid in three dimensions. We will take this statement for granted to solve the problem, but if you'd like to learn how to prove this, consult a linear algebra textbook with chapters on linear dependence and orthogonality.
The Gram-Schmidt procedure works by subtracting off the component of each $\ket{w_{k+1}}$ that is parallel to the preceding $k$ vectors, then normalizing the result. The ``normalization" step means dividing by the norm, which we see is done in equation 2.17 of the text. We just need to make sure that the norm is not actually zero. If it was zero, then from the numerator of 2.17 (in text) we would have
\begin{align}
\ket{w_{k+1}} = \sum_{i=1}^k \braket{v_i}{w_{k+1}} \ket{v_i}.
\end{align}
Since each $\ket{v_i}$ is, by construction, in the subspace spanned by the $\ket{w_j}$ for $j \leq i$, this would imply $\ket{w_{k+1}}$ is a superposition of other $\ket{w_j}$, and therefore they are not linearly dependent as we assumed! Since this can't happen, it must be that the norm is not zero, so $\ket{v_i}$ as written is well defined and normalized.
Now let's check that the vectors are orthogonal. By the way we've constructed each $\ket{v_i}$ based on the previous ones, it's easiest to proceed by checking that each $\ket{v_{k+1}}$ is orthogonal to all $\ket{v_j}$ for $j \leq k$. Since, normalization does not matter, we can also just consider the numerator of equation 2.17, which we'll call $\ket{x_{k+1}}$. For any $j \leq k$, we have
\begin{align}\label{eq:component_innprod}
\braket{v_j}{x_{k+1}} = \braket{v_j}{w_{k+1}} - \sum_{i=1}^k \braket{v_i}{w_{k+1}} \braket{v_j}{v_i}.
\end{align}
Let's assume we've already successfully shown that all pairs $\ket{v_i}, \ket{v_j}$ are orthogonal for $i,j \leq k$, $i\neq j$. Then $\braket{v_j}{v_i} = \delta_{ij}$ in equation~\eqref{eq:component_innprod}, where $\delta_{ij}$ is the Kronecker delta. This captures the orthonormality. We'll now show that $\ket{v_{k+1}}$ should be included in the set as well, extending the set by one. Using the orthonormality,
\begin{align}
\begin{aligned}
\braket{v_j}{x_{k+1}} &= \braket{v_j}{w_{k+1}} - \braket{v_j}{w_{k+1}} \\
&= 0
\end{aligned}
\end{align}
This shows $\ket{v_1}, \ket{v_2}, \dots, \ket{v_{k+1}}$ form an orthonormal set. Repeating for $k+2, k+3$ and so on up until the dimension $d$, we can show that all the $\ket{v_k}$ form an orthonormal basis. This shows the Gram-Schmidt procedure works.
For those interested, one can formalize the above argument into a proof by mathematical induction. For those who found the argument confusion, try drawing some pictures to better understand the first step of the Gram-Schmidt procedure. What is going on and why does it work?
\section*{Exercise 2.9 (Pauli operators and the outer product)}
The Pauli matrices (Figure 2.2 on page 65) can be considered as operators with respect to an orthonormal basis $\ket{0}, \ket{1}$ for a two-dimensional Hilbert space. Express each of the Pauli operators in the outer product notation.
\textbf{Solution}: The identity matrix, as was shown in the text, is given by the completeness relation for a single qubit.
\begin{align}
I = \sigma_0 = \ket{0}\bra{0} + \ket{1}\bra{1}
\end{align}
To construct the other ones, we'll use the following trick. Any matrix can be broken into a linear combination of matrices, each with only one nonzero entry that is equal to one. For two dimensions, it looks like this.
\begin{align} \label{eq:out_prod_decomp}
\begin{pmatrix}
a & b \\
c & d \\
\end{pmatrix}
= a
\begin{pmatrix}
1 & 0 \\
0 & 0
\end{pmatrix}
+ b
\begin{pmatrix}
0 & 1 \\
0 & 0
\end{pmatrix}
+ c
\begin{pmatrix}
0 & 0 \\
1 & 0
\end{pmatrix}
+ d
\begin{pmatrix}
0 & 0 \\
0 & 1
\end{pmatrix}
\end{align}
What is the outer product representation of each such matrix? Let $E^{(ij)}$ represent the operator in which only the $ij$th matrix element is one, the rest being zero. This implies that almost all basis vectors, when multiplied by $E^{(ij)}$, are sent to zero. The only one which isn't is $\ket{j}$, and this is sent to $\ket{i}$. This leads to the outer product formula
\begin{align}
E^{(ij)} = \ket{i}\bra{j},
\end{align}
which can be checked. Thus using the decomposition~\eqref{eq:out_prod_decomp}, we can convert any matrix into an outer product representation. We do that now for each pauli matrix.
\begin{align}
\begin{aligned}
\sigma_1 &=
\begin{pmatrix}
0 & 1 \\
1 & 0
\end{pmatrix} \rightarrow
\ket{0}\bra{1} + \ket{1}\bra{0} \\
\sigma_2 &=
\begin{pmatrix}
0 & -i \\
i & 0
\end{pmatrix} \rightarrow
-i \ket{0}\bra{1} + i \ket{1}\bra{0} \\
\sigma_3 &=
\begin{pmatrix}
1 & 0 \\
0 & -1
\end{pmatrix} \rightarrow
\ket{0}\bra{0} - \ket{1}\bra{1}
\end{aligned}
\end{align}
This suggests a general method: notice that the $ij$th element gets mapped to a coefficient for the term with operator $\ket{i}\bra{j}$. Can you write the outer product form of the matrix given in equation~\eqref{eq:out_prod_decomp}?
\section*{Exercise 2.10}
Suppose $\ket{v_i}$ is an orthonormal basis for an inner product space $V$. What is the matrix representation for the operator $\ket{v_j}\bra{v_k}$ with respect to the $\ket{v_i}$ basis?
\textbf{Solution}: The operator $\ket{v_j}\bra{v_k}$ sends every $\ket{v_i}$ to zero except when $i = k$, in which case
\begin{align}
\left(\ket{v_j}\bra{v_k}\right) \ket{v_k} = \ket{v_j}.
\end{align}
This means the matrix representation $M$ has columns of all zeros except in the $k$th column. And even then, the only nonzero entry is the $j$th row, because that will make sure $\ket{v_k} \rightarrow \ket{v_j}$ as column vectors. Hence, $M$ only has a single nonzero entry, the $jk$th entry, with value one. This agrees with our discussion from the previous exercise.
\section*{Exercise 2.11 (Eigendecomposition of the Pauli matrices)}
Find the eigenvectors, eigenvalues, and diagonal representations of the Pauli matrices $X$, $Y$, and $Z$.
\textbf{Solution}: Let's start with $Z$ since this is the simplest case. Why is it simple? Because $Z$ is already a diagonal matrix! We can read off the eigenvalues as the diagonal entries: $-1$ and $1$. Moreover, as for any diagonal matrix, the eigenvectors are just the standard basis vectors.
\begin{align}
Z \begin{pmatrix}
1 \\
0
\end{pmatrix}
= \begin{pmatrix}
1 \\
0
\end{pmatrix}
\quad Z
\begin{pmatrix}
0 \\
1
\end{pmatrix}
= -\begin{pmatrix}
0 \\
1
\end{pmatrix}
\end{align}
The diagonal representation is given in the text, equation 2.29.
I will give a complete solution for $X$, and I suggest you try to solve for $Y$ in the same manner. Let's proceed with the standard method of the characteristic equation. Suppose $\lambda$ is an eigenvalue of $X$, with eigenvector $\ket{v}$, such that
\begin{align} \label{eq:char_matrix}
\begin{aligned}
X \ket{v} &= \lambda \ket{v} \\
(X - \lambda I) \ket{v} &= 0.
\end{aligned}
\end{align}
Take a minute to think about why the second line follows from the first. What rearranging did I do?
The fact that there is a nonzero $\ket{v}$ such the 2nd line of equation\eqref{eq:char_matrix} holds means that $X - \lambda I$ is a ``singular matrix". This means many equivalent things, one of which is that the determinant is zero.
\begin{align}
\det (X - \lambda I) = 0
\end{align}
As a reminder, all of this applies more generally to determining the eigenvalues and eigenvectors of any square matrix. Let's calculate the determinant to solve for $\lambda$.
\begin{align}
\begin{aligned}
\det (X - \lambda I) = \det \begin{pmatrix}
-\lambda & 1 \\
1 & -\lambda
\end{pmatrix} &= 0 \\
\lambda^2 - 1 &= 0 \\
\lambda &= \pm 1.
\end{aligned}
\end{align}
Thus, the eigenvalues of X are $\pm 1$. Since $X$ is two-dimensional, each of these eigenvalues will have exactly one eigenvector. How do we find them? Let $\ket{+}$ and $\ket{-}$ correspond to the eigenvectors for $+1$ and $-1$ respectively. Going back to equation~\eqref{eq:char_matrix}, this means
\begin{align} \label{eq:eigenvecs_X}
(X-I)\ket{+} = 0 \\
(X + I) \ket{-} = 0.
\end{align}
These are linear equations we can solve, using our favorite method! Gaussian elimination is a great one in general. Solving each of these we will find the answer to be
\begin{align}
\begin{aligned}
\ket{+} &\propto \begin{pmatrix}
1 \\
1
\end{pmatrix} \\
\ket{-} &\propto \begin{pmatrix}
1 \\
-1
\end{pmatrix} \\
\end{aligned}
\end{align}
Here, $\propto$ means ``proportional to" by some constant $c$ which is undetermined. Any $c$ will solve~\eqref{eq:eigenvecs_X}, so there are an infinite number of solutions. However, we want \emph{normalized} solutions, so that $\braket{+}{+} = \braket{-}{-} = 1$. Since the norm of $\ket{+}$ and $\ket{-}$ is $\sqrt{2}$ (check it!) a good choice for $c$ then is $1/\sqrt{2}$. But $-1/\sqrt{2}$ would also work, wouldn't it? Anyways, with this normalization, our eigenvectors are determined, and our diagonal decomposition is given as
\begin{align}
X = \ket{+}\bra{+} - \ket{-}\bra{-}.
\end{align}
(Sorry if the notation strains your eyes a bit, it does mine as well!)
This concludes our analysis for $X$. I encourage you to try solving for $Y$, referencing the above steps as needed. I will simply list the solution. The eigenvalues are still $+1$ and $-1$, and the corresponding eigenvectors, which we'll call $\ket{+i}$ and $\ket{-i}$, are given by
\begin{align}
\begin{aligned}
\ket{+i} &= \frac{1}{\sqrt{2}} \begin{pmatrix}
1 \\
i
\end{pmatrix} \\
\ket{-i} &= \frac{1}{\sqrt{2}} \begin{pmatrix}
1 \\
-i
\end{pmatrix}.
\end{aligned}
\end{align}
Therefore, the diagonal decomposition is given by
\begin{align}
Y = \ket{+i}\bra{+i} - \ket{-i}\bra{-i}.
\end{align}
Having solved the problem stated, let me just leave some parting comments. It is no coincidence that all the Paulis have the same eigenvalues, because they are related by a change of basis. Intuitively, changing the basis (``point of view") should not affect the stretching/compressing properties of the matrix. Once the change of basis is known, it can be used to relate all the eigenvectors together as well. These concepts will become more transparent in Chapter 4!
\section*{Exercise 2.12}
Prove that the matrix
\begin{align}
\begin{pmatrix}
1 & 0 \\
1 & 1
\end{pmatrix}
\end{align}
is not diagonalizable.
\textbf{Solution}: To investigate this claim, let's find the eigenvalues and eigenvectors. Let $A$ be this matrix. The determinant of $A - \lambda I$ is then given by
\begin{align}
\det \begin{pmatrix}
1-\lambda & 0 \\
1 & 1-\lambda
\end{pmatrix} = (1-\lambda)^2.
\end{align}
This is equal to zero only when $\lambda = 1$, so there is only one eigenvalue. It may be tempting to conclude here that $A$ is not diagonalizable: don't we only have one eigenvector? Not so: there could be \emph{two} independent eigenvectors corresponding to this single eigenvalue. This is sometimes called ``degeneracy", especially within physics. Let's determine which case is true by calculating the eigenvectors. If $\ket{v}$ is an eigenvector, it must satisfy
\begin{align}
\begin{aligned}
(A-I)\ket{v} &= 0 \\
\begin{pmatrix}
0 & 0 \\
1 & 0
\end{pmatrix} \ket{v} = 0.
\end{aligned}
\end{align}
We can solve this system any way we like. There is only one (linearly independent) solution.\footnote{By this I mean any two solutions are related by a constant.}
\begin{align}
\ket{v} = \begin{pmatrix}
0 \\
1
\end{pmatrix}
\end{align}
This is the unique eigenvector of $A$. But $A$ cannot be written in terms of $\ket{v}\bra{v}$ alone (why?). This shows that $A$ is not diagonalizable.
\section*{Exercise 2.13}
If $\ket{w}$ and $\ket{v}$ are any two vectors, show that $(\ket{w}\bra{v})^\dagger = (\ket{v}\bra{w})$.
\textbf{Solution}: We could solve this from the definition, but it also falls out from a more interesting consideration. We can think of $\bra{w}$ as a linear operator from the Hilbert space $H$ to $\mathbb{C}$, and $\ket{v}$ as another linear operator from $\mathbb{C}$ back into $H$. Then $\ket{w}\bra{v}$ is just the composition of the two maps, and we have $(\ket{w}\bra{v})^\dagger = \bra{v}^\dagger \ket{w}^
\dagger = \ket{v}\bra{w}$.
But for sake of clarity, let's also work from the definition. For any two vectors $\phi, \psi$, we have
\begin{align}
\begin{aligned}
((\ket{w}\bra{v})^\dagger \phi, \psi) &=(\phi, \ket{w}\bra{v} \psi) \\
&= \bra{\phi}\ket{w}\bra{v}\ket{\psi} \\
&=(\bra{\psi}\ket{v}\bra{w}\ket{\phi})^* \\
&= (\psi, \ket{v}\bra{w} \phi)^* \\
&= (\ket{v}\bra{w} \phi, \psi).
\end{aligned}
\end{align}
Comparing, the first and last line, since this holds for arbitrary $\phi, \psi$, we have
\begin{align}
(\ket{w}\bra{v})^\dagger = \ket{v}\bra{w}.
\end{align}
\section*{Exercise 2.14 (Anti-linearity of the adjoint)}
Show that the adjoint operation is anti-linear,
\begin{align}
\left(\sum_i a_i A_i\right)^\dagger = \sum_i a_i^* A_i^\dagger.
\end{align}
\textbf{Solution}: Like before, we will work directly with the definition from the inner product, making use of conjugate-linearity in the first argument (see Exercise 2.6). For any two $\phi, \psi$,
\begin{align}
\begin{aligned}
( (\sum_i a_i A_i)^\dagger \phi, \psi) &= (\phi, \sum_i a_i A_i \psi) \\
&= \sum_i a_i (\phi, A_i \psi) \\
&=\sum_i a_i (A_i^\dagger \phi, \psi) \\
&= ((\sum_i a_i^* A_i^\dagger) \phi, \psi) \\
\end{aligned}
\end{align}
Comparing the beginning to end, since $\phi$ and $\psi$ were completely arbitrary, we must have the operators are equal. This gives our result.
\section*{Exercise 2.15}
Show that $(A^\dagger)^\dagger = A$.
\textbf{Solution}: Once again, from the definition! Choosing arbitrary $\phi, \psi$,
\begin{align}
(\phi, A\psi) = (A^\dagger \phi, \psi) = (\psi, A^\dagger\phi)^* = ((A^\dagger)^\dagger \psi, \phi)^* = (\phi, (A^\dagger)^\dagger \psi).
\end{align}
Thus, comparing the first and last of this string of equations, we see that $A$ and $(A^\dagger)^\dagger$ are necessarily equal.
\section*{Exercise 2.16}
Show that any projector $P$ satisfies $P^2 = P$.
\textbf{Solution}: In some treatments, this is given the definition of a projector! Let's prove it from the textbook's definition, using equation (2.35) from the text.
\begin{align}
\begin{aligned}
P^2 = P P &= \left(\sum_i \ket{i}\bra{i}\right)\left(\sum_j \ket{j}\bra{j}\right) \\
&= \sum_{i,j} \ket{i}\bra{i}\ket{j}\bra{j} \\
&= \sum_{i,j} \ket{i}\bra{j} \delta_{i,j} \\
&= \sum_{i} \ket{i}\bra{i} = P.
\end{aligned}
\end{align}
Note we took care to use different indices for each copy of $P$, and exploited the orthonormality of the $\ket{i}$ via $\braket{i}{j} = \delta_{i,j}$.
\section*{Exercise 2.17}
Show that a normal matrix is Hermitian if and only if it has real eigenvalues.
\textbf{Solution}: $(\implies)$ Suppose $H$ is Hermitian, and let $v$ be an eigenvector of $H$ with eigenvalue $\lambda$. For simplicity, take $v$ to be normalized. Then,
\begin{align}
(v, Hv) = (v, \lambda v) = \lambda (v, v) = \lambda.
\end{align}
On the other hand, we can take the adjoint of $H$, which is just $H$ itself, to compute the same quantity.
\begin{align}
(v, Hv) = (H^\dagger v, v) = (Hv,v) = (\lambda v, v) = \lambda^* (v,v) = \lambda^*
\end{align}
where we've used conjugate-linearity in the first argument. Taken together these equations imply $\lambda = \lambda^*$. Hence, $\lambda$ is real.
$(\impliedby)$ Suppose $H$ is normal with real eigenvalues. By the spectral theorem, $H$ has a diagonal representation $H = \sum_i \lambda_i \ket{v_i}\bra{v_i}$ where $\ket{v_i}$ are orthonormal and $\lambda_i$ real. Taking the Hermitian conjugate,
\begin{align}
H^\dagger = \left(\sum_i \lambda_i \ket{v_i}\bra{v_i}\right)^\dagger
= \sum_i \lambda_i^* (\ket{v_i}\bra{v_i})^\dagger = \sum_i \lambda_i \ket{v_i}\bra{v_i} = H.
\end{align}
Hence, $H$ is equal to its own adjoint, hence it is Hermitian.
\section*{Exercise 2.18}
Show that all eigenvalues of a unitary matrix have modulus 1, that is, can be written in the form $e^{i\theta}$ for some real $\theta$.
\textbf{Solution}: Let $v$ be an (normalized) eigenvector of unitary $U$ with eigenvalue $\lambda$. Since unitary operators preserve the norm, we must have
\begin{align}
(Uv, Uv) = (\lambda v, \lambda v) = \lambda^* \lambda = \abs{\lambda}^2 = 1.
\end{align}
Every complex number $\lambda$ has a polar representation $r e^{i\theta}$ for $r\geq 0$ and real $\theta$. Moreover, $\abs{\lambda}^2 = r^2 = 1$, so $r = 1$. This proves our result: $\lambda = e^{i\theta}$ for some real $\theta$.
\section*{Exercise 2.19 (Pauli matrices: Hermitian and unitary)}
Show that the Pauli matrices are Hermitian and unitary.
\textbf{Solution}: There are many ways to show this. In previous exercises we've shown that the Pauli matrices have eigenvalues $\pm 1$ with orthogonal eigenvectors. By Exercise 2.17, this means the Paulis are hermitian. Moreover, each Pauli squares to unity: $\sigma_i^2 = I$. This means the Paulis are their own inverse. Thus
\begin{align}
\sigma_i^\dagger \sigma_i = \sigma_i \sigma_i = \sigma_i^2 = I.
\end{align}
Hence, the Paulis are also unitary.
As an aside operators which are both Hermitian and unitary are \emph{reflections}: they necessarily have eigenvalues $\pm 1$ about orthogonal axes.
\section*{Exercise 2.20 (Basis change)}
Suppose $A'$ and $A''$ are matrix representations of an operator $A$ on a vector space $V$ with respect to two different orthonormal bases, $\ket{v_i}$ and $\ket{w_i}$. Then the elements of $A'$ and $A''$ are $A'_{ij} = \bra{v_i}A\ket{v_j}$ and $A''_{ij} = \bra{w_i}A''\ket{w_j}$. Characterize the relationship between $A'$ and $A''$.
\textbf{Solution}: Consider the matrix elements of $A'$. We would like to express them in terms of the elements of $A''$. One way to do this is to suitably insert two copies of the identity matrix into the matrix element expression, with the copies represented in the basis of $A''$.
\begin{align}
\begin{aligned}
A'_{ij} = \bra{v_i}A\ket{v_j} &= \bra{v_i}\left(\sum_k \ket{w_k}\bra{w_k}\right) A \left(\sum_l \ket{w_l}\bra{w_l}\right)\ket{v_j} \\
&= \sum_{k,l} \bra{v_i}\ket{w_k}\bra{w_k}A\ket{w_l}\bra{w_l}\ket{v_j} \\
&= \sum_{k,l}A''_{kl} \braket{v_i}{w_k}\braket{w_l}{v_j}
\end{aligned}
\end{align}
This gives a direct relationship between the two sets of matrix elements, but might not be so illuminating. However, the double sum along with the format of the inner products suggests some kind of matrix multiplication is occurring. To facilitate this interpretation, let's define the inner products as a matrix $M$.
\begin{align}
M_{ij} := \bra{v_i}\ket{w_j}
\end{align}
Then we have
\begin{align}
\begin{aligned}
A'_{ij} &= \sum_{k,l} A''_{k,l} M_{ik} M_{jl}^* \\
&= \sum_{k,l} M_{ik} A''_{k,l} (M^\dagger)_{lj} \\
&= (M A'' M^\dagger)_{ij}
\end{aligned}
\end{align}
Hence, $A' = M A'' M^\dagger$, where $M$ is a matrix of inner products. Can we deduce any further properties of $M$? Well, if $A= I$, then all matrix representations must be the identity matrix. This implies $M I M^\dagger = M M^\dagger = I$, so $M$ is unitary.
We can summarize these results by saying $A'$ and $A''$ are related via a \emph{unitary similarity transformation}. Since $A'$ and $A''$ were arbitrary matrix representations in terms of orthonormal bases, all such representations are related this way.
\section*{Exercise 2.21}
Repeat the proof of the spectral decomposition in Box 2.2 for the case when $M$ is Hermitian, simplifying the proof wherever possible.
\textbf{Solution}: I will just point out the places where simplifications can occur assuming $M$ is Hermitian. Once $QMP = 0$ is known, it follows immediately that $PMQ = 0$ by taking the adjoint. It is also immediate that $QMQ$ is normal, since it is Hermitian! Then the inductive step allows for diagonalizing $PMP$ and $QMQ$, and the rest of the proof carries through.
\section*{Exercise 2.22}
Prove that two eigenvectors of a Hermitian operator with different eigenvalues are necessarily orthogonal.
\textbf{Solution}: Let $u, v$ be normalized eigenvectors of a Hermitian $H$ with eigenvalues $\lambda, \mu$ respectively. We've already proven in Exercise 2.17 that $\lambda, \mu$ are real. Hence
\begin{align}
(u, Hv) = (u, \mu, v) = \mu (u,v)
\end{align}
and also
\begin{align}
(u, Hv) = (H u, v) = (\lambda u, v) = \lambda (u,v).
\end{align}
Setting these two equal, we see that if $\lambda \neq \mu$, $(u,v) = 0$. Hence, $u$ and $v$ are orthogonal.
\section*{Exercise 23}
Show that the eigenvalues of a projector P are all either 0 or 1.
\textbf{Solution}: Let $v$ be an eigenvector of $P$ with eigenvalue $\lambda$. By Exercise 2.16, $P^2 = P$, so
\begin{align}
P^2 v = P (Pv) = \lambda^2 v = P v = \lambda v.
\end{align}
Hence $\lambda^2 = \lambda$. There are exactly two solutions to this quadratic equation: zero and one.
\section*{Exercise 2.24 (Hermiticity of positive operators)}
Show that a positive operator is necessarily Hermitian. (Hint: Show that an arbitrary operator $A$ can be written $A = B + iC$ where $B$ and $C$ are Hermitian.)
\textbf{Solution}: There is a clever trick to do the decomposition given in the hint, and it indeed works for any square operator $A$. Taking inspiration from Euler's formula for complex numbers, we define
\begin{align}
B := \frac{A + A^\dagger}{2}, \quad C := \frac{A - A^\dagger }{2i}.
\end{align}
Indeed, we find that $A = B + i C$, and moreover both $B$ and $C$ are Hermitian!
Now let's assume $A$ is a positive operator. Then for any vector $\psi$, we have $(\psi, A \psi) = (\psi, B\psi) + i (\psi, C\psi) > 0$. In particular, it must be real valued. This must mean that $(\psi, C\psi) = 0$ for every choice of vector $\psi$. The only way this condition can be met is if $C= 0$. Thus, $A = B$, and since $B$ is Hermitian, so is $A$.
\section*{Exercise 2.25}
Show that for any operator $A$, $A^\dagger A$ is positive.
\textbf{Solution}: We need to show that $(v, A^\dagger A v) \geq 0$ for any vector $v$. This follows from the definition of the adjoint.
\begin{align}
(v, A^\dagger A v) = (A v, A v).
\end{align}
Since the inner product of a vector with itself is always nonnegative, $(A v, A v) \geq 0$ and we've proven our result.
\section*{Exercise 2.26}
Let $\ket{\psi} = (\ket{0}+\ket{1})/\sqrt{2}$. Write out $\ket{\psi}^{\otimes 2}$ and $\ket{\psi}^{\otimes{3}}$ explicitly, both in terms of tensor products like $\ket{0}\ket{1}$, and using the Kronecker product.
\textbf{Solution}: I will not do all steps because it is a bit tedious. I will just outline some things. Consider $\ket{\psi}^{\otimes n}$. Since the dimension of the Hilbert space is the product of the individual dimensions, the full space is of dimension $2^{n}$ (hence dimension four and eight for $n = 2, 3$). Moreover, the overall normalization factor will just be a product of all the $1/\sqrt{2}$, so it is $1/2^{n/2}$. Now we can just focus on the unnormalized $\ket{0} + \ket{1}$ and their tensor product.
I find it easier to work with the tensor product form instead of the Kronecker product, since we don't need to worry about "stacking" everything properly. We have a basis in terms of all possible bitstrings of length $n$. When we take the $n$-fold tensor product of $\ket{0} + \ket{1}$, we will find that we obtain a uniform superposition of all length $n$ bitstrings. Try to reason this out yourself ($n = 2$ makes a good example case). Hence,
\begin{align}
\ket{\psi} = \frac{1}{2^{n/2}} \sum_j \ket{j}
\end{align}
where the sum is taken over all bitstrings $j$ of length $n$. This means, in the Kronecker product column vector, every single entry will be 1, with an overall normalization factor.
\section*{Exercise 2.27}
Calculate the matrix representation of the tensor products of the Pauli operators (a) $X$ and $Z$; (b) $I$ and $X$; (c) $X$ and $I$. Is the tensor product commutative?
\textbf{Solution}: I will work out part (a) explicitly and just give the answer for the other cases. It's important to remember that the basis we are using is $\ket{00}, \ket{01}, \ket{10}, \ket{11}$, in that order. For larger numbers of qubits, by convention we will always count up in binary to order the basis vectors. As for any matrix representation, the ordering of the basis vectors matters!
With this convention in mind, the usual procedure is to to ``insert" the second matrix into each element of the first. So for $X\otimes Z$ we have
\begin{align}
X \otimes Z = \begin{pmatrix}
0 & Z \\
Z & 0
\end{pmatrix} = \begin{pmatrix}
0 & 0 & 1 & 0 \\
0 & 0 & 0 & -1 \\
1 & 0 & 0 & 0 \\
0 & -1 & 0 & 0
\end{pmatrix}.
\end{align}
We can (and should) check that this representation matches what we expect from the simpler tensor product representation. For example,
\begin{align}
\bra{1}\bra{1} X\otimes Z \ket{0}\ket{1} = \bra{1}X\ket{0} \bra{1}Z\ket{1} = -1.
\end{align}
Thus, a -1 should be located in the 2nd row and 4th column, and indeed there is. Note that $00$ corresponds to the 1st row/column. This check is helpful to make sure the right matrix was inserted in the other.
Let's now work out the matrix reps for (b) and (c).
\begin{align}
\begin{aligned}
I\otimes X &= \begin{pmatrix}
X & 0 \\
0 & X
\end{pmatrix} = \begin{pmatrix}
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 \\
0 & 0 & 1 & 0
\end{pmatrix} \\
X \otimes I &= \begin{pmatrix}
0 & I \\
I & 0
\end{pmatrix} = \begin{pmatrix}
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0
\end{pmatrix}.
\end{aligned}
\end{align}
From these matrix representations it is evident that the tensor product is not commutative $I \otimes X \neq X \otimes I$. However, this is perhaps more transparent from the tensor product directly. $A\otimes B$ is different from $B \otimes A$ because we care whether $A$ acts on the first or second space. They are two different spaces!
\section{Exercise 2.28}
Show that the transpose, complex conjugation, and adjoint operations distribute over the tensor product,
\begin{align}
(A\otimes B)^* = A^* \otimes B^*; (A\otimes B)^T = A^T \otimes B^T; (A\otimes B)^\dagger = A^\dagger \otimes B^\dagger.
\end{align}
\textbf{Solution}: As an initial caution, I'll mention that complex conjugation and transposing are only defined with respect to a \emph{specified basis}. Only the adjoint $\dagger$ is basis independent, which we can see from its definition. So we will assume these operations are with respect to bases $\ket{v_j}$ and $\ket{w_k}$ on the first and second spaces respectively.
Let's consider matrix elements in (the tensor product of) the bases above. We have
\begin{align}
\begin{aligned}
\bra{v_j}\bra{w_k} (A\otimes B)^* \ket{v_{j'}}\ket{w_{k'}} &:= (\bra{v_j}\bra{w_k} A\otimes B \ket{v_{j'}}\ket{w_{k'}})^* \\
&= (\bra{v_j}A\ket{v_j'} \bra{w_k}B\ket{w_k'})^* \\
&= (\bra{v_j}A\ket{v_j'})^* (\bra{w_k}B\ket{w_k'})^* \\
&:=\bra{v_j}A^*\ket{v_j'}\bra{w_k}B^*\ket{w_k'}\\
&= \bra{v_j}\bra{w_k} A^*\otimes B^* \ket{v_{j'}}\ket{w_{k'}}.
\end{aligned}
\end{align}
Looking at the first and last line, we see that the matrix elements of $(A\otimes B)^*$ and $A^* \otimes B^*$ are the same in the basis of interest. Hence, they are also equal as operators in general, by linearity. Let's carry out a similar argument for the transpose.
\begin{align}
\begin{aligned}
\bra{v_j}\bra{w_k} (A\otimes B)^T \ket{v_{j'}}\ket{w_{k'}} &:= \bra{v_j'}\bra{w_k'} A\otimes B \ket{v_{j}}\ket{w_{k}} \\
&= \bra{v_j'}A\ket{v_j} \bra{w_k'}B\ket{w_k} \\
&:=\bra{v_j}A^T\ket{v_j'}\bra{w_k}B^T\ket{w_k'}\\
&= \bra{v_j}\bra{w_k} A^T\otimes B^T \ket{v_{j'}}\ket{w_{k'}}.
\end{aligned}
\end{align}
Again, this must imply that $(A\otimes B)^T = A^T \otimes B^T$ by linearity.
The adjoint operator is just complex conjugation and transposition in sequence: $A^\dagger = A^{T*}$. We've shown both of these operations distribute over the tensor product. Therefore, so must the adjoint.
\section*{Exercise 2.29}
Show that the tensor product of two unitary operators is unitary.
\textbf{Solution}: Let $U$ and $V$ be operators on the first and second space, respectively, and consider the tensor product $U\otimes V$. We've shown in the previous exercise that $(U \otimes V)^\dagger = U^\dagger \otimes V^\dagger$. Hence,
\begin{align}
(U\otimes V)(U \otimes V)^\dagger = (U\otimes V)(U^\dagger \otimes V^\dagger) = U U^\dagger \otimes V V^\dagger = I.
\end{align}
Here we've used the fact that $U$ and $V$ are unitary, and that $I\otimes I = I$ (and being somewhat sloppy by not labelling the identity differently on each space). This shows that $U\otimes V$ is unitary, and therefore the tensor product of any unitaries remains unitary.
\section*{Exercise 2.30}
Show that the tensor product of two Hermitian operators is Hermitian.
\textbf{Solution}: If $H$ and $G$ are Hermitian on the first and second spaces, we have
\begin{align}
(H\otimes G)^\dagger = H^\dagger \otimes G^\dagger = H \otimes G.
\end{align}
Hence, $H\otimes G$ is also Hermitian.
\section*{Exercise 2.31}
Show that the tensor product of two positive operators is positive.
\textbf{Solution}: Let $P$ and $Q$ be positive operators on the first and second space, respectively. As shown in the previous exercise, $P\otimes Q$ is Hermitian since $P$ and $Q$ are. Moreover, the eigenvalues of $P\otimes Q$ are just the product of the eigenvalues of $P$ and $Q$ (why is this true?). Since the eigenvalues of $P$ and $Q$ are nonnegative, same for $P\otimes Q$. This means $P \otimes Q$ is a positive operator.
\section*{Exercise 2.32}
Show that the tensor product of two projectors is a projector.
\textbf{Solution}: Let $P$ and $Q$ be projectors and consider $P\otimes Q$. Since $P$ and $Q$ are Hermitian, so is $P\otimes Q$. Moreover,
\begin{align}
(P \otimes Q)^2 = P^2 \otimes Q^2 = P \otimes Q.
\end{align}
Altogether, this implies $P\otimes Q$ is a projection operator.
\section*{Exercise 2.33}
The Hadamard operator on one qubit may be written as
\begin{align}
H = \frac{1}{\sqrt{2}}\big[(\ket{0}+\ket{1})\bra{0}+ (\ket{0}-\ket{1})\bra{1}\big].
\end{align}
Show explicitly that the Hadamard transform on n qubits, $H^{\otimes n}$, may be written as
\begin{align}
H^{\otimes n} = \frac{1}{\sqrt{2^n}} \sum_{x,y} (-1)^{x\cdot y} \ket{x}\bra{y}.
\end{align}
Write out an explicit matrix representation for $H^{\otimes 2}$.
\textbf{Solution}: First, let's express $H$ as
\begin{align}
H = \frac{1}{\sqrt{2}}\sum_{x,y = 0}^1 (-1)^{x\cdot y} \ket{x}\bra{y}.
\end{align}
we see then that the statement holds for $n = 1$. The $x\cdot y$ here is just the multiplication of two bits $x, y$, which ensures that only the last matrix element is negative. Moving now to the tensor product,
\begin{align}
H^{\otimes n} &= \frac{1}{\sqrt{2^n}} \bigotimes_{j = 1}^n \sum_{x_j, y_j = 0}^1 (-1)^{x_j\cdot y_j} \ket{x_j}\bra{y_j} \\
&= \frac{1}{\sqrt{2^n}} \sum_{x_1, y_1 = 0}^1 \sum_{x_2, y_2 = 0}^1 \dots \sum_{x_n, y_n = 0}^1 (-1)^{x_1 y_1 + x_2 y_2 + \dots + x_n y_n} \ket{x_1 x_2 \dots x_n} \bra{y_1 y_2 \dots y_n},
\end{align}
where we used the distributive property of the tensor product to expand the sums. Identifying $x$ with the length $n$ binary string $x_1 x_2 \dots x_n$, and similar for $y$, we see that the many sums in the above equation represent a sum over all possible bitstrings $x$ and $y$. Moreover, the exponent of $-1$ is just the bitwise dot product $x\cdot y$. Making these identifications in the above equation,
\begin{align}
H^{\otimes n} = \frac{1}{\sqrt{2^n}} \sum_{x,y \in \text{bitstrings}} (-1)^{x\cdot y} \ket{x}\bra{y}.
\end{align}
This is the result we sought.
\section*{Exercise 2.34}
Find the square root and logarithm of the matrix
\begin{align}
\begin{bmatrix}
4 & 3 \\
3 & 4
\end{bmatrix}.
\end{align}
\textbf{Solution}: As described in the text, these functions can be seen as acting on the eigenvalues of the matrix. Let's begin by diagonalizing this matrix. We can take some shortcuts because this matrix has a very simple form: it is Hermitian and real, so it is a real linear combination of $I$, $Z$, and $X$. In fact, we can readily see it is just $4 I + 3 X$. This has the same eigenstates as $X$, with eigenvalues $4 \pm 3 = 7, 1$. If $\ket{+}$ and $\ket{-}$ are the eigenstates of $X$, we can write this matrix as
\begin{align}
\begin{bmatrix}
4 & 3 \\
3 & 4
\end{bmatrix} = 7 \ket{+}\bra{+} + 1 \ket{-}\bra{-}.
\end{align}
Where the projections are given by
\begin{align}
\begin{aligned}
\ket{+}\bra{+} = \frac{1}{2} \begin{bmatrix}
1 \\
1
\end{bmatrix} \begin{bmatrix}
1 & 1
\end{bmatrix} = \frac{1}{2} \begin{bmatrix}
1 & 1 \\
1 & 1
\end{bmatrix} \\
\ket{-}\bra{-} = \frac{1}{2} \begin{bmatrix}
1 \\
-1
\end{bmatrix} \begin{bmatrix}
1 & -1
\end{bmatrix} = \frac{1}{2} \begin{bmatrix}
1 & -1 \\
-1 & 1
\end{bmatrix}.
\end{aligned}
\end{align}
For any function $f$ on a single variable, we can evaluate $f(M)$ for this particular matrix as
\begin{align}
f(7) \ket{+}\bra{+} + f(3) \ket{-}\bra{-} = \frac{1}{2} \begin{bmatrix}
f(7) + f(3) & f(7)-f(3) \\
f(7)-f(3) & f(7) + f(3)
\end{bmatrix}.
\end{align}
Using square root and logarithm functions, we obtain
\begin{align}
\frac{1}{2}\begin{bmatrix}
\sqrt{7} + \sqrt{3} & \sqrt{7}-\sqrt{3} \\
\sqrt{7}-\sqrt{3} & \sqrt{7} + \sqrt{3}
\end{bmatrix}
\end{align}
and
\begin{align}
\frac{1}{2}\begin{bmatrix}
\log(21) & \log(7/3) \\
\log(7/3) & \log(21)
\end{bmatrix}
\end{align}
respectively.
\section*{Exercise 2.35 (Exponential of the Pauli matrices)}
Let $\vec{v}$ be any real, three-dimensional unit vector and $\theta$ a real number. Prove that
\begin{align}
\exp(i\theta \vec{v}\cdot \vec{\sigma}) = \cos \theta I + i \sin\theta \vec{v}\cdot\vec{\sigma},
\end{align}
where $\vec{v}\cdot\vec{\sigma}\equiv \sum_{i=1}^3 v_i\sigma_i$. This exercise is generalized in Problem 2.1 on page 117.
\textbf{Solution}: To understand this, it will be helpful to use the fact that $(\vec{v}\cdot\vec{\sigma})^2 = I$. The essential reason for this is that, just as the individual Pauli matrices square to the identity, any other choice of axis $\vec{v}$ should give this property as well, being essentially equivalent. We can also verify this from the
\section*{Exercise 2.36}
Show that the Pauli matrices except for $I$ have trace zero.
\textbf{Solution}: We can simply see that the diagonal elements of each matrix add to zero.
\begin{align}
\begin{aligned}
\tr(X) = 0 + 0 = 0 \\
\tr(Y) = 0 + 0 = 0 \\
\tr(Z) = 1 - 1 = 0.
\end{aligned}
\end{align}
There are other ways to understand why the trace is zero, which are insightful. First, suppose we know at least one of the Paulis has trace zero. Then, because all the Paulis are equivalent up to a change of basis, the other two Pauli's have trace zero as well. This can be seen as a consequence of the cyclicity of the trace. For example,
\begin{align}
\tr(X) = \tr(H Z H^{-1}) = \tr( H^{-1} H Z) = \tr(Z).
\end{align}
From the perspective of the Lie algebra (commutation relations), each Pauli is the commutator of the other two. Since the commutator is always trace zero, so is each Pauli.
\section*{Exercise 2.37 (Cyclic property of trace)}
If $A$ and $B$ are two linear operators show that
\begin{align}
\tr(AB) = \tr(BA).
\end{align}
\textbf{Solution}: Let's compute the trace with respect to some basis indexed by $i$. We have, starting from the definition,
\begin{align}
\begin{aligned}
\tr(AB) &= \sum_{i} (AB)_{ii} \\
&= \sum_{i}\sum_{j} A_{ij} B_{ji} \\
&= \sum_{j} \sum_{i}B_{ji} A_{ij} \\
&= \sum_{j} (BA)_{jj} \\
&:= \tr(BA).
\end{aligned}
\end{align}
In the above, we expanded the diagonal elements of $AB$ in terms of elements of $A$ and $B$, and rearranged the sums.
It's important to realize that, despite how it appears, we can't just \emph{arbitrarily} rearrange terms in the trace, only cyclically. So $\tr(ABC) = \tr(CAB) = \tr(BCA)$, but in general $\tr(ABC) \neq \tr(BAC)$. The result we proved only implies invariance under \emph{cyclic} permutations.
\section*{Exercise 2.38 (Linearity of the trace)}
If $A$ and $B$ are two linear operators, show that
\begin{align}
\tr(A+B) = \tr(A) + \tr(B)
\end{align}
and if $z$ is an arbitrary complex number show that
\begin{align}
\tr(z A) = z \tr(A).
\end{align}
\textbf{Solution}: Since $(A+B)_{ii} = A_{ii} + B_{ii}$, $\tr(A+B) = \sum_i (A+B)_{ii} = \sum_i A_{ii} + B_{ii} = \tr(A) + \tr(B)$.
As an aside, there is even more we can say. Observe that
\begin{align}
\tr(z A) = \sum_i z A_{ii} = z \sum_i A_{ii} = z\tr(A).
\end{align}
Together with the previous result, this shows that the trace is a \emph{linear map} to the complex numbers.
\section*{Exercise 2.39 (The Hilbert-Schmidt inner product on operators)}
The set $L_V$ of linear operators on a Hilbert space $V$ is obviously a vector space – the sum of two linear operators is a linear operator, $zA$ is a linear operator if $A$ is a linear operator and $z$ is a complex number, and there is a zero element $0$. An important additional result is that the vector space $L_V$ can be given a natural inner product structure, turning it into a Hilbert space.
\begin{enumerate}
\item Show that the function $(\cdot, \cdot)$ on $L_V \times L_V$ defined by
\begin{align}
(A,B) \equiv \tr(A^\dagger B)
\end{align}
is an inner product function. This inner product is known as the \emph{Hilbert-Schmidt} or \emph{trace} inner product
\item If $V$ has dimension $d$, show that $L_V$ has dimension $d^2$.
\item Find an orthonormal basis of Hermitian matrices for the Hilbert space $L_V$.
\end{enumerate}
\textbf{Solution}: We need to show that $(\cdot, \cdot)$ is linear in the second argument. This follows directly from linearity of the trace.
\begin{align}
(A, c_1 B_1 + c_2 B_2) = \tr(A^\dagger (c_1 B_1 + c_2 B_2)) = c_1 \tr(A^\dagger B_1) + c_2 \tr(A^\dagger B_2) = c_1 (A,B_1) + c_2 (A, B_2)
\end{align}
Next we need to show that $(A,B)$ is conjugate-symmetric: $(A,B) = (B,A)^*$. To see this, note that for any operator $M$, $\tr(M^\dagger) = \tr(M)^*$. This follows because in a given matrix representation, the trace is just transposition follows by complex conjugation, so each diagonal entry gets a complex conjugate before being added together. Using this fact, we have
\begin{align}
(A, B) = \tr(A^\dagger B) = \tr((B^\dagger A)^\dagger) = \tr(B^\dagger A)^* = (B,A)^*,
\end{align}
which shows conjugate symmetry. Finally, we need to show $(A,A) \geq 0$ for all $A \in L_V$, with equality iff $A = 0$. We can see this holds because for any $A$, $A^\dagger A$ is a positive operator. Hence, $\tr(A^\dagger A)$, when viewed in the eigenbasis, is the sum over all eigenvalues, all of which are nonnegative. Thus, the trace is also nonnegative. The only way it equals zero is if every eigenvalue of $A^\dagger A$ is zero, which implies $A^\dagger A = 0$. In fact, this directly implies $A = 0$, because if $A^\dagger A = 0$, then for every choice of vectors $v$, we have $(Av, Av) = 0$, so that $Av = 0$. This completes part 1: we've shown $(A,B)$ as defined is truly an inner product.
To show that $L_V$ is a Hilbert space of dimension $d^2$, we can think about expanding an operator $A$ into the $d^2$ entries. Specifically, any operator $A$ can be written as
\begin{align}
A = \sum_{ij} A_{ij} \ket{i}\bra{j}
\end{align}
for some orthonormal basis $\ket{i}$, where $A_{ij} = \bra{i}A\ket{j}$ are the matrix elements. The operators $\ket{i}\bra{j}$ form an orthonormal basis. Indeed,
\begin{align}
(\ket{i}\bra{j}, \ket{k}\bra{l}) = \tr(\ket{j}\bra{i}\ket{k}\bra{l}) = \braket{i}{k} \braket{j}{l} = \delta_{ik}\delta_{jl}.
\end{align}
So that the inner product is zero unless the two operators are equal, in which case it is one. Hence, we've found a set of $d^2$ orthonormal operators which span $L_V$. Hence, $L_V$ is dimension $d^2$.
Although we have found a basis, we have not found a Hermitian basis, as asked in part (3). The operators of the form $\ket{i}\bra{i}$ are Hermitian automatically, but those of the form $\ket{i}\bra{j}$ for $i\neq j$ are not. However, as discussed in Exercise 2.24, we can still write $\ket{i}\bra{j}$ as $B + iC$ for Hermitian $B$ and $C$. In this case, we have
\begin{align} \label{eq:B_C_Hermitian}
B^{(ij)} := \frac{\ket{i}\bra{j}+\ket{j}\bra{i}}{2} \quad C^{(ij)} := \frac{\ket{i}\bra{j}-\ket{j}\bra{i}}{2i}.
\end{align}
Thus, using these operators, we can express the original basis of $\ket{i}\bra{j}$ in terms of the $B^{(ij)}$ and $C^{(ij)}$. Since $B^{(ij)}$ and $B^{(ji)}$ are not independent (and similar for $C^{(ij)}$, we should only take $i <j$. Hence, our candidate basis consists of the $d$ operators $\ket{i}\bra{i}$, and the $d^2 - d$ operators $B^{(ij)}$ and $C^{(ij)}$ for $i < j$. However, these are not properly normalized: we have $(B^{(ij)}, B^{(ij)}) = 1/2$, and similar for $C^{(ij)}$. Multiplying these operators by $\sqrt{2}$ fixes this.
We need to check that these operators are all orthogonal. This can be checked using the orthonormality of the $\ket{i}\bra{j}$ under the Hilbert-Schmidt inner product.
\section*{Exercise 2.40 (Commutation relations for the Pauli matrices)}
Verify the commutation relations
\begin{align}
[X, Y] = 2iZ; \quad [Y, Z] = 2iX; \quad [Z, X] = 2iY.
\end{align}
There is an elegant way of writing this using $\epsilon_{jkl}$, the antisymmetric tensor on three indices, for which $\epsilon_{jkl} = 0$ except for $\epsilon_{123} = \epsilon_{231} = \epsilon_{312} = 1$, and $\epsilon_{321} = \epsilon_{213} = \epsilon_{132} = -1$:
\begin{align}
[\sigma_j, \sigma_k] = 2i \sum_{l=1}^3 \epsilon_{jkl} \sigma_l.
\end{align}
\textbf{Solution}: We could check this with direct matrix multiplication, of course. We won't perform that calculation here. A deeper explanation comes from the algebraic properties of the Pauli operators, which are independent of a specific matrix representation. To start, the Pauli operators \emph{anticommute}, which implies, for instance,
\begin{align}
[X,Y] := XY-YX = 2 XY.
\end{align}
The product of two (different) Pauli operators is $i$ times the third Pauli, with a plus or minus depending on the cyclic ordering according to $XYZ$. Thus, for example, $XY = iZ$. These properties show the commutator results, and also show why the antisymmetric tensor $\epsilon_{jkl}$ captures these relationships.
\section*{Exercise 2.41 (Anti-commutation relations for the Pauli matrices)}
Verify the anti-commutation relations
\begin{align}
\{\sigma_i, \sigma_j\}
\end{align}
where $i\neq j$ are both chosen from the set $1, 2, 3$. Also verify that ($i = 0,1,2,3$)
\begin{align}
\sigma_i^2 = I.
\end{align}
\textbf{Solution}: We won't go through the verification of these properties in this solution set: just compute, say, $XY + YX$ and verify that it equals zero.
As a general comment, at some level the statements we are being asked to show here should be considered as \emph{definitions} for the set of Pauli operators. Any set of Pauli operators must satisfy the anti-commutation relations and square to the identity. Any such set gives a valid set of operators to be used for describing qubit states. This abstraction may seem peculiar, but one benefit is it takes away some of the arbitrariness in the form of the Pauli matrices (e.g. making $Z$ diagonal).
\section*{Exercise 2.42}
Verify that
\begin{align}
AB = \frac{[A,B] + \{A,B\}}{2}.
\end{align}