×
Stop thinking about WHAT WILL HAPPEN and start thinking about WHAT YOU CAN DO.
--Your friends at LectureNotes
Close

Note of Data Science by Rakesh Kumar

  • Other
  • 737 Views
  • 15 Offline Downloads
  • Uploaded 1 year ago
0 User(s)
Download PDFOrder Printed Copy

Share it with your friends

Leave your Comments

Text from page-1

Ten Lectures and Forty-Two Open Problems in the Mathematics of Data Science Afonso S. Bandeira December, 2015 Preface These are notes from a course I gave at MIT on the Fall of 2015 entitled: “18.S096: Topics in Mathematics of Data Science”. These notes are not in final form and will be continuously edited and/or corrected (as I am sure they contain many typos). Please use at your own risk and do let me know if you find any typo/mistake. Part of the content of this course is greatly inspired by a course I took from Amit Singer while a graduate student at Princeton. Amit’s course was inspiring and influential on my research interests. I can only hope that these notes may one day inspire someone’s research in the same way that Amit’s course inspired mine. These notes also include a total of forty-two open problems (now 41, as in meanwhile Open Problem 1.3 has been solved [MS15]!). This list of problems does not necessarily contain the most important problems in the field (although some will be rather important). I have tried to select a mix of important, perhaps approachable, and fun problems. Hopefully you will enjoy thinking about these problems as much as I do! I would like to thank all the students who took my course, it was a great and interactive audience! I would also like to thank Nicolas Boumal, Ludwig Schmidt, and Jonathan Weed for letting me know of several typos. Thank you also to Nicolas Boumal, Dustin G. Mixon, Bernat Guillen Pegueroles, Philippe Rigollet, and Francisco Unda for suggesting open problems. Contents 0.1 0.2 0.3 List of open problems . . . . . . . . . . . A couple of Open Problems . . . . . . . . 0.2.1 Koml´os Conjecture . . . . . . . . . 0.2.2 Matrix AM-GM inequality . . . . Brief Review of some linear algebra tools . 0.3.1 Singular Value Decomposition . . . 0.3.2 Spectral Decomposition . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 6 6 7 7 7 8

Text from page-2

0.4 0.3.3 Trace and norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 9 1 Principal Component Analysis in High Dimensions and the Spike Model 1.1 Dimension Reduction and PCA . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 PCA as best d-dimensional affine fit . . . . . . . . . . . . . . . . . . . 1.1.2 PCA as d-dimensional projection that preserves the most variance . . 1.1.3 Finding the Principal Components . . . . . . . . . . . . . . . . . . . . 1.1.4 Which d should we pick? . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.5 A related open problem . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 PCA in high dimensions and Marcenko-Pastur . . . . . . . . . . . . . . . . . 1.2.1 A related open problem . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Spike Models and BBP transition . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 A brief mention of Wigner matrices . . . . . . . . . . . . . . . . . . . 1.3.2 An open problem about spike models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 10 10 12 13 13 14 15 17 18 22 23 2 Graphs, Diffusion Maps, and Semi-supervised Learning 2.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Cliques and Ramsey numbers . . . . . . . . . . . . . . . . . . . . 2.2 Diffusion Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 A couple of examples . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Diffusion Maps of point clouds . . . . . . . . . . . . . . . . . . . 2.2.3 A simple example . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Similar non-linear dimensional reduction techniques . . . . . . . 2.3 Semi-supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 An interesting experience and the Sobolev Embedding Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 24 25 29 32 33 34 34 35 38 3 Spectral Clustering and Cheeger’s Inequality 3.1 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 k-means Clustering . . . . . . . . . . . . . . . . . 3.2 Spectral Clustering . . . . . . . . . . . . . . . . . . . . . 3.3 Two clusters . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Normalized Cut . . . . . . . . . . . . . . . . . . . 3.3.2 Normalized Cut as a spectral relaxation . . . . . 3.4 Small Clusters and the Small Set Expansion Hypothesis 3.5 Computing Eigenvectors . . . . . . . . . . . . . . . . . . 3.6 Multiple Clusters . . . . . . . . . . . . . . . . . . . . . . 4 Concentration Inequalities, Scalar and Matrix 4.1 Large Deviation Inequalities . . . . . . . . . . . 4.1.1 Sums of independent random variables . 4.2 Gaussian Concentration . . . . . . . . . . . . . 4.2.1 Spectral norm of a Wigner Matrix . . . 4.2.2 Talagrand’s concentration inequality . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 41 41 43 45 46 48 53 53 54 Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 55 55 60 62 62

Text from page-3

4.3 4.4 4.5 4.6 4.7 4.8 Other useful large deviation inequalities . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Additive Chernoff Bound . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Multiplicative Chernoff Bound . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Deviation bounds on χ2 variables . . . . . . . . . . . . . . . . . . . . . . Matrix Concentration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optimality of matrix concentration result for gaussian series . . . . . . . . . . . 4.5.1 An interesting observation regarding random matrices with independent A matrix concentration inequality for Rademacher Series . . . . . . . . . . . . 4.6.1 A small detour on discrepancy theory . . . . . . . . . . . . . . . . . . . 4.6.2 Back to matrix concentration . . . . . . . . . . . . . . . . . . . . . . . . Other Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1 Oblivious Sparse Norm-Approximating Projections . . . . . . . . . . . . 4.7.2 k-lifts of graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Another open problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 63 63 63 64 66 68 69 69 70 75 75 76 77 5 Johnson-Lindenstrauss Lemma and Gordons Theorem 5.1 The Johnson-Lindenstrauss Lemma . . . . . . . . . . . . . . . . . . . . . 5.1.1 Optimality of the Johnson-Lindenstrauss Lemma . . . . . . . . . 5.1.2 Fast Johnson-Lindenstrauss . . . . . . . . . . . . . . . . . . . . . 5.2 Gordon’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Gordon’s Escape Through a Mesh Theorem . . . . . . . . . . . . 5.2.2 Proof of Gordon’s Theorem . . . . . . . . . . . . . . . . . . . . . 5.3 Sparse vectors and Low-rank matrices . . . . . . . . . . . . . . . . . . . 5.3.1 Gaussian width of k-sparse vectors . . . . . . . . . . . . . . . . . 5.3.2 The Restricted Isometry Property and a couple of open problems 5.3.3 Gaussian width of rank-r matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 78 80 80 81 83 83 85 85 86 87 6 Compressed Sensing and Sparse Recovery 6.1 Duality and exact recovery . . . . . . . . . . . . . . . . . 6.2 Finding a dual certificate . . . . . . . . . . . . . . . . . . 6.3 A different approach . . . . . . . . . . . . . . . . . . . . . 6.4 Partial Fourier matrices satisfying the Restricted Isometry 6.5 Coherence and Gershgorin Circle Theorem . . . . . . . . . 6.5.1 Mutually Unbiased Bases . . . . . . . . . . . . . . 6.5.2 Equiangular Tight Frames . . . . . . . . . . . . . . 6.5.3 The Paley ETF . . . . . . . . . . . . . . . . . . . . 6.6 The Kadison-Singer problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 91 92 93 94 94 95 96 97 97 7 Group Testing and Error-Correcting Codes 7.1 Group Testing . . . . . . . . . . . . . . . . . . 7.2 Some Coding Theory and the proof of Theorem 7.2.1 Boolean Classification . . . . . . . . . . 7.2.2 The proof of Theorem 7.3 . . . . . . . . 7.3 In terms of linear Bernoulli algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 98 102 103 104 105 3 . . 7.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Text from page-4

7.3.1 7.3.2 Shannon Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 The deletion channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 8 Approximation Algorithms and Max-Cut 8.1 The Max-Cut problem . . . . . . . . . . . . . . . . . . 8.2 Can αGW be improved? . . . . . . . . . . . . . . . . . 8.3 A Sums-of-Squares interpretation . . . . . . . . . . . . 8.4 The Grothendieck Constant . . . . . . . . . . . . . . . 8.5 The Paley Graph . . . . . . . . . . . . . . . . . . . . . 8.6 An interesting conjecture regarding cuts and bisections 9 Community detection and the Stochastic Block 9.1 Community Detection . . . . . . . . . . . . . . . 9.2 Stochastic Block Model . . . . . . . . . . . . . . 9.3 What does the spike model suggest? . . . . . . . 9.3.1 Three of more communities . . . . . . . . 9.4 Exact recovery . . . . . . . . . . . . . . . . . . . 9.5 The algorithm . . . . . . . . . . . . . . . . . . . . 9.6 The analysis . . . . . . . . . . . . . . . . . . . . . 9.6.1 Some preliminary definitions . . . . . . . 9.7 Convex Duality . . . . . . . . . . . . . . . . . . . 9.8 Building the dual certificate . . . . . . . . . . . . 9.9 Matrix Concentration . . . . . . . . . . . . . . . 9.10 More communities . . . . . . . . . . . . . . . . . 9.11 Euclidean Clustering . . . . . . . . . . . . . . . . 9.12 Probably Certifiably Correct algorithms . . . . . 9.13 Another conjectured instance of tightness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 108 110 111 114 115 115 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 117 117 117 119 120 120 122 122 122 124 125 126 127 128 129 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 131 131 134 135 135 135 137 138 10 Synchronization Problems and Alignment 10.1 Synchronization-type problems . . . . . . . . . . . . . 10.2 Angular Synchronization . . . . . . . . . . . . . . . . . 10.2.1 Orientation estimation in Cryo-EM . . . . . . . 10.2.2 Synchronization over Z2 . . . . . . . . . . . . . 10.3 Signal Alignment . . . . . . . . . . . . . . . . . . . . . 10.3.1 The model bias pitfall . . . . . . . . . . . . . . 10.3.2 The semidefinite relaxation . . . . . . . . . . . 10.3.3 Sample complexity for multireference alignment 0.1 List of open problems • 0.1: Komlos Conjecture • 0.2: Matrix AM-GM Inequality • 1.1: Mallat and Zeitouni’s problem 4 . . . . . . . . . . . . . . . . . . . . . .

Lecture Notes