1 00:00:00,000 --> 00:00:13,000 2 00:00:13,000 --> 00:00:16,000 This presentation is delivered by the Stanford Center for Professional 3 00:00:16,000 --> 00:00:26,000 Development. 4 00:00:26,000 --> 00:00:27,000 Hi, there. 5 00:00:27,000 --> 00:00:29,000 It's 6 00:00:29,000 --> 00:00:36,000 Wednesday. We are coming to the end of this fine quarter. So 7 00:00:36,000 --> 00:00:39,000 what I'm going to do today is I 8 00:00:39,000 --> 00:00:43,000 brought a cup filled with candy. We'll start with that 9 00:00:43,000 --> 00:00:45,000 because that's going to be helpful. What I'd like to do is it's just a couple 10 00:00:45,000 --> 00:00:50,000 bits of loose ends, some things that I wanted to touch on that 11 00:00:50,000 --> 00:00:53,000 have come and gone this quarter. I maybe wanted to pull them together and reach a bit of closure 12 00:00:53,000 --> 00:00:54,000 on. 13 00:00:54,000 --> 00:00:57,000 Then hopefully, if we have time, we'll move on to what 14 00:00:57,000 --> 00:01:00,000 happens after this. If you liked 106b, what do you do? If you don't like 15 00:01:00,000 --> 00:01:01,000 106b, what do you do? 16 00:01:01,000 --> 00:01:02,000 17 00:01:02,000 --> 00:01:06,000 What kind of options and future opportunities are there 18 00:01:06,000 --> 00:01:08,000 after this 19 00:01:08,000 --> 00:01:10,000 ends? I'm going to talk a little about some of the things that 20 00:01:10,000 --> 00:01:12,000 [inaudible], but I do want to talk about the final just 21 00:01:12,000 --> 00:01:15,000 for a second because it is kind of a 22 00:01:15,000 --> 00:01:17,000 last 23 00:01:17,000 --> 00:01:19,000 opportunity for you to show us your mastery of the material 24 00:01:19,000 --> 00:01:22,000 and get the grade that you've been working hard for all quarter. 25 00:01:22,000 --> 00:01:26,000 It does carry a certain amount of weight. 30 percent of the weight in the default 26 00:01:26,000 --> 00:01:27,000 system is actually placed on the final, 27 00:01:27,000 --> 00:01:30,000 and more of it can move to that if you have a better 28 00:01:30,000 --> 00:01:33,000 showing on the final than you did in the midterm. We're happy to move some of that 29 00:01:33,000 --> 00:01:34,000 weight over so it could actually 30 00:01:34,000 --> 00:01:36,000 account for even more of your total grade. 31 00:01:36,000 --> 00:01:38,000 So there's a lot riding on it, and you want to be sure you come 32 00:01:38,000 --> 00:01:40,000 ready to show us the 33 00:01:40,000 --> 00:01:44,000 things that you know and that you have comprehension and mastery over. 34 00:01:44,000 --> 00:01:47,000 To note, though, it is going to look a little bit different than the midterm in terms of its 35 00:01:47,000 --> 00:01:51,000 emphasis. We are going to see a lot more of this nitty gritty dynamic data 36 00:01:51,000 --> 00:01:52,000 structure, 37 00:01:52,000 --> 00:01:53,000 Linklis, trees, 38 00:01:53,000 --> 00:01:56,000 graphs, pointer-based 39 00:01:56,000 --> 00:01:58,000 structures with the 40 00:01:58,000 --> 00:02:01,000 inherent complications in dealing with that kind of dynamic structure. So bringing 41 00:02:01,000 --> 00:02:03,000 your best game to this exam 42 00:02:03,000 --> 00:02:05,000 is certainly going to pay off here. The 43 00:02:05,000 --> 00:02:08,000 coding that you will see will look a lot like what's in the assignments. If you already had a 44 00:02:08,000 --> 00:02:11,000 chance to look at the practice exam, you'll see that there are 45 00:02:11,000 --> 00:02:13,000 themes there that echo the things you did on the work. You want to try to make sure 46 00:02:13,000 --> 00:02:15,000 that's we're connecting to the 47 00:02:15,000 --> 00:02:16,000 48 00:02:16,000 --> 00:02:17,000 effort you've put in all along. 49 00:02:17,000 --> 00:02:20,000 There will actually be some opportunity to think a little bit at a higher level 50 00:02:20,000 --> 00:02:23,000 about making and analyzing choices for design and implementation. I'm going to talk a 51 00:02:23,000 --> 00:02:25,000 little bit about that today because I feel like 52 00:02:25,000 --> 00:02:26,000 at any given moment, we're talking about 53 00:02:26,000 --> 00:02:30,000 how we might implement a stack and how we might implement a queue. Sometimes it helps to 54 00:02:30,000 --> 00:02:32,000 step back a level and think a little bit about the choices for 55 00:02:32,000 --> 00:02:37,000 the bigger picture of when to use a stack and queue and how to make 56 00:02:37,000 --> 00:02:40,000 decisions in the large about those tradeoffs that are intelligent and 57 00:02:40,000 --> 00:02:42,000 grounded. So I'll do a little bit of that today, and then 58 00:02:42,000 --> 00:02:45,000 some of that has been built up all quarter long. 59 00:02:45,000 --> 00:02:48,000 It is open book and open notes. You can bring all your printouts, your readers, 60 00:02:48,000 --> 00:02:50,000 sections, all those sort of things. 61 00:02:50,000 --> 00:02:53,000 For some people, that translates to this idea that, well, I don't need to 62 00:02:53,000 --> 00:02:56,000 prepare because I'll have all my stuff there. If I need to know something 63 00:02:56,000 --> 00:02:59,000 in the exam, I'll just go look it up. That's certainly true for looking up details. You want to remember how 64 00:02:59,000 --> 00:03:02,000 this works or you had a piece of code that you think was similar. It could help 65 00:03:02,000 --> 00:03:05,000 you to brainstorm what you're trying to do this time. But 66 00:03:05,000 --> 00:03:08,000 you're not likely to have time to learn new things n the exam. So if there's 67 00:03:08,000 --> 00:03:11,000 something that you really feel you have not 68 00:03:11,000 --> 00:03:12,000 gotten your head around, 69 00:03:12,000 --> 00:03:17,000 the investment upfront before you get into the exam is where to get that 70 00:03:17,000 --> 00:03:19,000 understanding in place rather than trying, in the exam, to be flipping 71 00:03:19,000 --> 00:03:22,000 through chapter ten, trying to learn something. It 72 00:03:22,000 --> 00:03:23,000 may not really play out. 73 00:03:23,000 --> 00:03:27,000 I highly recommend practicing on paper in an exam-like environment. So really 74 00:03:27,000 --> 00:03:28,000 working 75 00:03:28,000 --> 00:03:31,000 the way you will be working in the exam to give yourself 76 00:03:31,000 --> 00:03:33,000 the best, 77 00:03:33,000 --> 00:03:35,000 consistent prep for what's happening there. 78 00:03:35,000 --> 00:03:39,000 Then some people have asked about extra problems, just wanting more of them to work on. I'll give you a 79 00:03:39,000 --> 00:03:41,000 couple places where you can look for things. One is that cs106 is 80 00:03:41,000 --> 00:03:44,000 being offered this same quarter, and they are just the honors version of our class. 81 00:03:44,000 --> 00:03:45,000 So they have some 82 00:03:45,000 --> 00:03:49,000 slightly notched up problems. But their practice and real midterm are posted on their 83 00:03:49,000 --> 00:03:51,000 website as well as their practice final. 84 00:03:51,000 --> 00:03:52,000 85 00:03:52,000 --> 00:03:55,000 They have very similar coverage. They did some assignments that were like 86 00:03:55,000 --> 00:03:58,000 ours, some that were a little bit different, but still go places to kind of 87 00:03:58,000 --> 00:03:59,000 just grab 88 00:03:59,000 --> 00:04:00,000 actual exam problems with their solution. 89 00:04:00,000 --> 00:04:04,000 The chapter exercises and section problems often are old 90 00:04:04,000 --> 00:04:05,000 exam problems or 91 00:04:05,000 --> 00:04:08,000 eventually became exam problems. So they kind of come and go in terms of being used as 92 00:04:08,000 --> 00:04:12,000 examples or used as exam problems. So they're definitely places to 93 00:04:12,000 --> 00:04:17,000 pick up extra mileage. Okay. So let me tell 94 00:04:17,000 --> 00:04:18,000 you a little bit about 95 00:04:18,000 --> 00:04:21,000 - any questions about the final? Okay. Let me talk about 96 00:04:21,000 --> 00:04:26,000 this line 97 00:04:26,000 --> 00:04:28,000 just for a second because I feel like we have 98 00:04:28,000 --> 00:04:31,000 touched on these themes again and again, but maybe it's good to have 99 00:04:31,000 --> 00:04:32,000 at least one 100 00:04:32,000 --> 00:04:35,000 moment of trying to pull this together and think a little bit about 101 00:04:35,000 --> 00:04:38,000 the closure and the big picture of all these things. 102 00:04:38,000 --> 00:04:41,000 A lot of what we're talking about in the beginning was we 103 00:04:41,000 --> 00:04:44,000 were giving you these abstractions, these cool things to do stuff with, a map, a 104 00:04:44,000 --> 00:04:45,000 stack, a queue, a vector, 105 00:04:45,000 --> 00:04:49,000 and problems to solve where those abstractions have a role to play. 106 00:04:49,000 --> 00:04:50,000 Then we spent a long time saying, 107 00:04:50,000 --> 00:04:53,000 okay, now that it's our job to make those abstractions work, 108 00:04:53,000 --> 00:04:56,000 what techniques can we use? What algorithms 109 00:04:56,000 --> 00:04:59,000 and data structures will help us manipulate and 110 00:04:59,000 --> 00:05:02,000 model those structures in efficient ways. We 111 00:05:02,000 --> 00:05:05,000 certainly spent a lot of time talking about runtime performance as though that 112 00:05:05,000 --> 00:05:08,000 was maybe a little too much in your mind to think that was the only thing 113 00:05:08,000 --> 00:05:11,000 that really mattered. How fast can we make something? O of N is better 114 00:05:11,000 --> 00:05:12,000 than N squared. 115 00:05:12,000 --> 00:05:15,000 Log N is better than N, driving to these holy grails of 116 00:05:15,000 --> 00:05:16,000 bringing the time down. 117 00:05:16,000 --> 00:05:19,000 It's certainly a very important factor 118 00:05:19,000 --> 00:05:20,000 that often - 119 00:05:20,000 --> 00:05:23,000 the difference between something being linear and something being quadratic is a 120 00:05:23,000 --> 00:05:27,000 matter of it being tractable at all for the problem at hand. You cannot sort a 121 00:05:27,000 --> 00:05:32,000 million numbers in a quadratic sort in nearly 122 00:05:32,000 --> 00:05:35,000 the same time. For large enough data sets, it'd 123 00:05:35,000 --> 00:05:37,000 be completely unfeasible. 124 00:05:37,000 --> 00:05:40,000 So it is important to have that big picture, to be thinking about it and know about 125 00:05:40,000 --> 00:05:42,000 those tradeoffs, to understand issues 126 00:05:42,000 --> 00:05:44,000 of scale here. 127 00:05:44,000 --> 00:05:47,000 But we also did some stuff on the homework about actual empirical time 128 00:05:47,000 --> 00:05:50,000 results, which is another way to bolster our understanding. The big O 129 00:05:50,000 --> 00:05:51,000 tells us these things, but 130 00:05:51,000 --> 00:05:53,000 what really happens in real life 131 00:05:53,000 --> 00:05:57,000 matches it, but not precisely. Those constant factors, we threw away, 132 00:05:57,000 --> 00:06:00,000 and other artifacts that are 133 00:06:00,000 --> 00:06:03,000 happening in the real world often give us new insights about things and 134 00:06:03,000 --> 00:06:06,000 challenges that we're facing. 135 00:06:06,000 --> 00:06:09,000 We also need to think about things like mix of expected operations, having some be 136 00:06:09,000 --> 00:06:11,000 slow 137 00:06:11,000 --> 00:06:14,000 at the consequence of other operations being fast. It's often a tradeoff we're 138 00:06:14,000 --> 00:06:16,000 willing to make. 139 00:06:16,000 --> 00:06:20,000 On the editor rougher, we drove toward getting all the operations to be 140 00:06:20,000 --> 00:06:23,000 fast. In an ideal world, that's certainly the best place to be, but 141 00:06:23,000 --> 00:06:26,000 it's also a matter at what cost 142 00:06:26,000 --> 00:06:29,000 in terms of code complexity and memory used 143 00:06:29,000 --> 00:06:33,000 and effort put into it. It may be that we're willing to tolerate some operations 144 00:06:33,000 --> 00:06:34,000 being less efficient 145 00:06:34,000 --> 00:06:39,000 as long as the most commonly used ones are very streamlined. 146 00:06:39,000 --> 00:06:42,000 We talked about these worst cases and degenerates about 147 00:06:42,000 --> 00:06:45,000 knowing when we can expect our algorithms to degrade and whether we 148 00:06:45,000 --> 00:06:47,000 want to do something about that 149 00:06:47,000 --> 00:06:50,000 or choose a different algorithm or provide some 150 00:06:50,000 --> 00:06:54,000 protection against those kind of cases coming through. To a lesser 151 00:06:54,000 --> 00:06:58,000 extent, we already talked about memory used, how much overhead 152 00:06:58,000 --> 00:07:01,000 there is. Seeing that we went to a pointer-based Linklis structure, we added 153 00:07:01,000 --> 00:07:03,000 the four byte overhead 154 00:07:03,000 --> 00:07:06,000 for every element. We went to an eight byte overhead once we got minor 155 00:07:06,000 --> 00:07:10,000 [inaudible] and maybe even a little bit more if we had balance factors included. 156 00:07:10,000 --> 00:07:13,000 Those things ride along with our data, 157 00:07:13,000 --> 00:07:17,000 increasing the size and the memory footprint of it. 158 00:07:17,000 --> 00:07:18,000 So there's a certain amount of 159 00:07:18,000 --> 00:07:21,000 overhead built into this system. There's also this notion of excess capacity. To what 160 00:07:21,000 --> 00:07:24,000 extent do we over allocate our 161 00:07:24,000 --> 00:07:27,000 data structures early, planning for future growth. In some senses, save 162 00:07:27,000 --> 00:07:32,000 ourselves that trouble of enlarging on demand. It's preparing for 163 00:07:32,000 --> 00:07:33,000 164 00:07:33,000 --> 00:07:34,000 a large 165 00:07:34,000 --> 00:07:36,000 allotment and then using it up when we find it. 166 00:07:36,000 --> 00:07:39,000 Then when we do exceed that capacity, having you do it again. So where 167 00:07:39,000 --> 00:07:42,000 do we make those tradeoffs about how much excess capacity and when to do 168 00:07:42,000 --> 00:07:44,000 the enlarging? 169 00:07:44,000 --> 00:07:48,000 It has a lot to do with what our general memory constraints are. 170 00:07:48,000 --> 00:07:51,000 I would say that memory has gotten to be quite 171 00:07:51,000 --> 00:07:53,000 free 172 00:07:53,000 --> 00:07:56,000 in recent processors and 173 00:07:56,000 --> 00:07:58,000 computer systems. You don't really think about 174 00:07:58,000 --> 00:08:01,000 100 bytes here, 200 bytes there, 1,000 bytes there. Even 175 00:08:01,000 --> 00:08:05,000 megabytes, you can talk about as thought it's a drop in the bucket. 176 00:08:05,000 --> 00:08:06,000 177 00:08:06,000 --> 00:08:09,000 So this is something that isn't given as much weight 178 00:08:09,000 --> 00:08:12,000 nowadays as it was ten years ago when there were a lot more constraints. But with 179 00:08:12,000 --> 00:08:14,000 the move for a lot more imbedded device programming, and looking the 180 00:08:14,000 --> 00:08:16,000 iPhone or 181 00:08:16,000 --> 00:08:19,000 cell phones or things like that where they don't have that same liberty 182 00:08:19,000 --> 00:08:23,000 to be wanton about memory, it's come back to being a little more careful and keeping 183 00:08:23,000 --> 00:08:25,000 an eye on it. 184 00:08:25,000 --> 00:08:28,000 There's also and issue here which trades off a lot with 185 00:08:28,000 --> 00:08:31,000 runtime, which is redundancy versus sharing. Having 186 00:08:31,000 --> 00:08:33,000 two different ways to get at the same piece of data 187 00:08:33,000 --> 00:08:38,000 often will provide you with quick access. For example, if you have a 188 00:08:38,000 --> 00:08:42,000 card catalogue for a library where you can look up by author, you can look up by title, you 189 00:08:42,000 --> 00:08:44,000 probably are maintaining two data structures. 190 00:08:44,000 --> 00:08:46,000 One that manipulates them, 191 00:08:46,000 --> 00:08:49,000 sorted by title, one that manipulates them sorted by author. They're both kind 192 00:08:49,000 --> 00:08:52,000 of looking at the same books in the end. 193 00:08:52,000 --> 00:08:55,000 So maybe there's this set of pointers in one map over here, 194 00:08:55,000 --> 00:08:58,000 indexed by author, another 195 00:08:58,000 --> 00:09:00,000 keyed by title over there. 196 00:09:00,000 --> 00:09:03,000 That means that I'm repeating all my data. 197 00:09:03,000 --> 00:09:06,000 Hopefully I'm sharing it with a pointer, so I'm not actually really repeating all the 198 00:09:06,000 --> 00:09:10,000 data, but I actually have two pointers or potentially two or more 199 00:09:10,000 --> 00:09:11,000 pointers to the same book, 200 00:09:11,000 --> 00:09:15,000 depending on the different ways you can get to it. That gives me that fast 201 00:09:15,000 --> 00:09:17,000 access. I want to know all the books written by Leo 202 00:09:17,000 --> 00:09:22,000 Lionni, I can find them easily, as well as finding all the books that have King in the 203 00:09:22,000 --> 00:09:26,000 title. I don't have to do searches on a data set that's been optimized for only 204 00:09:26,000 --> 00:09:27,000 one 205 00:09:27,000 --> 00:09:28,000 access. Definitely, 206 00:09:28,000 --> 00:09:32,000 tradeoffs were more space thrown at it to get at that fast access for 207 00:09:32,000 --> 00:09:34,000 different ways. 208 00:09:34,000 --> 00:09:40,000 Then this last one is one that I had mentioned along the way, but I think it's one that's easy to overlook, 209 00:09:40,000 --> 00:09:42,000 which is to get very excited about how fast you can make it, how small you can 210 00:09:42,000 --> 00:09:43,000 make it, 211 00:09:43,000 --> 00:09:46,000 how fancy it is. It 212 00:09:46,000 --> 00:09:50,000 has a real downside in terms of how hard it was to 213 00:09:50,000 --> 00:09:53,000 write. Typically, when you go to these fancier strategies 214 00:09:53,000 --> 00:09:54,000 that are 215 00:09:54,000 --> 00:09:58,000 space efficient and time efficient, you had to pay for it somewhere. 216 00:09:58,000 --> 00:10:01,000 It wasn't for free that the easy, obvious code would do that. It tends to 217 00:10:01,000 --> 00:10:05,000 actually be complicated code that may use bid operations, that probably uses 218 00:10:05,000 --> 00:10:06,000 more pointers 219 00:10:06,000 --> 00:10:10,000 that has kind of a density to it that means it's going to take you longer to get it 220 00:10:10,000 --> 00:10:12,000 right. It's going to take you more time to debug it. It's going to be more likely to 221 00:10:12,000 --> 00:10:15,000 have lurking bugs. It's going to be harder to maintain. 222 00:10:15,000 --> 00:10:18,000 If somebody comes along and wants to add a new operation, 223 00:10:18,000 --> 00:10:21,000 they open up your code, and they're frightened away. It might be that that 224 00:10:21,000 --> 00:10:23,000 code will never get 225 00:10:23,000 --> 00:10:25,000 moved forward. Everybody will just open it 226 00:10:25,000 --> 00:10:29,000 and close it immediately and say, whatever. We'll make it work the way it is, 227 00:10:29,000 --> 00:10:33,000 rather than trying to improve it or move it forward because it actually is kind of scary 228 00:10:33,000 --> 00:10:34,000 to look at. 229 00:10:34,000 --> 00:10:35,000 So thinking about 230 00:10:35,000 --> 00:10:37,000 what's the simplest thing that could work. 231 00:10:37,000 --> 00:10:40,000 There's this movement that I think is pretty neat to think about. 232 00:10:40,000 --> 00:10:43,000 There's this idea of the agile programming methodology. 233 00:10:43,000 --> 00:10:44,000 234 00:10:44,000 --> 00:10:45,000 It's based on the idea that 235 00:10:45,000 --> 00:10:49,000 rather than planning for building the most super stellar 236 00:10:49,000 --> 00:10:51,000 infrastructure for solving all the worlds problems. You're about to write a chess 237 00:10:51,000 --> 00:10:53,000 program. 238 00:10:53,000 --> 00:10:55,000 Actually, something that's even simpler. You're going to write a checkers program. You say, 239 00:10:55,000 --> 00:10:59,000 I'm going to sit down, and I'm going to design the pieces and the board. Then you think, hey, 240 00:10:59,000 --> 00:11:02,000 checkers is just one embodiment of this. Why don't I try to make the 241 00:11:02,000 --> 00:11:07,000 uber board that can be used for strategic conquest or for chess or Chinese 242 00:11:07,000 --> 00:11:11,000 checkers. You could have these hexagonal things. You start building this crazy 243 00:11:11,000 --> 00:11:12,000 infrastructure that 244 00:11:12,000 --> 00:11:15,000 could handle all of these cases equally well, even though all you really needed 245 00:11:15,000 --> 00:11:18,000 was checkers, a very simple, 2D grid. 246 00:11:18,000 --> 00:11:19,000 But you imagine 247 00:11:19,000 --> 00:11:23,000 the future needs way in advance of having them, and then you design something 248 00:11:23,000 --> 00:11:26,000 overly complicated. Then what happens is you get bogged out of that. 249 00:11:26,000 --> 00:11:28,000 It never happens, and the project dies. 250 00:11:28,000 --> 00:11:29,000 You 251 00:11:29,000 --> 00:11:31,000 lead a lonely life 252 00:11:31,000 --> 00:11:35,000 by yourself, eating oatmeal. 253 00:11:35,000 --> 00:11:37,000 I like 254 00:11:37,000 --> 00:11:39,000 oatmeal. So one of the mottos is 255 00:11:39,000 --> 00:11:42,000 design the simplest thing that could possibly work. 256 00:11:42,000 --> 00:11:46,000 I think that's a neat gift to yourself to realize that 257 00:11:46,000 --> 00:11:48,000 this simplest thing that could possibly work, that meets the needs that 258 00:11:48,000 --> 00:11:52,000 you have, that has a decent enough [inaudible] and memory constraint right there. 259 00:11:52,000 --> 00:11:54,000 Meets your needs in the simplest form of that. 260 00:11:54,000 --> 00:11:57,000 It's going to be the easiest thing to get working and running. If you later decide you want to 261 00:11:57,000 --> 00:11:58,000 go fancy, 262 00:11:58,000 --> 00:12:01,000 you can swap it out 263 00:12:01,000 --> 00:12:06,000 using the principles of distraction and [inaudible]. It should be that it's independent 264 00:12:06,000 --> 00:12:10,000 when you decide you need that fancier thing. I've got a 265 00:12:10,000 --> 00:12:13,000 couple questions here because I thought this would just help us frame a little 266 00:12:13,000 --> 00:12:14,000 bit of thinking about these things. 267 00:12:14,000 --> 00:12:17,000 These are questions that I'm hoping you can answer for me. That's why I 268 00:12:17,000 --> 00:12:18,000 brought 269 00:12:18,000 --> 00:12:22,000 cup full of candy. I'm prepared to offer bribery for people who help 270 00:12:22,000 --> 00:12:24,000 me answer these questions. 271 00:12:24,000 --> 00:12:26,000 So let's talk about array 272 00:12:26,000 --> 00:12:28,000 versus vector. 273 00:12:28,000 --> 00:12:31,000 So the sequels [inaudible] builds and 274 00:12:31,000 --> 00:12:35,000 array is what is behind the scenes of a vector. Then the vector adds 275 00:12:35,000 --> 00:12:38,000 convenience on the outside of it. 276 00:12:38,000 --> 00:12:40,000 Early on, I had said, once I tell 277 00:12:40,000 --> 00:12:43,000 you about arrays, 278 00:12:43,000 --> 00:12:46,000 put that back in your memory, and just use vector instead because everything array 279 00:12:46,000 --> 00:12:48,000 does, vector does 280 00:12:48,000 --> 00:12:53,000 nicely and cleanly and adds all sorts of features. Somebody give me a list 281 00:12:53,000 --> 00:12:55,000 of things that vector does much better than 282 00:12:55,000 --> 00:12:58,000 array. 283 00:12:58,000 --> 00:12:59,000 Way in the back. Bounce 284 00:12:59,000 --> 00:13:00,000 checking. 285 00:13:00,000 --> 00:13:03,000 Bounce checking. What else? Come on. A 286 00:13:03,000 --> 00:13:06,000 blow pop is on the line. 287 00:13:06,000 --> 00:13:09,000 I need to interrupt your - help us 288 00:13:09,000 --> 00:13:10,000 out. 289 00:13:10,000 --> 00:13:11,000 You've got bounce checking. What 290 00:13:11,000 --> 00:13:14,000 else? More? I want more. You've got nothing else for 291 00:13:14,000 --> 00:13:17,000 me? I don't even want a blow pop. You don't want 292 00:13:17,000 --> 00:13:21,000 a blow 293 00:13:21,000 --> 00:13:23,000 pop? All right. I'm going to give your blow pop to somebody else. Right 294 00:13:23,000 --> 00:13:27,000 here. I'm going to give the blow pop to you. Tell me what makes vector good. It has operations that 295 00:13:27,000 --> 00:13:27,000 shuffle 296 00:13:27,000 --> 00:13:31,000 elements for you. Yeah, it does the insert, delete, moving stuff down. 297 00:13:31,000 --> 00:13:32,000 Also, it does the growing. 298 00:13:32,000 --> 00:13:34,000 You keep adding, and it just grows. [Inaudible] don't do that. You 299 00:13:34,000 --> 00:13:37,000 want a [inaudible] to grow, you grow it. 300 00:13:37,000 --> 00:13:40,000 You take the pointer, you copy stuff over, you delete the old one, you 301 00:13:40,000 --> 00:13:43,000 rearrange things. You do all the work. So 302 00:13:43,000 --> 00:13:44,000 vector buys you kind of 303 00:13:44,000 --> 00:13:46,000 array with benefits. 304 00:13:46,000 --> 00:13:48,000 So it seems like, then, 305 00:13:48,000 --> 00:13:50,000 you could kind of take away - the naive point would be like, well, 306 00:13:50,000 --> 00:13:53,000 just use vector. Always use vector. 307 00:13:53,000 --> 00:13:56,000 Vector's always better than array. Is there ever a situation where array still 308 00:13:56,000 --> 00:13:57,000 is a pretty valid choice? 309 00:13:57,000 --> 00:14:00,000 What is it array has that vector maybe doesn't? 310 00:14:00,000 --> 00:14:05,000 Well, when you add things dynamically to the vector, it actually doubles the size [inaudible] 311 00:14:05,000 --> 00:14:06,000 increasing it by one. If you know 312 00:14:06,000 --> 00:14:10,000 the exact size, and you know it's not going to change, you're using [inaudible]. That's it exactly. 313 00:14:10,000 --> 00:14:14,000 One of its big advantages. If you know you only have 314 00:14:14,000 --> 00:14:17,000 - let's say you have a very small array 315 00:14:17,000 --> 00:14:21,000 planned for. The one on the practice exam was, if you're doing a calendar 316 00:14:21,000 --> 00:14:23,000 where you had 365 days in a year, 317 00:14:23,000 --> 00:14:26,000 and you're keeping track of the events on a particular day, you know 318 00:14:26,000 --> 00:14:30,000 how many days in the year. There's never going to be any change in that. Well, leap 319 00:14:30,000 --> 00:14:31,000 320 00:14:31,000 --> 00:14:33,000 year. 321 00:14:33,000 --> 00:14:36,000 But you know that, and you know that in advance. So why fuss with a data structure that does all 322 00:14:36,000 --> 00:14:39,000 this growing and shrinking and has some overhead associated with that 323 00:14:39,000 --> 00:14:43,000 when, in fact, you know there's exactly this number. That's all you need. 324 00:14:43,000 --> 00:14:46,000 In fact, it's almost a little bit more convenient to set up. You can 325 00:14:46,000 --> 00:14:48,000 just declare it that size, and it's ready to go. With 326 00:14:48,000 --> 00:14:52,000 the vector, you have to go through and add all the elements and stuff like that. 327 00:14:52,000 --> 00:14:54,000 So there are a few situations where you might just say, 328 00:14:54,000 --> 00:14:57,000 I don't need the advantages. 329 00:14:57,000 --> 00:15:00,000 It adds some overhead that I don't want to pay, 330 00:15:00,000 --> 00:15:04,000 and I'm perfectly happy making it a small fixed 331 00:15:04,000 --> 00:15:07,000 array. You'll still lose bounce checking, 332 00:15:07,000 --> 00:15:08,000 but 333 00:15:08,000 --> 00:15:11,000 if you are being careful, maybe you're 334 00:15:11,000 --> 00:15:12,000 comfortable with 335 00:15:12,000 --> 00:15:15,000 making that tradeoff. By losing bounce checking, you actually are getting a little bit 336 00:15:15,000 --> 00:15:18,000 of the speed improvement back. The bounce check does actually cost you 337 00:15:18,000 --> 00:15:20,000 a little bit on every 338 00:15:20,000 --> 00:15:23,000 vector access that an array would not actually pay. 339 00:15:23,000 --> 00:15:26,000 That's really in the noise for most programs. I hesitate to 340 00:15:26,000 --> 00:15:28,000 even mention it, to get you to think it matters, 341 00:15:28,000 --> 00:15:31,000 but it is part of the thinking of a 342 00:15:31,000 --> 00:15:34,000 professional program. Sometimes that may be the thing it comes down to, 343 00:15:34,000 --> 00:15:36,000 344 00:15:36,000 --> 00:15:37,000 making your operation 345 00:15:37,000 --> 00:15:41,000 constant factors streamline enough for the operation you need. So I put here 346 00:15:41,000 --> 00:15:44,000 stack and queue versus vector. 347 00:15:44,000 --> 00:15:48,000 Given that stack and queue are really just vectors in disguise, and they're 348 00:15:48,000 --> 00:15:50,000 vectors with less features, vectors 349 00:15:50,000 --> 00:15:53,000 that can't do as much. 350 00:15:53,000 --> 00:15:55,000 One of the implementations for stack and vector 351 00:15:55,000 --> 00:15:58,000 could easily be just layered on top of a vector. 352 00:15:58,000 --> 00:16:02,000 Why is it good to take away things? 353 00:16:02,000 --> 00:16:06,000 [Inaudible] 354 00:16:06,000 --> 00:16:08,000 tampering with contents of [inaudible]. Yeah. So it's enforcing discipline 355 00:16:08,000 --> 00:16:11,000 on how you use it. Stacks are 356 00:16:11,000 --> 00:16:12,000 357 00:16:12,000 --> 00:16:14,000 lifo, queues are fifo. 358 00:16:14,000 --> 00:16:15,000 The way things go out, they way they go out 359 00:16:15,000 --> 00:16:17,000 are very 360 00:16:17,000 --> 00:16:19,000 rigid. By using stack and queue, 361 00:16:19,000 --> 00:16:22,000 you are guaranteeing that you won't go in and accidentally put something at the head 362 00:16:22,000 --> 00:16:25,000 of the line or remove something out of the middle of the stack because you just don't 363 00:16:25,000 --> 00:16:26,000 have access to that. 364 00:16:26,000 --> 00:16:27,000 It's been tidied up and 365 00:16:27,000 --> 00:16:29,000 packaged up into stack, which has push and pop. 366 00:16:29,000 --> 00:16:31,000 queue, which has MQDQ. 367 00:16:31,000 --> 00:16:34,000 No other access implied or given, 368 00:16:34,000 --> 00:16:37,000 totally encapsulated from you. So in fact, it allows you to 369 00:16:37,000 --> 00:16:39,000 maintain on a discipline 370 00:16:39,000 --> 00:16:45,000 that otherwise the vector didn't strongly provide for you. Question? [Inaudible]. Another 371 00:16:45,000 --> 00:16:48,000 thing you could do is you could optimize a stack or a 372 00:16:48,000 --> 00:16:51,000 queue to be very good at the only active things you're doing, which 373 00:16:51,000 --> 00:16:55,000 is pushing and popping or RQDQ. I think that's worth a smartie, don't you? 374 00:16:55,000 --> 00:16:57,000 I think that is. 375 00:16:57,000 --> 00:16:58,000 Exactly. 376 00:16:58,000 --> 00:17:02,000 So by restricting what you have available, 377 00:17:02,000 --> 00:17:05,000 it actually gives you choices in the implementation that wouldn't have worked out as well, [inaudible] full array 378 00:17:05,000 --> 00:17:08,000 of operations. It's exactly true. Stack and queue only adding from one end 379 00:17:08,000 --> 00:17:11,000 mean that things like a Linklis, which isn't a good choice for implementing the 380 00:17:11,000 --> 00:17:13,000 general case of vector, 381 00:17:13,000 --> 00:17:17,000 are perfectly good choices for the stack and queue case. So giving ourselves 382 00:17:17,000 --> 00:17:22,000 some freedom as the implementer. So information about how it's being used 383 00:17:22,000 --> 00:17:24,000 can pay off. It also 384 00:17:24,000 --> 00:17:25,000 means when you read code - if you see 385 00:17:25,000 --> 00:17:27,000 code that's using something, call the stack 386 00:17:27,000 --> 00:17:31,000 or using a queue, you immediately know how it works. You could 387 00:17:31,000 --> 00:17:34,000 say, here's this search that's actually stuffing a bunch of things into 388 00:17:34,000 --> 00:17:37,000 a stack. I know they're last and first [inaudible]. 389 00:17:37,000 --> 00:17:39,000 If you see them using a vector, 390 00:17:39,000 --> 00:17:42,000 you have to verify where they put them in the vector. Where do 391 00:17:42,000 --> 00:17:46,000 they pull them out? The code doesn't tell you the same things. It doesn't tell you the same story 392 00:17:46,000 --> 00:17:49,000 as it does when you see a word that has a strong meaning to it, like 393 00:17:49,000 --> 00:17:52,000 stack and queue. 394 00:17:52,000 --> 00:17:55,000 Generally, these [inaudible] to make about sorting things or not sorting things. We spent a 395 00:17:55,000 --> 00:17:57,000 long time talking about sorting 396 00:17:57,000 --> 00:18:01,000 and how various algorithms approach the sorting problem and what 397 00:18:01,000 --> 00:18:02,000 they come up with. 398 00:18:02,000 --> 00:18:05,000 I have a question here. What sorting algorithm is best? Is 399 00:18:05,000 --> 00:18:09,000 there an 400 00:18:09,000 --> 00:18:12,000 answer to that question? Why would I ask a question that has no answer? It depends on what you expect it 401 00:18:12,000 --> 00:18:16,000 to be giving you. It depends. That's a great thing. It depends on 402 00:18:16,000 --> 00:18:18,000 what you expect it to give you. How big is it? 403 00:18:18,000 --> 00:18:21,000 What are the contents of it? Is it random? Is it sorted? Is it almost 404 00:18:21,000 --> 00:18:24,000 sorted? Is it reverse sorted? Is it 405 00:18:24,000 --> 00:18:28,000 drawn from a range of one to 1,000 values, or is it drawn from one to three 406 00:18:28,000 --> 00:18:31,000 values? Huge array that has just one, two, three in it 407 00:18:31,000 --> 00:18:34,000 might necessitate a different approach than one that has 408 00:18:34,000 --> 00:18:36,000 one to max sine in it. 409 00:18:36,000 --> 00:18:37,000 So 410 00:18:37,000 --> 00:18:39,000 there is no answer to that. It is depends. What do you know? If you don't 411 00:18:39,000 --> 00:18:41,000 know anything, give 412 00:18:41,000 --> 00:18:43,000 it to someone else. 413 00:18:43,000 --> 00:18:45,000 414 00:18:45,000 --> 00:18:49,000 If you don't know, there's certain things that perform well. You know in the 415 00:18:49,000 --> 00:18:52,000 average case, you don't have degenerate behaviors. For example, [inaudible] is a 416 00:18:52,000 --> 00:18:56,000 great sort for the general case of an N log N that doesn't use any space 417 00:18:56,000 --> 00:18:56,000 418 00:18:56,000 --> 00:19:01,000 or any extra space the way word sort does. It doesn't have any degenerate cases. 419 00:19:01,000 --> 00:19:02,000 For an 420 00:19:02,000 --> 00:19:05,000 unknown set of inputs, it works pretty well. 421 00:19:05,000 --> 00:19:08,000 If you happen to have a little bit of data, it's almost sorted. 422 00:19:08,000 --> 00:19:09,000 423 00:19:09,000 --> 00:19:10,000 Even 424 00:19:10,000 --> 00:19:11,000 Barack Obama's 425 00:19:11,000 --> 00:19:13,000 not favorite bubble sort 426 00:19:13,000 --> 00:19:14,000 has a 427 00:19:14,000 --> 00:19:17,000 chance of being in the run because it's already sorted. Is it worth sorting? Calculating the payoff, we had a problem on 428 00:19:17,000 --> 00:19:20,000 one of the section 429 00:19:20,000 --> 00:19:24,000 exercises once. I think it's a really good 430 00:19:24,000 --> 00:19:25,000 one to 431 00:19:25,000 --> 00:19:27,000 think through and revisit. This idea of 432 00:19:27,000 --> 00:19:31,000 certain operations, many operations are much more 433 00:19:31,000 --> 00:19:34,000 efficiently implemented if the data's sorted. 434 00:19:34,000 --> 00:19:37,000 You can think of them off the top of your head. Oh, you want to have the minimum? If 435 00:19:37,000 --> 00:19:41,000 it's sorted, it's very easy. You want the maximum? Very easy if it's sorted. You want to find the median. 436 00:19:41,000 --> 00:19:43,000 Also very easy if it's sorted. You want to 437 00:19:43,000 --> 00:19:44,000 438 00:19:44,000 --> 00:19:48,000 find duplicates. Once they're sorted, they're all lined up together. You want to find the 439 00:19:48,000 --> 00:19:49,000 mode, which is the most 440 00:19:49,000 --> 00:19:50,000 frequent element. 441 00:19:50,000 --> 00:19:53,000 Also easier if it's sorted because they'll all be groups together in a run. So you 442 00:19:53,000 --> 00:19:57,000 can do a linear pass, counting what you see to find the longest run. 443 00:19:57,000 --> 00:20:00,000 If you had two arrays and you wanted to merge them or intersect them to find 444 00:20:00,000 --> 00:20:04,000 their common elements, it's much easier if they're sorted. All these things 445 00:20:04,000 --> 00:20:07,000 that are hard to do when the data's in 446 00:20:07,000 --> 00:20:08,000 a random order 447 00:20:08,000 --> 00:20:12,000 actually have much more streamlined algorithms once you invest in sorting it. 448 00:20:12,000 --> 00:20:13,000 So there was a problem that was like, 449 00:20:13,000 --> 00:20:15,000 how do you know it's worth doing that? 450 00:20:15,000 --> 00:20:18,000 If you only plan on writing one merge, and 451 00:20:18,000 --> 00:20:19,000 you could do it in N squared, 452 00:20:19,000 --> 00:20:21,000 is it worth it to 453 00:20:21,000 --> 00:20:24,000 sort it, N log N, so you can get an O out of merge? So it has a lot to do 454 00:20:24,000 --> 00:20:26,000 with how often you plan on doing the merge. 455 00:20:26,000 --> 00:20:29,000 How big is your data set? 456 00:20:29,000 --> 00:20:30,000 What are your 457 00:20:30,000 --> 00:20:32,000 constraints on the times there. It 458 00:20:32,000 --> 00:20:35,000 is interesting to think about that relationship. It isn't for free, but 459 00:20:35,000 --> 00:20:37,000 460 00:20:37,000 --> 00:20:41,000 potentially, it's an investment by sorting it so you can do a lot of fast searches 461 00:20:41,000 --> 00:20:44,000 later. 462 00:20:44,000 --> 00:20:47,000 This one actually is interesting because they actually solve a lot of the 463 00:20:47,000 --> 00:20:50,000 same problems, and they differ in one small way. If you had a 464 00:20:50,000 --> 00:20:51,000 set - 465 00:20:51,000 --> 00:20:55,000 so set is being backed by a balance by a [inaudible] string that actually is a sorted data 466 00:20:55,000 --> 00:20:56,000 structure internally - versus a 467 00:20:56,000 --> 00:20:58,000 sorted vector. 468 00:20:58,000 --> 00:21:03,000 So if your goal were to keep track of all the 469 00:21:03,000 --> 00:21:07,000 students at Stanford, and you wanted to be able to look up people by name, 470 00:21:07,000 --> 00:21:08,000 471 00:21:08,000 --> 00:21:11,000 then the contains operation of the set 472 00:21:11,000 --> 00:21:15,000 is actually doing these minor ray 473 00:21:15,000 --> 00:21:16,000 search on assorted vectors 474 00:21:16,000 --> 00:21:18,000 using the same path, but working through a vector. 475 00:21:18,000 --> 00:21:21,000 So we can make the searching operations, like contains 476 00:21:21,000 --> 00:21:23,000 477 00:21:23,000 --> 00:21:24,000 and 478 00:21:24,000 --> 00:21:26,000 other look-up based things, 479 00:21:26,000 --> 00:21:29,000 algorithmic in both cases. 480 00:21:29,000 --> 00:21:33,000 So what advantage - this sort 481 00:21:33,000 --> 00:21:36,000 of vector, where I just used the interchangeably, is there something one 482 00:21:36,000 --> 00:21:41,000 can do that the other can't or some advantage or disadvantage that's 483 00:21:41,000 --> 00:21:43,000 assumed in that decision? 484 00:21:43,000 --> 00:21:47,000 You don't have duplicates in a set, but you can have duplicates in your sorted vector. Yeah, 485 00:21:47,000 --> 00:21:51,000 so the set has another concern with just how it operates. You can't put duplicates 486 00:21:51,000 --> 00:21:54,000 in, so if you have two students with the same name, then 487 00:21:54,000 --> 00:21:57,000 in the set, it turns out you're all coalesced down to one. Your transcripts all get 488 00:21:57,000 --> 00:21:59,000 merged to your advantage or not. What 489 00:21:59,000 --> 00:22:01,000 else 490 00:22:01,000 --> 00:22:05,000 does set buy you that vector doesn't? How hard is it to 491 00:22:05,000 --> 00:22:09,000 edit a sorted vector? You 492 00:22:09,000 --> 00:22:11,000 want to put something new in there? 493 00:22:11,000 --> 00:22:12,000 You're shuffling. You want to 494 00:22:12,000 --> 00:22:13,000 take something out? You're shuffling. 495 00:22:13,000 --> 00:22:18,000 The movement to the pointer-based structure there, the research 496 00:22:18,000 --> 00:22:18,000 stream behind this, 497 00:22:18,000 --> 00:22:21,000 is giving us this editability of the data structure. 498 00:22:21,000 --> 00:22:25,000 The convenient logarithmic time to remove and add elements using the 499 00:22:25,000 --> 00:22:28,000 pointer wiring to do that, 500 00:22:28,000 --> 00:22:30,000 that vector doesn't have. That tradeoff 501 00:22:30,000 --> 00:22:35,000 is actually one that 502 00:22:35,000 --> 00:22:39,000 situation may be that you just don't do a lot of edits. That's actually a very common 503 00:22:39,000 --> 00:22:42,000 situation. The sorted vector happens to be a very underutilized 504 00:22:42,000 --> 00:22:45,000 or under appreciated arrangement 505 00:22:45,000 --> 00:22:48,000 for data because for most things that are getting a lot of 506 00:22:48,000 --> 00:22:49,000 edits, having a 507 00:22:49,000 --> 00:22:53,000 linear time edit isn't going to be a painful 508 00:22:53,000 --> 00:22:53,000 consequence. 509 00:22:53,000 --> 00:22:56,000 You're doing a lot of searches, and you're getting the finer research there, 510 00:22:56,000 --> 00:22:57,000 where you really care about it, 511 00:22:57,000 --> 00:23:01,000 with no overhead. Very little overhead. A little bit of excess capacity, 512 00:23:01,000 --> 00:23:03,000 but none of the pointer-based 513 00:23:03,000 --> 00:23:06,000 trouble that came with the set. So the set actually adding onto that another eight 514 00:23:06,000 --> 00:23:08,000 bytes, potentially even a little bit more, 515 00:23:08,000 --> 00:23:12,000 of overhead on every element to give us that editability that maybe isn't 516 00:23:12,000 --> 00:23:14,000 that important to us. 517 00:23:14,000 --> 00:23:17,000 So if you're looking at that thing and saying, I'd like to be able to have 518 00:23:17,000 --> 00:23:21,000 a sorted structure for both the set and the sorted vector, let you 519 00:23:21,000 --> 00:23:24,000 access the elements in order. The interrater of the set goes in sort order, 520 00:23:24,000 --> 00:23:27,000 going from zero to the end. The vector gives me order. 521 00:23:27,000 --> 00:23:30,000 I can browse them in sort order. I can search them using sorted. 522 00:23:30,000 --> 00:23:33,000 Where they differ is where does it take to edit one? 523 00:23:33,000 --> 00:23:37,000 The one that invested more in the memory had faster 524 00:23:37,000 --> 00:23:39,000 525 00:23:39,000 --> 00:23:44,000 editing. The P-queue. This is kind of interesting. The P-queue is also a sorted structure, but not quite. 526 00:23:44,000 --> 00:23:46,000 527 00:23:46,000 --> 00:23:46,000 528 00:23:46,000 --> 00:23:48,000 It's a much more weakly 529 00:23:48,000 --> 00:23:52,000 sorted structure. So if you're using the [inaudible] heap like we did on the 530 00:23:52,000 --> 00:23:53,000 P-queue 531 00:23:53,000 --> 00:23:55,000 [inaudible], it does give you this access to the maximum value, 532 00:23:55,000 --> 00:23:59,000 and then kind of a quick reshuffling [inaudible] the next 533 00:23:59,000 --> 00:24:00,000 and so on. 534 00:24:00,000 --> 00:24:02,000 But it doesn't really give you the full range. For example, if you 535 00:24:02,000 --> 00:24:10,000 were looking for the minimum in a P-queue, where is it? 536 00:24:10,000 --> 00:24:11,000 It's somewhere in the 537 00:24:11,000 --> 00:24:14,000 bottom. It's in the bottom-most level, but where across the bottom, 538 00:24:14,000 --> 00:24:16,000 no knowledge, right? It could be 539 00:24:16,000 --> 00:24:20,000 at any of the nodes that are at the bottom of the heap. So in fact it 540 00:24:20,000 --> 00:24:21,000 gives you this access to 541 00:24:21,000 --> 00:24:23,000 a partial 542 00:24:23,000 --> 00:24:26,000 ordering of the data. In this case, it's relative 543 00:24:26,000 --> 00:24:28,000 to the prioritization, but not 544 00:24:28,000 --> 00:24:31,000 the fluid form of sorting the way a vector is. For example, 545 00:24:31,000 --> 00:24:33,000 if P-queue doesn't give you things for 546 00:24:33,000 --> 00:24:35,000 finding duplicates or finding the minimum or 547 00:24:35,000 --> 00:24:39,000 doing the mode it's not actually an easy operation because you have 548 00:24:39,000 --> 00:24:42,000 it in a heap form. I can get to the max 549 00:24:42,000 --> 00:24:42,000 quickly. 550 00:24:42,000 --> 00:24:46,000 So if I cared about pulling them out in sorted order 551 00:24:46,000 --> 00:24:47,000 to browse them, 552 00:24:47,000 --> 00:24:51,000 I could stuff things into a P-queue and then pull them out, but what's interesting about that is 553 00:24:51,000 --> 00:24:54,000 it requires destroying the P-queue. 554 00:24:54,000 --> 00:24:57,000 The P-queue isn't really a place you store things and plan on using them again. It's 555 00:24:57,000 --> 00:25:00,000 the place you stick things in order to get them out 556 00:25:00,000 --> 00:25:02,000 in an order that's only 557 00:25:02,000 --> 00:25:05,000 value is in the de-queuing of them, pulling them out. 558 00:25:05,000 --> 00:25:10,000 So finding the most promising option to look at in a search or finding the most 559 00:25:10,000 --> 00:25:13,000 high-priority activity to take on or something. 560 00:25:13,000 --> 00:25:16,000 But it doesn't really have the - if I wanted to be able to look at all 561 00:25:16,000 --> 00:25:18,000 the students in my class in sorted order, 562 00:25:18,000 --> 00:25:20,000 it's like, well, I could stick them in a P-queue and then pull them out, but it's sort of 563 00:25:20,000 --> 00:25:25,000 funny. I have to put them in the P-queue just to pull them back out. It wasn't a good way to store them for that 564 00:25:25,000 --> 00:25:27,000 browsing operation to be easily repeated. 565 00:25:27,000 --> 00:25:29,000 If I put them into a sorted vector, I could just keep 566 00:25:29,000 --> 00:25:33,000 iterating over it whenever I need to see it again. 567 00:25:33,000 --> 00:25:37,000 So a lot of the things that it comes down to is having a pretty good 568 00:25:37,000 --> 00:25:41,000 visualization of what it is that going to a pointer-based data structure buys you, and 569 00:25:41,000 --> 00:25:44,000 what it costs you, relative to the continuous memory. That's one of the 570 00:25:44,000 --> 00:25:45,000 571 00:25:45,000 --> 00:25:48,000 tensions underneath most of these things. It's like, 572 00:25:48,000 --> 00:25:50,000 putting more memory into it 573 00:25:50,000 --> 00:25:54,000 by virtue of having a pointer wire as opposed to a 574 00:25:54,000 --> 00:25:55,000 location 575 00:25:55,000 --> 00:25:57,000 arrangement. 576 00:25:57,000 --> 00:26:00,000 There's no information that's stored to get you from a race of zero to a 577 00:26:00,000 --> 00:26:04,000 race of two. They're just continuous in memory. So it saves you space because you only 578 00:26:04,000 --> 00:26:07,000 have to know where it starts and access continuously for that. This pointer means 579 00:26:07,000 --> 00:26:11,000 there's more kerversals. There's more memory. There's more opportunity 580 00:26:11,000 --> 00:26:12,000 to air, but it gave 581 00:26:12,000 --> 00:26:16,000 you this flexibility. It broke down this continuous block, that's kind 582 00:26:16,000 --> 00:26:17,000 of a monolith, 583 00:26:17,000 --> 00:26:21,000 into these pieces that are more flexible in how you can rearrange 584 00:26:21,000 --> 00:26:25,000 them. So thinking about these things is 585 00:26:25,000 --> 00:26:28,000 an important skill to come away from 106b with that, 586 00:26:28,000 --> 00:26:30,000 yeah, there's never 587 00:26:30,000 --> 00:26:31,000 one right answer. 588 00:26:31,000 --> 00:26:33,000 589 00:26:33,000 --> 00:26:36,000 You're investing to solve this other problem, and 590 00:26:36,000 --> 00:26:40,000 there's going to be downsides. There's no 591 00:26:40,000 --> 00:26:41,000 obvious 592 00:26:41,000 --> 00:26:43,000 triumph over all. 593 00:26:43,000 --> 00:26:47,000 It's fast, it's cheap and it's small and easy to write 594 00:26:47,000 --> 00:26:51,000 solutions out there. 595 00:26:51,000 --> 00:26:53,000 So I put up here - I had to 596 00:26:53,000 --> 00:26:54,000 whittle 597 00:26:54,000 --> 00:26:57,000 this down to the five or six things that, at 598 00:26:57,000 --> 00:27:02,000 the end of 106b, I want you to feel like, that's what I got. 599 00:27:02,000 --> 00:27:04,000 That's the heart of 600 00:27:04,000 --> 00:27:05,000 601 00:27:05,000 --> 00:27:06,000 the 602 00:27:06,000 --> 00:27:10,000 goal. The MVPs - the most valuable players of the 603 00:27:10,000 --> 00:27:12,000 106b curriculum here. 604 00:27:12,000 --> 00:27:15,000 You're looking back at it, and you go, where did I start, where did I end up, 605 00:27:15,000 --> 00:27:18,000 and how did my knowledge grow? 606 00:27:18,000 --> 00:27:21,000 I'm hoping these are the things that stand out for you. That was a 607 00:27:21,000 --> 00:27:23,000 real 608 00:27:23,000 --> 00:27:24,000 growth area for me. One 609 00:27:24,000 --> 00:27:27,000 is this idea of abstraction. 610 00:27:27,000 --> 00:27:32,000 That was stressed early on. Here's a stack, here's a queue, here's a set. They 611 00:27:32,000 --> 00:27:33,000 do things for you. We don't know how 612 00:27:33,000 --> 00:27:34,000 they work yet, 613 00:27:34,000 --> 00:27:38,000 but you make cool things happen with them. So you solve maze problems and 614 00:27:38,000 --> 00:27:40,000 random writers and 615 00:27:40,000 --> 00:27:43,000 do neat things with 616 00:27:43,000 --> 00:27:47,000 dictionaries without knowing how they work. So you're leveraging existing 617 00:27:47,000 --> 00:27:48,000 material without knowledge of how it works. 618 00:27:48,000 --> 00:27:52,000 It helps to keep the complexity down. You can solve much more interesting problems 619 00:27:52,000 --> 00:27:56,000 than you could without them. Try to go back and imagine the Markov random writer 620 00:27:56,000 --> 00:27:57,000 problem. 621 00:27:57,000 --> 00:27:58,000 Now imagine doing it 622 00:27:58,000 --> 00:28:00,000 623 00:28:00,000 --> 00:28:02,000 without the map in play. So 624 00:28:02,000 --> 00:28:04,000 you had to figure out that 625 00:28:04,000 --> 00:28:08,000 given this character, go look it up and find a place to store it, and 626 00:28:08,000 --> 00:28:11,000 then store the character in some other array that was growing on 627 00:28:11,000 --> 00:28:14,000 demand, as you put stuff in there. You would never have completed that 628 00:28:14,000 --> 00:28:18,000 program. You would still be writing it today, having pointer errors and whatnot 629 00:28:18,000 --> 00:28:19,000 behind the scenes. 630 00:28:19,000 --> 00:28:22,000 Having those tools already there, it's like having 631 00:28:22,000 --> 00:28:25,000 the microwave and the food processor, and you're about to cook instead 632 00:28:25,000 --> 00:28:26,000 of having a 633 00:28:26,000 --> 00:28:27,000 dull knife 634 00:28:27,000 --> 00:28:31,000 and doing all your work with a 635 00:28:31,000 --> 00:28:31,000 dull 636 00:28:31,000 --> 00:28:34,000 knife and a fire. You actually have 637 00:28:34,000 --> 00:28:36,000 some real power 638 00:28:36,000 --> 00:28:39,000 there. Recursion's a big theme in the weeks that followed that. 639 00:28:39,000 --> 00:28:41,000 How to solve these problems that 640 00:28:41,000 --> 00:28:43,000 have this similar structure, 641 00:28:43,000 --> 00:28:47,000 looking at these exhaustive traversals and choice problems that can 642 00:28:47,000 --> 00:28:51,000 be solved with backtracking. Having this idea of how to think about thing [inaudible] 643 00:28:51,000 --> 00:28:53,000 using the recursive problem solving 644 00:28:53,000 --> 00:28:55,000 and how to model 645 00:28:55,000 --> 00:28:56,000 that 646 00:28:56,000 --> 00:28:59,000 process using the computer was a 647 00:28:59,000 --> 00:29:02,000 very important thing that comes back. We look at algorithm 648 00:29:02,000 --> 00:29:05,000 analysis, this idea of making this 649 00:29:05,000 --> 00:29:08,000 sloppy map that computer scientists are 650 00:29:08,000 --> 00:29:11,000 fond of, talking about the scalability and growth, knowing about curves and 651 00:29:11,000 --> 00:29:13,000 the relationships and how to 652 00:29:13,000 --> 00:29:16,000 look at algorithms. The more formal analysis. 653 00:29:16,000 --> 00:29:17,000 Still, in this case, there's still quite 654 00:29:17,000 --> 00:29:21,000 a bit more formulas to this than we really went into. If you 655 00:29:21,000 --> 00:29:23,000 were go on in CS and look at it, you will see there's quite a lot 656 00:29:23,000 --> 00:29:26,000 of rigor that can be 657 00:29:26,000 --> 00:29:27,000 looked at in that area. 658 00:29:27,000 --> 00:29:31,000 We were getting more an intuition for how to gauge things about algorithms and 659 00:29:31,000 --> 00:29:33,000 their growth. 660 00:29:33,000 --> 00:29:36,000 Then this backside, all about implementation strategies and tradeoffs. 661 00:29:36,000 --> 00:29:38,000 So now that we've 662 00:29:38,000 --> 00:29:40,000 benefited from those abstractions, and we've 663 00:29:40,000 --> 00:29:43,000 solved some cool problems, what's it like to actually be the person building them? What 664 00:29:43,000 --> 00:29:46,000 are the techniques? What does it take to 665 00:29:46,000 --> 00:29:50,000 wrestle a pointer or a Linklis or a tree, heap graph into doing something really cool for you. How 666 00:29:50,000 --> 00:29:52,000 does that 667 00:29:52,000 --> 00:29:57,000 inform your information, what you expect about it and the 668 00:29:57,000 --> 00:30:00,000 coding complexity that was inherent in achieving that result? 669 00:30:00,000 --> 00:30:04,000 Hopefully in continuing with the theme that 106a started you on was this 670 00:30:04,000 --> 00:30:07,000 appreciation for [inaudible]. In the 671 00:30:07,000 --> 00:30:11,000 end, getting code that works, that's ugly, just 672 00:30:11,000 --> 00:30:14,000 isn't, hopefully, what you're satisfied with. You want to 673 00:30:14,000 --> 00:30:17,000 approach problem solving to produce beautiful, elegant, 674 00:30:17,000 --> 00:30:19,000 unified, clean, 675 00:30:19,000 --> 00:30:20,000 readable code 676 00:30:20,000 --> 00:30:24,000 that you would be happy to show to other people and be proud of and 677 00:30:24,000 --> 00:30:25,000 maintain and live with. 678 00:30:25,000 --> 00:30:27,000 The kind of work that 679 00:30:27,000 --> 00:30:29,000 other people you work with would appreciate 680 00:30:29,000 --> 00:30:32,000 being involved with. 681 00:30:32,000 --> 00:30:34,000 Things that are not so much 682 00:30:34,000 --> 00:30:37,000 - I don't think these are the most valuable pointers, but what 683 00:30:37,000 --> 00:30:38,000 came along the way, 684 00:30:38,000 --> 00:30:39,000 685 00:30:39,000 --> 00:30:42,000 the idea of pointers and C++ being part of the - 686 00:30:42,000 --> 00:30:45,000 C++ and its pointers, they're one and the same. We choose [inaudible] for a 687 00:30:45,000 --> 00:30:48,000 language, so as a result, we get to learn some C++ 688 00:30:48,000 --> 00:30:51,000 because it's so heavily 689 00:30:51,000 --> 00:30:53,000 invested with pointers. We also have to do some practice with the pointers to 690 00:30:53,000 --> 00:30:55,000 get our jobs done. 691 00:30:55,000 --> 00:30:58,000 I think of those as being off to the side. I don't think of those being the 692 00:30:58,000 --> 00:31:01,000 platform of things I worship. They are 693 00:31:01,000 --> 00:31:05,000 things we need to get our job done. I 694 00:31:05,000 --> 00:31:08,000 did put a little note about pointers here because there were quite a lot of comments on 695 00:31:08,000 --> 00:31:12,000 the mid-quarter evals that said, pointers, pointers still scare me. They make me crazy. 696 00:31:12,000 --> 00:31:14,000 I don't know enough about 697 00:31:14,000 --> 00:31:15,000 pointers. Here's a fact for you. 698 00:31:15,000 --> 00:31:18,000 You'll probably never feel like you know enough about pointers. 699 00:31:18,000 --> 00:31:21,000 Today, tomorrow, ten years from now, you'll still kind of feel like 700 00:31:21,000 --> 00:31:24,000 there's some real subtlety and trickiness in there. 701 00:31:24,000 --> 00:31:27,000 They're an extremely powerful facility that gives you that 702 00:31:27,000 --> 00:31:31,000 direct control over allocation and deallocation and all the wiring and 703 00:31:31,000 --> 00:31:32,000 sharing 704 00:31:32,000 --> 00:31:34,000 that 705 00:31:34,000 --> 00:31:37,000 is important to building these complicated data structures. 706 00:31:37,000 --> 00:31:40,000 But there are so many pitfalls. I'm sure you've run into 707 00:31:40,000 --> 00:31:43,000 most of these, if not all of them at some point, where you 708 00:31:43,000 --> 00:31:46,000 mishandled a pointer, failed the allocator, deallocate, 709 00:31:46,000 --> 00:31:49,000 and 710 00:31:49,000 --> 00:31:52,000 the way it gets reported, whether the compiler or the runtime error that 711 00:31:52,000 --> 00:31:56,000 you see is helpful in sorting it out is a real cause for 712 00:31:56,000 --> 00:31:58,000 long hours to disappear 713 00:31:58,000 --> 00:32:01,000 in the name of mastery of this. 714 00:32:01,000 --> 00:32:02,000 If you're still wary of pointers, you 715 00:32:02,000 --> 00:32:04,000 probably should be. 716 00:32:04,000 --> 00:32:06,000 Think of this as your first go-around. 717 00:32:06,000 --> 00:32:07,000 We're still 718 00:32:07,000 --> 00:32:10,000 at the journeyman's stage for programming here. 719 00:32:10,000 --> 00:32:14,000 So although there were pointers in 106a, they were very hidden. This 720 00:32:14,000 --> 00:32:17,000 is the first time you really see it and are exposed to it and trying to get your head 721 00:32:17,000 --> 00:32:18,000 around it. 722 00:32:18,000 --> 00:32:20,000 If you go on to 107 and 108, you'll just 723 00:32:20,000 --> 00:32:23,000 get more and more practice. It will 724 00:32:23,000 --> 00:32:26,000 become more solid as you keep working your way through it, but 725 00:32:26,000 --> 00:32:30,000 at this stage, it's okay 726 00:32:30,000 --> 00:32:31,000 and 727 00:32:31,000 --> 00:32:35,000 commonplace to feel still, a little bit - you see the star, and you take a 728 00:32:35,000 --> 00:32:37,000 double take. That's just where you should be. 729 00:32:37,000 --> 00:32:40,000 It's just like in the 730 00:32:40,000 --> 00:32:43,000 Chinese restaurants. They put the little red star by food you want to be 731 00:32:43,000 --> 00:32:49,000 careful of. That's why they put the star for the 732 00:32:49,000 --> 00:32:52,000 pointer. I also wanted to 733 00:32:52,000 --> 00:32:54,000 zoom out farther, 734 00:32:54,000 --> 00:32:56,000 a decade from 735 00:32:56,000 --> 00:32:59,000 now, when you are 736 00:32:59,000 --> 00:33:01,000 thinking back on the 737 00:33:01,000 --> 00:33:04,000 whole of your education and what you learned. Do I want to really 738 00:33:04,000 --> 00:33:06,000 remember how to 739 00:33:06,000 --> 00:33:08,000 type in keyword and the sequel plus generic 740 00:33:08,000 --> 00:33:10,000 template facility? No. 741 00:33:10,000 --> 00:33:13,000 If you code in C++ every day, you'll know that. If you don't, you 742 00:33:13,000 --> 00:33:14,000 can forget it. It's long gone. 743 00:33:14,000 --> 00:33:16,000 What I hope you'll carry away from you 744 00:33:16,000 --> 00:33:18,000 is a couple 745 00:33:18,000 --> 00:33:19,000 concepts that 746 00:33:19,000 --> 00:33:21,000 broaden 747 00:33:21,000 --> 00:33:23,000 outside of CS. One is this idea of algorithmic thinking, 748 00:33:23,000 --> 00:33:27,000 how you approach something logically and step-wise and draw pictures and analyze it to understand 749 00:33:27,000 --> 00:33:29,000 the cases 750 00:33:29,000 --> 00:33:32,000 and to debug when things aren't working well. I think that's a skill that 751 00:33:32,000 --> 00:33:33,000 transcends 752 00:33:33,000 --> 00:33:37,000 the art of computer programming into other domains, any kind of thinking 753 00:33:37,000 --> 00:33:39,000 logically and problem solving and analyzing in that 754 00:33:39,000 --> 00:33:43,000 way. I also hope that some of the experimental things we did with the sort 755 00:33:43,000 --> 00:33:45,000 lab and the P-queue 756 00:33:45,000 --> 00:33:48,000 and some of the big O and thing we tried to do in class together help 757 00:33:48,000 --> 00:33:51,000 you to [inaudible] back of the envelope calculations. This is something I'm 758 00:33:51,000 --> 00:33:53,000 a little bit of a - 759 00:33:53,000 --> 00:33:55,000 it's a pet issue of mine, which is, 760 00:33:55,000 --> 00:33:57,000 I think, in the era of computers 761 00:33:57,000 --> 00:34:00,000 and whatnot, really easy to get distanced from numbers and start 762 00:34:00,000 --> 00:34:03,000 thinking, I just punch numbers into formulas and one comes out. It must be 763 00:34:03,000 --> 00:34:06,000 the truth. Not having any intuition as to whether that's the right 764 00:34:06,000 --> 00:34:10,000 number, whether that number makes sense, whether it's grounded, 765 00:34:10,000 --> 00:34:14,000 and whether you actually are [inaudible] or so off because you've 766 00:34:14,000 --> 00:34:17,000 actually made an entry error, and you don't even notice it. 767 00:34:17,000 --> 00:34:20,000 So becoming comfortable with this idea of using numbers to drive things, 768 00:34:20,000 --> 00:34:21,000 being able to 769 00:34:21,000 --> 00:34:26,000 take some time trials, do some predictions, do some more time trials, 770 00:34:26,000 --> 00:34:28,000 match those things up, do a little sketching on your numbers and see the 771 00:34:28,000 --> 00:34:32,000 things that are working out. So being comfortable having the math and the theory 772 00:34:32,000 --> 00:34:34,000 reinforce each other 773 00:34:34,000 --> 00:34:34,000 and 774 00:34:34,000 --> 00:34:36,000 not get too 775 00:34:36,000 --> 00:34:39,000 distanced from the fact that those numbers do play out in 776 00:34:39,000 --> 00:34:42,000 the real world in ways that are interesting. Don't just let the numbers themselves 777 00:34:42,000 --> 00:34:44,000 tell the story. There really is more 778 00:34:44,000 --> 00:34:47,000 to connect it up with. 779 00:34:47,000 --> 00:34:49,000 This idea of tradeoffs. 780 00:34:49,000 --> 00:34:51,000 You may not remember 781 00:34:51,000 --> 00:34:54,000 how you can write a Linklis or how the hash table does 782 00:34:54,000 --> 00:34:56,000 the things it does 783 00:34:56,000 --> 00:35:00,000 and how to get one working, but hopefully you will take away this idea that 784 00:35:00,000 --> 00:35:03,000 the answer to most questions begins with the phrase, it depends. 785 00:35:03,000 --> 00:35:05,000 If I say, is it better to use a hash table or [inaudible] search tree, the answer 786 00:35:05,000 --> 00:35:09,000 will be, it depends. There's no, 787 00:35:09,000 --> 00:35:12,000 it's always better to use this sort. It's always 788 00:35:12,000 --> 00:35:14,000 wrong to use this approach. 789 00:35:14,000 --> 00:35:15,000 It's, it depends. Do 790 00:35:15,000 --> 00:35:18,000 you have the code written for that, and it works well, and you don't care how 791 00:35:18,000 --> 00:35:19,000 long it takes? 792 00:35:19,000 --> 00:35:23,000 Then bubble sort actually could be the right answer. 793 00:35:23,000 --> 00:35:25,000 Barack Obama not withstanding. 794 00:35:25,000 --> 00:35:27,000 But knowing whether memory is 795 00:35:27,000 --> 00:35:32,000 at premium, time is at premium, what your mix of operations are, 796 00:35:32,000 --> 00:35:33,000 what 797 00:35:33,000 --> 00:35:34,000 language you're working with, what 798 00:35:34,000 --> 00:35:38,000 program expertise you have, what timeline you have for developing it all 799 00:35:38,000 --> 00:35:42,000 can play a role in deciding what choices to make and what strategy to pursue. 800 00:35:42,000 --> 00:35:45,000 So being very comfortable with this idea that it's about gathering information 801 00:35:45,000 --> 00:35:46,000 before you 802 00:35:46,000 --> 00:35:49,000 can commit to one strategy as opposed to it's always going to be 803 00:35:49,000 --> 00:35:52,000 this or that. 804 00:35:52,000 --> 00:35:54,000 So that's the 106b thing. 805 00:35:54,000 --> 00:35:57,000 I have a couple slides here on 806 00:35:57,000 --> 00:35:59,000 things that happen after 106b. I'm 807 00:35:59,000 --> 00:36:01,000 just going to go ahead and show you. 808 00:36:01,000 --> 00:36:03,000 809 00:36:03,000 --> 00:36:06,000 This would be a great time to ask questions. In particular, if there's anything you're 810 00:36:06,000 --> 00:36:07,000 curious about. 811 00:36:07,000 --> 00:36:10,000 Where do we go from here, and 812 00:36:10,000 --> 00:36:13,000 how do we make good decisions about things that follow? The 813 00:36:13,000 --> 00:36:17,000 most obvious following courses from CS - if you took 106b 814 00:36:17,000 --> 00:36:19,000 and you hated it, I'm very sorry. 815 00:36:19,000 --> 00:36:23,000 Hopefully 816 00:36:23,000 --> 00:36:25,000 the scarring 817 00:36:25,000 --> 00:36:26,000 will heal over and you'll lead a 818 00:36:26,000 --> 00:36:28,000 productive and happy life 819 00:36:28,000 --> 00:36:30,000 elsewhere. If you did like it and you think, 820 00:36:30,000 --> 00:36:33,000 I'm kind of interested in this. What do I do next to explore further? 821 00:36:33,000 --> 00:36:35,000 Where else can I go with this? 822 00:36:35,000 --> 00:36:38,000 I put up the intro 823 00:36:38,000 --> 00:36:40,000 courses with their numbers and names, and I'll talk a little bit 824 00:36:40,000 --> 00:36:44,000 about what each of these does to give you an idea of what opportunity it 825 00:36:44,000 --> 00:36:45,000 offers to you. 826 00:36:45,000 --> 00:36:49,000 The obvious follow onto the programming side - 106a and b 827 00:36:49,000 --> 00:36:51,000 are programming courses, building your programming mastery. 828 00:36:51,000 --> 00:36:54,000 107 follow in the same vein. We typically call those systems 829 00:36:54,000 --> 00:36:56,000 courses in CS nomenclature. 830 00:36:56,000 --> 00:36:59,000 That means practical, hands on programming, getting more exposure 831 00:36:59,000 --> 00:37:00,000 to the system. 832 00:37:00,000 --> 00:37:05,000 So 107 builds on 106b's knowledge in a class called programming paradigms. 833 00:37:05,000 --> 00:37:08,000 Part of what it's trying to explore is different languages. 834 00:37:08,000 --> 00:37:11,000 So you actually look at the language, C, kind of stepping back to old school. 835 00:37:11,000 --> 00:37:15,000 You look at the language scheme, which is a functional language that 836 00:37:15,000 --> 00:37:17,000 was developed for AI purposes 837 00:37:17,000 --> 00:37:17,000 that 838 00:37:17,000 --> 00:37:21,000 has a long history coming from the math side of things. Probably looking at some 839 00:37:21,000 --> 00:37:25,000 scripting and flexible [inaudible] like Python. We kind of mix them up a 840 00:37:25,000 --> 00:37:28,000 little bit, so it's not exactly the same things, but looking at other 841 00:37:28,000 --> 00:37:30,000 languages, other ways of doing things. 842 00:37:30,000 --> 00:37:33,000 The other component of 107 is also looking at 843 00:37:33,000 --> 00:37:35,000 some more low-level things. 844 00:37:35,000 --> 00:37:37,000 What happens? When you 845 00:37:37,000 --> 00:37:41,000 pass your code off to the compiler, what is it doing? What kind of analysis does do of your 846 00:37:41,000 --> 00:37:44,000 code? How does it actually generate machine code? What happens in the machine layer, 847 00:37:44,000 --> 00:37:47,000 understanding some more of the performance implications 848 00:37:47,000 --> 00:37:50,000 and how things are expressed. Getting a better understanding of pointers actually 849 00:37:50,000 --> 00:37:53,000 comes up because there is a certain amount of low-level coding that's being 850 00:37:53,000 --> 00:37:54,000 explored in this class. 851 00:37:54,000 --> 00:37:58,000 So building on the programming skill as we move toward a Unix 852 00:37:58,000 --> 00:38:00,000 Linux-based 853 00:38:00,000 --> 00:38:02,000 environment. So away from the Mac and Windows. 854 00:38:02,000 --> 00:38:03,000 855 00:38:03,000 --> 00:38:08,000 Just continue building larger programs with complicated 856 00:38:08,000 --> 00:38:10,000 features to grow your mastery. 857 00:38:10,000 --> 00:38:13,000 Internally, we call this class programming maturity, even though 858 00:38:13,000 --> 00:38:14,000 it's not its title. 859 00:38:14,000 --> 00:38:18,000 We see it that way in the sequence in terms of just growing you into 860 00:38:18,000 --> 00:38:19,000 a 861 00:38:19,000 --> 00:38:22,000 more versatile programmer, seeing different languages, different things 862 00:38:22,000 --> 00:38:25,000 and getting more understanding of what happens under the hood. 863 00:38:25,000 --> 00:38:29,000 It is a five-unit class. It's typically offered fall and spring. Most 864 00:38:29,000 --> 00:38:30,000 people thing it is 865 00:38:30,000 --> 00:38:33,000 about as much work as 106b. Some a 866 00:38:33,000 --> 00:38:37,000 little more or a little less in different parts of the quarter, depending on what your 867 00:38:37,000 --> 00:38:38,000 background is and 868 00:38:38,000 --> 00:38:39,000 869 00:38:39,000 --> 00:38:41,000 how 106b worked for you. 870 00:38:41,000 --> 00:38:44,000 In a model, I think most people think of it as being - would you guys 871 00:38:44,000 --> 00:38:45,000 say 107 872 00:38:45,000 --> 00:38:48,000 about as hard as 106b? 873 00:38:48,000 --> 00:38:51,000 Some ways, it's harder. Some ways, it's not. Maybe that's 874 00:38:51,000 --> 00:38:55,000 the deal. The 108 class which follows onto 107, but that 875 00:38:55,000 --> 00:38:57,000 prerequisite is particularly strong. It's a Java class. 876 00:38:57,000 --> 00:39:00,000 So moving in the other direction. Rather than moving down an extraction, moving 877 00:39:00,000 --> 00:39:02,000 up on the extraction, 878 00:39:02,000 --> 00:39:05,000 looking at more large-scale design, modern programming technique, 879 00:39:05,000 --> 00:39:06,000 using 880 00:39:06,000 --> 00:39:10,000 object-oriented facilities, thinking about large-scale programming. You do a 881 00:39:10,000 --> 00:39:11,000 big team project in the 882 00:39:11,000 --> 00:39:13,000 end, so a three week, 883 00:39:13,000 --> 00:39:15,000 three person, massive effort 884 00:39:15,000 --> 00:39:17,000 that is the first 885 00:39:17,000 --> 00:39:26,000 really big scale thing that you'll be exposed to in the lower-division 886 00:39:26,000 --> 00:39:30,000 curriculum. [Inaudible]. Sometimes in 107l, which has some relationship to the 887 00:39:30,000 --> 00:39:33,000 106l in the way that it's more hands-on, more 888 00:39:33,000 --> 00:39:34,000 C++, 889 00:39:34,000 --> 00:39:37,000 I don't know whether it will be offered this spring or not, 890 00:39:37,000 --> 00:39:38,000 but 891 00:39:38,000 --> 00:39:41,000 if you look in Access, it will tell you at least whether we have it tentatively on 892 00:39:41,000 --> 00:39:42,000 the schedule. 893 00:39:42,000 --> 00:39:46,000 I think that comes and goes with the interest from the students and the 894 00:39:46,000 --> 00:39:51,000 availability of someone to teach 895 00:39:51,000 --> 00:39:52,000 it. 896 00:39:52,000 --> 00:39:55,000 [Inaudible] can we watch [inaudible]? 897 00:39:55,000 --> 00:39:58,000 Yeah, 107 and 108 are offered on TV usually 898 00:39:58,000 --> 00:40:02,000 once a year. 107 is offered on TV this spring. The one in fall is 899 00:40:02,000 --> 00:40:03,000 not taped. 900 00:40:03,000 --> 00:40:06,000 It might be that the old lectures from spring are somewhere that 901 00:40:06,000 --> 00:40:09,000 you could dig them out, but I actually don't happen to know for sure. 902 00:40:09,000 --> 00:40:11,000 108 is on TV, I think, 903 00:40:11,000 --> 00:40:13,000 in the fall, not in the winter. So the fall lectures are probably still around for 904 00:40:13,000 --> 00:40:16,000 108, so you could take a look at them. You 905 00:40:16,000 --> 00:40:19,000 could certainly look at their websites, and 906 00:40:19,000 --> 00:40:23,000 there's often a lot of old materials and handouts 907 00:40:23,000 --> 00:40:26,000 and stuff that gives you at least some idea of things that have been 908 00:40:26,000 --> 00:40:29,000 covered in the most recent offering of it. 909 00:40:29,000 --> 00:40:34,000 Certainly showing up in the first week is one way to get an 910 00:40:34,000 --> 00:40:36,000 idea of what you're in for. 911 00:40:36,000 --> 00:40:38,000 108 is a four-unit class. 912 00:40:38,000 --> 00:40:41,000 Because it's in Java, there's certain things about Java, 913 00:40:41,000 --> 00:40:44,000 the safety and 914 00:40:44,000 --> 00:40:48,000 the attentiveness of the compiler, makes the error handing a little 915 00:40:48,000 --> 00:40:52,000 bit easier. So I don't think most people think of 108 as quite as intense as 916 00:40:52,000 --> 00:40:55,000 107 or 106b, but that project at the end has quite a reputation 917 00:40:55,000 --> 00:40:58,000 as being a 918 00:40:58,000 --> 00:41:02,000 real time suck. So it has lots to do with being 919 00:41:02,000 --> 00:41:04,000 scheduled and motivated and working through that team project at 920 00:41:04,000 --> 00:41:05,000 the end. 921 00:41:05,000 --> 00:41:09,000 I think that's where most people find the big challenge of 922 00:41:09,000 --> 00:41:12,000 108. Then on the math side, in the 923 00:41:12,000 --> 00:41:15,000 nomenclature for CS, we have this idea of theory, introducing you to more 924 00:41:15,000 --> 00:41:16,000 925 00:41:16,000 --> 00:41:19,000 the formalism for big O and discrete structures and doing proofs and 926 00:41:19,000 --> 00:41:21,000 thinking about logic. 927 00:41:21,000 --> 00:41:25,000 So we have a 103 a and b and a 103x, which is a combined, 928 00:41:25,000 --> 00:41:28,000 accelerated form of a and b, which serves as math classes. They are math classes 929 00:41:28,000 --> 00:41:29,000 for introducing 930 00:41:29,000 --> 00:41:33,000 the kind of mathematics that are important for computer scientists. 931 00:41:33,000 --> 00:41:34,000 So that is 932 00:41:34,000 --> 00:41:38,000 for people who are thinking about a CS major, another good course to get in 933 00:41:38,000 --> 00:41:39,000 early. Pretty 934 00:41:39,000 --> 00:41:43,000 much everything in the upper division of the CS major has 935 00:41:43,000 --> 00:41:47,000 these things layered down as prereqs. So these are the intro courses that serve as the 936 00:41:47,000 --> 00:41:50,000 gateway to prepping you to go on, and then looking at things in a more topical 937 00:41:50,000 --> 00:41:51,000 way. 938 00:41:51,000 --> 00:41:53,000 Looking at networks or security, 939 00:41:53,000 --> 00:41:54,000 HCI or 940 00:41:54,000 --> 00:41:59,000 artificial intelligence layers on a grounding in theory and programming 941 00:41:59,000 --> 00:42:03,000 that came from the intro courses. So 942 00:42:03,000 --> 00:42:07,000 that said, we're in the 943 00:42:07,000 --> 00:42:08,000 midst of a really 944 00:42:08,000 --> 00:42:10,000 serious curriculum revision, 945 00:42:10,000 --> 00:42:13,000 which is 946 00:42:13,000 --> 00:42:15,000 on the table and being designed right now, 947 00:42:15,000 --> 00:42:18,000 and was voted on by our faculty and improved and has to go 948 00:42:18,000 --> 00:42:22,000 through the school of engineering and get approval. Then there's going to be a little bit of 949 00:42:22,000 --> 00:42:23,000 jostling as everything gets worked out. 950 00:42:23,000 --> 00:42:27,000 But started next year, you're going to see some changes in the 951 00:42:27,000 --> 00:42:29,000 department. Some of these courses are going to 952 00:42:29,000 --> 00:42:33,000 get morphed, and 953 00:42:33,000 --> 00:42:36,000 they won't be exactly the way they are today. 954 00:42:36,000 --> 00:42:39,000 As a Stanford undergrad, you actually have the option of graduating under any set of 955 00:42:39,000 --> 00:42:42,000 requirements that are in play any time you're an undergrad. So you could graduate under this year's 956 00:42:42,000 --> 00:42:43,000 requirements, 957 00:42:43,000 --> 00:42:45,000 or you could wait. 958 00:42:45,000 --> 00:42:48,000 I think for most, the right answer's going to be wait because I think the new 959 00:42:48,000 --> 00:42:51,000 arrangement is very much more student-friendly. 960 00:42:51,000 --> 00:42:53,000 It has more flexibility 961 00:42:53,000 --> 00:42:57,000 in the program. If you look at the CS program as it is, it has a lot of pick 962 00:42:57,000 --> 00:43:00,000 one of two, pick one of three, pick one of two, 963 00:43:00,000 --> 00:43:04,000 where you're very constrained on what you can choose. The new program is going to be 964 00:43:04,000 --> 00:43:07,000 more track-based where you pick an area of interest. You say, I'm really interested in the 965 00:43:07,000 --> 00:43:08,000 graphics. You do a little 966 00:43:08,000 --> 00:43:12,000 depth concentration in that, and then just have room for a 967 00:43:12,000 --> 00:43:15,000 very wide selection of electives to fill up the program as opposed to 968 00:43:15,000 --> 00:43:19,000 a smorgasbord where we forced you to pick through a bunch of different areas. 969 00:43:19,000 --> 00:43:24,000 So for most students, I think it gives you flexibility that actually gives you 970 00:43:24,000 --> 00:43:25,000 the options 971 00:43:25,000 --> 00:43:26,000 to 972 00:43:26,000 --> 00:43:30,000 pick the classes that are most interesting to you and get the material that you want. It's a growing 973 00:43:30,000 --> 00:43:32,000 recognition of the fact that CS 974 00:43:32,000 --> 00:43:35,000 has really broadened as a field. Some say that our major looks the same way 975 00:43:35,000 --> 00:43:37,000 it did ten years ago, 976 00:43:37,000 --> 00:43:40,000 a classic definition of here's these 977 00:43:40,000 --> 00:43:42,000 seven things that all matter. The 978 00:43:42,000 --> 00:43:45,000 things aren't seven anymore. There's no 15 of them, 25 of them. If 979 00:43:45,000 --> 00:43:49,000 you look at the spectrum of things that computer science is now 980 00:43:49,000 --> 00:43:52,000 embracing, we can't say, you have to take one of all these things without keeping you here 981 00:43:52,000 --> 00:43:53,000 for eight years. 982 00:43:53,000 --> 00:43:55,000 So we are 983 00:43:55,000 --> 00:43:58,000 allowing that our field's footprint has gotten so large that we need to 984 00:43:58,000 --> 00:44:02,000 let you pick and choose a little bit to get the slice that seems most 985 00:44:02,000 --> 00:44:03,000 appropriate for where you're headed. 986 00:44:03,000 --> 00:44:06,000 So you'll see more about that. 987 00:44:06,000 --> 00:44:10,000 If you are thinking about being a CS major 988 00:44:10,000 --> 00:44:13,000 and have not yet declared, 989 00:44:13,000 --> 00:44:17,000 I would advise 990 00:44:17,000 --> 00:44:22,000 that you add yourself to what's called the considering CS list, 991 00:44:22,000 --> 00:44:26,000 and you can get to this from the CS course advisor's page, 992 00:44:26,000 --> 00:44:27,000 which, 993 00:44:27,000 --> 00:44:30,000 off the top of my head, I think 994 00:44:30,000 --> 00:44:32,000 it might be 995 00:44:32,000 --> 00:44:36,000 CSadvising, but I'm not positive. I'll put the link on 996 00:44:36,000 --> 00:44:39,000 our webpage, when I remember what it is. 997 00:44:39,000 --> 00:44:42,000 It is a mailing list. So we have a mailing list of the declared undergrads, but we 998 00:44:42,000 --> 00:44:45,000 also have a list for people who are 999 00:44:45,000 --> 00:44:48,000 at least potentially interested in the major and just want to be kept up to date on 1000 00:44:48,000 --> 00:44:49,000 things that are happening. 1001 00:44:49,000 --> 00:44:52,000 There's going to be a lot of activity on this list in the 1002 00:44:52,000 --> 00:44:55,000 upcoming months as we publish the information about the new 1003 00:44:55,000 --> 00:44:56,000 curriculum. 1004 00:44:56,000 --> 00:44:57,000 So 1005 00:44:57,000 --> 00:45:00,000 you'll know what's coming down the pipes so you can make good decisions for 1006 00:45:00,000 --> 00:45:01,000 preparing for 1007 00:45:01,000 --> 00:45:04,000 what will be [inaudible], what the transition plan will be as we move into the new 1008 00:45:04,000 --> 00:45:05,000 courses 1009 00:45:05,000 --> 00:45:06,000 and how to make sure you 1010 00:45:06,000 --> 00:45:09,000 land 1011 00:45:09,000 --> 00:45:13,000 in the right place when all is said 1012 00:45:13,000 --> 00:45:14,000 and done. Then I put 1013 00:45:14,000 --> 00:45:15,000 a little note about 1014 00:45:15,000 --> 00:45:17,000 other majors that exist. 1015 00:45:17,000 --> 00:45:19,000 CS is 1016 00:45:19,000 --> 00:45:20,000 1017 00:45:20,000 --> 00:45:22,000 the home base for people who are 1018 00:45:22,000 --> 00:45:25,000 solving problems using computers. It spans a large variety of people 1019 00:45:25,000 --> 00:45:27,000 doing a lot of different things. 1020 00:45:27,000 --> 00:45:30,000 Then there's a couple other majors that we participate in, in the interdisciplinary 1021 00:45:30,000 --> 00:45:32,000 sense, 1022 00:45:32,000 --> 00:45:34,000 that UCS is part of a larger 1023 00:45:34,000 --> 00:45:37,000 context for doing things. So computer systems engineering, which is 1024 00:45:37,000 --> 00:45:40,000 between us and EE, which is a little bit more about hardware. 1025 00:45:40,000 --> 00:45:43,000 The symbolic systems program, which is 1026 00:45:43,000 --> 00:45:47,000 run by linguistics and includes cooperation from psychology, philosophy, communications, 1027 00:45:47,000 --> 00:45:51,000 CS looking at artificial intelligence and modeling human 1028 00:45:51,000 --> 00:45:52,000 thought and 1029 00:45:52,000 --> 00:45:53,000 understanding 1030 00:45:53,000 --> 00:45:55,000 1031 00:45:55,000 --> 00:45:57,000 intelligence as expressed in that way. 1032 00:45:57,000 --> 00:45:59,000 Then there's a mathematical computational science program that's homed 1033 00:45:59,000 --> 00:46:03,000 in the statistic department that is joint between math, computer science and the 1034 00:46:03,000 --> 00:46:06,000 MS and E. It's more of an applied math degree. So taking 1035 00:46:06,000 --> 00:46:08,000 mathematical thinking and 1036 00:46:08,000 --> 00:46:09,000 1037 00:46:09,000 --> 00:46:12,000 using computers and statistical things to analyze and 1038 00:46:12,000 --> 00:46:16,000 think about things. A lot of financial people end up 1039 00:46:16,000 --> 00:46:20,000 being drawn to that particular one or people who are doing risk analysis or 1040 00:46:20,000 --> 00:46:21,000 that sort of 1041 00:46:21,000 --> 00:46:24,000 combination of things. But where CS has a role to play in all of these because it's a 1042 00:46:24,000 --> 00:46:27,000 great tool for solving these kind of problems. So 1043 00:46:27,000 --> 00:46:29,000 different ways to get at 1044 00:46:29,000 --> 00:46:30,000 different pieces 1045 00:46:30,000 --> 00:46:33,000 of the major. 1046 00:46:33,000 --> 00:46:35,000 I've put a couple notes of things that 1047 00:46:35,000 --> 00:46:38,000 I think are worth 1048 00:46:38,000 --> 00:46:39,000 looking into 1049 00:46:39,000 --> 00:46:40,000 in case they, 1050 00:46:40,000 --> 00:46:43,000 at some point, are the right thing for you to look at. Section leading, 1051 00:46:43,000 --> 00:46:45,000 so joining the 106 staff 1052 00:46:45,000 --> 00:46:46,000 and shepherding the next generation 1053 00:46:46,000 --> 00:46:49,000 of students through the program 1054 00:46:49,000 --> 00:46:52,000 is a great way to strengthen your own knowledge. There's nothing like 1055 00:46:52,000 --> 00:46:54,000 making sure you understand recursion as to try and explain it to someone else who 1056 00:46:54,000 --> 00:46:55,000 doesn't understand it. 1057 00:46:55,000 --> 00:46:56,000 I 1058 00:46:56,000 --> 00:46:59,000 can really do a lot for 1059 00:46:59,000 --> 00:47:01,000 the personal connection to you and your craft 1060 00:47:01,000 --> 00:47:04,000 as well as just being involved in a great community. The section leaders are a really 1061 00:47:04,000 --> 00:47:05,000 wonderful, 1062 00:47:05,000 --> 00:47:08,000 dedicated group of students who are 1063 00:47:08,000 --> 00:47:08,000 1064 00:47:08,000 --> 00:47:12,000 fun to be around and great people to make friends with. So I 1065 00:47:12,000 --> 00:47:14,000 highly encourage you to take a look at that. 1066 00:47:14,000 --> 00:47:16,000 We interview for that every quarter, 1067 00:47:16,000 --> 00:47:17,000 so usually about week six, 1068 00:47:17,000 --> 00:47:19,000 we put out 1069 00:47:19,000 --> 00:47:21,000 applications and bring people in for interviews. We're always looking for new people 1070 00:47:21,000 --> 00:47:24,000 to join on a quarterly basis. 1071 00:47:24,000 --> 00:47:27,000 There are research opportunities all around the campus. One that the CS 1072 00:47:27,000 --> 00:47:30,000 department sponsors in particular is their [inaudible] summer institute. 1073 00:47:30,000 --> 00:47:33,000 So matching professors with undergrads to do work in the summer for 1074 00:47:33,000 --> 00:47:35,000 slave wages. 1075 00:47:35,000 --> 00:47:38,000 But great opportunity. You can get started on 1076 00:47:38,000 --> 00:47:40,000 thinking about how research might be part of your future. 1077 00:47:40,000 --> 00:47:43,000 One of the best preparations for thinking about grad school. 1078 00:47:43,000 --> 00:47:46,000 We do have an honors program. 1079 00:47:46,000 --> 00:47:50,000 Like most, we have a chance to stake out on an independent project and 1080 00:47:50,000 --> 00:47:50,000 show your mettle. 1081 00:47:50,000 --> 00:47:53,000 We have a co-terminal masters program 1082 00:47:53,000 --> 00:47:54,000 where you can 1083 00:47:54,000 --> 00:47:57,000 get some of the depth 1084 00:47:57,000 --> 00:47:58,000 before you leave this place. 1085 00:47:58,000 --> 00:48:01,000 Not to mention, there's a million fabulous 1086 00:48:01,000 --> 00:48:03,000 things to do with Stanford, getting involved with 1087 00:48:03,000 --> 00:48:06,000 arts or nonprofit organizations or sports or the student leadership 1088 00:48:06,000 --> 00:48:07,000 or whatever. 1089 00:48:07,000 --> 00:48:10,000 Maybe the most important memory I have from my undergrad was that the 1090 00:48:10,000 --> 00:48:13,000 trickiest part of my undergrad was learning what to say no to. 1091 00:48:13,000 --> 00:48:14,000 Realizing that there are 1092 00:48:14,000 --> 00:48:17,000 10,000 fabulous, worthwhile opportunities, 1093 00:48:17,000 --> 00:48:21,000 and doing all of them only means sacrificing your sanity and 1094 00:48:21,000 --> 00:48:23,000 the quality of the things you do. 1095 00:48:23,000 --> 00:48:24,000 1096 00:48:24,000 --> 00:48:29,000 So be careful and thoughtful about choosing the things you do and embracing them. 1097 00:48:29,000 --> 00:48:32,000 Be okay with saying, I won't do this. Even though it's great to go overseas, and I 1098 00:48:32,000 --> 00:48:35,000 would've liked that opportunity, it's not going to fit, and I'm okay with that. Don't 1099 00:48:35,000 --> 00:48:38,000 beat yourself up over it and cause 1100 00:48:38,000 --> 00:48:41,000 yourself too much grief trying to do too many things. 1101 00:48:41,000 --> 00:48:44,000 I won't be here on Friday. Keith is going to come. He's going to teach you about C++ in the real 1102 00:48:44,000 --> 00:48:45,000 world. 1103 00:48:45,000 --> 00:48:47,000 I'm going to see you guys Friday at the final. 1104 00:48:47,000 --> 00:48:57,000 I think that's all I need to say for today.