1 00:00:00,000 --> 00:00:09,000 2 00:00:09,000 --> 00:00:11,000 3 00:00:11,000 --> 00:00:14,000 This presentation is delivered by the Stanford Center for Professional Development. 4 00:00:14,000 --> 00:00:23,000 5 00:00:23,000 --> 00:00:25,000 Hi there. Good afternoon. 6 00:00:25,000 --> 00:00:27,000 Welcome to Monday. 7 00:00:27,000 --> 00:00:28,000 Today's a big day. 8 00:00:28,000 --> 00:00:31,000 A big day, right? We've got a couple of recursive backtracking examples that we 9 00:00:31,000 --> 00:00:35,000 didn't get to on Friday that I'm gonna 10 00:00:35,000 --> 00:00:36,000 talk you through today, 11 00:00:36,000 --> 00:00:40,000 and then we're gonna talk a little bit about pointers, the long anticipated introduction to 12 00:00:40,000 --> 00:00:42,000 one of the scarier parts of the C++ language 13 00:00:42,000 --> 00:00:47,000 as kind of a step toward building linked lists and the recursive data idea 14 00:00:47,000 --> 00:00:49,000 that we will study today and 15 00:00:49,000 --> 00:00:51,000 continue into Wednesday. 16 00:00:51,000 --> 00:00:54,000 So the two samples I want to do are both based on the same backtracking pseudocode that we 17 00:00:54,000 --> 00:00:56,000 were using on Friday. I 18 00:00:56,000 --> 00:00:59,000 just want to go through them. I'm gonna do a little bit less attention to the code and 19 00:00:59,000 --> 00:01:03,000 a little bit more attention to the problem solving because at some point I think the problem 20 00:01:03,000 --> 00:01:03,000 solving 21 00:01:03,000 --> 00:01:05,000 is really where the tricky stuff comes on. 22 00:01:05,000 --> 00:01:08,000 And then the kind of - turning it into code, there's some details there but 23 00:01:08,000 --> 00:01:11,000 they're not actually as important, so I'm gonna de-emphasize that just a little bit here and 24 00:01:11,000 --> 00:01:13,000 think more about solving it. 25 00:01:13,000 --> 00:01:16,000 So this is the pseudocode for any form of a backtracker, right, that has 26 00:01:16,000 --> 00:01:17,000 some sort of, 27 00:01:17,000 --> 00:01:20,000 you know, failure and success cases, like when we run out of choices, we've hit a 28 00:01:20,000 --> 00:01:24,000 dead end, or we've hit a goal state, we've - you know, there's no more decisions to make, 29 00:01:24,000 --> 00:01:26,000 right, is this want to be, yes or no, 30 00:01:26,000 --> 00:01:28,000 and then otherwise there are some decisions to make. 31 00:01:28,000 --> 00:01:33,000 And for all the possible decisions we could make, we're gonna try one, be optimistic 32 00:01:33,000 --> 00:01:37,000 about it working out, make that recursive call that says that choice was where I 33 00:01:37,000 --> 00:01:42,000 wanted to be. If it returns true, or whatever the success return value is, 34 00:01:42,000 --> 00:01:44,000 then we return true, 35 00:01:44,000 --> 00:01:47,000 right? No need to look any further. That choice was good enough. 36 00:01:47,000 --> 00:01:51,000 If it didn't work then we gotta unmake - try some other choices. If we try all the 37 00:01:51,000 --> 00:01:55,000 things available to us and no case, right, did solve ever return true, 38 00:01:55,000 --> 00:01:59,000 then we can only conclude that the configuration as given to this call was 39 00:01:59,000 --> 00:02:00,000 unsolvable, 40 00:02:00,000 --> 00:02:02,000 which is where that return false 41 00:02:02,000 --> 00:02:06,000 causes it to back up to some earlier call and start unmaking 42 00:02:06,000 --> 00:02:08,000 that next layer of decisions, 43 00:02:08,000 --> 00:02:11,000 and then recur further down, eventually either 44 00:02:11,000 --> 00:02:15,000 unwinding the entire recursive [inaudible] all the way back to the beginning and saying it 45 00:02:15,000 --> 00:02:18,000 was totally unsolvable no matter what, or eventually finding 46 00:02:18,000 --> 00:02:21,000 some sequence of decisions that will lead to one that will get us to a success 47 00:02:21,000 --> 00:02:23,000 case. 48 00:02:23,000 --> 00:02:27,000 So the one I'm gonna show you here is the sudoku, 49 00:02:27,000 --> 00:02:31,000 which is those little puzzles that show up in 50 00:02:31,000 --> 00:02:34,000 newspapers everywhere. They're actually not 51 00:02:34,000 --> 00:02:37,000 attributed to - apparently, to the Japanese, but it apparently got a lot more popular 52 00:02:37,000 --> 00:02:40,000 under its Japanese name than it did under the original 53 00:02:40,000 --> 00:02:41,000 English name for it. 54 00:02:41,000 --> 00:02:44,000 And the goal of a sudoku, if you haven't ever done one, right, is it's 55 00:02:44,000 --> 00:02:48,000 a nine by nine grid in which you're trying to place numbers, 56 00:02:48,000 --> 00:02:51,000 and the requirement for the numbers is such that 57 00:02:51,000 --> 00:02:53,000 within any particular row, 58 00:02:53,000 --> 00:02:56,000 or any particular column, or any particular block, which is a three by 59 00:02:56,000 --> 00:02:58,000 three subsection of that, 60 00:02:58,000 --> 00:03:00,000 the numbers one through nine each appear once. 61 00:03:00,000 --> 00:03:02,000 So you can never have more than two ones in a row, 62 00:03:02,000 --> 00:03:04,000 two twos in a column, 63 00:03:04,000 --> 00:03:06,000 three threes in a block, or anything like that. 64 00:03:06,000 --> 00:03:09,000 And so there has to be some rearrangement, or, in fact, 65 00:03:09,000 --> 00:03:11,000 permutation 66 00:03:11,000 --> 00:03:14,000 of the numbers one through nine in each row, in each column, and each block, 67 00:03:14,000 --> 00:03:17,000 such that the whole puzzle kind of works out logically. 68 00:03:17,000 --> 00:03:22,000 So when it's given to you, you know, usually some fraction of the 69 00:03:22,000 --> 00:03:23,000 slots are already filled in, 70 00:03:23,000 --> 00:03:27,000 and the goal for you is to fill in those remaining slots without violating any of 71 00:03:27,000 --> 00:03:29,000 these rules. Now 72 00:03:29,000 --> 00:03:33,000 the sort of pure sudoku solvers 73 00:03:33,000 --> 00:03:35,000 don't really use guessing. 74 00:03:35,000 --> 00:03:37,000 It's considered, actually, poor form, 75 00:03:37,000 --> 00:03:39,000 you know, that you're supposed to actually logically work it out by constraints 76 00:03:39,000 --> 00:03:43,000 about what has to be true here versus what has to be true there to kind of 77 00:03:43,000 --> 00:03:44,000 realize that 78 00:03:44,000 --> 00:03:46,000 - what choices you have. We're 79 00:03:46,000 --> 00:03:48,000 actually not gonna be a pure, 80 00:03:48,000 --> 00:03:51,000 artistic, you know, sudoku solver. What we're gonna do is we're actually gonna use a brute force 81 00:03:51,000 --> 00:03:54,000 recursive algorithm that's just gonna try recursive backtracking, which is to 82 00:03:54,000 --> 00:03:55,000 say, 83 00:03:55,000 --> 00:03:56,000 make an assignment, 84 00:03:56,000 --> 00:03:59,000 be optimistic about it, and if it works out, great, and if not, 85 00:03:59,000 --> 00:04:02,000 we'll eventually come back to that decision and revisit it. 86 00:04:02,000 --> 00:04:04,000 So what we have here basically is a big decision problem. 87 00:04:04,000 --> 00:04:06,000 You know, of the 88 00:04:06,000 --> 00:04:08,000 81 squares on here, 89 00:04:08,000 --> 00:04:10,000 you know, about 50 of them, right, need to be 90 00:04:10,000 --> 00:04:12,000 chosen. 91 00:04:12,000 --> 00:04:15,000 For each of those 50 squares, right, we're gonna do them one at a time and recur 92 00:04:15,000 --> 00:04:19,000 on the remaining ones. So, you know, choose one then we'll just go left to right from the 93 00:04:19,000 --> 00:04:20,000 top. 94 00:04:20,000 --> 00:04:23,000 Choose that one at the top, make an assignment that works, and so that's what 95 00:04:23,000 --> 00:04:26,000 we'll use the context we have in problem here. So, for example, if you look at 96 00:04:26,000 --> 00:04:28,000 this first row, 97 00:04:28,000 --> 00:04:31,000 there's a one in this column so we can't use one. There's 98 00:04:31,000 --> 00:04:33,000 a two in that block so we can't use two, 99 00:04:33,000 --> 00:04:37,000 but there's not a three in either that row, or that column, or that block, so we'll say, well, three looks good, you 100 00:04:37,000 --> 00:04:39,000 know, just trying the numbers in order. 101 00:04:39,000 --> 00:04:43,000 We'll be optimistic, say that works, and say, well if we planted a three here, could 102 00:04:43,000 --> 00:04:46,000 we recursively solve the remaining 49 holes 103 00:04:46,000 --> 00:04:48,000 and work it out? 104 00:04:48,000 --> 00:04:49,000 And so 105 00:04:49,000 --> 00:04:51,000 we get to that next one - we look at this one and say, okay, well 106 00:04:51,000 --> 00:04:53,000 we could put a one here, 107 00:04:53,000 --> 00:04:57,000 right, because there's not a one in this column, not a one in that row, not a one in 108 00:04:57,000 --> 00:05:01,000 that block, so we'll kind of move on. I'm gonna do a little demo of that for you. And maybe it's 109 00:05:01,000 --> 00:05:05,000 to kind of keep moving our way across, and only when 110 00:05:05,000 --> 00:05:06,000 we get to a dead end 111 00:05:06,000 --> 00:05:08,000 based on some of our earlier decisions will we unmake 112 00:05:08,000 --> 00:05:10,000 and come back. So 113 00:05:10,000 --> 00:05:15,000 let me - okay. 114 00:05:15,000 --> 00:05:20,000 So that's the same set of numbers there. So I put the three up in the corner. 115 00:05:20,000 --> 00:05:20,000 116 00:05:20,000 --> 00:05:23,000 And so it puts the one here thinking, okay, that looks good. So it gets to the 117 00:05:23,000 --> 00:05:25,000 next square over here 118 00:05:25,000 --> 00:05:26,000 and then 119 00:05:26,000 --> 00:05:29,000 the one can't be used because it's actually already in use both in that column and 120 00:05:29,000 --> 00:05:33,000 in the row we're building so far, but two can be used. Two doesn't conflict with anything we 121 00:05:33,000 --> 00:05:34,000 have so far. 122 00:05:34,000 --> 00:05:36,000 And so it just keeps going optimistically, 123 00:05:36,000 --> 00:05:39,000 right? At that stage over there it turns out almost all the numbers are in 124 00:05:39,000 --> 00:05:43,000 use. Most of the numbers all the way up through seven and nine are there, and seven's 125 00:05:43,000 --> 00:05:46,000 in its column. So, in fact, the only number that that could possibly be is nine, so 126 00:05:46,000 --> 00:05:49,000 the only choice we have here to try is nine, 127 00:05:49,000 --> 00:05:51,000 and then we'll place the seven next 128 00:05:51,000 --> 00:05:54,000 to it. And so now we have a whole row that doesn't conflict with any of its blocks this 129 00:05:54,000 --> 00:05:56,000 way, and then we just keep moving on, right, 130 00:05:56,000 --> 00:05:57,000 so that is to keep kind of 131 00:05:57,000 --> 00:05:59,000 going from top to bottom, left to right. 132 00:05:59,000 --> 00:06:01,000 We'll place that four. We'll 133 00:06:01,000 --> 00:06:04,000 place another one. Place a three. 134 00:06:04,000 --> 00:06:07,000 So it's actually choosing them - it's actually going in order from one to nine just picking 135 00:06:07,000 --> 00:06:09,000 the first one that fits, 136 00:06:09,000 --> 00:06:11,000 right, so 137 00:06:11,000 --> 00:06:13,000 when we get to the next square here, right, 138 00:06:13,000 --> 00:06:17,000 it can't use one, it can't use two, it can't use three, it can't use four, it can't use five because they're all in 139 00:06:17,000 --> 00:06:18,000 that row. It can't 140 00:06:18,000 --> 00:06:20,000 use six because it's in the column. It 141 00:06:20,000 --> 00:06:22,000 can't use seven 142 00:06:22,000 --> 00:06:24,000 because it's in that row, but it should be able to use eight, 143 00:06:24,000 --> 00:06:26,000 and so it'll place an eight there. 144 00:06:26,000 --> 00:06:29,000 So it just kind of examines them in sequence until it finds the first one that 145 00:06:29,000 --> 00:06:32,000 doesn't violate any things already decided, 146 00:06:32,000 --> 00:06:35,000 and then kind of moves optimistically forward. 147 00:06:35,000 --> 00:06:38,000 So about this point, right, we're doing pretty well, but we're starting to 148 00:06:38,000 --> 00:06:40,000 run into some troubles if you look at 149 00:06:40,000 --> 00:06:42,000 the next one over here. 150 00:06:42,000 --> 00:06:44,000 It'll place the six there, 151 00:06:44,000 --> 00:06:46,000 but then it will - 152 00:06:46,000 --> 00:06:50,000 once the six is placed there, then it looks to the right and it says, oh, I need to put a 153 00:06:50,000 --> 00:06:52,000 nine there. That's the only number left. It tries all the numbers one, two, 154 00:06:52,000 --> 00:06:53,000 three, four, 155 00:06:53,000 --> 00:06:55,000 and it says that isn't gonna work. 156 00:06:55,000 --> 00:06:59,000 And so it actually fails on the rightmost column, 157 00:06:59,000 --> 00:07:02,000 which causes it to back up to the one right before and it says, well, why don't you try 158 00:07:02,000 --> 00:07:05,000 something else here? Well, it looks at seven, eight, and nine, none of those can work either, 159 00:07:05,000 --> 00:07:07,000 so it's actually gonna back up even further 160 00:07:07,000 --> 00:07:10,000 and then say, well what if we try to put a nine here? That also doesn't work. So now it's 161 00:07:10,000 --> 00:07:14,000 gonna start seeing that it's kind of unwinding as the constraints we have made have kind of 162 00:07:14,000 --> 00:07:17,000 got us down a dead end and it's slowly working its way back out to where the 163 00:07:17,000 --> 00:07:21,000 original bad decision was made. 164 00:07:21,000 --> 00:07:24,000 So it tries again on moving nine in here, and moving across, right, 165 00:07:24,000 --> 00:07:26,000 but again, kind of, 166 00:07:26,000 --> 00:07:27,000 167 00:07:27,000 --> 00:07:30,000 you know, working its way forward but then kind of backing its way up. And 168 00:07:30,000 --> 00:07:32,000 let me go ahead 169 00:07:32,000 --> 00:07:37,000 and just run it faster so you can kind of see it. 170 00:07:37,000 --> 00:07:41,000 But, you know, it's working on that row for a while, but essentially to note that the top row stays 171 00:07:41,000 --> 00:07:44,000 relatively constant. It kind of believes that, well, that first three must have been good because, you 172 00:07:44,000 --> 00:07:47,000 know, we're getting somewhere on that, and it keeps kind of going. You can see 173 00:07:47,000 --> 00:07:50,000 that the activity is kind of always at the kind of 174 00:07:50,000 --> 00:07:54,000 tail end of that decision making, which eventually, right, 175 00:07:54,000 --> 00:07:56,000 worked its way out. And so it turns out, like, those three ones that we did put in the 176 00:07:56,000 --> 00:07:59,000 first spots were fine. 177 00:07:59,000 --> 00:08:01,000 That is, choices, right, it did work out. 178 00:08:01,000 --> 00:08:04,000 We didn't know that when we started, right, but it was optimistic, right, it put it down 179 00:08:04,000 --> 00:08:05,000 and then kept moving, and 180 00:08:05,000 --> 00:08:08,000 then eventually, right, 181 00:08:08,000 --> 00:08:12,000 worked out how the other things that had to get placed to make the whole puzzle 182 00:08:12,000 --> 00:08:13,000 be solvable. 183 00:08:13,000 --> 00:08:18,000 And so this thing can solve, actually, any solvable sudoku, right, and if it's not animating, 184 00:08:18,000 --> 00:08:19,000 instantaneously, 185 00:08:19,000 --> 00:08:21,000 even though it is really doing it in a fairly crude way, 186 00:08:21,000 --> 00:08:24,000 right, it's basically just trying everything it can, 187 00:08:24,000 --> 00:08:27,000 moving forward, and only when it kind of reaches a contradiction 188 00:08:27,000 --> 00:08:30,000 based on some of those choices that they will - that can't possibly be because at 189 00:08:30,000 --> 00:08:33,000 this point I'm forced into a situation where there's nothing that works in this 190 00:08:33,000 --> 00:08:33,000 square, 191 00:08:33,000 --> 00:08:36,000 so it must be that some earlier decision was wrong. 192 00:08:36,000 --> 00:08:39,000 And you notice that when it backs up, it backs up to the most immediate decision. 193 00:08:39,000 --> 00:08:42,000 So if you think of it in terms of recursive call, here's your first decision, your second 194 00:08:42,000 --> 00:08:44,000 decision, your third decision, your fourth decision, your fifth decision. If you get down 195 00:08:44,000 --> 00:08:47,000 here and you're, like, trying to make your eighth decision, 196 00:08:47,000 --> 00:08:49,000 and there's nothing that works, right, 197 00:08:49,000 --> 00:08:53,000 the decision that you come back to will be your seventh one. The one right 198 00:08:53,000 --> 00:08:56,000 before it. You don't go throw everything away and start over. You don't go all the way 199 00:08:56,000 --> 00:08:59,000 back to the beginning and say, oh, well that didn't work, let's try again. 200 00:08:59,000 --> 00:09:01,000 It's that you actually use the kind of context to say, well, 201 00:09:01,000 --> 00:09:04,000 the last decision I made was probably the one that needs a little fixing. Let me 202 00:09:04,000 --> 00:09:08,000 just back right up to that one. That's the way the calls unwind, and it says we'll pick up 203 00:09:08,000 --> 00:09:11,000 trying some other options there. Which ones have we not tried on that one? And 204 00:09:11,000 --> 00:09:15,000 then go forward again, and again, if you get down to that eighth decision and you're still 205 00:09:15,000 --> 00:09:18,000 stuck, you come back to the seventh decision again, and only after kind of the seventh decision 206 00:09:18,000 --> 00:09:21,000 has gone back and forth with the eighth unsuccessfully 207 00:09:21,000 --> 00:09:25,000 through all its options would we eventually return to that sixth decision, and 208 00:09:25,000 --> 00:09:29,000 potentially back to the fifth, and fourth, and so on. 209 00:09:29,000 --> 00:09:31,000 The code for this guy, 210 00:09:31,000 --> 00:09:32,000 211 00:09:32,000 --> 00:09:36,000 a little abstracted 212 00:09:36,000 --> 00:09:39,000 that should very much fit the pattern of what you think recursive backtracking 213 00:09:39,000 --> 00:09:40,000 looks like, and then the kind of 214 00:09:40,000 --> 00:09:43,000 sort of goofier parts that are about, well, what does it mean to test a 215 00:09:43,000 --> 00:09:47,000 sudoku for being a sign of having conflicts is actually then 216 00:09:47,000 --> 00:09:50,000 passed out into these helper functions that manage the more 217 00:09:50,000 --> 00:09:52,000 domain specific parts of the problem. 218 00:09:52,000 --> 00:09:55,000 So at the very beginning it's like, find an unassigned location on the grid, and 219 00:09:55,000 --> 00:09:58,000 it returns the row and column by reference. It turns out in this case 220 00:09:58,000 --> 00:10:01,000 those are reference parameters. So [inaudible] searches from top to bottom to find the 221 00:10:01,000 --> 00:10:05,000 first slot that actually does not have a value currently in it. 222 00:10:05,000 --> 00:10:08,000 If it never finds one, so exhaustively searched the grid and didn't find one, then it 223 00:10:08,000 --> 00:10:11,000 must be that we have a working sudoku because we never put a number in 224 00:10:11,000 --> 00:10:17,000 unless it worked for us, and so if we've managed to assign them all, we're done. 225 00:10:17,000 --> 00:10:19,000 If this didn't return true, 226 00:10:19,000 --> 00:10:21,000 that meant it found one, and it assigned them a row and column, 227 00:10:21,000 --> 00:10:24,000 and then what we're gonna go through the process of is assigning that row and 228 00:10:24,000 --> 00:10:24,000 column. 229 00:10:24,000 --> 00:10:26,000 So we look at the numbers one through nine. 230 00:10:26,000 --> 00:10:28,000 If there are no conflicts for that number, 231 00:10:28,000 --> 00:10:32,000 so it doesn't conflict with the row, column, or block, 232 00:10:32,000 --> 00:10:35,000 that number isn't already in use in one of those places, then we go ahead and make 233 00:10:35,000 --> 00:10:36,000 the assignment, 234 00:10:36,000 --> 00:10:40,000 and then we see if we can solve it from here. So having updated the grid 235 00:10:40,000 --> 00:10:42,000 to show that new number's in play, 236 00:10:42,000 --> 00:10:46,000 you know, if we move on, the next [inaudible] of sudoku will then do a 237 00:10:46,000 --> 00:10:50,000 search for find unassigned location. This time the one that we previously found, right, 238 00:10:50,000 --> 00:10:52,000 has been assigned, so it actually won't get triggered on that one. 239 00:10:52,000 --> 00:10:55,000 It'll look past it further down into the puzzle, 240 00:10:55,000 --> 00:10:57,000 and eventually either find 241 00:10:57,000 --> 00:11:00,000 the next one to make a call on, and kind of work its way through, or to - you have to 242 00:11:00,000 --> 00:11:02,000 solve the whole thing. 243 00:11:02,000 --> 00:11:06,000 If it didn't work, so we made that assignment to the number nine, and we went to solve, 244 00:11:06,000 --> 00:11:07,000 and eventually this 245 00:11:07,000 --> 00:11:11,000 tried all its possibilities from here and nothing came up good, then this 246 00:11:11,000 --> 00:11:13,000 unassigned 247 00:11:13,000 --> 00:11:15,000 constant is used to unmake that decision, 248 00:11:15,000 --> 00:11:18,000 and come back around, and try 249 00:11:18,000 --> 00:11:21,000 assigning it a different number. If we try all of our examples, so for 250 00:11:21,000 --> 00:11:25,000 example, if we never find one that doesn't already conflict, or 251 00:11:25,000 --> 00:11:29,000 if we try each one and it comes back false, right, that this return false here is 252 00:11:29,000 --> 00:11:30,000 what's triggering 253 00:11:30,000 --> 00:11:33,000 the backtracking up to the previous recursive call 254 00:11:33,000 --> 00:11:37,000 to reconsider some earlier decision - the most recent early decision 255 00:11:37,000 --> 00:11:38,000 for this one, 256 00:11:38,000 --> 00:11:40,000 and say that was really our mistake, right, 257 00:11:40,000 --> 00:11:42,000 we've got to unmake that one. 258 00:11:42,000 --> 00:11:45,000 So it should look like kind of all the recursive backtracking all through looks the same, 259 00:11:45,000 --> 00:11:47,000 right? You know, if 260 00:11:47,000 --> 00:11:48,000 we're at the end, 261 00:11:48,000 --> 00:11:49,000 here's our base cases 262 00:11:49,000 --> 00:11:53,000 for all our options. If we can make this choice, make it, try to solve it, be 263 00:11:53,000 --> 00:11:56,000 optimistic, if it works out, return. 264 00:11:56,000 --> 00:11:59,000 Otherwise, unmake it, allow the loop to iterate a 265 00:11:59,000 --> 00:12:02,000 few more times trying again. If all of those things 266 00:12:02,000 --> 00:12:06,000 fail to find a solution then that return false here will cause the backtracking 267 00:12:06,000 --> 00:12:12,000 out of this decision. I just let it 268 00:12:12,000 --> 00:12:15,000 become constant. You know, I made it up. I used negative one, in fact, 269 00:12:15,000 --> 00:12:16,000 just to know that 270 00:12:16,000 --> 00:12:20,000 it has no contents. 271 00:12:20,000 --> 00:12:26,000 Just specific to sudoku in this case. Now would be a great time 272 00:12:26,000 --> 00:12:30,000 to ask a question. You guys kind of have that sort of half-okay 273 00:12:30,000 --> 00:12:34,000 look on your face. It could be I'm totally bored. It could be I'm totally lost. Where do row and column 274 00:12:34,000 --> 00:12:35,000 get assigned? So they 275 00:12:35,000 --> 00:12:38,000 - finding assigned location takes [inaudible] reference. 276 00:12:38,000 --> 00:12:41,000 So if you look at the full code for this - this is actually in the handout I gave out last time, and so 277 00:12:41,000 --> 00:12:44,000 there's a pass by reference in that function, and it returns true if it assigned 278 00:12:44,000 --> 00:12:46,000 them something, and then they have the 279 00:12:46,000 --> 00:12:49,000 coordinates of that 280 00:12:49,000 --> 00:12:51,000 unassigned location. Question over here. How'd you know 281 00:12:51,000 --> 00:12:55,000 to write sol sudoku as a bool rather than as, like, a void function? Well, in this 282 00:12:55,000 --> 00:12:57,000 case - that's a great question, right, 283 00:12:57,000 --> 00:13:01,000 in most cases in recursive backtracking I am trying to 284 00:13:01,000 --> 00:13:05,000 discover the success or failure of an operation, and so that's a good way to tell 285 00:13:05,000 --> 00:13:07,000 because otherwise I need to know did it work. 286 00:13:07,000 --> 00:13:10,000 And so if I made it void then there had to be some other way I figured it out. Maybe, 287 00:13:10,000 --> 00:13:13,000 you know, that you have to check the grid when you're done and see that it has - no longer has 288 00:13:13,000 --> 00:13:16,000 any unassigned locations, but the bool is actually just the easiest way to get that 289 00:13:16,000 --> 00:13:17,000 information out. 290 00:13:17,000 --> 00:13:19,000 So typically your recursive backtracking machine 291 00:13:19,000 --> 00:13:21,000 will probably return something. 292 00:13:21,000 --> 00:13:22,000 293 00:13:22,000 --> 00:13:25,000 Often it's a true or false. It could be, you know, in some other case, just some other 294 00:13:25,000 --> 00:13:27,000 good know value, and some other 295 00:13:27,000 --> 00:13:31,000 sentinel that says, you know, bad value. So, for example, in the finding an anagram of the 296 00:13:31,000 --> 00:13:34,000 words it might be that it returned the word if it found one, or returned an empty string if 297 00:13:34,000 --> 00:13:36,000 it didn't. So using some sort of state that says 298 00:13:36,000 --> 00:13:39,000 here's how - if you made the call, you'll know whether it worked because we 299 00:13:39,000 --> 00:13:40,000 really do need to know. 300 00:13:40,000 --> 00:13:44,000 Make the call and find out whether it worked. Well, how are we gonna find out? One of the best ways to 301 00:13:44,000 --> 00:13:47,000 get that information is from a return value. Way in 302 00:13:47,000 --> 00:13:52,000 the back? Exactly does the return calls trigger in the backtracking 303 00:13:52,000 --> 00:13:52,000 [inaudible]? 304 00:13:52,000 --> 00:13:56,000 So think about this as that, you know, it's hard to think about because you have to kind of have kind of 305 00:13:56,000 --> 00:13:58,000 both sides of the recursion in your head, but 306 00:13:58,000 --> 00:14:00,000 when - let's say we're on 307 00:14:00,000 --> 00:14:02,000 the fifth decision, right? 308 00:14:02,000 --> 00:14:05,000 We make the call to solve the sudoku that's gonna look at the sixth 309 00:14:05,000 --> 00:14:06,000 decision, right? 310 00:14:06,000 --> 00:14:10,000 If the sixth decision fails, it says, I looked at all my choices and none of them 311 00:14:10,000 --> 00:14:11,000 worked, right? 312 00:14:11,000 --> 00:14:12,000 It's that 313 00:14:12,000 --> 00:14:15,000 that causes the fifth decision to say, well I - here go solve for the sixth decision 314 00:14:15,000 --> 00:14:18,000 down. Well the sixth decision came back with a false. It got to this case after 315 00:14:18,000 --> 00:14:19,000 trying all its options, 316 00:14:19,000 --> 00:14:23,000 then it's the fifth decision that then says, okay, well then I'd better unmake the decision I 317 00:14:23,000 --> 00:14:24,000 made and try again. 318 00:14:24,000 --> 00:14:28,000 And so at any given stage you can think about the caller and the callee, it's saying, 319 00:14:28,000 --> 00:14:31,000 I'm making a decision. You tell me if you can make it work from here. 320 00:14:31,000 --> 00:14:35,000 If the delegate that you passed on the smaller form of the recursive problem comes 321 00:14:35,000 --> 00:14:39,000 back with a return false, and then I've tried everything I could possibly do, but 322 00:14:39,000 --> 00:14:42,000 there was no way. The situation you gave me was unworkable. 323 00:14:42,000 --> 00:14:46,000 And that says, oh, okay, well you know what, it must have been my fault, right? I put 324 00:14:46,000 --> 00:14:49,000 the wrong number in my box. Let me try a different number and now you try again. 325 00:14:49,000 --> 00:14:52,000 And so they're constantly kind of these - you have to keep active in your mind the 326 00:14:52,000 --> 00:14:52,000 idea 327 00:14:52,000 --> 00:14:55,000 that there's this whole chain of these things being stacked up, each of them 328 00:14:55,000 --> 00:14:59,000 being optimistic and then delegating down. And if your delegate comes back 329 00:14:59,000 --> 00:15:00,000 with the - 330 00:15:00,000 --> 00:15:02,000 there was nothing I could do, 331 00:15:02,000 --> 00:15:06,000 then you have to revisit your earlier optimistic decision, unmake it, and 332 00:15:06,000 --> 00:15:07,000 try again. 333 00:15:07,000 --> 00:15:09,000 And so 334 00:15:09,000 --> 00:15:12,000 that - this is the return false to the outer call, 335 00:15:12,000 --> 00:15:16,000 the one that made the solved call 336 00:15:16,000 --> 00:15:17,000 that unwinds from. 337 00:15:17,000 --> 00:15:21,000 Way over here? [Inaudible]? 338 00:15:21,000 --> 00:15:26,000 Pretty much any recursive piece of thing can be reformed into a backtracking form, right? 339 00:15:26,000 --> 00:15:29,000 You need to - to do a little bit like the [inaudible] we tried last time, 340 00:15:29,000 --> 00:15:32,000 we'll take a void returning thing and make it return some true or false, and there's kind of, 341 00:15:32,000 --> 00:15:36,000 like, make a decision and be optimistic, see if it worked out. 342 00:15:36,000 --> 00:15:40,000 But that kind of rearrangement of the code should work with pretty much all 343 00:15:40,000 --> 00:15:43,000 your standard recursion stuff. So any kind of puzzle that actually, 344 00:15:43,000 --> 00:15:46,000 like, all these kind of jumble puzzles, and crossword puzzles, and things 345 00:15:46,000 --> 00:15:48,000 that involve kind of choosing, right, 346 00:15:48,000 --> 00:15:52,000 can be solved with recursive backtracking. It's not necessarily the fastest, right, way 347 00:15:52,000 --> 00:15:56,000 to get to a solution, but it will exhaustively try all the options until 348 00:15:56,000 --> 00:16:00,000 a sequence of decisions, right, leads you to a goal if there is a goal to be found. 349 00:16:00,000 --> 00:16:03,000 And so that if you can think of it as - if you can think of your problem as a 350 00:16:03,000 --> 00:16:04,000 decision problem - 351 00:16:04,000 --> 00:16:07,000 I need to make some decisions, and then from there make some more decisions and so 352 00:16:07,000 --> 00:16:09,000 on, and eventually, right, I will bottom out 353 00:16:09,000 --> 00:16:11,000 either at a goal or dead end, 354 00:16:11,000 --> 00:16:14,000 then you can write a recursive problem solving - 355 00:16:14,000 --> 00:16:18,000 backtracker to solve it. Way in 356 00:16:18,000 --> 00:16:23,000 the back? This doesn't have anything to do with recursion, but how did you slow down the simulation? How did I slow it down? I'm just using pause. There's a pause 357 00:16:23,000 --> 00:16:28,000 function in our stenographics library, and you just give it a time. You say one second, half a second, and 358 00:16:28,000 --> 00:16:29,000 just - I 359 00:16:29,000 --> 00:16:31,000 use that a lot in our demos, for example, for the maze, so that you can actually see what's 360 00:16:31,000 --> 00:16:35,000 going on, and that way it just animates a little more. Stenographics, CSTgraph.h. [Inaudible] me 361 00:16:35,000 --> 00:16:42,000 show you 362 00:16:42,000 --> 00:16:43,000 one more 363 00:16:43,000 --> 00:16:45,000 kind 364 00:16:45,000 --> 00:16:47,000 of just in the same theme. The idea of taking 365 00:16:47,000 --> 00:16:50,000 sort of some, you know, puzzle that somebody might give you, 366 00:16:50,000 --> 00:16:53,000 trying to cast it in terms of a decision problem, right, where you make decisions 367 00:16:53,000 --> 00:16:55,000 and then you move on, 368 00:16:55,000 --> 00:16:58,000 can I solve something like these little cryptographic puzzles? 369 00:16:58,000 --> 00:17:01,000 So here's send plus more equals money. 370 00:17:01,000 --> 00:17:03,000 I've got eight 371 00:17:03,000 --> 00:17:08,000 digits in there - eight letters in there. You know, D E M N O R S 372 00:17:08,000 --> 00:17:10,000 Y across all of them, 373 00:17:10,000 --> 00:17:13,000 and the goal of this puzzle is to assign these letters - 374 00:17:13,000 --> 00:17:17,000 eight letters to the digits zero through nine without replacement 375 00:17:17,000 --> 00:17:18,000 so that 376 00:17:18,000 --> 00:17:21,000 if D's assigned to three, it's assigned to three in all places. For example, O shows up 377 00:17:21,000 --> 00:17:24,000 two places in the puzzle. Both of those are either a two or both are three. 378 00:17:24,000 --> 00:17:27,000 There's not one that's one and one that's another. And each 379 00:17:27,000 --> 00:17:33,000 digit is used once. So, like, if I assigned two to O, then I will not assign two to D. So 380 00:17:33,000 --> 00:17:35,000 what we've got here is eight letters. 381 00:17:35,000 --> 00:17:38,000 We've got 10 digits. 382 00:17:38,000 --> 00:17:41,000 We want to make an assignment. 383 00:17:41,000 --> 00:17:44,000 What we've got here is some choices. The choices are, 384 00:17:44,000 --> 00:17:48,000 for each letter, what digit are we gonna map it to? 385 00:17:48,000 --> 00:17:51,000 Another way to think about it is if you took these eight letters, and then you 386 00:17:51,000 --> 00:17:53,000 attach two dashes on the end, 387 00:17:53,000 --> 00:17:56,000 then you considered that the letter's position in the string was - the index 388 00:17:56,000 --> 00:17:58,000 of it was its digit. 389 00:17:58,000 --> 00:18:00,000 It's really just like trying the permutations, 390 00:18:00,000 --> 00:18:02,000 right, rearranging those 391 00:18:02,000 --> 00:18:06,000 letters in any of those sequences. So right now, for example, maybe D is assigned 392 00:18:06,000 --> 00:18:09,000 zero, and one, and two, and so on, and these guys are eight, nine. Well, if 393 00:18:09,000 --> 00:18:12,000 you rearrange the letters into some other permutation then you've made a 394 00:18:12,000 --> 00:18:13,000 different assignment. 395 00:18:13,000 --> 00:18:16,000 So in effect, right, what you're looking at is a problem that has a lot in common with 396 00:18:16,000 --> 00:18:17,000 permutations, right? 397 00:18:17,000 --> 00:18:21,000 I'm gonna make an assignment, take a letter off, decide what index it's gonna 398 00:18:21,000 --> 00:18:24,000 be placed at, so it's kind of like deciding where it would go in the permutation, 399 00:18:24,000 --> 00:18:27,000 and then that leaves us with a smaller form of the problem 400 00:18:27,000 --> 00:18:30,000 which has one fewer letters to be assigned, 401 00:18:30,000 --> 00:18:33,000 and then recursively explore the options from there to see if we can make 402 00:18:33,000 --> 00:18:37,000 something that makes the addition add up correctly that D plus E equals Y 403 00:18:37,000 --> 00:18:37,000 means that 404 00:18:37,000 --> 00:18:39,000 if D was assigned five, 405 00:18:39,000 --> 00:18:43,000 and E was three, then Y better be eight for this to work out. 406 00:18:43,000 --> 00:18:47,000 So the first form I'm gonna show you is actually just the dumb, 407 00:18:47,000 --> 00:18:49,000 exhaustive recursive backtracking 408 00:18:49,000 --> 00:18:53,000 that works very much like the sudoku problem where it just - it finds the next unassigned 409 00:18:53,000 --> 00:18:57,000 letter, it assigns it a digit of the next auto sign digits, right, and then just 410 00:18:57,000 --> 00:18:59,000 optimistically figures that'll work out. 411 00:18:59,000 --> 00:19:02,000 After it makes an assignment for all the letters it says, 412 00:19:02,000 --> 00:19:07,000 take the puzzle, convert all the letters into their digit equivalents, and see if it adds 413 00:19:07,000 --> 00:19:08,000 together correctly. 414 00:19:08,000 --> 00:19:10,000 If so, we're done. If not, 415 00:19:10,000 --> 00:19:13,000 then let's backtrack. 416 00:19:13,000 --> 00:19:16,000 So let me - I mean, actually, I'll show you 417 00:19:16,000 --> 00:19:18,000 the code first 418 00:19:18,000 --> 00:19:20,000 because it's actually quite easy to 419 00:19:20,000 --> 00:19:21,000 take a look at. 420 00:19:21,000 --> 00:19:25,000 Again, it has helper routines that kind of 421 00:19:25,000 --> 00:19:29,000 try to abstract the pieces of the puzzle that actually aren't interesting 422 00:19:29,000 --> 00:19:31,000 from kind of looking at just the core of the recursive algorithm, so it looks a 423 00:19:31,000 --> 00:19:33,000 lot like sudoku 424 00:19:33,000 --> 00:19:36,000 in that if letters do assign, so it actually keeps the string of the letters that 425 00:19:36,000 --> 00:19:38,000 haven't yet been assigned. 426 00:19:38,000 --> 00:19:41,000 If there are no more letters in that string, we take one off at each time we 427 00:19:41,000 --> 00:19:42,000 assign it, 428 00:19:42,000 --> 00:19:45,000 then we check and see if the puzzle's solved. So that kind of does the substitution, and does 429 00:19:45,000 --> 00:19:48,000 the math, and comes back saying yes or no. 430 00:19:48,000 --> 00:19:50,000 431 00:19:50,000 --> 00:19:53,000 If it worked out, we'll get a true. If it didn't work out we get a false. 432 00:19:53,000 --> 00:19:57,000 If we still have letters to assign then it goes through the process of making a choice, and 433 00:19:57,000 --> 00:19:59,000 that choice is looking at the 434 00:19:59,000 --> 00:20:00,000 435 00:20:00,000 --> 00:20:04,000 digits zero through nine if we can assign it. So looking at that first letter in 436 00:20:04,000 --> 00:20:07,000 that digit, and then that's making sure that we don't already have an assignment for 437 00:20:07,000 --> 00:20:10,000 that letter, that we don't have an assignment for that digit. Sort of make sure that just the 438 00:20:10,000 --> 00:20:12,000 constraints of the problem are being met. 439 00:20:12,000 --> 00:20:14,000 If we're able to make that assignment 440 00:20:14,000 --> 00:20:16,000 then we go ahead and make a recursive call, 441 00:20:16,000 --> 00:20:18,000 having one fewer letters 442 00:20:18,000 --> 00:20:20,000 to make a decision for, and 443 00:20:20,000 --> 00:20:23,000 if that worked out true, otherwise we do an unassignment and come back around 444 00:20:23,000 --> 00:20:24,000 that loop 445 00:20:24,000 --> 00:20:27,000 and then eventually the same return false at the bottom, which said, well 446 00:20:27,000 --> 00:20:31,000 given the previous assignments of the letters before we got to this call, 447 00:20:31,000 --> 00:20:33,000 there was nothing I could do from here 448 00:20:33,000 --> 00:20:36,000 to make this work out. So 449 00:20:36,000 --> 00:20:40,000 let me show you this guy working because you're gonna see that it is actually 450 00:20:40,000 --> 00:20:42,000 a crazy way to try to solve this problem 451 00:20:42,000 --> 00:20:45,000 in terms of what you know about 452 00:20:45,000 --> 00:20:48,000 stuff. So I say C S plus U equals fun, 453 00:20:48,000 --> 00:20:54,000 a well-known equation. 454 00:20:54,000 --> 00:20:57,000 So I did this little animation to try to get you visualized what's going on at any 455 00:20:57,000 --> 00:20:59,000 given stage. 456 00:20:59,000 --> 00:21:00,000 So 457 00:21:00,000 --> 00:21:03,000 it has the letters down across the bottom, S U N C O Y F. 458 00:21:03,000 --> 00:21:06,000 That's the string of letters to assign. 459 00:21:06,000 --> 00:21:08,000 It's gonna assign it the next available digit 460 00:21:08,000 --> 00:21:11,000 when it makes that recursive call, so the first recursive call is 461 00:21:11,000 --> 00:21:12,000 gonna be about assigning S, 462 00:21:12,000 --> 00:21:16,000 and then the next call will be assign U, and the next call - and so on. So it always 463 00:21:16,000 --> 00:21:18,000 gets up to seven deep 464 00:21:18,000 --> 00:21:19,000 on any particular thing. 465 00:21:19,000 --> 00:21:21,000 And so it says, okay, well 466 00:21:21,000 --> 00:21:25,000 first digit available is zero, go ahead and assign that to S. So it's gonna make a 467 00:21:25,000 --> 00:21:28,000 call there. And then it says, okay, well look at U. Well we can't use 468 00:21:28,000 --> 00:21:33,000 zero because zero's already in use. What don't we assign U to be one? Okay. Sounds good. Keep going. And 469 00:21:33,000 --> 00:21:36,000 then it'll say, okay, let's get to N. Let's make an assignment for N. It says I'll look around. 470 00:21:36,000 --> 00:21:40,000 Okay, well zero's in use, one's in use, how about two? Okay. Good. 471 00:21:40,000 --> 00:21:41,000 And then it assigns 472 00:21:41,000 --> 00:21:44,000 this to three, this to four, this to five, 473 00:21:44,000 --> 00:21:47,000 and this to six. Okay. It gets to here and it says, 474 00:21:47,000 --> 00:21:48,000 hey, does that work, 475 00:21:48,000 --> 00:21:50,000 30 plus 541 equals 476 00:21:50,000 --> 00:21:52,000 612? No? 477 00:21:52,000 --> 00:21:55,000 Okay. Well, you know what the problem was? It was F, 478 00:21:55,000 --> 00:22:01,000 right? I assigned F to six. How stupid of me. It can't be six. Let's make it seven. 479 00:22:01,000 --> 00:22:02,000 480 00:22:02,000 --> 00:22:05,000 And then it says, oh, oh no, oh I'm sorry. I'm sorry, 30 plus 541 that's not 481 00:22:05,000 --> 00:22:07,000 712. How silly. 482 00:22:07,000 --> 00:22:09,000 I need to assign F to be eight. 483 00:22:09,000 --> 00:22:11,000 And it's gonna do this for a little while. 484 00:22:11,000 --> 00:22:12,000 A long while [inaudible]. 485 00:22:12,000 --> 00:22:13,000 And then it says, oh, okay. Well, 486 00:22:13,000 --> 00:22:17,000 you know what, given the letters you'd assigned to the first six things, when you got to 487 00:22:17,000 --> 00:22:20,000 F, I tried everything I could and there was nothing I could do to fix this. 488 00:22:20,000 --> 00:22:23,000 You know what the problem was? It's not me. It's not me. I'm not the problem. You're the 489 00:22:23,000 --> 00:22:24,000 problem. 490 00:22:24,000 --> 00:22:26,000 So it returns false 491 00:22:26,000 --> 00:22:29,000 from this call at the bottom, having tried all its options, 492 00:22:29,000 --> 00:22:31,000 which causes Y to say, oh, yeah, yeah, 493 00:22:31,000 --> 00:22:36,000 I know. I know I said five. Did I said five? I didn't meant to say five. I meant to say six. 494 00:22:36,000 --> 00:22:39,000 And so it moves up to the six option. 495 00:22:39,000 --> 00:22:43,000 Again, optimistically saying that's good. Go for it. See what you can do. 496 00:22:43,000 --> 00:22:44,000 So it picks five. 497 00:22:44,000 --> 00:22:44,000 That won't work, 498 00:22:44,000 --> 00:22:46,000 right? It picks seven. It's gonna 499 00:22:46,000 --> 00:22:48,000 go for a long time, right, because it turns out, right, 500 00:22:48,000 --> 00:22:52,000 this is one of those cases where that very first decision, that was one of your problems, 501 00:22:52,000 --> 00:22:52,000 right? 502 00:22:52,000 --> 00:22:54,000 If you assign S to be zero, 503 00:22:54,000 --> 00:22:58,000 there's nothing you can assign U and N to be that are gonna work out. 504 00:22:58,000 --> 00:23:01,000 So what it's gonna be going through is this process though of 505 00:23:01,000 --> 00:23:05,000 having, you know, committed to this early decision and kind of moving on it's gonna try 506 00:23:05,000 --> 00:23:10,000 every other variation over here before it gives up on that. So let me 507 00:23:10,000 --> 00:23:13,000 508 00:23:13,000 --> 00:23:16,000 set it to going. 509 00:23:16,000 --> 00:23:22,000 Even though C S plus U does equal fun. I guarantee it. We'll 510 00:23:22,000 --> 00:23:23,000 let it do some work for a while. 511 00:23:23,000 --> 00:23:26,000 So those bars is they grow up are desperation. 512 00:23:26,000 --> 00:23:29,000 You can think of that as, like, it's running out of options. It's running out of 513 00:23:29,000 --> 00:23:30,000 time, you know, and it's like oh, 514 00:23:30,000 --> 00:23:33,000 oh, wait, earlier, back up, back up. And so, okay, you can kind of see 515 00:23:33,000 --> 00:23:36,000 how far its backed up but sort of how tall 516 00:23:36,000 --> 00:23:40,000 some of the deeper recursive calls, right, the earlier ones in the sequence. And 517 00:23:40,000 --> 00:23:41,000 so 518 00:23:41,000 --> 00:23:44,000 it doesn't, you know, revisit any of these decisions because it's really busy over 519 00:23:44,000 --> 00:23:47,000 here, but you can see that C is now up to seven. 520 00:23:47,000 --> 00:23:49,000 It'll get up to eight. It'll get up to nine. 521 00:23:49,000 --> 00:23:52,000 And that was when it will cause itself to say, you know, I tried everything that was 522 00:23:52,000 --> 00:23:54,000 possible from C on down, 523 00:23:54,000 --> 00:23:58,000 and there was no way to make this thing work out. It must be that the bad 524 00:23:58,000 --> 00:24:00,000 decision was made earlier. 525 00:24:00,000 --> 00:24:04,000 Let's back up and look at that. And so it'll come back to the decision N, 526 00:24:04,000 --> 00:24:09,000 bump it up by one, and then go again. It's gonna do this a long time, right? 527 00:24:09,000 --> 00:24:12,000 Go through all the options for N and its neighbors before it comes back and revisits 528 00:24:12,000 --> 00:24:14,000 the U. It's gonna have to get U all the way up, 529 00:24:14,000 --> 00:24:18,000 right, through having tried one, two, three, four. So adding zero to one, and two, and three, 530 00:24:18,000 --> 00:24:19,000 and four, 531 00:24:19,000 --> 00:24:21,000 and discovering it can never make a match over there 532 00:24:21,000 --> 00:24:24,000 before it will eventually decide that the S 533 00:24:24,000 --> 00:24:25,000 is really where 534 00:24:25,000 --> 00:24:27,000 we got ourselves in trouble. 535 00:24:27,000 --> 00:24:29,000 So this is kind of in its most naive 536 00:24:29,000 --> 00:24:32,000 form, right? The recursive backtracking is doing basically permutations, 537 00:24:32,000 --> 00:24:34,000 right? You can see this is actually just 538 00:24:34,000 --> 00:24:37,000 permuting the digits as they're assigned to the numbers, and there are, in this 539 00:24:37,000 --> 00:24:41,000 case, you know, seven factorial different ways that we can make this assignment, 540 00:24:41,000 --> 00:24:45,000 and there are a lot of them that are wrong, right? It's not being at all clever 541 00:24:45,000 --> 00:24:46,000 about 542 00:24:46,000 --> 00:24:49,000 how to pick and choose among which to explore. 543 00:24:49,000 --> 00:24:51,000 So in 544 00:24:51,000 --> 00:24:52,000 its most naive form, right, 545 00:24:52,000 --> 00:24:54,000 recursive backtracking can be very expensive because often you're looking 546 00:24:54,000 --> 00:24:59,000 at things that have very high branching, and very long depth to them, 547 00:24:59,000 --> 00:25:02,000 which can add up to a lot of different things tried. 548 00:25:02,000 --> 00:25:04,000 Just using some simple - in this case, heuristics, 549 00:25:04,000 --> 00:25:08,000 sort of information you know about the domain 550 00:25:08,000 --> 00:25:11,000 can help you to guide your choices instead of just making the choices in order, trying the 551 00:25:11,000 --> 00:25:14,000 numbers zero through nine as though they're equally likely, 552 00:25:14,000 --> 00:25:16,000 or 553 00:25:16,000 --> 00:25:19,000 kind of waiting to make all the assignments before you look at 554 00:25:19,000 --> 00:25:21,000 anything to see if it's actually good. There's actually some ways we can kind of 555 00:25:21,000 --> 00:25:24,000 shape our decision making to look at the most likely options before we look at 556 00:25:24,000 --> 00:25:26,000 these more 557 00:25:26,000 --> 00:25:30,000 first. Finding the problem instead of niggling around this dead end. 558 00:25:30,000 --> 00:25:33,000 So I'm gonna let that guy work for a little bit 559 00:25:33,000 --> 00:25:34,000 while I 560 00:25:34,000 --> 00:25:36,000 show you another slide over here. 561 00:25:36,000 --> 00:25:39,000 So what the smarter solver does, 562 00:25:39,000 --> 00:25:43,000 is that it doesn't really consider all permutations equally plausible. We're 563 00:25:43,000 --> 00:25:47,000 gonna use a little grade school addition knowledge. If I look at the rightmost column, the 564 00:25:47,000 --> 00:25:49,000 least significant digit, 565 00:25:49,000 --> 00:25:51,000 I'm gonna assign D first. 566 00:25:51,000 --> 00:25:52,000 I assign D to zero. I assign 567 00:25:52,000 --> 00:25:53,000 E to one, 568 00:25:53,000 --> 00:25:54,000 and then I look at Y - 569 00:25:54,000 --> 00:25:59,000 I don't try Y is five, or seven, or anything like that. I say there's 570 00:25:59,000 --> 00:26:01,000 exactly one thing Y has to be, right, if 571 00:26:01,000 --> 00:26:04,000 D is zero and E is one, then Y has to be one, and I can say, well that won't work 572 00:26:04,000 --> 00:26:06,000 because I already used one for E. 573 00:26:06,000 --> 00:26:07,000 So it'll say, 574 00:26:07,000 --> 00:26:10,000 well that's impossible. Something must be wrong 575 00:26:10,000 --> 00:26:13,000 early. Rather than keep going on this kind of dumb 576 00:26:13,000 --> 00:26:15,000 dead end, it's to realize right away 577 00:26:15,000 --> 00:26:18,000 that one of the decisions I've already made, D and E, has gotta be wrong. So it'll back up 578 00:26:18,000 --> 00:26:22,000 to E. It'll try two, three, four, very quickly realizing 579 00:26:22,000 --> 00:26:25,000 that of all the things you could assign to E, once you have assigned D to 580 00:26:25,000 --> 00:26:29,000 zero, you're in trouble. And so it will quickly unmake that decision about D and work its way 581 00:26:29,000 --> 00:26:30,000 down. 582 00:26:30,000 --> 00:26:34,000 So using the kind of structure of the problem. So it makes it a little bit more 583 00:26:34,000 --> 00:26:37,000 housekeeping about where I'm at, and what I'm doing, and 584 00:26:37,000 --> 00:26:37,000 585 00:26:37,000 --> 00:26:38,000 what's going on, 586 00:26:38,000 --> 00:26:43,000 but it is using some real smarts about what part of the tree to explore, 587 00:26:43,000 --> 00:26:45,000 rather than letting it kind of just go 588 00:26:45,000 --> 00:26:49,000 willy nilly across the whole thing. Let me get that 589 00:26:49,000 --> 00:26:52,000 out of the way. And I'll 590 00:26:52,000 --> 00:27:00,000 run this one. I say C S plus U equals fun. Okay. 591 00:27:00,000 --> 00:27:03,000 So it goes zero here, and 592 00:27:03,000 --> 00:27:04,000 tries the one, and it 593 00:27:04,000 --> 00:27:06,000 says no, that won't work. How about the two? 594 00:27:06,000 --> 00:27:09,000 No, that won't work, right, because there's nothing you can assign the N that'll 595 00:27:09,000 --> 00:27:13,000 make this work. And so it immediately is kind of failing on this, 596 00:27:13,000 --> 00:27:16,000 and even after it tries kind of all nine of these, it 597 00:27:16,000 --> 00:27:18,000 says none of these are looking good, 598 00:27:18,000 --> 00:27:22,000 then it comes back to this decision and says, no, no, actually, S's a zero. That wasn't so 599 00:27:22,000 --> 00:27:24,000 hot. How about we try S as one? 600 00:27:24,000 --> 00:27:27,000 And then kind of works its way 601 00:27:27,000 --> 00:27:32,000 further down. 602 00:27:32,000 --> 00:27:34,000 Hello? Okay. So let me try this again. Let me get it 603 00:27:34,000 --> 00:27:40,000 to just go. Okay. 604 00:27:40,000 --> 00:27:42,000 So it took 30 assignments 605 00:27:42,000 --> 00:27:44,000 - 30 different things it tried 606 00:27:44,000 --> 00:27:47,000 before it was able to come up with 41 plus 582 equals 623, 607 00:27:47,000 --> 00:27:49,000 which does add correctly. 608 00:27:49,000 --> 00:27:53,000 It didn't have to unmake once it decided that S was one. It turns out that was a 609 00:27:53,000 --> 00:27:57,000 workable solution so it didn't have to go very far once it made that commitment, 610 00:27:57,000 --> 00:27:59,000 and then you can see it kind of working its way up. 611 00:27:59,000 --> 00:28:03,000 So 30 assignments, right, across this tree that has, you know, 612 00:28:03,000 --> 00:28:05,000 hundreds of thousands in the factorial, very large number, 613 00:28:05,000 --> 00:28:08,000 but very quickly kind of pruning down those things that aren't worth looking 614 00:28:08,000 --> 00:28:09,000 at, and 615 00:28:09,000 --> 00:28:12,000 so focusing its attention on those things that are more likely to 616 00:28:12,000 --> 00:28:14,000 work out using information about the problem. 617 00:28:14,000 --> 00:28:17,000 It doesn't really change what the recursion does. It's kind of interesting if you 618 00:28:17,000 --> 00:28:20,000 think that the same backtracking and recursive strategy's the same in all these 619 00:28:20,000 --> 00:28:23,000 problems, but what you're trying to do is pick your options, like, 620 00:28:23,000 --> 00:28:24,000 looking at the choices you have 621 00:28:24,000 --> 00:28:28,000 and trying to decide what are the more likely ones, which ones are not worth trying, right, so 622 00:28:28,000 --> 00:28:29,000 sort of directing 623 00:28:29,000 --> 00:28:32,000 your decision making 624 00:28:32,000 --> 00:28:33,000 from the 625 00:28:33,000 --> 00:28:36,000 standpoint of trying to make sure you don't violate constraints that later will come 626 00:28:36,000 --> 00:28:38,000 back to bite you, and things like that. 627 00:28:38,000 --> 00:28:40,000 628 00:28:40,000 --> 00:28:41,000 This one's back here. 629 00:28:41,000 --> 00:28:43,000 It's at 630 00:28:43,000 --> 00:28:47,000 13,000 and working hard. Getting desperate. Doing its thing. Still has not unmade the 631 00:28:47,000 --> 00:28:52,000 decision about S is zero, so you can get an idea of how much longer it's gonna take. We'll 632 00:28:52,000 --> 00:28:53,000 just let it keep going. 633 00:28:53,000 --> 00:28:55,000 I'm in no hurry - and let 634 00:28:55,000 --> 00:28:56,000 it 635 00:28:56,000 --> 00:29:07,000 do its thing. 636 00:29:07,000 --> 00:29:11,000 So let me - before I move away from talking about recursion, just try 637 00:29:11,000 --> 00:29:12,000 638 00:29:12,000 --> 00:29:15,000 to get you thinking just a little bit about how the patterns we're seeing, 639 00:29:15,000 --> 00:29:16,000 right, 640 00:29:16,000 --> 00:29:18,000 are more alike than they are different. 641 00:29:18,000 --> 00:29:21,000 That solving a sudoku, solving the eight queens, 642 00:29:21,000 --> 00:29:26,000 you know, solving the find an anagram in a sequence of letters, that 643 00:29:26,000 --> 00:29:31,000 they all have this general idea of there being these decisions that you're making, 644 00:29:31,000 --> 00:29:33,000 and you're working your way down to where there's, you know 645 00:29:33,000 --> 00:29:37,000 - fewer and fewer of those decisions until eventually you end up, sort of, okay, I'm done. I've 646 00:29:37,000 --> 00:29:41,000 made all the decisions I can. What letter goes next, or whether something goes in or out. 647 00:29:41,000 --> 00:29:44,000 And the two problems that I call the kind of master or mother problems, 648 00:29:44,000 --> 00:29:46,000 right, of permutations and subsets 649 00:29:46,000 --> 00:29:49,000 are kind of fundamental to 650 00:29:49,000 --> 00:29:52,000 adapting them to these other domains. And so I'm gonna give you a couple 651 00:29:52,000 --> 00:29:55,000 examples of some things that come up that actually are just kind of permutations, 652 00:29:55,000 --> 00:29:58,000 or subsets, or some variation that you can help me think about a little bit. 653 00:29:58,000 --> 00:30:03,000 One that the CS people are fond of is this idea of knapsack feeling. 654 00:30:03,000 --> 00:30:07,000 It's often cast as someone breaking into your house. All criminals at 655 00:30:07,000 --> 00:30:09,000 heart apparently in computer science. 656 00:30:09,000 --> 00:30:12,000 And you've got this sack, and you can put 50 pounds of stuff in it, 657 00:30:12,000 --> 00:30:15,000 right, and you're looking around, you know, at all the stuff that's 658 00:30:15,000 --> 00:30:18,000 up for grabs in this house that you've broken into, 659 00:30:18,000 --> 00:30:22,000 and you want to try to pack your sack, right, so that you've got 50 pounds of the high value 660 00:30:22,000 --> 00:30:23,000 stuff. 661 00:30:23,000 --> 00:30:26,000 So let's say there's, like, 100 things, right, you could pick up, right, 662 00:30:26,000 --> 00:30:29,000 and they weigh different amounts, and they're worth different amounts. 663 00:30:29,000 --> 00:30:33,000 What's the strategy for going about finding the optimal 664 00:30:33,000 --> 00:30:35,000 combination of things to stick into your sack, right, 665 00:30:35,000 --> 00:30:38,000 so that you got the maximum value from your heist? 666 00:30:38,000 --> 00:30:44,000 What problem does that look like that you've seen? 667 00:30:44,000 --> 00:30:47,000 It's a subset. It's a subset. [Inaudible]. 668 00:30:47,000 --> 00:30:48,000 Right? So if you look at the, 669 00:30:48,000 --> 00:30:52,000 you know, the Wii, and you say, oh, should I take the Wii? You know, it weighs this much, and it's this much 670 00:30:52,000 --> 00:30:53,000 value, well 671 00:30:53,000 --> 00:30:56,000 let's try it in and see what happens, right, and see what else I can stuff in the sack with 672 00:30:56,000 --> 00:30:59,000 that, right, and then see how well that works out. I should also try it with it out. 673 00:30:59,000 --> 00:31:02,000 So while you're standing there trying to decide what to 674 00:31:02,000 --> 00:31:05,000 steal, you have to type in all the values of things in every computer program. 675 00:31:05,000 --> 00:31:08,000 Go through the kind of machinations of well, try this with that because some things 676 00:31:08,000 --> 00:31:10,000 are big, but have a lot of value, 677 00:31:10,000 --> 00:31:13,000 but they only leave a little bit of odd space left over you might not be able to 678 00:31:13,000 --> 00:31:15,000 use well, or something. 679 00:31:15,000 --> 00:31:18,000 But what we're looking for is the optimal combination, the optimal subset. So 680 00:31:18,000 --> 00:31:21,000 trying the different subsets tells you how much value and weight 681 00:31:21,000 --> 00:31:25,000 you can get in a combination, and then you're looking for that best value you can get. 682 00:31:25,000 --> 00:31:27,000 You're the traveling salesman. 683 00:31:27,000 --> 00:31:29,000 You've got 10 cities to visit. 684 00:31:29,000 --> 00:31:33,000 Boston, New York, Phoenix, Minneapolis, right? 685 00:31:33,000 --> 00:31:36,000 You want to cover them in such a way that you spend the minimal amount of time, 686 00:31:36,000 --> 00:31:40,000 you know, in painful air travel across the U.S. Well you certainly don't want to be 687 00:31:40,000 --> 00:31:41,000 going 688 00:31:41,000 --> 00:31:47,000 Boston, New York, right, like, Boston, to L.A., to New York, to Seattle, to 689 00:31:47,000 --> 00:31:48,000 D.C., 690 00:31:48,000 --> 00:31:51,000 to Los Angeles back and forth, right? There's gotta be a pattern where you 691 00:31:51,000 --> 00:31:55,000 kind of visit the close cities and work your way to the far cities to kind of 692 00:31:55,000 --> 00:31:58,000 minimize your total distance overall. 693 00:31:58,000 --> 00:32:02,000 What problem was that really in disguise? We've 694 00:32:02,000 --> 00:32:06,000 got 10 cities. What are you trying to do? 695 00:32:06,000 --> 00:32:08,000 Help me out a little. I hear it almost. Louder. 696 00:32:08,000 --> 00:32:12,000 Permutations. You've got a permutation problem there, all right? 697 00:32:12,000 --> 00:32:14,000 You got 10 cities. They all have to show up, 698 00:32:14,000 --> 00:32:17,000 right? And it's a permutation. Where do you start, where do you end, where do you go in the 699 00:32:17,000 --> 00:32:20,000 middle, right? What sequencing, right, do you need to take? 700 00:32:20,000 --> 00:32:21,000 So what you're looking at is 701 00:32:21,000 --> 00:32:24,000 try the permutations, tell me which ones come up short. 702 00:32:24,000 --> 00:32:27,000 There are things, right, about heuristics that can help this, right? 703 00:32:27,000 --> 00:32:28,000 So the idea that 704 00:32:28,000 --> 00:32:31,000 certainly, like, the ones that are closer to you are likely to make a better 705 00:32:31,000 --> 00:32:33,000 choice than the longer one. So kind of using 706 00:32:33,000 --> 00:32:36,000 some information about the problem can help you decide which ones are more 707 00:32:36,000 --> 00:32:40,000 promising avenues to explore. But in the end it's a permutation problem. 708 00:32:40,000 --> 00:32:43,000 I'm trying to divide you guys into fair teams. I want to, you know, divide you up into 709 00:32:43,000 --> 00:32:46,000 10 teams to have a kind of head-to-head programming competition. I happen to know 710 00:32:46,000 --> 00:32:48,000 all your Iqs. 711 00:32:48,000 --> 00:32:50,000 I don't know. Some other, you know, useless fact that 712 00:32:50,000 --> 00:32:53,000 perhaps we could use as some judge of your worth. 713 00:32:53,000 --> 00:32:54,000 714 00:32:54,000 --> 00:32:58,000 And I want to make sure that each time has kind of 715 00:32:58,000 --> 00:33:01,000 a fair amount of IQ power in it 716 00:33:01,000 --> 00:33:03,000 relative to the others that I didn't put, 717 00:33:03,000 --> 00:33:09,000 you know, all the superstars on one team. What am I doing? How do 718 00:33:09,000 --> 00:33:12,000 I divide you guys up? It's a subset problem, right? So 719 00:33:12,000 --> 00:33:14,000 if - 720 00:33:14,000 --> 00:33:17,000 in this case it's a subset that's not just in or out. It's like which team am I gonna 721 00:33:17,000 --> 00:33:21,000 put you in? So I've got 10 subsets to build. But in the end it's an in out 722 00:33:21,000 --> 00:33:22,000 problem that looks like, well, 723 00:33:22,000 --> 00:33:23,000 I have the next student. 724 00:33:23,000 --> 00:33:27,000 Which of the 10 teams can I try them in? I can try them in each of them, right? So 725 00:33:27,000 --> 00:33:30,000 in some sense it's trying in the first team, the second team, the third team, 726 00:33:30,000 --> 00:33:32,000 right, and then pull the next student, try him in those teams, 727 00:33:32,000 --> 00:33:35,000 and then see whether I can come up with something that appears to get a balance ratio when I'm 728 00:33:35,000 --> 00:33:37,000 done. 729 00:33:37,000 --> 00:33:41,000 It turns out you can take the letters from Richard Millhouse Dickson and you can 730 00:33:41,000 --> 00:33:44,000 extract and rearrange them to spell the word criminal. I 731 00:33:44,000 --> 00:33:48,000 am making no conclusion about anything, having done that, 732 00:33:48,000 --> 00:33:48,000 just 733 00:33:48,000 --> 00:33:50,000 an observation of fact. 734 00:33:50,000 --> 00:33:51,000 735 00:33:51,000 --> 00:33:55,000 But that sort of process, right, I'm trying to find, well what is the longest word 736 00:33:55,000 --> 00:33:58,000 that you can extract from a sequence of letters? 737 00:33:58,000 --> 00:34:05,000 What kind of things is it using? 738 00:34:05,000 --> 00:34:13,000 Anything you've seen before? 739 00:34:13,000 --> 00:34:18,000 Oh, somebody come on. I hear the whispering but no one wants to just stand up and be 740 00:34:18,000 --> 00:34:20,000 committed. 741 00:34:20,000 --> 00:34:23,000 How about 742 00:34:23,000 --> 00:34:24,000 a 743 00:34:24,000 --> 00:34:27,000 little of both, right? The what? Like the sudoku puzzle? It's a little bit like the sudoku, right? It's got a permutation sort of backing, and it's got a little bit of 744 00:34:27,000 --> 00:34:29,000 a subset backing, right? That is that 745 00:34:29,000 --> 00:34:32,000 I'm not choosing all the letters from this. And in fact there's a subset process, 746 00:34:32,000 --> 00:34:35,000 which is deciding whether it's in or out, and then there's a permutation problem, 747 00:34:35,000 --> 00:34:37,000 which is actually rearranging them, right? 748 00:34:37,000 --> 00:34:40,000 That picking the C, and the R, and the I, and the M, and then kind of rearranging them. So in fact it 749 00:34:40,000 --> 00:34:44,000 has kind of a little bit of both, right? Which is what sequence of letters can I 750 00:34:44,000 --> 00:34:46,000 extract and then rearrange to make a word - would 751 00:34:46,000 --> 00:34:50,000 end up using kind of elements of both of those kinds of recursions sort of 752 00:34:50,000 --> 00:34:51,000 mixed together 753 00:34:51,000 --> 00:34:55,000 so that when you look at a lot of recursion problems, they end up actually just mapping down to 754 00:34:55,000 --> 00:34:59,000 one or the other, or maybe a little bit of both. And so feeling very comfortable with 755 00:34:59,000 --> 00:35:00,000 those problems, 756 00:35:00,000 --> 00:35:03,000 how you turn them into a recursive backtracker, 757 00:35:03,000 --> 00:35:05,000 and how you can recognize, right, 758 00:35:05,000 --> 00:35:08,000 their roots inside these other problems, it's kind of a really good first step to 759 00:35:08,000 --> 00:35:12,000 becoming kind of a journeyman, in a way, or recursion. 760 00:35:12,000 --> 00:35:14,000 So just whenever you see a new problem you start thinking, okay, 761 00:35:14,000 --> 00:35:17,000 is it permutations? Is it subset? Or is it something totally new? 762 00:35:17,000 --> 00:35:21,000 It's probably much more likely to be the first two then the third, and so 763 00:35:21,000 --> 00:35:24,000 trying to use the ones that you feel really comfortable with to kind of branch 764 00:35:24,000 --> 00:35:25,000 out to solve 765 00:35:25,000 --> 00:35:29,000 similar problems, right, where the domain is a little different - the structure 766 00:35:29,000 --> 00:35:35,000 of the code is still very much the same. All right. 767 00:35:35,000 --> 00:35:37,000 I've got 10 minutes to teach you about pointers. 768 00:35:37,000 --> 00:35:39,000 This should be no problem. Let's 769 00:35:39,000 --> 00:35:42,000 go see what that guy back there's doing. Oh, look at 770 00:35:42,000 --> 00:35:47,000 that, 38,000 assignments, still has not given up on that S is zero. 771 00:35:47,000 --> 00:35:53,000 It's a trooper. Okay. 772 00:35:53,000 --> 00:35:57,000 So this is definitely gonna be your first introduction on kind of the first 773 00:35:57,000 --> 00:36:01,000 day of talking about this. This is not the only day. So 774 00:36:01,000 --> 00:36:03,000 don't get to worried about me trying to do it in 10 minutes. What I'm gonna do is give you 775 00:36:03,000 --> 00:36:07,000 the kind of - a little bit of the basics today, and we're gonna follow up with 776 00:36:07,000 --> 00:36:10,000 some more stuff tomorrow where we start kind of making use of them and doing something 777 00:36:10,000 --> 00:36:11,000 kind of interesting with them. 778 00:36:11,000 --> 00:36:13,000 So there is this notion 779 00:36:13,000 --> 00:36:17,000 in C++, which is inherited from the C language in fact, 780 00:36:17,000 --> 00:36:20,000 of using a variable type called a pointer 781 00:36:20,000 --> 00:36:22,000 as part of the things that you operate on. 782 00:36:22,000 --> 00:36:26,000 Okay. People tend to have a lot of trepidation. Just the word pointer causes a little bit 783 00:36:26,000 --> 00:36:28,000 of fear, kind of 784 00:36:28,000 --> 00:36:30,000 to immediately rise in a new programmer. 785 00:36:30,000 --> 00:36:34,000 But hopefully we can demystify a little bit, and then also get a little bit of understanding 786 00:36:34,000 --> 00:36:38,000 of why there are reasons to be a little bit 787 00:36:38,000 --> 00:36:40,000 wary of working with pointers. 788 00:36:40,000 --> 00:36:42,000 A pointer is actually an address. 789 00:36:42,000 --> 00:36:45,000 So here's a couple things we have to kind of think about a little bit how the 790 00:36:45,000 --> 00:36:48,000 machine works to have a vision of what's going on here, 791 00:36:48,000 --> 00:36:52,000 that when you declare variables, you 792 00:36:52,000 --> 00:36:54,000 know, here I am in main. 793 00:36:54,000 --> 00:36:56,000 794 00:36:56,000 --> 00:36:58,000 And I declare, 795 00:36:58,000 --> 00:37:02,000 you know, int Num, 796 00:37:02,000 --> 00:37:04,000 and string S, 797 00:37:04,000 --> 00:37:05,000 that there 798 00:37:05,000 --> 00:37:09,000 is space set aside as part of the working memory of this program 799 00:37:09,000 --> 00:37:13,000 that is gonna record whatever I assign to Num or S. So if I say Num equals 800 00:37:13,000 --> 00:37:19,000 45, what is that? S equals hello, that there 801 00:37:19,000 --> 00:37:20,000 has to be memory 802 00:37:20,000 --> 00:37:22,000 that's being used to hold onto that. 803 00:37:22,000 --> 00:37:27,000 804 00:37:27,000 --> 00:37:28,000 805 00:37:28,000 --> 00:37:32,000 And as you declare variables, right, more of the space is set aside, and, you know, when 806 00:37:32,000 --> 00:37:36,000 you initialize it, when you read to it, when you write to it, right, that 807 00:37:36,000 --> 00:37:40,000 space is being accessed and updated as you go through. When you open 808 00:37:40,000 --> 00:37:44,000 a new scope, right, new variables come in a scope. When you close that scope, really that 809 00:37:44,000 --> 00:37:47,000 function, that space gets de-allocated. All this happens on your behalf without you 810 00:37:47,000 --> 00:37:50,000 actually taking a really active role in it, but in fact you do have to have a little 811 00:37:50,000 --> 00:37:53,000 bit of understanding of that to kind of idea of what a pointer's about, 812 00:37:53,000 --> 00:37:56,000 is that there is this just big sequence of 813 00:37:56,000 --> 00:37:57,000 data, 814 00:37:57,000 --> 00:38:00,000 starting from an address zero. You can think of these things as actually 815 00:38:00,000 --> 00:38:02,000 having a little number on them. 816 00:38:02,000 --> 00:38:06,000 So maybe this is number 1,000, and this is number 817 00:38:06,000 --> 00:38:08,000 1,004. In this case, assuming that 818 00:38:08,000 --> 00:38:10,000 we're numbering by the byte, 819 00:38:10,000 --> 00:38:11,000 which is 820 00:38:11,000 --> 00:38:12,000 821 00:38:12,000 --> 00:38:15,000 the smallest chunk of memory we can talk about, 822 00:38:15,000 --> 00:38:19,000 and that an integer requires four of those bytes to store all the different 823 00:38:19,000 --> 00:38:20,000 patterns for different kinds of numbers 824 00:38:20,000 --> 00:38:24,000 so that the bytes from 1,000 to 1,003 are reserved for holding 825 00:38:24,000 --> 00:38:27,000 onto the information about 45. And then from 1,000 forward - 826 00:38:27,000 --> 00:38:31,000 to however big the string is, which we don't even really know how big it is, so we'll 827 00:38:31,000 --> 00:38:33,000 kind of leave it a little bit vague here - 828 00:38:33,000 --> 00:38:37,000 is gonna be set aside for storing information about that string. 829 00:38:37,000 --> 00:38:38,000 Okay. 830 00:38:38,000 --> 00:38:41,000 So usually it's of interest to the compiler, right, to know where things are being 831 00:38:41,000 --> 00:38:43,000 stored and what's going on. 832 00:38:43,000 --> 00:38:47,000 But it also turns out to be somewhat useful for us to be able to talk about 833 00:38:47,000 --> 00:38:50,000 something by address. Instead of saying the variable 834 00:38:50,000 --> 00:38:51,000 whose name is Num, 835 00:38:51,000 --> 00:38:56,000 I can talk about the variable who lives at location 1,000. And say 836 00:38:56,000 --> 00:38:58,000 there's an integer, and 837 00:38:58,000 --> 00:39:01,000 I can point to it, or refer to it 838 00:39:01,000 --> 00:39:05,000 by saying go look at memory address 1,000, and read or write the contents 839 00:39:05,000 --> 00:39:09,000 there that are of integer type. Okay. 840 00:39:09,000 --> 00:39:13,000 It seems a little bit odd at first. You're like, well I already have other ways to get to Num. 841 00:39:13,000 --> 00:39:16,000 Why is it I want to go out of my way to find another different mechanism that can 842 00:39:16,000 --> 00:39:18,000 get me access to that same variable? 843 00:39:18,000 --> 00:39:21,000 And we're gonna see that actually there's a lot of flexibility 844 00:39:21,000 --> 00:39:23,000 that is being bought by us 845 00:39:23,000 --> 00:39:27,000 adding this to our feature set. We can say, well, I could actually have 846 00:39:27,000 --> 00:39:29,000 one copy of something. 847 00:39:29,000 --> 00:39:32,000 So imagine that you have 848 00:39:32,000 --> 00:39:32,000 849 00:39:32,000 --> 00:39:33,000 a system, 850 00:39:33,000 --> 00:39:37,000 like, an access system where you have student records, and they're enrolled in a bunch of different 851 00:39:37,000 --> 00:39:38,000 courses, that your 852 00:39:38,000 --> 00:39:41,000 enrolled in four, five different courses, 853 00:39:41,000 --> 00:39:44,000 that when it comes time for a class to know who's in their list, it might be 854 00:39:44,000 --> 00:39:45,000 handy for them, 855 00:39:45,000 --> 00:39:48,000 instead of actually having a copy of that student's information to have a pointer 856 00:39:48,000 --> 00:39:52,000 to one student record, and have a lot of different placers where there are 857 00:39:52,000 --> 00:39:55,000 additional pointers taken out to that same location and says, go look at the student who's 858 00:39:55,000 --> 00:39:56,000 at address 1,000. 859 00:39:56,000 --> 00:39:59,000 I have the students at address 1,000, at 1,026, 860 00:39:59,000 --> 00:40:02,000 at, you know, 1,035, 861 00:40:02,000 --> 00:40:05,000 whatever those places are, and that that's one way I can talk about what 862 00:40:05,000 --> 00:40:09,000 students I have, and those same student addresses might show up in someone else's 863 00:40:09,000 --> 00:40:11,000 class list in math 51, or 864 00:40:11,000 --> 00:40:13,000 physics, you know, 43 or whatever, 865 00:40:13,000 --> 00:40:15,000 and that we are 866 00:40:15,000 --> 00:40:18,000 all referring to one copy of the student data without duplicating it. 867 00:40:18,000 --> 00:40:22,000 So this idea of being able to refer to stuff not just by name, but by where it lives, 868 00:40:22,000 --> 00:40:28,000 is gonna give us some flexibility in terms of how we build things out of that. 869 00:40:28,000 --> 00:40:31,000 So let me kind of show you a little bit of the basic operations, 870 00:40:31,000 --> 00:40:35,000 and then - and I'll talk a little bit along the way about this thing about why they're 871 00:40:35,000 --> 00:40:36,000 scary 872 00:40:36,000 --> 00:40:37,000 because 873 00:40:37,000 --> 00:40:40,000 using, you know, memory access as your way to get to something is a little 874 00:40:40,000 --> 00:40:44,000 bit more error prone and a little bit harder to deal with than 875 00:40:44,000 --> 00:40:47,000 some of the more - other operations we have. 876 00:40:47,000 --> 00:40:50,000 So what I'm gonna show you here is a little piece of code. 877 00:40:50,000 --> 00:40:57,000 It shows some simple use of pointers. All right. 878 00:40:57,000 --> 00:40:58,000 So 879 00:40:58,000 --> 00:40:59,000 880 00:40:59,000 --> 00:41:02,000 I'm gonna draw some of the variables that are going on here. This 881 00:41:02,000 --> 00:41:04,000 is main and it declared 882 00:41:04,000 --> 00:41:05,000 an integer 883 00:41:05,000 --> 00:41:08,000 whose name is Num, so I 884 00:41:08,000 --> 00:41:09,000 draw a box for that, and it 885 00:41:09,000 --> 00:41:13,000 declares two pointer variables. So the addition of the asterisk by the name 886 00:41:13,000 --> 00:41:14,000 there 887 00:41:14,000 --> 00:41:19,000 says that P and Q are pointers to integers. 888 00:41:19,000 --> 00:41:23,000 So P and Q themselves are variables 889 00:41:23,000 --> 00:41:26,000 that live in the stack. So all the local variables we say 890 00:41:26,000 --> 00:41:30,000 live in the stack. 891 00:41:30,000 --> 00:41:33,000 They are automatically allocated and de-allocated when you enter a routine. The space for 892 00:41:33,000 --> 00:41:36,000 them comes into being. When you leave it, it goes away, 893 00:41:36,000 --> 00:41:40,000 and that P and Q are designed not to hold integers themselves. They don't hold numbers, 894 00:41:40,000 --> 00:41:44,000 but they hold the address of a number somewhere else. 895 00:41:44,000 --> 00:41:47,000 And so the first thing I did with 896 00:41:47,000 --> 00:41:48,000 P there was to assign it 897 00:41:48,000 --> 00:41:52,000 the address of a new integer variable that came out of the heap. So the 898 00:41:52,000 --> 00:41:55,000 new operator is like the new operator in Java. 899 00:41:55,000 --> 00:41:59,000 It takes the thing you want to create one of, it makes one of those in the heap, 900 00:41:59,000 --> 00:42:02,000 and it returns to you its address. In that way it works exactly like in Java. So, in 901 00:42:02,000 --> 00:42:04,000 fact, Java actually 902 00:42:04,000 --> 00:42:07,000 has pointers despite what anybody ever told you, 903 00:42:07,000 --> 00:42:11,000 that the way you create objects and you use new to access them and stuff like that 904 00:42:11,000 --> 00:42:15,000 is exactly the same as it is in C++ as it is 905 00:42:15,000 --> 00:42:16,000 pointers behind the scene. 906 00:42:16,000 --> 00:42:18,000 So I say P 907 00:42:18,000 --> 00:42:20,000 gets the value of a new integer. 908 00:42:20,000 --> 00:42:24,000 This memory over here is called the heap. 909 00:42:24,000 --> 00:42:27,000 So this is not 910 00:42:27,000 --> 00:42:29,000 to confuse you with the idea of 911 00:42:29,000 --> 00:42:32,000 the stack ADT. We've been using the [inaudible] 912 00:42:32,000 --> 00:42:35,000 but it does kind of - it helps you to remember that the stack actually kind of, like, by 913 00:42:35,000 --> 00:42:38,000 the virtue of the way function calls get made, main calls A, which 914 00:42:38,000 --> 00:42:42,000 calls B, which calls C, that that memory actually kind of is laid out like a stack. 915 00:42:42,000 --> 00:42:46,000 The heap is just this unordered crazy pile of stuff. 916 00:42:46,000 --> 00:42:49,000 I ask for new integer, so this might be address 1,000. 917 00:42:49,000 --> 00:42:52,000 This might be address 1,004. This might be address 1,008. Typically 918 00:42:52,000 --> 00:42:56,000 the stack variables are laid out right next to each other. This could be, like, 919 00:42:56,000 --> 00:42:56,000 address 920 00:42:56,000 --> 00:42:59,000 32,016 or something over here. 921 00:42:59,000 --> 00:43:02,000 Some other larger address. So I've 922 00:43:02,000 --> 00:43:03,000 assigned P to be that 923 00:43:03,000 --> 00:43:07,000 thing. So what I actually gotten written into P's box is behind the scenes there really is the number, 924 00:43:07,000 --> 00:43:10,000 you know, the address, 32,016. What 925 00:43:10,000 --> 00:43:14,000 I'm gonna draw is an arrow to remind myself that what it is is it's a 926 00:43:14,000 --> 00:43:18,000 location of an integer stored elsewhere. 927 00:43:18,000 --> 00:43:20,000 The de-reference operator, 928 00:43:20,000 --> 00:43:22,000 which is the first thing we see on this next line, 929 00:43:22,000 --> 00:43:25,000 says to follow P 930 00:43:25,000 --> 00:43:26,000 and assign it a 10. 931 00:43:26,000 --> 00:43:29,000 So this is taking that address that's in P, 932 00:43:29,000 --> 00:43:33,000 using it as a location to go look up something, and it says, and go write to that 933 00:43:33,000 --> 00:43:35,000 location in address 32,016 934 00:43:35,000 --> 00:43:42,000 the number 10. 935 00:43:42,000 --> 00:43:47,000 And I said Q equals no int. So Q gets an 936 00:43:47,000 --> 00:43:47,000 [inaudible] 937 00:43:47,000 --> 00:43:53,000 maybe this is at 23,496. 938 00:43:53,000 --> 00:43:55,000 Some other place out there. And so 939 00:43:55,000 --> 00:43:58,000 that's actually kind of what's being stored here, 940 00:43:58,000 --> 00:44:00,000 23,496. 941 00:44:00,000 --> 00:44:03,000 And then I did this thing where I assigned from 942 00:44:03,000 --> 00:44:05,000 D referencing P 943 00:44:05,000 --> 00:44:09,000 to get an integer, and assign that onto what Q is that has the effect of kind of 944 00:44:09,000 --> 00:44:15,000 following P, reading its 10, and then writing it over here. So copying the integers 945 00:44:15,000 --> 00:44:19,000 at the end of those two pointers to make them point to the same value. 946 00:44:19,000 --> 00:44:22,000 So they point to two different locations, but those locations hold the same integer 947 00:44:22,000 --> 00:44:23,000 value 948 00:44:23,000 --> 00:44:25,000 that is different 949 00:44:25,000 --> 00:44:27,000 than the next line. 950 00:44:27,000 --> 00:44:29,000 Without the stars, 951 00:44:29,000 --> 00:44:34,000 where I said Q equals P, 952 00:44:34,000 --> 00:44:37,000 without the stars on it, 953 00:44:37,000 --> 00:44:43,000 it's saying take the address that's in P, and assign it to Q, 954 00:44:43,000 --> 00:44:45,000 causing Q and P now to be aliases 955 00:44:45,000 --> 00:44:48,000 for the same location. 956 00:44:48,000 --> 00:44:52,000 So now I have two different ways of getting to that same piece of memory, 957 00:44:52,000 --> 00:44:55,000 either by reaching through P and de-referencing, or reaching through Q and de-referencing 958 00:44:55,000 --> 00:44:57,000 it - both of them 959 00:44:57,000 --> 00:45:01,000 are reading and writing to that same location in the heap, where the 10 is, 960 00:45:01,000 --> 00:45:05,000 and then this one down here's no longer accessible. I sort of lost track of it 961 00:45:05,000 --> 00:45:06,000 when I 962 00:45:06,000 --> 00:45:08,000 overwrote Q with the 963 00:45:08,000 --> 00:45:10,000 copy of P. 964 00:45:10,000 --> 00:45:12,000 When I am done 965 00:45:12,000 --> 00:45:14,000 in C++ it is my job 966 00:45:14,000 --> 00:45:15,000 to delete things 967 00:45:15,000 --> 00:45:20,000 that are coming out of the heap. 968 00:45:20,000 --> 00:45:24,000 Yes, it should. I'll take that one at 32,016. 969 00:45:24,000 --> 00:45:28,000 Whereas in Java, when you say new, and then you stop using something, 970 00:45:28,000 --> 00:45:30,000 it figures it out and it does what's called garbage collection to kind of clean up 971 00:45:30,000 --> 00:45:31,000 behind you. 972 00:45:31,000 --> 00:45:35,000 In C++, things that you new out of the heap are your responsibility to 973 00:45:35,000 --> 00:45:36,000 delete. 974 00:45:36,000 --> 00:45:40,000 So you delete something when you're done with it. If I'm no longer using this 975 00:45:40,000 --> 00:45:42,000 new integer I created in the heap, 976 00:45:42,000 --> 00:45:44,000 I say to delete P 977 00:45:44,000 --> 00:45:47,000 to allow that memory to be reclaimed. 978 00:45:47,000 --> 00:45:51,000 And so that causes this piece of memory to get marked as freed, 979 00:45:51,000 --> 00:45:56,000 or reusable, so that a subsequent new call can have that space again 980 00:45:56,000 --> 00:45:58,000 and use it for things. 981 00:45:58,000 --> 00:46:01,000 I will note that right now, the code as written 982 00:46:01,000 --> 00:46:05,000 has a little bit of an error in it because it says delete P. And 983 00:46:05,000 --> 00:46:09,000 delete P says we'll follow out to 32 [inaudible] and mark it as 984 00:46:09,000 --> 00:46:10,000 available. 985 00:46:10,000 --> 00:46:13,000 The next thing I did was said delete Q. Well Q points to the same 986 00:46:13,000 --> 00:46:17,000 place P did. So in fact I'm saying take that piece of freed memory and mark it freed 987 00:46:17,000 --> 00:46:17,000 again. 988 00:46:17,000 --> 00:46:19,000 989 00:46:19,000 --> 00:46:22,000 There is no real guarantee about whether that's gonna do something 990 00:46:22,000 --> 00:46:25,000 sensible, or whether it's just gonna ignore me. One thing it could do is just say, well look, that 991 00:46:25,000 --> 00:46:27,000 memory's already freed, so 992 00:46:27,000 --> 00:46:29,000 you saying to free it twice is kind of stupid. 993 00:46:29,000 --> 00:46:33,000 On the other hand, it could still cause some more problems depending on how the 994 00:46:33,000 --> 00:46:37,000 heap allocater works, and there's no guarantee. So it becomes very [inaudible] on the 995 00:46:37,000 --> 00:46:39,000 programmer to be very careful about this matching, that 996 00:46:39,000 --> 00:46:41,000 if you make a new call, 997 00:46:41,000 --> 00:46:43,000 you make a delete call. And 998 00:46:43,000 --> 00:46:45,000 if you already made a delete call, you don't make another one 999 00:46:45,000 --> 00:46:49,000 accidentally. So I really should have that line out of there. 1000 00:46:49,000 --> 00:46:54,000 The piece of memory that Q originally pointed to, right, I no longer have any way to get to, 1001 00:46:54,000 --> 00:46:57,000 and so I have no way of making a delete call to it and freeing it. And so this little 1002 00:46:57,000 --> 00:47:01,000 piece of memory we call an orphan. 1003 00:47:01,000 --> 00:47:03,000 He's stranded out there in the heap, 1004 00:47:03,000 --> 00:47:04,000 no way to get back to it, 1005 00:47:04,000 --> 00:47:09,000 and C++ will not automatically reclaim it for us. So we have 1006 00:47:09,000 --> 00:47:12,000 created a little bit of a mess for ourselves. If we did that a lot, right, 1007 00:47:12,000 --> 00:47:16,000 we could end up clogging our heap filled with these orphans, 1008 00:47:16,000 --> 00:47:19,000 and have to keep getting new memory because the old memory, right, was not 1009 00:47:19,000 --> 00:47:22,000 being properly reclaimed. 1010 00:47:22,000 --> 00:47:24,000 We're gonna talk more about this, so this is not 1011 00:47:24,000 --> 00:47:27,000 the all - end all of pointers, but just a little bit to think about today. We'll 1012 00:47:27,000 --> 00:47:30,000 come back on Wednesday, talk more about it, and then talk about how we can use 1013 00:47:30,000 --> 00:47:33,000 this to build linked lists, and that will be 1014 00:47:33,000 --> 00:47:36,000 fun times for all.