1 00:00:00,000 --> 00:00:10,000 2 00:00:10,000 --> 00:00:11,000 3 00:00:11,000 --> 00:00:14,000 This presentation is delivered by the Stanford Center for Professional 4 00:00:14,000 --> 00:00:24,000 Development. 5 00:00:24,000 --> 00:00:26,000 Welcome to Friday. I keep telling you I'm gonna talk 6 00:00:26,000 --> 00:00:29,000 about linked list and then we don't ever get very far, but 7 00:00:29,000 --> 00:00:33,000 today really we are gonna talk about linked list and do some tracing and see some 8 00:00:33,000 --> 00:00:36,000 recursive work we can do with linked list and then 9 00:00:36,000 --> 00:00:40,000 I'll probably give you a little preview of the next topic, which is the introduction 10 00:00:40,000 --> 00:00:44,000 algorithm analysis in Big O toward the end. We won't get very far in that, but we will be talking 11 00:00:44,000 --> 00:00:47,000 about that all next week, which is Chapter 7 in its full glory. That 12 00:00:47,000 --> 00:00:48,000 will be 13 00:00:48,000 --> 00:00:51,000 pretty much everything we cover next week is coming out of Chapter 7, so it's all 14 00:00:51,000 --> 00:00:53,000 about algorithm and sorting and different 15 00:00:53,000 --> 00:00:54,000 algorithms. 16 00:00:54,000 --> 00:00:58,000 I will not be able to go - hang out with you today after class because I have an undergraduate 17 00:00:58,000 --> 00:00:59,000 counsel meeting. 18 00:00:59,000 --> 00:01:00,000 19 00:01:00,000 --> 00:01:03,000 But it's a nice day so hopefully you'll have some more 20 00:01:03,000 --> 00:01:06,000 fun outdoor thing to do anyway. 21 00:01:06,000 --> 00:01:09,000 We will meet again next Friday. So 22 00:01:09,000 --> 00:01:11,000 put that on your calendar if you are 23 00:01:11,000 --> 00:01:16,000 the type who keeps a calendar. Okay. 24 00:01:16,000 --> 00:01:20,000 So actually I'm gonna not even really work in the things. I'm gonna go back to my 25 00:01:20,000 --> 00:01:22,000 editing and talk about kind of where we were at 26 00:01:22,000 --> 00:01:25,000 at the end of 27 00:01:25,000 --> 00:01:27,000 Wednesday's lecture. Was 28 00:01:27,000 --> 00:01:29,000 we were doing a little bit of 29 00:01:29,000 --> 00:01:31,000 coding with the linked list itself. 30 00:01:31,000 --> 00:01:35,000 So having defined that recursive structure that has the information from 31 00:01:35,000 --> 00:01:37,000 one entry plus a pointer to the next 32 00:01:37,000 --> 00:01:40,000 and then so far the little pieces we have is something that we will print an individual 33 00:01:40,000 --> 00:01:41,000 entry, 34 00:01:41,000 --> 00:01:44,000 something that will create a new entry out in the heap, filling in with data 35 00:01:44,000 --> 00:01:47,000 that was read from the console typed in by the user, 36 00:01:47,000 --> 00:01:51,000 and then the last bit of code that we were looking at was this idea of building the 37 00:01:51,000 --> 00:01:52,000 list 38 00:01:52,000 --> 00:01:55,000 by getting a new entry. 39 00:01:55,000 --> 00:01:58,000 So a pointer coming back from there that points to a new heap allocated 40 00:01:58,000 --> 00:02:01,000 structure that has the name and the address of someone 41 00:02:01,000 --> 00:02:03,000 and then attaching it to the 42 00:02:03,000 --> 00:02:06,000 front part of the list. So we talked about why that was 43 00:02:06,000 --> 00:02:08,000 the easiest thing to do and I'm gonna 44 00:02:08,000 --> 00:02:11,000 just re-draw this picture again because we're gonna need it 45 00:02:11,000 --> 00:02:16,000 as we go through and do stuff. Here's the main stack frame. 46 00:02:16,000 --> 00:02:19,000 It's got this pointer list, which 47 00:02:19,000 --> 00:02:23,000 is expected to point to an entry out in the heap. It starts pointing at null. 48 00:02:23,000 --> 00:02:26,000 We have this local variable down here, new one 49 00:02:26,000 --> 00:02:29,000 that is also a pointer to an entry 50 00:02:29,000 --> 00:02:32,000 and based on the result from get new entry, which will either return null to say 51 00:02:32,000 --> 00:02:36,000 there's no more entries or a new heap allocated thing if it gives us this new 52 00:02:36,000 --> 00:02:38,000 entry that says 53 00:02:38,000 --> 00:02:43,000 here's Jason and his phone number. 54 00:02:43,000 --> 00:02:48,000 That the lines there - the new ones next equals list so 55 00:02:48,000 --> 00:02:52,000 new ones next field, right? Gets assigned the value of list, which affectively just copies 56 00:02:52,000 --> 00:02:54,000 a null on top of a null 57 00:02:54,000 --> 00:02:59,000 and then sets list to point to the same place that new 1 does. 58 00:02:59,000 --> 00:03:01,000 So on the first iteration, 59 00:03:01,000 --> 00:03:04,000 Jason is the new front most cell on the list, 60 00:03:04,000 --> 00:03:08,000 has a null terminator in the next field, which says there's no further cells. So we 61 00:03:08,000 --> 00:03:10,000 have a singleton list of just one cell. 62 00:03:10,000 --> 00:03:13,000 Now the subsequent iteration 63 00:03:13,000 --> 00:03:17,000 we call get new entry returns us a new 64 00:03:17,000 --> 00:03:23,000 pointer to another cell out here on the list. This one's gonna be Joel. 65 00:03:23,000 --> 00:03:27,000 He starts off as his own singleton list and then here's the attach again. 66 00:03:27,000 --> 00:03:29,000 New ones 67 00:03:29,000 --> 00:03:32,000 next field gets the value of list, 68 00:03:32,000 --> 00:03:35,000 so that changes 69 00:03:35,000 --> 00:03:38,000 Joel's next to 70 00:03:38,000 --> 00:03:41,000 point to where list does now. So I have two different ways to get to Jason's cell 71 00:03:41,000 --> 00:03:45,000 now. Directly, it's the front most cell of the list still, but it's now following Joel. 72 00:03:45,000 --> 00:03:47,000 And then I update list 73 00:03:47,000 --> 00:03:49,000 to point to new one, 74 00:03:49,000 --> 00:03:54,000 so doing pointer assignment on this. Copying the pointers, making an alias, 75 00:03:54,000 --> 00:03:55,000 and now 76 00:03:55,000 --> 00:03:59,000 at the bottom of that loop, right? I now have list pointing to Joel, which points 77 00:03:59,000 --> 00:04:02,000 to Jason, which points to null. So my two entry list. 78 00:04:02,000 --> 00:04:05,000 Every subsequent cell that's created is 79 00:04:05,000 --> 00:04:06,000 prepended, right? 80 00:04:06,000 --> 00:04:08,000 Placed in the front most position of a list 81 00:04:08,000 --> 00:04:11,000 and then the remainder of the list trails behind it. 82 00:04:11,000 --> 00:04:13,000 So we should expect that if we 83 00:04:13,000 --> 00:04:16,000 put in the list A, B, C that if we were to go back and traverse it we'll 84 00:04:16,000 --> 00:04:18,000 discover it goes C, B, A. And 85 00:04:18,000 --> 00:04:22,000 so I was just about to show you that and then we ran out of time. So I'm gonna go ahead and 86 00:04:22,000 --> 00:04:22,000 finish that because I'm gonna 87 00:04:22,000 --> 00:04:31,000 write something that will print an entire linked list. 88 00:04:31,000 --> 00:04:33,000 It will take the list that we have and I'm gonna 89 00:04:33,000 --> 00:04:35,000 show you 90 00:04:35,000 --> 00:04:39,000 that. 91 00:04:39,000 --> 00:04:42,000 So the idea is to 92 00:04:42,000 --> 00:04:44,000 traverse a pointer 93 00:04:44,000 --> 00:04:47,000 down the list 94 00:04:47,000 --> 00:04:49,000 and print each one in 95 00:04:49,000 --> 00:04:51,000 turn. I'm gonna do that using a four loop. It 96 00:04:51,000 --> 00:04:55,000 may seem a little bit like an odd use of the four loop, but in fact what 97 00:04:55,000 --> 00:04:59,000 we're doing is really comparable to how you would do iteration down a link, an 98 00:04:59,000 --> 00:05:00,000 array, 99 00:05:00,000 --> 00:05:03,000 using kind of zero to n, you know, I++. 100 00:05:03,000 --> 00:05:06,000 We're doing the same thing, but instead of advancing an integer, which 101 00:05:06,000 --> 00:05:10,000 indexes into that, we're actually keeping track of a pointer, which moves down. 102 00:05:10,000 --> 00:05:14,000 So the initial state of that pointer as we assign it to be the value of the list, 103 00:05:14,000 --> 00:05:17,000 so we alias to the first cell of the list. 104 00:05:17,000 --> 00:05:20,000 While that pointer points this invalid cell not to null, 105 00:05:20,000 --> 00:05:23,000 so as long as there are still cells to process 106 00:05:23,000 --> 00:05:25,000 then we'll print to 107 00:05:25,000 --> 00:05:28,000 cell. Then we'll advance the kind of equivalent of the I++. 108 00:05:28,000 --> 00:05:30,000 Here's the Kerr equals Kerr next. 109 00:05:30,000 --> 00:05:34,000 So advancing - so starting from Kerr equals Joel, right? To Kerr equals Jason and then Kerr 110 00:05:34,000 --> 00:05:38,000 equals null, which will terminate that. So as long as our linked list is properly 111 00:05:38,000 --> 00:05:40,000 terminated, right? We should print all the cells 112 00:05:40,000 --> 00:05:41,000 from front to back 113 00:05:41,000 --> 00:05:43,000 using this one loop. 114 00:05:43,000 --> 00:05:47,000 And so if I change this down here to be print list instead of just print to the front 115 00:05:47,000 --> 00:05:48,000 most entry 116 00:05:48,000 --> 00:05:56,000 and then I run a little test on this. 117 00:05:56,000 --> 00:05:58,000 So I enter Jake 118 00:05:58,000 --> 00:06:00,000 and his phone number, 119 00:06:00,000 --> 00:06:02,000 and I enter 120 00:06:02,000 --> 00:06:04,000 Carl and his phone number, and then I 121 00:06:04,000 --> 00:06:06,000 enter Ilia and 122 00:06:06,000 --> 00:06:07,000 his phone number, and I 123 00:06:07,000 --> 00:06:10,000 see that's all I have. Then they come back out in the opposite order, 124 00:06:10,000 --> 00:06:13,000 right? That Ilia, Carl, Jake 125 00:06:13,000 --> 00:06:17,000 because of the way I was placing them in the front of the list, right? Kind of effectively reversed, right? Them 126 00:06:17,000 --> 00:06:22,000 from the order they were inserted. Do you have a question? Do people ever 127 00:06:22,000 --> 00:06:22,000 make blank lists so that you can traverse backwards 128 00:06:22,000 --> 00:06:25,000 through them? 129 00:06:25,000 --> 00:06:28,000 They certainly do. So the simplest form of the linked list, right? Has only these 130 00:06:28,000 --> 00:06:33,000 forward links, right? So each cell gets to the one that follows it in the list, 131 00:06:33,000 --> 00:06:36,000 but that is gonna create certain asymmetries in how you process that list. 132 00:06:36,000 --> 00:06:38,000 You're always gonna start at the front and move your way to the back. Well, what happens if you're 133 00:06:38,000 --> 00:06:40,000 at a cell and you'd like to back up, right? 134 00:06:40,000 --> 00:06:42,000 It doesn't have the information 135 00:06:42,000 --> 00:06:46,000 in the single chain to do that. So you can build lists that either 136 00:06:46,000 --> 00:06:49,000 link in the other direction or link in both directions, right? To where you 137 00:06:49,000 --> 00:06:53,000 can, from a particular cell, find who proceeded it and who followed it. It just makes 138 00:06:53,000 --> 00:06:56,000 more pointers, right? And more complication. We'll see reasons why that actually 139 00:06:56,000 --> 00:06:59,000 becomes valuable. At this stage it turns out that it would just be 140 00:06:59,000 --> 00:07:02,000 complication that we wouldn't really be taking advantage of, but 141 00:07:02,000 --> 00:07:05,000 in a week or two we're gonna get to some place where that turns out to be a very, very 142 00:07:05,000 --> 00:07:08,000 useful addition to the list because it turns out that we really will need to be 143 00:07:08,000 --> 00:07:11,000 able to kind of move easily in both directions and right now we're not too 144 00:07:11,000 --> 00:07:12,000 worried about 145 00:07:12,000 --> 00:07:18,000 any direction other than the front ways. 146 00:07:18,000 --> 00:07:21,000 So this idea of 147 00:07:21,000 --> 00:07:23,000 the four loop, right? Is 148 00:07:23,000 --> 00:07:27,000 just one of the ways I could have done this. I could do this with a Y loop, 149 00:07:27,000 --> 00:07:28,000 right? Other sort of forms of iteration. 150 00:07:28,000 --> 00:07:31,000 I could also 151 00:07:31,000 --> 00:07:35,000 capitalize on the recursive nature of the list, and this is partly why I've chosen to introduce 152 00:07:35,000 --> 00:07:37,000 this topic now, is because I really do want to stress that this idea that it's a 153 00:07:37,000 --> 00:07:39,000 recursive structure 154 00:07:39,000 --> 00:07:42,000 gives us a little bit of an insight into ways that operations on it 155 00:07:42,000 --> 00:07:45,000 might be able to take advantage of that recursive nature. 156 00:07:45,000 --> 00:07:48,000 So let me write this alternate print list, and I'll 157 00:07:48,000 --> 00:07:52,000 call it recursive print list, 158 00:07:52,000 --> 00:07:55,000 that is designed to use recursion to get the job done. 159 00:07:55,000 --> 00:08:00,000 So in terms of thinking about it recursively, you can think of, well, a linked list is 160 00:08:00,000 --> 00:08:03,000 the sequence of cells and iteration kind of thinks of the whole sequence at one 161 00:08:03,000 --> 00:08:04,000 time. 162 00:08:04,000 --> 00:08:08,000 Recursion tends to take the strategy of, well, I'm just gonna separate it into 163 00:08:08,000 --> 00:08:12,000 something small I can deal with right now and then some instance of the same 164 00:08:12,000 --> 00:08:14,000 problem, but in a slightly smaller simpler form. 165 00:08:14,000 --> 00:08:18,000 The easy way to do that with a linked list is imagine there's the front 166 00:08:18,000 --> 00:08:20,000 most cell. 167 00:08:20,000 --> 00:08:23,000 Which I could print by calling print entry, which prints a single entry. 168 00:08:23,000 --> 00:08:26,000 And then there is this recursive 169 00:08:26,000 --> 00:08:29,000 structure left behind, which is the remainder of the list and I can say, well, 170 00:08:29,000 --> 00:08:32,000 I can recursively print the list 171 00:08:32,000 --> 00:08:34,000 that follows. 172 00:08:34,000 --> 00:08:36,000 So print the front most cell, 173 00:08:36,000 --> 00:08:40,000 let recursion work its magic on what remains. 174 00:08:40,000 --> 00:08:43,000 Looks pretty good. Something really critical missing? 175 00:08:43,000 --> 00:08:45,000 Base case. Better have a base case here. 176 00:08:45,000 --> 00:08:49,000 And the simplest possible base case is list equals equals null. 177 00:08:49,000 --> 00:08:53,000 If list equals equals null then it won't have nothing to do, so I can just return or I 178 00:08:53,000 --> 00:08:55,000 can actually just do it like this. Say if list 179 00:08:55,000 --> 00:08:57,000 does not equal null, so it has some 180 00:08:57,000 --> 00:08:59,000 valid contents, 181 00:08:59,000 --> 00:09:00,000 something to look at, 182 00:09:00,000 --> 00:09:03,000 then we'll print the front most one and then let recursion take care of the 183 00:09:03,000 --> 00:09:05,000 rest. So 184 00:09:05,000 --> 00:09:11,000 I'll change my called print list to be a recursive print list. And I should 185 00:09:11,000 --> 00:09:12,000 see some [inaudible] results here. 186 00:09:12,000 --> 00:09:15,000 I say that 187 00:09:15,000 --> 00:09:17,000 Nathan is here, I 188 00:09:17,000 --> 00:09:20,000 say Jason is here, I 189 00:09:20,000 --> 00:09:21,000 Sara, 190 00:09:21,000 --> 00:09:23,000 Sara are you an H? I don't 191 00:09:23,000 --> 00:09:24,000 remember. 192 00:09:24,000 --> 00:09:26,000 Today it doesn't. 193 00:09:26,000 --> 00:09:30,000 Sara, Jason, Nathan coming out the other way. Okay. 194 00:09:30,000 --> 00:09:33,000 So I want to do something kind of funny with this recursive call. 195 00:09:33,000 --> 00:09:33,000 So it's 196 00:09:33,000 --> 00:09:37,000 not really in this case - both of these look equally simple to write. We're 197 00:09:37,000 --> 00:09:40,000 gonna start to look at some things where this actually might buy 198 00:09:40,000 --> 00:09:43,000 us something. For example, lets imagine that what I wanted to do was print the 199 00:09:43,000 --> 00:09:47,000 list in the other order. So I don't have the list 200 00:09:47,000 --> 00:09:48,000 that links backwards. 201 00:09:48,000 --> 00:09:50,000 So if I really did want to print them 202 00:09:50,000 --> 00:09:52,000 with the last most element first, 203 00:09:52,000 --> 00:09:56,000 the way that they were added at the console, 204 00:09:56,000 --> 00:09:58,000 and the way I'm building the list, right? Is putting them in the other order. So, well, could I just take 205 00:09:58,000 --> 00:10:00,000 the list and print it back to front? 206 00:10:00,000 --> 00:10:03,000 I mean, it's simple enough to do that on a vector, if you had one, to work from 207 00:10:03,000 --> 00:10:06,000 the way back. Could I do it with the linked list? 208 00:10:06,000 --> 00:10:09,000 In the iterate formulation it's gonna actually get pretty messy because I'm 209 00:10:09,000 --> 00:10:12,000 gonna have to walk my way all the way down to the end and then print the last one, and then I'm 210 00:10:12,000 --> 00:10:15,000 gonna have to walk my way down to the one that points to the last one and 211 00:10:15,000 --> 00:10:18,000 print it, and then I'm gonna have to walk my way down to the one that printed to the penultimate 212 00:10:18,000 --> 00:10:19,000 one and print it, 213 00:10:19,000 --> 00:10:21,000 but it does involve a lot of traversal and a lot of work. 214 00:10:21,000 --> 00:10:24,000 I can make the recursive print one do it 215 00:10:24,000 --> 00:10:28,000 with just one tiny change of the code. 216 00:10:28,000 --> 00:10:30,000 Take this line 217 00:10:30,000 --> 00:10:34,000 and put it there. So 218 00:10:34,000 --> 00:10:37,000 what I'm doing here is letting recursion do the work for me of 219 00:10:37,000 --> 00:10:42,000 saying, well, print everything that follows me in the list before you print this cell. So 220 00:10:42,000 --> 00:10:46,000 if the list is A, B, C it says when you get to the A node it says 221 00:10:46,000 --> 00:10:49,000 okay, please print the things that follow me before you get back to me. Well, recursion 222 00:10:49,000 --> 00:10:52,000 being applied to the B, C list says, well, B says, well, hold onto B, print everything 223 00:10:52,000 --> 00:10:56,000 that follows me before you print B, when C gets there it says, well, print everything that 224 00:10:56,000 --> 00:10:59,000 follows me, which turns out to be nothing, which causes it to come back and 225 00:10:59,000 --> 00:11:02,000 say, well, okay, I'm done with the part that followed C, why don't we print C, 226 00:11:02,000 --> 00:11:03,000 and then print B, and then print A. 227 00:11:03,000 --> 00:11:09,000 So if I do this and 228 00:11:09,000 --> 00:11:10,000 print it through I put 229 00:11:10,000 --> 00:11:12,000 in A, 230 00:11:12,000 --> 00:11:16,000 B, C. 231 00:11:16,000 --> 00:11:18,000 Then they get printed out in the order I inserted them. 232 00:11:18,000 --> 00:11:23,000 They happen to be stored internally as C, B, A, but then I just do a 233 00:11:23,000 --> 00:11:25,000 change-a-roo was printing 234 00:11:25,000 --> 00:11:29,000 to print from the back of the list to the front of the list in the other order. 235 00:11:29,000 --> 00:11:32,000 But just the magic of recursion, right? A very simple little change, right? In terms of 236 00:11:32,000 --> 00:11:34,000 doing the work before the recursive call 237 00:11:34,000 --> 00:11:36,000 versus after 238 00:11:36,000 --> 00:11:39,000 gave me exactly what I wanted without any real craziness 239 00:11:39,000 --> 00:11:42,000 inserted in the code. We'll do a couple 240 00:11:42,000 --> 00:11:44,000 of other little things with you 241 00:11:44,000 --> 00:11:46,000 and 242 00:11:46,000 --> 00:11:49,000 I'm gonna do them recursively just 243 00:11:49,000 --> 00:11:50,000 for practice. 244 00:11:50,000 --> 00:11:53,000 245 00:11:53,000 --> 00:11:57,000 If I wanted to count them, I wanted to know how many cells 246 00:11:57,000 --> 00:12:02,000 are in my list, then thinking about 247 00:12:02,000 --> 00:12:03,000 it recursively 248 00:12:03,000 --> 00:12:06,000 is a matter of saying, well, there's a cell in the front 249 00:12:06,000 --> 00:12:06,000 250 00:12:06,000 --> 00:12:10,000 and then there's some count of the cells that follow it. If I add those together, 251 00:12:10,000 --> 00:12:12,000 that tells me how long this list is. 252 00:12:12,000 --> 00:12:13,000 253 00:12:13,000 --> 00:12:16,000 Having a base case there that says when I get to the completely empty list where the list 254 00:12:16,000 --> 00:12:17,000 is null 255 00:12:17,000 --> 00:12:20,000 then there are no more cells to count that returns my zero. So it will work its way all the way down 256 00:12:20,000 --> 00:12:22,000 to the bottom, 257 00:12:22,000 --> 00:12:25,000 find that trailing null that sentinels the end, and then kind of count on its way 258 00:12:25,000 --> 00:12:28,000 back out. Okay, there was one before that and then one before that and add those 259 00:12:28,000 --> 00:12:29,000 all up 260 00:12:29,000 --> 00:12:33,000 to tell me how many entries are in my address book. So I can do 261 00:12:33,000 --> 00:12:36,000 that here. I can say what is the count in 262 00:12:36,000 --> 00:12:38,000 my 263 00:12:38,000 --> 00:12:43,000 address book? And maybe I'll stop printing it because I'm - and so 264 00:12:43,000 --> 00:12:46,000 I can test it on a couple of things. Well, what if I put 265 00:12:46,000 --> 00:12:50,000 an empty cell in so I have a null? If I get 266 00:12:50,000 --> 00:12:54,000 one cell in there, right? Then I should have one and I can even do a couple more. A, B, C. 267 00:12:54,000 --> 00:12:55,000 Got 268 00:12:55,000 --> 00:12:59,000 three cells. I'll 269 00:12:59,000 --> 00:13:00,000 do 270 00:13:00,000 --> 00:13:06,000 another little one while I'm here. 271 00:13:06,000 --> 00:13:08,000 Is that when I'm done with the linked list, 272 00:13:08,000 --> 00:13:11,000 all those heap allocated cells, right? 273 00:13:11,000 --> 00:13:14,000 Are just out there clogging up my heap. So when I'm done with this list 274 00:13:14,000 --> 00:13:16,000 the appropriate thing to do is to deallocate it. 275 00:13:16,000 --> 00:13:18,000 I will note, actually, just to be clear though, 276 00:13:18,000 --> 00:13:23,000 that any memory that's allocated by main and kind of in the process of working the program that 277 00:13:23,000 --> 00:13:27,000 when you actually exit the program it automatically deallocated. So 278 00:13:27,000 --> 00:13:29,000 when you are completed with the program and you're exiting 279 00:13:29,000 --> 00:13:33,000 you can go around and be tidy and clean stuff up, but, in fact, there's not a lot of point 280 00:13:33,000 --> 00:13:33,000 to it. 281 00:13:33,000 --> 00:13:36,000 The more important reason, right? To deallocation would be during the 282 00:13:36,000 --> 00:13:40,000 running of the program as you're playing games or monitoring things or doing the 283 00:13:40,000 --> 00:13:41,000 data. 284 00:13:41,000 --> 00:13:43,000 If you are not deallocating - if the program, especially if it's 285 00:13:43,000 --> 00:13:44,000 long 286 00:13:44,000 --> 00:13:45,000 running, 287 00:13:45,000 --> 00:13:49,000 will eventually have problems related to its kind of gargantuan memory size if it's 288 00:13:49,000 --> 00:13:52,000 not being careful about releasing memory. 289 00:13:52,000 --> 00:13:55,000 When you're done deleting it manually or just letting the system take it down as 290 00:13:55,000 --> 00:13:57,000 part of the process 291 00:13:57,000 --> 00:13:58,000 there's not much 292 00:13:58,000 --> 00:14:00,000 advantage to. 293 00:14:00,000 --> 00:14:03,000 So if I'm gonna deallocate a list, 294 00:14:03,000 --> 00:14:07,000 if the list does not equal null there's something deallocate. 295 00:14:07,000 --> 00:14:07,000 I'm 296 00:14:07,000 --> 00:14:11,000 gonna do this wrong first and then we'll talk about why 297 00:14:11,000 --> 00:14:15,000 it's not what I want to do. 298 00:14:15,000 --> 00:14:19,000 That's good exercise remembering things. So I think, okay, well, what I need to do 299 00:14:19,000 --> 00:14:22,000 is delete the front most cell 300 00:14:22,000 --> 00:14:26,000 and then deallocate all those cells that follow it. So 301 00:14:26,000 --> 00:14:29,000 using recursion, right? To kind of divide the problem up. 302 00:14:29,000 --> 00:14:32,000 It's a matter of deleting the front cell and then saying, okay, whatever else needs to 303 00:14:32,000 --> 00:14:35,000 be done needs to be done to everything that remains. List dot next gives me a pointer to that 304 00:14:35,000 --> 00:14:37,000 smaller sub list to work on. 305 00:14:37,000 --> 00:14:41,000 As written, this does try to make the right delete calls, 306 00:14:41,000 --> 00:14:44,000 but it does something really dangerous in terms of use of free memory. Anybody want 307 00:14:44,000 --> 00:14:46,000 to help me out with that? After it 308 00:14:46,000 --> 00:14:48,000 deletes the first 309 00:14:48,000 --> 00:14:50,000 element it doesn't know where to find the rest of it. 310 00:14:50,000 --> 00:14:51,000 That's exactly right. 311 00:14:51,000 --> 00:14:56,000 So this one said delete list. So if I think of it in terms of a picture here 312 00:14:56,000 --> 00:14:59,000 coming into the call 313 00:14:59,000 --> 00:15:01,000 list to some pointer 314 00:15:01,000 --> 00:15:05,000 to these nodes 315 00:15:05,000 --> 00:15:07,000 that are out here, A, B, and C, 316 00:15:07,000 --> 00:15:09,000 followed by null and I 317 00:15:09,000 --> 00:15:11,000 said delete list. 318 00:15:11,000 --> 00:15:14,000 So delete list says follow list to find that piece of memory out there in 319 00:15:14,000 --> 00:15:18,000 the heap and mark this cell 320 00:15:18,000 --> 00:15:19,000 as deleted, 321 00:15:19,000 --> 00:15:22,000 no longer in use, ready to be reclaimed. 322 00:15:22,000 --> 00:15:24,000 The very next line 323 00:15:24,000 --> 00:15:26,000 says deallocate list arrow next. 324 00:15:26,000 --> 00:15:30,000 And so that is actually using list, right? To point back into this piece of freed 325 00:15:30,000 --> 00:15:31,000 memory 326 00:15:31,000 --> 00:15:33,000 and try to read this number out of here. 327 00:15:33,000 --> 00:15:37,000 It may work in some situations. It may work long enough that you almost 328 00:15:37,000 --> 00:15:40,000 think it's actually correct and then some day cause some problem, 329 00:15:40,000 --> 00:15:45,000 right? Just in a different circumstances where this didn't succeed by accident. 330 00:15:45,000 --> 00:15:47,000 That I'm digging into 331 00:15:47,000 --> 00:15:51,000 that piece of freed memory and trying to access a field out of it, right? Which is 332 00:15:51,000 --> 00:15:54,000 not a reliable thing to do. C++ will let me do it, 333 00:15:54,000 --> 00:15:58,000 it doesn't complain about this. Ether it will compile time or run time in an obvious way, 334 00:15:58,000 --> 00:16:01,000 but it's just, sort of, a little bit of ticking time bomb 335 00:16:01,000 --> 00:16:04,000 to have that kind of code that's there. 336 00:16:04,000 --> 00:16:06,000 So there's two different ways I could fix this. 337 00:16:06,000 --> 00:16:08,000 One way would be to, before 338 00:16:08,000 --> 00:16:10,000 I do the delete 339 00:16:10,000 --> 00:16:14,000 is to just hold onto the pointer that I'm gonna need. 340 00:16:14,000 --> 00:16:21,000 So pull it out of the memory before we're done here. 341 00:16:21,000 --> 00:16:22,000 So I read it in there. 342 00:16:22,000 --> 00:16:26,000 So the piece of memory that's behind it is still good. 343 00:16:26,000 --> 00:16:29,000 The delete actually just deleted that individual cell, so whatever is previously 344 00:16:29,000 --> 00:16:32,000 allocated with new, which was one entry structure, 345 00:16:32,000 --> 00:16:34,000 was what was deleted by that call. 346 00:16:34,000 --> 00:16:36,000 But what I was 347 00:16:36,000 --> 00:16:38,000 needing here was to keep track of something in 348 00:16:38,000 --> 00:16:42,000 that piece of memory before, right? I obliterated it, 349 00:16:42,000 --> 00:16:44,000 so that I can make a further use of it. 350 00:16:44,000 --> 00:16:46,000 The other alternative to how to fix this 351 00:16:46,000 --> 00:16:48,000 would be to just merely 352 00:16:48,000 --> 00:16:53,000 rearrange these two lines. Same kind of fix I did up above where 353 00:16:53,000 --> 00:16:54,000 go ahead and delete 354 00:16:54,000 --> 00:16:58,000 everything that follows so when I'm doing a list like this it would say, well, 355 00:16:58,000 --> 00:17:02,000 delete the D and C and only after all of that has been cleaned up 356 00:17:02,000 --> 00:17:05,000 come back and delete the one on the front. So it would actually delete from the rear 357 00:17:05,000 --> 00:17:08,000 forward. Work its way all the way down to the bottom, delete the last cell, then the next to 358 00:17:08,000 --> 00:17:09,000 last, 359 00:17:09,000 --> 00:17:10,000 and work its way out. 360 00:17:10,000 --> 00:17:12,000 Which would also be 361 00:17:12,000 --> 00:17:13,000 a completely 362 00:17:13,000 --> 00:17:20,000 correct way to solve this problem. Okay. Any questions 363 00:17:20,000 --> 00:17:22,000 on that one so far? Let me go 364 00:17:22,000 --> 00:17:25,000 back over here and see what 365 00:17:25,000 --> 00:17:27,000 I want to do with you next. 366 00:17:27,000 --> 00:17:31,000 So I talked about this, talked about these. Okay. 367 00:17:31,000 --> 00:17:32,000 Now, 368 00:17:32,000 --> 00:17:34,000 I'm gonna do something that's actually 369 00:17:34,000 --> 00:17:36,000 really pretty tricky 370 00:17:36,000 --> 00:17:47,000 that I want you guys to watch with me. All right. 371 00:17:47,000 --> 00:17:51,000 So this is the same code that we had for building the address book. 372 00:17:51,000 --> 00:17:52,000 373 00:17:52,000 --> 00:17:54,000 The listhead, the new one, and so on. 374 00:17:54,000 --> 00:17:57,000 And what I did was, I just did a little bit of code factoring. Where I took the steps 375 00:17:57,000 --> 00:18:01,000 that were designed to splice that new cell onto the front of the list 376 00:18:01,000 --> 00:18:04,000 and I looped them into their own function called prepend 377 00:18:04,000 --> 00:18:08,000 that takes two pointers. The pointer to the new entry and the pointer to the current first 378 00:18:08,000 --> 00:18:11,000 cell of the list and it's designed to wire them 379 00:18:11,000 --> 00:18:12,000 up by putting it in the front. 380 00:18:12,000 --> 00:18:15,000 So it looks like the code that was here, just moved up here, and then the 381 00:18:15,000 --> 00:18:19,000 variable names change because the parameters have slightly different names. 382 00:18:19,000 --> 00:18:22,000 The code as I'm showing it right here is buggy. 383 00:18:22,000 --> 00:18:24,000 It's almost, but not quite, what we want 384 00:18:24,000 --> 00:18:28,000 and I want to do this very carefully. Kind of trace through what's happening here, so you 385 00:18:28,000 --> 00:18:29,000 can watch with me 386 00:18:29,000 --> 00:18:32,000 what's happening in this process. So 387 00:18:32,000 --> 00:18:37,000 the variable is actually called listhead and this - I guess let me go ahead and do this. 388 00:18:37,000 --> 00:18:40,000 Okay. So 389 00:18:40,000 --> 00:18:43,000 let's imagine that we have gone through a couple of alliterations and we've got a good linked 390 00:18:43,000 --> 00:18:43,000 list 391 00:18:43,000 --> 00:18:44,000 392 00:18:44,000 --> 00:18:46,000 kind of in place, so I have something to work with her. 393 00:18:46,000 --> 00:18:49,000 So let's imagine that listhead 394 00:18:49,000 --> 00:18:52,000 points to a 395 00:18:52,000 --> 00:18:53,000 cell A to B 396 00:18:53,000 --> 00:18:56,000 and then its got two cells, let's say, 397 00:18:56,000 --> 00:18:58,000 and then it's empty. 398 00:18:58,000 --> 00:19:00,000 Let's say I get a new one 399 00:19:00,000 --> 00:19:01,000 400 00:19:01,000 --> 00:19:06,000 and this ones gonna be a J, let's say. Okay. 401 00:19:06,000 --> 00:19:09,000 So that's the state, let's say, that I'm coming into here and hit the 402 00:19:09,000 --> 00:19:09,000 prepend 403 00:19:09,000 --> 00:19:13,000 and I'm ready to take this J cell and put it on the front so that I have J, A, B on 404 00:19:13,000 --> 00:19:16,000 my list. Okay. So let me 405 00:19:16,000 --> 00:19:20,000 do a little dotted line here that distinguishes my two stack 406 00:19:20,000 --> 00:19:25,000 rings. I'm gonna make this call to prepend 407 00:19:25,000 --> 00:19:31,000 and prepend has two variables, ENT and first, 408 00:19:31,000 --> 00:19:33,000 that were copied from 409 00:19:33,000 --> 00:19:37,000 the variables new one and listhead that came out of the build address book. Actually, 410 00:19:37,000 --> 00:19:40,000 it's not called main here. Let me 411 00:19:40,000 --> 00:19:43,000 call this build 412 00:19:43,000 --> 00:19:47,000 book. Okay. 413 00:19:47,000 --> 00:19:50,000 So I pass the value of new one as 414 00:19:50,000 --> 00:19:51,000 ENT. 415 00:19:51,000 --> 00:19:53,000 So what we get is a pointer 416 00:19:53,000 --> 00:19:55,000 to this same cell, 417 00:19:55,000 --> 00:19:57,000 a copy, and so this case, right? 418 00:19:57,000 --> 00:20:00,000 We copied the pointer, so whatever address was being held there is actually 419 00:20:00,000 --> 00:20:03,000 copied into here as this parameter. 420 00:20:03,000 --> 00:20:04,000 Similarly, 421 00:20:04,000 --> 00:20:07,000 listhead is copied into the second parameter, which is the pointer to the 422 00:20:07,000 --> 00:20:12,000 first cell. Okay. 423 00:20:12,000 --> 00:20:15,000 I want to do this in such a way that you can follow what's going on. So let's see 424 00:20:15,000 --> 00:20:16,000 if I can get this 425 00:20:16,000 --> 00:20:19,000 to work out. So it's pointing up there to first. Okay. 426 00:20:19,000 --> 00:20:21,000 So I've got 427 00:20:21,000 --> 00:20:24,000 two pointers to this cell A. 428 00:20:24,000 --> 00:20:26,000 One that came off the original listhead and one that came there from the copy in first. 429 00:20:26,000 --> 00:20:29,000 And then I've got two pointers to this J. 430 00:20:29,000 --> 00:20:32,000 One that came through the variable new one in the build an address book frame and 431 00:20:32,000 --> 00:20:35,000 one that was ENT in the prepend frame. Now, I'm gonna 432 00:20:35,000 --> 00:20:37,000 trace through the code of prepend. 433 00:20:37,000 --> 00:20:39,000 It says ent 434 00:20:39,000 --> 00:20:41,000 next field equals first. Okay. 435 00:20:41,000 --> 00:20:43,000 So ent's 436 00:20:43,000 --> 00:20:44,000 next field 437 00:20:44,000 --> 00:20:48,000 is right here. 438 00:20:48,000 --> 00:20:53,000 It gets the value of first. Well, first points to this cell. Okay. 439 00:20:53,000 --> 00:20:56,000 So we go ahead 440 00:20:56,000 --> 00:20:59,000 and make the next field of J point to that cell. 441 00:20:59,000 --> 00:21:03,000 So that looks pretty good. That first line worked out just fine. It changed 442 00:21:03,000 --> 00:21:05,000 the cell J 443 00:21:05,000 --> 00:21:08,000 to point to what was previously the front of the list. So it looks like now it's the 444 00:21:08,000 --> 00:21:10,000 first cell that's followed by those. 445 00:21:10,000 --> 00:21:13,000 And then the next line I need to do is I need to update the listhead 446 00:21:13,000 --> 00:21:16,000 to point to J. 447 00:21:16,000 --> 00:21:17,000 So it says first 448 00:21:17,000 --> 00:21:19,000 equals ENT. 449 00:21:19,000 --> 00:21:20,000 So first 450 00:21:20,000 --> 00:21:29,000 gets the value of ENT. Well this just does pointer assignment. 451 00:21:29,000 --> 00:21:31,000 452 00:21:31,000 --> 00:21:34,000 Sorry, I erased where it used to point to. 453 00:21:34,000 --> 00:21:36,000 And then I make it point to there, too. 454 00:21:36,000 --> 00:21:39,000 So at the bottom of prepend, 455 00:21:39,000 --> 00:21:43,000 if I were to use first as the pointer to the initial cell of the list it looks good. 456 00:21:43,000 --> 00:21:47,000 It points to J, which points to A, which points to B, which points to null. 457 00:21:47,000 --> 00:21:49,000 Everything looks fine 458 00:21:49,000 --> 00:21:50,000 here, 459 00:21:50,000 --> 00:21:51,000 right? At the end of prepend. 460 00:21:51,000 --> 00:21:53,000 But when I get back here 461 00:21:53,000 --> 00:21:55,000 to come back around on this loop; 462 00:21:55,000 --> 00:22:03,000 where is listhead pointing? It's pointing to A. 463 00:22:03,000 --> 00:22:09,000 It's pointing to A. Did anything happen to listhead in the process of this? No. 464 00:22:09,000 --> 00:22:12,000 That listhead points where it always pointed, which was whatever cell, right? Was 465 00:22:12,000 --> 00:22:14,000 previously pointed to. 466 00:22:14,000 --> 00:22:18,000 This attempt to pass it into prepend and have prepend update it didn't 467 00:22:18,000 --> 00:22:21,000 stick. 468 00:22:21,000 --> 00:22:22,000 Prepend 469 00:22:22,000 --> 00:22:24,000 has two pointers 470 00:22:24,000 --> 00:22:27,000 that are copies of these pointers. 471 00:22:27,000 --> 00:22:31,000 So like other variables, ents, and strings and 472 00:22:31,000 --> 00:22:32,000 vectors and anything else we have, 473 00:22:32,000 --> 00:22:36,000 if we just pass the variable itself into a function call 474 00:22:36,000 --> 00:22:41,000 then that pass by value mechanism kicks in, which causes there to be two new 475 00:22:41,000 --> 00:22:43,000 variables. In this case, the ENT and the first variables, 476 00:22:43,000 --> 00:22:47,000 that are copies of the two variables listhead and new one 477 00:22:47,000 --> 00:22:49,000 that were present in build address book. 478 00:22:49,000 --> 00:22:54,000 And the changes I make down here, if I really try to change ENT or change first, 479 00:22:54,000 --> 00:22:58,000 so I try to make ENT point somewhere new or make first point somewhere new, 480 00:22:58,000 --> 00:22:59,000 don't stick. 481 00:22:59,000 --> 00:23:03,000 They change the copies, not the originals. 482 00:23:03,000 --> 00:23:05,000 This is tricky. 483 00:23:05,000 --> 00:23:08,000 This is very tricky because it's entering that the first line 484 00:23:08,000 --> 00:23:10,000 was actually okay. 485 00:23:10,000 --> 00:23:14,000 That what the first line tried to do actually did have a persistent affect and 486 00:23:14,000 --> 00:23:15,000 the affect we wanted, 487 00:23:15,000 --> 00:23:18,000 which is it followed the ENT pointer 488 00:23:18,000 --> 00:23:21,000 to this shared location and changed its next field. So 489 00:23:21,000 --> 00:23:25,000 dereferencing that pointer and mucking around with the struct itself 490 00:23:25,000 --> 00:23:28,000 did have a persistent affect. It wasn't another copy of the entry struck. There 491 00:23:28,000 --> 00:23:33,000 really is just this one entry struck that new one and ENT are both pointing 492 00:23:33,000 --> 00:23:34,000 to. 493 00:23:34,000 --> 00:23:35,000 So both of them, right? 494 00:23:35,000 --> 00:23:39,000 Are viewing the same piece of heap memory. Changes made out there at the 495 00:23:39,000 --> 00:23:42,000 structure, right? Are being perceived from both ends, 496 00:23:42,000 --> 00:23:45,000 but the pointers themselves - the fact that I had two different pointers that 497 00:23:45,000 --> 00:23:48,000 pointed to the same place, I changed one of my pointers, 498 00:23:48,000 --> 00:23:51,000 the one whose name is first, to 499 00:23:51,000 --> 00:23:55,000 point somewhere new. I didn't change listhead by that action. That listhead and 500 00:23:55,000 --> 00:23:55,000 first 501 00:23:55,000 --> 00:23:59,000 were just two aliases of the same location, but they had no further 502 00:23:59,000 --> 00:24:04,000 relationship that is implied by that parameter passing there. How do 503 00:24:04,000 --> 00:24:07,000 you feel about that? Question? And 504 00:24:07,000 --> 00:24:09,000 first and ent 505 00:24:09,000 --> 00:24:11,000 are gonna disappear now, right, afterwards? No. First and ent would go away, 506 00:24:11,000 --> 00:24:15,000 but that just means their pointers would go. So when prepend goes away, 507 00:24:15,000 --> 00:24:18,000 right? This space gets deallocated, so it's 508 00:24:18,000 --> 00:24:20,000 like this 509 00:24:20,000 --> 00:24:23,000 pointer goes away 510 00:24:23,000 --> 00:24:26,000 and all this space down here goes 511 00:24:26,000 --> 00:24:30,000 away. Then what we have is list still pointing to A and B and then new one, 512 00:24:30,000 --> 00:24:33,000 right? Is pointing to this J, which points to A, but, in fact, right? 513 00:24:33,000 --> 00:24:36,000 Is effectively gonna be orphaned by the next come around of the loop when we 514 00:24:36,000 --> 00:24:37,000 reassign new one. 515 00:24:37,000 --> 00:24:39,000 So that J didn't get prepended. 516 00:24:39,000 --> 00:24:43,000 It got half of its attachment done. The outgoing attachment from J got done, 517 00:24:43,000 --> 00:24:45,000 but the incoming one 518 00:24:45,000 --> 00:24:48,000 didn't stick. 519 00:24:48,000 --> 00:24:51,000 This is pretty tricky to think about because it really requires really kind of 520 00:24:51,000 --> 00:24:53,000 carefully analyzing what the code is doing 521 00:24:53,000 --> 00:24:57,000 and not letting this idea of the pointer confuse you from what you all 522 00:24:57,000 --> 00:25:00,000 ready know about variables. If I pass - if you see a function call 523 00:25:00,000 --> 00:25:01,000 where I say 524 00:25:01,000 --> 00:25:03,000 binky 525 00:25:03,000 --> 00:25:05,000 of 526 00:25:05,000 --> 00:25:07,000 A and B, where A and B are integer values, 527 00:25:07,000 --> 00:25:10,000 if these are not by reference coming into binky then you know that when they 528 00:25:10,000 --> 00:25:13,000 come back A and B are what they were before. If A was 10 and B was 20, 529 00:25:13,000 --> 00:25:16,000 they're still that. That the only way that binky can have some persistent affect on 530 00:25:16,000 --> 00:25:20,000 the values of A and B would be that they were being passed by reference. 531 00:25:20,000 --> 00:25:22,000 If I change this piece of code, 532 00:25:22,000 --> 00:25:25,000 and it's just one tiny change I'm gonna 533 00:25:25,000 --> 00:25:31,000 make here, to add an ampersand on that first, 534 00:25:31,000 --> 00:25:38,000 then I'm gonna get what I wanted. So I want 535 00:25:38,000 --> 00:25:41,000 to draw my reference a little bit differently to remind myself about 536 00:25:41,000 --> 00:25:45,000 what it is. When I call prepend passing new one it got passed normally. New one 537 00:25:45,000 --> 00:25:49,000 is still just an ordinary pointer. So now I have two pointers 538 00:25:49,000 --> 00:25:52,000 where ENT and new one both point to that J. 539 00:25:52,000 --> 00:25:56,000 This one currently doesn't yet point up there. It's gonna in a minute, but I'll go ahead 540 00:25:56,000 --> 00:25:58,000 and put it back to its original state. 541 00:25:58,000 --> 00:25:59,000 Now, first 542 00:25:59,000 --> 00:26:03,000 is gonna be an alias for listhead, so that first 543 00:26:03,000 --> 00:26:09,000 is not going to be a copy of what listhead is. It's actually gonna be a 544 00:26:09,000 --> 00:26:11,000 reference back to here. 545 00:26:11,000 --> 00:26:15,000 And a reference looks a lot like a pointer in the way that I draw it, 546 00:26:15,000 --> 00:26:18,000 but the difference is gonna be I'm gonna shade or, sort of, cross hatch this box 547 00:26:18,000 --> 00:26:21,000 to remind myself that 548 00:26:21,000 --> 00:26:24,000 first is attached a listhead and there's nothing you can do to change it. 549 00:26:24,000 --> 00:26:27,000 That first becomes a synonym for listhead. 550 00:26:27,000 --> 00:26:31,000 That in the context of the prepend function, any access to first is really 551 00:26:31,000 --> 00:26:35,000 just an access to listhead. So trying to read from it or write to it, you know, do you 552 00:26:35,000 --> 00:26:37,000 reference it, it all goes back to the original 553 00:26:37,000 --> 00:26:40,000 that this is just standing in for. 554 00:26:40,000 --> 00:26:43,000 That there actually is a relationship that's permanent from the 555 00:26:43,000 --> 00:26:47,000 time of that call that means that yeah, there really is now new variable first. First is 556 00:26:47,000 --> 00:26:51,000 actually just another name for an existing variable, this one of listhead. 557 00:26:51,000 --> 00:26:55,000 So when I got through the ENT it gets the value - ENT's next gets the value of first. 558 00:26:55,000 --> 00:26:56,000 Well, 559 00:26:56,000 --> 00:26:57,000 first's value really is 560 00:26:57,000 --> 00:27:01,000 what listhead is. So that means it points up to A, 561 00:27:01,000 --> 00:27:02,000 so that part 562 00:27:02,000 --> 00:27:04,000 works as we were hoping before. 563 00:27:04,000 --> 00:27:08,000 Then when I make the change to first equals ENT, 564 00:27:08,000 --> 00:27:10,000 first is a synonym for listhead 565 00:27:10,000 --> 00:27:12,000 and that made listhead 566 00:27:12,000 --> 00:27:15,000 stop pointing to A and 567 00:27:15,000 --> 00:27:18,000 start pointing straight to J. 568 00:27:18,000 --> 00:27:22,000 And so this is still attached here. Let's do that. Let 569 00:27:22,000 --> 00:27:25,000 me see if I can just erase this part and make it a little clearer what's going on. 570 00:27:25,000 --> 00:27:28,000 So new one's pointing to J now, ENT 571 00:27:28,000 --> 00:27:31,000 is pointing to J, 572 00:27:31,000 --> 00:27:33,000 and listhead is pointing to J, 573 00:27:33,000 --> 00:27:34,000 and then this is still 574 00:27:34,000 --> 00:27:37,000 the reference pointing back to there. 575 00:27:37,000 --> 00:27:40,000 So now when prepend goes away, 576 00:27:40,000 --> 00:27:42,000 this extra pointer 577 00:27:42,000 --> 00:27:43,000 is removed, 578 00:27:43,000 --> 00:27:45,000 this reference is removed, 579 00:27:45,000 --> 00:27:47,000 all this space 580 00:27:47,000 --> 00:27:48,000 is reclaimed here in the stack, 581 00:27:48,000 --> 00:27:51,000 but listhead now really points to J, 582 00:27:51,000 --> 00:27:53,000 which points to A, which points to B as before. 583 00:27:53,000 --> 00:27:55,000 So we wired in two pointers, right? 584 00:27:55,000 --> 00:27:59,000 One going out of the new cell into what was previously the first cell and then the 585 00:27:59,000 --> 00:28:00,000 first cell 586 00:28:00,000 --> 00:28:04,000 pointer now points to the new one rather than the original first cell. 587 00:28:04,000 --> 00:28:07,000 It was all a matter of this one little ampersand 588 00:28:07,000 --> 00:28:09,000 that makes the difference, right? 589 00:28:09,000 --> 00:28:12,000 Without that, right? Then we are only operating on a copy. 590 00:28:12,000 --> 00:28:15,000 We're changing our local copy of a pointer to point somewhere new; 591 00:28:15,000 --> 00:28:17,000 having no permanent affect 592 00:28:17,000 --> 00:28:17,000 593 00:28:17,000 --> 00:28:19,000 on the actual state of it. 594 00:28:19,000 --> 00:28:22,000 So, actually, without that ampersand in there it turns out 595 00:28:22,000 --> 00:28:26,000 nothing ever gets added to the list. The listhead starts as null and stays null, 596 00:28:26,000 --> 00:28:29,000 but all these things will copy the value of listhead as a 597 00:28:29,000 --> 00:28:33,000 null, change it in the local context of the called function, but then listhead 598 00:28:33,000 --> 00:28:36,000 will stay null. So, in fact, if I just ran this a bunch of times and I threw in a bunch of 599 00:28:36,000 --> 00:28:37,000 things without the ampersand 600 00:28:37,000 --> 00:28:41,000 I would find that all my cells just get discarded. I'd end up with an 601 00:28:41,000 --> 00:28:42,000 empty list when I was done. All the orphaned out 602 00:28:42,000 --> 00:28:45,000 in the heap 603 00:28:45,000 --> 00:28:49,000 attached and not really recorded permanently. 604 00:28:49,000 --> 00:28:52,000 That's pretty tricky. Question? Why is the 605 00:28:52,000 --> 00:28:55,000 ampersand after this [inaudible]? 606 00:28:55,000 --> 00:28:56,000 Because it is the - the 607 00:28:56,000 --> 00:29:00,000 ampersand applies to the - think of it more as applying to the name and it's saying this 608 00:29:00,000 --> 00:29:04,000 name is a reference to something of that type. So on the left-hand side 609 00:29:04,000 --> 00:29:05,000 is the type. 610 00:29:05,000 --> 00:29:09,000 What type of thing is it? Is it a string of anti-vector events or whatever? And then the 611 00:29:09,000 --> 00:29:13,000 ampersand can go between the type and the name to say and it is a reference to an 612 00:29:13,000 --> 00:29:17,000 existing variable of that type as opposed to a copy of a variable of that type. So 613 00:29:17,000 --> 00:29:19,000 without the ampersand it's always a copy. 614 00:29:19,000 --> 00:29:21,000 With the ampersand, you're saying 615 00:29:21,000 --> 00:29:25,000 I'm introducing first as a synonym for an existing entry star variable somewhere 616 00:29:25,000 --> 00:29:28,000 else and that in the context of this function 617 00:29:28,000 --> 00:29:31,000 access to this named parameter really is reaching out 618 00:29:31,000 --> 00:29:35,000 to change a variable stored elsewhere. 619 00:29:35,000 --> 00:29:41,000 It's quite, quite tricky to kind of get your head around what's going on here. 620 00:29:41,000 --> 00:29:44,000 I'm gonna use this, actually, in running another piece of code, so I want to 621 00:29:44,000 --> 00:29:46,000 be sure that at this point everybody feels 622 00:29:46,000 --> 00:29:54,000 at least reasonably okay with what I just showed you. Question? What happens if we stop 623 00:29:54,000 --> 00:29:58,000 ampersand [inaudible]? Jordan, you want to invert them or something like that? Yes. It turns out that that won't work. 624 00:29:58,000 --> 00:30:02,000 Basically, it won't compile because in this case you can't take a pointer out to an 625 00:30:02,000 --> 00:30:04,000 entry reference. In fact, it just won't let you. So 626 00:30:04,000 --> 00:30:07,000 it doesn't really make sense is the truth. 627 00:30:07,000 --> 00:30:08,000 But think of it as like the - sometimes people 628 00:30:08,000 --> 00:30:12,000 actually will draw it with the ampersand attached even without the 629 00:30:12,000 --> 00:30:15,000 space onto the variable name. It's just a notation. A really good clue that this name, 630 00:30:15,000 --> 00:30:18,000 right? Is a reference to an existing variable and that its type is kind 631 00:30:18,000 --> 00:30:19,000 of 632 00:30:19,000 --> 00:30:21,000 over here. So type, 633 00:30:21,000 --> 00:30:25,000 name, and then name as a reference has that ampersand attached. Question? Is 634 00:30:25,000 --> 00:30:27,000 there a wrong side to passing both the pointers by 635 00:30:27,000 --> 00:30:29,000 reference? Nope, 636 00:30:29,000 --> 00:30:32,000 not really. If I pass it by reference, right? 637 00:30:32,000 --> 00:30:35,000 Then it - nothing would change about what really happens, right? It would still leach into 638 00:30:35,000 --> 00:30:39,000 the thing and copy it. There is - for variables that are large, sometimes we 639 00:30:39,000 --> 00:30:42,000 pass them by reference just to avoid the overhead of a copy. Since a pointer is pretty small, that 640 00:30:42,000 --> 00:30:45,000 making a copy versus taking a reference it turns out doesn't really save you anything and 641 00:30:45,000 --> 00:30:48,000 actually it makes a little bit more work 642 00:30:48,000 --> 00:30:52,000 to access it because it has to kind of go one level out to get it. But 643 00:30:52,000 --> 00:30:54,000 effectively no. In 644 00:30:54,000 --> 00:30:56,000 general though, I would say that 645 00:30:56,000 --> 00:30:58,000 passing things by reference, 646 00:30:58,000 --> 00:31:01,000 as a habit, is probably not one you want to assume for other kinds of variables. 647 00:31:01,000 --> 00:31:02,000 Just 648 00:31:02,000 --> 00:31:05,000 because that once you've passed it by reference you've opened up the access to where if it 649 00:31:05,000 --> 00:31:08,000 changes it it does persistent and if you didn't intend for that that could be kind of a mystical thing 650 00:31:08,000 --> 00:31:09,000 to try to track down. So I 651 00:31:09,000 --> 00:31:11,000 think you should be quite deliberate about - 652 00:31:11,000 --> 00:31:14,000 I plan on changing this and I want to be sure I see that persistent 653 00:31:14,000 --> 00:31:22,000 effect. That I need that pass by reference there. Okay. 654 00:31:22,000 --> 00:31:26,000 So let me go on to kind of solve a problem with the linked list that helps to motivate what a linked 655 00:31:26,000 --> 00:31:30,000 list is good for. Why would you use a linked list versus a 656 00:31:30,000 --> 00:31:35,000 vector or an array style access for something that was a collection? 657 00:31:35,000 --> 00:31:39,000 And so the built in array, which is what vector is built on, so they have the same 658 00:31:39,000 --> 00:31:40,000 characteristics in this respect. 659 00:31:40,000 --> 00:31:42,000 They store elements in contiguous memory. 660 00:31:42,000 --> 00:31:44,000 You allocate an array of ten elements. 661 00:31:44,000 --> 00:31:48,000 There's a block that is ten integers wide, let's say, 662 00:31:48,000 --> 00:31:52,000 where they were all sitting next to each other. In address 1,000 as one, 663 00:31:52,000 --> 00:31:54,000 1,004 is the next, 1,008 and so on. 664 00:31:54,000 --> 00:31:58,000 What that gives you is fast direct access by index. You can ask for the third 665 00:31:58,000 --> 00:32:02,000 member, the seventh member, the 28th member, the 6 millionth member 666 00:32:02,000 --> 00:32:03,000 depending on how big it is, right? 667 00:32:03,000 --> 00:32:05,000 By, sort of, directly computing 668 00:32:05,000 --> 00:32:07,000 where it will be in memory. You know where it starts and you know how big each 669 00:32:07,000 --> 00:32:11,000 element is. Then you can say here's where I can find the 10th element. 670 00:32:11,000 --> 00:32:15,000 That's an advantage, right? To the array vector style of arrangement. 671 00:32:15,000 --> 00:32:16,000 However, 672 00:32:16,000 --> 00:32:20,000 the fact that it's in this big contiguous block is actually very bulky. 673 00:32:20,000 --> 00:32:21,000 It means that 674 00:32:21,000 --> 00:32:26,000 if you were to want to put a new element in the front of an array or vector that 675 00:32:26,000 --> 00:32:27,000 has 1,000 elements 676 00:32:27,000 --> 00:32:30,000 you have to move over the 1,000 elements to make space. If 677 00:32:30,000 --> 00:32:33,000 it's starting to address 1,000 and you have them all laid out and you want to put a new 678 00:32:33,000 --> 00:32:37,000 one in the front, everybody's gotta get picked up and moved over a slot to make 679 00:32:37,000 --> 00:32:39,000 space. So imagine you're all 680 00:32:39,000 --> 00:32:40,000 681 00:32:40,000 --> 00:32:43,000 sitting across a lecture hall and I decided I wanted to put a new person there, everybody 682 00:32:43,000 --> 00:32:45,000 gets up and moves one chair to the left, 683 00:32:45,000 --> 00:32:47,000 but there's not a way to just stick a new chair 684 00:32:47,000 --> 00:32:49,000 on one end of it 685 00:32:49,000 --> 00:32:51,000 using the array style 686 00:32:51,000 --> 00:32:52,000 of layout. 687 00:32:52,000 --> 00:32:55,000 Same for removing, right? So any kind of access to where you want to take 688 00:32:55,000 --> 00:32:59,000 somebody out of the array and close over the gap or insert in the middle and the beginning and 689 00:32:59,000 --> 00:32:59,000 whatnot 690 00:32:59,000 --> 00:33:02,000 requires everybody else moving around in memory, 691 00:33:02,000 --> 00:33:06,000 which gets expensive. Especially for large things, right? That's a lot of work. 692 00:33:06,000 --> 00:33:09,000 It also can't easily grow and shrink. 693 00:33:09,000 --> 00:33:12,000 You allocate an array, your vector, to be a certain size 694 00:33:12,000 --> 00:33:13,000 695 00:33:13,000 --> 00:33:15,000 underneath it. That's what vector's doing on your behalf. 696 00:33:15,000 --> 00:33:19,000 If you have ten elements and now you need to put in ten more 697 00:33:19,000 --> 00:33:22,000 what you need to do is go allocate a piece of memory that's twice as big, copy over the 698 00:33:22,000 --> 00:33:23,000 ten you have, 699 00:33:23,000 --> 00:33:26,000 and now have bigger space. If you need to shrink it down you'll have to make a 700 00:33:26,000 --> 00:33:29,000 smaller piece, copy it down. There's not a way to take a piece of memory and just kind 701 00:33:29,000 --> 00:33:30,000 of 702 00:33:30,000 --> 00:33:33,000 in place in memory where it is kind of extended or shrink it 703 00:33:33,000 --> 00:33:35,000 in the C++ language. 704 00:33:35,000 --> 00:33:38,000 So that the process of growing and shrinking, right? Requires this extra 705 00:33:38,000 --> 00:33:41,000 effort. So behind the scenes that's what vector is doing. When you keep adding to a 706 00:33:41,000 --> 00:33:43,000 vector, eventually it will fill to capacity. It 707 00:33:43,000 --> 00:33:46,000 will internally make more space and copy things over and you're 708 00:33:46,000 --> 00:33:49,000 paying that cost behind the scenes 709 00:33:49,000 --> 00:33:54,000 when you're doing a lot of additions and removals from a vector. The 710 00:33:54,000 --> 00:33:57,000 linked list, right? Uses this wiring them together using pointers, 711 00:33:57,000 --> 00:33:59,000 so it has a lot of flexibility 712 00:33:59,000 --> 00:34:00,000 in it that the array does not. 713 00:34:00,000 --> 00:34:04,000 Each element individually allocated. So that means you can pick and choose when and 714 00:34:04,000 --> 00:34:07,000 where you're ready to take some more space on and when you don't. 715 00:34:07,000 --> 00:34:10,000 So when you're ready to add a new person to the vector there's no worry 716 00:34:10,000 --> 00:34:12,000 about if there's a million elements there and now you don't have any 717 00:34:12,000 --> 00:34:15,000 space you have to copy them. It's like you don't - you can leave the other million elements alone. 718 00:34:15,000 --> 00:34:18,000 You just take the heap, ask for one new element, 719 00:34:18,000 --> 00:34:20,000 and then splice it in. 720 00:34:20,000 --> 00:34:23,000 So the only allocation of the allocation concerns the individual element you're 721 00:34:23,000 --> 00:34:26,000 trying to do something with. You want to take one out of the middle; you just need to delete 722 00:34:26,000 --> 00:34:30,000 that one and close around the gap by wiring the pointers around it. 723 00:34:30,000 --> 00:34:33,000 So all the insert and remove actions, right? Are 724 00:34:33,000 --> 00:34:35,000 only a matter of wiring pointers. 725 00:34:35,000 --> 00:34:37,000 Wiring around something you're taking out, 726 00:34:37,000 --> 00:34:41,000 attaching one to the end or into the middle is splicing a pointer in, and is splicing the pointer 727 00:34:41,000 --> 00:34:45,000 out. So basically you typically have two pointers you need to reassign 728 00:34:45,000 --> 00:34:46,000 to do that kind of adjustment. 729 00:34:46,000 --> 00:34:49,000 If the element has ten members, it has a million members, right? Same amount of work to 730 00:34:49,000 --> 00:34:51,000 put a cell in there. 731 00:34:51,000 --> 00:34:52,000 732 00:34:52,000 --> 00:34:57,000 No dependency on how big it is, right? Causing those operations to bog down. 733 00:34:57,000 --> 00:35:00,000 The downside then is 734 00:35:00,000 --> 00:35:02,000 exactly the, sort of, opposite of where it came out in the array in the vector, 735 00:35:02,000 --> 00:35:06,000 which is you don't have direct access to the 10th element, to the 15th 736 00:35:06,000 --> 00:35:09,000 element, to the 260th element. If I want to see what that 260 of 737 00:35:09,000 --> 00:35:13,000 my linked list is I've got to start at the beginning and walk. 738 00:35:13,000 --> 00:35:17,000 I go next, next, next, next, next, next 260 times and then I will get 739 00:35:17,000 --> 00:35:19,000 to the 260th element. 740 00:35:19,000 --> 00:35:20,000 So I don't have 741 00:35:20,000 --> 00:35:22,000 that easy access to say 742 00:35:22,000 --> 00:35:25,000 right here at this spot because we don't know. They're all over the place 743 00:35:25,000 --> 00:35:28,000 in memory. Each linked list cell is allocated individually and the only way to get to 744 00:35:28,000 --> 00:35:29,000 them 745 00:35:29,000 --> 00:35:32,000 is to follow those links. 746 00:35:32,000 --> 00:35:35,000 Often that's not as bad a disadvantage as it sounds because 747 00:35:35,000 --> 00:35:36,000 typically, right? 748 00:35:36,000 --> 00:35:38,000 That you're doing things like walking down the array or the vector to look at each of 749 00:35:38,000 --> 00:35:41,000 the elements at the end you would also be walking down the linked list while you did it isn't 750 00:35:41,000 --> 00:35:42,000 really that bad. 751 00:35:42,000 --> 00:35:45,000 So in the case where you happen to be storing stuff in an index and you really want to 752 00:35:45,000 --> 00:35:46,000 reach back in there 753 00:35:46,000 --> 00:35:50,000 is where the linked list starts to be a bad call. Can you take it on, like, 754 00:35:50,000 --> 00:35:52,000 from the list and find something? 755 00:35:52,000 --> 00:35:55,000 Well, it can take - the thing is it - with a question with me like a long time. Like if I were 756 00:35:55,000 --> 00:35:59,000 looking through to see if there was an existence of a particular score in a 757 00:35:59,000 --> 00:36:02,000 vector of integers I have to look at them all from the beginning to the end. 758 00:36:02,000 --> 00:36:05,000 Now, the way I can access them is subzero, subone, subtwo, right? 759 00:36:05,000 --> 00:36:09,000 But if I'm walking down a linked list I'm doing the same sort of work. I have to look at each element 760 00:36:09,000 --> 00:36:12,000 but the way I get to them is by traversing these pointers. 761 00:36:12,000 --> 00:36:15,000 It might be that that's a little bit more expensive relative to the 762 00:36:15,000 --> 00:36:18,000 vector, but they're still gonna be about the same. If there's a million elements 763 00:36:18,000 --> 00:36:20,000 they're all gonna look at a million elements and whether it looked at them in contiguous memory 764 00:36:20,000 --> 00:36:24,000 or looked at them by jumping around it ends up being in the end kind 765 00:36:24,000 --> 00:36:27,000 of a very comparable amount of time spent doing those things. 766 00:36:27,000 --> 00:36:31,000 What would be tricky is if I knew I wanted to get to the 100th element. 767 00:36:31,000 --> 00:36:34,000 Like I wasn't interested in looking at the other 99 that preceded it. I really just wanted 768 00:36:34,000 --> 00:36:37,000 to go straight to the thing at slot 100. 769 00:36:37,000 --> 00:36:39,000 The array gives me that access immediately. 770 00:36:39,000 --> 00:36:43,000 Just doing a little calculation it says it's here. Go to this place in memory. 771 00:36:43,000 --> 00:36:47,000 And the linked list would require this walking of 100 steps to get there. 772 00:36:47,000 --> 00:36:49,000 And so if there are situations, right? Where one of these is 773 00:36:49,000 --> 00:36:53,000 just clearly the better answer because you know you're gonna do that operation a lot, right? You're 774 00:36:53,000 --> 00:36:54,000 gonna definitely prefer this. 775 00:36:54,000 --> 00:36:58,000 If you know you're gonna do a lot of inserting and rearranging within the list, 776 00:36:58,000 --> 00:37:00,000 the linked list may buy you 777 00:37:00,000 --> 00:37:03,000 the ease of flexibility of a kind of rewiring 778 00:37:03,000 --> 00:37:10,000 that does not require all this shuffling to move everything around. Question here? But if you were to 779 00:37:10,000 --> 00:37:14,000 know the size of the linked list? Typically a linked list won't limit the size unless you tell it, but that's actually true about an 780 00:37:14,000 --> 00:37:15,000 array too, right? Which is you 781 00:37:15,000 --> 00:37:18,000 - the vector happens to it on your behalf, but underneath it, right? That the 782 00:37:18,000 --> 00:37:22,000 array is tracking how many elements are in there, so would a linked list. So 783 00:37:22,000 --> 00:37:25,000 you could do a manual count when you needed, which would be expensive, or you could 784 00:37:25,000 --> 00:37:27,000 just track along with that 785 00:37:27,000 --> 00:37:29,000 outer most pointer. Well, how many elements have I put in there? 786 00:37:29,000 --> 00:37:31,000 And then update it as you add and remove. So 787 00:37:31,000 --> 00:37:34,000 you'd likely want to cache that if it was important to you to know that. If 788 00:37:34,000 --> 00:37:38,000 a lot of your operations depended on knowing that size you'd probably want to keep it stored 789 00:37:38,000 --> 00:37:41,000 somewhere. Over here? 790 00:37:41,000 --> 00:37:42,000 Why does 791 00:37:42,000 --> 00:37:45,000 the vector use arrays then instead of linked 792 00:37:45,000 --> 00:37:46,000 list? 793 00:37:46,000 --> 00:37:49,000 That is a great question. It's one that we actually will talk about kind of in about a week. 794 00:37:49,000 --> 00:37:52,000 It's like, well, why? So sometimes array - the vector is kind of taking the 795 00:37:52,000 --> 00:37:56,000 path of, well, it's just presenting you to the array in a slightly more convenient 796 00:37:56,000 --> 00:37:56,000 form. 797 00:37:56,000 --> 00:37:58,000 We will see situations where 798 00:37:58,000 --> 00:38:01,000 then the vector is just as bad as a choice as an array and then we'll 799 00:38:01,000 --> 00:38:05,000 see, well, why we might prefer to use a linked list and we will have - we will build linked 800 00:38:05,000 --> 00:38:06,000 lists instead basically. 801 00:38:06,000 --> 00:38:10,000 It has to make a choice and it turns out that for most usages 802 00:38:10,000 --> 00:38:11,000 the array 803 00:38:11,000 --> 00:38:15,000 is a pretty good match for a lot of ordinary tasks. The linked list happens to be a match 804 00:38:15,000 --> 00:38:18,000 for other tasks, right? And then we will build stuff out of that when 805 00:38:18,000 --> 00:38:20,000 that context comes up. Well, if you just, like, 806 00:38:20,000 --> 00:38:22,000 then you just, sort of, like 807 00:38:22,000 --> 00:38:25,000 in a vector class, like encapsulate all that and I - its final 808 00:38:25,000 --> 00:38:29,000 functionality would be the same. You certainly could. So you could totally build a vector class that 809 00:38:29,000 --> 00:38:31,000 actually was backed by a linked list and we will see how to do that, 810 00:38:31,000 --> 00:38:34,000 right? And then it would have different characteristics in terms of what 811 00:38:34,000 --> 00:38:36,000 operations were efficient and which ones were inefficient 812 00:38:36,000 --> 00:38:39,000 relative to do an array backed one. And 813 00:38:39,000 --> 00:38:41,000 over here was also. Yeah? 814 00:38:41,000 --> 00:38:43,000 Well, I, just to - for using linked 815 00:38:43,000 --> 00:38:44,000 lists, like, 816 00:38:44,000 --> 00:38:48,000 so first in, first out or first in, last out 817 00:38:48,000 --> 00:38:48,000 seems to make 818 00:38:48,000 --> 00:38:51,000 really good sense, but are there - am I 819 00:38:51,000 --> 00:38:52,000 like 820 00:38:52,000 --> 00:38:55,000 missing - You are way ahead of us, but, yeah, you're right on. 821 00:38:55,000 --> 00:38:57,000 The things like the stack and the Q, right? 822 00:38:57,000 --> 00:39:00,000 Seem like they're gonna be natural things that will work very, very well with 823 00:39:00,000 --> 00:39:01,000 a linked list. 824 00:39:01,000 --> 00:39:04,000 Things that required a lot of shuffling in the middle, right? It's a 825 00:39:04,000 --> 00:39:08,000 little unclear because you have to find the place in the middle, which is expensive, 826 00:39:08,000 --> 00:39:11,000 but then the insert is fast. The question is, well, which of these is actually gonna 827 00:39:11,000 --> 00:39:13,000 work out better? They both have 828 00:39:13,000 --> 00:39:16,000 some things they can do well and some things they can't do well. So it will have to have 829 00:39:16,000 --> 00:39:19,000 know maybe a little bit more about the context to be sure which ones seem 830 00:39:19,000 --> 00:39:22,000 like the best solution and maybe in the end it's just a coin toss, right? 831 00:39:22,000 --> 00:39:25,000 Some parts will be fast and some parts could be slow and anyway I can get the inverted form 832 00:39:25,000 --> 00:39:26,000 of that 833 00:39:26,000 --> 00:39:28,000 as an alternative that 834 00:39:28,000 --> 00:39:32,000 doesn't dominate in any clear sense. It's just like, well, they both have advantages 835 00:39:32,000 --> 00:39:35,000 and disadvantages. Question 836 00:39:35,000 --> 00:39:36,000 back here? Why 837 00:39:36,000 --> 00:39:37,000 was 838 00:39:37,000 --> 00:39:38,000 that infused 839 00:39:38,000 --> 00:39:41,000 to be based on a vector instead of a linked list? 840 00:39:41,000 --> 00:39:44,000 So they - I think your question is kind of like, yeah, they likely are built on linked lists. 841 00:39:44,000 --> 00:39:48,000 We're gonna see how we can do it both ways, right? We'll see how we can do it on array, we'll see how we can do it 842 00:39:48,000 --> 00:39:50,000 on a linked list, and then we'll talk about what those tradeoffs look like. Well, which one 843 00:39:50,000 --> 00:39:54,000 ends up being the better call 844 00:39:54,000 --> 00:39:57,000 when we're there? So that will be about a week. So don't worry too much about it now, 845 00:39:57,000 --> 00:40:00,000 but when we get there we'll definitely see kind of both variants and then we'll 846 00:40:00,000 --> 00:40:06,000 talk about them in great detail. Okay. 847 00:40:06,000 --> 00:40:10,000 So I'm gonna show you, just the last operation I want to show you on a linked list 848 00:40:10,000 --> 00:40:12,000 is doing an insert in sorted order and this 849 00:40:12,000 --> 00:40:15,000 is gonna require some pretty careful 850 00:40:15,000 --> 00:40:17,000 understanding of what's going on with the pointers and then we're also gonna 851 00:40:17,000 --> 00:40:20,000 see a recursive form of it that actually is a 852 00:40:20,000 --> 00:40:22,000 really clever 853 00:40:22,000 --> 00:40:25,000 and very elegant way to solve the same problem. So I'm gonna first do it 854 00:40:25,000 --> 00:40:25,000 iteratively 855 00:40:25,000 --> 00:40:28,000 just to kind of talk about. What's the mechanism for doing stuff iteratively on 856 00:40:28,000 --> 00:40:32,000 a linked list? It gets a little bit messy and so just be prepared for that. 857 00:40:32,000 --> 00:40:36,000 That I'm planning on taking my address book and building it up inserted order. So 858 00:40:36,000 --> 00:40:40,000 instead of doing a prepend where I just take each cell and put it on the front I'm actually assume 859 00:40:40,000 --> 00:40:42,000 that I'm gonna keep them all in order by name 860 00:40:42,000 --> 00:40:45,000 and whenever I get a new cell I want to put it in the right position relative to 861 00:40:45,000 --> 00:40:48,000 the other neighbors in the list that are all ready there. 862 00:40:48,000 --> 00:40:51,000 So this is one of those places where it seems like the linked list is actually gonna do very well. I have 863 00:40:51,000 --> 00:40:53,000 to search to find the spot. 864 00:40:53,000 --> 00:40:55,000 Okay? Well, you're gonna have to search to find the spot in a vector, too. 865 00:40:55,000 --> 00:40:59,000 Once I find the spot it should be very easy to just wire it in 866 00:40:59,000 --> 00:41:02,000 without shuffling and rearranging anything around. It should be a pointer 867 00:41:02,000 --> 00:41:04,000 in and a pointer out. 868 00:41:04,000 --> 00:41:06,000 So I'm gonna start this insert sorted. 869 00:41:06,000 --> 00:41:08,000 It's gonna take the 870 00:41:08,000 --> 00:41:11,000 head of the list, right? As a bi-reference parameter because there are 871 00:41:11,000 --> 00:41:14,000 situations, right? Where we're gonna be changing where the list points to when the cells in the 872 00:41:14,000 --> 00:41:15,000 first one 873 00:41:15,000 --> 00:41:18,000 and then the new cell that's coming in is the second parameter. 874 00:41:18,000 --> 00:41:21,000 So we're gonna search to find the place where it goes. 875 00:41:21,000 --> 00:41:24,000 So let's watch this go down it's 876 00:41:24,000 --> 00:41:28,000 loop. I do a four loop here that's like, well, Kerr starts this list while it's not null, 877 00:41:28,000 --> 00:41:29,000 advance by this, 878 00:41:29,000 --> 00:41:34,000 and if the cell that I'm trying to place, which in this case is the cell Rain, 879 00:41:34,000 --> 00:41:36,000 proceeds the one we're currently looking at, 880 00:41:36,000 --> 00:41:38,000 then this must be the place that goes in the list. 881 00:41:38,000 --> 00:41:41,000 So I look at Rain, I compare it to Cullaf 882 00:41:41,000 --> 00:41:44,000 and if Rain goes in front of Cullaf then this is where Rain should go in the list. 883 00:41:44,000 --> 00:41:47,000 Well, it doesn't actually proceed it, 884 00:41:47,000 --> 00:41:48,000 so I advance again. Then 885 00:41:48,000 --> 00:41:51,000 I look at Cullaf - Rain versus Lowery. 886 00:41:51,000 --> 00:41:54,000 Does Rain proceed Lowery? It does not. So I just keep going. So I'm gonna break out of 887 00:41:54,000 --> 00:41:56,000 the loop as soon as I get to 888 00:41:56,000 --> 00:41:59,000 the place I compare Rain to Matt. 889 00:41:59,000 --> 00:42:00,000 Doesn't come out. 890 00:42:00,000 --> 00:42:03,000 I compare Rain to Thomas. 891 00:42:03,000 --> 00:42:07,000 Now, this is the first time when new ones name came up as being less than 892 00:42:07,000 --> 00:42:08,000 Kerr's name. 893 00:42:08,000 --> 00:42:13,000 So that says, okay, well, Rain needs to go in front of Thomas. 894 00:42:13,000 --> 00:42:15,000 Okay. All is well and good 895 00:42:15,000 --> 00:42:18,000 almost. 896 00:42:18,000 --> 00:42:20,000 I have a pointer to Thomas. 897 00:42:20,000 --> 00:42:23,000 That's where my curve points to, right? 898 00:42:23,000 --> 00:42:29,000 And I want to put Rain kind of in the list right in front of Thomas. How do 899 00:42:29,000 --> 00:42:38,000 I get from Thomas to Matt? But it's 900 00:42:38,000 --> 00:42:41,000 not just enough to know Thomas, right? Because Thomas tells me like what's gonna follow 901 00:42:41,000 --> 00:42:42,000 Rain. 902 00:42:42,000 --> 00:42:44,000 So this is the cell where 903 00:42:44,000 --> 00:42:47,000 Rain's next is gonna point to where Thomas is because Rain is kind of gonna replace 904 00:42:47,000 --> 00:42:51,000 Thomas in the list and push Thomas one element down, 905 00:42:51,000 --> 00:42:55,000 but the situation is that I also need to know what proceeds Thomas. The 906 00:42:55,000 --> 00:42:58,000 cell that led into Thomas is the one whose next field needs to get reassigned 907 00:42:58,000 --> 00:43:00,000 to point to Rain. 908 00:43:00,000 --> 00:43:01,000 So the way the loop is going, right? 909 00:43:01,000 --> 00:43:04,000 It has to go kind of one too far 910 00:43:04,000 --> 00:43:07,000 to know that it was there because, like, if I look at Matt I can say, well, does Rain go 911 00:43:07,000 --> 00:43:10,000 after Matt? Yes, it does. Right? But that's not good enough to know that it goes 912 00:43:10,000 --> 00:43:13,000 right after Matt. The only way I can tell if it goes right after Matt is to say, 913 00:43:13,000 --> 00:43:17,000 well, if the cell that comes up next, right? Is behind Rain is that it goes in between 914 00:43:17,000 --> 00:43:21,000 these two. 915 00:43:21,000 --> 00:43:22,000 [Inaudible] Thomas and then 916 00:43:22,000 --> 00:43:24,000 [inaudible] Thomas? 917 00:43:24,000 --> 00:43:27,000 Yes. So at least I can do it by maybe kind of 918 00:43:27,000 --> 00:43:30,000 overwriting Thomas with Rain and then overwriting Rain with Thomas and kind of 919 00:43:30,000 --> 00:43:31,000 wiring them up. I'm gonna try to avoid that. 920 00:43:31,000 --> 00:43:34,000 I'm gonna actually say, well, let's just think about can we do it with 921 00:43:34,000 --> 00:43:37,000 leaving the information in the cells where it is? 922 00:43:37,000 --> 00:43:40,000 I don't know who else might point to Thomas, right? It would seem kind of weird if all of 923 00:43:40,000 --> 00:43:44,000 a sudden Thomas turned into Rain in some other way. That I don't know for sure 924 00:43:44,000 --> 00:43:47,000 who else is using this pointer. So lets assume that I can't do that 925 00:43:47,000 --> 00:43:50,000 variation of it, but I'd like to put Rain in between Matt and Thomas 926 00:43:50,000 --> 00:43:53,000 and the pointer I have is not quite good enough. 927 00:43:53,000 --> 00:43:56,000 The cells as is, right? Have this asymmetry. They know what follows or 928 00:43:56,000 --> 00:43:58,000 not. What precedes them. 929 00:43:58,000 --> 00:44:02,000 So, in fact, what I'm gonna change here is I'm going to run two pointers down 930 00:44:02,000 --> 00:44:03,000 the list, 931 00:44:03,000 --> 00:44:05,000 kind of in parallel, 932 00:44:05,000 --> 00:44:08,000 where they're always one step apart. 933 00:44:08,000 --> 00:44:11,000 Then I'm gonna track what was the one I last looked at 934 00:44:11,000 --> 00:44:13,000 and what is the one I'm currently looking at 935 00:44:13,000 --> 00:44:16,000 and I call this dragging a previous pointer 936 00:44:16,000 --> 00:44:19,000 or trailing a previous pointer behind. 937 00:44:19,000 --> 00:44:20,000 So 938 00:44:20,000 --> 00:44:24,000 each time I'm about to advance Kerr equals Kerr next, the thing I'll do right 939 00:44:24,000 --> 00:44:27,000 before it is to assign previous to be what Kerr is now. 940 00:44:27,000 --> 00:44:31,000 So then it'll advance to the next one, test against null, and keep going. So at 941 00:44:31,000 --> 00:44:33,000 any given state, right? 942 00:44:33,000 --> 00:44:36,000 Previous is exactly one behind where Kerr is 943 00:44:36,000 --> 00:44:39,000 and there's one special case, right? Which is that 944 00:44:39,000 --> 00:44:44,000 at the very first time through the loop, right? Kerr is set to be the list. Well, 945 00:44:44,000 --> 00:44:48,000 what's the previous of the head of the list? There actually is no previous, so actually 946 00:44:48,000 --> 00:44:50,000 initialize previous to be null. 947 00:44:50,000 --> 00:44:52,000 So the first time through, right? 948 00:44:52,000 --> 00:44:55,000 We've assigned previous to null and Kerr gets to go to the first one and then we 949 00:44:55,000 --> 00:44:57,000 say, well, is Cullaf - 950 00:44:57,000 --> 00:44:58,000 does 951 00:44:58,000 --> 00:45:00,000 Rain come before Cullaf? No. 952 00:45:00,000 --> 00:45:04,000 And so I'll advance both so move previous up to where Kerr is and then move 953 00:45:04,000 --> 00:45:06,000 Kerr up to the next one. 954 00:45:06,000 --> 00:45:08,000 So now they're pointing at Cullaf and Lowery together 955 00:45:08,000 --> 00:45:12,000 and, again, I'm comparing Lowery to Rain. Does Rain go in front of Lowery? No. So then I 956 00:45:12,000 --> 00:45:15,000 move up previous and move up Kerr. 957 00:45:15,000 --> 00:45:17,000 Does Rain go in front of Matt? 958 00:45:17,000 --> 00:45:19,000 No. I advance them both, right? 959 00:45:19,000 --> 00:45:21,000 And then I have 960 00:45:21,000 --> 00:45:22,000 961 00:45:22,000 --> 00:45:25,000 previous at the Matt entry, Kerr at the Thomas entry, 962 00:45:25,000 --> 00:45:28,000 and then Rain does go right in between those, right? 963 00:45:28,000 --> 00:45:32,000 We know that it didn't precede Matt because we all ready made it through the loop. 964 00:45:32,000 --> 00:45:37,000 We know it does precede Thomas and that tells us it goes exactly in between them, right? 965 00:45:37,000 --> 00:45:41,000 That Matt should lead into Rain and Rain should lead into Thomas and that will have 966 00:45:41,000 --> 00:45:44,000 spliced it in the list at the right position 967 00:45:44,000 --> 00:45:45,000 relative to the 968 00:45:45,000 --> 00:45:47,000 other sorted entries. So I 969 00:45:47,000 --> 00:45:55,000 give myself just a little note here. Well, what are the possible values for previous? 970 00:45:55,000 --> 00:46:00,000 If the cell is gonna go anywhere in the list other than the very front, 971 00:46:00,000 --> 00:46:05,000 so as long as Rain follows at least one cell on the list, it's not the 972 00:46:05,000 --> 00:46:05,000 alphabetically 973 00:46:05,000 --> 00:46:09,000 first cell of all the ones I've seen, then previous will be some valid cell 974 00:46:09,000 --> 00:46:12,000 and potentially Kerr could be null, 975 00:46:12,000 --> 00:46:17,000 but there is a non-null prev in all of those cases. 976 00:46:17,000 --> 00:46:20,000 There's also the possibility though that previous is null 977 00:46:20,000 --> 00:46:23,000 in the case where the new cell was the alphabetically first. So if that one actually began 978 00:46:23,000 --> 00:46:27,000 with an A, let's say it was Alena. 979 00:46:27,000 --> 00:46:30,000 Then the first test would be if Alena is less then Cullaf. It is. It would 980 00:46:30,000 --> 00:46:31,000 immediately break and 981 00:46:31,000 --> 00:46:35,000 so previous would be null, Kerr would point to the front most cell, 982 00:46:35,000 --> 00:46:38,000 and I've got a little bit of a special case there 983 00:46:38,000 --> 00:46:43,000 where inserting at the front of the list isn't quite the same thing as inserting 984 00:46:43,000 --> 00:46:44,000 further down. 985 00:46:44,000 --> 00:46:49,000 So let me look at the code that I added to this and I ran out of room to draw pictures, so we'll have to do a little 986 00:46:49,000 --> 00:46:51,000 bit on the board here. 987 00:46:51,000 --> 00:46:54,000 That there are two pointers we need to assign 988 00:46:54,000 --> 00:46:55,000 for that 989 00:46:55,000 --> 00:47:01,000 new cell that's coming in. One is it's outgoing one 990 00:47:01,000 --> 00:47:06,000 and then another is it's incoming one. And so 991 00:47:06,000 --> 00:47:11,000 I have this list and, 992 00:47:11,000 --> 00:47:16,000 let's say, this new cell is going here. So this is like K, L, M, 993 00:47:16,000 --> 00:47:17,000 T, R. 994 00:47:17,000 --> 00:47:22,000 That the first line splices in the outgoing pointer in all cases. It says 995 00:47:22,000 --> 00:47:24,000 that the new ones next field is Kerr. 996 00:47:24,000 --> 00:47:29,000 So Kerr is the first cell, right? That we found that follows the cell we're trying 997 00:47:29,000 --> 00:47:32,000 to insert. It might be that Kerr is null, 998 00:47:32,000 --> 00:47:33,000 but that's totally fine too. 999 00:47:33,000 --> 00:47:37,000 In this case it happened to point - this is where Kerr was and this is where previous 1000 00:47:37,000 --> 00:47:39,000 was. 1001 00:47:39,000 --> 00:47:41,000 That it will make 1002 00:47:41,000 --> 00:47:44,000 R point to T, so that is the outgoing pointer for the new cell 1003 00:47:44,000 --> 00:47:46,000 and now there are two cases here. 1004 00:47:46,000 --> 00:47:51,000 The ordinary case is that there is some previous cell. In which case the previous 1005 00:47:51,000 --> 00:47:54,000 cell's next field needs to get updated to point to the new one. 1006 00:47:54,000 --> 00:47:58,000 So we use the pointer to this one. It's next field 1007 00:47:58,000 --> 00:48:00,000 no longer points to T, 1008 00:48:00,000 --> 00:48:05,000 but instead points to R. And so I inserted R in between M and T with that 1009 00:48:05,000 --> 00:48:06,000 adjustment. 1010 00:48:06,000 --> 00:48:09,000 In the case where the new cell, 1011 00:48:09,000 --> 00:48:11,000 let's say, was the A, 1012 00:48:11,000 --> 00:48:13,000 then Kerr would be here 1013 00:48:13,000 --> 00:48:17,000 and previous would be null. 1014 00:48:17,000 --> 00:48:20,000 I will set A as next field to the Kerr, 1015 00:48:20,000 --> 00:48:23,000 which spliced the outgoing pointer of A 1016 00:48:23,000 --> 00:48:25,000 to attach to the rest of the list. 1017 00:48:25,000 --> 00:48:29,000 But here's the special case. There is no previous next field that's gonna point to A. 1018 00:48:29,000 --> 00:48:32,000 A actually needs to get reassigned to be the front most cell of the list. So it 1019 00:48:32,000 --> 00:48:36,000 looks like our prepend operation where I'm saying list equals new one 1020 00:48:36,000 --> 00:48:40,000 and the list is some reference parameter 1021 00:48:40,000 --> 00:48:42,000 to some list elsewhere 1022 00:48:42,000 --> 00:48:46,000 that then gets permanently set to the new cell. So I would call insert sorted 1023 00:48:46,000 --> 00:48:49,000 from, like, the build address book function 1024 00:48:49,000 --> 00:48:52,000 to attach it to the very front in this case. 1025 00:48:52,000 --> 00:48:56,000 This is very common in linked list code, right? Is to have there be a bit of a special 1026 00:48:56,000 --> 00:49:00,000 case about either the first cell or the last cell or on the empty list, right? But 1027 00:49:00,000 --> 00:49:03,000 that there is a large 1028 00:49:03,000 --> 00:49:06,000 - all the cells in the interior kind of behave the same. They have a cell in 1029 00:49:06,000 --> 00:49:10,000 front of them and a cell in back of them and that insert has certain properties, but 1030 00:49:10,000 --> 00:49:13,000 there's a little bit of an oddness with that maybe that first cell is sometimes the 1031 00:49:13,000 --> 00:49:16,000 very last cell that requires a little special case handling. 1032 00:49:16,000 --> 00:49:20,000 In this case, right? Putting A in the front means somebody doesn't point 1033 00:49:20,000 --> 00:49:22,000 into A. It's the listhead that points to A 1034 00:49:22,000 --> 00:49:24,000 and so it's not somebody's next field 1035 00:49:24,000 --> 00:49:30,000 that we're updating. It really is the listhead pointer itself. 1036 00:49:30,000 --> 00:49:32,000 Questions about that one? 1037 00:49:32,000 --> 00:49:35,000 So I need to show you the recursive one. I'm not gonna be able to talk about it very much, but 1038 00:49:35,000 --> 00:49:38,000 I'm gonna encourage you to take a look at it. 1039 00:49:38,000 --> 00:49:40,000 Which is the 1040 00:49:40,000 --> 00:49:41,000 1041 00:49:41,000 --> 00:49:46,000 same activity of inserting a cell 1042 00:49:46,000 --> 00:49:48,000 into a list that's being kept in sorted order. 1043 00:49:48,000 --> 00:49:53,000 Now, not using this iterative, not using this dragging of our previous pointer, 1044 00:49:53,000 --> 00:49:58,000 but instead using a strong dense recursive powerful formulation here 1045 00:49:58,000 --> 00:50:01,000 that basically has the idea that says, well, 1046 00:50:01,000 --> 00:50:03,000 given a list that I'm inserting into, 1047 00:50:03,000 --> 00:50:06,000 let's say it's the M through Z part of the list. 1048 00:50:06,000 --> 00:50:11,000 I'm trying to insert R is I compare R to the front of the list and I say, 1049 00:50:11,000 --> 00:50:14,000 well, does that become the new front of this list? That's what my base case is 1050 00:50:14,000 --> 00:50:18,000 looking at. Is the list empty or does this one go in front of it? If so, prepend it. 1051 00:50:18,000 --> 00:50:19,000 Otherwise 1052 00:50:19,000 --> 00:50:22,000 just insert it in the remainder of this list. 1053 00:50:22,000 --> 00:50:24,000 So if it doesn't insert 1054 00:50:24,000 --> 00:50:28,000 at the front of M then we pass the N or the O whatever is following it and 1055 00:50:28,000 --> 00:50:31,000 have it kind of recursively insert into the 1056 00:50:31,000 --> 00:50:34,000 remainder of the list. Eventually it will hit the space case 1057 00:50:34,000 --> 00:50:38,000 when it either gets to the end of the list or when there is 1058 00:50:38,000 --> 00:50:40,000 the cell we're trying to insert 1059 00:50:40,000 --> 00:50:43,000 belongs in front of these remaining cells. In which case, we do a prepend right 1060 00:50:43,000 --> 00:50:44,000 there 1061 00:50:44,000 --> 00:50:46,000 and stop the recursion. 1062 00:50:46,000 --> 00:50:49,000 A dense little piece of code. One I'll 1063 00:50:49,000 --> 00:50:53,000 encourage you to kind of trace a little bit on your own to kind of 1064 00:50:53,000 --> 00:50:56,000 get an idea of how it's working, but it kind of shows in this case that the recursion is 1065 00:50:56,000 --> 00:50:57,000 really buying you something 1066 00:50:57,000 --> 00:51:00,000 by making it, I think, a much simpler and more direct way to 1067 00:51:00,000 --> 00:51:02,000 express what you want to do and then kind of the 1068 00:51:02,000 --> 00:51:06,000 more mechanical means of doing it iteratively. 1069 00:51:06,000 --> 00:51:08,000 What we will talk about on Monday 1070 00:51:08,000 --> 00:51:12,000 will be algorithms. We'll talk about algorithms, Chapter 7. Start talking about 1071 00:51:12,000 --> 00:51:16,000 Big O and formal announcements algorithms and do a little bit of sorting. So I will see 1072 00:51:16,000 --> 00:51:18,000 you on Monday 1073 00:51:18,000 --> 00:51:23,000 and have a good weekend.