1 00:00:00,000 --> 00:00:08,000 2 00:00:08,000 --> 00:00:12,000 This presentation is delivered by the Stanford Center for Professional 3 00:00:12,000 --> 00:00:16,000 Development. 4 00:00:22,000 --> 00:00:26,000 Okay. Well welcome, welcome. We got a couple parents who have braved the 5 00:00:26,000 --> 00:00:29,000 other activities of Parents Weekend to come here, so thank you for coming to see what 6 00:00:29,000 --> 00:00:32,000 it is that we do here in 7 00:00:32,000 --> 00:00:36,000 CS106B. We are still in the process of grading the midterm. In fact, most of our grading, we're planning on 8 00:00:36,000 --> 00:00:40,000 taking care of tomorrow, no, Sunday, in some big, massive grading sessions. So with any 9 00:00:40,000 --> 00:00:42,000 luck, if it all goes well, we'll actually have them 10 00:00:42,000 --> 00:00:45,000 ready to go back on Monday, but if somewhere it turns into some disaster, it may take us a 11 00:00:45,000 --> 00:00:48,000 little longer, but hopefully. That's the plan. 12 00:00:48,000 --> 00:00:51,000 So I'm gonna finish up a couple slides from last time, 13 00:00:51,000 --> 00:00:54,000 and then I'm gonna mostly go on to do some code with you together. 14 00:00:54,000 --> 00:00:56,000 But I just wanted to 15 00:00:56,000 --> 00:00:56,000 16 00:00:56,000 --> 00:01:00,000 try to give you some perspective on the value of abstraction and the idea of an 17 00:01:00,000 --> 00:01:03,000 abstract data type, or an ADT, and why 18 00:01:03,000 --> 00:01:07,000 this is such a powerful and important concept in management of complexity. 19 00:01:07,000 --> 00:01:09,000 So we saw this, for example, 20 00:01:09,000 --> 00:01:13,000 in those first couple assignments that the client uses classes as 21 00:01:13,000 --> 00:01:14,000 an abstraction. 22 00:01:14,000 --> 00:01:18,000 You have the need to manage something that has a queue-like behavior: first in, 23 00:01:18,000 --> 00:01:19,000 first out. 24 00:01:19,000 --> 00:01:23,000 So what you want is something that you can put things in that will enforce that: 25 00:01:23,000 --> 00:01:26,000 put things at the end of the line, always return to the head of the line. 26 00:01:26,000 --> 00:01:29,000 And that how it managed what it did and what went on behind the scenes 27 00:01:29,000 --> 00:01:33,000 wasn't something we had to worry about or even concern ourselves with. And in 28 00:01:33,000 --> 00:01:36,000 fact, even if we wanted to, we couldn't muck around in there. Like 29 00:01:36,000 --> 00:01:39,000 it wasn't our business, it was maintained privately. 30 00:01:39,000 --> 00:01:44,000 And that is a real advantage for both sides, right? That the client 31 00:01:44,000 --> 00:01:47,000 doesn't have to worry about it and can't even get in the way of us, that we 32 00:01:47,000 --> 00:01:50,000 can work independently and get our things done. 33 00:01:50,000 --> 00:01:54,000 And so one sort of piece of terminology we often use here, we talk about this wall of 34 00:01:54,000 --> 00:01:55,000 abstraction. 35 00:01:55,000 --> 00:01:57,000 That there is kind of a 36 00:01:57,000 --> 00:01:58,000 real 37 00:01:58,000 --> 00:02:00,000 block 38 00:02:00,000 --> 00:02:04,000 that prevents the two of us from interfering with each other's process, as 39 00:02:04,000 --> 00:02:06,000 part of, you know, combining to build a program together. 40 00:02:06,000 --> 00:02:10,000 And there's a little tiny chink in that wall that we call the interface. 41 00:02:10,000 --> 00:02:13,000 And that's the way that you speak to the stack and ask it to do things on your 42 00:02:13,000 --> 00:02:16,000 behalf, and it listens to your requests and performs them. 43 00:02:16,000 --> 00:02:19,000 And so if you think about the lexicon class, which we used in the Boggle and 44 00:02:19,000 --> 00:02:22,000 in the recursion assignment that managed a word list, 45 00:02:22,000 --> 00:02:26,000 the abstraction is, yeah, you say, ''Is this word in there? Add this word to it. Load 46 00:02:26,000 --> 00:02:29,000 the words from a file.'' How does it store them? How does it do stuff? Did anybody 47 00:02:29,000 --> 00:02:33,000 go open up the lexicon.cpp file just to see? 48 00:02:33,000 --> 00:02:34,000 Anybody who was curious? And 49 00:02:34,000 --> 00:02:37,000 what did you find out? I just - 50 00:02:37,000 --> 00:02:40,000 I 51 00:02:40,000 --> 00:02:40,000 think 52 00:02:40,000 --> 00:02:44,000 it ended up in there [inaudible]. You ended up in there and when you got in there, what did you decide to do? 53 00:02:44,000 --> 00:02:45,000 Leave. Leave. 54 00:02:45,000 --> 00:02:48,000 Yeah. Did anybody else open it up and have the same sort of reaction? Over 55 00:02:48,000 --> 00:02:49,000 here, 56 00:02:49,000 --> 00:02:54,000 what did you think? I didn't really understand. Yeah. And what did you see? It was a mess. It was a mess. Who wrote that code? My, gosh, she should 57 00:02:54,000 --> 00:02:57,000 be fired. 58 00:02:57,000 --> 00:03:00,000 It's scary. It does something kind of scary. We'll talk about it. Actually, at the end, we'll come back here because I think it's actually 59 00:03:00,000 --> 00:03:02,000 a really neat class to study. 60 00:03:02,000 --> 00:03:04,000 But in fact, like you open it up and you're like, 61 00:03:04,000 --> 00:03:09,000 ''I don't want to be here. I want to use a word list. Let me close this file and let me go back to word list. Add word, 62 00:03:09,000 --> 00:03:11,000 contains word, okay.'' 63 00:03:11,000 --> 00:03:14,000 And you're happy about that. Right? It does something very complex that turns out to 64 00:03:14,000 --> 00:03:18,000 be very efficient and optimize for the task at hand. But back to Boggle, 65 00:03:18,000 --> 00:03:20,000 you don't want to be worrying about that; you got other things on your 66 00:03:20,000 --> 00:03:25,000 plate and that's a fine place to just poke your nose right back out of. So if you haven't had a 67 00:03:25,000 --> 00:03:28,000 chance to look at it, because we will talk about it later, but the person who wrote it, 68 00:03:28,000 --> 00:03:30,000 was a crummy commoner. I'm just 69 00:03:30,000 --> 00:03:32,000 saying 70 00:03:32,000 --> 00:03:36,000 that they would definitely not be getting a check-plus on style, so. 71 00:03:36,000 --> 00:03:38,000 So I drew this picture. It's like this wall. Right? 72 00:03:38,000 --> 00:03:42,000 So when you made a lexicon, you say, ''Oh, I want a lexicon. Add this word. Add that word. Does it 73 00:03:42,000 --> 00:03:44,000 contain this word?'' 74 00:03:44,000 --> 00:03:47,000 And that there is this big wall that - and you think of what's on the other side as a black box. The 75 00:03:47,000 --> 00:03:50,000 black box is the microwave that you push buttons and food gets hot. How does 76 00:03:50,000 --> 00:03:53,000 it really work? Ah, you know, who knows. 77 00:03:53,000 --> 00:03:55,000 You don't open up the back. You don't dig around in it. 78 00:03:55,000 --> 00:03:58,000 And this little chink here, that's the interface. And the 79 00:03:58,000 --> 00:04:01,000 lexicon provides a very small interface, if you remember, adding the words, checking 80 00:04:01,000 --> 00:04:04,000 contains word and prefix, reading 81 00:04:04,000 --> 00:04:07,000 the words from a file, really not much else beyond that. 82 00:04:07,000 --> 00:04:11,000 And then now what we're starting to think about is, ''Well, what's this other side look like?'' What goes over 83 00:04:11,000 --> 00:04:13,000 here 84 00:04:13,000 --> 00:04:15,000 is that there is this implementer 85 00:04:15,000 --> 00:04:18,000 who does know the internal structure. Who does know what shenanigans are being 86 00:04:18,000 --> 00:04:21,000 pulled on that inside, who does have access to that private data, 87 00:04:21,000 --> 00:04:25,000 and who, upon request when you ask it add a word or look up a word or look up a 88 00:04:25,000 --> 00:04:29,000 prefix, internally trolls through that private information and returns 89 00:04:29,000 --> 00:04:31,000 information back to the client, 90 00:04:31,000 --> 00:04:35,000 having mucked around in those internals. So when you say add a word, 91 00:04:35,000 --> 00:04:38,000 maybe what its got is some array with a count of how many words are 92 00:04:38,000 --> 00:04:41,000 used, and it sticks it in the next slot of the array 93 00:04:41,000 --> 00:04:43,000 and updates its counter. 94 00:04:43,000 --> 00:04:46,000 And we don't know what it does but it's not really our business as the client, but the 95 00:04:46,000 --> 00:04:47,000 implementer has 96 00:04:47,000 --> 00:04:48,000 kind of the full 97 00:04:48,000 --> 00:04:50,000 picture of that. 98 00:04:50,000 --> 00:04:53,000 And then over here on this side, if the client attempts to do the kind of things 99 00:04:53,000 --> 00:04:57,000 that are really implementer specific, it tries to access directly the num words 100 00:04:57,000 --> 00:04:58,000 and the words array 101 00:04:58,000 --> 00:05:01,000 to go in and say, ''Yeah, I'd like to put pig in the array. How about I do this? How 102 00:05:01,000 --> 00:05:04,000 about I change the number of words? Or how about I stick it in the slot at the beginning 103 00:05:04,000 --> 00:05:06,000 and overwrite whatever's there.'' 104 00:05:06,000 --> 00:05:09,000 That an attempt to do this 105 00:05:09,000 --> 00:05:12,000 should fail. It should not compile because these fields, 106 00:05:12,000 --> 00:05:14,000 that are part of the lexicon, 107 00:05:14,000 --> 00:05:17,000 should have been declared private to begin with, 108 00:05:17,000 --> 00:05:20,000 to make sure that that wall's maintained. Everything that's private is like 109 00:05:20,000 --> 00:05:24,000 over here on this side of the wall, 110 00:05:24,000 --> 00:05:27,000 inaccessible outside of that. And 111 00:05:27,000 --> 00:05:30,000 so why do we think this is important? 112 00:05:30,000 --> 00:05:34,000 Of all the ideas that come away from 106B, I think this is one of the top 113 00:05:34,000 --> 00:05:38,000 three is this idea of abstraction. We actually even call the whole class 114 00:05:38,000 --> 00:05:39,000 Programming Abstractions. 115 00:05:39,000 --> 00:05:40,000 Because 116 00:05:40,000 --> 00:05:45,000 that the advantage of using this as a strategy for solving larger and 117 00:05:45,000 --> 00:05:49,000 more complicated problems is that you can divide up your tasks. 118 00:05:49,000 --> 00:05:52,000 That you can say this is an abstraction, like a word list, and kind of have it be as fancy as 119 00:05:52,000 --> 00:05:54,000 it needs to be behind the scenes 120 00:05:54,000 --> 00:05:56,000 but very easy to use from the outside. 121 00:05:56,000 --> 00:05:59,000 So that makes it easy for me to write some piece and give it to you 122 00:05:59,000 --> 00:06:01,000 in a form that's easy for you to incorporate. 123 00:06:01,000 --> 00:06:04,000 We can work on our things without stepping on each other. As you get to 124 00:06:04,000 --> 00:06:06,000 larger and larger projects 125 00:06:06,000 --> 00:06:07,000 beyond this class, 126 00:06:07,000 --> 00:06:09,000 you'll need ways of making it so three people can work together without 127 00:06:09,000 --> 00:06:12,000 stepping on each other's toes the whole time. And classes provide a really good 128 00:06:12,000 --> 00:06:15,000 way of managing those boundaries 129 00:06:15,000 --> 00:06:16,000 to keep each other 130 00:06:16,000 --> 00:06:18,000 out of each other's hair. 131 00:06:18,000 --> 00:06:21,000 And there's a lot of flexibility 132 00:06:21,000 --> 00:06:24,000 given to us by this. And we're gonna see this actually as we go forward, that we can talk 133 00:06:24,000 --> 00:06:25,000 about what a vector is. 134 00:06:25,000 --> 00:06:29,000 It keeps things in index order. Or a stack is, it does LIFO. 135 00:06:29,000 --> 00:06:30,000 But there is no guarantee there about 136 00:06:30,000 --> 00:06:34,000 how it internally is implemented, no guarantee expressed or implied, and 137 00:06:34,000 --> 00:06:37,000 that actually gives you a lot of flexibility as the implementer. 138 00:06:37,000 --> 00:06:40,000 You can decide to do it one way today, and if upon later learning about some 139 00:06:40,000 --> 00:06:43,000 other new technique or some way to save memory or time, 140 00:06:43,000 --> 00:06:45,000 you can swap it out, 141 00:06:45,000 --> 00:06:47,000 replace an implementation with something better, 142 00:06:47,000 --> 00:06:51,000 and all the code that depends on it shouldn't require any changes. 143 00:06:51,000 --> 00:06:54,000 That suddenly add word just runs twice as fast or ten times as fast, 144 00:06:54,000 --> 00:06:57,000 would be something everybody could appreciate without having to do anything in their own 145 00:06:57,000 --> 00:07:00,000 code to take advantage of that. 146 00:07:00,000 --> 00:07:01,000 So these are good things to know. 147 00:07:01,000 --> 00:07:05,000 So what I'm gonna do actually today is I'm gonna just stop doing slides, because I'm sick of doing 148 00:07:05,000 --> 00:07:08,000 slides. We do way too many slides; I'm bored with slides. 149 00:07:08,000 --> 00:07:11,000 And what I'm gonna do is I'm gonna actually go through the process of 150 00:07:11,000 --> 00:07:13,000 developing vector 151 00:07:13,000 --> 00:07:15,000 from just the ground up. 152 00:07:15,000 --> 00:07:16,000 So my plan here is to say 153 00:07:16,000 --> 00:07:17,000 our goal is to make 154 00:07:17,000 --> 00:07:19,000 the vector abstraction 155 00:07:19,000 --> 00:07:23,000 real, so to get dirty, get behind the scenes and say we know what vector is. It 156 00:07:23,000 --> 00:07:24,000 acts like an array. It 157 00:07:24,000 --> 00:07:26,000 has these indexed 158 00:07:26,000 --> 00:07:29,000 slots. It's this linear collection. It lets you put things anywhere you like in 159 00:07:29,000 --> 00:07:30,000 the vector. We're gonna 160 00:07:30,000 --> 00:07:33,000 go through the process of making that whole thing. 161 00:07:33,000 --> 00:07:35,000 And I'm gonna start 162 00:07:35,000 --> 00:07:39,000 at the - with a simple form that actually is not templatized and then I'm gonna 163 00:07:39,000 --> 00:07:42,000 change it to one that is templatized. So we're gonna see kind of the whole process from 164 00:07:42,000 --> 00:07:44,000 start to end. So 165 00:07:44,000 --> 00:07:48,000 all I have are empty files right now. I have myvector.h and myvector.cpp that 166 00:07:48,000 --> 00:07:51,000 really have sort of nothing other than sort of boilerplate stuff in them. 167 00:07:51,000 --> 00:07:54,000 So let me look at myvector.h to start with. 168 00:07:54,000 --> 00:07:58,000 So I'm calling this class myvector so that it doesn't confuse with the existing 169 00:07:58,000 --> 00:08:00,000 vector class, so that we have that name there. 170 00:08:00,000 --> 00:08:05,000 And I'm gonna start by just putting in 171 00:08:05,000 --> 00:08:06,000 the 172 00:08:06,000 --> 00:08:09,000 simple parts of the interface, and then we'll see how many other things we 173 00:08:09,000 --> 00:08:12,000 have time to kind of add into it. But I'm gonna get the kinda basic skeletal functions 174 00:08:12,000 --> 00:08:15,000 down and show how they get implemented, and then we'll see what else we'll try. 175 00:08:15,000 --> 00:08:19,000 So having the size is probably pretty important, being able to say, ''Well how many things will I 176 00:08:19,000 --> 00:08:22,000 put in there,'' get that back out, 177 00:08:22,000 --> 00:08:23,000 being able to add an element. 178 00:08:23,000 --> 00:08:27,000 And I'm gonna assume that right now the vector's gonna hold just scranks. 179 00:08:27,000 --> 00:08:29,000 Certainly that's not what we're gonna want in the long run, but I'm just gonna pick something 180 00:08:29,000 --> 00:08:32,000 for now so that we have something to practice with. 181 00:08:32,000 --> 00:08:36,000 And then I'm gonna have the get at, 182 00:08:36,000 --> 00:08:38,000 which give it an index and return something. Okay, 183 00:08:38,000 --> 00:08:41,000 so I think these are the only ones I'm gonna have in the first version. 184 00:08:41,000 --> 00:08:47,000 And then the other ones, you remember, there's like a remove at, there's an insert at, 185 00:08:47,000 --> 00:08:47,000 there's an 186 00:08:47,000 --> 00:08:55,000 overloaded bracket operator, and things like that I'm not gonna show right away. Question? [Inaudible]. Oh yeah, yeah. This is 187 00:08:55,000 --> 00:08:58,000 actually you know kind of just standard boilerplate for C or C++ 188 00:08:58,000 --> 00:09:01,000 header files. And you'll see this again and again and again. I should really have pointed 189 00:09:01,000 --> 00:09:02,000 this out at some point along the way 190 00:09:02,000 --> 00:09:08,000 is that the compiler does not like to see two definitions of the same thing, ever. 191 00:09:08,000 --> 00:09:11,000 Even if those definitions exactly match, it just gets its 192 00:09:11,000 --> 00:09:15,000 total knickers in a twist about having seen a class myvector, another 193 00:09:15,000 --> 00:09:16,000 class myvector. 194 00:09:16,000 --> 00:09:18,000 And so if you 195 00:09:18,000 --> 00:09:21,000 include the header file, myvector, one place and you include it some other place, it's gonna end up 196 00:09:21,000 --> 00:09:24,000 thinking there's two myvectors classes out there and it's gonna have a problem. 197 00:09:24,000 --> 00:09:28,000 So this little bit of boilerplate is to tell the compiler, ''If you haven't already 198 00:09:28,000 --> 00:09:32,000 seen this header, now's the time to go in there.'' So this ifindex, if not 199 00:09:32,000 --> 00:09:34,000 defined, and then the name there is something I made up, some symbol name 200 00:09:34,000 --> 00:09:36,000 that's unique to this file. When I say, 201 00:09:36,000 --> 00:09:38,000 ''Define that symbol,'' so 202 00:09:38,000 --> 00:09:41,000 it's like saying, ''I've seen it,'' and then down here at the very bottom, there's an end if 203 00:09:41,000 --> 00:09:43,000 that matches it. And so, 204 00:09:43,000 --> 00:09:44,000 in the case where we have - 205 00:09:44,000 --> 00:09:47,000 this is the first time we've seen the file it'll say, ''If not to define the symbol it's 206 00:09:47,000 --> 00:09:48,000 not.'' 207 00:09:48,000 --> 00:09:51,000 It'll say, ''Define that symbol, see all this stuff, and then the end if.'' The second time 208 00:09:51,000 --> 00:09:54,000 it gets here it'll say, ''If not define that symbol, say that symbol's already defined,'' so it'll skip 209 00:09:54,000 --> 00:09:59,000 down to the end if. And so, every subsequent attempt to look at myvector 210 00:09:59,000 --> 00:10:00,000 will be passed over. 211 00:10:00,000 --> 00:10:03,000 If you don't have it, you'll get a lot of errors about, 212 00:10:03,000 --> 00:10:05,000 ''I've seen this thing before. And it looks like the one I saw before but I don't like 213 00:10:05,000 --> 00:10:06,000 it. You 214 00:10:06,000 --> 00:10:08,000 know smells suspicious to me.'' 215 00:10:08,000 --> 00:10:13,000 So that is sort of standard boilerplate for any header file is to have this 216 00:10:13,000 --> 00:10:16,000 multiple include protection on it. Anything else you 217 00:10:16,000 --> 00:10:18,000 want to ask about the - way in the back? Why 218 00:10:18,000 --> 00:10:20,000 would it look at it multiple times, though? 219 00:10:20,000 --> 00:10:23,000 Well because sometimes you include it and sometimes - for example, think about genlib. Like I might 220 00:10:23,000 --> 00:10:25,000 include genlib but then 221 00:10:25,000 --> 00:10:27,000 I include something else that includes genlib. So there's these ways that 222 00:10:27,000 --> 00:10:29,000 you could accidentally get in there more than once, 223 00:10:29,000 --> 00:10:32,000 just because some other things depend on it, and the next thing you know. So 224 00:10:32,000 --> 00:10:34,000 it's better to just have the header file never 225 00:10:34,000 --> 00:10:38,000 complain about that, never let that happen, than to make it somebody else's responsibility 226 00:10:38,000 --> 00:10:44,000 to make sure it never happened through the includes. 227 00:10:44,000 --> 00:10:45,000 So I've got five 228 00:10:45,000 --> 00:10:48,000 member functions here that I'm gonna have to implement. 229 00:10:48,000 --> 00:10:52,000 And now I need to think about what the private data, this guy's gonna look like. So 230 00:10:52,000 --> 00:10:55,000 now, we are the low-level implementer. 231 00:10:55,000 --> 00:10:58,000 We're not building on anything else right now, because myvector is kind 232 00:10:58,000 --> 00:11:00,000 of the bottommost piece of things here. 233 00:11:00,000 --> 00:11:05,000 We've got nothing out there except for the raw array to do the job for us. So 234 00:11:05,000 --> 00:11:06,000 typically, the most 235 00:11:06,000 --> 00:11:07,000 236 00:11:07,000 --> 00:11:08,000 compatible 237 00:11:08,000 --> 00:11:12,000 mechanism to map something like vector onto is the array. You know it has 238 00:11:12,000 --> 00:11:16,000 contiguous memory, it allows you to do direct access to things by index, 239 00:11:16,000 --> 00:11:19,000 and so that's where we're gonna start. And we'll come back and talk about that, whether 240 00:11:19,000 --> 00:11:23,000 that's the only option we have here. But what I'm gonna put in here 241 00:11:23,000 --> 00:11:24,000 is a pointer 242 00:11:24,000 --> 00:11:25,000 to 243 00:11:25,000 --> 00:11:26,000 244 00:11:26,000 --> 00:11:27,000 a string, or 245 00:11:27,000 --> 00:11:31,000 in this case it's gonna be a pointer to a string 246 00:11:31,000 --> 00:11:34,000 array. And so in C++ those both look the same, this says arr is a single pointer that points 247 00:11:34,000 --> 00:11:35,000 to 248 00:11:35,000 --> 00:11:38,000 a string in memory, which in our situation is actually gonna be a whole sequence 249 00:11:38,000 --> 00:11:41,000 of strings in memory kind of one after another. 250 00:11:41,000 --> 00:11:44,000 The array has no knowledge of its length 251 00:11:44,000 --> 00:11:47,000 so we're gonna build is using new and things like that. It will not know how big it is. 252 00:11:47,000 --> 00:11:50,000 It will be our job to manually track that. 253 00:11:50,000 --> 00:11:52,000 So I'm gonna go ahead and have 254 00:11:52,000 --> 00:11:55,000 two fields to go with it, and I'm gonna show you why I have two in a 255 00:11:55,000 --> 00:12:00,000 minute. But instead of having just a length field that says the array is this long, 256 00:12:00,000 --> 00:12:02,000 I'm actually gonna track two integers on it 257 00:12:02,000 --> 00:12:05,000 because likely the thing I'm gonna do with the array is I'm gonna allocate it to a 258 00:12:05,000 --> 00:12:07,000 certain size and then kind of fill it up. 259 00:12:07,000 --> 00:12:09,000 And then when it gets 260 00:12:09,000 --> 00:12:13,000 totally filled, then I'm gonna do a process of resizing it and enlarging it. 261 00:12:13,000 --> 00:12:15,000 And so, at any given point, there may be 262 00:12:15,000 --> 00:12:18,000 a little bit excess capacity, a little slack in the system. So we might start by 263 00:12:18,000 --> 00:12:20,000 making it ten long 264 00:12:20,000 --> 00:12:21,000 and then fill it up, 265 00:12:21,000 --> 00:12:24,000 and then as we get to size 10, make it 20 or something like that. 266 00:12:24,000 --> 00:12:27,000 So at any point, we'll need to know things about that array: how many of 267 00:12:27,000 --> 00:12:31,000 the slots are actually used, so that'll be the first five slots are in use; and then 268 00:12:31,000 --> 00:12:33,000 the number of allocated will tell me how many other - how many 269 00:12:33,000 --> 00:12:37,000 total slots do I have, so how many slots can I use before 270 00:12:37,000 --> 00:12:37,000 I 271 00:12:37,000 --> 00:12:40,000 will have run out of space. 272 00:12:40,000 --> 00:12:42,000 Both of those. Okay, 273 00:12:42,000 --> 00:12:45,000 so there's my interface. So these things all public because anybody can use 274 00:12:45,000 --> 00:12:46,000 them; 275 00:12:46,000 --> 00:12:49,000 these things private because they're gonna be part of my implementation; I don't want anybody mucking with 276 00:12:49,000 --> 00:12:50,000 those 277 00:12:50,000 --> 00:12:53,000 or directly accessing them at all, 278 00:12:53,000 --> 00:12:55,000 so I put them down there. All 279 00:12:55,000 --> 00:12:57,000 right, now I go to myvector.cpp. 280 00:12:57,000 --> 00:13:01,000 So it includes myvector.h, so that it already has seen the class interface so it knows 281 00:13:01,000 --> 00:13:04,000 when we're talking about when we start trying to implement methods on the myvector 282 00:13:04,000 --> 00:13:06,000 class. And I'm 283 00:13:06,000 --> 00:13:07,000 gonna start with 284 00:13:07,000 --> 00:13:10,000 myvector's constructor, 285 00:13:10,000 --> 00:13:13,000 and the goal of the constructor will be to setup the instance variables of this 286 00:13:13,000 --> 00:13:17,000 object into a state that makes sense. 287 00:13:17,000 --> 00:13:18,000 So what I'm gonna choose to do 288 00:13:18,000 --> 00:13:24,000 is allocate my array to some size. I'm gonna pick ten. So 289 00:13:24,000 --> 00:13:28,000 I'm gonna say I want space for ten strings. 290 00:13:28,000 --> 00:13:31,000 I want to record that I made space for ten strings so I know the num 291 00:13:31,000 --> 00:13:33,000 allocated, the size of my array right now is ten, 292 00:13:33,000 --> 00:13:37,000 and the number used is zero. 293 00:13:37,000 --> 00:13:39,000 So that means that when someone 294 00:13:39,000 --> 00:13:46,000 makes a call, like a declaration, that says myvector V, so if I'm back over here in my main. Say 295 00:13:46,000 --> 00:13:49,000 myvector V, 296 00:13:49,000 --> 00:13:54,000 but the process of constructing that vector V will cause the constructor to get 297 00:13:54,000 --> 00:13:58,000 called, will cause a ten-member string element array to get allocated out in the 298 00:13:58,000 --> 00:13:58,000 heap 299 00:13:58,000 --> 00:14:01,000 that's pointed to by arr, and then it will set those two integers to know that there's ten 300 00:14:01,000 --> 00:14:03,000 of them and zero of them are used. 301 00:14:03,000 --> 00:14:07,000 So just to kind of all as part of the machinery of declaring, the 302 00:14:07,000 --> 00:14:11,000 constructor is just wired into that so we get setup, ready to go, with some empty 303 00:14:11,000 --> 00:14:16,000 space set aside. 304 00:14:16,000 --> 00:14:18,000 So to go with that, I'm 305 00:14:18,000 --> 00:14:22,000 gonna go ahead and add the destructor right along side of it, 306 00:14:22,000 --> 00:14:26,000 which is I need to be in charge of cleaning up my 307 00:14:26,000 --> 00:14:28,000 dynamic allocation. I 308 00:14:28,000 --> 00:14:30,000 allocated that with the new bracket, 309 00:14:30,000 --> 00:14:33,000 the new that allocates out in the heap that uses the bracket form has a 310 00:14:33,000 --> 00:14:37,000 matching delete bracket that says delete a whole array's worth of data, so not just 311 00:14:37,000 --> 00:14:40,000 one string out there but a sequence of them. 312 00:14:40,000 --> 00:14:43,000 We don't have to tell it the size; it actually does - as much as I said, it doesn't 313 00:14:43,000 --> 00:14:47,000 its size. Well somewhere in the internals, it does know the size but it never 314 00:14:47,000 --> 00:14:49,000 exposes it to us. So in fact, once I delete [inaudible] array, 315 00:14:49,000 --> 00:14:52,000 it knows how much space is there to cleanup. Yeah? 316 00:14:52,000 --> 00:14:54,000 Are you just temporarily 317 00:14:54,000 --> 00:14:58,000 setting it up so the vector only works on strings? Yes, we are. Okay. Yes. We're gonna 318 00:14:58,000 --> 00:14:59,000 come back and fix that, but 319 00:14:59,000 --> 00:15:02,000 I think it's easier maybe to see it one time on a fixed type and then say, ''Well, what happens when you 320 00:15:02,000 --> 00:15:05,000 go to template? What things change?'' And we'll see all the places that we have to make 321 00:15:05,000 --> 00:15:08,000 modifications. 322 00:15:08,000 --> 00:15:11,000 So I have myvector 323 00:15:11,000 --> 00:15:13,000 size. 324 00:15:13,000 --> 00:15:17,000 Which variable's the one that tells us about size? 325 00:15:17,000 --> 00:15:20,000 I got none used. I got none allocated. Which is the right one? Num 326 00:15:20,000 --> 00:15:24,000 used. Num used, that is exactly right. So num allocated turned out to be something we only will 327 00:15:24,000 --> 00:15:29,000 use internally. That's not gonna - no one's gonna see that or know about it, but it is - 328 00:15:29,000 --> 00:15:33,000 the num used tracks how many elements have actually been put in there. Then 329 00:15:33,000 --> 00:15:35,000 I write 330 00:15:35,000 --> 00:15:39,000 myvector add. 331 00:15:39,000 --> 00:15:42,000 So I'm gonna write one line of code, then I'm gonna come back and think about it for a 332 00:15:42,000 --> 00:15:44,000 second. I could 333 00:15:44,000 --> 00:15:45,000 say 334 00:15:45,000 --> 00:15:49,000 arr num 335 00:15:49,000 --> 00:15:50,000 used ++ = S, so tight little line that 336 00:15:50,000 --> 00:15:54,000 has a lot of stuff wrapped up in it. 337 00:15:54,000 --> 00:15:55,000 Using 338 00:15:55,000 --> 00:15:59,000 the brackets on the dynamic array here are saying, ''Take the array 339 00:15:59,000 --> 00:16:03,000 and right to the num used slot, post-incrementing it so it's a slide effect.'' So if 340 00:16:03,000 --> 00:16:08,000 num used is zero to start with, this has the effect of saying array of, 341 00:16:08,000 --> 00:16:13,000 and then the num used ++ returns the value before incrementing. So it evaluates to zero, 342 00:16:13,000 --> 00:16:17,000 but as a slide effect the next use of num used will now be one. So that's exactly 343 00:16:17,000 --> 00:16:22,000 what we want, we want to write the slot zero and then have num used be one in subsequent usage. 344 00:16:22,000 --> 00:16:25,000 And then we assign that to be S, so it'll put it in the right spot of the array. So once num 345 00:16:25,000 --> 00:16:27,000 used is five, so that 346 00:16:27,000 --> 00:16:31,000 means the zero through four slots are used. It'll write to the fifth slot and then 347 00:16:31,000 --> 00:16:35,000 up the num used to be six, so we'll know our size is now up by one. 348 00:16:35,000 --> 00:16:41,000 What else does add need to do? Is it good? Needs to make sure that we have that [inaudible]. It'd 349 00:16:41,000 --> 00:16:43,000 better 350 00:16:43,000 --> 00:16:44,000 make sure we have some space. 351 00:16:44,000 --> 00:16:48,000 So I'm gonna do this right now. I'm gonna say if num used 352 00:16:48,000 --> 00:16:50,000 is equal to num allocated, I'm gonna raise an error. I'm 353 00:16:50,000 --> 00:16:56,000 gonna come back to this but I'm gonna say - 354 00:16:56,000 --> 00:16:57,000 or 355 00:16:57,000 --> 00:17:00,000 for this first pass, we're gonna make it so it just doesn't grow. Picked some arbitrary size, it 356 00:17:00,000 --> 00:17:04,000 got that big, and then it ran out of space. Question? 357 00:17:04,000 --> 00:17:07,000 So when the - for the 358 00:17:07,000 --> 00:17:09,000 vector zero, first time it gets called it's actually gonna 359 00:17:09,000 --> 00:17:11,000 be placed in index one of the array? 360 00:17:11,000 --> 00:17:16,000 So it's in place of index zero. So num used ++, any variable ++ returns the - it 361 00:17:16,000 --> 00:17:20,000 evaluates to the value before it increments, and then as a side effect, kind of in the next 362 00:17:20,000 --> 00:17:23,000 step, it will update the value. So there is a difference between 363 00:17:23,000 --> 00:17:26,000 the ++ num used and the num ++ form. 364 00:17:26,000 --> 00:17:28,000 Sometimes it's a little bit of an obscure detail we don't spend a lot of time on, but 365 00:17:28,000 --> 00:17:30,000 ++ num says 366 00:17:30,000 --> 00:17:33,000 first increment it and then tell me what the newly incremented value is. 367 00:17:33,000 --> 00:17:36,000 Num ++ says tell me what the value is right now and then, as a side effect, increment it 368 00:17:36,000 --> 00:17:41,000 for later use. So the num used gets changed in that? It does get changed in that expression, but the 369 00:17:41,000 --> 00:17:46,000 expression happened to evaluate to the value before it did that change. 370 00:17:46,000 --> 00:17:49,000 Question? And 371 00:17:49,000 --> 00:17:54,000 is it really necessary to have myvector:: or is - Oh yeah, yeah, yeah, yeah. - there any way for - So if I take this out, 372 00:17:54,000 --> 00:17:55,000 and I could show you what happens if I do 373 00:17:55,000 --> 00:17:59,000 this, is what the compiler's gonna think that is, is it thinks, ''Oh, there's just a function 374 00:17:59,000 --> 00:18:03,000 called size that take the arguments and returns them in.'' And it doesn't exist in any 375 00:18:03,000 --> 00:18:05,000 context; it's just a global function. 376 00:18:05,000 --> 00:18:08,000 And then I try to compile it, if I go ahead and do it, 377 00:18:08,000 --> 00:18:11,000 it's gonna say, ''Num used? What's num used? Where did num used come from?'' 378 00:18:11,000 --> 00:18:14,000 So if I click on this it'll say, ''Yeah, num used was not declared in this scope.'' It has no idea where that 379 00:18:14,000 --> 00:18:18,000 came from. So there is no mechanism in C++ to identify this as being a 380 00:18:18,000 --> 00:18:21,000 member function other than scoping it with the right names. You say this is 381 00:18:21,000 --> 00:18:23,000 the size 382 00:18:23,000 --> 00:18:25,000 function 383 00:18:25,000 --> 00:18:27,000 that is part of the myvector class. And now 384 00:18:27,000 --> 00:18:31,000 it has the right context to interpret that and it knows that when you talk about num used, 385 00:18:31,000 --> 00:18:34,000 ''Oh, there's a private data member that matches that. That must be the receiver's 386 00:18:34,000 --> 00:18:37,000 num used.'' So 387 00:18:37,000 --> 00:18:38,000 one other 388 00:18:38,000 --> 00:18:42,000 member function we got to go, 389 00:18:42,000 --> 00:18:45,000 which is the one that returns, 390 00:18:45,000 --> 00:18:47,000 of I do this, 391 00:18:47,000 --> 00:18:49,000 it almost but not quite, 392 00:18:49,000 --> 00:18:51,000 looks like what I wanted. So I 393 00:18:51,000 --> 00:18:53,000 say return arrays of index. 394 00:18:53,000 --> 00:18:56,000 They said get in the next and returned the string that was at that index. 395 00:18:56,000 --> 00:19:03,000 Is there anything else I might want to do here? Check and see if the index is [inaudible]. Well, let me 396 00:19:03,000 --> 00:19:05,000 check and see if that index is valid. 397 00:19:05,000 --> 00:19:06,000 Now, this is one of those cases where it's like you 398 00:19:06,000 --> 00:19:10,000 could just sort of take the stance that says, ''Well, it's not my job.'' If you ask me to get you 399 00:19:10,000 --> 00:19:14,000 know something at index 25, and there are only four of them, that's your own fault and you 400 00:19:14,000 --> 00:19:14,000 deserve 401 00:19:14,000 --> 00:19:16,000 what you get. 402 00:19:16,000 --> 00:19:18,000 And that is certainly 403 00:19:18,000 --> 00:19:21,000 the way a lot of professional libraries work because this, if I added a step here 404 00:19:21,000 --> 00:19:24,000 that does a check, it means that every check costs a little bit more. 405 00:19:24,000 --> 00:19:26,000 When you go in to get something out of the vector, it's gonna have to look and 406 00:19:26,000 --> 00:19:28,000 make sure you were doing it correctly. 407 00:19:28,000 --> 00:19:31,000 And there are people who believe that if it's like riding a motorcycle without a helmet, 408 00:19:31,000 --> 00:19:32,000 that's your own 409 00:19:32,000 --> 00:19:33,000 choice. 410 00:19:33,000 --> 00:19:35,000 We're gonna actually bullet proof; 411 00:19:35,000 --> 00:19:36,000 we're gonna make sure 412 00:19:36,000 --> 00:19:39,000 that the index 413 00:19:39,000 --> 00:19:41,000 isn't whack. 414 00:19:41,000 --> 00:19:45,000 And I'm gonna go ahead and use my own size there. I'm gonna say if the index is less zero or it's greater 415 00:19:45,000 --> 00:19:47,000 than or equal to, and I'm gonna use size. 416 00:19:47,000 --> 00:19:52,000 I could use num used there but there are also reasons that if I use my own member functions 417 00:19:52,000 --> 00:19:55,000 and then later somehow I change that name or I change how I calculate the 418 00:19:55,000 --> 00:19:59,000 size, let's say I change it to where I use a linked list within an array and I'm managing the size 419 00:19:59,000 --> 00:20:00,000 differently, 420 00:20:00,000 --> 00:20:03,000 that if I have used size in this context, it will just change along with that 421 00:20:03,000 --> 00:20:06,000 because it's actually depending on the 422 00:20:06,000 --> 00:20:08,000 other part of the interface that I'll update. 423 00:20:08,000 --> 00:20:09,000 And so I'm gonna go ahead and do that 424 00:20:09,000 --> 00:20:12,000 rather than directly asking my variable. I'll say if that happens I say error, out of bounds. 425 00:20:12,000 --> 00:20:15,000 And on 426 00:20:15,000 --> 00:20:18,000 that, that error will cause the program to halt right down there, so I don't have to worry about 427 00:20:18,000 --> 00:20:21,000 it, in that case, getting down to the return. So I feel 428 00:20:21,000 --> 00:20:22,000 okay about this. 429 00:20:22,000 --> 00:20:25,000 Not too much code to get kind of a little bit of something up and 430 00:20:25,000 --> 00:20:26,000 running here. 431 00:20:26,000 --> 00:20:39,000 Let's go over and write something to test. Add Jason, and here we go. Okay, so I 432 00:20:39,000 --> 00:20:41,000 put some things in there and I'm 433 00:20:41,000 --> 00:20:43,000 gonna see if 434 00:20:43,000 --> 00:20:50,000 it'll let me get them back out. 435 00:20:50,000 --> 00:20:51,000 And 436 00:20:51,000 --> 00:20:54,000 before I get any further, I might as well test the code I have. Right? 437 00:20:54,000 --> 00:20:58,000 This is - one of the things about designing a class is it's pretty hard to write 438 00:20:58,000 --> 00:21:02,000 any one piece and test it by itself because often there's a relationship: the constructor 439 00:21:02,000 --> 00:21:04,000 and then some adding and then some getting stuff back out. 440 00:21:04,000 --> 00:21:07,000 So it's a little bit more code than I typically would like to write and not have 441 00:21:07,000 --> 00:21:10,000 a chance to test, having all five of these member functions kind of at first. 442 00:21:10,000 --> 00:21:11,000 So 443 00:21:11,000 --> 00:21:14,000 if I run this test and it doesn't work, it's like well which part what was wrong? Was the 444 00:21:14,000 --> 00:21:17,000 constructor wrong? Was the add wrong? Was the size wrong? You know there's a bunch of 445 00:21:17,000 --> 00:21:19,000 places to look, but 446 00:21:19,000 --> 00:21:22,000 unfortunately, they're kind of all interrelated in a way that makes it a little hard to have them 447 00:21:22,000 --> 00:21:23,000 independently 448 00:21:23,000 --> 00:21:26,000 worked out. But then subsequent thing, hopefully I can add the like the 449 00:21:26,000 --> 00:21:28,000 insert at method without 450 00:21:28,000 --> 00:21:33,000 having to add a lot more before I test. 451 00:21:33,000 --> 00:21:37,000 Okay. So I run this guy and it says Jason, Illia, Nathan. Feel good, I feel good, 452 00:21:37,000 --> 00:21:41,000 I feel smart. 453 00:21:41,000 --> 00:21:42,000 Put them in, in that order. See 454 00:21:42,000 --> 00:21:44,000 me get them out in that order. I might 455 00:21:44,000 --> 00:21:46,000 try some things that I'm hoping will cause my thing 456 00:21:46,000 --> 00:21:48,000 to blow up. 457 00:21:48,000 --> 00:21:53,000 Why don't I get at ten? Let's see, I 458 00:21:53,000 --> 00:21:55,000 like to be sure, 459 00:21:55,000 --> 00:21:59,000 so I want you to tell me what's at the tenth slot in that vector. 460 00:21:59,000 --> 00:22:02,000 And it's, ooh, out of bounds, just like I was hoping for. 461 00:22:02,000 --> 00:22:05,000 Oh, I like to see that. Right? 462 00:22:05,000 --> 00:22:07,000 What happens if I put 463 00:22:07,000 --> 00:22:10,000 in 464 00:22:10,000 --> 00:22:11,000 enough strings 465 00:22:11,000 --> 00:22:12,000 that I 466 00:22:12,000 --> 00:22:16,000 run out of memory? And we talked about a thing - it's set it to be ten right now. Why don't I just make it 467 00:22:16,000 --> 00:22:19,000 smaller, that'll way it'll make it easier for me. So I say, ''Oh how 468 00:22:19,000 --> 00:22:21,000 about this?'' I'm 469 00:22:21,000 --> 00:22:23,000 only gonna make 470 00:22:23,000 --> 00:22:26,000 space for two things. And it just: 471 00:22:26,000 --> 00:22:30,000 error, out of space. Right? That happened before it got to printing anything. It tried to add the first one. It 472 00:22:30,000 --> 00:22:33,000 added the second one. Then it went to add that third one and it said, ''Oh okay, we run out of space. We 473 00:22:33,000 --> 00:22:35,000 only had space for two set aside, you 474 00:22:35,000 --> 00:22:40,000 asked me to put a third in. I had no room.'' 475 00:22:40,000 --> 00:22:42,000 So at least the kind of simple behaviors that I put in here seem to kind of 476 00:22:42,000 --> 00:22:46,000 show evidence that we've got a little part of this up and running. 477 00:22:46,000 --> 00:22:48,000 What I'm gonna fix first is this out of space thing. 478 00:22:48,000 --> 00:22:54,000 So it would be pretty bogus and pretty unusual for a 479 00:22:54,000 --> 00:22:56,000 collection class like this to have some sort of fixed limit. 480 00:22:56,000 --> 00:22:59,000 Right? It wouldn't - you know it'd be very unusual to say well it's always gonna hold exactly 10 481 00:22:59,000 --> 00:23:02,000 things or 100 things or even a 1,000 things. Right? 482 00:23:02,000 --> 00:23:05,000 You know one way you might design it is you could imagine adding an 483 00:23:05,000 --> 00:23:07,000 argument to the constructor that said, ''Well, how many things do you plan on putting in 484 00:23:07,000 --> 00:23:11,000 there? And then I'll allocate it to that. And then when we run out of space, you know you're hosed.'' 485 00:23:11,000 --> 00:23:14,000 But certainly a better strategy that kind of solves more general case problems would 486 00:23:14,000 --> 00:23:16,000 be like, ''Oh let's grow. 487 00:23:16,000 --> 00:23:20,000 When we run out of space, let's make some more space.'' Okay, 488 00:23:20,000 --> 00:23:23,000 let's think about what it takes to make more space in 489 00:23:23,000 --> 00:23:24,000 using pointers. 490 00:23:24,000 --> 00:23:31,000 So what a vector really looks like 491 00:23:31,000 --> 00:23:33,000 is it has three fields, 492 00:23:33,000 --> 00:23:34,000 the arr, 493 00:23:34,000 --> 00:23:39,000 the num used, and the num allocated. When 494 00:23:39,000 --> 00:23:43,000 I declare one, the way it's being setup right now, it's over here and it's allocating space 495 00:23:43,000 --> 00:23:45,000 for 496 00:23:45,000 --> 00:23:47,000 some number, let's say it's ten again, 497 00:23:47,000 --> 00:23:48,000 and then marking it as zero. 498 00:23:48,000 --> 00:23:53,000 Then as it fills these things up, it puts strings in each of these slots and it 499 00:23:53,000 --> 00:23:58,000 starts updating this number, eventually it gets to where all of them are filled. 500 00:23:58,000 --> 00:24:02,000 That when num used equals num allocated, it means that however much space I set 501 00:24:02,000 --> 00:24:05,000 aside, every one of those slots is now in use. 502 00:24:05,000 --> 00:24:08,000 So when that happens, it's gonna be time to make a bigger array. 503 00:24:08,000 --> 00:24:11,000 There is not a mechanism in C++ that says take my array and just make 504 00:24:11,000 --> 00:24:13,000 it bigger where it is. 505 00:24:13,000 --> 00:24:16,000 That the way we'll have to do this is, we'll have to make a bigger array, copy over 506 00:24:16,000 --> 00:24:17,000 what we have, 507 00:24:17,000 --> 00:24:18,000 and then, 508 00:24:18,000 --> 00:24:22,000 you know, have it add it on by making a bigger array full of space. So what we'll do is we'll make 509 00:24:22,000 --> 00:24:23,000 something that's like 510 00:24:23,000 --> 00:24:27,000 twice as big, I'm just gonna draw it this way since I'm running out of board space, 511 00:24:27,000 --> 00:24:31,000 and it's got ten slots and then ten more. And 512 00:24:31,000 --> 00:24:36,000 then I will copy over all these guys that I have 513 00:24:36,000 --> 00:24:38,000 up to the end, 514 00:24:38,000 --> 00:24:40,000 and then I have these new spaces at the end. 515 00:24:40,000 --> 00:24:41,000 And so I will have 516 00:24:41,000 --> 00:24:44,000 to reset my pointer to point to there, 517 00:24:44,000 --> 00:24:49,000 update my num allocated, let's say to be twice as big or something, 518 00:24:49,000 --> 00:24:52,000 and then delete this old region that 519 00:24:52,000 --> 00:24:54,000 I'm no longer using. 520 00:24:54,000 --> 00:24:57,000 So we're gonna see an allocate, a copy, a delete, 521 00:24:57,000 --> 00:25:00,000 and then kind of resetting our fields 522 00:25:00,000 --> 00:25:01,000 to know what we just did. 523 00:25:01,000 --> 00:25:05,000 So I'm gonna make a private helper to do all this, and 524 00:25:05,000 --> 00:25:12,000 I'll just call that enlarge capacity here. Question? Is this like [inaudible]? Well 525 00:25:12,000 --> 00:25:14,000 it is - it can be, is the truth. 526 00:25:14,000 --> 00:25:16,000 So you're getting a little 527 00:25:16,000 --> 00:25:18,000 bit ahead of us. But in the sense that like, 528 00:25:18,000 --> 00:25:20,000 you know, if the array has a thousand elements 529 00:25:20,000 --> 00:25:23,000 and now we got to put that thousand-first thing in, it's gonna take 530 00:25:23,000 --> 00:25:26,000 all thousand and copy them over and enlarge the space. 531 00:25:26,000 --> 00:25:30,000 So in effect what is typically an O of one operation, just tacking something on 532 00:25:30,000 --> 00:25:34,000 the end, every now and then is gonna take a whole hit of an end 533 00:25:34,000 --> 00:25:37,000 operation when it does the copy. But 534 00:25:37,000 --> 00:25:39,000 one way to think about that is that is 535 00:25:39,000 --> 00:25:42,000 every now it's really expensive but kind of if you think of it across the whole 536 00:25:42,000 --> 00:25:45,000 space, that you got a - let's say you started at 1,000 and then you 537 00:25:45,000 --> 00:25:46,000 doubled to 2,000, that 538 00:25:46,000 --> 00:25:48,000 the first 1,000 never paid that cost. 539 00:25:48,000 --> 00:25:51,000 And then all of a sudden one of them paid it but then now you don't pay it again for 540 00:25:51,000 --> 00:25:55,000 another 1,000. But if you kind of divided that cost, sort of amortized it across all 541 00:25:55,000 --> 00:25:56,000 those adds, 542 00:25:56,000 --> 00:25:57,000 that it was a - 543 00:25:57,000 --> 00:25:59,000 it didn't change the overall constant running time. 544 00:25:59,000 --> 00:26:02,000 So you have to - you kind of think of it maybe in the picture. It does mean every now and 545 00:26:02,000 --> 00:26:05,000 then though one of them is gonna surprise you with how slow it is, but 546 00:26:05,000 --> 00:26:06,000 547 00:26:06,000 --> 00:26:07,000 hopefully that's 548 00:26:07,000 --> 00:26:12,000 few and far between, if we've chosen our allocation strategy well. 549 00:26:12,000 --> 00:26:15,000 So I'm gonna say - actually, I'm gonna do this. I'm gonna call this 550 00:26:15,000 --> 00:26:17,000 thing actually double capacity. 551 00:26:17,000 --> 00:26:21,000 So one strategy that often is used is you could go in little chunks. You could go like 552 00:26:21,000 --> 00:26:23,000 ten at a time all the time. 553 00:26:23,000 --> 00:26:24,000 But 554 00:26:24,000 --> 00:26:27,000 if the array's gonna get very big, 555 00:26:27,000 --> 00:26:30,000 going by tens will take you a lot of time. You might sort of use sometimes the indication of, ''Well 556 00:26:30,000 --> 00:26:33,000 how big is it already?'' Then if that's about the size it seems to be, why don't we go ahead 557 00:26:33,000 --> 00:26:34,000 and just double. 558 00:26:34,000 --> 00:26:36,000 So if we start at 10, 559 00:26:36,000 --> 00:26:39,000 you know, and then we go to 20, then we go to 40, then we to 80, then 560 00:26:39,000 --> 00:26:42,000 as we get to bigger and bigger sizes, we'll make the kind of assumption that: well, 561 00:26:42,000 --> 00:26:45,000 given how big it is already, it's not - it wouldn't surprise us if it got a lot bigger. So 562 00:26:45,000 --> 00:26:48,000 let's just go ahead and allocate 563 00:26:48,000 --> 00:26:50,000 twice what we have as a way of kind of 564 00:26:50,000 --> 00:26:54,000 predicting that it's pretty big now, it might get even bigger in the future. That 565 00:26:54,000 --> 00:26:55,000 isn't necessarily 566 00:26:55,000 --> 00:26:58,000 the only strategy we could use, but it's one that actually often makes 567 00:26:58,000 --> 00:27:01,000 pretty decent sense. And then at any given point, you'll know that the vector will 568 00:27:01,000 --> 00:27:03,000 be somewhere between 569 00:27:03,000 --> 00:27:05,000 full and half capacity, 570 00:27:05,000 --> 00:27:07,000 is what you're setting up for. 571 00:27:07,000 --> 00:27:10,000 So let's do this. 572 00:27:10,000 --> 00:27:12,000 Oops, it's not ints; 573 00:27:12,000 --> 00:27:14,000 string bigger equals, 574 00:27:14,000 --> 00:27:18,000 and I make a new string array that is num allocated times two, 575 00:27:18,000 --> 00:27:22,000 so twice as big as the one we have. 576 00:27:22,000 --> 00:27:30,000 And now I go through and I copy. I'm 577 00:27:30,000 --> 00:27:34,000 copying from my old array to my bigger one, 578 00:27:34,000 --> 00:27:37,000 all of num used. In this case, num used and num allocated are exactly the same thing, so. And 579 00:27:37,000 --> 00:27:38,000 then I'm 580 00:27:38,000 --> 00:27:43,000 gonna reset my fields as needed. I'm gonna delete my old array because I'm 581 00:27:43,000 --> 00:27:45,000 now done with it. I'm gonna 582 00:27:45,000 --> 00:27:49,000 set my pointer to this, so that's doing pointer assignment only. Right? So I deleted 583 00:27:49,000 --> 00:27:52,000 the old space, I've copied everything I needed out of it, I'm 584 00:27:52,000 --> 00:27:54,000 now resetting my pointer to the bigger one, 585 00:27:54,000 --> 00:27:56,000 and then I'm setting my num allocated 586 00:27:56,000 --> 00:28:00,000 to be 587 00:28:00,000 --> 00:28:01,000 twice as big, 588 00:28:01,000 --> 00:28:05,000 num used doesn't change. Okay, 589 00:28:05,000 --> 00:28:06,000 so 590 00:28:06,000 --> 00:28:09,000 went through the process of building all these things. And as noted, that is gonna 591 00:28:09,000 --> 00:28:11,000 cost us something 592 00:28:11,000 --> 00:28:15,000 linear in the number of elements that we currently have, so. 593 00:28:15,000 --> 00:28:19,000 So this guy is intended to be a private method. 594 00:28:19,000 --> 00:28:22,000 I don't want people outside of the vector being able to call this, so I'm actually 595 00:28:22,000 --> 00:28:24,000 gonna list this 596 00:28:24,000 --> 00:28:27,000 in the private section. It's not as 597 00:28:27,000 --> 00:28:29,000 common that you'll see member functions listed down here. But in this case, it's 598 00:28:29,000 --> 00:28:32,000 appropriate for something that you plan to use internally as a helper but you don't want 599 00:28:32,000 --> 00:28:39,000 anybody outside just to be calling double capacity when they feel like it. 600 00:28:39,000 --> 00:28:41,000 Question? So normally [inaudible] an array, you couldn't actually 601 00:28:41,000 --> 00:28:45,000 declare a size like that, though, right, with a variable? 602 00:28:45,000 --> 00:28:48,000 You know I don't understand the question. Try that again. You 603 00:28:48,000 --> 00:28:52,000 know in this case to enlarge it, you use a 604 00:28:52,000 --> 00:28:53,000 variable as the - for the 605 00:28:53,000 --> 00:28:57,000 size of the array. When you normally declare an array, you couldn't do that, right? Well if 606 00:28:57,000 --> 00:29:03,000 you did it - It has to be a constant. So like if you used this form of an array, you know, 607 00:29:03,000 --> 00:29:05,000 where you declared it like this. Did you see what I just did? 608 00:29:05,000 --> 00:29:05,000 Yeah. Yeah. 609 00:29:05,000 --> 00:29:06,000 That way won't work, 610 00:29:06,000 --> 00:29:10,000 right? That way is fixed size, nothing you can do about it. So I'm usually totally the 611 00:29:10,000 --> 00:29:14,000 dynamic array at all times, so that everything - So you do it when you're declaring it on a heap? 612 00:29:14,000 --> 00:29:15,000 Yes. Okay. 613 00:29:15,000 --> 00:29:19,000 Yes, exactly. All I have in the - stored in the vector itself is a pointer to an 614 00:29:19,000 --> 00:29:22,000 array elsewhere. And that array in the heap gives me the flexibility to make it as big as 615 00:29:22,000 --> 00:29:26,000 I want, as small as I want, to change its size, to change where it 616 00:29:26,000 --> 00:29:29,000 points to, you know all the - the dynamic arrays typically just give you a lot more flexibility 617 00:29:29,000 --> 00:29:32,000 than a static array. That's really stack heap. It is. Array is a 618 00:29:32,000 --> 00:29:35,000 stack heap thing. When you put on the stack, you had to say how big it is at compile time 619 00:29:35,000 --> 00:29:38,000 and you can't change it. The heap let's you say, ''I need it bigger, I need it smaller, I need to 620 00:29:38,000 --> 00:29:40,000 move it,'' 621 00:29:40,000 --> 00:29:44,000 in a way that the stack just doesn't give you that at all. 622 00:29:44,000 --> 00:29:47,000 So when I go back to myvector.cpp, the place I want to put in my 623 00:29:47,000 --> 00:29:54,000 double capacity here is when num used is equal to num allocated, 624 00:29:54,000 --> 00:29:58,000 but what I want to do is call double capacity there. 625 00:29:58,000 --> 00:30:02,000 After I've done that, num allocated should've gone up by a factor of two. 626 00:30:02,000 --> 00:30:04,000 Space will be there at that point. I know that num used is a 627 00:30:04,000 --> 00:30:08,000 fine slot to now use to assign that next thing. So whenever we're out of space, we'll 628 00:30:08,000 --> 00:30:10,000 make some more space. 629 00:30:10,000 --> 00:30:13,000 And so I'm gonna - right now, I think my allocated is still set at two. That's a fine place. 630 00:30:13,000 --> 00:30:16,000 I'd like it to be kind of small because I'd like to test kind of 631 00:30:16,000 --> 00:30:18,000 the - 632 00:30:18,000 --> 00:30:25,000 some of the initial allocations. So I'll go ahead and add a 633 00:30:25,000 --> 00:30:28,000 couple more people so I can see that I know that - at that point, if I've gotten to 634 00:30:28,000 --> 00:30:32,000 five, I'm gonna have to double once and then again to get there. 635 00:30:32,000 --> 00:30:37,000 And let's, I'll take out my error case here, see that I've managed 636 00:30:37,000 --> 00:30:38,000 to 637 00:30:38,000 --> 00:30:40,000 allocate and move stuff around. 638 00:30:40,000 --> 00:30:42,000 Got my five names back out 639 00:30:42,000 --> 00:30:46,000 without running into anything crazy, so that makes me feel good about what I got 640 00:30:46,000 --> 00:30:49,000 there. So 641 00:30:49,000 --> 00:30:53,000 I could go on to show you what insert and remove 642 00:30:53,000 --> 00:30:53,000 do; I think 643 00:30:53,000 --> 00:30:57,000 I'm probably gonna skip that because I'd rather talk about the template thing. But I could 644 00:30:57,000 --> 00:31:01,000 just tell you - I could sketch the [inaudible]: what does insert at do, what does remove at do? 645 00:31:01,000 --> 00:31:04,000 Basically, that they're doing the string - the array shuffling for you. If you say 646 00:31:04,000 --> 00:31:08,000 insert at some position, it has to move everything down by one and then 647 00:31:08,000 --> 00:31:11,000 put in there. Whereas at is actually just tacking it onto the end of the existing 648 00:31:11,000 --> 00:31:12,000 ones. 649 00:31:12,000 --> 00:31:16,000 The insert and remove have to do the shuffle to either close up the space or open up a space. 650 00:31:16,000 --> 00:31:21,000 They'll probably both need to look at the capacity as well. That the - if you're 651 00:31:21,000 --> 00:31:25,000 inserting and you're already at capacity, you better double before you start. And then the 652 00:31:25,000 --> 00:31:28,000 remove at, also may actually want to have a shrink capacity. Where when we realize 653 00:31:28,000 --> 00:31:31,000 we previously were allocated much larger and we've gotten a lot smaller, should we 654 00:31:31,000 --> 00:31:35,000 take away some of that space. A lot of times the implementations don't bother 655 00:31:35,000 --> 00:31:38,000 with that case. They figure, ''Ah, it's already allocated, just keep it around. You 656 00:31:38,000 --> 00:31:41,000 might need it again later.'' So it may be that actually we just leave it over-allocated, 657 00:31:41,000 --> 00:31:44,000 even when we've deleted a lot of elements, but 658 00:31:44,000 --> 00:31:47,000 if we were being tidy we could take an effort there. 659 00:31:47,000 --> 00:31:50,000 What I want to do is switch it to a template. So if you have questions 660 00:31:50,000 --> 00:31:53,000 about the code I have right here, now would be 661 00:31:53,000 --> 00:31:55,000 a really good time to ask before I start 662 00:31:55,000 --> 00:31:57,000 mucking it up. Way in the back? [Inaudible]. 663 00:31:57,000 --> 00:32:00,000 I will. You know that's a really good idea because if I - at this point, I'll 664 00:32:00,000 --> 00:32:03,000 start to change it and then it's gonna 665 00:32:03,000 --> 00:32:06,000 be something else before we're all done. So let me 666 00:32:06,000 --> 00:32:09,000 take a snapshot of what we have here 667 00:32:09,000 --> 00:32:12,000 so that I - before I destroy it. 668 00:32:12,000 --> 00:32:14,000 Question? When does the deconstructor 669 00:32:14,000 --> 00:32:16,000 get 670 00:32:16,000 --> 00:32:19,000 called? Okay, so the destructor gets called 671 00:32:19,000 --> 00:32:23,000 in the most - there are two cases it gets called. One is when the 672 00:32:23,000 --> 00:32:25,000 - so the constructor gets called when you declare 673 00:32:25,000 --> 00:32:29,000 it, and then destructor gets called when it goes out of scope. So at the opening brace 674 00:32:29,000 --> 00:32:32,000 of the block where you declared it in, is when the constructor's happening, 675 00:32:32,000 --> 00:32:35,000 and then when you leave that. So in the case of this program, it's at the end of 676 00:32:35,000 --> 00:32:36,000 May, 677 00:32:36,000 --> 00:32:38,000 but and if it were in some called function or in the body of a for loop or 678 00:32:38,000 --> 00:32:42,000 something, it would get called when you enter the loop and then called as - 679 00:32:42,000 --> 00:32:43,000 destroyed as it left. 680 00:32:43,000 --> 00:32:47,000 For things that were allocated out of the heap, so if I had a myvector new myvector, it 681 00:32:47,000 --> 00:32:50,000 would explicitly when I called delete that thing when I was done with it. 682 00:32:50,000 --> 00:32:55,000 So especially when the variable that holds it is going away, 683 00:32:55,000 --> 00:32:58,000 right? Either because you're deleting it out of the heap or because it's going out of scope on the 684 00:32:58,000 --> 00:33:00,000 stack. Here? 685 00:33:00,000 --> 00:33:04,000 When you have string star array, arr, it's 686 00:33:04,000 --> 00:33:07,000 a pointer to a single string and then later you can use a new statement 687 00:33:07,000 --> 00:33:08,000 to 688 00:33:08,000 --> 00:33:09,000 link it to - Yeah. - 689 00:33:09,000 --> 00:33:13,000 an array. So how does it know when it - How does it know? It doesn't, is the truth, is that 690 00:33:13,000 --> 00:33:16,000 when you say something's a pointer to a string, 691 00:33:16,000 --> 00:33:20,000 the only guarantee is really there and says it's an address of someplace where a 692 00:33:20,000 --> 00:33:23,000 string lives in memory. Whether there's one string there or a whole sequence of strings 693 00:33:23,000 --> 00:33:27,000 is something you decide and you maintain, and so the compiler doesn't distinguish 694 00:33:27,000 --> 00:33:28,000 those two cases. 695 00:33:28,000 --> 00:33:30,000 It does not know that you made 696 00:33:30,000 --> 00:33:32,000 exactly 1 string at the other end of this or 500. 697 00:33:32,000 --> 00:33:36,000 And so, that's why this idea of tracking the num used and num allocated becomes our 698 00:33:36,000 --> 00:33:39,000 job; that they look exactly the same in terms of how it interprets them. It says, ''Well it's the 699 00:33:39,000 --> 00:33:43,000 address of where a string lives in memory, or maybe it's a sequence. I don't know.'' And 700 00:33:43,000 --> 00:33:47,000 so it lets you use the brackets and the new and stuff on a pointer without 701 00:33:47,000 --> 00:33:51,000 distinguishing - having any mechanism in the language to say this is a single 702 00:33:51,000 --> 00:33:53,000 pointer and this is an array pointer. They look the same. So 703 00:33:53,000 --> 00:33:55,000 you could use the bracket notion on a single pointer? 704 00:33:55,000 --> 00:34:00,000 You certainly can. And then - It's not a good idea but you can do it. So 705 00:34:00,000 --> 00:34:03,000 legally in the language, it just makes no distinction between those. 706 00:34:03,000 --> 00:34:06,000 They are really commingled in way that - and so that's one of the more surprising features of C 707 00:34:06,000 --> 00:34:08,000 and C++ and one's that a little bit hard to get your head around is 708 00:34:08,000 --> 00:34:12,000 it doesn't track that it's a pointer to a single thing versus a pointer to an array. 709 00:34:12,000 --> 00:34:14,000 That they're both just 710 00:34:14,000 --> 00:34:15,000 pointers and it's mine. 711 00:34:15,000 --> 00:34:18,000 And that allows opportunity for errors, when you mistreat them as the other type 712 00:34:18,000 --> 00:34:21,000 for example. Question? Can you go 713 00:34:21,000 --> 00:34:25,000 over in your secret view file in the [inaudible] 714 00:34:25,000 --> 00:34:27,000 of what's going on 715 00:34:27,000 --> 00:34:32,000 with [inaudible] real quick because it's just - Yeah, yeah, yeah, so let me - it was basically the thing I drew over here, but I'll do it again just to watch 716 00:34:32,000 --> 00:34:33,000 it, 717 00:34:33,000 --> 00:34:35,000 is that let's start - let's imagine I have a slightly smaller 718 00:34:35,000 --> 00:34:37,000 num allocated so it's a little 719 00:34:37,000 --> 00:34:38,000 less 720 00:34:38,000 --> 00:34:42,000 for me to write. So let's say that I'm gonna use a num allocated of two, so this allocates 721 00:34:42,000 --> 00:34:43,000 two. So 722 00:34:43,000 --> 00:34:45,000 when I construct it, it makes a 723 00:34:45,000 --> 00:34:48,000 block that holds two things and num used is 724 00:34:48,000 --> 00:34:51,000 zero. So I do two adds: I add A, it 725 00:34:51,000 --> 00:34:53,000 increments num used to one; 726 00:34:53,000 --> 00:34:55,000 I had B, it increments num used to two. 727 00:34:55,000 --> 00:34:57,000 I try to add C. 728 00:34:57,000 --> 00:34:59,000 It says, ''Oh, well num used equals num allocated. 729 00:34:59,000 --> 00:35:01,000 We're gonna go to double capacity now.'' 730 00:35:01,000 --> 00:35:07,000 So double capacity has this little local variable called bigger. 731 00:35:07,000 --> 00:35:08,000 And it says bigger 732 00:35:08,000 --> 00:35:10,000 is gonna be something that is 733 00:35:10,000 --> 00:35:13,000 four strings worth in an array, so it 734 00:35:13,000 --> 00:35:14,000 gets four out there. 735 00:35:14,000 --> 00:35:15,000 It does a full loop 736 00:35:15,000 --> 00:35:19,000 to copy the contents of the old array on top of the initial part of this array; so it 737 00:35:19,000 --> 00:35:22,000 copies over the A and the B, 738 00:35:22,000 --> 00:35:23,000 into there. 739 00:35:23,000 --> 00:35:27,000 And then it goes, ''Okay, I'm done with this old part. So let me go ahead 740 00:35:27,000 --> 00:35:30,000 and delete that.'' And then it resets the arr to 741 00:35:30,000 --> 00:35:34,000 point to this new one down here, where bigger was. So now, we got to aliases of the 742 00:35:34,000 --> 00:35:35,000 same location. 743 00:35:35,000 --> 00:35:38,000 And then it sets my num allocated to say and now 744 00:35:38,000 --> 00:35:39,000 what you've got there 745 00:35:39,000 --> 00:35:41,000 is something that holds four slots. 746 00:35:41,000 --> 00:35:45,000 And then that used call here says, ''Okay and now writer the C into 747 00:35:45,000 --> 00:35:48,000 the slot at three.'' 748 00:35:48,000 --> 00:35:51,000 So the process here is the only way to enlarge an array in C++ is to make 749 00:35:51,000 --> 00:35:54,000 a bigger one, copy what you had, 750 00:35:54,000 --> 00:35:57,000 and then by virtue of you having made a bigger array to start with, you have some more slack that 751 00:35:57,000 --> 00:35:59,000 you didn't have before. 752 00:35:59,000 --> 00:36:02,000 Daniel? How does it delete arr with a star? 753 00:36:02,000 --> 00:36:05,000 You know it has to do with just delete takes a pointer. 754 00:36:05,000 --> 00:36:09,000 It does it - so a star arr is a string, arr is a pointer to a string. So 755 00:36:09,000 --> 00:36:13,000 both forms of delete, delete and delete bracket - So conceptually - - a pointer. - there is a start 756 00:36:13,000 --> 00:36:16,000 there because it's delete - Well effectively, yeah. It's delete the thing at the other end of the pointer, really. 757 00:36:16,000 --> 00:36:19,000 But it's funny. Delete says 758 00:36:19,000 --> 00:36:21,000 take this address and reclaim its contents. 759 00:36:21,000 --> 00:36:25,000 And so it doesn't really operate on a string, per se, it operates on the storage 760 00:36:25,000 --> 00:36:29,000 where that string is. And so I don't know whether you want to call that is there an implicit star there or 761 00:36:29,000 --> 00:36:33,000 not, it really is about the pointer though rather than the contents. 762 00:36:33,000 --> 00:36:38,000 So saying that address has some memory associated with it, reclaim that memory. If I could raise - 763 00:36:38,000 --> 00:36:39,000 Uh-huh. 764 00:36:39,000 --> 00:36:41,000 So when you're 765 00:36:41,000 --> 00:36:45,000 first declaring or when you're making a pointer like string bigger, string star bigger, 766 00:36:45,000 --> 00:36:47,000 you have to declare it with 767 00:36:47,000 --> 00:36:50,000 the star notion. But then later on, you don't ever have to use 768 00:36:50,000 --> 00:36:53,000 that again? You pretty much won't see that star used again. 769 00:36:53,000 --> 00:36:56,000 Right? It's interesting that things like bigger sub I and erase sub I 770 00:36:56,000 --> 00:37:00,000 implicitly have a D reference in them. And that can be misleading. You 771 00:37:00,000 --> 00:37:02,000 think, ''Well how come I'm never actually using that star again on that thing to 772 00:37:02,000 --> 00:37:04,000 get back to the strings that are out there?'' 773 00:37:04,000 --> 00:37:08,000 And it has to do with the fact that the bracket notation kind of implicitly D 774 00:37:08,000 --> 00:37:09,000 references in it. 775 00:37:09,000 --> 00:37:13,000 If I did a star bigger, it would actually have the effect of giving me bigger 776 00:37:13,000 --> 00:37:16,000 sub zero, it turns out. You can use that notation but it's 777 00:37:16,000 --> 00:37:18,000 not that common to need to. And 778 00:37:18,000 --> 00:37:19,000 so down on 779 00:37:19,000 --> 00:37:20,000 the last, 780 00:37:20,000 --> 00:37:23,000 one line up 781 00:37:23,000 --> 00:37:24,000 from the bottom, it says array equals bigger. You don't have to - Yeah, 782 00:37:24,000 --> 00:37:27,000 if you did that, if I did say - If you 783 00:37:27,000 --> 00:37:33,000 said array - Star arr equals star bigger, 784 00:37:33,000 --> 00:37:35,000 I would not be getting what I want. 785 00:37:35,000 --> 00:37:37,000 Right? What it would be doing is it would say 786 00:37:37,000 --> 00:37:40,000 follow bigger and see what's at the other end, so that would follow bigger and 787 00:37:40,000 --> 00:37:44,000 get that string A. And then it would say follow ARR and overwrite it 788 00:37:44,000 --> 00:37:45,000 with that A, 789 00:37:45,000 --> 00:37:49,000 so it would actually have the effect of only copying the first string 790 00:37:49,000 --> 00:37:51,000 from bigger on top of the first string of array. But array would still point to 791 00:37:51,000 --> 00:37:54,000 where it was, bigger would still point to where it was, and they would - we 792 00:37:54,000 --> 00:37:56,000 would've not have updated our, 793 00:37:56,000 --> 00:38:00,000 the pointer we really wanted to point to the new array. So 794 00:38:00,000 --> 00:38:02,000 there is a difference. Without the star, 795 00:38:02,000 --> 00:38:05,000 we're talking about the changing the pointers; with the star, we're 796 00:38:05,000 --> 00:38:08,000 talking about the strings at the other end. And so we're - this is a string 797 00:38:08,000 --> 00:38:10,000 assignment. It says assign one string to another. 798 00:38:10,000 --> 00:38:12,000 Without 799 00:38:12,000 --> 00:38:15,000 the star on it, it's like assign one pointer to another; 800 00:38:15,000 --> 00:38:18,000 make two pointers point to the same place. When you're done with this, 801 00:38:18,000 --> 00:38:23,000 bigger and arr will be aliases for the same location. 802 00:38:23,000 --> 00:38:25,000 That's a very important question though to get kind of 803 00:38:25,000 --> 00:38:30,000 what that star's doing for you. Here? After 804 00:38:30,000 --> 00:38:34,000 arr is bigger, can 805 00:38:34,000 --> 00:38:35,000 you delete bigger after that? If I deleted 806 00:38:35,000 --> 00:38:39,000 bigger, at that point arr is pointing to the same place. And so remember that 807 00:38:39,000 --> 00:38:42,000 having two or three or ten pointers all at the same place, if you delete one of them, 808 00:38:42,000 --> 00:38:46,000 they actually effectively are deleted. The delete really deletes the 809 00:38:46,000 --> 00:38:49,000 storage out here. And then if I did that, it would cause arr to then be pointing to this 810 00:38:49,000 --> 00:38:51,000 piece of memory, 811 00:38:51,000 --> 00:38:54,000 and not a good scene will come from that. It means that when it later 812 00:38:54,000 --> 00:38:57,000 goes back in there and starts trying to read and write to that contents at 813 00:38:57,000 --> 00:39:00,000 any moment it could kind of shift underneath you. You don't own it any 814 00:39:00,000 --> 00:39:02,000 more; it's not reserved for your use. 815 00:39:02,000 --> 00:39:05,000 So if we did that, we'd get ourselves into trouble. All 816 00:39:05,000 --> 00:39:08,000 right? So there should basically be a one-to-one correspondence between 817 00:39:08,000 --> 00:39:10,000 things you new and things you 818 00:39:10,000 --> 00:39:11,000 delete. And so 819 00:39:11,000 --> 00:39:14,000 in the myvector case, we newed something in the constructor that we're gonna delete in the 820 00:39:14,000 --> 00:39:18,000 destructor. If at some point along the way we got rid of our old one and get 821 00:39:18,000 --> 00:39:21,000 a new one, that's the new the one that's gonna get deleted. If we deleted it 822 00:39:21,000 --> 00:39:23,000 midstream here, we would just be asking for 823 00:39:23,000 --> 00:39:29,000 havoc when we start accessing that deleted memory. Way in the back? Is 824 00:39:29,000 --> 00:39:31,000 it possible to create a pointer 825 00:39:31,000 --> 00:39:33,000 to something - 826 00:39:33,000 --> 00:39:37,000 a pointer to the address that's one off the end of the original 827 00:39:37,000 --> 00:39:42,000 array and then just create an array just off the end there? Not really. So new doesn't let 828 00:39:42,000 --> 00:39:43,000 you decide where you want something. 829 00:39:43,000 --> 00:39:45,000 So you're point being to think, 830 00:39:45,000 --> 00:39:48,000 ''Well I can tell you what this address is, can I just make space right there 831 00:39:48,000 --> 00:39:53,000 and then I won't have to copy.'' And it turns out new just doesn't give you that kind of control. You ask it for space, it 832 00:39:53,000 --> 00:39:55,000 finds it wherever it has and you can't - 833 00:39:55,000 --> 00:39:58,000 there isn't even a mechanism where you could suggest where you'd like it to be. You could say, ''Well 834 00:39:58,000 --> 00:40:01,000 let that place right there would be really handy. Could you please give me that one?'' It 835 00:40:01,000 --> 00:40:02,000 just doesn't give it you. 836 00:40:02,000 --> 00:40:03,000 So 837 00:40:03,000 --> 00:40:05,000 you're - this is the way 838 00:40:05,000 --> 00:40:08,000 that typically you have to manage a dynamic array. And this is actually one of the big 839 00:40:08,000 --> 00:40:10,000 drawbacks 840 00:40:10,000 --> 00:40:13,000 to continuous memory as a reason for implementing things is that the 841 00:40:13,000 --> 00:40:16,000 fact that it has to maintain contiguousness. Means you have to shuffle and move 842 00:40:16,000 --> 00:40:18,000 and copy 843 00:40:18,000 --> 00:40:19,000 this block 844 00:40:19,000 --> 00:40:20,000 845 00:40:20,000 --> 00:40:23,000 without the flexibility of something like a link list where every cell is independently 846 00:40:23,000 --> 00:40:27,000 manipulated. 847 00:40:27,000 --> 00:40:29,000 There? Why does [inaudible] the delete 848 00:40:29,000 --> 00:40:34,000 brackets arr as delete just arr? So the difference is that if you allocated 849 00:40:34,000 --> 00:40:38,000 something with new string bracket, new something bracket, you need a delete bracket. 850 00:40:38,000 --> 00:40:41,000 If you actually use delete without the brackets, it thinks there's a single 851 00:40:41,000 --> 00:40:42,000 pointer and there's only one string at the other end. 852 00:40:42,000 --> 00:40:47,000 Where delete brackets says delete the whole gob of strings that are there. 853 00:40:47,000 --> 00:40:49,000 If you don't do it, it's not - the consequence is not that big; 854 00:40:49,000 --> 00:40:52,000 it's like some memory gets orphaned, some things don't happen. But 855 00:40:52,000 --> 00:40:56,000 to be totally correct, they go hand in hand: if you use new with 856 00:40:56,000 --> 00:40:59,000 brackets, use delete with no brackets, if you use new with brackets, use delete with brackets. 857 00:40:59,000 --> 00:41:02,000 So it's either both with brackets or both without. 858 00:41:02,000 --> 00:41:04,000 So even though arr is just point 859 00:41:04,000 --> 00:41:07,000 to the first place, the 860 00:41:07,000 --> 00:41:08,000 brackets knows the - 861 00:41:08,000 --> 00:41:12,000 Yeah, it does. And so that kind of makes me feel like I'm a lair because I said well the array 862 00:41:12,000 --> 00:41:15,000 doesn't know its length. Well it does somehow. Internally it is 863 00:41:15,000 --> 00:41:18,000 maintaining some housekeeping but it doesn't expose it to you. 864 00:41:18,000 --> 00:41:21,000 So when you say delete bracket arr it knows, ''Oh, there's a bunch of strings and I got 865 00:41:21,000 --> 00:41:25,000 to do a bunch of cleanup on them.'' But it doesn't ever expose that information back 866 00:41:25,000 --> 00:41:29,000 to you. It doesn't let you depend on it, so it's up to you to maintain that information redundantly 867 00:41:29,000 --> 00:41:34,000 with it. All 868 00:41:34,000 --> 00:41:36,000 right, let me see if I can make it a template. 869 00:41:36,000 --> 00:41:39,000 I probably can't do this actually fast enough to get it all done today, but we can at least get started 870 00:41:39,000 --> 00:41:41,000 on it. 871 00:41:41,000 --> 00:41:42,000 So then, I 872 00:41:42,000 --> 00:41:46,000 introduce a template header and 873 00:41:46,000 --> 00:41:49,000 I make up the name that I want here, so 874 00:41:49,000 --> 00:41:50,000 same class header now 875 00:41:50,000 --> 00:41:52,000 other than typing elem type. Then 876 00:41:52,000 --> 00:41:55,000 I look through my interface and I see places where I previously had said it's 877 00:41:55,000 --> 00:41:59,000 strings, it's strings, it's storing strings. And I say it's not actually storing strings; 878 00:41:59,000 --> 00:42:00,000 it's gonna store 879 00:42:00,000 --> 00:42:02,000 elem type things, 880 00:42:02,000 --> 00:42:06,000 it's gonna return elem type things 881 00:42:06,000 --> 00:42:09,000 and it's going to have an array of 882 00:42:09,000 --> 00:42:11,000 elem type things. 883 00:42:11,000 --> 00:42:13,000 So I think that's everything that happened 884 00:42:13,000 --> 00:42:17,000 to the interface. Let me see if I see any other places that I - so the 885 00:42:17,000 --> 00:42:18,000 interface part 886 00:42:18,000 --> 00:42:20,000 is kind of small. 887 00:42:20,000 --> 00:42:23,000 There's one other change I'm gonna have to make to it but I'm gonna come back to it. I'm gonna look 888 00:42:23,000 --> 00:42:25,000 at the code at the other side for a second. And I say, ''Okay, well 889 00:42:25,000 --> 00:42:28,000 that wasn't so bad.'' Now 890 00:42:28,000 --> 00:42:31,000 it turns out that it gets a little bit goopier over here 891 00:42:31,000 --> 00:42:33,000 because that template type name 892 00:42:33,000 --> 00:42:38,000 has to go on every one of these: 893 00:42:38,000 --> 00:42:40,000 introduce them to the template type and elem type. 894 00:42:40,000 --> 00:42:44,000 And now there's another place where it needs to show up. 895 00:42:44,000 --> 00:42:49,000 So the full syntax for this is now saying this is a template function, 896 00:42:49,000 --> 00:42:50,000 depending on elem type, 897 00:42:50,000 --> 00:42:52,000 and it's actually for the myvector 898 00:42:52,000 --> 00:42:54,000 who is being - 899 00:42:54,000 --> 00:42:58,000 we are writing the myvector constructor for something whose name is myvector 900 00:42:58,000 --> 00:43:00,000 angle bracket elem type. 901 00:43:00,000 --> 00:43:02,000 So there's gonna be a lot of this goo. Every one of these is kinda change 902 00:43:02,000 --> 00:43:05,000 its form, from just looking like the ordinary myvector class scope doesn't 903 00:43:05,000 --> 00:43:06,000 really exist any more. 904 00:43:06,000 --> 00:43:10,000 Myvector is now a template for which there's a lot of different class scopes, 905 00:43:10,000 --> 00:43:13,000 one for each kind of thing being stored. So myvector int is different than myvector 906 00:43:13,000 --> 00:43:14,000 string. 907 00:43:14,000 --> 00:43:15,000 So we say, ''Well, 908 00:43:15,000 --> 00:43:18,000 if you were building the myvector constructor for myvector string, it 909 00:43:18,000 --> 00:43:21,000 looks like this.'' Or you know 910 00:43:21,000 --> 00:43:24,000 having filled an elem type with those strings. So 911 00:43:24,000 --> 00:43:30,000 everywhere I was using string, I got to change to elem type in the body as well. And 912 00:43:30,000 --> 00:43:32,000 then I kind of take this guy 913 00:43:32,000 --> 00:43:38,000 and use it in a bunch of places. I'm gonna use it here 914 00:43:38,000 --> 00:43:39,000 and then I'm gonna have to do it down 915 00:43:39,000 --> 00:43:50,000 here, on that side, do it 916 00:43:50,000 --> 00:43:54,000 here, and it's gonna return something of elem type, 917 00:43:54,000 --> 00:43:56,000 here. It's a little bit of 918 00:43:56,000 --> 00:43:59,000 a mess to do this, and the code definitely gets a little bit goopier as a result 919 00:43:59,000 --> 00:44:00,000 of this. It doesn't 920 00:44:00,000 --> 00:44:08,000 look quite as pretty as it did when it wasn't a template, 921 00:44:08,000 --> 00:44:11,000 but it becomes a lot more useful. Okay. 922 00:44:11,000 --> 00:44:14,000 Then I need to look for places that I used a string. 923 00:44:14,000 --> 00:44:17,000 And every place where I was using string, assuming that's what I was storing, it 924 00:44:17,000 --> 00:44:20,000 now actually turns into elem type. So 925 00:44:20,000 --> 00:44:23,000 my pointers and the kind of array I'm allocating is actually now made into elem 926 00:44:23,000 --> 00:44:24,000 type. 927 00:44:24,000 --> 00:44:27,000 The rest of the code actually didn't say anything specific about what's its doing, just copying 928 00:44:27,000 --> 00:44:30,000 things from one array to another. And now, depending on what the arrays are, it's 929 00:44:30,000 --> 00:44:32,000 copying ints or strings or doubles. 930 00:44:32,000 --> 00:44:36,000 And then other places in the interface where I'm doing add or I'm going get at, I have to be 931 00:44:36,000 --> 00:44:40,000 describing the things that are coming in and out as elem type so that they can be 932 00:44:40,000 --> 00:44:42,000 matched to whatever the client's using. I 933 00:44:42,000 --> 00:44:45,000 think the rest of it looks okay. Why do you have to write 934 00:44:45,000 --> 00:44:47,000 template type name, and elem type above 935 00:44:47,000 --> 00:44:50,000 every - Because you just have to, because it's C++. Because the thing is, 936 00:44:50,000 --> 00:44:53,000 that piece of code is, itself, a template, so 937 00:44:53,000 --> 00:44:56,000 these are like little mini-templates. So that I had the interface, which said here's the 938 00:44:56,000 --> 00:44:58,000 template pattern for the 939 00:44:58,000 --> 00:45:02,000 interface, and each of these says when you're ready to make the size member function for a vector of int, 940 00:45:02,000 --> 00:45:05,000 it comes off this template. So this template describes what the size member function 941 00:45:05,000 --> 00:45:09,000 looks like for any of the myvectors you might instantiate. And it describes the 942 00:45:09,000 --> 00:45:12,000 template because, in fact, we need to build a new size for 943 00:45:12,000 --> 00:45:14,000 ints versus doubles versus strings. 944 00:45:14,000 --> 00:45:17,000 It's even funny because you think of my size like, ''Well size doesn't even 945 00:45:17,000 --> 00:45:21,000 use anything related to the elem type.'' But in fact, each of the member functions 946 00:45:21,000 --> 00:45:25,000 is kinda specific. It's not just a myvector size; it's the myvector int size, the 947 00:45:25,000 --> 00:45:26,000 myvector string size. 948 00:45:26,000 --> 00:45:30,000 And that for some of the member functions it's quite obvious why you need a distinct 949 00:45:30,000 --> 00:45:33,000 copy. Get at returns an int in some cases and a double in others; 950 00:45:33,000 --> 00:45:36,000 but even though ones that don't appear to have any dependence on the elem type, 951 00:45:36,000 --> 00:45:37,000 actually are 952 00:45:37,000 --> 00:45:40,000 separated into their own 953 00:45:40,000 --> 00:45:41,000 individual versions. 954 00:45:41,000 --> 00:45:45,000 So I think I got all of that fixed, and then I'm gonna have to do one thing that's gonna seem 955 00:45:45,000 --> 00:45:49,000 really quirky. And it is very quirky but it is C++. Let me show 956 00:45:49,000 --> 00:45:51,000 you what I'm gonna do. 957 00:45:51,000 --> 00:45:59,000 Is I'm going [inaudible] out of the project. Okay, 958 00:45:59,000 --> 00:46:01,000 stop compiling that. 959 00:46:01,000 --> 00:46:01,000 And I'm gonna change 960 00:46:01,000 --> 00:46:12,000 how it is that myvector gets compiled by doing this. Okay. Take 961 00:46:12,000 --> 00:46:14,000 a deep breath. 962 00:46:14,000 --> 00:46:15,000 This is really 963 00:46:15,000 --> 00:46:18,000 just an oddity of C++. 964 00:46:18,000 --> 00:46:20,000 So the situation is this: that 965 00:46:20,000 --> 00:46:23,000 templates aren't really compiled ahead of time, 966 00:46:23,000 --> 00:46:25,000 templates are just patterns. 967 00:46:25,000 --> 00:46:26,000 You know? They like describe 968 00:46:26,000 --> 00:46:30,000 a recipe for how you would build a myvector class. 969 00:46:30,000 --> 00:46:34,000 But you can't just compile myvector and be done with it because until the client 970 00:46:34,000 --> 00:46:37,000 uses it, you don't know what kind of myvectors you're building. Are they myvectors of ints 971 00:46:37,000 --> 00:46:39,000 or strings or pseudo structures? 972 00:46:39,000 --> 00:46:43,000 So it turns out that the myvector really needs to get compiled 973 00:46:43,000 --> 00:46:46,000 at the usage, at the instantiation. When you're ready to make a myvector of 974 00:46:46,000 --> 00:46:50,000 students, it then needs to see all the code for myvector so it can go build you a 975 00:46:50,000 --> 00:46:53,000 myvector for students on the fly. 976 00:46:53,000 --> 00:46:56,000 In order to see that code, it actually has to be present in a different way than 977 00:46:56,000 --> 00:46:59,000 most code. Most code is compiled, instead of .cpp, it just 978 00:46:59,000 --> 00:47:02,000 gets compiled once and once for all. The random library, random integer doesn't 979 00:47:02,000 --> 00:47:06,000 change for anybody usage, there's a random.cpp. It compiled the function. 980 00:47:06,000 --> 00:47:07,000 You're done. 981 00:47:07,000 --> 00:47:11,000 So the template code does not get compiled ahead of time. It doesn't get listed in the 982 00:47:11,000 --> 00:47:13,000 project. What happens is the .h 983 00:47:13,000 --> 00:47:17,000 typically has not only the interface, but actually all the code. 984 00:47:17,000 --> 00:47:21,000 And so the two ways to get the code in here, one way is I could've just put all the 985 00:47:21,000 --> 00:47:24,000 code down here. And that's the way a lot of professional code gets written, it has 986 00:47:24,000 --> 00:47:28,000 the interface followed by all the template code right after it. I 987 00:47:28,000 --> 00:47:32,000 like to keep us thinking about the interface and the implementation being 988 00:47:32,000 --> 00:47:32,000 separate, 989 00:47:32,000 --> 00:47:35,000 so I'm actually taking the interface and keeping the .h, keeping this [inaudible] 990 00:47:35,000 --> 00:47:36,000 over here 991 00:47:36,000 --> 00:47:42,000 in a .cpp. And then I'm using the #include mechanism in a very unusual way. That almost 992 00:47:42,000 --> 00:47:45,000 never would you want to, in a regular usage, to #include another 993 00:47:45,000 --> 00:47:49,000 .cpp file. For templates, we're making an exception. And we're saying, ''Well 994 00:47:49,000 --> 00:47:53,000 in this case, because I really need that code there,'' the #include mechanism is 995 00:47:53,000 --> 00:47:57,000 basically saying go take the contents of this thing and just dump it in here. 996 00:47:57,000 --> 00:48:01,000 It really is an include mechanism. It says, ''Go get this file and take its text 997 00:48:01,000 --> 00:48:03,000 contents and dump it right into this file.'' 998 00:48:03,000 --> 00:48:06,000 So that when somebody's trying to import the myvector.h, 999 00:48:06,000 --> 00:48:09,000 they're getting both the interface plus all the code that we'll generate a 1000 00:48:09,000 --> 00:48:11,000 pattern from. 1001 00:48:11,000 --> 00:48:13,000 So this is definitely just a quirk. 1002 00:48:13,000 --> 00:48:14,000 There's no 1003 00:48:14,000 --> 00:48:16,000 1004 00:48:16,000 --> 00:48:18,000 consistency between how other languages that do stuff like this expect 1005 00:48:18,000 --> 00:48:19,000 this. 1006 00:48:19,000 --> 00:48:22,000 This is just unique to C++ and its compilation mechanisms that require 1007 00:48:22,000 --> 00:48:24,000 this kind of sort of slight 1008 00:48:24,000 --> 00:48:26,000 variation in handling. 1009 00:48:26,000 --> 00:48:29,000 So we'll see this for all the templates we'll use is that 1010 00:48:29,000 --> 00:48:33,000 they will not be included as normal cpp files, they will get included in the .h. 1011 00:48:33,000 --> 00:48:35,000 And there is this exact pattern, which is 1012 00:48:35,000 --> 00:48:39,000 reproduced for every one of the ones in the text. You'll see it on stack and queue and 1013 00:48:39,000 --> 00:48:44,000 integer. That it becomes the kind of boilerplate we'll use when making a template. So 1014 00:48:44,000 --> 00:48:45,000 in general, I'd say 1015 00:48:45,000 --> 00:48:49,000 be very wary of anything that looks like this. This is not a normal thing to do 1016 00:48:49,000 --> 00:48:52,000 and we're doing it just specifically to kind of keep up the illusion that the 1017 00:48:52,000 --> 00:48:55,000 interface and the implementation are kept separate because there's actually 1018 00:48:55,000 --> 00:48:57,000 some good thinking that comes from that. 1019 00:48:57,000 --> 00:49:00,000 But the way the compiler sees it, it doesn't want them to be separate, and so 1020 00:49:00,000 --> 00:49:06,000 we have to accommodate it with this little hack, let's say, here. 1021 00:49:06,000 --> 00:49:08,000 So once I've done that, 1022 00:49:08,000 --> 00:49:10,000 I go back to lecture. 1023 00:49:10,000 --> 00:49:14,000 If I change this to be myvector string, 1024 00:49:14,000 --> 00:49:16,000 I'm hoping 1025 00:49:16,000 --> 00:49:19,000 that everything will still work. 1026 00:49:19,000 --> 00:49:20,000 Which it did, kind 1027 00:49:20,000 --> 00:49:21,000 of amazingly. Daniel? 1028 00:49:21,000 --> 00:49:23,000 So 1029 00:49:23,000 --> 00:49:27,000 where is the myvector.cpp file at? So it's actually just living in the same directory with this, 1030 00:49:27,000 --> 00:49:30,000 the way myvector.h is. So typically, like your .h files are just sitting in the directory - .cpp is sitting in the 1031 00:49:30,000 --> 00:49:34,000 same directory. That's where it's gonna look for it when it 1032 00:49:34,000 --> 00:49:38,000 goes #including is in the kind of local contents. But like 1033 00:49:38,000 --> 00:49:39,000 where is that? Like is it in 1034 00:49:39,000 --> 00:49:42,000 resources? No, it's just - if you look - you know this is the directory I have, this 1035 00:49:42,000 --> 00:49:43,000 is the 1036 00:49:43,000 --> 00:49:47,000 contents of my, you know all my files, my project files, where the thing gets dumped. Oh, okay. It's just sitting there with all 1037 00:49:47,000 --> 00:49:50,000 the code. 1038 00:49:50,000 --> 00:49:53,000 And I should be able to 1039 00:49:53,000 --> 00:49:57,000 change this now to put some numbers in 1040 00:49:57,000 --> 00:50:03,000 and have it do both. I just did it 1041 00:50:03,000 --> 00:50:05,000 with strings and now I'm gonna do it with ints. 1042 00:50:05,000 --> 00:50:06,000 And 1043 00:50:06,000 --> 00:50:10,000 voila, we now have something that holds ints. So a certain 1044 00:50:10,000 --> 00:50:12,000 amount of goo that went from 1045 00:50:12,000 --> 00:50:14,000 the simple form to the template form, 1046 00:50:14,000 --> 00:50:18,000 but a lot of power gained. Suddenly we took this thing that was one purpose, 1047 00:50:18,000 --> 00:50:21,000 that held strings only, and you just made it to where it can hold anything you 1048 00:50:21,000 --> 00:50:23,000 can think of to stick in there 1049 00:50:23,000 --> 00:50:24,000 1050 00:50:24,000 --> 00:50:26,000 by making 1051 00:50:26,000 --> 00:50:28,000 little machinations in the 1052 00:50:28,000 --> 00:50:29,000 syntax there. 1053 00:50:29,000 --> 00:50:34,000 So we'll see a lot more of this. It's not the first, nor the last, syntax that - for templates that we're gonna be playing 1054 00:50:34,000 --> 00:50:36,000 with, but that will be next week. Come to Cafe with me, if you 1055 00:50:36,000 --> 00:50:45,000 got some time. Otherwise, I will see you on Monday.