1 00:00:00,000 --> 00:00:11,000 2 00:00:11,000 --> 00:00:14,000 This presentation is delivered by the Stanford Center for Professional Development. 3 00:00:14,000 --> 00:00:24,000 4 00:00:24,000 --> 00:00:26,000 Any administration questions on your mind? 5 00:00:26,000 --> 00:00:28,000 How many people have actually successfully installed a compiler? 6 00:00:28,000 --> 00:00:31,000 Have stuff working - okay, so that's like a 7 00:00:31,000 --> 00:00:34,000 third of you, good to know. 8 00:00:34,000 --> 00:00:36,000 Remaining two thirds, you 9 00:00:36,000 --> 00:00:40,000 want to get on it. Okay, 10 00:00:40,000 --> 00:00:45,000 so we started to talk about this on Monday, and I'm gonna 11 00:00:45,000 --> 00:00:48,000 try to finish off the things that I had started to get you thinking about; 12 00:00:48,000 --> 00:00:52,000 about how input/output works in C++. We've seen the simple 13 00:00:52,000 --> 00:00:53,000 forms of 14 00:00:53,000 --> 00:00:57,000 using stream insertion, the less than less than operator to push things on to cout, 15 00:00:57,000 --> 00:00:59,000 the Console Output Stream. 16 00:00:59,000 --> 00:01:02,000 A C-Out is capable of writing all the 17 00:01:02,000 --> 00:01:06,000 basic types that are built into C++, ants and doubles and cars and strings, 18 00:01:06,000 --> 00:01:06,000 right, 19 00:01:06,000 --> 00:01:09,000 by virtue of just sort of putting the string on the left and the thing you want on 20 00:01:09,000 --> 00:01:10,000 the right, it will kind of 21 00:01:10,000 --> 00:01:14,000 take that thing and push it out onto stream. You can chain those 22 00:01:14,000 --> 00:01:17,000 together with lots and lots of those < < to get a whole bunch 23 00:01:17,000 --> 00:01:20,000 of things, and then the endl is the - what's called stream 24 00:01:20,000 --> 00:01:24,000 manipulator that produces a new line, starts the next line of text, a line 25 00:01:24,000 --> 00:01:25,000 beneath that. 26 00:01:25,000 --> 00:01:30,000 The analog to that on the reading side is the stream 27 00:01:30,000 --> 00:01:33,000 extraction operator, which is the > >. And then when applied 28 00:01:33,000 --> 00:01:36,000 to an input stream it attempts to sort of take where the cursor position is in 29 00:01:36,000 --> 00:01:42,000 the input stream and read the next characters using the expected format 30 00:01:42,000 --> 00:01:46,000 given by the type of the thing you're trying to extract. So in this case 31 00:01:46,000 --> 00:01:50,000 what I'm saying, CN > > extract an integer here, X being an integer. 32 00:01:50,000 --> 00:01:53,000 What it's gonna look for in the input stream is it's going to skip 33 00:01:53,000 --> 00:01:55,000 over white space. So by default 34 00:01:55,000 --> 00:01:58,000 the stream extraction always skips over any leading white space. That means tabs, 35 00:01:58,000 --> 00:01:59,000 new lines, 36 00:01:59,000 --> 00:02:02,000 and ordinary space characters. So 37 00:02:02,000 --> 00:02:05,000 scans up to that, gets to the first non-space character 38 00:02:05,000 --> 00:02:09,000 and then starts assuming that what should be there is a number, and so 39 00:02:09,000 --> 00:02:12,000 number being, a sequence of digit characters. And in this case, because it's 40 00:02:12,000 --> 00:02:16,000 integer, it shouldn't have a dot or any of the exponentiations sort of things that 41 00:02:16,000 --> 00:02:17,000 a real number 42 00:02:17,000 --> 00:02:20,000 would. If it runs into something that's not integer, it 43 00:02:20,000 --> 00:02:24,000 runs into a character, it runs into a punctuation, it runs into a 44 00:02:24,000 --> 00:02:26,000 39.5, 45 00:02:26,000 --> 00:02:29,000 what happens is that the screen goes into a fail state 46 00:02:29,000 --> 00:02:33,000 where it says, I - you told me to expect an integer. What I read next wasn't an 47 00:02:33,000 --> 00:02:33,000 integer. 48 00:02:33,000 --> 00:02:36,000 I don't know how to make heads or tails of this. So it basically just 49 00:02:36,000 --> 00:02:38,000 throws up its hand. 50 00:02:38,000 --> 00:02:39,000 And so it - 51 00:02:39,000 --> 00:02:43,000 at that point the stream is - it requires you to kind of intervene, 52 00:02:43,000 --> 00:02:45,000 check the fail state, see that something's wrong, 53 00:02:45,000 --> 00:02:47,000 clear that fail state, 54 00:02:47,000 --> 00:02:50,000 decide what to do about it, kind of restart, and kind of pick up where you left off. It 55 00:02:50,000 --> 00:02:52,000 makes for kind of messy handling 56 00:02:52,000 --> 00:02:53,000 57 00:02:53,000 --> 00:02:57,000 to have all that code kind of in your face when you're trying to do that reading, 58 00:02:57,000 --> 00:03:01,000 and that's actually why we've provided the things like get integer, get line and get 59 00:03:01,000 --> 00:03:03,000 wheel, and the simple I/O library 60 00:03:03,000 --> 00:03:05,000 to just manage that for you. 61 00:03:05,000 --> 00:03:09,000 Basically what they're doing is in a loop they're trying to read that integer 62 00:03:09,000 --> 00:03:09,000 63 00:03:09,000 --> 00:03:13,000 off the console. And if it fails, write resetting the stream, 64 00:03:13,000 --> 00:03:16,000 going back around asking the user to type in 65 00:03:16,000 --> 00:03:20,000 - give it another try, until they get something that's well formed. So 66 00:03:20,000 --> 00:03:22,000 typically we're just going to use these, 67 00:03:22,000 --> 00:03:24,000 because they just provide conveniences. You could certainly use this, but it would just 68 00:03:24,000 --> 00:03:28,000 require more effort on your part to kind of manage the error conditions and retry 69 00:03:28,000 --> 00:03:31,000 and whatnot. So 70 00:03:31,000 --> 00:03:35,000 that's why it's there. 71 00:03:35,000 --> 00:03:38,000 The C++ file I/O; so 72 00:03:38,000 --> 00:03:42,000 the console is actually just a particular instance of the stream. Cout and cin 73 00:03:42,000 --> 00:03:47,000 are the string that's attached to the users interface console there. 74 00:03:47,000 --> 00:03:51,000 That the same sort of mechanism is used to read files on disks, so text files on 75 00:03:51,000 --> 00:03:53,000 disks that have contents you like to 76 00:03:53,000 --> 00:03:56,000 pull into a database, or you want to write some information out to a file, you 77 00:03:56,000 --> 00:03:58,000 use the file stream for that. 78 00:03:58,000 --> 00:04:01,000 There is a header called fstream, standard C++ header 79 00:04:01,000 --> 00:04:02,000 in this case, so 80 00:04:02,000 --> 00:04:04,000 enclosed in < >, 81 00:04:04,000 --> 00:04:08,000 that declares the isstream and the osstream. The input file stream for reading, 82 00:04:08,000 --> 00:04:12,000 the output file stream for writing. 83 00:04:12,000 --> 00:04:14,000 Declaring these variables; this [inaudible] 84 00:04:14,000 --> 00:04:18,000 just sets up a default stream that is not connected to anything on disc. 85 00:04:18,000 --> 00:04:22,000 Before you do anything with it you really do need to attach it to some 86 00:04:22,000 --> 00:04:25,000 named location, some file by name on your disk 87 00:04:25,000 --> 00:04:27,000 to have the right thing happen, to read from some 88 00:04:27,000 --> 00:04:29,000 contents, or to write the contents somewhere. 89 00:04:29,000 --> 00:04:32,000 The operation that does that is open, 90 00:04:32,000 --> 00:04:34,000 so the isstream and the osstream are objects, 91 00:04:34,000 --> 00:04:39,000 so dot notation is used to send messages to it. In this case, telling the 92 00:04:39,000 --> 00:04:40,000 input stream 93 00:04:40,000 --> 00:04:45,000 to open the file whose name is "names.txt." 94 00:04:45,000 --> 00:04:48,000 The behavior for open is to assume that you meant the file in the current 95 00:04:48,000 --> 00:04:52,000 directory if you don't otherwise give a more fully specified path. So 96 00:04:52,000 --> 00:04:56,000 this is almost always the way we're going to do this, we're just going to open a file by name. It's going to look 97 00:04:56,000 --> 00:04:59,000 for it in the project directory, where your code is, where you project is, so 98 00:04:59,000 --> 00:05:01,000 kind of right there locally. 99 00:05:01,000 --> 00:05:04,000 Now this will look for a file whose name is exactly names.txt, 100 00:05:04,000 --> 00:05:09,000 and then from that point the file positions, the kind of cursor we 101 00:05:09,000 --> 00:05:12,000 call it, is positioned at the beginning of the input stream. The first character read 102 00:05:12,000 --> 00:05:16,000 will be the first character of names.txt, and as you move forward 103 00:05:16,000 --> 00:05:18,000 it will read its way all the way to the end. 104 00:05:18,000 --> 00:05:21,000 Similarly, doing an outopen, 105 00:05:21,000 --> 00:05:25,000 it opens a file and kind of positions the writing at the very beginning 106 00:05:25,000 --> 00:05:27,000 that will - the first character written will be the first character then when 107 00:05:27,000 --> 00:05:32,000 you finish. And that file, they'll be written in sequence. 108 00:05:32,000 --> 00:05:36,000 So this is one of those places, actually, probably the only one that this 109 00:05:36,000 --> 00:05:39,000 direction is going to be relevant for. I talked a little bit last time about C-strings 110 00:05:39,000 --> 00:05:42,000 and C++ strings, and you might have been a little bit 111 00:05:42,000 --> 00:05:43,000 worried about why 112 00:05:43,000 --> 00:05:45,000 I'm telling you you need to know that both exist. 113 00:05:45,000 --> 00:05:47,000 And so last time I talked a little about 114 00:05:47,000 --> 00:05:51,000 one way in which C-strings don't do what you think, in that one case of 115 00:05:51,000 --> 00:05:55,000 concatenation, and how you can do a - force a conversion from the old to the new. 116 00:05:55,000 --> 00:05:57,000 Now, I also mentioned that there was a conversion that went in the 117 00:05:57,000 --> 00:06:01,000 opposite direction. You had a new string, and you wanted the old one. 118 00:06:01,000 --> 00:06:02,000 And 119 00:06:02,000 --> 00:06:04,000 one of the first questions you might ask is well why would I ever want to do that? Why 120 00:06:04,000 --> 00:06:08,000 would I ever want to go backwards? Why do I want to move back to the older yucky thing? 121 00:06:08,000 --> 00:06:10,000 This is the case that comes up; 122 00:06:10,000 --> 00:06:12,000 the open operation 123 00:06:12,000 --> 00:06:15,000 on isstream and osstream 124 00:06:15,000 --> 00:06:19,000 expects its argument to be specified as an old style string. 125 00:06:19,000 --> 00:06:22,000 This is actually just an artifact; it has to do with it - 126 00:06:22,000 --> 00:06:24,000 the group that was working on 127 00:06:24,000 --> 00:06:27,000 designing the string package. The group that was designing the string package were 128 00:06:27,000 --> 00:06:31,000 not in sync, and they were not working together. The string package was 129 00:06:31,000 --> 00:06:33,000 finalized before the string package was ready 130 00:06:33,000 --> 00:06:37,000 and so it depended on what was available at the time and that was only the old style 131 00:06:37,000 --> 00:06:37,000 string. 132 00:06:37,000 --> 00:06:41,000 So as a result, it wants an old style string, and that's what it takes, and you 133 00:06:41,000 --> 00:06:43,000 can't give it a C++ 134 00:06:43,000 --> 00:06:47,000 string. So in double quotes - so this is the case where the double quotes 135 00:06:47,000 --> 00:06:49,000 are actually old style strings, 136 00:06:49,000 --> 00:06:52,000 in almost all situations gets converted on your behalf automatically. 137 00:06:52,000 --> 00:06:55,000 In this case it's not being converted and it's exactly what's wanted. 138 00:06:55,000 --> 00:06:59,000 So if you have a name that's a string constant or a literal, you can just pass it 139 00:06:59,000 --> 00:07:01,000 in double quotes to open. 140 00:07:01,000 --> 00:07:03,000 If you have a C++ variable, 141 00:07:03,000 --> 00:07:07,000 so you've asked the user for what file to open, and you've used getline to 142 00:07:07,000 --> 00:07:08,000 read it into a string, 143 00:07:08,000 --> 00:07:13,000 if you try to pass that C++ string variable to open, it will not match 144 00:07:13,000 --> 00:07:14,000 what it's expecting. 145 00:07:14,000 --> 00:07:18,000 I do need to do that conversion asking it to go .c_str 146 00:07:18,000 --> 00:07:21,000 to convert itself into the old style format. 147 00:07:21,000 --> 00:07:24,000 So that was sort of where I was getting to when I kind of 148 00:07:24,000 --> 00:07:26,000 positioned you to realize this was gonna 149 00:07:26,000 --> 00:07:29,000 someday come up. This is the one piece of the interface that will interact with this 150 00:07:29,000 --> 00:07:31,000 quarter that requires that old string, 151 00:07:31,000 --> 00:07:33,000 where you'll have to make that effort to 152 00:07:33,000 --> 00:07:36,000 convert it backwards. 153 00:07:36,000 --> 00:07:39,000 Both of these operations can fail. 154 00:07:39,000 --> 00:07:43,000 When you open a file and [inaudible] - question here? So how hard 155 00:07:43,000 --> 00:07:44,000 [inaudible]? 156 00:07:44,000 --> 00:07:48,000 You know it's obviously extremely easy to do it; 157 00:07:48,000 --> 00:07:50,000 the issue has to do with compatibility. 158 00:07:50,000 --> 00:07:54,000 They announced it this way, people wrote code that expected it this way 159 00:07:54,000 --> 00:07:57,000 and then you change it out from under them and all this code breaks that used 160 00:07:57,000 --> 00:07:58,000 to work. 161 00:07:58,000 --> 00:08:01,000 And so as a result of this [inaudible] 162 00:08:01,000 --> 00:08:02,000 compatibility an issue of 163 00:08:02,000 --> 00:08:04,000 once we kind of published it and we told people this was how it works, we can't 164 00:08:04,000 --> 00:08:07,000 really take it away from them. And so part of that's - sort of part of what we're doing within C++2, 165 00:08:07,000 --> 00:08:10,000 which is things that used to work in C still need to work in C, 166 00:08:10,000 --> 00:08:11,000 and so as a result 167 00:08:11,000 --> 00:08:14,000 there's a certain amount of history that we're all carrying forward with us in a very 168 00:08:14,000 --> 00:08:16,000 annoying way. I totally agree 169 00:08:16,000 --> 00:08:19,000 that it seems like we could just fix it, but we would break a lot of code in the process 170 00:08:19,000 --> 00:08:21,000 and anger a lot of 171 00:08:21,000 --> 00:08:24,000 existing programmers. 172 00:08:24,000 --> 00:08:28,000 So both of these open calls could fail; you might be able to - try to open a file and it 173 00:08:28,000 --> 00:08:31,000 doesn't exist, you don't have the permissions for it, you spelled the name wrong. 174 00:08:31,000 --> 00:08:34,000 Similarly trying to open it for writing, it's like you might not have write 175 00:08:34,000 --> 00:08:36,000 permission in the directory. 176 00:08:36,000 --> 00:08:38,000 And 177 00:08:38,000 --> 00:08:41,000 in either situation you need to know, well did it open or did it not? 178 00:08:41,000 --> 00:08:44,000 There's not a return value from open that tells you that. 179 00:08:44,000 --> 00:08:47,000 What there is is a member function called 180 00:08:47,000 --> 00:08:52,000 .fail, that you can ask the stream at any point, are you in a fail state. So for 181 00:08:52,000 --> 00:08:54,000 operations that actually kinda have a chance of succeeding or failing in the 182 00:08:54,000 --> 00:08:57,000 string, you'll tend to actually almost write the code as a 183 00:08:57,000 --> 00:08:57,000 try it 184 00:08:57,000 --> 00:09:01,000 then check in .sale. So try to read this thing, check in .sale. Try to 185 00:09:01,000 --> 00:09:03,000 open this file check in .sale as your way 186 00:09:03,000 --> 00:09:07,000 of following up on did it work and making sure that you 187 00:09:07,000 --> 00:09:09,000 have good contents before you keep going. 188 00:09:09,000 --> 00:09:12,000 If the in .open has failed, 189 00:09:12,000 --> 00:09:14,000 then every subsequent read on it will fail. 190 00:09:14,000 --> 00:09:17,000 Once the string is in a fail state, nothing works. You can't read 191 00:09:17,000 --> 00:09:20,000 or write or do anything with it until you fix the error, 192 00:09:20,000 --> 00:09:23,000 and that's the in .clear 193 00:09:23,000 --> 00:09:26,000 command that kind of resets the state back into a known good state, 194 00:09:26,000 --> 00:09:29,000 and then you have a chance to retry. So for example, if you were trying to open a 195 00:09:29,000 --> 00:09:32,000 file that the user gave you a name for, 196 00:09:32,000 --> 00:09:35,000 they might type the name wrong. So you could try in .openit, check 197 00:09:35,000 --> 00:09:36,000 in .dot fail. 198 00:09:36,000 --> 00:09:40,000 If it failed, say no, no, I couldn't open that file, why don't you try again, get a new 199 00:09:40,000 --> 00:09:41,000 name, 200 00:09:41,000 --> 00:09:45,000 and then you'd clear the state, come back around and try another in .open 201 00:09:45,000 --> 00:09:51,000 to - until you get one that succeeds. 202 00:09:51,000 --> 00:09:53,000 Once you have one of those guys open 203 00:09:53,000 --> 00:09:55,000 for reading or writing, 204 00:09:55,000 --> 00:09:56,000 there are three 205 00:09:56,000 --> 00:10:01,000 main ways that you can do your input/output. 206 00:10:01,000 --> 00:10:06,000 We have seen this form a little bit, this one with the insertion/extraction, 207 00:10:06,000 --> 00:10:09,000 these other two are more likely to be useful in the file reading state as 208 00:10:09,000 --> 00:10:12,000 opposed to interacting with the user state, and they have to deal with just 209 00:10:12,000 --> 00:10:15,000 breaking down the input 210 00:10:15,000 --> 00:10:17,000 more 211 00:10:17,000 --> 00:10:19,000 fine graindly. 212 00:10:19,000 --> 00:10:21,000 Let's say this first one is reading and writing single characters. It might be 213 00:10:21,000 --> 00:10:24,000 that all I want to do is just go through the file and read it character by character. 214 00:10:24,000 --> 00:10:27,000 Maybe what I'm trying to write is something that will just count the characters and 215 00:10:27,000 --> 00:10:30,000 produce a frequency count across 216 00:10:30,000 --> 00:10:33,000 the file, tell me how many A's and B's and C's are in it, 217 00:10:33,000 --> 00:10:35,000 or just tell me how many characters are in the file at all. 218 00:10:35,000 --> 00:10:37,000 In .get 219 00:10:37,000 --> 00:10:40,000 is the number function that you send to an input file stream 220 00:10:40,000 --> 00:10:42,000 that will retrieve the next character. 221 00:10:42,000 --> 00:10:46,000 If [inaudible] the next character from the stream it returns EOF when there are no 222 00:10:46,000 --> 00:10:49,000 more characters. EOF is the end of file marker, it's actually capital 223 00:10:49,000 --> 00:10:53,000 EOF, it's the constant that's defined with the class. And 224 00:10:53,000 --> 00:10:56,000 so you could read till EOF as a way of just getting them character by 225 00:10:56,000 --> 00:10:58,000 character. 226 00:10:58,000 --> 00:11:02,000 Similarly there is a put on the other side, which is when you're writing, do you just 227 00:11:02,000 --> 00:11:04,000 want to write a single character. 228 00:11:04,000 --> 00:11:06,000 You could also do this with 229 00:11:06,000 --> 00:11:09,000 out << ch, which writes the character. This actually just does a 230 00:11:09,000 --> 00:11:11,000 put of the character, just 231 00:11:11,000 --> 00:11:15,000 kind of a matching function in the analog to get input 232 00:11:15,000 --> 00:11:19,000 that do single character io. 233 00:11:19,000 --> 00:11:22,000 Sometimes what you're trying to do is process it line by line. Each line is the 234 00:11:22,000 --> 00:11:25,000 name of somebody and you're kind of putting those names into a database. You 235 00:11:25,000 --> 00:11:28,000 don't want to just assemble the characters by characters, and you don't know how 236 00:11:28,000 --> 00:11:30,000 many 237 00:11:30,000 --> 00:11:31,000 tokens there might be, 238 00:11:31,000 --> 00:11:35,000 that the white space might be that there's Julie Diane Zelenski, sometimes 239 00:11:35,000 --> 00:11:38,000 there might be Julie Zelenski, you don't know how many name pieces might appear to be 240 00:11:38,000 --> 00:11:39,000 there. 241 00:11:39,000 --> 00:11:42,000 You can use getline to read an entire line in one chuck. 242 00:11:42,000 --> 00:11:45,000 So it'll read everything up to 243 00:11:45,000 --> 00:11:49,000 the first new line character it finds. It actually discards the new line and advances 244 00:11:49,000 --> 00:11:52,000 past it. So what you will get is - 245 00:11:52,000 --> 00:11:56,000 the sequence of characters that you will have read will be everything up to and not including 246 00:11:56,000 --> 00:11:59,000 the new line. The new line will be consumed though so that reading will 247 00:11:59,000 --> 00:12:00,000 pick up 248 00:12:00,000 --> 00:12:02,000 on the next line and go forward. 249 00:12:02,000 --> 00:12:06,000 Getline is a free function. 250 00:12:06,000 --> 00:12:12,000 It is not a member function on the stream. It takes a stream as its first 251 00:12:12,000 --> 00:12:12,000 argument. 252 00:12:12,000 --> 00:12:16,000 It takes a string by reference as its second argument, 253 00:12:16,000 --> 00:12:20,000 and it fills in the line with the text of 254 00:12:20,000 --> 00:12:24,000 the characters from here to the next line read in the file. 255 00:12:24,000 --> 00:12:26,000 256 00:12:26,000 --> 00:12:29,000 If it fails the way you will find out is by checking the fail 257 00:12:29,000 --> 00:12:31,000 states. You can do a getline 258 00:12:31,000 --> 00:12:34,000 inline and then in .fail after it to see, well did it write something 259 00:12:34,000 --> 00:12:39,000 in the line that was valid? If it failed, then the contents of line are 260 00:12:39,000 --> 00:12:40,000 261 00:12:40,000 --> 00:12:44,000 unchanged, so they'll be whatever nonsense they were. So 262 00:12:44,000 --> 00:12:46,000 it's a way of just pulling it line by line. 263 00:12:46,000 --> 00:12:51,000 This name has the same words in it as 264 00:12:51,000 --> 00:12:52,000 rgetlineGL 265 00:12:52,000 --> 00:12:55,000 in the sympio, which shows that it's kind of a reasonable name for the kind 266 00:12:55,000 --> 00:13:00,000 of thing that reads line by line, but there is a different arrangement to how it's - what 267 00:13:00,000 --> 00:13:03,000 it's used for and how it's it used. So rgetline takes no arguments and returns a line read 268 00:13:03,000 --> 00:13:04,000 for the console. 269 00:13:04,000 --> 00:13:08,000 The lower case getline takes the file stream to read from and the string to write 270 00:13:08,000 --> 00:13:10,000 it into 271 00:13:10,000 --> 00:13:11,000 and 272 00:13:11,000 --> 00:13:13,000 does not have a return value. 273 00:13:13,000 --> 00:13:16,000 You check in .fail if you 274 00:13:16,000 --> 00:13:18,000 want to know how it went. So write the entire line out there, [inaudible] a put line equivalence, so 275 00:13:18,000 --> 00:13:22,000 in fact you could just use the out 276 00:13:22,000 --> 00:13:26,000 stream insertion here, stick that line back out with an nline to kind of reproduce 277 00:13:26,000 --> 00:13:28,000 the same line your just read. 278 00:13:28,000 --> 00:13:32,000 And then these we've talked a little about, this idea of formatted 279 00:13:32,000 --> 00:13:35,000 read and write, where it's expecting things by format. It's expecting to see a character, 280 00:13:35,000 --> 00:13:38,000 it's expecting to see an integer, and it's expecting to see a 281 00:13:38,000 --> 00:13:39,000 string. 282 00:13:39,000 --> 00:13:43,000 It uses white space as the default delimiter between those things. So it's kind of 283 00:13:43,000 --> 00:13:46,000 scanning over white space and discarding it and then trying to pull the next thing out. 284 00:13:46,000 --> 00:13:47,000 285 00:13:47,000 --> 00:13:51,000 These are definitely much trickier to use because if the format that you're 286 00:13:51,000 --> 00:13:54,000 expecting doesn't show up, it causes the stream to get new fail state, and you 287 00:13:54,000 --> 00:13:56,000 have to kind of fix it and recreate it. 288 00:13:56,000 --> 00:14:00,000 So often even when you expect that things are going to be, let's say, a sequence of 289 00:14:00,000 --> 00:14:02,000 numbers or a name fall by number, 290 00:14:02,000 --> 00:14:05,000 you might instead choose to pull it as a string 291 00:14:05,000 --> 00:14:09,000 and then use operations on the string itself to kinda divide it up 292 00:14:09,000 --> 00:14:12,000 rather than depending on stream io because stream io is just a little bit harder to get 293 00:14:12,000 --> 00:14:14,000 that same effect. 294 00:14:14,000 --> 00:14:17,000 And then in all these cases write in .fail. 295 00:14:17,000 --> 00:14:19,000 There is also - 296 00:14:19,000 --> 00:14:22,000 you could check out.fail. It's just much less common that the 297 00:14:22,000 --> 00:14:25,000 writing will fail, so you don't see it as much, but it is true for example, if 298 00:14:25,000 --> 00:14:28,000 you had wanted a disk space and you were writing, a 299 00:14:28,000 --> 00:14:31,000 write operation could fail because it had 300 00:14:31,000 --> 00:14:35,000 wanted a space or some media error had happened on the disk, so 301 00:14:35,000 --> 00:14:36,000 both of those 302 00:14:36,000 --> 00:14:40,000 have reasons to check fail. 303 00:14:40,000 --> 00:14:43,000 So let me do just a little bit of 304 00:14:43,000 --> 00:14:45,000 live coding 305 00:14:45,000 --> 00:14:49,000 to show you that I - 306 00:14:49,000 --> 00:14:52,000 it works the way I'm 307 00:14:52,000 --> 00:14:54,000 telling you. Yeah? So the 308 00:14:54,000 --> 00:14:57,000 fail 309 00:14:57,000 --> 00:14:58,000 function, is it 310 00:14:58,000 --> 00:14:59,000 going 311 00:14:59,000 --> 00:15:02,000 to always be the stream that's failing and not 312 00:15:02,000 --> 00:15:04,000 the function that's failing? Yes, 313 00:15:04,000 --> 00:15:08,000 pretty much. There are a couple rare cases where the function actually also 314 00:15:08,000 --> 00:15:11,000 tells you a little bit about it, but a general fail just covers the whole general 315 00:15:11,000 --> 00:15:14,000 case of anything I have just got on the stream fail 316 00:15:14,000 --> 00:15:16,000 so any of the operations 317 00:15:16,000 --> 00:15:20,000 that could potentially run into some error condition will set the fail in such a way 318 00:15:20,000 --> 00:15:22,000 that your next call to in .fail will tell you about it. 319 00:15:22,000 --> 00:15:26,000 And so that's the - the general model will be; make the call, check the fail, 320 00:15:26,000 --> 00:15:28,000 if you know that there was a chance that something could have gone 321 00:15:28,000 --> 00:15:34,000 wrong and then you want to clean up after it and do something [inaudible]. 322 00:15:34,000 --> 00:15:38,000 So I'm gonna show you that I'm gonna get the name of the file from the 323 00:15:38,000 --> 00:15:40,000 user here, 324 00:15:40,000 --> 00:15:43,000 I'm going to use in .open of that, 325 00:15:43,000 --> 00:15:46,000 and I'm going to show you the error that you're gonna get when you forget to convert 326 00:15:46,000 --> 00:15:47,000 it, while I'm at it. 327 00:15:47,000 --> 00:15:50,000 And then I'll have like an in 328 00:15:50,000 --> 00:15:52,000 329 00:15:52,000 --> 00:15:53,000 .fail error 330 00:15:53,000 --> 00:15:55,000 wouldn't - 331 00:15:55,000 --> 00:15:57,000 file didn't open. 332 00:15:57,000 --> 00:16:02,000 First I just want to show you this little simple stuff; I've got 333 00:16:02,000 --> 00:16:06,000 my ifstream declared, my attempt to open it and then my check for seeing that it 334 00:16:06,000 --> 00:16:09,000 failed. I'm gonna 335 00:16:09,000 --> 00:16:13,000 anticipate the fact that the compiler's gonna be 336 00:16:13,000 --> 00:16:16,000 complaining about the fact that it hasn't heard about fstream, so I'm gonna tell it about 337 00:16:16,000 --> 00:16:16,000 fstream. 338 00:16:16,000 --> 00:16:20,000 And I'm gonna let this go ahead in compiling, although I know it has an error in it, 339 00:16:20,000 --> 00:16:22,000 because I want to show you sort of the things that are happening. So the first thing 340 00:16:22,000 --> 00:16:25,000 it's complaining about actually is this one, which is 341 00:16:25,000 --> 00:16:29,000 the fact that getline is not declared in the scope, which meant I forgot one more of my 342 00:16:29,000 --> 00:16:31,000 headers that I wanted. Let 343 00:16:31,000 --> 00:16:36,000 me move this up a little bit because it's sitting down a little far. 344 00:16:36,000 --> 00:16:39,000 And then the second thing it's complaining about is right here. 345 00:16:39,000 --> 00:16:43,000 This is pretty hard to see, but I'll read it to you so you can tell what it says; it says error, 346 00:16:43,000 --> 00:16:45,000 there's no matching function call 347 00:16:45,000 --> 00:16:48,000 and then it has sort of some gobbly gook that's a little bit scary, 348 00:16:48,000 --> 00:16:51,000 but includes the name ifstream. It's actually - the full name for ifstream is a 349 00:16:51,000 --> 00:16:53,000 lot bigger than you think, 350 00:16:53,000 --> 00:16:54,000 but it's saying that there's - 351 00:16:54,000 --> 00:16:58,000 the ifstream is open, and it says that it does not 352 00:16:58,000 --> 00:17:02,000 have a match to that, that there is no open call on the ifstream class, so no 353 00:17:02,000 --> 00:17:05,000 member function of the ifstream class whose name is open, 354 00:17:05,000 --> 00:17:07,000 whose argument is a string. 355 00:17:07,000 --> 00:17:11,000 And so that cryptic little bit of information is gonna be your reminder 356 00:17:11,000 --> 00:17:15,000 to jog your memory about the fact that open doesn't deal in the 357 00:17:15,000 --> 00:17:18,000 new string world, it wants the old string world. It will 358 00:17:18,000 --> 00:17:20,000 not take a new string, 359 00:17:20,000 --> 00:17:21,000 and I will convert it 360 00:17:21,000 --> 00:17:26,000 to my old string, 361 00:17:26,000 --> 00:17:30,000 and then be able to get this thing compiling. 362 00:17:30,000 --> 00:17:35,000 And so when it runs if I enter a file name of I say [inaudible], 363 00:17:35,000 --> 00:17:37,000 it'll say error file didn't open, some file that 364 00:17:37,000 --> 00:17:40,000 I don't have access for. It happens that I have one sitting here, I think, whose name is 365 00:17:40,000 --> 00:17:41,000 366 00:17:41,000 --> 00:17:45,000 handout.txt. I took the text of some handout and then I just 367 00:17:45,000 --> 00:17:47,000 left it there. So 368 00:17:47,000 --> 00:17:48,000 let me 369 00:17:48,000 --> 00:17:49,000 doing something with that file. Let's 370 00:17:49,000 --> 00:17:53,000 just do something simple where we just count the number of lines in it. Let's say - actually I'll make 371 00:17:53,000 --> 00:17:55,000 a little function that - 372 00:17:55,000 --> 00:17:58,000 just to talk a little bit about one of the things that's a little quirky about 373 00:17:58,000 --> 00:18:00,000 ifstreams 374 00:18:00,000 --> 00:18:01,000 is that 375 00:18:01,000 --> 00:18:04,000 when you pass an ifstream you will 376 00:18:04,000 --> 00:18:07,000 typically want to do so by reference. 377 00:18:07,000 --> 00:18:08,000 Not only is this kind of a good idea, 378 00:18:08,000 --> 00:18:12,000 because the ifstream is kind of changing in the process of being read. It's 379 00:18:12,000 --> 00:18:15,000 updating its internal state and you want to be sure that we're not 380 00:18:15,000 --> 00:18:19,000 missing this update that's going on. It's also the case that most 381 00:18:19,000 --> 00:18:22,000 libraries require you to pass it by reference. That it doesn't have a model for how 382 00:18:22,000 --> 00:18:25,000 to take a copy of a stream and make another copy that's distinct. That it really 383 00:18:25,000 --> 00:18:28,000 is always referring to the same file, so in fact in most libraries you have 384 00:18:28,000 --> 00:18:31,000 to pass it by reference. 385 00:18:31,000 --> 00:18:34,000 So I'll go ahead and pass it by reference. I'm gonna go in here and I'm just gonna do a line-by-line 386 00:18:34,000 --> 00:18:39,000 read and count as I go. I'm 387 00:18:39,000 --> 00:18:43,000 gonna write this as a wild [inaudible], 388 00:18:43,000 --> 00:18:45,000 and I'm gonna say 389 00:18:45,000 --> 00:18:46,000 read the next line 390 00:18:46,000 --> 00:18:48,000 from the file into the variable, 391 00:18:48,000 --> 00:18:53,000 and then if in .fail - so if it was unable to read another line, 392 00:18:53,000 --> 00:18:57,000 the - my assumption here is gonna be that 393 00:18:57,000 --> 00:19:00,000 we're done, so it will fail as eof . It's the most common reason it could 394 00:19:00,000 --> 00:19:04,000 fail. It could also fail if there was some sort of more catastrophic error, you're leading a file from a 395 00:19:04,000 --> 00:19:06,000 network and the network's gone down or something like that. In our 396 00:19:06,000 --> 00:19:09,000 case its right, the in .fail is going to tell us yeah, there's nothing more to read 397 00:19:09,000 --> 00:19:11,000 from this file, which means we've gotten to the end. 398 00:19:11,000 --> 00:19:14,000 We've advanced the count. Whenever we get a good line we go back 399 00:19:14,000 --> 00:19:15,000 around, so we're 400 00:19:15,000 --> 00:19:18,000 using kind of the wild true in this case because we have a little bit of work to do 401 00:19:18,000 --> 00:19:20,000 before we're ready to decide whether to keep going, 402 00:19:20,000 --> 00:19:22,000 in this case, reading that line. 403 00:19:22,000 --> 00:19:28,000 And then I return the count at the end, 404 00:19:28,000 --> 00:19:30,000 and then I can then down 405 00:19:30,000 --> 00:19:33,000 here print it nom lines 406 00:19:33,000 --> 00:19:37,000 = mi call to count lines of n 407 00:19:37,000 --> 00:19:39,000 and l. Okay. Let 408 00:19:39,000 --> 00:19:40,000 me move that up a little bit. 409 00:19:40,000 --> 00:19:45,000 Last time I posted the code that I wrote in the 410 00:19:45,000 --> 00:19:49,000 editor here, and I'll be happy to do that again today, so you 411 00:19:49,000 --> 00:19:52,000 shouldn't need to worry about copying it down, I will post it later if you want to 412 00:19:52,000 --> 00:19:53,000 have a copy of it for your records, but 413 00:19:53,000 --> 00:19:56,000 just showing, okay, yeah, we're just a line by line read, 414 00:19:56,000 --> 00:19:59,000 counting, and then a little bit more of the how do you open something, how do you 415 00:19:59,000 --> 00:20:03,000 check for failure. And 416 00:20:03,000 --> 00:20:06,000 when I put this together, what does it complain about? Well I think it complains about the fact that 417 00:20:06,000 --> 00:20:16,000 I told it my function returned void, but then I made it return it. And that 418 00:20:16,000 --> 00:20:19,000 should be okay now. And so if I read the handout.txt file, 419 00:20:19,000 --> 00:20:23,000 the number of lines in it happens to be 28. It's just some text I'd cut out of the handout, so 420 00:20:23,000 --> 00:20:25,000 there are 28 new line characters 421 00:20:25,000 --> 00:20:31,000 is basically what it's telling me 422 00:20:31,000 --> 00:20:32,000 there. 423 00:20:32,000 --> 00:20:36,000 So I can just do more things, like I could use - change this loop and instead use like get to 424 00:20:36,000 --> 00:20:39,000 do a single character count. I could say how many characters were in there. 425 00:20:39,000 --> 00:20:41,000 If I used 426 00:20:41,000 --> 00:20:42,000 the 427 00:20:42,000 --> 00:20:45,000 tokenization and I said, well just tell how many strings I find using string 428 00:20:45,000 --> 00:20:48,000 extraction, it would kind of count the number of non-space things that it found and 429 00:20:48,000 --> 00:20:49,000 things like that. 430 00:20:49,000 --> 00:20:51,000 Typically the 431 00:20:51,000 --> 00:20:55,000 IO is one of those errors I said where there's like a vast array of nuances to all 432 00:20:55,000 --> 00:21:00,000 the different things you can do with it, but the simple things actually are 433 00:21:00,000 --> 00:21:03,000 usually fairly easy, and those are the only ones that really going to matter to us as being 434 00:21:03,000 --> 00:21:05,000 able to do a little bit of simple reading and 435 00:21:05,000 --> 00:21:09,000 file reading/writing to get information into our programs. How do 436 00:21:09,000 --> 00:21:11,000 you feel about that? Question? Sorry, why do 437 00:21:11,000 --> 00:21:12,000 have getline 438 00:21:12,000 --> 00:21:14,000 an empty string? 439 00:21:14,000 --> 00:21:16,000 So getline, 440 00:21:16,000 --> 00:21:19,000 the one that was down here? This one? No, 441 00:21:19,000 --> 00:21:21,000 the one that - Oh, the 442 00:21:21,000 --> 00:21:26,000 one that's up here. So yeah, let's talk about that. The getline that's here is - 443 00:21:26,000 --> 00:21:29,000 the second argument to getline is being passed by reference, and so it's 444 00:21:29,000 --> 00:21:32,000 filling in that line with the information it read from the file. 445 00:21:32,000 --> 00:21:36,000 So I just declared the variable so I had a place to store it 446 00:21:36,000 --> 00:21:37,000 and I said, 447 00:21:37,000 --> 00:21:40,000 okay, read the next line from the file, store the thing you read in the line. It turns 448 00:21:40,000 --> 00:21:42,000 out I don't actually care about that information, but there's no way to tell 449 00:21:42,000 --> 00:21:44,000 getline to just throw it away anyway. Oh. 450 00:21:44,000 --> 00:21:47,000 So I'm using it to just kinda move through line-by-line, but it happens to 451 00:21:47,000 --> 00:21:51,000 be that getline requires me to store the answer somewhere, and I'm storing it. 452 00:21:51,000 --> 00:21:54,000 Instead of returning it, it happens to use the design where it fills it in by 453 00:21:54,000 --> 00:21:56,000 reference. 454 00:21:56,000 --> 00:21:59,000 There's actually - it turns out to be a little bit more efficient 455 00:21:59,000 --> 00:22:02,000 to do a pass by reference and fill something in, then to return it. And the 456 00:22:02,000 --> 00:22:05,000 C++ libraries in general prefer that style of 457 00:22:05,000 --> 00:22:06,000 458 00:22:06,000 --> 00:22:09,000 getting information back out of a function as opposed to the function return, 459 00:22:09,000 --> 00:22:13,000 which you think of as being a little more natural design. There's a slight 460 00:22:13,000 --> 00:22:16,000 inefficiency to that relative to the pass by reference and the libraries tend to be very 461 00:22:16,000 --> 00:22:19,000 hyper-conscious of that efficiency, so they tend to prefer this 462 00:22:19,000 --> 00:22:26,000 slightly more awkward style. Question? 463 00:22:26,000 --> 00:22:28,000 Why in the 464 00:22:28,000 --> 00:22:31,000 main [inaudible] does the 465 00:22:31,000 --> 00:22:32,000 error 466 00:22:32,000 --> 00:22:36,000 open [inaudible] file didn't open with [inaudible] like print error: file didn't open? You 467 00:22:36,000 --> 00:22:37,000 know 468 00:22:37,000 --> 00:22:40,000 it's just the way that error works. Error wants to make sure that you don't mistake what 469 00:22:40,000 --> 00:22:44,000 it does, and so it actually prefixes whatever you ask it to write with this big 470 00:22:44,000 --> 00:22:47,000 ERROR in uppercase letters, and so 471 00:22:47,000 --> 00:22:50,000 the purpose of error is twofold; is to report what happened and to halt 472 00:22:50,000 --> 00:22:54,000 processing. And so when it reports that it actually prefixes it with this big red 473 00:22:54,000 --> 00:22:58,000 E-R-R-O-R just to say don't miss this, and then it halts processing 474 00:22:58,000 --> 00:23:01,000 there. And it's just - the error [inaudible] libraries function, which is your way of handling any 475 00:23:01,000 --> 00:23:04,000 kind of catastrophic I can't recover from this. And it's certainly 476 00:23:04,000 --> 00:23:07,000 something we don't want anybody to overlook, and so we try to make it 477 00:23:07,000 --> 00:23:11,000 really jump out at you when 478 00:23:11,000 --> 00:23:15,000 it tells you that. So this is in symbio? :It is in genlib actually. Oh. So error's actually declared out of genlib. And can we use it - 479 00:23:15,000 --> 00:23:18,000 so it's global basically? It is global. It's a telefree function, and you will definitely have occasion 480 00:23:18,000 --> 00:23:22,000 to use it. Right, it's just - it's your way of saying something happened that there's just no 481 00:23:22,000 --> 00:23:26,000 recovery from and continuing on would not make sense. Here's a 482 00:23:26,000 --> 00:23:28,000 - stop and help 483 00:23:28,000 --> 00:23:32,000 and alert the user something's really wrong, so you don't 484 00:23:32,000 --> 00:23:34,000 want to keep going after this because there's no way to kind of 485 00:23:34,000 --> 00:23:36,000 patch things back together. In 486 00:23:36,000 --> 00:23:38,000 this case probably a more likely thing we'd do, is I should say 487 00:23:38,000 --> 00:23:42,000 give me another name, let's go back around and try again, would be a 488 00:23:42,000 --> 00:23:44,000 sort of better way to handle that. I can 489 00:23:44,000 --> 00:23:46,000 even show you how I would do that. 490 00:23:46,000 --> 00:23:49,000 I could say, well while true, 491 00:23:49,000 --> 00:23:51,000 enter the name, 492 00:23:51,000 --> 00:23:53,000 and maybe I could change this to be well 493 00:23:53,000 --> 00:23:55,000 if it didn't fail 494 00:23:55,000 --> 00:23:59,000 then go ahead and break out of the loop. Otherwise, just report that the file 495 00:23:59,000 --> 00:24:02,000 didn't open, 496 00:24:02,000 --> 00:24:04,000 and say try again. 497 00:24:04,000 --> 00:24:06,000 And then the last thing I will need to do 498 00:24:06,000 --> 00:24:09,000 is clear that state. 499 00:24:09,000 --> 00:24:11,000 So now it's prompting, 500 00:24:11,000 --> 00:24:12,000 trying to open it. 501 00:24:12,000 --> 00:24:15,000 If it didn't fail it will break and then it will move forward to counting the lines. 502 00:24:15,000 --> 00:24:16,000 503 00:24:16,000 --> 00:24:19,000 If it did fail it'll continue on through here reporting this message, and then 504 00:24:19,000 --> 00:24:22,000 that clear, very important, because that clear kind of gets us 505 00:24:22,000 --> 00:24:26,000 back in the state where we can try again. If we don't clear the error and we try to do 506 00:24:26,000 --> 00:24:29,000 another in .open, once the string is in a fail state it stays in a fail 507 00:24:29,000 --> 00:24:33,000 state until you clear it, and no subsequent operation will work whatsoever. 508 00:24:33,000 --> 00:24:35,000 It's just ignoring everything you ask it to do 509 00:24:35,000 --> 00:24:38,000 until you have acknowledged you have done something about the problem, which 510 00:24:38,000 --> 00:24:42,000 in this case was as simple as clearing and asking to open again. 511 00:24:42,000 --> 00:24:46,000 So if I do it this way 512 00:24:46,000 --> 00:24:50,000 I enter some name it'll say that didn't open, try again. And then if I say 513 00:24:50,000 --> 00:24:51,000 handout.txt, 514 00:24:51,000 --> 00:24:52,000 it'll open it and 515 00:24:52,000 --> 00:24:57,000 go ahead and read. All right, 516 00:24:57,000 --> 00:25:01,000 any questions about iostreams? We're 517 00:25:01,000 --> 00:25:07,000 gonna move away from this [inaudible], if there's anything about it you'd like to know I'd be happy to answer it. 518 00:25:07,000 --> 00:25:08,000 So let me 519 00:25:08,000 --> 00:25:11,000 get us back to 520 00:25:11,000 --> 00:25:15,000 our slides, 521 00:25:15,000 --> 00:25:17,000 and I'll kind 522 00:25:17,000 --> 00:25:21,000 of move on to the more object-oriented features of the things we're going to be 523 00:25:21,000 --> 00:25:25,000 depending on and using this quarter. 524 00:25:25,000 --> 00:25:27,000 So the libraries that we have been looking at, 525 00:25:27,000 --> 00:25:30,000 many of them are just provided as what we call free functions. Global functions that 526 00:25:30,000 --> 00:25:34,000 aren't assigned to a particular object, they are part of a class, so asking for random 527 00:25:34,000 --> 00:25:34,000 integer, 528 00:25:34,000 --> 00:25:37,000 reading a line, competing the square root, 529 00:25:37,000 --> 00:25:40,000 gobs of things are there that just kind of have 530 00:25:40,000 --> 00:25:43,000 functionality that you can use anywhere and everywhere procedurally. 531 00:25:43,000 --> 00:25:46,000 We've just started to see some things that are provided in terms of classes, 532 00:25:46,000 --> 00:25:50,000 the string of the class, that means that you have string objects that you're messaging and 533 00:25:50,000 --> 00:25:52,000 having them manipulate themselves. 534 00:25:52,000 --> 00:25:55,000 The stream object also is class, ifstream, ofstream, those are all classes 535 00:25:55,000 --> 00:25:57,000 that you send messages like open to 536 00:25:57,000 --> 00:26:04,000 and fail to, to ask about that streams state or reset its state. This idea of 537 00:26:04,000 --> 00:26:08,000 a class is one that's hopefully not new to you. Most of you are coming from Java 538 00:26:08,000 --> 00:26:11,000 have - this is pretty much the only mechanism for writing code for 539 00:26:11,000 --> 00:26:14,000 Java is in the context of a class. Those 540 00:26:14,000 --> 00:26:16,000 of you who haven't seen that as much, we're going to definitely be practicing 541 00:26:16,000 --> 00:26:20,000 on this in our - some simple things you need to know to kind of just get up to the 542 00:26:20,000 --> 00:26:24,000 vocabulary wise is class is just a way of taking a set of 543 00:26:24,000 --> 00:26:25,000 fields or data 544 00:26:25,000 --> 00:26:29,000 and attaching operations to it to where it kind of creates a kind of an 545 00:26:29,000 --> 00:26:33,000 entity that has both its state and its functionality kind of packaged 546 00:26:33,000 --> 00:26:34,000 together. 547 00:26:34,000 --> 00:26:38,000 So in the class interface you'll say here is a time object, and a time object has an 548 00:26:38,000 --> 00:26:39,000 hour and a minute 549 00:26:39,000 --> 00:26:40,000 and you can do things like 550 00:26:40,000 --> 00:26:42,000 tell me if this time's before that time or what the 551 00:26:42,000 --> 00:26:46,000 duration starting at this time and this end time would - there would be all these 552 00:26:46,000 --> 00:26:50,000 behaviors that are like [inaudible] to do. Can you print a time, sure. Can I read a time for a 553 00:26:50,000 --> 00:26:51,000 file, sure. 554 00:26:51,000 --> 00:26:54,000 As long as the interface for the time class provides those things, its kinda this fully 555 00:26:54,000 --> 00:26:56,000 flip - fleshed out 556 00:26:56,000 --> 00:26:58,000 new data type 557 00:26:58,000 --> 00:27:02,000 that then you use time objects of whenever you need to work with time. 558 00:27:02,000 --> 00:27:06,000 The idea is that the client use the object, which is the first role we're 559 00:27:06,000 --> 00:27:08,000 gonna be in for a couple weeks here, 560 00:27:08,000 --> 00:27:11,000 is you learn what the abstraction is. What does the class provide? It provides the notion of a 561 00:27:11,000 --> 00:27:14,000 sequence of characters, that's what stream does. And so that sequence 562 00:27:14,000 --> 00:27:17,000 has all these operations; like well tell me what characters are at this position, or 563 00:27:17,000 --> 00:27:19,000 find this sub-string, 564 00:27:19,000 --> 00:27:21,000 or insert these characters, remove those characters. And 565 00:27:21,000 --> 00:27:24,000 internally it's obviously doing some machinations to keep track of what you 566 00:27:24,000 --> 00:27:28,000 asked it to do and how to update its internal state. But what's neat is that from 567 00:27:28,000 --> 00:27:30,000 the outside as a client you just think well there's a sequence of 568 00:27:30,000 --> 00:27:34,000 characters there and I can ask that sequence of characters to do these 569 00:27:34,000 --> 00:27:35,000 operations, and 570 00:27:35,000 --> 00:27:37,000 it does what I ask, 571 00:27:37,000 --> 00:27:38,000 and that I don't need to know 572 00:27:38,000 --> 00:27:42,000 how it's implemented internally. What mechanisms it uses and how it responds 573 00:27:42,000 --> 00:27:44,000 to those things to update it state 574 00:27:44,000 --> 00:27:46,000 is very much 575 00:27:46,000 --> 00:27:50,000 kind of behind the abstraction or inside that black box, sometime we'll call it to kind 576 00:27:50,000 --> 00:27:51,000 of 577 00:27:51,000 --> 00:27:52,000 578 00:27:52,000 --> 00:27:54,000 suggest to ourselves that we can't see inside of it, we don't know how it works. It's 579 00:27:54,000 --> 00:27:57,000 like the microwave, you go up and you punch on the microwave and you say cook for a minute. Like 580 00:27:57,000 --> 00:28:01,000 what does the microwave do? I don't know, I have no idea, but things get hot, that's what I 581 00:28:01,000 --> 00:28:02,000 know. 582 00:28:02,000 --> 00:28:04,000 So the nice thing about [inaudible] is you can say, yeah, 583 00:28:04,000 --> 00:28:08,000 if you push this button things get hot and that's what I need to know. 584 00:28:08,000 --> 00:28:12,000 [Inaudible] has become widely 585 00:28:12,000 --> 00:28:13,000 industry standard in sort 586 00:28:13,000 --> 00:28:16,000 of all existing languages that are out there. It seems like there's been 587 00:28:16,000 --> 00:28:19,000 somebody who's gone to the trouble of trying to extend it to add these 588 00:28:19,000 --> 00:28:22,000 object [inaudible] features and languages like Java that are fully object 589 00:28:22,000 --> 00:28:25,000 oriented, are very much all the rage now. 590 00:28:25,000 --> 00:28:29,000 And I thought it was interesting to take just a minute to talk about well why is it so 591 00:28:29,000 --> 00:28:32,000 successful? Why is object oriented like the next big thing in programming? 592 00:28:32,000 --> 00:28:36,000 And there are some really good valid reasons for why it is a very 593 00:28:36,000 --> 00:28:39,000 sensible approach to writing programs 594 00:28:39,000 --> 00:28:41,000 that is 595 00:28:41,000 --> 00:28:43,000 worth thinking a little bit about. 596 00:28:43,000 --> 00:28:46,000 Probably the largest sort of 597 00:28:46,000 --> 00:28:49,000 motivation for the industry has to do with this idea of taming complexity 598 00:28:49,000 --> 00:28:51,000 that certainly one of the 599 00:28:51,000 --> 00:28:53,000 weaknesses of 600 00:28:53,000 --> 00:28:54,000 ourself as a discipline is that 601 00:28:54,000 --> 00:28:57,000 the complexity kinda can quickly spiral out of control. 602 00:28:57,000 --> 00:28:58,000 The programs that - 603 00:28:58,000 --> 00:29:01,000 as they get larger and larger, their interactions get harder and harder to 604 00:29:01,000 --> 00:29:02,000 model and we have more 605 00:29:02,000 --> 00:29:06,000 and more issues where we have bugs and security flaws and viruses 606 00:29:06,000 --> 00:29:09,000 and whatnot that exploit holes in these things. 607 00:29:09,000 --> 00:29:12,000 That we need a way as engineers to kind of 608 00:29:12,000 --> 00:29:15,000 tighten down our discipline and really produce things that actually 609 00:29:15,000 --> 00:29:17,000 don't have those kind of holes in them. 610 00:29:17,000 --> 00:29:20,000 And that object oriented probably means one of the ways to try to manage the complexities of 611 00:29:20,000 --> 00:29:22,000 systems. 612 00:29:22,000 --> 00:29:25,000 That instead of having lots and lots of code that [inaudible] things, if you can 613 00:29:25,000 --> 00:29:28,000 break it down into these objects, and each 614 00:29:28,000 --> 00:29:30,000 class that represents that object can be 615 00:29:30,000 --> 00:29:33,000 designed and tested and worked on independently, 616 00:29:33,000 --> 00:29:35,000 there's some hope that you can have a team of programmers working together, 617 00:29:35,000 --> 00:29:38,000 each managing their own classes 618 00:29:38,000 --> 00:29:41,000 and have them be able to not interfere with each other too much to kind of 619 00:29:41,000 --> 00:29:42,000 accomplish - 620 00:29:42,000 --> 00:29:45,000 get the whole end result done by having people collaborate, but without them kind 621 00:29:45,000 --> 00:29:48,000 of stepping on top of each other. 622 00:29:48,000 --> 00:29:51,000 It has a - the advantage of modeling the real world, that we tend to talk to talk about 623 00:29:51,000 --> 00:29:54,000 classes that kind of have names that speak to us, what's a ballot, what's 624 00:29:54,000 --> 00:29:59,000 a class list, what's a database, what is a 625 00:29:59,000 --> 00:30:01,000 time, a string, 626 00:30:01,000 --> 00:30:04,000 that - a fraction? These things kind of - we have ideas about what those things are 627 00:30:04,000 --> 00:30:06,000 in the real world, and having the class 628 00:30:06,000 --> 00:30:09,000 model that abstraction makes it easier to understand what the code is doing and 629 00:30:09,000 --> 00:30:11,000 what that objects role is 630 00:30:11,000 --> 00:30:13,000 in solving the problem. 631 00:30:13,000 --> 00:30:17,000 It also has the advantage of [inaudible] use. That once you build a class and it's 632 00:30:17,000 --> 00:30:19,000 operations, the idea is that it can 633 00:30:19,000 --> 00:30:24,000 be pulled out of the - neatly out of the one program and used in another if the 634 00:30:24,000 --> 00:30:25,000 design has been done, 635 00:30:25,000 --> 00:30:29,000 and can be changed extended fairly easily in the future if the design was 636 00:30:29,000 --> 00:30:30,000 637 00:30:30,000 --> 00:30:32,000 good to begin with. 638 00:30:32,000 --> 00:30:34,000 So let me tell you what 639 00:30:34,000 --> 00:30:38,000 kind of things we're going to be doing in our class library 640 00:30:38,000 --> 00:30:40,000 that will help you to kind of just become a big fan 641 00:30:40,000 --> 00:30:43,000 of having a bunch of pre-written classes around. 642 00:30:43,000 --> 00:30:44,000 We have, 643 00:30:44,000 --> 00:30:49,000 I think, seven classes - I think there's eight actually in our class library 644 00:30:49,000 --> 00:30:51,000 that just look at certain problems that either 645 00:30:51,000 --> 00:30:53,000 C++ 646 00:30:53,000 --> 00:30:56,000 provides in a way that's not as convenient for us, or is kind of missing, 647 00:30:56,000 --> 00:30:58,000 or that can be improved on where we've 648 00:30:58,000 --> 00:31:01,000 tackled those things and given you seven classes that you just get to use from 649 00:31:01,000 --> 00:31:02,000 the get go 650 00:31:02,000 --> 00:31:06,000 that solve problems that are likely to come up for you. 651 00:31:06,000 --> 00:31:07,000 One of them is the scanner, 652 00:31:07,000 --> 00:31:10,000 which I kind of separated by itself because it's a little bit of an unusual class, and 653 00:31:10,000 --> 00:31:14,000 then there's a bunch of container classes on that next line, the vector 654 00:31:14,000 --> 00:31:16,000 grid, staque, math and set 655 00:31:16,000 --> 00:31:19,000 that are used for storing data, different kinds of collections, 656 00:31:19,000 --> 00:31:21,000 and they differ in kind of 657 00:31:21,000 --> 00:31:24,000 what their usage pattern is and what they're storing, 658 00:31:24,000 --> 00:31:26,000 how they're storing it for you. 659 00:31:26,000 --> 00:31:29,000 But that most programs need to do stuff like this, need to store some kind of 660 00:31:29,000 --> 00:31:30,000 collection of date, 661 00:31:30,000 --> 00:31:32,000 why not have some good tools to do it. 662 00:31:32,000 --> 00:31:33,000 663 00:31:33,000 --> 00:31:36,000 These tools kinda let you live higher on the food chain. They're very efficient, 664 00:31:36,000 --> 00:31:39,000 they're debugged, they're commented, the abstraction's been thought about and 665 00:31:39,000 --> 00:31:40,000 kind of worked out 666 00:31:40,000 --> 00:31:43,000 and so they provide kinda this very useful piece of function [inaudible] kinda written to you 667 00:31:43,000 --> 00:31:46,000 ready to go. 668 00:31:46,000 --> 00:31:49,000 And then I - a little note here is that we study these - we are going to study these 669 00:31:49,000 --> 00:31:51,000 abstractions twice. 670 00:31:51,000 --> 00:31:53,000 We're gonna look at these seven classes 671 00:31:53,000 --> 00:31:57,000 today and Friday as a client, and then start using them all through the quarter. 672 00:31:57,000 --> 00:32:01,000 In about a week or so after the mid-term we're gonna come back to them 673 00:32:01,000 --> 00:32:02,000 and say, well how are they implemented? 674 00:32:02,000 --> 00:32:06,000 That after having used them and appreciated what they provided to you, it 675 00:32:06,000 --> 00:32:08,000 will be interesting, I think, to open up the hood 676 00:32:08,000 --> 00:32:11,000 and look down in there and see how they work. 677 00:32:11,000 --> 00:32:15,000 I think this is - there is an interesting pedagogical 678 00:32:15,000 --> 00:32:17,000 679 00:32:17,000 --> 00:32:20,000 debate going on about this, about 680 00:32:20,000 --> 00:32:21,000 whether 681 00:32:21,000 --> 00:32:22,000 682 00:32:22,000 --> 00:32:24,000 it's better to first know how to implement these things and then get to 683 00:32:24,000 --> 00:32:27,000 use them, or to use them and then later know how to implement them. 684 00:32:27,000 --> 00:32:30,000 And I liken it to a little bit if you think about some things we do 685 00:32:30,000 --> 00:32:33,000 very clearly one way or the other in our curriculum, and it's interesting to think about 686 00:32:33,000 --> 00:32:34,000 why. 687 00:32:34,000 --> 00:32:37,000 That when you learn, for example, arithmetic as a 688 00:32:37,000 --> 00:32:38,000 primary schooler, 689 00:32:38,000 --> 00:32:41,000 they don't give you a calculator and say, here, go do some division and multiplication, 690 00:32:41,000 --> 00:32:43,000 and then later try to teach you long division. 691 00:32:43,000 --> 00:32:46,000 You'll never do it. You'll be like, why would I ever do this, this little box 692 00:32:46,000 --> 00:32:48,000 does it for me, the black box. 693 00:32:48,000 --> 00:32:52,000 So in fact they drill you on your multiplication tables and 694 00:32:52,000 --> 00:32:56,000 your long division long before they let you touch a calculator, 695 00:32:56,000 --> 00:33:00,000 which I think is one way of doing it. And, so - and for example, it's like 696 00:33:00,000 --> 00:33:03,000 we could do that with you, make you do it the kind of painful way and then 697 00:33:03,000 --> 00:33:06,000 later say, okay, well here's these way you can avoid 698 00:33:06,000 --> 00:33:07,000 699 00:33:07,000 --> 00:33:09,000 being bogged down by that tedium. 700 00:33:09,000 --> 00:33:12,000 On the other had, think about the way we teach you to drive. 701 00:33:12,000 --> 00:33:14,000 We do not say, here's a wheel and 702 00:33:14,000 --> 00:33:15,000 then they say, 703 00:33:15,000 --> 00:33:18,000 let me tell you a little bit about the combustion engine, you 704 00:33:18,000 --> 00:33:21,000 know, we give you some spark plugs and 705 00:33:21,000 --> 00:33:25,000 try to get you to build your car from the ground up. It's like you learn to drive 706 00:33:25,000 --> 00:33:25,000 707 00:33:25,000 --> 00:33:26,000 and then if you 708 00:33:26,000 --> 00:33:29,000 are more interested in that you might learn what's under the hood, how to 709 00:33:29,000 --> 00:33:31,000 take care of your car, and eventually how to do 710 00:33:31,000 --> 00:33:35,000 more serious repairs or design of your own care. 711 00:33:35,000 --> 00:33:38,000 Where I think of that as being a client first model, like you learn how to use the car 712 00:33:38,000 --> 00:33:41,000 and drive and get places and then if it 713 00:33:41,000 --> 00:33:45,000 intrigues you, you can dig further to learn more about how the car works. 714 00:33:45,000 --> 00:33:48,000 So that's definitely - our model is more of the drive one than the arithmetic one that 715 00:33:48,000 --> 00:33:52,000 it's really nice to be able to drive places first. Like if I - we spent all quarter 716 00:33:52,000 --> 00:33:54,000 learning how to build a combustion engine and you didn't get to go 717 00:33:54,000 --> 00:33:55,000 anywhere, 718 00:33:55,000 --> 00:33:56,000 719 00:33:56,000 --> 00:33:59,000 I'd feel like you wouldn't have tasted what - where you're trying to get, and why that's 720 00:33:59,000 --> 00:34:01,000 so fabulous. So 721 00:34:01,000 --> 00:34:04,000 we will see them first as a client, and you'll get to do really neat things. You'll discover this thing called 722 00:34:04,000 --> 00:34:08,000 the map where you can put thousands, millions of entries in and 723 00:34:08,000 --> 00:34:11,000 have instantaneous look-up access on that. 724 00:34:11,000 --> 00:34:14,000 That you can put these things in a stack or a queue and then have them maintained 725 00:34:14,000 --> 00:34:15,000 for you and popped back out 726 00:34:15,000 --> 00:34:19,000 and all the storage of that being managed and the safety of that being managed without 727 00:34:19,000 --> 00:34:23,000 you having to kinda take any active role in that. That they provide functionality to 728 00:34:23,000 --> 00:34:24,000 you, that you just get 729 00:34:24,000 --> 00:34:25,000 to - 730 00:34:25,000 --> 00:34:29,000 leverage from the get go, and hopefully it will cause you to be 731 00:34:29,000 --> 00:34:32,000 curious though, like how does it work, why does it work so well, 732 00:34:32,000 --> 00:34:33,000 and what kind 733 00:34:33,000 --> 00:34:36,000 of things must happen behind the scenes and under the hood 734 00:34:36,000 --> 00:34:39,000 so that when we get to that you're actually kind of inspired to know 735 00:34:39,000 --> 00:34:43,000 how it did it, what it did. 736 00:34:43,000 --> 00:34:44,000 So I'm gonna tell you about the scanner 737 00:34:44,000 --> 00:34:48,000 and maybe even tell you a little bit about the vector today, and then we'll do the remaining 738 00:34:48,000 --> 00:34:52,000 ones on Friday, perhaps even carrying over a little bit into the weeks 739 00:34:52,000 --> 00:34:55,000 to get ourselves used to what we've got. 740 00:34:55,000 --> 00:34:58,000 The scanner I kind of separated because the scanner's more of a task based object then it 741 00:34:58,000 --> 00:34:59,000 is a 742 00:34:59,000 --> 00:35:02,000 collection or a container for storing things. The scanner's job is to break 743 00:35:02,000 --> 00:35:07,000 apart input into tokens. To take a string in this case that either you read from 744 00:35:07,000 --> 00:35:10,000 the file or you got from the user, or you constructed some way, and just tokenize 745 00:35:10,000 --> 00:35:12,000 it. It's called tokenizer parsec. 746 00:35:12,000 --> 00:35:13,000 747 00:35:13,000 --> 00:35:18,000 That this is something a little bit like - strained extraction kind of does this, 748 00:35:18,000 --> 00:35:22,000 but strained extraction, as I said, isn't very flexible, 749 00:35:22,000 --> 00:35:23,000 that it doesn't 750 00:35:23,000 --> 00:35:25,000 make it easy for you to kind of - you 751 00:35:25,000 --> 00:35:28,000 have to sort of fully anticipate what's coming up on the string. There's not 752 00:35:28,000 --> 00:35:29,000 anyway you can sort of 753 00:35:29,000 --> 00:35:31,000 take a look at it and then to decide what to do with it and 754 00:35:31,000 --> 00:35:35,000 decide how to change your parstring strategy. And scanner has a kind of flexibility that 755 00:35:35,000 --> 00:35:36,000 lets it be a little bit more 756 00:35:36,000 --> 00:35:40,000 configurable about what you expect coming up and how it works. 757 00:35:40,000 --> 00:35:43,000 So the idea is that basically it just takes your input, you know, this line contains ten 758 00:35:43,000 --> 00:35:44,000 tokens, 759 00:35:44,000 --> 00:35:46,000 and as you go into a loop saying, 760 00:35:46,000 --> 00:35:48,000 give me the next token, it will 761 00:35:48,000 --> 00:35:52,000 sub-string out and return to you this four character string followed by this single 762 00:35:52,000 --> 00:35:54,000 character space and then this four character line 763 00:35:54,000 --> 00:35:58,000 and space, and so the default behavior is to extract all the tokens to come up, 764 00:35:58,000 --> 00:36:03,000 to use white-space and punctuation as delimiters. So it will kind of 765 00:36:03,000 --> 00:36:05,000 aggregate letters and numbers together 766 00:36:05,000 --> 00:36:09,000 and then individual spaces and new lines and tabs will come out as single 767 00:36:09,000 --> 00:36:14,000 character tokens. The parenthesis and dots and number signs would all come out as single character 768 00:36:14,000 --> 00:36:15,000 tokens, 769 00:36:15,000 --> 00:36:17,000 770 00:36:17,000 --> 00:36:20,000 and it just kind of divides it up for you. 771 00:36:20,000 --> 00:36:21,000 Okay. 772 00:36:21,000 --> 00:36:25,000 It has fancy options though that let you do things like discard those face 773 00:36:25,000 --> 00:36:29,000 tokens because you don't care about them. To do things like read 774 00:36:29,000 --> 00:36:31,000 the fancy number formats. So it can read 775 00:36:31,000 --> 00:36:35,000 integer formats and real formats, it can do the real format with exponentiation 776 00:36:35,000 --> 00:36:38,000 in it with leading minus', things like that, 777 00:36:38,000 --> 00:36:40,000 that 778 00:36:40,000 --> 00:36:41,000 you can control 779 00:36:41,000 --> 00:36:44,000 with these setters and getters, like what it is you wanted to do about those things. 780 00:36:44,000 --> 00:36:48,000 You can it things like when I see an opening quote, I want you to gather everything to 781 00:36:48,000 --> 00:36:50,000 the closing quote, and so it does kind of 782 00:36:50,000 --> 00:36:53,000 gather 783 00:36:53,000 --> 00:36:56,000 phrases out of sequence if that's what you want. And so you have control over 784 00:36:56,000 --> 00:37:00,000 when and where it decides to do those things that lets you kind of 785 00:37:00,000 --> 00:37:05,000 handle a variety of kind of parsing and dividing tasks by using the scanner 786 00:37:05,000 --> 00:37:08,000 to get that job done. So I listed some things you might need, if you're 787 00:37:08,000 --> 00:37:11,000 reading txt files, you're parsing expressions, you were processing some kind of commands, that 788 00:37:11,000 --> 00:37:15,000 this scanner is a very handy way to just divide that [inaudible] up. 789 00:37:15,000 --> 00:37:17,000 You could certainly do this kind of stuff manually, 790 00:37:17,000 --> 00:37:18,000 for example, 791 00:37:18,000 --> 00:37:23,000 like using the find on the string and finding those faces and dividing it up, but 792 00:37:23,000 --> 00:37:25,000 that the idea is just doing that 793 00:37:25,000 --> 00:37:27,000 in a more convenient way for you 794 00:37:27,000 --> 00:37:33,000 than you having to handle that process manually. 795 00:37:33,000 --> 00:37:35,000 This is what its interface looks like. 796 00:37:35,000 --> 00:37:39,000 So this is a C++ class definition. It looks 797 00:37:39,000 --> 00:37:42,000 very similar to a Java class definition, but there's a little bit of 798 00:37:42,000 --> 00:37:46,000 variation in some of the ways the syntax comes through in the class. 799 00:37:46,000 --> 00:37:48,000 The class being here is scanner, 800 00:37:48,000 --> 00:37:52,000 the public colon introduces a sequence of where everything from 801 00:37:52,000 --> 00:37:55,000 here until the next access modifier is 802 00:37:55,000 --> 00:37:58,000 public. So I don't actually have public repeated again and again on all the 803 00:37:58,000 --> 00:38:00,000 804 00:38:00,000 --> 00:38:01,000 individual entries here. 805 00:38:01,000 --> 00:38:04,000 It tells us that the scanner has a constructor 806 00:38:04,000 --> 00:38:08,000 that takes no arguments; it just initializes a new empty scanner. 807 00:38:08,000 --> 00:38:12,000 I'm gonna skip the destructor for a second; I'll come back to it. 808 00:38:12,000 --> 00:38:15,000 There is a set input member function that you give it the string that you want 809 00:38:15,000 --> 00:38:15,000 810 00:38:15,000 --> 00:38:19,000 scanned and then there's these two 811 00:38:19,000 --> 00:38:21,000 operations that tend to be used in a look where you keep asking are there more 812 00:38:21,000 --> 00:38:23,000 tokens and if so, give me the next token, so it 813 00:38:23,000 --> 00:38:27,000 just kind of pulls them out one by one. I picked 814 00:38:27,000 --> 00:38:31,000 just one of the space - of the particular advanced options to show you 815 00:38:31,000 --> 00:38:34,000 the format for them. There's actually about six more that deal with 816 00:38:34,000 --> 00:38:35,000 817 00:38:35,000 --> 00:38:37,000 some other more obscure things. 818 00:38:37,000 --> 00:38:38,000 This one is 819 00:38:38,000 --> 00:38:40,000 how is it you'd like it to deal with spaces, 820 00:38:40,000 --> 00:38:44,000 when you see face tokens, should they be returned as ordinary tokens or should you 821 00:38:44,000 --> 00:38:48,000 just discard them entirely and not even bother with them? 822 00:38:48,000 --> 00:38:51,000 The default is what's called preserve spaces, so it really does return them, so if 823 00:38:51,000 --> 00:38:54,000 you ask and there's only spaces left in the file, it will say there are more tokens 824 00:38:54,000 --> 00:38:58,000 and as you call the next token we'll return those spaces as individual tokens. 825 00:38:58,000 --> 00:39:01,000 If you instead have set the space option of ignore spaces, then it will just 826 00:39:01,000 --> 00:39:05,000 skip over all of those, and if all that was left in the file was white space 827 00:39:05,000 --> 00:39:07,000 when you ask for more tokens, it will say no. 828 00:39:07,000 --> 00:39:10,000 And when you ask for a token and there's some spaces leading up to 829 00:39:10,000 --> 00:39:15,000 something it will just skip right over those and return the next non-space token. 830 00:39:15,000 --> 00:39:19,000 There's a variety of these other ones that exist 831 00:39:19,000 --> 00:39:23,000 that handle the floating point and the double quote and other kind of 832 00:39:23,000 --> 00:39:25,000 fancy behaviors. 833 00:39:25,000 --> 00:39:29,000 There's one little detail I'll show you that's a C++ ism that isn't 834 00:39:29,000 --> 00:39:31,000 - doesn't really have a Java analog, 835 00:39:31,000 --> 00:39:34,000 which is the constructor which is used as the initialization function for a 836 00:39:34,000 --> 00:39:35,000 class 837 00:39:35,000 --> 00:39:36,000 has a 838 00:39:36,000 --> 00:39:38,000 corresponding destructor. 839 00:39:38,000 --> 00:39:41,000 Every class has the option of doing this. 840 00:39:41,000 --> 00:39:41,000 That is 841 00:39:41,000 --> 00:39:45,000 the - kind of when the object is being created, the constructor is being called. When the 842 00:39:45,000 --> 00:39:50,000 object is being de-allocated or destroyed, going out of scope, the destructor is 843 00:39:50,000 --> 00:39:51,000 called. 844 00:39:51,000 --> 00:39:54,000 And the pairing allows sort of the constructor to do any kind of set up that needs to be 845 00:39:54,000 --> 00:39:58,000 done and the destructor to do any kind of tear down that needs to be done. 846 00:39:58,000 --> 00:40:01,000 In most cases there's not that much that needs to be there, but 847 00:40:01,000 --> 00:40:06,000 it is part of the mechanism that allows all classes to have an option kind of at 848 00:40:06,000 --> 00:40:09,000 birth and death to do what it needs to do. For example, my file 849 00:40:09,000 --> 00:40:10,000 stream 850 00:40:10,000 --> 00:40:13,000 object, when you - 851 00:40:13,000 --> 00:40:16,000 when it goes away, closes it file automatically. So it's a place where the 852 00:40:16,000 --> 00:40:24,000 destructor gets used to do cleanup as that object is no longer valid. 853 00:40:24,000 --> 00:40:25,000 So a little bit of 854 00:40:25,000 --> 00:40:26,000 scanner 855 00:40:26,000 --> 00:40:28,000 code 856 00:40:28,000 --> 00:40:32,000 showing kind of the most common access pattern, is you declare the 857 00:40:32,000 --> 00:40:37,000 scanner. So at this point the scanner is empty, it has no contents to scan. 858 00:40:37,000 --> 00:40:39,000 Before I start pulling stuff out of it, 859 00:40:39,000 --> 00:40:42,000 I'm typically gonna call a set input on it, passing some string. In this case the 860 00:40:42,000 --> 00:40:46,000 string I'm passing is the one that was entered by the user, using getline. 861 00:40:46,000 --> 00:40:48,000 And then the 862 00:40:48,000 --> 00:40:51,000 ubiquitous loop that says well while the scanner has more tokens, get the next 863 00:40:51,000 --> 00:40:52,000 token. 864 00:40:52,000 --> 00:40:55,000 And in this case I'm not even actually paying attention to what those tokens are, I'm 865 00:40:55,000 --> 00:40:56,000 just counting them. 866 00:40:56,000 --> 00:40:59,000 So this one is kind of a 867 00:40:59,000 --> 00:41:03,000 very simple access that just says just call the next token as many times as you can 868 00:41:03,000 --> 00:41:04,000 until there 869 00:41:04,000 --> 00:41:10,000 are no more tokens to pull out. Way in the back? [Inaudible] I 870 00:41:10,000 --> 00:41:11,000 mean, like 871 00:41:11,000 --> 00:41:16,000 in the beginning when it says scanner, scanner, do we write scanner scanner = new 872 00:41:16,000 --> 00:41:17,000 scanner () or [inaudible]? 873 00:41:17,000 --> 00:41:19,000 Yes. 874 00:41:19,000 --> 00:41:22,000 Not exactly. So that's a very good example of like where Java and C++ are gonna 875 00:41:22,000 --> 00:41:25,000 conspire to trip you up just a little bit, 876 00:41:25,000 --> 00:41:30,000 that in Java objects were always printed using the syntax of new. You say new 877 00:41:30,000 --> 00:41:33,000 this thing, and in fact that actually does an allocation 878 00:41:33,000 --> 00:41:35,000 out in what's called the heap 879 00:41:35,000 --> 00:41:37,000 of that object and then from there you use it. 880 00:41:37,000 --> 00:41:40,000 In C++ you actually don't have to put things in the heap, and in fact 881 00:41:40,000 --> 00:41:43,000 we will rarely put things in the heap, and that's what new is for. 882 00:41:43,000 --> 00:41:47,000 So we're gonna use the stack to allocate them. So when I say scanner scanner, 883 00:41:47,000 --> 00:41:50,000 that really declares a scanner object right there 884 00:41:50,000 --> 00:41:53,000 and in this case there are no [inaudible] my constructor, so I don't have anything in 885 00:41:53,000 --> 00:41:56,000 parenths. If there were some arguments I would put parenths and put the 886 00:41:56,000 --> 00:41:57,000 information there, 887 00:41:57,000 --> 00:42:01,000 but the constructor is being called even with out this new. New actually is 888 00:42:01,000 --> 00:42:04,000 more about where the memory comes from. The constructor is called regardless of 889 00:42:04,000 --> 00:42:07,000 where the memory came from. And so this is the mechanism of C++ to get 890 00:42:07,000 --> 00:42:11,000 yourself an object tends to be, say the class name, say the name of the variable. 891 00:42:11,000 --> 00:42:14,000 If you have arguments for the constructor, they will go in parenths 892 00:42:14,000 --> 00:42:16,000 after the variable's name. 893 00:42:16,000 --> 00:42:18,000 So if scanner had 894 00:42:18,000 --> 00:42:20,000 something, I would be putting it right here, 895 00:42:20,000 --> 00:42:25,000 open parenth, yada, yada. 896 00:42:25,000 --> 00:42:27,000 So that's a little 897 00:42:27,000 --> 00:42:30,000 C++/Java 898 00:42:30,000 --> 00:42:32,000 difference. Oh, that's good. Question over 899 00:42:32,000 --> 00:42:35,000 here? 900 00:42:35,000 --> 00:42:37,000 When do we have to use the destructor? 901 00:42:37,000 --> 00:42:41,000 So typically you will not ever make a call that explicitly calls the 902 00:42:41,000 --> 00:42:43,000 destructor. It happens for you automatically. So you're - [inaudible] you're gonna 903 00:42:43,000 --> 00:42:47,000 see it in the interface as part of the completeness of the class it, here's how I 904 00:42:47,000 --> 00:42:49,000 set up, here's how I tear down. 905 00:42:49,000 --> 00:42:51,000 When we start implementing classes we'll have a reason to think more seriously about 906 00:42:51,000 --> 00:42:55,000 what goes in the destructor. But now you will never explicitly call it. Just know that 907 00:42:55,000 --> 00:42:57,000 it automatically gets called for you. 908 00:42:57,000 --> 00:43:01,000 The constructor kinda gets automatically called; the destructor gets automatically called, so 909 00:43:01,000 --> 00:43:04,000 just know that they're there. One 910 00:43:04,000 --> 00:43:08,000 of the things that's - I just want to encourage you not to get too 911 00:43:08,000 --> 00:43:11,000 bogged down in is that there's a lot of syntax to C++. I'm trying to give 912 00:43:11,000 --> 00:43:15,000 you the important parts that are going to matter early on, and we'll see more and 913 00:43:15,000 --> 00:43:16,000 more as we go through. 914 00:43:16,000 --> 00:43:18,000 Don't let it get you too overwhelmed, the feeling of it's 915 00:43:18,000 --> 00:43:21,000 almost but not quite like Java and it's going to make me crazy. 916 00:43:21,000 --> 00:43:25,000 Realize that 917 00:43:25,000 --> 00:43:27,000 there's just a little bit of differences that you kinda got to absorb, and once you 918 00:43:27,000 --> 00:43:30,000 get your head around them actually you will find yourself very able to 919 00:43:30,000 --> 00:43:33,000 express yourself without getting too tripped up by it. But it's just at the beginning I'm sure 920 00:43:33,000 --> 00:43:35,000 it feels like you've got this big list of here's a thousand things that are a 921 00:43:35,000 --> 00:43:37,000 little bit different that - 922 00:43:37,000 --> 00:43:41,000 and it will not be long before it will feel like your native language, so 923 00:43:41,000 --> 00:43:46,000 hang in there with us. 924 00:43:46,000 --> 00:43:47,000 So 925 00:43:47,000 --> 00:43:50,000 I wanted to show you the vector before we get done today and then we'll 926 00:43:50,000 --> 00:43:55,000 have a lot more chance to talk about this on Friday. That the other six 927 00:43:55,000 --> 00:43:57,000 classes that come in [inaudible] class library 928 00:43:57,000 --> 00:44:00,000 are all container classes. So containers are these things like they're buckets 929 00:44:00,000 --> 00:44:04,000 or shells or bags. They hold things for you. You stick things into the 930 00:44:04,000 --> 00:44:07,000 container and then later you can retrieve them. 931 00:44:07,000 --> 00:44:11,000 This turns out to be the most common need in all programs. If you look 932 00:44:11,000 --> 00:44:14,000 at all the things programs do, [inaudible] manipulating information, where are 933 00:44:14,000 --> 00:44:18,000 they putting that information, where are they storing it? 934 00:44:18,000 --> 00:44:22,000 One of the sorts of obvious needs is something that is just kind of a 935 00:44:22,000 --> 00:44:25,000 linear collection. I need to put together the 100 student that are in 936 00:44:25,000 --> 00:44:29,000 this class in a list, well what do I do - what do I use to do that? 937 00:44:29,000 --> 00:44:33,000 There is a build in kind of raw array, or primitive array in C++. I'm not 938 00:44:33,000 --> 00:44:35,000 even gonna show it to you right now. 939 00:44:35,000 --> 00:44:37,000 The truth is 940 00:44:37,000 --> 00:44:42,000 it's functional, it does kinda what it sets out to do, but it's very weak. 941 00:44:42,000 --> 00:44:44,000 It has constraints on how big it is 942 00:44:44,000 --> 00:44:48,000 and how it's access to it is. For example, you can make an array that has 10 members 943 00:44:48,000 --> 00:44:51,000 and then you can axe the 12th member or the 1,500th member 944 00:44:51,000 --> 00:44:55,000 without any good error reporting from either the compiler or the runtime 945 00:44:55,000 --> 00:44:55,000 system. 946 00:44:55,000 --> 00:44:58,000 That it's designed for kind of to be a professional's tool and it's very efficient, 947 00:44:58,000 --> 00:45:00,000 but it's not very safe. 948 00:45:00,000 --> 00:45:04,000 It doesn't have any convenience attached to it whatsoever. If you have a - you 949 00:45:04,000 --> 00:45:07,000 create a ten number array and later you decide you need to put 12 things into 950 00:45:07,000 --> 00:45:07,000 it, 951 00:45:07,000 --> 00:45:11,000 then your only recourse is to go create a new 12 number array and copy over 952 00:45:11,000 --> 00:45:13,000 those ten things 953 00:45:13,000 --> 00:45:16,000 and get rid of your old array and make a totally new one, that you can't take the 954 00:45:16,000 --> 00:45:18,000 one you have and just grow it 955 00:45:18,000 --> 00:45:19,000 956 00:45:19,000 --> 00:45:20,000 in the standard language. 957 00:45:20,000 --> 00:45:23,000 So we'll come back to see it because it turns out there's some reasons we're gonna need to 958 00:45:23,000 --> 00:45:27,000 know how it works. But for now if you say if I needed to make a list what I want 959 00:45:27,000 --> 00:45:28,000 to use is the vector. 960 00:45:28,000 --> 00:45:31,000 So we have a vector class 961 00:45:31,000 --> 00:45:32,000 in our class library 962 00:45:32,000 --> 00:45:36,000 that just solves this problem of you need to collect up this sequence of 963 00:45:36,000 --> 00:45:39,000 things, a bunch of scores on a test, 964 00:45:39,000 --> 00:45:41,000 a bunch of students who are in a class, 965 00:45:41,000 --> 00:45:44,000 a bunch of name 966 00:45:44,000 --> 00:45:46,000 that are being invited to a party. 967 00:45:46,000 --> 00:45:50,000 And what it does for you is the things that array does but with safety 968 00:45:50,000 --> 00:45:52,000 and convenience built into it. 969 00:45:52,000 --> 00:45:55,000 So it does bounds checking. If you created a vector and you put ten things 970 00:45:55,000 --> 00:45:56,000 into it, 971 00:45:56,000 --> 00:45:59,000 then you can ask for the zero through 9th entries, but you cannot ask 972 00:45:59,000 --> 00:46:02,000 for the 22nd entry, it will raise an error and 973 00:46:02,000 --> 00:46:05,000 it will use that error function, you will get a big red error message, you will not 974 00:46:05,000 --> 00:46:07,000 975 00:46:07,000 --> 00:46:09,000 bludgeon on unknowingly. 976 00:46:09,000 --> 00:46:12,000 You can add things and insert them and then remove them. So I can go into the array and 977 00:46:12,000 --> 00:46:15,000 say I'd like to put something in slot zero, it will shuffle everything over and make 978 00:46:15,000 --> 00:46:19,000 that space. If I say delete the element that's at zero it will move everything 979 00:46:19,000 --> 00:46:21,000 down. So it just does all this kind of handling of 980 00:46:21,000 --> 00:46:24,000 keeping the integrity of the list 981 00:46:24,000 --> 00:46:26,000 and its ordering maintained 982 00:46:26,000 --> 00:46:28,000 on your behalf. 983 00:46:28,000 --> 00:46:30,000 It also does all the 984 00:46:30,000 --> 00:46:33,000 management of how much storage space is needed. So if I put ten things into 985 00:46:33,000 --> 00:46:36,000 the vector and I put the 11th or the 12th or the - add 986 00:46:36,000 --> 00:46:37,000 100 more, 987 00:46:37,000 --> 00:46:39,000 it knows how to make the space necessary for it. 988 00:46:39,000 --> 00:46:42,000 Behind the scenes it's figuring out where I can get that space and how to take 989 00:46:42,000 --> 00:46:46,000 care of it. It always knows what count it has and what's going on there, but 990 00:46:46,000 --> 00:46:50,000 its doing this on our behalf in a way that that rawray just does not, that becomes 991 00:46:50,000 --> 00:46:55,000 very tedious and error prone if it's our responsibility to deal with it. 992 00:46:55,000 --> 00:46:59,000 So what the vector is kind of running, it's an instruction. And this is a key word for us in 993 00:46:59,000 --> 00:47:01,000 things that we're going to be talking about this quarter 994 00:47:01,000 --> 00:47:01,000 is that 995 00:47:01,000 --> 00:47:03,000 what you really wanted was a list. 996 00:47:03,000 --> 00:47:07,000 I want a list of students and I want to be able to put it in sorted order or 997 00:47:07,000 --> 00:47:08,000 find this person or print them. 998 00:47:08,000 --> 00:47:11,000 The fact that where the memory came from and how it's keeping track of is really 999 00:47:11,000 --> 00:47:15,000 a tedious detail that I'd rather not have to deal with. And that's exactly 1000 00:47:15,000 --> 00:47:17,000 what the vector's gonna do for you, is make it so 1001 00:47:17,000 --> 00:47:21,000 you store things and the storage is somebody else's problem. 1002 00:47:21,000 --> 00:47:23,000 You use a list, 1003 00:47:23,000 --> 00:47:27,000 you get an abstraction. 1004 00:47:27,000 --> 00:47:30,000 How that - there's one little quirk, and this is 1005 00:47:30,000 --> 00:47:33,000 not so startling to those of you who have 1006 00:47:33,000 --> 00:47:35,000 worked on a recent version of Java, 1007 00:47:35,000 --> 00:47:38,000 is in order to make the vector generally useful, 1008 00:47:38,000 --> 00:47:40,000 it cannot store just one type of thing. 1009 00:47:40,000 --> 00:47:43,000 That you can't make a vector that stores [inaudible] and 1010 00:47:43,000 --> 00:47:46,000 service everyone's needs, that it has to be able to hold vectors of doubles 1011 00:47:46,000 --> 00:47:49,000 or vectors of strings or vectors of student structures 1012 00:47:49,000 --> 00:47:51,000 equally well. 1013 00:47:51,000 --> 00:47:55,000 And so the way the vector class is actually supplied is using a 1014 00:47:55,000 --> 00:47:58,000 feature in the C++ language called templates where 1015 00:47:58,000 --> 00:48:02,000 the vector describes what it's storing using a placeholder. It says, well this is a 1016 00:48:02,000 --> 00:48:04,000 vector of something and 1017 00:48:04,000 --> 00:48:08,000 when you put these things in they all have to be the same type of thing 1018 00:48:08,000 --> 00:48:11,000 and when you get one out you'll get the thing you put in, 1019 00:48:11,000 --> 00:48:14,000 but I will not commit to, and the interface saying it's always an integer, 1020 00:48:14,000 --> 00:48:16,000 it's always a double. 1021 00:48:16,000 --> 00:48:19,000 It's left open and then the client has to describe what they want when they're 1022 00:48:19,000 --> 00:48:20,000 ready to use it. 1023 00:48:20,000 --> 00:48:24,000 So this is like the Java generics. When you're using an array list you said, well what 1024 00:48:24,000 --> 00:48:27,000 kind of things am I sticking in my array list, and then that way 1025 00:48:27,000 --> 00:48:34,000 the compiler can keep track of it for you and help you to use it correctly. 1026 00:48:34,000 --> 00:48:36,000 The interpart of this kinda 1027 00:48:36,000 --> 00:48:38,000 looks as 1028 00:48:38,000 --> 00:48:40,000 we've seen before. It's a class vector, 1029 00:48:40,000 --> 00:48:42,000 it has a constructor and destructor 1030 00:48:42,000 --> 00:48:44,000 and it has some operations that 1031 00:48:44,000 --> 00:48:46,000 return things like the number of elements that you can find out whether it 1032 00:48:46,000 --> 00:48:50,000 has zero elements, you can get the element at index, you can set the element at 1033 00:48:50,000 --> 00:48:51,000 index, 1034 00:48:51,000 --> 00:48:54,000 you can add, insert and remove 1035 00:48:54,000 --> 00:48:55,000 things within there. 1036 00:48:55,000 --> 00:48:56,000 1037 00:48:56,000 --> 00:48:59,000 The one thing that's a little bit unusual about it is that every time it's 1038 00:48:59,000 --> 00:49:02,000 talking about the type of something that's going into the vector or 1039 00:49:02,000 --> 00:49:04,000 something that's coming out of the vector, 1040 00:49:04,000 --> 00:49:06,000 it uses this elem type 1041 00:49:06,000 --> 00:49:11,000 which traces its origin back to this template header up there, 1042 00:49:11,000 --> 00:49:14,000 that is the clue to you that the vector 1043 00:49:14,000 --> 00:49:15,000 doesn't 1044 00:49:15,000 --> 00:49:19,000 commit to I'm storing ants, I'm storing doubles, I'm storing strings, it stores some 1045 00:49:19,000 --> 00:49:22,000 generic elem type thing, 1046 00:49:22,000 --> 00:49:26,000 which went the client is ready to create a vector, they will have to make 1047 00:49:26,000 --> 00:49:30,000 that commitment and say this vector is gonna hold doubles, this vector is 1048 00:49:30,000 --> 00:49:31,000 gonna hold ants, 1049 00:49:31,000 --> 00:49:34,000 and from that point forward that vector knows that the 1050 00:49:34,000 --> 00:49:39,000 getat on a vector of ants returns something of n type. And then add on a vector of nts 1051 00:49:39,000 --> 00:49:41,000 expects a perimeter of n type, 1052 00:49:41,000 --> 00:49:44,000 which is distinct from a vector of strings or a vector 1053 00:49:44,000 --> 00:49:45,000 of doubles. So I'll 1054 00:49:45,000 --> 00:49:49,000 show you a little code and we'll have to just really talk about this more deeply on 1055 00:49:49,000 --> 00:49:49,000 Friday. 1056 00:49:49,000 --> 00:49:51,000 A 1057 00:49:51,000 --> 00:49:54,000 little bit of this in text for how I make a vector of [inaudible] how I make a vector of 1058 00:49:54,000 --> 00:49:56,000 strings, and 1059 00:49:56,000 --> 00:49:56,000 then 1060 00:49:56,000 --> 00:49:58,000 some of the things that you could try to mix up 1061 00:49:58,000 --> 00:50:01,000 that the template will actually 1062 00:50:01,000 --> 00:50:02,000 not let you get away with, 1063 00:50:02,000 --> 00:50:05,000 mixing those types. So 1064 00:50:05,000 --> 00:50:07,000 we'll see this on Friday, so don't worry, 1065 00:50:07,000 --> 00:50:08,000 1066 00:50:08,000 --> 00:50:10,000 there will be time to look at it 1067 00:50:10,000 --> 00:50:17,000 and meanwhile good luck getting your compiler set up.