1
00:00:00,000 --> 00:00:10,000


2
00:00:10,000 --> 00:00:12,000

3
00:00:12,000 --> 00:00:15,000
This presentation is delivered by the Stanford Center for Professional

4
00:00:15,000 --> 00:00:23,000
Development.

5
00:00:23,000 --> 00:00:27,000
Any questions administratively? 

6
00:00:27,000 --> 00:00:29,000
I had one 

7
00:00:29,000 --> 00:00:33,000
student come and hang out with me on Friday. 

8
00:00:33,000 --> 00:00:37,000
I'm hoping that was because it was so last minute. 

9
00:00:37,000 --> 00:00:39,000


10
00:00:39,000 --> 00:00:42,000
This Friday, I don't have a meeting, so 

11
00:00:42,000 --> 00:00:47,000
put it on your calendars now for Friday hanging out. 

12
00:00:47,000 --> 00:00:48,000
C++ libraries ? the notion of a 

13
00:00:48,000 --> 00:00:53,000
library is really nothing more than saying you've got some functionality 

14
00:00:53,000 --> 00:00:57,000
that you want to provide to all users of C++ or all 

15
00:00:57,000 --> 00:00:59,000
students enrolled 

16
00:00:59,000 --> 00:01:00,000
in CS106. There is some 

17
00:01:00,000 --> 00:01:03,000
reasonable grouping to these things. You have a bunch of operations that operate on 

18
00:01:03,000 --> 00:01:07,000
strings or that allow you to do graphic works or allow you to do event handling 

19
00:01:07,000 --> 00:01:08,000
or something. 

20
00:01:08,000 --> 00:01:12,000
The library is the packaging device in C++ where you 

21
00:01:12,000 --> 00:01:14,000
say here's a set of routines. 

22
00:01:14,000 --> 00:01:18,000
Typically, it comes with two pieces. One is the interface or 

23
00:01:18,000 --> 00:01:20,000
declaration or header file. 

24
00:01:20,000 --> 00:01:23,000
It tells you about what routines are in there ? what are their names? What are the 

25
00:01:23,000 --> 00:01:24,000
prototypes? How do you use them? 

26
00:01:24,000 --> 00:01:27,000
It often contains good comments about 

27
00:01:27,000 --> 00:01:31,000
the things that you would need to know as a client using that facility ? 

28
00:01:31,000 --> 00:01:33,000
how to use it effectively 

29
00:01:33,000 --> 00:01:36,000
and correctly. There is the code that really implements it that when you make 

30
00:01:36,000 --> 00:01:40,000
the call to a substring operation, how does it actually work? Well, there's 

31
00:01:40,000 --> 00:01:43,000
some code that backs it that does the operation that gets called at 

32
00:01:43,000 --> 00:01:47,000
runtime when you make a call to that function. The 

33
00:01:47,000 --> 00:01:50,000
libraries that we're going to see this quarter 

34
00:01:50,000 --> 00:01:52,000
form roughly two big groups. There is the C++ 

35
00:01:52,000 --> 00:01:56,000
standard library, so things that come with every C++ compiler. 

36
00:01:56,000 --> 00:01:59,000
No matter where you continue coding in C++, 

37
00:01:59,000 --> 00:02:03,000
you will always find things like the C++ string, the C++ IO 

38
00:02:03,000 --> 00:02:06,000
stream, the file stream which is the F stream. 

39
00:02:06,000 --> 00:02:07,000
There's a 

40
00:02:07,000 --> 00:02:10,000
math header. There are headers that deal with 

41
00:02:10,000 --> 00:02:13,000
all sorts of other facilities that are beyond what we're doing here. 

42
00:02:13,000 --> 00:02:16,000
These are the ones that we'll see most commonly in the early part are 

43
00:02:16,000 --> 00:02:18,000
string and stream. 

44
00:02:18,000 --> 00:02:22,000
The typical include for them is going to be the angle brackets. That's the sign to 

45
00:02:22,000 --> 00:02:25,000
the compiler we're looking for something from the standard header locations. 

46
00:02:25,000 --> 00:02:30,000
One way to remind yourself about how to distinguish these from our 

47
00:02:30,000 --> 00:02:31,000
special libraries is that you're 

48
00:02:31,000 --> 00:02:34,000
going to see these very terse and lowercase names. Part of the legacy of C 

49
00:02:34,000 --> 00:02:37,000
and C++ was as a professional programmer's 

50
00:02:37,000 --> 00:02:40,000
tool, they tended to value terseness over any kind of 

51
00:02:40,000 --> 00:02:43,000
verbose and descriptive names, making it a little easier to type a little faster to get 

52
00:02:43,000 --> 00:02:47,000
your point across. Things like the cout, which is the console out stream, 

53
00:02:47,000 --> 00:02:52,000
the get line, the [inaudible] call. There's the substring name of the function there. Those 

54
00:02:52,000 --> 00:02:55,000
tend to be short and throw away vowels where they can. They tend to be all lowercase. 

55
00:02:55,000 --> 00:02:56,000


56
00:02:56,000 --> 00:02:59,000
Whenever you're looking at a routine, you might wonder where it comes from. 

57
00:02:59,000 --> 00:03:01,000
If it has this 

58
00:03:01,000 --> 00:03:03,000
capitalization scheme, it's likely to be something coming out of the standard 

59
00:03:03,000 --> 00:03:05,000
libraries. 

60
00:03:05,000 --> 00:03:08,000
In addition to what we have present in the standard, 

61
00:03:08,000 --> 00:03:12,000
we also have about seven libraries that we've included as part 

62
00:03:12,000 --> 00:03:16,000
of CS106 mostly to make our lives a little bit easier. 

63
00:03:16,000 --> 00:03:16,000


64
00:03:16,000 --> 00:03:20,000
Things like the random library or the simple IO library ? they actually layer 

65
00:03:20,000 --> 00:03:23,000
on existing functionality that is already present in the standard libraries, 

66
00:03:23,000 --> 00:03:26,000
but the way the functionality is expressed in the standard is just a little bit 

67
00:03:26,000 --> 00:03:28,000
awkward or 

68
00:03:28,000 --> 00:03:30,000
unhelpful for the task we need to do. 

69
00:03:30,000 --> 00:03:34,000
We've provided a layer that cleans it up for you. 

70
00:03:34,000 --> 00:03:38,000
The graphics is a good example of where there is no graphics library included in 

71
00:03:38,000 --> 00:03:42,000
standard C++. If you're working on graphics on Windows, you have access to 

72
00:03:42,000 --> 00:03:46,000
a different toolkit than you do on the Mac or on Linux or some other platform. 

73
00:03:46,000 --> 00:03:47,000
We have 

74
00:03:47,000 --> 00:03:51,000
tried to abstract out a very simple graphics library that we can run on 

75
00:03:51,000 --> 00:03:53,000
both Mac and Windows that then we provide one interface through that 

76
00:03:53,000 --> 00:03:57,000
in turn talks to your platform in its native language 

77
00:03:57,000 --> 00:04:01,000
to take those Windows and drawing things happen. Our 

78
00:04:01,000 --> 00:04:04,000
header files are always in the double quotes. We 

79
00:04:04,000 --> 00:04:08,000
typically use a strategy of having capitalized verbose names. We don't 

80
00:04:08,000 --> 00:04:11,000
throw away our vowels. We try to make a little bit of sense to describe 

81
00:04:11,000 --> 00:04:13,000
the action that's being taken. 

82
00:04:13,000 --> 00:04:14,000


83
00:04:14,000 --> 00:04:17,000


84
00:04:17,000 --> 00:04:21,000
Today, what I'm going to look at is I'm going to look at one of the 

85
00:04:21,000 --> 00:04:22,000
106 libraries here at the beginning 

86
00:04:22,000 --> 00:04:25,000
because it's a nice, easy one to get our head around. I'm going to go and look through 

87
00:04:25,000 --> 00:04:28,000
the C++ string library 

88
00:04:28,000 --> 00:04:31,000
and then I'm going to hopefully get a chance to even start talking a little about the 

89
00:04:31,000 --> 00:04:35,000
C++ stream library. Along the way, there will be 

90
00:04:35,000 --> 00:04:38,000
two additional 106 libraries that help out with string and stream that provide a 

91
00:04:38,000 --> 00:04:40,000
little bit more functionality than what's already there. 

92
00:04:40,000 --> 00:04:42,000
Randomness 

93
00:04:42,000 --> 00:04:45,000
comes up in all kinds of simulations in game playing. You want 

94
00:04:45,000 --> 00:04:47,000
the computer to simulate 

95
00:04:47,000 --> 00:04:51,000
some random behavior ? flipping a coin, rolling a die or shuffling 

96
00:04:51,000 --> 00:04:54,000
an array or a deck of cards. 

97
00:04:54,000 --> 00:04:57,000
Computers actually aren't capable of true randomness in the sense 

98
00:04:57,000 --> 00:04:59,000
that you might think in the real world, 

99
00:04:59,000 --> 00:05:02,000
but they have what's called pseudo randomness behavior where it can generate 

100
00:05:02,000 --> 00:05:05,000
numbers in a sequence that appears effectively random from the outside even 

101
00:05:05,000 --> 00:05:09,000
though there actually is some determinism in how it operates. 

102
00:05:09,000 --> 00:05:13,000
There is a set in our library ? there are four functions 

103
00:05:13,000 --> 00:05:16,000
that form the CS106 random library 

104
00:05:16,000 --> 00:05:17,000
that are used for 

105
00:05:17,000 --> 00:05:20,000
all kinds of random behavior that you want. 

106
00:05:20,000 --> 00:05:23,000
I note here that they are free functions, and by free functions, I mean 

107
00:05:23,000 --> 00:05:24,000
functions that aren't 

108
00:05:24,000 --> 00:05:28,000
on a particular class. They're actually globally accessible. You don't call them by 

109
00:05:28,000 --> 00:05:29,000
sending [inaudible] particular object. 

110
00:05:29,000 --> 00:05:33,000
They just exist at the top-level name space and you can just call them 

111
00:05:33,000 --> 00:05:36,000
anywhere and any time. When you're ready to get some random behavior, you make a call 

112
00:05:36,000 --> 00:05:38,000
to one of these 

113
00:05:38,000 --> 00:05:39,000
routines. 

114
00:05:39,000 --> 00:05:42,000
There is an initialization routine for this library. 

115
00:05:42,000 --> 00:05:44,000
The randomized call ? 

116
00:05:44,000 --> 00:05:47,000
that's called once and exactly once in your program, usually at the very 

117
00:05:47,000 --> 00:05:48,000
beginning 

118
00:05:48,000 --> 00:05:51,000
to set up a new random sequence. That's what's called seeds of the 

119
00:05:51,000 --> 00:05:54,000
generator to get it started in a new place. 

120
00:05:54,000 --> 00:05:56,000
Then, once you've made that call, 

121
00:05:56,000 --> 00:06:00,000
you can intersperse any number of these calls to simulate certain random 

122
00:06:00,000 --> 00:06:01,000
events. 

123
00:06:01,000 --> 00:06:05,000
The standard random number generator that's in C++ 

124
00:06:05,000 --> 00:06:08,000
provides all of this through one call. We'll generate a random number 

125
00:06:08,000 --> 00:06:11,000
from zero to the largest number possible, 

126
00:06:11,000 --> 00:06:14,000
and then you can decide how to map that to other things. If you want 

127
00:06:14,000 --> 00:06:18,000
the ability to flip a coin, you want to say half the numbers are odd and half the 

128
00:06:18,000 --> 00:06:20,000
numbers are even. 

129
00:06:20,000 --> 00:06:23,000
You could do something like generate a number and see if it's odd or even, or see if 

130
00:06:23,000 --> 00:06:23,000
it's 

131
00:06:23,000 --> 00:06:27,000
from the bottom half of the range versus the top half. 

132
00:06:27,000 --> 00:06:30,000
What these functions do is just provide that functionality and 

133
00:06:30,000 --> 00:06:33,000
package it up for you neatly. You can say things like random integer low to 

134
00:06:33,000 --> 00:06:37,000
high ? you say one to five. It will give you a number from one to five inclusive. 

135
00:06:37,000 --> 00:06:40,000
If you keep calling that, you should see 

136
00:06:40,000 --> 00:06:45,000
an unpredictable sequence of the one, two, three, four, five coming in 

137
00:06:45,000 --> 00:06:49,000
jumbled and mixing up. There are no real guarantees about what order you'll see it 

138
00:06:49,000 --> 00:06:51,000
that 

139
00:06:51,000 --> 00:06:53,000
will allow you to simulate random events. 

140
00:06:53,000 --> 00:06:56,000
The random real ? same sort of idea but in this case using boundaries that are 

141
00:06:56,000 --> 00:06:59,000
expressed in real numbers and returning in real number. 

142
00:06:59,000 --> 00:07:03,000
Again, the bounds are inclusive, so it can't actually return the number low or high or 

143
00:07:03,000 --> 00:07:05,000
anything in between. 

144
00:07:05,000 --> 00:07:06,000
The last one is 

145
00:07:06,000 --> 00:07:09,000
simulating a probability 

146
00:07:09,000 --> 00:07:11,000
true/false value. 

147
00:07:11,000 --> 00:07:14,000
Given the probability of 0.5, half the time it should return true 

148
00:07:14,000 --> 00:07:16,000
and half the time it should return false. 

149
00:07:16,000 --> 00:07:20,000
If you give it a probability of .25, one quarter of the time it will 

150
00:07:20,000 --> 00:07:23,000
return true and the other three quarters of the time it will return false. It allows you 

151
00:07:23,000 --> 00:07:27,000
to simulate coin flips or other random events where you have a [inaudible] 

152
00:07:27,000 --> 00:07:28,000


153
00:07:28,000 --> 00:07:30,000
distribution. The point of a 

154
00:07:30,000 --> 00:07:35,000
library is to take some set 

155
00:07:35,000 --> 00:07:37,000
of facilities that are needed, 

156
00:07:37,000 --> 00:07:39,000
package it up, have a vision of 

157
00:07:39,000 --> 00:07:43,000
how they work together, a naming convention and design convention that makes them 

158
00:07:43,000 --> 00:07:44,000
coherent, 

159
00:07:44,000 --> 00:07:48,000
that they provide some convenience and they're complete. They cover all the 

160
00:07:48,000 --> 00:07:50,000
bases. These three 

161
00:07:50,000 --> 00:07:52,000
provide a pretty good range of different kinds of random events. 

162
00:07:52,000 --> 00:07:55,000
There are still other things you might need to simulate, but you can typically do it in 

163
00:07:55,000 --> 00:07:59,000
terms of using one of these. You could also have left one out and have to 

164
00:07:59,000 --> 00:08:01,000
simulate the others from it, 

165
00:08:01,000 --> 00:08:04,000
but each of them has a 

166
00:08:04,000 --> 00:08:05,000
client use that's pretty 

167
00:08:05,000 --> 00:08:07,000
handy, so it actually has all three of them 

168
00:08:07,000 --> 00:08:08,000


169
00:08:08,000 --> 00:08:10,000
for your use. 

170
00:08:10,000 --> 00:08:18,000
[Inaudible] comes out of random.h in the 

171
00:08:18,000 --> 00:08:22,000
CS106 library. The .h is just a convention for the 

172
00:08:22,000 --> 00:08:25,000
extension for header files. .txt gets used on text files and .cpp gets used 

173
00:08:25,000 --> 00:08:29,000
on source files. .h is for header files, which are 

174
00:08:29,000 --> 00:08:32,000
descriptions of routines but no real code is typically in the .h file. 

175
00:08:32,000 --> 00:08:34,000
It's an interface file they call 

176
00:08:34,000 --> 00:08:37,000
it. In Java, there isn't that distinction. Everything is all in 

177
00:08:37,000 --> 00:08:38,000
one file, 

178
00:08:38,000 --> 00:08:39,000
but the 

179
00:08:39,000 --> 00:08:42,000
definition of a class serves as both the description for a client using it as 

180
00:08:42,000 --> 00:08:44,000
well as the implementer implementing it. 

181
00:08:44,000 --> 00:08:50,000
C++ has them separated. 

182
00:08:50,000 --> 00:08:53,000
Let me look at C++ string 

183
00:08:53,000 --> 00:08:53,000


184
00:08:53,000 --> 00:08:56,000
as the next example of something we have to 

185
00:08:56,000 --> 00:08:58,000
make use of in getting things done. The C++ 

186
00:08:58,000 --> 00:09:02,000
string type is actually defined in a header file, and it's a 

187
00:09:02,000 --> 00:09:04,000
library that's added into the language. 

188
00:09:04,000 --> 00:09:08,000
Unlike int and bool and double that are part of the language and can't be 

189
00:09:08,000 --> 00:09:12,000
separated from it, string is kind of an add on that's defined through a 

190
00:09:12,000 --> 00:09:13,000
library. 

191
00:09:13,000 --> 00:09:15,000
It models a sequence of 

192
00:09:15,000 --> 00:09:18,000
characters including everything ? letters, numbers, digits, punctuation 

193
00:09:18,000 --> 00:09:20,000
? 

194
00:09:20,000 --> 00:09:24,000
and the string is defined as a class. In the same way that in Java 

195
00:09:24,000 --> 00:09:28,000
you're used to the class being the pattern from which you can declare and initialize 

196
00:09:28,000 --> 00:09:31,000
objects that you can then message and do things with, 

197
00:09:31,000 --> 00:09:33,000
string in the C++ world is the same sort of deal. 

198
00:09:33,000 --> 00:09:37,000
You have a string class. You initialize string objects. You send messages to those 

199
00:09:37,000 --> 00:09:40,000
string objects to ask them to do things for you. 

200
00:09:40,000 --> 00:09:43,000
Asking a string to give you the character at a particular position 

201
00:09:43,000 --> 00:09:44,000
or the number of characters 

202
00:09:44,000 --> 00:09:48,000
or to insert some characters or change some characters within the body of the 

203
00:09:48,000 --> 00:09:50,000
string are all done by 

204
00:09:50,000 --> 00:09:53,000
messaging the string. A 

205
00:09:53,000 --> 00:09:54,000
couple simple operations ? I 

206
00:09:54,000 --> 00:09:57,000
put a little bit of string code to get started. The variable name is actually 

207
00:09:57,000 --> 00:09:59,000
string itself. 

208
00:09:59,000 --> 00:10:01,000
There is something a little bit different about string 

209
00:10:01,000 --> 00:10:04,000
when you declare it and you don't initialize it relative to the things we know 

210
00:10:04,000 --> 00:10:06,000
about primitives. 

211
00:10:06,000 --> 00:10:09,000
When I say string S and I don't say anything else, 

212
00:10:09,000 --> 00:10:11,000


213
00:10:11,000 --> 00:10:14,000
you might assume then 

214
00:10:14,000 --> 00:10:16,000
similar to the primitive types that S is garbage. It has some sequence of 

215
00:10:16,000 --> 00:10:19,000


216
00:10:19,000 --> 00:10:20,000
characters. 

217
00:10:20,000 --> 00:10:24,000
In fact, string has what's called a default constructor, 

218
00:10:24,000 --> 00:10:26,000
one that's invoked when you don't specify otherwise 

219
00:10:26,000 --> 00:10:29,000
such that when you initialize a string with no other explicit 

220
00:10:29,000 --> 00:10:33,000
information, it will assume you meant to set it up to be the empty string. 

221
00:10:33,000 --> 00:10:37,000
Making a call string S actually declares and initializes a string 

222
00:10:37,000 --> 00:10:41,000
with no characters. 

223
00:10:41,000 --> 00:10:43,000
If I were to ask it for its length, 

224
00:10:43,000 --> 00:10:45,000
which is the way we 

225
00:10:45,000 --> 00:10:47,000
ask for the number of characters in it that's being used right here ? for 

226
00:10:47,000 --> 00:10:49,000
example, S.length 

227
00:10:49,000 --> 00:10:51,000
and then close the open paren there. 

228
00:10:51,000 --> 00:10:55,000
It would return zero on the empty string. 

229
00:10:55,000 --> 00:10:57,000
We can use square brackets 

230
00:10:57,000 --> 00:11:00,000
like the array notation you're probably familiar with to 

231
00:11:00,000 --> 00:11:03,000
access individual characters of the string. 

232
00:11:03,000 --> 00:11:05,000
Applying the square brackets to S ? 

233
00:11:05,000 --> 00:11:08,000
S sub I sometimes I'll call this 

234
00:11:08,000 --> 00:11:11,000
accesses the [inaudible] character within the string. 

235
00:11:11,000 --> 00:11:15,000
The character's index starting from zero ? so if I have a ten character string, they actually are 

236
00:11:15,000 --> 00:11:17,000
indexed zero through nine. 

237
00:11:17,000 --> 00:11:19,000
The C++ string ? 

238
00:11:19,000 --> 00:11:23,000
the square brackets allow you to access that character both to read it and to write 

239
00:11:23,000 --> 00:11:27,000
it. A C++ string is mutable. 

240
00:11:27,000 --> 00:11:31,000
The Java string is immutable. Once you create a string, it has a certain 

241
00:11:31,000 --> 00:11:32,000
sequence of characters, 

242
00:11:32,000 --> 00:11:33,000
and although you can 

243
00:11:33,000 --> 00:11:37,000
make a new string and overwrite that one, you can't go in and just manipulate the string 

244
00:11:37,000 --> 00:11:39,000
in place and change its contents. 

245
00:11:39,000 --> 00:11:41,000
C++ you can do that. 

246
00:11:41,000 --> 00:11:44,000
I initialized string in this case to the string literal or string 

247
00:11:44,000 --> 00:11:47,000
constant CS106, and then I ran a loop 

248
00:11:47,000 --> 00:11:49,000
over the index of the 

249
00:11:49,000 --> 00:11:51,000
proper range of indices for this string, 

250
00:11:51,000 --> 00:11:53,000
and then I used the two upper, 

251
00:11:53,000 --> 00:11:55,000
which is a 

252
00:11:55,000 --> 00:11:58,000
function from the standard C library that takes a character and returns its 

253
00:11:58,000 --> 00:12:01,000
uppercase equivalent or unchanged if it's not a letter, 

254
00:12:01,000 --> 00:12:05,000
and then [inaudible] S of I the result of two upper. 

255
00:12:05,000 --> 00:12:08,000
The effect of this was for each lowercase character in the string, we overrode 

256
00:12:08,000 --> 00:12:12,000
it with its uppercase equivalent. Any other existing uppercase or 

257
00:12:12,000 --> 00:12:14,000
punctuation characters 

258
00:12:14,000 --> 00:12:17,000
were left unchanged. 

259
00:12:17,000 --> 00:12:21,000
You can make assignment into that, which is something you cannot do 

260
00:12:21,000 --> 00:12:22,000


261
00:12:22,000 --> 00:12:26,000
with the Java string. 

262
00:12:26,000 --> 00:12:31,000
Student:Can you insert [inaudible]? Instructor (Julie Zelenski):I certainly can. I'm going to show you that in about two slides. There are a whole set of 

263
00:12:31,000 --> 00:12:34,000
member functions that then do these things. This one allows 

264
00:12:34,000 --> 00:12:35,000
you to have the sequence of 

265
00:12:35,000 --> 00:12:38,000
five characters ? what if I want to put one in 

266
00:12:38,000 --> 00:12:41,000
the middle? I'll use something called insert. If I want to take one out in the middle, I use something called replace 

267
00:12:41,000 --> 00:12:42,000
or 

268
00:12:42,000 --> 00:12:50,000
erase to pull it out and put something else in. 

269
00:12:50,000 --> 00:12:53,000
Many of the built in operators 

270
00:12:53,000 --> 00:13:00,000
? things like equals and less than or less than or equal to, not equal, have extended 

271
00:13:00,000 --> 00:13:03,000
meanings that apply to strings when they're used as the operands for those 

272
00:13:03,000 --> 00:13:04,000
types. 

273
00:13:04,000 --> 00:13:09,000
I can assign two strings using equals. If I say string S equals T 

274
00:13:09,000 --> 00:13:10,000
as I'm doing right here, 

275
00:13:10,000 --> 00:13:14,000
then whatever value T is, 

276
00:13:14,000 --> 00:13:16,000
S becomes a copy of that. S 

277
00:13:16,000 --> 00:13:20,000
and T have the same value, but they're not related in any 

278
00:13:20,000 --> 00:13:23,000
important way going forward. We have two copies that both happen to have 

279
00:13:23,000 --> 00:13:24,000
the same five characters. 

280
00:13:24,000 --> 00:13:28,000
For example, the first thing I did after this was change the first 

281
00:13:28,000 --> 00:13:30,000
character of T to be J, 

282
00:13:30,000 --> 00:13:32,000
so now T is jello. S 

283
00:13:32,000 --> 00:13:33,000
is still hello. 

284
00:13:33,000 --> 00:13:37,000
It was initialized from the same sequence, but they don't retain any kind of 

285
00:13:37,000 --> 00:13:39,000
aliasing from that point forward. 

286
00:13:39,000 --> 00:13:42,000
I'm able to compare two strings directly to see whether they're lexicographically 

287
00:13:42,000 --> 00:13:46,000
equal or less than according to ASCII ordering. 

288
00:13:46,000 --> 00:13:51,000
I can say if S = = T ? in Java, that didn't do what you wanted. It 

289
00:13:51,000 --> 00:13:54,000
did compile, but it didn't test the thing you were hoping for. In C++, 

290
00:13:54,000 --> 00:13:56,000
it does do what you're expecting, which is to say 

291
00:13:56,000 --> 00:14:00,000
take two strings and say do they have the same sequence of characters. 

292
00:14:00,000 --> 00:14:01,000
If I have 

293
00:14:01,000 --> 00:14:04,000
assigned S to T, if I do S = = T, it's going to say yes, they have 

294
00:14:04,000 --> 00:14:07,000
the same five characters in the same order. Once I've changed one of them, 

295
00:14:07,000 --> 00:14:10,000
then they'll come up as not equal. I 

296
00:14:10,000 --> 00:14:14,000
could do less than and less than or equal to to see in ASCII ordering which 

297
00:14:14,000 --> 00:14:15,000
one proceeds the other to 

298
00:14:15,000 --> 00:14:17,000
do sorting of strings. 

299
00:14:17,000 --> 00:14:21,000
Just like you think of as the integer types in double 

300
00:14:21,000 --> 00:14:24,000
touch, those operators have 

301
00:14:24,000 --> 00:14:27,000
reasonable meanings applied to strings. The 

302
00:14:27,000 --> 00:14:31,000
plus and plus equals is what's called overloaded, so extended 

303
00:14:31,000 --> 00:14:35,000
beyond its usual meaning for addition to do concatenation of strings. 

304
00:14:35,000 --> 00:14:39,000
I can take S and I can add to it a character space at the end, so now 

305
00:14:39,000 --> 00:14:42,000
instead of being just hello, it's hello space. 

306
00:14:42,000 --> 00:14:44,000
I can also add 

307
00:14:44,000 --> 00:14:47,000
strings to strings, so I can take T and use the shorthand plus equals, 

308
00:14:47,000 --> 00:14:51,000
which takes jello and turns it into jello jello there, 

309
00:14:51,000 --> 00:14:53,000
attaching another one on the end. 

310
00:14:53,000 --> 00:14:57,000
The concatenation for the C++ string only operates on strings and 

311
00:14:57,000 --> 00:14:58,000
characters 

312
00:14:58,000 --> 00:15:01,000
whereas in Java, there's this kind of automatic mechanism where things 

313
00:15:01,000 --> 00:15:04,000
like doubles and integers are converted to string and 

314
00:15:04,000 --> 00:15:06,000
added into the concatenation. 

315
00:15:06,000 --> 00:15:09,000
That does not happen in C++. Concatenation is just for strings and 

316
00:15:09,000 --> 00:15:12,000
characters. If you have something that's in numeric form and you want 

317
00:15:12,000 --> 00:15:16,000
to add it into a string, you'll have to first convert it to a string. I'll show you 

318
00:15:16,000 --> 00:15:21,000
a routine that does that a little bit later. 

319
00:15:21,000 --> 00:15:24,000


320
00:15:24,000 --> 00:15:26,000


321
00:15:26,000 --> 00:15:30,000
I would be happy to do that. Most of the things that I'm talking about actually are in 

322
00:15:30,000 --> 00:15:32,000
handout four was well as 

323
00:15:32,000 --> 00:15:35,000
repeated in handout ten and in the reader. There are a million places 

324
00:15:35,000 --> 00:15:39,000
you can look for information on strings. 

325
00:15:39,000 --> 00:15:44,000


326
00:15:44,000 --> 00:15:49,000
Most of the heavy lifting on the strings is done via these member functions. 

327
00:15:49,000 --> 00:15:52,000
These are part of the string class, and so these are operations that 

328
00:15:52,000 --> 00:15:56,000
apply to string receiver objects. They're not free functions. You can't call them 

329
00:15:56,000 --> 00:15:58,000
outside of a 

330
00:15:58,000 --> 00:15:59,000
usage where you're saying 

331
00:15:59,000 --> 00:16:03,000
on some receiver string, apply this function 

332
00:16:03,000 --> 00:16:05,000
using these arguments. 

333
00:16:05,000 --> 00:16:07,000
For example, the length 

334
00:16:07,000 --> 00:16:11,000
member function is applied to a string. Just to note here, the word member 

335
00:16:11,000 --> 00:16:11,000
function 

336
00:16:11,000 --> 00:16:13,000
is vocabulary-wise 

337
00:16:13,000 --> 00:16:17,000
the same thing as method. Java programmers tend to call the 

338
00:16:17,000 --> 00:16:19,000
functions that are defined as part of a class methods. C++ programmers tend to call 

339
00:16:19,000 --> 00:16:21,000
them member functions. 

340
00:16:21,000 --> 00:16:24,000
They really mean the same thing, but I do try to use the word member function because we 

341
00:16:24,000 --> 00:16:27,000
are a C++ class, and that is kind of the convention. 

342
00:16:27,000 --> 00:16:30,000
I'll probably end up using both accidentally 

343
00:16:30,000 --> 00:16:31,000
without 

344
00:16:31,000 --> 00:16:33,000
even noticing it. 

345
00:16:33,000 --> 00:16:35,000
Hopefully, it won't cause you too much grief there. 

346
00:16:35,000 --> 00:16:38,000
The member function here is saying str.function R is saying apply the 

347
00:16:38,000 --> 00:16:42,000
function, send the message function to this particular string with these 

348
00:16:42,000 --> 00:16:44,000
arguments and then get its 

349
00:16:44,000 --> 00:16:46,000
answer back or have that operation happen. 

350
00:16:46,000 --> 00:16:49,000
I can ask a string for its length in terms of an 

351
00:16:49,000 --> 00:16:50,000
integer. It tells you the number of characters. 

352
00:16:50,000 --> 00:16:52,000
I can ask a string 

353
00:16:52,000 --> 00:16:57,000
to look for a particular character or string sequence substring 

354
00:16:57,000 --> 00:16:59,000
within the characters that 

355
00:16:59,000 --> 00:17:01,000
that string maintains right 

356
00:17:01,000 --> 00:17:04,000
now. It will return the index of the first occurrence found, scanning from 

357
00:17:04,000 --> 00:17:05,000
left to right 

358
00:17:05,000 --> 00:17:10,000
or a string::end pause. It's a little bit of a funny return value, but 

359
00:17:10,000 --> 00:17:12,000
it is the 

360
00:17:12,000 --> 00:17:15,000
return value that says I didn't find it. It's a string::end pause. It's an integer 

361
00:17:15,000 --> 00:17:16,000
value 

362
00:17:16,000 --> 00:17:17,000
that is 

363
00:17:17,000 --> 00:17:20,000
distinct from any other valid index within the 

364
00:17:20,000 --> 00:17:22,000
string itself to tell you it didn't find it. 

365
00:17:22,000 --> 00:17:24,000
Both of these 

366
00:17:24,000 --> 00:17:26,000
have a default argument on them. We 

367
00:17:26,000 --> 00:17:28,000
talked a little bit about that last time. 

368
00:17:28,000 --> 00:17:33,000
If I do not specify that second argument when I'm making a find call, 

369
00:17:33,000 --> 00:17:35,000
it will assume that you want to start looking from the beginning. 

370
00:17:35,000 --> 00:17:39,000
If I do specify it, then it will start from that position 

371
00:17:39,000 --> 00:17:42,000
and scan from there to the end of the string. It's a way of 

372
00:17:42,000 --> 00:17:45,000
targeting the place you're looking for a little more precisely than just 

373
00:17:45,000 --> 00:17:49,000
starting from the beginning and going to the end. 

374
00:17:49,000 --> 00:17:52,000
C++ does allow what's called overloading. 

375
00:17:52,000 --> 00:17:56,000
In this case, the function find that finds a char and the function find that 

376
00:17:56,000 --> 00:17:58,000
finds a string both have the same name, 

377
00:17:58,000 --> 00:18:01,000
and so that name can be used for multiple purposes 

378
00:18:01,000 --> 00:18:03,000
as long as there's a sequence of arguments that distinguishes them so 

379
00:18:03,000 --> 00:18:06,000
that when I make a call to find, it knows whether the first 

380
00:18:06,000 --> 00:18:08,000
version or the second version 

381
00:18:08,000 --> 00:18:11,000
by virtue of whether the first argument is a character or the first 

382
00:18:11,000 --> 00:18:12,000
argument is a string. 

383
00:18:12,000 --> 00:18:14,000
That can be extended to other types. 

384
00:18:14,000 --> 00:18:16,000
This is typically used when you have 

385
00:18:16,000 --> 00:18:20,000
an operation that really has the same behavior but some slightly 

386
00:18:20,000 --> 00:18:23,000
different sequence of arguments is required to invoke it. It 

387
00:18:23,000 --> 00:18:26,000
is not something you want to use a lot to make a bunch of similar named functions 

388
00:18:26,000 --> 00:18:28,000
that don't have similar operations. 

389
00:18:28,000 --> 00:18:31,000
It allows for a convenience when there are 

390
00:18:31,000 --> 00:18:34,000
two or three variations of the same theme. They might all come under the same 

391
00:18:34,000 --> 00:18:38,000
name by virtue of overloading. 

392
00:18:38,000 --> 00:18:41,000


393
00:18:41,000 --> 00:18:43,000


394
00:18:43,000 --> 00:18:46,000
Substr is something that given a receiver string and a position in a length 

395
00:18:46,000 --> 00:18:48,000
will extract a new substring 

396
00:18:48,000 --> 00:18:52,000
out of the middle of the string that was received. If I take the hello 

397
00:18:52,000 --> 00:18:57,000
string and starting from position zero take two characters, I get the string he. 

398
00:18:57,000 --> 00:19:00,000
It copies them. It's distinct from the original, 

399
00:19:00,000 --> 00:19:03,000
and so all it did was get its initial sequence by copying characters 

400
00:19:03,000 --> 00:19:04,000
from there. 

401
00:19:04,000 --> 00:19:08,000
If I go in to change the hello string into jello, that he string stays he. 

402
00:19:08,000 --> 00:19:09,000
They're not 

403
00:19:09,000 --> 00:19:13,000
attached in any long-term way. 

404
00:19:13,000 --> 00:19:15,000
Insert, replace and erase 

405
00:19:15,000 --> 00:19:19,000
are all of the family of something that I call modifiers or mutaters that change 

406
00:19:19,000 --> 00:19:20,000
the receiver string. 

407
00:19:20,000 --> 00:19:23,000
You can send these messages to a string 

408
00:19:23,000 --> 00:19:27,000
to cause new text to get added into the string, text to be removed or text to be deleted and 

409
00:19:27,000 --> 00:19:29,000
replaced with something else. 

410
00:19:29,000 --> 00:19:33,000
Inserting ? someone asks, well, how can I put new characters in the middle? Well, I put 

411
00:19:33,000 --> 00:19:36,000
the position where I'd like them to go. If I say position zero and I say put the 

412
00:19:36,000 --> 00:19:38,000
string 

413
00:19:38,000 --> 00:19:41,000
I in there, then it would bump everything down and put I in the front 

414
00:19:41,000 --> 00:19:44,000
and replace it. If it was hello, it would be I 

415
00:19:44,000 --> 00:19:48,000
said hello. I inserted the string I said. 

416
00:19:48,000 --> 00:19:49,000
The replace 

417
00:19:49,000 --> 00:19:50,000
at a position 

418
00:19:50,000 --> 00:19:54,000
removes length characters starting at that position and then replaces it with that 

419
00:19:54,000 --> 00:19:58,000
character. It's a way to take a chunk out and put something else in instead. Erase 

420
00:19:58,000 --> 00:20:00,000
does a straight remove 

421
00:20:00,000 --> 00:20:03,000
at a position. Take 

422
00:20:03,000 --> 00:20:07,000
this number of characters and throw them away, deleting them from the string and 

423
00:20:07,000 --> 00:20:09,000
making it shorter. 

424
00:20:09,000 --> 00:20:13,000
All of these change the receiver string. When you say str.insert, 

425
00:20:13,000 --> 00:20:15,000
str.replace or str.erase, 

426
00:20:15,000 --> 00:20:18,000
after that call, str now actually has new contents based on what you've asked it 

427
00:20:18,000 --> 00:20:20,000
to do about changing 

428
00:20:20,000 --> 00:20:24,000
and mutating its contents. Here's something I 

429
00:20:24,000 --> 00:20:28,000
should tell you a little bit about C++ string relative to 

430
00:20:28,000 --> 00:20:30,000
Java string. C++ 

431
00:20:30,000 --> 00:20:34,000
is kind of an industrial strength language that's targeted 

432
00:20:34,000 --> 00:20:35,000
at professional programmers. It 

433
00:20:35,000 --> 00:20:39,000
does not make any guarantees to you about what happens if you misuse these 

434
00:20:39,000 --> 00:20:40,000
calls. 

435
00:20:40,000 --> 00:20:45,000
If you give it a position that isn't valid for this string or a length that isn't 

436
00:20:45,000 --> 00:20:46,000
valid for the string, 

437
00:20:46,000 --> 00:20:49,000
there is no 

438
00:20:49,000 --> 00:20:52,000
contract in the C++ libraries that said this is what will definitely 

439
00:20:52,000 --> 00:20:55,000
happen. It doesn't say oh, it's definitely going to throw an exception or throw some sort of 

440
00:20:55,000 --> 00:20:55,000
error. 

441
00:20:55,000 --> 00:20:57,000
It doesn't say it's just going to truncate it at the end. 

442
00:20:57,000 --> 00:21:01,000
It says that the library is free to do whatever is convenient for it up to and including 

443
00:21:01,000 --> 00:21:03,000
just crashing. 

444
00:21:03,000 --> 00:21:03,000


445
00:21:03,000 --> 00:21:08,000
It does mean that as the programmer using these calls, 

446
00:21:08,000 --> 00:21:10,000
it is a little bit more on you to be careful that you're using them 

447
00:21:10,000 --> 00:21:13,000
correctly and making the numbers inbounds for the string 

448
00:21:13,000 --> 00:21:15,000
in ways that will 

449
00:21:15,000 --> 00:21:17,000
produce correct results. 

450
00:21:17,000 --> 00:21:20,000
It might be that it will produce a nice error message, 

451
00:21:20,000 --> 00:21:23,000
but there are no guarantees. You wouldn't want to come to depend on that. You 

452
00:21:23,000 --> 00:21:25,000
want to just be careful about 

453
00:21:25,000 --> 00:21:26,000
knowing what 

454
00:21:26,000 --> 00:21:28,000
the right numbers are. 

455
00:21:28,000 --> 00:21:31,000
Unlike Java, which is very attentive to those things 

456
00:21:31,000 --> 00:21:35,000
and on your case when you're a little bit out of bounds, 

457
00:21:35,000 --> 00:21:38,000
in the name of efficiency, it tends to just 

458
00:21:38,000 --> 00:21:45,000
breeze through that stuff. I'm going 

459
00:21:45,000 --> 00:21:46,000
to show you 

460
00:21:46,000 --> 00:21:49,000
a little bit of coding 

461
00:21:49,000 --> 00:21:51,000
together just for fun. 

462
00:21:51,000 --> 00:21:53,000


463
00:21:53,000 --> 00:21:54,000
I like 

464
00:21:54,000 --> 00:21:57,000


465
00:21:57,000 --> 00:22:00,000
to sit and show you 

466
00:22:00,000 --> 00:22:01,000
some things. 

467
00:22:01,000 --> 00:22:04,000
If I were to do something like want to count 

468
00:22:04,000 --> 00:22:06,000
the occurrences 

469
00:22:06,000 --> 00:22:10,000
of a particular character within a string, 

470
00:22:10,000 --> 00:22:14,000
I could write a loop that looks like this. I could say 

471
00:22:14,000 --> 00:22:15,000
int count 

472
00:22:15,000 --> 00:22:17,000
= zero 

473
00:22:17,000 --> 00:22:20,000
for ? and this is a very ubiquitous loop for 

474
00:22:20,000 --> 00:22:22,000
operating over a 

475
00:22:22,000 --> 00:22:25,000
collection ? in this case, the collection being the characters in there from zero 

476
00:22:25,000 --> 00:22:26,000
to this 

477
00:22:26,000 --> 00:22:29,000
length. If S sub I equals the character I'm looking for, we would 

478
00:22:29,000 --> 00:22:31,000
increment the count 

479
00:22:31,000 --> 00:22:33,000
and then return it. 

480
00:22:33,000 --> 00:22:34,000
I 

481
00:22:34,000 --> 00:22:38,000
put this down here in my code 

482
00:22:38,000 --> 00:22:41,000
and do a little testing 

483
00:22:41,000 --> 00:22:45,000
of looking for the character C in 

484
00:22:45,000 --> 00:22:48,000
Chihuahua 

485
00:22:48,000 --> 00:22:50,000


486
00:22:50,000 --> 00:22:50,000


487
00:22:50,000 --> 00:22:54,000
cheese crackers. 

488
00:22:54,000 --> 00:22:57,000
Let's take a 

489
00:22:57,000 --> 00:22:58,000
look 

490
00:22:58,000 --> 00:23:01,000
at that and see if we 

491
00:23:01,000 --> 00:23:04,000
manage to count the number of Cs 

492
00:23:04,000 --> 00:23:05,000
in my list. There 

493
00:23:05,000 --> 00:23:06,000
are four, apparently. 

494
00:23:06,000 --> 00:23:10,000
Let's go check and see if that comes up. It looks good. We did a 

495
00:23:10,000 --> 00:23:12,000
little bit of counting. 

496
00:23:12,000 --> 00:23:15,000
We're feeling okay about that part. Let 

497
00:23:15,000 --> 00:23:19,000
me do something where for example I want to remove all of the occurrences from 

498
00:23:19,000 --> 00:23:22,000
that. I'm going to write this two different ways to 

499
00:23:22,000 --> 00:23:24,000
highlight a little bit about how things work. 

500
00:23:24,000 --> 00:23:26,000
I'm going to 

501
00:23:26,000 --> 00:23:28,000
design a remove occurrences that given a character in a string 

502
00:23:28,000 --> 00:23:30,000
will return to you a new string 

503
00:23:30,000 --> 00:23:33,000
where all the occurrences of CH have been 

504
00:23:33,000 --> 00:23:36,000
removed. 

505
00:23:36,000 --> 00:23:38,000
Easy enough. 

506
00:23:38,000 --> 00:23:38,000


507
00:23:38,000 --> 00:23:43,000
It's not going to modify the original string. It's going to return a new one. 

508
00:23:43,000 --> 00:23:46,000
Here's my strategy. The way to build these things up is 

509
00:23:46,000 --> 00:23:49,000
I could go through the manipulations of 

510
00:23:49,000 --> 00:23:52,000
trying to take the characters out in place and figuring out where I'm at, but often, 

511
00:23:52,000 --> 00:23:56,000
the easier way to do this is to build up the result ? decide when to 

512
00:23:56,000 --> 00:23:59,000
append or concatenate a character from the original string 

513
00:23:59,000 --> 00:24:01,000
and when to ignore it and go past it. 

514
00:24:01,000 --> 00:24:05,000
I can do something like this where it's like if 

515
00:24:05,000 --> 00:24:08,000
the character I've just seen is 

516
00:24:08,000 --> 00:24:10,000
not the one that I'm trying to avoid, 

517
00:24:10,000 --> 00:24:11,000
then I can just add it 

518
00:24:11,000 --> 00:24:14,000
into the result. When I'm 

519
00:24:14,000 --> 00:24:15,000
done, I have the result. 

520
00:24:15,000 --> 00:24:19,000
If I do this and I change this call down to 

521
00:24:19,000 --> 00:24:23,000
remove occurrences ? 

522
00:24:23,000 --> 00:24:25,000
I'm counting on the fact that 

523
00:24:25,000 --> 00:24:26,000


524
00:24:26,000 --> 00:24:28,000
result is initialized to the empty 

525
00:24:28,000 --> 00:24:32,000
string. I didn't actually say anything there. I could, for example, do this 

526
00:24:32,000 --> 00:24:35,000
and that doesn't change anything about it and you might feel a little better 

527
00:24:35,000 --> 00:24:38,000
about seeing that explicit initialization, but C++ programmers are very used 

528
00:24:38,000 --> 00:24:39,000
to 

529
00:24:39,000 --> 00:24:40,000


530
00:24:40,000 --> 00:24:42,000
seeing uninitialized strings and knowing that that means they got the default 

531
00:24:42,000 --> 00:24:44,000
initialization to 

532
00:24:44,000 --> 00:24:48,000
the empty string. When I didn't find the character I was looking for, I [inaudible] the result. I'm going 

533
00:24:48,000 --> 00:24:51,000
to switch this up just a little bit. I'm 

534
00:24:51,000 --> 00:24:53,000
going to change remove occurrences 

535
00:24:53,000 --> 00:24:54,000
to instead of 

536
00:24:54,000 --> 00:24:59,000
making a new string to actually modify the string that I have. 

537
00:24:59,000 --> 00:25:01,000
I'm going to 

538
00:25:01,000 --> 00:25:04,000
change my code down here to match what's going to happen. I'm going to set it to 

539
00:25:04,000 --> 00:25:08,000
do this, 

540
00:25:08,000 --> 00:25:12,000
and then I'm going to call remove occurrences C of S, and 

541
00:25:12,000 --> 00:25:13,000
then I'm going 

542
00:25:13,000 --> 00:25:17,000
to print out S afterwards. In this case, I don't expect there to be a second 

543
00:25:17,000 --> 00:25:22,000
string created. I expect us to go in and modify that string in place, 

544
00:25:22,000 --> 00:25:26,000
truncating some of those characters and taking them out to make this work. 

545
00:25:26,000 --> 00:25:30,000
I could kind of do this thing where I'm walking down the string character by character and then 

546
00:25:30,000 --> 00:25:33,000
deciding whether to collapse over it. I'm actually going to change my strategy 

547
00:25:33,000 --> 00:25:36,000
entirely just so I have a little practice using some of the other routines, and I'm 

548
00:25:36,000 --> 00:25:40,000
going to end up using the string find. We'll 

549
00:25:40,000 --> 00:25:42,000
start with 

550
00:25:42,000 --> 00:25:44,000
this. I can actually do this. 

551
00:25:44,000 --> 00:25:48,000
S.find of CH, and if I don't give that second argument, it's going to start from the 

552
00:25:48,000 --> 00:25:49,000
very beginning 

553
00:25:49,000 --> 00:25:52,000
and look all the way through and see if it finds it. 

554
00:25:52,000 --> 00:25:55,000
I'm going to 

555
00:25:55,000 --> 00:25:56,000
put a 

556
00:25:56,000 --> 00:25:59,000
hold that result at a variable 

557
00:25:59,000 --> 00:26:02,000
and I'm going to say while found equals 

558
00:26:02,000 --> 00:26:04,000
the result of calling find 

559
00:26:04,000 --> 00:26:08,000
and then I'm going to stick it in. This is a very C++ way of coding. It's 

560
00:26:08,000 --> 00:26:13,000
tightly combining this up. In this case, I have an assignment and a 

561
00:26:13,000 --> 00:26:15,000
comparison all in the test of the Y 

562
00:26:15,000 --> 00:26:18,000
loop. I'm making the call to ask the string to find a particular character, 

563
00:26:18,000 --> 00:26:22,000
storing that result in an integer here so I can use it and then comparing that 

564
00:26:22,000 --> 00:26:25,000
resultive string end pause. So the string end pause is a little bit of a funny C++ syntax 

565
00:26:25,000 --> 00:26:27,000
there, but 

566
00:26:27,000 --> 00:26:32,000
the way to read that is within the string class ? string:: says 

567
00:26:32,000 --> 00:26:35,000
within the string class, scope within the string. There's a particular constant 

568
00:26:35,000 --> 00:26:38,000
called end pause, which is used as the 

569
00:26:38,000 --> 00:26:41,000
return value in cases like find when it's looking for something. 

570
00:26:41,000 --> 00:26:44,000
End pause being part of the class is a way of avoiding it 

571
00:26:44,000 --> 00:26:48,000
conflicting and interfering with any other usages where you might have variables 

572
00:26:48,000 --> 00:26:51,000
named end pause or similar functionalities. It's tied to the string class 

573
00:26:51,000 --> 00:26:53,000
through the scoping mechanism. 

574
00:26:53,000 --> 00:26:55,000
I check and see if it's not string end pause, 

575
00:26:55,000 --> 00:27:00,000
and then if it's not, then I go into the loop here 

576
00:27:00,000 --> 00:27:02,000
and I can do an erase 

577
00:27:02,000 --> 00:27:04,000
of one character at position. 

578
00:27:04,000 --> 00:27:06,000
Erase takes the position in the count 

579
00:27:06,000 --> 00:27:07,000
and removes 

580
00:27:07,000 --> 00:27:10,000
the number of characters I specified from that position. 

581
00:27:10,000 --> 00:27:12,000
Then, it will come back around. 

582
00:27:12,000 --> 00:27:13,000


583
00:27:13,000 --> 00:27:17,000
I have passed string by reference coming into here, 

584
00:27:17,000 --> 00:27:20,000
and that's a very important part of what's happening here because 

585
00:27:20,000 --> 00:27:24,000
these calls to erase that are modifying string ? if I have not passed string by 

586
00:27:24,000 --> 00:27:26,000
reference, they'd be operating on my copy. 

587
00:27:26,000 --> 00:27:29,000
I'd go through all the trouble of erasing all the Cs in my copy, 

588
00:27:29,000 --> 00:27:31,000
but when I got back out to the main call, 

589
00:27:31,000 --> 00:27:35,000
none of those effects would have been permanent. Passing by reference really means that 

590
00:27:35,000 --> 00:27:38,000
what remove occurrences got was access to the original S. I should make 

591
00:27:38,000 --> 00:27:39,000
these names 

592
00:27:39,000 --> 00:27:43,000
? I'll call this my string out here so 

593
00:27:43,000 --> 00:27:45,000
that we don't get any confusion about 

594
00:27:45,000 --> 00:27:46,000
the two names. 

595
00:27:46,000 --> 00:27:50,000
The my string variable in main is really being accessed by remove 

596
00:27:50,000 --> 00:27:53,000
occurrences without a copy. It's reaching back out into main and 

597
00:27:53,000 --> 00:27:57,000
making changes to the my string itself. 

598
00:27:57,000 --> 00:27:58,000


599
00:27:58,000 --> 00:28:00,000


600
00:28:00,000 --> 00:28:06,000


601
00:28:06,000 --> 00:28:09,000
What does it not like about that? Oh, 

602
00:28:09,000 --> 00:28:19,000
pause. I called it found. 

603
00:28:19,000 --> 00:28:22,000
I've achieved the same thing. 

604
00:28:22,000 --> 00:28:24,000
That's one of the 

605
00:28:24,000 --> 00:28:27,000
things about the string library is it's so big and has so many different ways of 

606
00:28:27,000 --> 00:28:28,000
doing things 

607
00:28:28,000 --> 00:28:32,000
that often two people or ten people running the 

608
00:28:32,000 --> 00:28:36,000
same task won't even come up with the same solutions. I could have used a replace where I 

609
00:28:36,000 --> 00:28:37,000
replaced 

610
00:28:37,000 --> 00:28:39,000
the character it found with the empty string. 

611
00:28:39,000 --> 00:28:43,000
I can build it up through concatenation. I can take it down with erase and 

612
00:28:43,000 --> 00:28:45,000
replace. I could insert the other way around. There are a 

613
00:28:45,000 --> 00:28:47,000
bunch of things I can do 

614
00:28:47,000 --> 00:28:50,000
that in the end will achieve the same effect but show that there are a 

615
00:28:50,000 --> 00:28:52,000
lot of ways to accomplish the same things. The library is pretty 

616
00:28:52,000 --> 00:28:55,000
rich and has a wide variety of tools in it. I'm going to make 

617
00:28:55,000 --> 00:28:58,000
one change to this 

618
00:28:58,000 --> 00:29:01,000
to show you how I can make it slightly more efficient. This is silly 

619
00:29:01,000 --> 00:29:04,000
because strings are typically very short, so it doesn't really matter, 

620
00:29:04,000 --> 00:29:06,000
but 

621
00:29:06,000 --> 00:29:10,000
I'm going to do this and I'm going to 

622
00:29:10,000 --> 00:29:13,000
use found as my index on subsequent calls. I can say starting 

623
00:29:13,000 --> 00:29:15,000
at found from zero, do 

624
00:29:15,000 --> 00:29:17,000
my search from zero and then found, 

625
00:29:17,000 --> 00:29:20,000
and then any subsequent calls will pick up where I left off. 

626
00:29:20,000 --> 00:29:23,000
The next time around through the loop, found is at the place where I found a 

627
00:29:23,000 --> 00:29:25,000
previous occurrence of that character 

628
00:29:25,000 --> 00:29:28,000
and it says starting from that position now, look forward and see if you see any more 

629
00:29:28,000 --> 00:29:32,000
from here to the end. For a very long string like this, it ends up doing a lot 

630
00:29:32,000 --> 00:29:35,000
less work. It doesn't start at the beginning each time. 

631
00:29:35,000 --> 00:29:37,000
It just picks up where the 

632
00:29:37,000 --> 00:29:40,000
previous occurrence was found and goes from there to the end. It's a 

633
00:29:40,000 --> 00:29:41,000
small change, but no 

634
00:29:41,000 --> 00:29:46,000
big 

635
00:29:46,000 --> 00:29:49,000
deal. Basically, what you're saying is 

636
00:29:49,000 --> 00:29:51,000
find needs to return something that says I didn't find it. 

637
00:29:51,000 --> 00:29:54,000
It could return to you zero, one, two, three, all these indices. It actually needs to 

638
00:29:54,000 --> 00:29:56,000
return to you, and it uses a 

639
00:29:56,000 --> 00:29:58,000
special sentinel value that says I didn't find it. 

640
00:29:58,000 --> 00:30:02,000
You might think that might be negative one or some other thing. A good programming 

641
00:30:02,000 --> 00:30:03,000
form would be to have a constant for it 

642
00:30:03,000 --> 00:30:07,000
so that you don't have any magic numbers embedded in your code. That 

643
00:30:07,000 --> 00:30:07,000
constant 

644
00:30:07,000 --> 00:30:10,000
is defined as part of the string class. Just the syntax for 

645
00:30:10,000 --> 00:30:14,000
accessing that constant of the string class is using the string class name :: 

646
00:30:14,000 --> 00:30:15,000
end pause. 

647
00:30:15,000 --> 00:30:17,000
It's basically the syntax for 

648
00:30:17,000 --> 00:30:21,000
I have a constant that was defined within a class. How do I get to it? I use its 

649
00:30:21,000 --> 00:30:24,000
class name, two colons and then the name of the constant. It's just C++ 

650
00:30:24,000 --> 00:30:26,000
for something that in Java looks a little bit more like 

651
00:30:26,000 --> 00:30:28,000
class name dot. Question? 

652
00:30:28,000 --> 00:30:30,000
Student:Sometimes I 

653
00:30:30,000 --> 00:30:36,000
see a function declaration before [inaudible] and then the definition afterwards. Is that just a matter of preference? Instructor (Julie 

654
00:30:36,000 --> 00:30:40,000
Zelenski):It totally is. Probably I'm being a little bit lazy in class, which is 

655
00:30:40,000 --> 00:30:43,000
if I put the function definition up here, 

656
00:30:43,000 --> 00:30:47,000
then I can call it down here because it's already been seen. If I put it down here, 

657
00:30:47,000 --> 00:30:49,000
then I need a prototype up there. The prototype means I have to be a little bit 

658
00:30:49,000 --> 00:30:53,000
more careful. When I change the name, I have to change it in both places. If I change the 

659
00:30:53,000 --> 00:30:54,000
argument, I have to change it in both places. 

660
00:30:54,000 --> 00:30:57,000
The problem, of course, is that when you read the code, it probably reads a 

661
00:30:57,000 --> 00:31:01,000
little better to say here's the main which makes calls to A, 

662
00:31:01,000 --> 00:31:02,000
B 

663
00:31:02,000 --> 00:31:05,000
and C. Some of it has to do with it's a little bit harder to maintain in that form, 

664
00:31:05,000 --> 00:31:07,000
but I think it's easier when you're done to read it. 

665
00:31:07,000 --> 00:31:10,000
You're totally free to do it either way. You should probably pick a strategy and go 

666
00:31:10,000 --> 00:31:11,000
with it. 

667
00:31:11,000 --> 00:31:14,000
Maintain the prototypes is not really that much work once you get used 

668
00:31:14,000 --> 00:31:17,000
to it, and I think in the end, it probably is a little bit cleaner. When I'm being lazy 

669
00:31:17,000 --> 00:31:21,000
in class, I'm much more likely to just throw them up there to save myself some time. 

670
00:31:21,000 --> 00:31:23,000
It's good to note that there are a lot of 

671
00:31:23,000 --> 00:31:26,000
things that will slip by me if I'm not being careful. Let 

672
00:31:26,000 --> 00:31:28,000
me 

673
00:31:28,000 --> 00:31:32,000


674
00:31:32,000 --> 00:31:34,000
go back and 

675
00:31:34,000 --> 00:31:38,000
pick up a few last details about string that I don't want to overlook 

676
00:31:38,000 --> 00:31:42,000
before I move away from this. 

677
00:31:42,000 --> 00:31:45,000
There are library functions that are need to know. I have them 

678
00:31:45,000 --> 00:31:49,000
sketched out in a couple places, and you can look at them and see 

679
00:31:49,000 --> 00:31:52,000
what they do ? knowing they're there and then learning about them 

680
00:31:52,000 --> 00:31:52,000


681
00:31:52,000 --> 00:31:55,000
as you encounter them is a fine strategy. 

682
00:31:55,000 --> 00:31:57,000
There are a couple additions 

683
00:31:57,000 --> 00:32:01,000
in our [inaudible] which is a 106 specific header file which are just some 

684
00:32:01,000 --> 00:32:03,000
things that for one reason or another 

685
00:32:03,000 --> 00:32:04,000
are a little bit harder or 

686
00:32:04,000 --> 00:32:07,000
more annoying to do using the standard tools than we think is 

687
00:32:07,000 --> 00:32:09,000
worth putting on your plate for now. 

688
00:32:09,000 --> 00:32:11,000
We have two 

689
00:32:11,000 --> 00:32:14,000
convert to upper and lowercase that given a string just convert it to its upper and 

690
00:32:14,000 --> 00:32:17,000
lowercase equivalent. There are some things that do conversion between 

691
00:32:17,000 --> 00:32:20,000
string and integer and 

692
00:32:20,000 --> 00:32:22,000
string and real 

693
00:32:22,000 --> 00:32:25,000
when you have it in one form and you need it in the other. 

694
00:32:25,000 --> 00:32:28,000
Here's something that just does that for you as part of the string library. 

695
00:32:28,000 --> 00:32:30,000
It's just some simple things that 

696
00:32:30,000 --> 00:32:32,000
you might find yourself needing and you just want 

697
00:32:32,000 --> 00:32:36,000
to know they're there. 

698
00:32:36,000 --> 00:32:40,000
Here is something that is a little bit of a bummer. 

699
00:32:40,000 --> 00:32:44,000
Part of the legacy of C++ being built on C 

700
00:32:44,000 --> 00:32:46,000
means that every now and then, there's a little bit of a history 

701
00:32:46,000 --> 00:32:48,000
in our deep, dark past 

702
00:32:48,000 --> 00:32:52,000
that pops its head up in ways that are a little bit surprising. 

703
00:32:52,000 --> 00:32:55,000
For string, it turns out there is a little bit of a weirdness here that I 

704
00:32:55,000 --> 00:32:59,000
want to point out before you run into it the hard way. 

705
00:32:59,000 --> 00:33:03,000
There is a notion of the old style C string. 

706
00:33:03,000 --> 00:33:06,000
The original C language didn't have a string class. It actually doesn't have any object 

707
00:33:06,000 --> 00:33:07,000
[inaudible] features at all. 

708
00:33:07,000 --> 00:33:10,000
It did have, though, some other more primitive handling of sequences of 

709
00:33:10,000 --> 00:33:14,000
characters. This is a very common [inaudible] to have something. I've 

710
00:33:14,000 --> 00:33:17,000
put in parenthesis what it actually is. It's [inaudible] an alternative. 

711
00:33:17,000 --> 00:33:19,000
Don't worry about what that phrase means. 

712
00:33:19,000 --> 00:33:23,000
That's just for those of you who have seen it a little bit before. That 

713
00:33:23,000 --> 00:33:26,000
would be 

714
00:33:26,000 --> 00:33:29,000
fine. We have this better string object that has all these 

715
00:33:29,000 --> 00:33:32,000
fancy features, so you'd think we could just use that and ignore the fact that the 

716
00:33:32,000 --> 00:33:34,000
other one is there. 

717
00:33:34,000 --> 00:33:37,000
It almost ? 99 percent of the time, that's exactly how it's going to 

718
00:33:37,000 --> 00:33:41,000
work. It does turn out that there are a few situations where this old style 

719
00:33:41,000 --> 00:33:44,000
string pops its head up and gets a little bit in our way. 

720
00:33:44,000 --> 00:33:47,000
One way that may be a little bit of a surprise is that the string literals are 

721
00:33:47,000 --> 00:33:48,000
actually C strings. 

722
00:33:48,000 --> 00:33:50,000
When you see an open quote, 

723
00:33:50,000 --> 00:33:52,000
some characters and then a closed quote, 

724
00:33:52,000 --> 00:33:56,000
the compiler interprets that as a C style string. 

725
00:33:56,000 --> 00:33:57,000


726
00:33:57,000 --> 00:34:01,000
It also has a mechanism by which if you tried to use it in a context 

727
00:34:01,000 --> 00:34:04,000
where you needed a C++ string, it will automatically convert it for you. It 

728
00:34:04,000 --> 00:34:08,000
will take the old style string and make a C++ string out of it. 

729
00:34:08,000 --> 00:34:10,000
That means that basically I can use them wherever I want and it 

730
00:34:10,000 --> 00:34:12,000
will mostly work out. 

731
00:34:12,000 --> 00:34:15,000
There is a way you can deliberately force it, 

732
00:34:15,000 --> 00:34:18,000
if you use what looks like the type case here. This is actually calling the 

733
00:34:18,000 --> 00:34:22,000
string constructor, and you pass a string literal or string constant. It will turn 

734
00:34:22,000 --> 00:34:25,000
it into a C++ string manually there. 

735
00:34:25,000 --> 00:34:28,000
It's going to turn out that you might need to know this. 

736
00:34:28,000 --> 00:34:30,000
There's also 

737
00:34:30,000 --> 00:34:33,000
the other problem of what if I have it in one form and I want it in the old form? 

738
00:34:33,000 --> 00:34:36,000
I have the old form. I want a new form. I have the new form. I want an 

739
00:34:36,000 --> 00:34:37,000
old form. 

740
00:34:37,000 --> 00:34:41,000
There is a member function on the string class that will return to you an old style 

741
00:34:41,000 --> 00:34:45,000
string from a new style C++ string, and it's called the C_str 

742
00:34:45,000 --> 00:34:46,000
[inaudible]. 

743
00:34:46,000 --> 00:34:47,000
They let you 

744
00:34:47,000 --> 00:34:49,000
convert. Why do you 

745
00:34:49,000 --> 00:34:52,000
care? It turns out there's one thing you'll definitely run into, which is when 

746
00:34:52,000 --> 00:34:54,000


747
00:34:54,000 --> 00:34:55,000


748
00:34:55,000 --> 00:34:56,000


749
00:34:56,000 --> 00:35:00,000
you're opening a file stream, you want to say this is the file on disk that 

750
00:35:00,000 --> 00:35:04,000
you want to identify. It turns out that that library requires the use of a C string as the 

751
00:35:04,000 --> 00:35:05,000


752
00:35:05,000 --> 00:35:08,000
name. There was a little bit of an 

753
00:35:08,000 --> 00:35:09,000


754
00:35:09,000 --> 00:35:12,000
issue trying to get all the libraries to come together at the right time, and it turns out 

755
00:35:12,000 --> 00:35:15,000
the stream library got finalized before the string library was done, and so it depended 

756
00:35:15,000 --> 00:35:18,000
on what was available at the time, which was the old style string. 

757
00:35:18,000 --> 00:35:21,000
Even years later when they're both happily debugged and working, 

758
00:35:21,000 --> 00:35:22,000


759
00:35:22,000 --> 00:35:25,000
it is still the case that when you use the stream library, you 

760
00:35:25,000 --> 00:35:26,000
have to 

761
00:35:26,000 --> 00:35:30,000
describe the file you want by using the old style string. If you had a C++ 

762
00:35:30,000 --> 00:35:32,000
string variable that held the name you wanted, 

763
00:35:32,000 --> 00:35:37,000
you'll actually have to convert it. 

764
00:35:37,000 --> 00:35:39,000
Converting in the other direction comes up 

765
00:35:39,000 --> 00:35:42,000
in one case, and I'm going to show you this one. 

766
00:35:42,000 --> 00:35:45,000
It has to do with concatenation. 

767
00:35:45,000 --> 00:35:48,000
The plus operator that does concatenation 

768
00:35:48,000 --> 00:35:51,000
really wants to work on C++ style strings, 

769
00:35:51,000 --> 00:35:54,000
so if one of your operands is a C++ string, 

770
00:35:54,000 --> 00:35:55,000
it's all fine, 

771
00:35:55,000 --> 00:36:00,000
as long as the left or the right side is a C++ string. The 

772
00:36:00,000 --> 00:36:03,000
other side can be a string literal, a constant, another string, a 

773
00:36:03,000 --> 00:36:06,000
character variable ? all those things work fine. As long as at least one of the 

774
00:36:06,000 --> 00:36:10,000
operands really is a true C++ string already, you're good. 

775
00:36:10,000 --> 00:36:13,000
That's almost always the case. 

776
00:36:13,000 --> 00:36:14,000
But 

777
00:36:14,000 --> 00:36:16,000
in the case where you somehow have two things 

778
00:36:16,000 --> 00:36:18,000
on either side of the plus, 

779
00:36:18,000 --> 00:36:21,000
neither of which is already a C++ string 

780
00:36:21,000 --> 00:36:25,000
? typically, that means you have a C string on one or both sides, a 

781
00:36:25,000 --> 00:36:26,000
character on one or both sides ? 

782
00:36:26,000 --> 00:36:29,000
you are not going to get concatenation. 

783
00:36:29,000 --> 00:36:34,000
If you try to add two C style strings, it actually won't compile. 

784
00:36:34,000 --> 00:36:35,000


785
00:36:35,000 --> 00:36:39,000
The sad thing about these two things, about taking a string literal and adding 

786
00:36:39,000 --> 00:36:40,000
either a 

787
00:36:40,000 --> 00:36:42,000
character constant or a character variable 

788
00:36:42,000 --> 00:36:46,000
is that it does compile and it just does not do what you want at all. 

789
00:36:46,000 --> 00:36:48,000
It does so in a 

790
00:36:48,000 --> 00:36:50,000
silent but deadly way. 

791
00:36:50,000 --> 00:36:51,000
I'm not going to tell you 

792
00:36:51,000 --> 00:36:56,000
what it does, but if you are curious, you can come and talk to me and I'll lay it out for you. 

793
00:36:56,000 --> 00:37:00,000
What I want you to come away with is this memory that when I'm using 

794
00:37:00,000 --> 00:37:01,000
concatenation 

795
00:37:01,000 --> 00:37:04,000
to be sure that one of the two operands is a 

796
00:37:04,000 --> 00:37:06,000
C++ string. 

797
00:37:06,000 --> 00:37:09,000
If you have to, force one. If you have 

798
00:37:09,000 --> 00:37:12,000
a string literal and you want it to be a C++ string, then make it one to 

799
00:37:12,000 --> 00:37:15,000
avoid running into this. The mistake that you get from this is 

800
00:37:15,000 --> 00:37:17,000
actually quite mystical and very confusing. 

801
00:37:17,000 --> 00:37:19,000


802
00:37:19,000 --> 00:37:22,000
Probably 95 percent of you would never run into this, and so mostly, I've 

803
00:37:22,000 --> 00:37:25,000
just confused you for reasons that seem unclear, but for the five 

804
00:37:25,000 --> 00:37:27,000
percent that are going to run into this, 

805
00:37:27,000 --> 00:37:29,000
I'm really trying to do you a favor 

806
00:37:29,000 --> 00:37:32,000
by giving you a heads up 

807
00:37:32,000 --> 00:37:35,000
before it causes you a lot of grief later. 

808
00:37:35,000 --> 00:37:38,000
Just a little bit of a legacy. C and C++ go back a 

809
00:37:38,000 --> 00:37:42,000
long way, and as a result, we sometimes have little quirks we have to deal with 

810
00:37:42,000 --> 00:37:44,000
even in the modern world. Student:Is that the only way 

811
00:37:44,000 --> 00:37:48,000
you can 

812
00:37:48,000 --> 00:37:51,000
convert a C string into a C++ string? Instructor (Julie Zelenski):Not exactly. A lot of times, it just happens automatically is the truth. In almost all situations 

813
00:37:51,000 --> 00:37:54,000
? if you had a routine that expected a string argument and you passed the 

814
00:37:54,000 --> 00:37:57,000
string literal, it will automatically convert. Mostly, 

815
00:37:57,000 --> 00:38:01,000
you won't need to do this is the truth. It happens all the time behind 

816
00:38:01,000 --> 00:38:05,000
the scenes without any effort on your part. This is the official way to say I've 

817
00:38:05,000 --> 00:38:06,000
got a string 

818
00:38:06,000 --> 00:38:08,000
and I really want to force it and I'm not waiting for the compiler to do it 

819
00:38:08,000 --> 00:38:10,000
on my behalf. 

820
00:38:10,000 --> 00:38:13,000
In particularly, for example, in a situation like this, 

821
00:38:13,000 --> 00:38:16,000
it doesn't realize what you really wanted to do was convert this and then to concatenation. 

822
00:38:16,000 --> 00:38:19,000
It does something kind of goofy based on what the old meaning of taking 

823
00:38:19,000 --> 00:38:29,000
a C string and adding a character to it was, which was not concatenation. 

824
00:38:29,000 --> 00:38:34,000
That's a little moment of silence for old language C that 

825
00:38:34,000 --> 00:38:40,000
comes back to haunt us a little bit like a ghost in the attic. Student:What do 

826
00:38:40,000 --> 00:38:44,000
you mean by a string literal? Instructor (Julie Zelenski):A string literal 

827
00:38:44,000 --> 00:38:48,000
just means a string constant. It's something in quotes. 

828
00:38:48,000 --> 00:38:53,000
Student:Without explicitly declaring it to be a string. Instructor (Julie Zelenski):Yeah. A string literal is when you see open double quotes, some characters, 

829
00:38:53,000 --> 00:38:55,000
close quote, that's a string constant or a string literal. 

830
00:38:55,000 --> 00:38:58,000
In any situation where you see exactly that ? 

831
00:38:58,000 --> 00:39:02,000
not a string variable 

832
00:39:02,000 --> 00:39:05,000


833
00:39:05,000 --> 00:39:08,000
is basically 

834
00:39:08,000 --> 00:39:09,000


835
00:39:09,000 --> 00:39:12,000
what I'm saying there. 

836
00:39:12,000 --> 00:39:14,000


837
00:39:14,000 --> 00:39:17,000
How do we do IO? How do we do input/output 

838
00:39:17,000 --> 00:39:18,000
in 

839
00:39:18,000 --> 00:39:23,000
C++? Let me first say that input/output is probably one of the more 

840
00:39:23,000 --> 00:39:26,000
distinctive features of any language. C's IO, for example, looks very different 

841
00:39:26,000 --> 00:39:29,000
than C++'s IO, which looks kind of different from Java's 

842
00:39:29,000 --> 00:39:31,000
IO. These are 

843
00:39:31,000 --> 00:39:35,000
areas where for some reason, even though they all do the same things 

844
00:39:35,000 --> 00:39:37,000
underneath it all ? they let you print stuff. They let you read stuff. 

845
00:39:37,000 --> 00:39:39,000
They have some formatting features. 

846
00:39:39,000 --> 00:39:43,000
For one reason or another, these are the areas where they're widely divergent in their 

847
00:39:43,000 --> 00:39:46,000
syntax and the way you express what you want to do. That makes them particularly 

848
00:39:46,000 --> 00:39:50,000
annoying to learn is the truth. I know a lot of IO systems, and they're all 

849
00:39:50,000 --> 00:39:52,000
very jumbled up in my head. 

850
00:39:52,000 --> 00:39:54,000
At any given moment if you asked me how could I print 

851
00:39:54,000 --> 00:39:57,000
a decimal number with three digits of precision in this language, I'm going 

852
00:39:57,000 --> 00:40:01,000
to have to go look it up. My motto is look it up. 

853
00:40:01,000 --> 00:40:03,000
Don't worry about memorizing 

854
00:40:03,000 --> 00:40:05,000
these details, because they are 

855
00:40:05,000 --> 00:40:08,000
very tied to any particular language and its formatting system. 

856
00:40:08,000 --> 00:40:10,000


857
00:40:10,000 --> 00:40:13,000


858
00:40:13,000 --> 00:40:16,000
That said, we're going to use a little bit of IO. We'll need to be able to read and 

859
00:40:16,000 --> 00:40:20,000
write things to the console to interact with the user. We're going to do a little bit of 

860
00:40:20,000 --> 00:40:25,000
file reading, reading numbers and strings from files, maybe even producing some files. We're going to 

861
00:40:25,000 --> 00:40:29,000
use some very simple set of features. We're not going to go too deep. When 

862
00:40:29,000 --> 00:40:33,000
you need to know more, there are great resources to go check into for that. 

863
00:40:33,000 --> 00:40:35,000
I wouldn't in advance go make yourself 

864
00:40:35,000 --> 00:40:37,000
an expert on any form of 

865
00:40:37,000 --> 00:40:39,000


866
00:40:39,000 --> 00:40:42,000
IO. Figure it out when you need to. 

867
00:40:42,000 --> 00:40:43,000


868
00:40:43,000 --> 00:40:48,000
The IOs are actually handled in C++ using stream objects. There are stream 

869
00:40:48,000 --> 00:40:48,000
classes. 

870
00:40:48,000 --> 00:40:52,000
The O stream is the output stream that's used for writing. The I stream are the classes 

871
00:40:52,000 --> 00:40:54,000
used for reading. 

872
00:40:54,000 --> 00:40:57,000
Their variance, for example ? the IF stream and the OF stream are the 

873
00:40:57,000 --> 00:41:00,000
file equivalents of the input/output streams. 

874
00:41:00,000 --> 00:41:02,000
Cout and 

875
00:41:02,000 --> 00:41:06,000
cin are these two basically global variables, effectively, that 

876
00:41:06,000 --> 00:41:10,000
give you access to the console output stream and cin for the console input 

877
00:41:10,000 --> 00:41:11,000
stream. 

878
00:41:11,000 --> 00:41:14,000
That means the little text window that pops up that you get to type and 

879
00:41:14,000 --> 00:41:17,000
print things for the user to see and interact with. 

880
00:41:17,000 --> 00:41:20,000
The standard operators for reading and writing to the stream 

881
00:41:20,000 --> 00:41:24,000
in the default sense are the <<, which is stream insertion, and 

882
00:41:24,000 --> 00:41:26,000
>>, which is stream extraction. 

883
00:41:26,000 --> 00:41:29,000
You stick things onto a stream and then retrieve things back from a stream 

884
00:41:29,000 --> 00:41:32,000
that you're reading from. 

885
00:41:32,000 --> 00:41:33,000
A very simple example of this would be 

886
00:41:33,000 --> 00:41:39,000
I have the variables X and Y declared here. I asked the user to enter two 

887
00:41:39,000 --> 00:41:40,000
numbers, 

888
00:41:40,000 --> 00:41:44,000
and then I use extraction that says from the console input stream 

889
00:41:44,000 --> 00:41:48,000
to pull an integer out followed by another integer, and then I repeat back what 

890
00:41:48,000 --> 00:41:50,000
they 

891
00:41:50,000 --> 00:41:54,000
said. In its simplest form, the kind of things you can print out are 

892
00:41:54,000 --> 00:41:58,000
very related to things you can read in. When I ask cin to read an integer 

893
00:41:58,000 --> 00:42:01,000
here, it looks for a sequence of digits upcoming in the stream that form 

894
00:42:01,000 --> 00:42:03,000
a valid integer 

895
00:42:03,000 --> 00:42:06,000
which it assembles and puts into the value X. Then it looks for another one. 

896
00:42:06,000 --> 00:42:11,000
It typically uses white spaces as an eliminator, so any returns, tabs or spaces 

897
00:42:11,000 --> 00:42:15,000
will be skipped over in between. Anything that led up to X, it will 

898
00:42:15,000 --> 00:42:18,000
skip over all the white space, look for some digits and then skip over any 

899
00:42:18,000 --> 00:42:24,000
intervening white space, and look for some more digits to pull Y. Of course, 

900
00:42:24,000 --> 00:42:24,000
what's 

901
00:42:24,000 --> 00:42:29,000
likely to happen here is users are bad typists. They make mistakes that 

902
00:42:29,000 --> 00:42:33,000
when I go to read this, what happens if they've typed the letter A or 

903
00:42:33,000 --> 00:42:37,000
72A45. 

904
00:42:37,000 --> 00:42:40,000
This causes a little bit of havoc because when it goes to extract that, it looks for 

905
00:42:40,000 --> 00:42:44,000
some digits and it finds this thing and it doesn't match its expectation. That puts the stream in 

906
00:42:44,000 --> 00:42:46,000
what is called an error or fail state, 

907
00:42:46,000 --> 00:42:50,000
which then requires you digging around, realizing it went into a fail 

908
00:42:50,000 --> 00:42:53,000
state, cleaning it up and resetting and starting over. 

909
00:42:53,000 --> 00:42:56,000
It's not that it can't be done, but it's a little bit annoying. 

910
00:42:56,000 --> 00:43:01,000
We just made this task a little bit less onerous by providing in the 

911
00:43:01,000 --> 00:43:05,000
simpIO library, which is our CS106 simple IO ? it has 

912
00:43:05,000 --> 00:43:07,000
get integer, get real 

913
00:43:07,000 --> 00:43:11,000
and get line. They all read from the console, so reading from cin, and they deal with 

914
00:43:11,000 --> 00:43:15,000
all that error handling. They make sure that the input given was 

915
00:43:15,000 --> 00:43:18,000
well formed. If it's not, it reprompts and has them try again. It does that until 

916
00:43:18,000 --> 00:43:21,000
they get an integer. When you call get integer, you know that eventually, 

917
00:43:21,000 --> 00:43:25,000
the user will have typed in a well-formed integer and you will get that value back 

918
00:43:25,000 --> 00:43:27,000
when you make that call. You don't have to be worried about 

919
00:43:27,000 --> 00:43:31,000
all the machinations to check for errors. 

920
00:43:31,000 --> 00:43:35,000
Retry is actually bundled up behind that routine for you. 

921
00:43:35,000 --> 00:43:38,000
Most of our console input will end up using these functions just for convenience. 

922
00:43:38,000 --> 00:43:43,000
They save us a certain amount of hassle. 

923
00:43:43,000 --> 00:43:49,000
I would ask ? if I wanted X and Y, I would say get integer one, get integer two. 

924
00:43:49,000 --> 00:43:53,000
I'd have to call it twice. There's not a 

925
00:43:53,000 --> 00:43:57,000
combined form of it. It saves us a lot of trouble. I 

926
00:43:57,000 --> 00:44:00,000
can't do it this way. I'd have to stop it after one anyway, check to see if 

927
00:44:00,000 --> 00:44:05,000
it failed, if not, go back in. It's kind of misleading to even show this form, because 

928
00:44:05,000 --> 00:44:09,000
that form assumes that the user is a perfect typist and never makes mistakes, which is 

929
00:44:09,000 --> 00:44:12,000
in this day and age not too likely. We'll talk more 

930
00:44:12,000 --> 00:44:18,000
about file streams on Wednesday. 

931
00:44:18,000 --> 00:44:22,000
On your way out, look for handout five and seven for Mac or PC depending 

932
00:44:22,000 --> 00:44:36,000
on what you're using and good luck getting your compiler set up. I'll see you on Wednesday.