vector
TypeA vector
is a collection of objects, all of which have the same type. Every object in the collection has an associated index, which gives access to that object. A vector
is often referred to as a container because it “contains” other objects. We’ll have much more to say about containers in Part II.
To use a vector
, we must include the appropriate header. In our examples, we also assume that an appropriate using
declaration is made:
#include <vector>
using std::vector;
A vector
is a class template. C++ has both class and function templates. Writing a template requires a fairly deep understanding of C++. Indeed, we won’t see how to create our own templates until Chapter 16! Fortunately, we can use templates without knowing how to write them.
Templates are not themselves functions or classes. Instead, they can be thought of as instructions to the compiler for generating classes or functions. The process that the compiler uses to create classes or functions from templates is called instantiation. When we use a template, we specify what kind of class or function we want the compiler to instantiate.
For a class template, we specify which class to instantiate by supplying additional information, the nature of which depends on the template. How we specify the information is always the same: We supply it inside a pair of angle brackets following the template’s name.
In the case of vector
, the additional information we supply is the type of the objects the vector
will hold:
vector<int> ivec; // ivec holds objects of type int
vector<Sales_item> Sales_vec; // holds Sales_items
vector<vector<string>> file; // vector whose elements are vectors
In this example, the compiler generates three distinct types from the vector
template: vector<int>
, vector<Sales_item>
, and vector<vector<string>>
.
vector
is a template, not a type. Types generated fromvector
must include the element type, for example,vector<int>
.
We can define vector
s to hold objects of most any type. Because references are not objects (§ 2.3.1, p. 50), we cannot have a vector
of references. However, we can have vector
s of most other (nonreference) built-in types and most class types. In particular, we can have vector
s whose elements are themselves vector
s.
It is worth noting that earlier versions of C++ used a slightly different syntax to define a vector
whose elements are themselves vector
s (or another template type). In the past, we had to supply a space between the closing angle bracket of the outer vector
and its element type—vector<vector<int> >
rather than vector<vector<int>>
.
Some compilers may require the old-style declarations for a
vector
ofvector
s, for example,vector<vector<int> >
.
vector
sAs with any class type, the vector
template controls how we define and initialize vector
s. Table 3.4 (p. 99) lists the most common ways to define vector
s.
Table 3.4. Ways to Initialize a vector
We can default initialize a vector
(§ 2.2.1, p. 44), which creates an empty vector
of the specified type:
vector<string> svec; // default initialization; svec has no elements
It might seem that an empty vector
would be of little use. However, as we’ll see shortly, we can (efficiently) add elements to a vector
at run time. Indeed, the most common way of using vector
s is to define an initially empty vector
to which elements are added as their values become known at run time.
We can also supply initial value(s) for the element(s) when we define a vector
. For example, we can copy elements from another vector
. When we copy a vector
, each element in the new vector
is a copy of the corresponding element in the original vector
. The two vector
s must be the same type:
vector<int> ivec; // initially empty
// give ivec some values
vector<int> ivec2(ivec); // copy elements of ivec into ivec2
vector<int> ivec3 = ivec; // copy elements of ivec into ivec3
vector<string> svec(ivec2); // error: svec holds strings, not ints
vector
Another way to provide element values, is that under the new standard, we can list initialize (§ 2.2.1, p. 43) a vector
from a list of zero or more initial element values enclosed in curly braces:
vector<string> articles = {"a", "an", "the"};
The resulting vector
has three elements; the first holds the string "a"
, the second holds "an"
, and the last is "the"
.
As we’ve seen, C++ provides several forms of initialization (§ 2.2.1, p. 43). In many, but not all, cases we can use these forms of initialization interchangably. So far, we have seen two examples where the form of initialization matters: when we use the copy initialization form (i.e., when we use =
) (§ 3.2.1, p. 84), we can supply only a single initializer; and when we supply an in-class initializer (§ 2.6.1, p. 73), we must either use copy initialization or use curly braces. A third restriction is that we can supply a list of element values only by using list initialization in which the initializers are enclosed in curly braces. We cannot supply a list of initializers using parentheses:
vector<string> v1{"a", "an", "the"}; // list initialization
vector<string> v2("a", "an", "the"); // error
We can also initialize a vector
from a count and an element value. The count determines how many elements the vector
will have; the value provides the initial value for each of those elements:
vector<int> ivec(10, -1); // ten int elements, each initialized to -1
vector<string> svec(10, "hi!"); // ten strings; each element is "hi!"
We can usually omit the value and supply only a size. In this case the library creates a value-initialized element initializer for us. This library-generated value is used to initialize each element in the container. The value of the element initializer depends on the type of the elements stored in the vector
.
If the vector
holds elements of a built-in type, such as int
, then the element initializer has a value of 0. If the elements are of a class type, such as string
, then the element initializer is itself default initialized:
vector<int> ivec(10); // ten elements, each initialized to 0
vector<string> svec(10); // ten elements, each an empty string
There are two restrictions on this form of initialization: The first restriction is that some classes require that we always supply an explicit initializer (§ 2.2.1, p. 44). If our vector
holds objects of a type that we cannot default initialize, then we must supply an initial element value; it is not possible to create vector
s of such types by supplying only a size.
The second restriction is that when we supply an element count without also supplying an initial value, we must use the direct form of initialization:
vector<int> vi = 10; // error: must use direct initialization to supply a size
Here we are using 10 to instruct vector
how to create the vector
—we want a vector
with ten value-initialized elements. We are not “copying” 10 into the vector
. Hence, we cannot use the copy form of initialization. We’ll see more about how this restriction works in § 7.5.4 (p. 296).
In a few cases, what initialization means depends upon whether we use curly braces or parentheses to pass the initializer(s). For example, when we initialize a vector<int>
from a single int
value, that value might represent the vector
’s size or it might be an element value. Similarly, if we supply exactly two int
values, those values could be a size and an initial value, or they could be values for a two-element vector
. We specify which meaning we intend by whether we use curly braces or parentheses:
vector<int> v1(10); // v1 has ten elements with value 0
vector<int> v2{10}; // v2 has one element with value 10
vector<int> v3(10, 1); // v3 has ten elements with value 1
vector<int> v4{10, 1}; // v4 has two elements with values 10 and 1
When we use parentheses, we are saying that the values we supply are to be used to construct the object. Thus, v1
and v3
use their initializers to determine the vector
’s size, and its size and element values, respectively.
When we use curly braces, {...}
, we’re saying that, if possible, we want to list initialize the object. That is, if there is a way to use the values inside the curly braces as a list of element initializers, the class will do so. Only if it is not possible to list initialize the object will the other ways to initialize the object be considered. The values we supply when we initialize v2
and v4
can be used as element values. These objects are list initialized; the resulting vector
s have one and two elements, respectively.
On the other hand, if we use braces and there is no way to use the initializers to list initialize the object, then those values will be used to construct the object. For example, to list initialize a vector
of string
s, we must supply values that can be used as string
s. In this case, there is no confusion about whether to list initialize the elements or construct a vector
of the given size:
vector<string> v5{"hi"}; // list initialization: v5 has one element
vector<string> v6("hi"); // error: can't construct a vector from a string literal
vector<string> v7{10}; // v7 has ten default-initialized elements
vector<string> v8{10, "hi"}; // v8 has ten elements with value "hi"
Although we used braces on all but one of these definitions, only v5
is list initialized. In order to list initialize the vector
, the values inside braces must match the element type. We cannot use an int
to initialize a string
, so the initializers for v7
and v8
can’t be element initializers. If list initialization isn’t possible, the compiler looks for other ways to initialize the object from the given values.
Exercises Section 3.3.1
Exercise 3.12: Which, if any, of the following
vector
definitions are in error? For those that are legal, explain what the definition does. For those that are not legal, explain why they are illegal.(a)
vector<vector<int>> ivec;
(b)
vector<string> svec = ivec;
(c)
vector<string> svec(10, "null");
Exercise 3.13: How many elements are there in each of the following
vector
s? What are the values of the elements?(a)
vector<int> v1;
(b)
vector<int> v2(10);
(c)
vector<int> v3(10, 42);
(d)
vector<int> v4{10};
(e)
vector<int> v5{10, 42};
(f)
vector<string> v6{10};
(g)
vector<string> v7{10, "hi"};
vector
Directly initializing the elements of a vector
is feasible only if we have a small number of known initial values, if we want to make a copy of another vector
, or if we want to initialize all the elements to the same value. More commonly, when we create a vector
, we don’t know how many elements we’ll need, or we don’t know the value of those elements. Even if we do know all the values, if we have a large number of different initial element values, it can be cumbersome to specify them when we create the vector
.
As one example, if we need a vector
with values from 0 to 9, we can easily use list initialization. What if we wanted elements from 0 to 99 or 0 to 999? List initialization would be too unwieldy. In such cases, it is better to create an empty vector
and use a vector
member named push_back
to add elements at run time. The push_back
operation takes a value and “pushes” that value as a new last element onto the “back” of the vector
. For example:
vector<int> v2; // empty vector
for (int i = 0; i != 100; ++i)
v2.push_back(i); // append sequential integers to v2
// at end of loop v2 has 100 elements, values 0 . . . 99
Even though we know we ultimately will have 100 elements, we define v2
as empty. Each iteration adds the next sequential integer as a new element in v2
.
We use the same approach when we want to create a vector
where we don’t know until run time how many elements the vector
should have. For example, we might read the input, storing the values we read in the vector
:
// read words from the standard input and store them as elements in a vector
string word;
vector<string> text; // empty vector
while (cin >> word) {
text.push_back(word); // append word to text
}
Again, we start with an initially empty vector
. This time, we read and store an unknown number of values in text
.
The standard requires that
vector
implementations can efficiently add elements at run time. Becausevector
s grow efficiently, it is often unnecessary—and can result in poorer performance—to define avector
of a specific size. The exception to this rule is if all the elements actually need the same value. If differing element values are needed, it is usually more efficient to define an emptyvector
and add elements as the values we need become known at run time. Moreover, as we’ll see in § 9.4 (p. 355),vector
offers capabilities to allow us to further enhance run-time performance when we add elements.Starting with an empty
vector
and adding elements at run time is distinctly different from how we use built-in arrays in C and in most other languages. In particular, if you are accustomed to using C or Java, you might expect that it would be best to define thevector
at its expected size. In fact, the contrary is usually the case.
vector
The fact that we can easily and efficiently add elements to a vector
greatly simplifies many programming tasks. However, this simplicity imposes a new obligation on our programs: We must ensure that any loops we write are correct even if the loop changes the size of the vector
.
Other implications that follow from the dynamic nature of vector
s will become clearer as we learn more about using them. However, there is one implication that is worth noting already: For reasons we’ll explore in § 5.4.3 (p. 188), we cannot use a range for
if the body of the loop adds elements to the vector
.
The body of a range
for
must not change the size of the sequence over which it is iterating.
Exercises Section 3.3.2
Exercise 3.14: Write a program to read a sequence of
int
s fromcin
and store those values in avector
.Exercise 3.15: Repeat the previous program but read
string
s this time.
vector
OperationsIn addition to push_back, vector
provides only a few other operations, most of which are similar to the corresponding operations on string
s. Table 3.5 lists the most important ones.
We access the elements of a vector
the same way that we access the characters in a string:
through their position in the vector
. For example, we can use a range for
(§ 3.2.3, p. 91) to process all the elements in a vector
:
vector<int> v{1,2,3,4,5,6,7,8,9};
for (auto &i : v) // for each element in v (note: i is a reference)
i *= i; // square the element value
for (auto i : v) // for each element in v
cout << i << " "; // print the element
cout << endl;
In the first loop, we define our control variable, i
, as a reference so that we can use i
to assign new values to the elements in v
. We let auto
deduce the type of i
. This loop uses a new form of the compound assignment operator (§ 1.4.1, p. 12). As we’ve seen, +=
adds the right-hand operand to the left and stores the result in the left-hand operand. The *=
operator behaves similarly, except that it multiplies the left- and right-hand operands, storing the result in the left-hand one. The second range for
prints each element.
The empty
and size
members behave as do the corresponding string
members (§ 3.2.2, p. 87): empty
returns a bool
indicating whether the vector
has any elements, and size
returns the number of elements in the vector
. The size
member returns a value of the size_type
defined by the corresponding vector
type.
To use
size_type
, we must name the type in which it is defined. Avector
type always includes its element type (§ 3.3, p. 97):vector<int>::size_type // ok
vector::size_type // error
The equality and relational operators have the same behavior as the corresponding string
operations (§ 3.2.2, p. 88). Two vector
s are equal if they have the same number of elements and if the corresponding elements all have the same value. The relational operators apply a dictionary ordering: If the vector
s have differing sizes, but the elements that are in common are equal, then the vector
with fewer elements is less than the one with more elements. If the elements have differing values, then the relationship between the vector
s is determined by the relationship between the first elements that differ.
We can compare two vector
s only if we can compare the elements in those vector
s. Some class types, such as string
, define the meaning of the equality and relational operators. Others, such as our Sales_item
class, do not. The only operations Sales_item
supports are those listed in § 1.5.1 (p. 20). Those operations did not include the equality or relational operators. As a result, we cannot compare two vector<Sales_item>
objects.
vector
IndexWe can fetch a given element using the subscript operator (§ 3.2.3, p. 93). As with string
s, subscripts for vector
start at 0; the type of a subscript is the corresponding size_type
; and—assuming the vector
is nonconst
—we can write to the element returned by the subscript operator. In addition, as we did in § 3.2.3 (p. 95), we can compute an index and directly fetch the element at that position.
As an example, let’s assume that we have a collection of grades that range from 0 through 100. We’d like to count how many grades fall into various clusters of 10. Between zero and 100 there are 101 possible grades. These grades can be represented by 11 clusters: 10 clusters of 10 grades each plus one cluster for the perfect score of 100. The first cluster will count grades of 0 through 9, the second will count grades from 10 through 19, and so on. The final cluster counts how many scores of 100 were achieved.
Clustering the grades this way, if our input is
42 65 95 100 39 67 95 76 88 76 83 92 76 93
then the output should be
0 0 0 1 1 0 2 3 2 4 1
which indicates that there were no grades below 30, one grade in the 30s, one in the 40s, none in the 50s, two in the 60s, three in the 70s, two in the 80s, four in the 90s, and one grade of 100.
We’ll use a vector
with 11 elements to hold the counters for each cluster. We can determine the cluster index for a given grade by dividing that grade by 10. When we divide two integers, we get an integer in which the fractional part is truncated. For example, 42/10 is 4, 65/10 is 6 and 100/10 is 10. Once we’ve computed the cluster index, we can use it to subscript our vector
and fetch the counter we want to increment:
// count the number of grades by clusters of ten: 0--9, 10--19, . .. 90--99, 100
vector<unsigned> scores(11, 0); // 11 buckets, all initially 0
unsigned grade;
while (cin >> grade) { // read the grades
if (grade <= 100) // handle only valid grades
++scores[grade/10]; // increment the counter for the current cluster
}
We start by defining a vector
to hold the cluster counts. In this case, we do want each element to have the same value, so we allocate all 11 elements, each of which is initialized to 0
. The while
condition reads the grades. Inside the loop, we check that the grade we read has a valid value (i.e., that it is less than or equal to 100). Assuming the grade is valid, we increment the appropriate counter for grade
.
The statement that does the increment is a good example of the kind of terse code characteristic of C++ programs. This expression
++scores[grade/10]; // increment the counter for the current cluster
is equivalent to
auto ind = grade/10; // get the bucket index
scores[ind] = scores[ind] + 1; // increment the count
We compute the bucket index by dividing grade
by 10 and use the result of the division to index scores
. Subscripting scores
fetches the appropriate counter for this grade. We increment the value of that element to indicate the occurrence of a score in the given range.
As we’ve seen, when we use a subscript, we should think about how we know that the indices are in range (§ 3.2.3, p. 95). In this program, we verify that the input is a valid grade in the range between 0
and 100
. Thus, we know that the indices we can compute are between 0
and 10
. These indices are between 0
and scores.size() - 1
.
Programmers new to C++ sometimes think that subscripting a vector
adds elements; it does not. The following code intends to add ten elements to ivec
:
vector<int> ivec; // empty vector
for (decltype(ivec.size()) ix = 0; ix != 10; ++ix)
ivec[ix] = ix; // disaster: ivec has no elements
However, it is in error: ivec
is an empty vector
; there are no elements to subscript! As we’ve seen, the right way to write this loop is to use push_back
:
for (decltype(ivec.size()) ix = 0; ix != 10; ++ix)
ivec.push_back(ix); // ok: adds a new element with value ix
The subscript operator on
vector
(andstring
) fetches an existing element; it does not add an element.
It is crucially important to understand that we may use the subscript operator (the
[]
operator) to fetch only elements that actually exist. For example,vector<int> ivec; // empty vector
cout << ivec[0]; // error: ivec has no elements!
vector<int> ivec2(10); // vector with ten elements
cout << ivec2[10]; // error: ivec2 has elements 0 . . . 9It is an error to subscript an element that doesn’t exist, but it is an error that the compiler is unlikely to detect. Instead, the value we get at run time is undefined.
Attempting to subscript elements that do not exist is, unfortunately, an extremely common and pernicious programming error. So-called buffer overflow errors are the result of subscripting elements that don’t exist. Such bugs are the most common cause of security problems in PC and other applications.
A good way to ensure that subscripts are in range is to avoid subscripting altogether by using a range
for
whenever possible.
Exercises Section 3.3.3
Exercise 3.16: Write a program to print the size and contents of the
vector
s from exercise 3.13. Check whether your answers to that exercise were correct. If not, restudy § 3.3.1 (p. 97) until you understand why you were wrong.Exercise 3.17: Read a sequence of words from
cin
and store the values avector
. After you’ve read all the words, process thevector
and change each word to uppercase. Print the transformed elements, eight words to a line.Exercise 3.18: Is the following program legal? If not, how might you fix it?
vector<int> ivec;
ivec[0] = 42;Exercise 3.19: List three ways to define a
vector
and give it ten elements, each with the value 42. Indicate whether there is a preferred way to do so and why.Exercise 3.20: Read a set of integers into a
vector
. Print the sum of each pair of adjacent elements. Change your program so that it prints the sum of the first and last elements, followed by the sum of the second and second-to-last, and so on.