3.3. Library vector Type

3.3. Library `vector` Type

A vector is a collection of objects, all of which have the same type. Every object in the collection has an associated index, which gives access to that object. A vector is often referred to as a container because it “contains” other objects. We’ll have much more to say about containers in Part II.

To use a vector, we must include the appropriate header. In our examples, we also assume that an appropriate using declaration is made:

#include <vector> using std::vector;

A vector is a class template. C++ has both class and function templates. Writing a template requires a fairly deep understanding of C++. Indeed, we won’t see how to create our own templates until Chapter 16! Fortunately, we can use templates without knowing how to write them.

Templates are not themselves functions or classes. Instead, they can be thought of as instructions to the compiler for generating classes or functions. The process that the compiler uses to create classes or functions from templates is called instantiation. When we use a template, we specify what kind of class or function we want the compiler to instantiate.

For a class template, we specify which class to instantiate by supplying additional information, the nature of which depends on the template. How we specify the information is always the same: We supply it inside a pair of angle brackets following the template’s name.

In the case of vector, the additional information we supply is the type of the objects the vector will hold:

vector<int> ivec; // ivec holds objects of type int vector<Sales_item> Sales_vec; // holds Sales_items vector<vector<string>> file; // vector whose elements are vectors

In this example, the compiler generates three distinct types from the vector template: vector<int>, vector<Sales_item>, and vector<vector<string>>.

Note

vector is a template, not a type. Types generated from vector must include the element type, for example, vector<int>.

We can define vectors to hold objects of most any type. Because references are not objects (§ 2.3.1, p. 50), we cannot have a vector of references. However, we can have vectors of most other (nonreference) built-in types and most class types. In particular, we can have vectors whose elements are themselves vectors.

It is worth noting that earlier versions of C++ used a slightly different syntax to define a vector whose elements are themselves vectors (or another template type). In the past, we had to supply a space between the closing angle bracket of the outer vector and its element type—vector<vector<int> > rather than vector<vector<int>>.

Warning

Some compilers may require the old-style declarations for a vector of vectors, for example, vector<vector<int> >.

3.3.1. Defining and Initializing `vector`s

As with any class type, the vector template controls how we define and initialize vectors. Table 3.4 (p. 99) lists the most common ways to define vectors.

Table 3.4. Ways to Initialize a vector

We can default initialize a vector (§ 2.2.1, p. 44), which creates an empty vector of the specified type:

vector<string> svec; // default initialization; svec has no elements

It might seem that an empty vector would be of little use. However, as we’ll see shortly, we can (efficiently) add elements to a vector at run time. Indeed, the most common way of using vectors is to define an initially empty vector to which elements are added as their values become known at run time.

We can also supply initial value(s) for the element(s) when we define a vector. For example, we can copy elements from another vector. When we copy a vector, each element in the new vector is a copy of the corresponding element in the original vector. The two vectors must be the same type:

vector<int> ivec; // initially empty // give ivec some values vector<int> ivec2(ivec); // copy elements of ivec into ivec2 vector<int> ivec3 = ivec; // copy elements of ivec into ivec3 vector<string> svec(ivec2); // error: svec holds strings, not ints

List Initializing a `vector`

Another way to provide element values, is that under the new standard, we can list initialize (§ 2.2.1, p. 43) a vector from a list of zero or more initial element values enclosed in curly braces:

vector<string> articles = {"a", "an", "the"};

The resulting vector has three elements; the first holds the string "a", the second holds "an", and the last is "the".

As we’ve seen, C++ provides several forms of initialization (§ 2.2.1, p. 43). In many, but not all, cases we can use these forms of initialization interchangably. So far, we have seen two examples where the form of initialization matters: when we use the copy initialization form (i.e., when we use =) (§ 3.2.1, p. 84), we can supply only a single initializer; and when we supply an in-class initializer (§ 2.6.1, p. 73), we must either use copy initialization or use curly braces. A third restriction is that we can supply a list of element values only by using list initialization in which the initializers are enclosed in curly braces. We cannot supply a list of initializers using parentheses:

vector<string> v1{"a", "an", "the"}; // list initialization vector<string> v2("a", "an", "the"); // error

Creating a Specified Number of Elements

We can also initialize a vector from a count and an element value. The count determines how many elements the vector will have; the value provides the initial value for each of those elements:

vector<int> ivec(10, -1); // ten int elements, each initialized to -1 vector<string> svec(10, "hi!"); // ten strings; each element is "hi!"

Value Initialization

We can usually omit the value and supply only a size. In this case the library creates a value-initialized element initializer for us. This library-generated value is used to initialize each element in the container. The value of the element initializer depends on the type of the elements stored in the vector.

If the vector holds elements of a built-in type, such as int, then the element initializer has a value of 0. If the elements are of a class type, such as string, then the element initializer is itself default initialized:

vector<int> ivec(10); // ten elements, each initialized to 0 vector<string> svec(10); // ten elements, each an empty string

There are two restrictions on this form of initialization: The first restriction is that some classes require that we always supply an explicit initializer (§ 2.2.1, p. 44). If our vector holds objects of a type that we cannot default initialize, then we must supply an initial element value; it is not possible to create vectors of such types by supplying only a size.

The second restriction is that when we supply an element count without also supplying an initial value, we must use the direct form of initialization:

vector<int> vi = 10; // error: must use direct initialization to supply a size

Here we are using 10 to instruct vector how to create the vector—we want a vector with ten value-initialized elements. We are not “copying” 10 into the vector. Hence, we cannot use the copy form of initialization. We’ll see more about how this restriction works in § 7.5.4 (p. 296).

List Initializer or Element Count?

In a few cases, what initialization means depends upon whether we use curly braces or parentheses to pass the initializer(s). For example, when we initialize a vector<int> from a single int value, that value might represent the vector’s size or it might be an element value. Similarly, if we supply exactly two int values, those values could be a size and an initial value, or they could be values for a two-element vector. We specify which meaning we intend by whether we use curly braces or parentheses:

vector<int> v1(10); // v1 has ten elements with value 0 vector<int> v2{10}; // v2 has one element with value 10 vector<int> v3(10, 1); // v3 has ten elements with value 1 vector<int> v4{10, 1}; // v4 has two elements with values 10 and 1

When we use parentheses, we are saying that the values we supply are to be used to construct the object. Thus, v1 and v3 use their initializers to determine the vector’s size, and its size and element values, respectively.

When we use curly braces, {...}, we’re saying that, if possible, we want to list initialize the object. That is, if there is a way to use the values inside the curly braces as a list of element initializers, the class will do so. Only if it is not possible to list initialize the object will the other ways to initialize the object be considered. The values we supply when we initialize v2 and v4 can be used as element values. These objects are list initialized; the resulting vectors have one and two elements, respectively.

On the other hand, if we use braces and there is no way to use the initializers to list initialize the object, then those values will be used to construct the object. For example, to list initialize a vector of strings, we must supply values that can be used as strings. In this case, there is no confusion about whether to list initialize the elements or construct a vector of the given size:

vector<string> v5{"hi"}; // list initialization: v5 has one element vector<string> v6("hi"); // error: can't construct a vector from a string literal vector<string> v7{10}; // v7 has ten default-initialized elements vector<string> v8{10, "hi"}; // v8 has ten elements with value "hi"

Although we used braces on all but one of these definitions, only v5 is list initialized. In order to list initialize the vector, the values inside braces must match the element type. We cannot use an int to initialize a string, so the initializers for v7 and v8 can’t be element initializers. If list initialization isn’t possible, the compiler looks for other ways to initialize the object from the given values.

Exercises Section 3.3.1

Exercise 3.12: Which, if any, of the following vector definitions are in error? For those that are legal, explain what the definition does. For those that are not legal, explain why they are illegal.

(a) vector<vector<int>> ivec;

(b) vector<string> svec = ivec;

(c) vector<string> svec(10, "null");

Exercise 3.13: How many elements are there in each of the following vectors? What are the values of the elements?

(a) vector<int> v1;

(b) vector<int> v2(10);

(c) vector<int> v3(10, 42);

(d) vector<int> v4{10};

(e) vector<int> v5{10, 42};

(f) vector<string> v6{10};

(g) vector<string> v7{10, "hi"};

3.3.2. Adding Elements to a `vector`

Directly initializing the elements of a vector is feasible only if we have a small number of known initial values, if we want to make a copy of another vector, or if we want to initialize all the elements to the same value. More commonly, when we create a vector, we don’t know how many elements we’ll need, or we don’t know the value of those elements. Even if we do know all the values, if we have a large number of different initial element values, it can be cumbersome to specify them when we create the vector.

As one example, if we need a vector with values from 0 to 9, we can easily use list initialization. What if we wanted elements from 0 to 99 or 0 to 999? List initialization would be too unwieldy. In such cases, it is better to create an empty vector and use a vector member named push_back to add elements at run time. The push_back operation takes a value and “pushes” that value as a new last element onto the “back” of the vector. For example:

vector<int> v2; // empty vector for (int i = 0; i != 100; ++i) v2.push_back(i); // append sequential integers to v2 // at end of loop v2 has 100 elements, values 0 . . . 99

Even though we know we ultimately will have 100 elements, we define v2 as empty. Each iteration adds the next sequential integer as a new element in v2.

We use the same approach when we want to create a vector where we don’t know until run time how many elements the vector should have. For example, we might read the input, storing the values we read in the vector:

// read words from the standard input and store them as elements in a vector string word; vector<string> text; // empty vector while (cin >> word) { text.push_back(word); // append word to text }

Again, we start with an initially empty vector. This time, we read and store an unknown number of values in text.

Key Concept: vectors Grow Efficiently

The standard requires that vector implementations can efficiently add elements at run time. Because vectors grow efficiently, it is often unnecessary—and can result in poorer performance—to define a vector of a specific size. The exception to this rule is if all the elements actually need the same value. If differing element values are needed, it is usually more efficient to define an empty vector and add elements as the values we need become known at run time. Moreover, as we’ll see in § 9.4 (p. 355), vector offers capabilities to allow us to further enhance run-time performance when we add elements.

Starting with an empty vector and adding elements at run time is distinctly different from how we use built-in arrays in C and in most other languages. In particular, if you are accustomed to using C or Java, you might expect that it would be best to define the vector at its expected size. In fact, the contrary is usually the case.

Programming Implications of Adding Elements to a `vector`

The fact that we can easily and efficiently add elements to a vector greatly simplifies many programming tasks. However, this simplicity imposes a new obligation on our programs: We must ensure that any loops we write are correct even if the loop changes the size of the vector.

Other implications that follow from the dynamic nature of vectors will become clearer as we learn more about using them. However, there is one implication that is worth noting already: For reasons we’ll explore in § 5.4.3 (p. 188), we cannot use a range for if the body of the loop adds elements to the vector.

Warning

The body of a range for must not change the size of the sequence over which it is iterating.

Exercises Section 3.3.2

Exercise 3.14: Write a program to read a sequence of ints from cin and store those values in a vector.

Exercise 3.15: Repeat the previous program but read strings this time.

3.3.3. Other `vector` Operations

In addition to push_back, vector provides only a few other operations, most of which are similar to the corresponding operations on strings. Table 3.5 lists the most important ones.

Table 3.5. vector Operations

We access the elements of a vector the same way that we access the characters in a string: through their position in the vector. For example, we can use a range for (§ 3.2.3, p. 91) to process all the elements in a vector:

vector<int> v{1,2,3,4,5,6,7,8,9}; for (auto &i : v) // for each element in v (note: i is a reference) i *= i; // square the element value for (auto i : v) // for each element in v cout << i << " "; // print the element cout << endl;

In the first loop, we define our control variable, i, as a reference so that we can use i to assign new values to the elements in v. We let auto deduce the type of i. This loop uses a new form of the compound assignment operator (§ 1.4.1, p. 12). As we’ve seen, += adds the right-hand operand to the left and stores the result in the left-hand operand. The *= operator behaves similarly, except that it multiplies the left- and right-hand operands, storing the result in the left-hand one. The second range for prints each element.

The empty and size members behave as do the corresponding string members (§ 3.2.2, p. 87): empty returns a bool indicating whether the vector has any elements, and size returns the number of elements in the vector. The size member returns a value of the size_type defined by the corresponding vector type.

Note

To use size_type, we must name the type in which it is defined. A vector type always includes its element type (§ 3.3, p. 97):

vector<int>::size_type // ok vector::size_type // error

The equality and relational operators have the same behavior as the corresponding string operations (§ 3.2.2, p. 88). Two vectors are equal if they have the same number of elements and if the corresponding elements all have the same value. The relational operators apply a dictionary ordering: If the vectors have differing sizes, but the elements that are in common are equal, then the vector with fewer elements is less than the one with more elements. If the elements have differing values, then the relationship between the vectors is determined by the relationship between the first elements that differ.

We can compare two vectors only if we can compare the elements in those vectors. Some class types, such as string, define the meaning of the equality and relational operators. Others, such as our Sales_item class, do not. The only operations Sales_item supports are those listed in § 1.5.1 (p. 20). Those operations did not include the equality or relational operators. As a result, we cannot compare two vector<Sales_item> objects.

Computing a `vector` Index

We can fetch a given element using the subscript operator (§ 3.2.3, p. 93). As with strings, subscripts for vector start at 0; the type of a subscript is the corresponding size_type; and—assuming the vector is nonconst—we can write to the element returned by the subscript operator. In addition, as we did in § 3.2.3 (p. 95), we can compute an index and directly fetch the element at that position.

As an example, let’s assume that we have a collection of grades that range from 0 through 100. We’d like to count how many grades fall into various clusters of 10. Between zero and 100 there are 101 possible grades. These grades can be represented by 11 clusters: 10 clusters of 10 grades each plus one cluster for the perfect score of 100. The first cluster will count grades of 0 through 9, the second will count grades from 10 through 19, and so on. The final cluster counts how many scores of 100 were achieved.

Clustering the grades this way, if our input is

42 65 95 100 39 67 95 76 88 76 83 92 76 93

then the output should be

0 0 0 1 1 0 2 3 2 4 1

which indicates that there were no grades below 30, one grade in the 30s, one in the 40s, none in the 50s, two in the 60s, three in the 70s, two in the 80s, four in the 90s, and one grade of 100.

We’ll use a vector with 11 elements to hold the counters for each cluster. We can determine the cluster index for a given grade by dividing that grade by 10. When we divide two integers, we get an integer in which the fractional part is truncated. For example, 42/10 is 4, 65/10 is 6 and 100/10 is 10. Once we’ve computed the cluster index, we can use it to subscript our vector and fetch the counter we want to increment:

// count the number of grades by clusters of ten: 0--9, 10--19, . .. 90--99, 100 vector<unsigned> scores(11, 0); // 11 buckets, all initially 0 unsigned grade; while (cin >> grade) { // read the grades if (grade <= 100) // handle only valid grades ++scores[grade/10]; // increment the counter for the current cluster }

We start by defining a vector to hold the cluster counts. In this case, we do want each element to have the same value, so we allocate all 11 elements, each of which is initialized to 0. The while condition reads the grades. Inside the loop, we check that the grade we read has a valid value (i.e., that it is less than or equal to 100). Assuming the grade is valid, we increment the appropriate counter for grade.

The statement that does the increment is a good example of the kind of terse code characteristic of C++ programs. This expression

++scores[grade/10]; // increment the counter for the current cluster

is equivalent to

auto ind = grade/10; // get the bucket index scores[ind] = scores[ind] + 1; // increment the count

We compute the bucket index by dividing grade by 10 and use the result of the division to index scores. Subscripting scores fetches the appropriate counter for this grade. We increment the value of that element to indicate the occurrence of a score in the given range.

As we’ve seen, when we use a subscript, we should think about how we know that the indices are in range (§ 3.2.3, p. 95). In this program, we verify that the input is a valid grade in the range between 0 and 100. Thus, we know that the indices we can compute are between 0 and 10. These indices are between 0 and scores.size() - 1.

Subscripting Does Not Add Elements

Programmers new to C++ sometimes think that subscripting a vector adds elements; it does not. The following code intends to add ten elements to ivec:

vector<int> ivec; // empty vector for (decltype(ivec.size()) ix = 0; ix != 10; ++ix) ivec[ix] = ix; // disaster: ivec has no elements

However, it is in error: ivec is an empty vector; there are no elements to subscript! As we’ve seen, the right way to write this loop is to use push_back:

for (decltype(ivec.size()) ix = 0; ix != 10; ++ix) ivec.push_back(ix); // ok: adds a new element with value ix

Warning

The subscript operator on vector (and string) fetches an existing element; it does not add an element.

Caution: Subscript Only Elements that are Known to Exist!

It is crucially important to understand that we may use the subscript operator (the [] operator) to fetch only elements that actually exist. For example,

vector<int> ivec; // empty vector cout << ivec[0]; // error: ivec has no elements! vector<int> ivec2(10); // vector with ten elements cout << ivec2[10]; // error: ivec2 has elements 0 . . . 9

It is an error to subscript an element that doesn’t exist, but it is an error that the compiler is unlikely to detect. Instead, the value we get at run time is undefined.

Attempting to subscript elements that do not exist is, unfortunately, an extremely common and pernicious programming error. So-called buffer overflow errors are the result of subscripting elements that don’t exist. Such bugs are the most common cause of security problems in PC and other applications.

Tip

A good way to ensure that subscripts are in range is to avoid subscripting altogether by using a range for whenever possible.

Exercises Section 3.3.3

Exercise 3.16: Write a program to print the size and contents of the vectors from exercise 3.13. Check whether your answers to that exercise were correct. If not, restudy § 3.3.1 (p. 97) until you understand why you were wrong.

Exercise 3.17: Read a sequence of words from cin and store the values a vector. After you’ve read all the words, process the vector and change each word to uppercase. Print the transformed elements, eight words to a line.

Exercise 3.18: Is the following program legal? If not, how might you fix it?

vector<int> ivec; ivec[0] = 42;

Exercise 3.19: List three ways to define a vector and give it ten elements, each with the value 42. Indicate whether there is a preferred way to do so and why.

Exercise 3.20: Read a set of integers into a vector. Print the sum of each pair of adjacent elements. Change your program so that it prints the sum of the first and last elements, followed by the sum of the second and second-to-last, and so on.