Data Structures in C++

C++ offers a wide range of data structures. From a performance perspective, the right data structure has to be chosen. This post briefly outline the main feature of every data structure available in C++.

ToC ToC ToC
1. vector 2. set 3. multiset
4. unordered_set 5. unordered_multiset 6. map
7. multimap 8. unordered_map 9. unordered_multimap
10. deque 11. forward_list 12. list
13. queue 14. stack 15. array
16. priority_queue    

1. std::vector


2. std::set

Present in header file set.

Set container stores unique elements whose values once entered cannot be modified later, they can only be inserted and removed. The elements in a set are accessed by their key instead of position. The elements are sorted according to a particular order as they are inserted.

Set is most commonly implemented as a binary search tree (red-black trees). Therefore, it also has the provision to make use of a custom comparison function as a template parameter (requiring signature bool comp(T lhs, T rhs)). It can also take an allocator object when default memory allocation needs to be modified.

Binary Search Tree

Since they are implemented in a binary search trees, the time to insert, erase and find an element is O(log n) (the size of the tree). It is worth mentioning that set can make use of hints to decrease the operation times.

Set allows the following operations:

  • Iterators: begin, end, rbegin, rend, cbegin, cend, crbegin, crend
  • Capacity: empty, size
  • Modification: insert, erase, clear, emplace, emplace_hint
  • Operations: find, lower_bound, upper_bound

3. std::multiset

Present in header file set.

Multiset container stores multiple equal elements whose values once entered cannot be modified later, they can only be inserted and removed. The elements in a set are accessed by their key instead of position. The elements are sorted according to a particular order as they are inserted.

Set is most commonly implemented as a binary search tree (red-black trees). Therefore, it also has the provision to make use of a custom comparison function as a template parameter (requiring signature bool comp(T lhs, T rhs)). It can also take an allocator object when default memory allocation needs to be modified.

Since they are implemented in a binary search trees, the time to insert, erase and find an element is O(log n) (the size of the tree). It is worth mentioning that set can make use of hints to decrease the operation times. Count takes O(log n) to find the matching element and linear time to count the number of elements.

Set allows the following operations:

  • Iterators: begin, end, rbegin, rend, cbegin, cend, crbegin, crend
  • Capacity: empty, size
  • Modification: insert, erase, clear, emplace, emplace_hint
  • Operations: find, count, lower_bound, upper_bound

4. std::unordered_set

Present in the header file unordered_set.

Unordered set container stores unique elements whose values once entered cannot be modified later (immutable), they can only be inserted and removed. The elements in a set are accessed by their key instead of position. The elements are placed in no particular order, which makes it easier for faster retrieval.

Unordered set is implemented as a chained hash table, which arranges the elements into buckets based on the hash of the element. Therefore, it also has the provision to make use of a custom hash function as a template parameter. It can also take an allocator object when default memory allocation needs to be modified.

Since they are implemented as a chained hash table, the time to insert, erase and find an element is O(1). It is worth mentioning that set can make use of hints to decrease the operation times.

Set allows the following operations:

  • Iterators: begin, end, cbegin, cend
  • Capacity: empty, size
  • Modification: insert, erase, clear, emplace, emplace_hint
  • Operations: find, count, equal_range
  • Observers: hash_function, key_eq, get_allocator
  • Hash Policy: load_factor, max_load_factor, rehash, reserve
  • Buckets: bucket_count, max_bucket_count, bucket_size, bucket

A small demo program to show how to implement an unordered_set for custom classes is shown here


5. std::unordered_multiset

Present in the header file unordered_set.

The only difference between this container and unorder set is that this container can hold multiple keys.

Unordered set container stores multiple equal elements whose values once entered cannot be modified later (immutable), they can only be inserted and removed. The elements in a set are accessed by their key instead of position. The elements are placed in no particular order, which makes it easier for faster retrieval.

Unordered multiset is also implemented as a chained hash table, which arranges the elements into buckets based on the hash of the element. Therefore, it also has the provision to make use of a custom hash function as a template parameter. It can also take an allocator object when default memory allocation needs to be modified.

Since they are implemented as a chained hash table, the time to insert, erase and find an element is O(1). It is worth mentioning that set can make use of hints to decrease the operation times.

Set allows the following operations:

  • Iterators: begin, end, cbegin, cend
  • Capacity: empty, size
  • Modification: insert, erase, clear, emplace, emplace_hint
  • Operations: find, count, equal_range
  • Observers: hash_function, key_eq, get_allocator
  • Hash Policy: load_factor, max_load_factor, rehash, reserve
  • Buckets: bucket_count, max_bucket_count, bucket_size, bucket

6. std::map

Present in header file map.

Map container stores unique key elements. The elements in a set are accessed by their key instead of position and the value held by the key can be extracted (keys and mapped to values). The elements are sorted according to a particular order as they are inserted.

Maps is most commonly implemented as a binary search tree (red-black trees). Therefore, it also has the provision to make use of a custom comparison function as a template parameter (requiring signature bool comp(T lhs, T rhs)). It can also take an allocator object when default memory allocation needs to be modified. An entry in the map can be accessed with pair<const T1 K, T2 V> entry.

Since they are implemented in a binary search trees, the time to insert, erase and find an element is O(log n) (the size of the tree). It is worth mentioning that set can make use of hints to decrease the operation times.

Set allows the following operations:

  • Iterators: begin, end, rbegin, rend, cbegin, cend, crbegin, crend
  • Capacity: empty, size, max_size
  • Element Access: operator[], at
  • Modification: insert, erase, clear, emplace, emplace_hint
  • Observers: key_comp, value_comp
  • Operations: find, lower_bound, upper_bound, equal_range
  • Allocator: get_allocator

7. std::multimap

Present in header file map.

Multi map container stores multiple equal elements. The elements in a multi map are accessed by their key instead of position. The elements are sorted according to a particular order as they are inserted.

Set is most commonly implemented as a binary search tree (red-black trees). Therefore, it also has the provision to make use of a custom comparison function as a template parameter (requiring signature bool comp(T lhs, T rhs)). It can also take an allocator object when default memory allocation needs to be modified. An entry in the map can be accessed with pair<const T1 K, T2 V> entry.

Since they are implemented in a binary search trees, the time to insert, erase and find an element is O(log n) (the size of the tree). It is worth mentioning that set can make use of hints to decrease the operation times. Count takes O(log n) to find the matching element and linear time to count the number of elements.

Set allows the following operations: Set allows the following operations:

  • Iterators: begin, end, rbegin, rend, cbegin, cend, crbegin, crend
  • Capacity: empty, size, max_size
  • Modification: insert, erase, clear, emplace, emplace_hint
  • Observers: key_comp, value_comp
  • Operations: find, count, lower_bound, upper_bound, equal_range
  • Allocator: get_allocator

8. std::unordered_map


9. std::unordered_multimap


10. std::deque

Present in header file deque.

Deque expands to double ended queue. This container should be preferred when elements are inserted or removed from either the front or back of the container frequently, while also allowing random access to the elements (by index).

Deque is commonly implemented as dynamic arrays in which multiple arrays are attached with linked with each other. Generally these multiple arrays are managed from a lookup table. The working of this data structure is shown in the image below.It should be noted that the size of the chucks should be constant throughout an instance.

Double Ended Queue Implementation as shown by Konrad Rudolph

The internal implementation around maps make it constant time access O(1) by making use of the house-keeping data in the lookup table. Pushing and poping an element from the front and back are both constant time access O(1). However, insert is a linear time operation, since it has to move all elements by one position.

Set allows the following operations:

  • Iterators: begin, end, rbegin, rend, cbegin, cend, crbegin, crend
  • Capacity: empty, size, resize, shrink_to_fit
  • Element Access: operator[], at, front, back
  • Modification: assign, push_back, push_front, pop_back, pop_front, insert, erase, clear, emplace, emplace_back, emplace_front

Some detailed analysis of the operations on deque are found here and here.

Reference found here.


11. std::forward_list

Present in header file forward_list.

Deque expands to double ended queue. This container should be preferred when elements are inserted or removed from either the front or back of the container frequently, while also allowing random access to the elements (by index).

Forward lists are implemention of singly-linked lists. In which every element along with it’s data also hold the pointer to the next element.

Singly-Linked List


12. std::list


13. std::queue


14. std::stack

Present in the header stack.

Similar to priority_queues, they are container adaptors, which can take form of a vector, deque (default) and list.

Stack is a container used to perform Last-in First-out operations.

stack<int, vector<int>> stackVector();
stack<int, list<int>> stackList();
stack<int, deque<int>> stackDeque();

Stack allows the following operations:

  • General functions: empty, size, top, push, emplace, pop, swap.

15. std::array


16. std::priority_queue

Present in header file queue.

Although they are not containers, the are container adaptors which can take form of a vector (default) or deque. Priority queues are container that store elements with some interesting property (either min or max value) at the top of the tree, decreasing the access time. It allows only to retrieve the max (for max-heap) and min (for min-heap).

They are implementation of binary heaps. In which a node has a value greater than equal to the children (for a max heap). Heaps are linear data structures, in which relation between children and parent can be obtain with the following equations:

parent(n) = (n-1)/2

left-child(n) = (n+1)*2 - 1 = 2n + 1 right-child(n) = (n+1)*2 = 2n + 2

Therefore, it also has the provision to make use of a custom comparison function as a template parameter (requiring signature bool comp(T lhs, T rhs)). It can also take an allocator object when default memory allocation needs to be modified.

Heap Representation of an array

Since they are implemented as heaps, the time to access the value of interest (max - for max heap and min - for min heap) is O(1). However, inserting an element into the heap takes O(log n) time. Since, insertions initially happen at the end of the array and keep climbing until they reach the desired position. (inserting an element actually calls the function push_heap on the underlying container). Pop operations also have a similar runtime.

Priority queue allows the following operations:

  • Iterators: None
  • Capacity: empty, size
  • Modification: push, emplace, pop
  • Operations: top

Heap operations are also possible on standalone vector and deque with heap functions. These function are available in header algorithm.

A small demo program to show how to use a priority queue and standalone heap here


Summary

Q. Why to choose set against unordered_set when there is very little benefit gained by using the former?

If you need to iterator through the container or need the elements to be sorted and using a small container with 100 elements, choose set. For smaller sets prefer the usage of set, otherwise use an unordered_set. However, always benchmark when counting for performance.

Reference:

Written on January 23, 2020