Skip to content
nucularmoo edited this page Mar 17, 2018 · 22 revisions

Item Containers

Explanation of the contents of a topic page @ Week 1 Topic 1

Back to Week 1

Objectives: Adding, iterating, manipulating data in item containers

Comment: Attendees should realise it's totally ok to use standard containers. There are conversion functions between the standard and Qt item containers. The added-value of Qt containers is implicit sharing. It's efficient to take copies of huge containers, as long as the items are not changed, which will result to deep copy. Important learning objective is to do the iteration in the right way: non-mutable with simple range loop for (const &noteConstRefToMyItem : container) and QMutableXXIterator for mutable iterating. These are typically the fastest ones.

Beginner

Intermediate

  • What are Containers?
  • What are the Container classes?
  • Is it OK to use standard containers?
  • What is the added-value of Qt containers?
  • How do I iterate the right way?
  • What are assignable types?
  • How to iterate containers efficiently (for-range loop + Java-like iterators)?
  • How to use algorithms in container manipulation (because Qt algorithms are mostly depicted)?

Expert

!NB: The doc.qt.io Container page is pretty exhaustive. We may not need much more than a shorter introductory presentation here.


Course material content

Qt Containers (vs standard Containers?)

http://doc.qt.io/qt-5/containers.html

The Qt library provides a set of general purpose template-based container classes. These classes can be used to store items of a specified type. For example, if you need a resizable array of QStrings, use QVector<QString>.

These container classes are designed to be lighter, safer, and easier to use than the STL containers. If you are unfamiliar with the STL, or prefer to do things the "Qt way", you can use these classes instead of the STL classes.

The container classes are implicitly shared, they are reentrant, and they are optimized for speed, low memory consumption, and minimal inline code expansion, resulting in smaller executables. In addition, they are thread-safe in situations where they are used as read-only containers by all threads used to access them.

For traversing the items stored in a container, you can use one of two types of iterators: Java-style iterators and STL-style iterators. The Java-style iterators are easier to use and provide high-level functionality, whereas the STL-style iterators are slightly more efficient and can be used together with Qt's and STL's generic algorithms.

Qt also offers a foreach keyword that make it very easy to iterate over all the items stored in a container.

Container classes

http://doc.qt.io/qt-5/containers.html#the-container-classes

Qt provides the following sequential containers: QList, QLinkedList, QVector, QStack, and QQueue. For most applications, QList is the best type to use. Although it is implemented as an array-list, it provides very fast prepends and appends. If you really need a linked-list, use QLinkedList; if you want your items to occupy consecutive memory locations, use QVector. QStack and QQueue are convenience classes that provide LIFO and FIFO semantics.

Qt also provides these associative containers: QMap, QMultiMap, QHash, QMultiHash, and QSet. The "Multi" containers conveniently support multiple values associated with a single key. The "Hash" containers provide faster lookup by using a hash function instead of a binary search on a sorted set.

As special cases, the QCache and QContiguousCache classes provide efficient hash-lookup of objects in a limited cache storage.

Containers can be nested. For example, it is perfectly possible to use a QMap<QString, QList<int>>, where the key type is QString and the value type QList<int>.

The containers are defined in individual header files with the same name as the container (e.g., <QLinkedList>). For convenience, the containers are forward declared in <QtContainerFwd>.

The values stored in the various containers can be of any assignable data type. To qualify, a type must provide a default constructor, a copy constructor, and an assignment operator. This covers most data types you are likely to want to store in a container, including basic types such as int and double, pointer types, and Qt data types such as QString, QDate, and QTime, but it doesn't cover QObject or any QObject subclass (QWidget, QDialog, QTimer, etc.). If you attempt to instantiate a QList<\QWidget>, the compiler will complain that QWidget's copy constructor and assignment operators are disabled. If you want to store these kinds of objects in a container, store them as pointers, for example as QList<QWidget *>.

Here's an example custom data type that meets the requirement of an assignable data type:

 class Employee
 {
 public:
      Employee() {}
      Employee(const Employee &other);

      Employee &operator=(const Employee &other);

  private:
       QString myName;
       QDate myDateOfBirth;
 };

If we don't provide a copy constructor or an assignment operator, C++ provides a default implementation that performs a member-by-member copy. In the example above, that would have been sufficient. Also, if you don't provide any constructors, C++ provides a default constructor that initializes its member using default constructors. Although it doesn't provide any explicit constructors or assignment operator, the following data type can be stored in a container:

 struct Movie
 {
      int id;
      QString title;
      QDate releaseDate;
  };

Some containers have additional requirements for the data types they can store. For example, the Key type of a QMap<Key, T> must provide operator<(). Such special requirements are documented in a class's detailed description. In some cases, specific functions have special requirements; these are described on a per-function basis. The compiler will always emit an error if a requirement isn't met.

Qt's containers provide operator<<() and operator>>() so that they can easily be read and written using a QDataStream. This means that the data types stored in the container must also support operator<<() and operator>>(). Providing such support is straightforward; here's how we could do it for the Movie struct above:

 QDataStream &operator<<(QDataStream &out, const Movie &movie)
 {
      out << (quint32)movie.id << movie.title
          << movie.releaseDate;
      return out;
 }

 QDataStream &operator>>(QDataStream &in, Movie &movie)
 {
      quint32 id;
      QDate date;

      in >> id >> movie.title >> date;
      movie.id = (int)id;
      movie.releaseDate = date;
      return in;
 }

The documentation of certain container class functions refer to default-constructed values; for example, QVector automatically initializes its items with default-constructed values, and QMap::value() returns a default-constructed value if the specified key isn't in the map. For most value types, this simply means that a value is created using the default constructor (e.g. an empty string for QString). But for primitive types like int and double, as well as for pointer types, the C++ language doesn't specify any initialization; in those cases, Qt's containers automatically initialize the value to 0.

Using standard containers

It's totally ok to use standard containers \:D/

Iterating correctly and efficiently

http://doc.qt.io/qt-5/containers.html#the-iterator-classes
http://doc.qt.io/qt-5/qtalgorithms.html#details

Iterators provide a uniform means to access items in a container. Qt's container classes provide two types of iterators: Java-style iterators and STL-style iterators. Iterators of both types are invalidated when the data in the container is modified or detached from implicitly shared copies due to a call to a non-const member function.

Java-Style Iterators

The Java-style iterators are new in Qt 4 and are the standard ones used in Qt applications. They are more convenient to use than the STL-style iterators, at the price of being slightly less efficient. Their API is modelled on Java's iterator classes.

For each container class, there are two Java-style iterator data types: one that provides read-only access and one that provides read-write access.

In this discussion, we will concentrate on QList and QMap. The iterator types for QLinkedList, QVector, and QSet have exactly the same interface as QList's iterators; similarly, the iterator types for QHash have the same interface as QMap's iterators.

Unlike STL-style iterators (covered below), Java-style iterators point between items rather than directly at items. For this reason, they are either pointing to the very beginning of the container (before the first item), at the very end of the container (after the last item), or between two items. The diagram below shows the valid iterator positions as red arrows for a list containing four items:

Here's a typical loop for iterating through all the elements of a QList<QString> in order and printing them to the console:

 QList<QString> list;
 list << "A" << "B" << "C" << "D";

 QListIterator<QString> i(list);
 while (i.hasNext())
      qDebug() << i.next();

It works as follows: The QList to iterate over is passed to the QListIterator constructor. At that point, the iterator is located just in front of the first item in the list (before item "A"). Then we call hasNext() to check whether there is an item after the iterator. If there is, we call next() to jump over that item. The next() function returns the item that it jumps over. For a QList<QString>, that item is of type QString.

Here's how to iterate backward in a QList:

 QListIterator<QString> i(list);
 i.toBack();
 while (i.hasPrevious())
      qDebug() << i.previous();

The code is symmetric with iterating forward, except that we start by calling toBack() to move the iterator after the last item in the list.

The diagram below illustrates the effect of calling next() and previous() on an iterator:

The following table summarizes the QListIterator API:

QListIterator provides no functions to insert or remove items from the list as we iterate. To accomplish this, you must use QMutableListIterator. Here's an example where we remove all odd numbers from a QList<int> using QMutableListIterator:

 QMutableListIterator<int> i(list);
 while (i.hasNext()) {
      if (i.next() % 2 != 0)
           i.remove();
 }

The next() call in the loop is made every time. It jumps over the next item in the list. The remove() function removes the last item that we jumped over from the list. The call to remove() does not invalidate the iterator, so it is safe to continue using it. This works just as well when iterating backward:

 QMutableListIterator<int> i(list);
 i.toBack();
 while (i.hasPrevious()) {
      if (i.previous() % 2 != 0)
           i.remove();
 } 

If we just want to modify the value of an existing item, we can use setValue(). In the code below, we replace any value larger than 128 with 128:

 QMutableListIterator<int> i(list);
 while (i.hasNext()) {
      if (i.next() > 128)
           i.setValue(128);
 }

Just like remove(), setValue() operates on the last item that we jumped over. If we iterate forward, this is the item just before the iterator; if we iterate backward, this is the item just after the iterator.

The next() function returns a non-const reference to the item in the list. For simple operations, we don't even need setValue():

 QMutableListIterator<int> i(list);
 while (i.hasNext())
      i.next() *= 2;

As mentioned above, QLinkedList's, QVector's, and QSet's iterator classes have exactly the same API as QList's. We will now turn to QMapIterator, which is somewhat different because it iterates on (key, value) pairs.

Like QListIterator, QMapIterator provides toFront(), toBack(), hasNext(), next(), peekNext(), hasPrevious(), previous(), and peekPrevious(). The key and value components are extracted by calling key() and value() on the object returned by next(), peekNext(), previous(), or peekPrevious().

The following example removes all (capital, country) pairs where the capital's name ends with "City":

 QMap<QString, QString> map;
 map.insert("Paris", "France");
 map.insert("Guatemala City", "Guatemala");
 map.insert("Mexico City", "Mexico");
 map.insert("Moscow", "Russia");
 ...

 QMutableMapIterator<QString, QString> i(map);
 while (i.hasNext()) {
      if (i.next().key().endsWith("City"))
           i.remove();
 }

QMapIterator also provides a key() and a value() function that operate directly on the iterator and that return the key and value of the last item that the iterator jumped above. For example, the following code copies the contents of a QMap into a QHash:

 QMap<int, QWidget *> map;
 QHash<int, QWidget *> hash;

 QMapIterator<int, QWidget *> i(map);
 while (i.hasNext()) {
      i.next();
      hash.insert(i.key(), i.value());
 }

If we want to iterate through all the items with the same value, we can use findNext() or findPrevious(). Here's an example where we remove all the items with a particular value:

 QMutableMapIterator<int, QWidget *> i(map);
 while (i.findNext(widget))
      i.remove();

Assignable types

Algorithms in container manipulation

http://doc.qt.io/qt-5/containers.html#algorithmic-complexity

Algorithmic complexity is concerned about how fast (or slow) each function is as the number of items in the container grow. For example, inserting an item in the middle of a QLinkedList is an extremely fast operation, irrespective of the number of items stored in the QLinkedList. On the other hand, inserting an item in the middle of a QVector is potentially very expensive if the QVector contains many items, since half of the items must be moved one position in memory.

To describe algorithmic complexity, we use the following terminology, based on the "big Oh" notation:

The following table summarizes the algorithmic complexity of Qt's sequential container classes:

In the table, "Amort." stands for "amortized behavior". For example, "Amort. O(1)" means that if you call the function only once, you might get O(n) behavior, but if you call it multiple times (e.g., n times), the average behavior will be O(1).

The following table summarizes the algorithmic complexity of Qt's associative containers and sets:

With QVector, QHash, and QSet, the performance of appending items is amortized O(log n). It can be brought down to O(1) by calling QVector::reserve(), QHash::reserve(), or QSet::reserve() with the expected number of items before you insert the items. The next section discusses this topic in more depth.

Growth Strategies

QVector<T>, QString, and QByteArray store their items contiguously in memory; QList<T> maintains an array of pointers to the items it stores to provide fast index-based access (unless T is a pointer type or a basic type of the size of a pointer, in which case the value itself is stored in the array); QHash<Key, T> keeps a hash table whose size is proportional to the number of items in the hash. To avoid reallocating the data every single time an item is added at the end of the container, these classes typically allocate more memory than necessary.

Consider the following code, which builds a QString from another QString:

 QString onlyLetters(const QString &in)
 {
      QString out;
      for (int j = 0; j < in.size(); ++j) {
           if (in[j].isLetter())
                out += in[j];
      }
      return out;
 }

We build the string out dynamically by appending one character to it at a time. Let's assume that we append 15000 characters to the QString string. Then the following 18 reallocations (out of a possible 15000) occur when QString runs out of space: 4, 8, 12, 16, 20, 52, 116, 244, 500, 1012, 2036, 4084, 6132, 8180, 10228, 12276, 14324, 16372. At the end, the QString has 16372 Unicode characters allocated, 15000 of which are occupied.

The values above may seem a bit strange, but here are the guiding principles:

  • QString allocates 4 characters at a time until it reaches size 20.
  • From 20 to 4084, it advances by doubling the size each time. More precisely, it advances to the next power of two, minus 12. (Some memory allocators perform worst when requested exact powers of two, because they use a few bytes per block for book-keeping.)
  • From 4084 on, it advances by blocks of 2048 characters (4096 bytes). This makes sense because modern operating systems don't copy the entire data when reallocating a buffer; the physical memory pages are simply reordered, and only the data on the first and last pages actually needs to be copied.

QByteArray and QList use more or less the same algorithm as QString.

QVector<T> also uses that algorithm for data types that can be moved around in memory using memcpy() (including the basic C++ types, the pointer types, and Qt's shared classes) but uses a different algorithm for data types that can only be moved by calling the copy constructor and a destructor. Since the cost of reallocating is higher in that case, QVector<T> reduces the number of reallocations by always doubling the memory when running out of space.

QHash<Key, T> is a totally different case. QHash's internal hash table grows by powers of two, and each time it grows, the items are relocated in a new bucket, computed as qHash(key) % QHash::capacity() (the number of buckets). This remark applies to QSet<T> and QCache<Key, \T> as well.

For most applications, the default growing algorithm provided by Qt does the trick. If you need more control, QVector<T>, QHash<Key, T>, QSet<T>, QString, and QByteArray provide a trio of functions that allow you to check and specify how much memory to use to store the items:

  • capacity() returns the number of items for which memory is allocated (for QHash and QSet, the number of buckets in the hash table).
  • reserve(size) explicitly preallocates memory for size items.
  • squeeze() frees any memory not required to store the items.

If you know approximately how many items you will store in a container, you can start by calling reserve(), and when you are done populating the container, you can call squeeze() to release the extra preallocated memory.


Exhaustive reference material mentioned in this topic

Further reading topics/links:

Clone this wiki locally