About Boost Multi-index Containers

I really like Qt Template Library (QTL), but still have lack of some important functionality which is usually called multi-index. Anyway there is very powerful implementation existing in Boost. I really need these templates to establish relations between different data structures. I often use it as internal “engine” for different Proxy models that I use to manipulate models.

So, I would like to remind and give quick recepies about such powerful module in Boost as multi_index_container. We know and intensively use standard STL templates for storing and access to data as lists, vectors, maps, hashes. There are a lot of information about them and very well explained.

Things get complicated when there is a need to store and access to objects by more then one set of keys or their combinations. Usually developers start from creation of two maps or hashes plus a log of synchronization of then, but later with increase of keys codes became much more complicated due to synchronization of hashes, insertions, key renames, removes etc. Also it is getting very hard to understand the computation complexity of every operation taken with such storage. And of course creation of such “bicycles” leads to high amount of bugs, needs of optimizations, etc.

Of course everything is invented before us and in Boost libraries you will find a module for solving these problems – boost::multi_index. Great advantage – speed, multi_index is very fast. But documentation for this module is, let’s say, complicated for understanding and especially for quick start, so newbies try to avoid the use of the module. And of course compiler messages about error due to use of multi_index templates give high amount of confusion.

I would like to remove confusion and give simple cases that will be good recipes of use. I will use it together with Qt classes for demo:

For example we have small structure:

struct Rec { QString name, phone, addr; };

multi_index provide convenience by using tags for keys. Let’s define then directly in Rec to have it all in one place:

struct Rec {
	QString name, phone, addr;
	struct ByName {}; struct ByPhone {}; struct ByAddr {};
};

Now we can create next array:

typedef boost::multi_index_container<Rec,
	indexed_by<
		ordered_unique<
			tag<Rec::ByName>, member<Rec,QString,&Rec::name>
		>,
		ordered_non_unique<
			tag<Rec::ByPhone>, member<Rec,QString,&Rec::phone>
		>,
		ordered_non_unique<
			tag<Rec::ByAddr>, member<Rec,QString,&Rec::addr>
		>
	>
> Store;

We defines that records are unique by name, but not unique by address and phone. The “name” uniqueness also means that we can not add two records with same name.

{
	Store store;
	Rec r1 = { "Basilio Pupkinio", "022", "Neron st" };
	qDebug() << "ok1" << store.insert(r1).second; // ok1 true
	qDebug() << "ok2" << store.insert(r1).second; // ok2 false
}

For me it is very useful property that array does not allow the addition of duplicates - then I can rely on validity of array "a priori".

Let's find the record by name:

{
	QString find_id = "Basilio Pupkinio";
	typedef Store::index<Rec::ByName>::type List;
	const List & ls = store.get<Rec::ByName>();
	List::const_iterator it = ls.find(find_id);
	if ( it != ls.end() ) {
		qDebug() << (*it).addr; // "Neron st"
	}
}

If we have few records with phone 022

{
	Store store;
	Rec r1 = { "Basilio Pupkinio", "022", "Pushkina st" };
	Rec r2 = { "Vasya Pupkin",     "022", "Around st" };
	store.insert(r1)
	store.insert(r2)
}

Then to find all records with the phone 022:

{
	QString find_phone = "022";
	Store::index<Rec::ByPhone>::type::iterator it0, it1;
	tie(it0,it1) = store.get<Rec::ByPhone>().equal_range(find_phone);
	while(it0 != it1) { qDebug() << (*it0).name; ++it0; }
}

What if we are interested in search by phone+address combination?
Then we just add composition index to our array by such code block:

		ordered_non_unique<
			tag<Rec::ByKey>, composite_key<
				Rec,
				member<Rec,QString,&Rec::phone>,
				member<Rec,QString,&Rec::addr>
			>
		>,

(do not forget to add tag: struct ByKey {}; )

{
	Rec r1  = { "Basilio Pupkinio", "022", "Pushkina st" };
	Rec r2  = { "Vasya Pupkin",     "022", "Around st" };
	Rec r3  = { "Vasilisa Pupkina", "022", "Around st" };
	store.insert(r1);
	store.insert(r2);
	store.insert(r3);
	{
		QString find_phone = "022";
		QString find_addr = "Around st";
		Store::index<Rec::ByKey>::type::iterator it0, it1;
		tie(it0,it1) = store.get<Rec::ByKey>().equal_range(make_tuple(find_phone, find_addr));
		while(it0 != it1) { qDebug() << (*it0).name; ++it0; }
	}
}

Of course for composite key we can also use both ordered_non_unique and ordered_unique.
In this way it is very convenient to create additional requirements to content of keys and their combinations - the array itself will not allow addition of "wrong" objects.
If we would like to have same time "vector"-like access then we can easily add random_access definition:

typedef boost::multi_index_container<Rec,
	indexed_by<
		random_access< tag<Rec::ByRnd> >,		
		ordered_unique<
			tag<Rec::ByName>, member<Rec,QString,&Rec::name>
		>,
		ordered_non_unique<
			tag<Rec::ByPhone>, member<Rec,QString,&Rec::phone>
		>
	>
> Store;

As result we have access to records by store[0], store[1]..., and push_back().
If we would like to use array as hash and have fast access by key as О(1), but slower O(log(n)) on insert/remove then we can use hashed_unique/hashed_non_unique instead ordered_unique/ordered_non_uniue:

typedef boost::multi_index_container<Rec,
	indexed_by<
		hashed_non_unique<
			tag<Rec::ByName>, member<Rec,QString,&Rec::name>
		>,
		ordered_non_unique<
			tag<Rec::ByPhone>, member<Rec,QString,&Rec::phone>
		>
	>
> Store;
std::size_t hash_value(QString x) { return qHash(x); }

Of course can do same hashed_uniqie/hashed_non_unique for composite keys.

Records modifications:
Because members of objects should be in sync with array keys you can not change only field of object. Imagine that we need to change a phone of one record. To have our keys in sync the change should be done through functor:

struct Rec {
	QString name, phone, addr;
	struct ByName {};
	struct ByPhone {};
	struct ByAddr {};
	struct PhoneChange : public std::unary_function<Rec,void> {
		QString p; PhoneChange(const QString &_p) : p(_p) {}
		void operator()(Rec & r) { r.phone = p; }
	};
}; 

As example let's find some record by name, get the appropriate "name" iterator, then with project() switch to iterator by "phone" and finally modify the record by functor:

{
	Store store;
	Rec r1  = { "Basilio Pupkinio", "021", "Around st" };
	Rec r2  = { "Vasya Pupkin",     "022", "Around st" };
	Rec r3  = { "Vasilisa Pupkina", "022", "Around st" };
	store.insert(r1);
	store.insert(r2);
	store.insert(r3);

	QString find_id = "Basilio Pupkinio";
	typedef Store::index<Rec::ByName>::type NList;
	typedef Store::index<Rec::ByPhone>::type PList;
	NList & ns = store.get<Rec::ByName>();
	PList & ps = store.get<Rec::ByPhone>();
	NList::const_iterator nit = ns.find(find_id);
	if ( nit != ns.end() ) {
		PList::const_iterator pit = store.project<Rec::ByPhone>(nit);
		ps.modify(pit, Rec::PhoneChange("022"));
	}
}

Keys are not limited by fields - you can use and member-functions, example:

struct Rec {
	QString n, phone, addr;
	QString name() const { return n; }
	struct ByName {};
	struct ByPhone {};
};

typedef boost::multi_index_container<Rec,
	indexed_by<
		hashed_unique<
			tag<Rec::ByName>, const_mem_fun<Rec,QString,&Rec::name>
		>,
		ordered_non_unique<
			tag<Rec::ByPhone>, member<Rec,QString,&Rec::phone>
		>
	>
> Store;

Worth to mention usage of pointers in the array:

struct Rec {
	QString n, phone, addr;
	QString name() const { return n; }
	struct ByName {};
	struct ByPhone {};
};
typedef boost::shared_ptr<Rec> Rec_ptr;

typedef boost::multi_index_container<Rec_ptr,
	indexed_by<
		hashed_non_unique<
			tag<Rec::ByName>, const_mem_fun<Rec,QString,&Rec::name>
		>,
		ordered_non_unique<
			tag<Rec::ByPhone>, member<Rec,QString,&Rec::phone>
		>,
		hashed_non_unique<
			tag<Rec::ByKey>, composite_key<
				Rec,
				member<Rec,QString,&Rec::phone>,
				member<Rec,QString,&Rec::addr>
			>
		>,
		ordered_non_unique<
			tag<Rec::ByAddr>, member<Rec,QString,&Rec::addr>
		>
	>
> Store;

All other codes are same (only used "->" instead "." in few places):

{
	QString find_phone = "022";
	Store::index<Rec::ByPhone>::type::iterator it0, it1;
	tie(it0,it1) = store.get<Rec::ByPhone>().equal_range(find_phone);
	while(it0 != it1) { qDebug() << (*it0)->name(); ++it0; }
}

Plus worth to add a block

		ordered_unique<
			tag<Rec::ByPtr>, const_mem_fun<Rec_ptr,Rec *,&Rec_ptr::get>
		>,

To prevent duplicates of pointers in array.

As result it is possible to create very efficient (by access) data structures cross-linked by smart pointers and have simple predictions regarding cost of every operation of access, insert or remove. From some point of view multi_index_container can be considered as extremely fast database table in memory with predefined(during coding) keys.

I guess these examples are great collection of "hot recipes" for use.

This entry was posted in Blog, Boost, C++, Models, Qt, Research. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

3 Comments

  1. Shailendra Kumar
    Posted July 8, 2015 at 13:38 | Permalink

    Hi Admin,

    Your examples here are great and easy to understand, still as a novice user of Boost i have one question. Is there a possibility to update “phone” ,”addr” and one new field “socialid” without calling “ps.modify(pit, Rec::” thrice. Please guide.

  2. yshurik
    Posted July 8, 2015 at 15:44 | Permalink

    Hi,

    I am not sure about appropriate api, because you need to modify different lists of keys (modify() is method of ps). If it is a question about performance then instead of changing multiple keylists it may worth to remove, modify and re-insert element.

  3. Jim
    Posted September 16, 2016 at 08:15 | Permalink

    Is there an advantage of adding an index as a “member” rather than a “identity”. I can instead of doing “member” do “identity” and insert. Is that a problem?

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>