Calculate all combinations of K items selected from a set of size N

Given a set of N items, we need to calculate all subsets of K items. In each calculated subset, the selected items must appear only one time (no repetitions). In addition, the order of selected items should follow the criteria chosen for the entire set of N items.

In the following sections, it is always assumed that the elements in the original set are all unique and sorted according to some criteria.

An example

For easy of understanding, let’s consider a set of 5 characters: { ‘a’, ‘b’, ‘c’, ‘d’, ‘e’ }.

The goal of this post is to provide a basic algorithm to calculate all combinations of size k <= 5.

For example, if k = 4, then the combinations will be:

  1. { ‘a’, ‘b’, ‘c’, ‘d’ }
  2. { ‘a’, ‘b’, ‘c’, ‘e’ }
  3. { ‘a’, ‘b’, ‘d’, ‘e’ }
  4. { ‘a’, ‘c’, ‘d’, ‘e’ }
  5. { ‘b’, ‘c’, ‘d’, ‘e’ }

Each of the selected items occurs only once in the calculated set (combinations with no repetitions).

How to calculate the expected number of combinations?

The number of combinations of size k from a set of n elements is calculated as:

nCk = n! / ( k! * (n-k)! )

where ! denotes the factorial operation.

A first approach: the brute force (iterative method)

The following code provides all possible combinations of size K from a vector of N items.

This code assumes that the input values are sorted according to some criteria and are unique inside the input vector.

The proposed code is basically a brute force algorithm that checks all possible combinations of size k and stores them inside an output vector.

// Author : Pietro L. Carotenuto
// Website: www.pietrolc.com

#include <iostream>
#include <vector>
#include <deque>

using Tin = std::vector<int>;
using Tout = std::vector<std::deque<int>>;

void calculateCombinationsRec(const Tin&amp; vec, const int k, Tout&amp; sol, std::deque<int>&amp; tmp, const int nextIndex )
{
	tmp.emplace_back(vec[nextIndex]);

	// Save and return
	if (k == tmp.size())
	{
		// Add to solutions bucket
		sol.emplace_back(tmp);

		// Remove last item
		tmp.pop_back();

		// Check remaining combinations
		return;
	}

	// Otherwise, continue checking remaining combinations
	for (size_t i = nextIndex + 1; i < vec.size(); i++)
	{
		calculateCombinationsRec(vec, k, sol, tmp, i);
	}

	// Remove last item before checking new combinations
	tmp.pop_back();
}

// This returns a vector of all calculated combinations
// Assumption: The input vector does not contain any duplicate values.
//             Ideally, the input values are sorted according some criteria.
void calculateCombinations(const Tin&amp; vec, const int k, Tout&amp; sol)
{
	// Iterate on all possible root indexes
	for (size_t rootIdx = 0; rootIdx < vec.size(); rootIdx++)
	{
		// For each root, this temporary queue stores all combinations having
		// the same starting point
		std::deque<int> tmp;
		calculateCombinationsRec(vec, k, sol, tmp, rootIdx);
	}
}

int main()
{
	Tin vec;
	Tout sol;
	const int k = 5;

	for (int i = 0; i < 10; i++)
	{
	    vec.emplace_back(i);
	}

	// Main function
	calculateCombinations(vec, k, sol);

	std::cout << "Combinations: " << std::endl;
	for (const auto &amp; s1 : sol)
	{
		for (const auto&amp; it2 : s1)
		{
			std::cout << it2 << " ";
		}
		std::cout << std::endl;
	}

	std::cout << "N. combinations: " << sol.size() << std::endl;
}

This code produces the following output:

Combinations:
0 1 2 3 4
0 1 2 3 5
0 1 2 3 6
0 1 2 3 7
0 1 2 3 8
0 1 2 3 9
0 1 2 4 5
0 1 2 4 6
0 1 2 4 7
0 1 2 4 8
0 1 2 4 9
0 1 2 5 6
0 1 2 5 7
0 1 2 5 8
0 1 2 5 9
0 1 2 6 7
0 1 2 6 8
0 1 2 6 9
0 1 2 7 8
0 1 2 7 9
0 1 2 8 9
0 1 3 4 5
0 1 3 4 6
0 1 3 4 7
0 1 3 4 8
0 1 3 4 9
0 1 3 5 6
0 1 3 5 7
0 1 3 5 8
0 1 3 5 9
0 1 3 6 7
0 1 3 6 8
0 1 3 6 9
0 1 3 7 8
0 1 3 7 9
0 1 3 8 9
0 1 4 5 6
0 1 4 5 7
0 1 4 5 8
0 1 4 5 9
0 1 4 6 7
0 1 4 6 8
0 1 4 6 9
0 1 4 7 8
0 1 4 7 9
0 1 4 8 9
0 1 5 6 7
0 1 5 6 8
0 1 5 6 9
0 1 5 7 8
0 1 5 7 9
0 1 5 8 9
0 1 6 7 8
0 1 6 7 9
0 1 6 8 9
0 1 7 8 9
0 2 3 4 5
0 2 3 4 6
0 2 3 4 7
0 2 3 4 8
0 2 3 4 9
0 2 3 5 6
0 2 3 5 7
0 2 3 5 8
0 2 3 5 9
0 2 3 6 7
0 2 3 6 8
0 2 3 6 9
0 2 3 7 8
0 2 3 7 9
0 2 3 8 9
0 2 4 5 6
0 2 4 5 7
0 2 4 5 8
0 2 4 5 9
0 2 4 6 7
0 2 4 6 8
0 2 4 6 9
0 2 4 7 8
0 2 4 7 9
0 2 4 8 9
0 2 5 6 7
0 2 5 6 8
0 2 5 6 9
0 2 5 7 8
0 2 5 7 9
0 2 5 8 9
0 2 6 7 8
0 2 6 7 9
0 2 6 8 9
0 2 7 8 9
0 3 4 5 6
0 3 4 5 7
0 3 4 5 8
0 3 4 5 9
0 3 4 6 7
0 3 4 6 8
0 3 4 6 9
0 3 4 7 8
0 3 4 7 9
0 3 4 8 9
0 3 5 6 7
0 3 5 6 8
0 3 5 6 9
0 3 5 7 8
0 3 5 7 9
0 3 5 8 9
0 3 6 7 8
0 3 6 7 9
0 3 6 8 9
0 3 7 8 9
0 4 5 6 7
0 4 5 6 8
0 4 5 6 9
0 4 5 7 8
0 4 5 7 9
0 4 5 8 9
0 4 6 7 8
0 4 6 7 9
0 4 6 8 9
0 4 7 8 9
0 5 6 7 8
0 5 6 7 9
0 5 6 8 9
0 5 7 8 9
0 6 7 8 9
1 2 3 4 5
1 2 3 4 6
1 2 3 4 7
1 2 3 4 8
1 2 3 4 9
1 2 3 5 6
1 2 3 5 7
1 2 3 5 8
1 2 3 5 9
1 2 3 6 7
1 2 3 6 8
1 2 3 6 9
1 2 3 7 8
1 2 3 7 9
1 2 3 8 9
1 2 4 5 6
1 2 4 5 7
1 2 4 5 8
1 2 4 5 9
1 2 4 6 7
1 2 4 6 8
1 2 4 6 9
1 2 4 7 8
1 2 4 7 9
1 2 4 8 9
1 2 5 6 7
1 2 5 6 8
1 2 5 6 9
1 2 5 7 8
1 2 5 7 9
1 2 5 8 9
1 2 6 7 8
1 2 6 7 9
1 2 6 8 9
1 2 7 8 9
1 3 4 5 6
1 3 4 5 7
1 3 4 5 8
1 3 4 5 9
1 3 4 6 7
1 3 4 6 8
1 3 4 6 9
1 3 4 7 8
1 3 4 7 9
1 3 4 8 9
1 3 5 6 7
1 3 5 6 8
1 3 5 6 9
1 3 5 7 8
1 3 5 7 9
1 3 5 8 9
1 3 6 7 8
1 3 6 7 9
1 3 6 8 9
1 3 7 8 9
1 4 5 6 7
1 4 5 6 8
1 4 5 6 9
1 4 5 7 8
1 4 5 7 9
1 4 5 8 9
1 4 6 7 8
1 4 6 7 9
1 4 6 8 9
1 4 7 8 9
1 5 6 7 8
1 5 6 7 9
1 5 6 8 9
1 5 7 8 9
1 6 7 8 9
2 3 4 5 6
2 3 4 5 7
2 3 4 5 8
2 3 4 5 9
2 3 4 6 7
2 3 4 6 8
2 3 4 6 9
2 3 4 7 8
2 3 4 7 9
2 3 4 8 9
2 3 5 6 7
2 3 5 6 8
2 3 5 6 9
2 3 5 7 8
2 3 5 7 9
2 3 5 8 9
2 3 6 7 8
2 3 6 7 9
2 3 6 8 9
2 3 7 8 9
2 4 5 6 7
2 4 5 6 8
2 4 5 6 9
2 4 5 7 8
2 4 5 7 9
2 4 5 8 9
2 4 6 7 8
2 4 6 7 9
2 4 6 8 9
2 4 7 8 9
2 5 6 7 8
2 5 6 7 9
2 5 6 8 9
2 5 7 8 9
2 6 7 8 9
3 4 5 6 7
3 4 5 6 8
3 4 5 6 9
3 4 5 7 8
3 4 5 7 9
3 4 5 8 9
3 4 6 7 8
3 4 6 7 9
3 4 6 8 9
3 4 7 8 9
3 5 6 7 8
3 5 6 7 9
3 5 6 8 9
3 5 7 8 9
3 6 7 8 9
4 5 6 7 8
4 5 6 7 9
4 5 6 8 9
4 5 7 8 9
4 6 7 8 9
5 6 7 8 9
N. combinations: 252

Each element inside the input vector can be the starting element of a set of combinations of k elements.

READ ALSO  Pizza problem - Solved Google Hashcode practice test

The most expert in this field might realize that the elements having indexes larger than n – (k-1) cannot be the starting element of a new set of combinations. In order to improve the performance, we could ignore those items.

Given a specific starting element, then the function calculateCombinationsRec() is recursively called to check the remaining elements. The function calculateCombinationsRec() only stores the combinations having size k.

The brute force (recursive optimized method)

The code above works pretty fine but it has a big disadvantage: it cannot handle a large data set of items. Why? Simply because the proposed function is recursive.

For a large set of items or, similarly, for large values of K, this recursive function might lead to a stack overflow.

The solution is to convert the proposed code to its non-recursive version.

Converting a code from a recursive to its iterative version is easy.

Please, have a look into the following code, where an iterative function is being used to calculate all combinations of a given set of characters:

Unlock now all the premium content in this page. Sign-in or register to buy access to this content using your virtual coins.

0 persons like you have already done it. Do not miss it!

Which is the expected console output for the input vector of characters {‘a’, ‘b’, ‘c’, ‘d’, ‘e’} for K = 3?

Please, check the console output here below:

Combinations:
a b c
a b d
a b e
a c d
a c e
a d e
b c d
b c e
b d e
c d e
N. combinations: 10

Second approach: a C++ specific method

The method proposed in this section has been adapted from Rosetta Code website.

I like this approach since it is not recursive and it takes advantage of the function std::prev_permutation() already available in the STL library “algorithm”.

#include <algorithm>
#include <iostream>
#include <string>
#include <vector>

using T = char;
using Tcomb = std::vector<T>;
using Tsol = std::vector<Tcomb>;

void comb(const int K, const Tcomb&amp; v, Tsol&amp; sol)
{
	const int N = v.size();

	if (K > N)
	{
		// log an error message here!
		return;
	}

	std::string bitmask(K, 1); // K leading 1's
	bitmask.resize(N, 0); // N-K trailing 0's

	// store values and permute bitmask
	do {

		Tcomb tmp{};
		for (int i = 0; i < N; ++i) // [0..N-1] integers
		{
			if (bitmask[i] != 0)
			{
				tmp.emplace_back(v[i]);
			}
		}

		if (!tmp.empty())
		{
			sol.emplace_back(tmp);
		}

	} while (std::prev_permutation(bitmask.begin(), bitmask.end()));
}

int main()
{
	Tsol sol;
	Tcomb values{ 'a','b','c','d','e' };

	comb(3, values, sol);

	for ( auto&amp; t : sol)
	{
		for (auto&amp; e: t)
		{
			std::cout << e << " ";
		}
		std::cout << std::endl;
	}
	std::cout << "N. combinations: " << sol.size() << std::endl;
}

The proposed iterative method can be used with any pair of K and N (assuming K <= N).

READ ALSO  "More Pizza" practice problem: solved with maximum score

Conclusions

In this article, two different approaches have been proposed to calculate all combinations of size k from a set of size N.

The main assumptions for the proposed methods are that all items in the set are unique and – eventually – sorted according to some criteria. The calculated combinations are sorted according to the same criteria followed in the set of items used as input.

Each proposed method has been properly discussed to point out pros and cons.

vote
Article Rating
Sharing is Caring
  •   
  •   
  •   
  •   
  •  
  •  
  •  
  •  
    4
    Shares
  • 4
  •  
  •  
  •  
  •  
  •  
Subscribe
Notify of
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x