banner

For a list of BASHing data 2 blog posts see the index page.    RSS


Permutations and combinations of pairs with AWK

In this post I demonstrate how to use AWK to permute and combine pairs from a list of items. For the basic AWK command I'm very grateful to Sundeep Agarwal, the author of the excellent "learn by example" online tutorials and e-books.

The list with which I'll demonstrate (I've given it the filename "items") is:

a
b
c
x
y
z

Note that there are no duplicated items.


Permutation with repetition.

awk '{a[++c]=$0} END {for (i=1;i<=c;i++) for (j=1;j<=c;j++) print a[i]a[j]}' items

perm1

This command builds the 36 possible permutations of 6 items taken 2 at a time, with repetition allowed. In other words, both the first letter in the pair and the second letter can be any one of the 6 letters. For this screenshot and the ones below I've passed the AWK output to pr for clarity.

The command builds an array "a" whose index string is a count beginning with "1" and whose value string is the whole item. In other words, AWK's first step is to number all the items from 1 to 6. In the END statement, two nested "for" loops work from index 1 through index 6, and prints item pairs by their index (1 1, 1 2, 1 3...6 1, 6 2, 6 3...6 6).


Permutation without repetition.

awk '{a[++c]=$0} END {for (i=1;i<=c;i++) for (j=1;j<=c;j++) if (a[i] != a[j]) print a[i]a[j]}' items

perm2

Here I've built the 30 possible permutations of 6 items taken 2 at a time, with no repetition: the second letter in the pair cannot be the same as the first.

Almost exactly the same as the last permutation command, but this time repeated pairs (index 1 1, 2 2, 3 3, 4 4, 5 5, 6 6) are excluded by the condition if (a[i] != a[j]).


Combination with repetition.

awk '{a[c++]=$0} END {for (i=0;i<c;i++) for (j=i;j<c;j++) print a[i]a[j]}' items

comb1

This command builds the 21 possible combinations of 6 items taken 2 at a time, with repetition allowed. While a permutation distinguishes between "ab" and "ba", for example, a combination sees the two strings as the same.

In this command the indexing begins with 0, not 1, so the items are numbered 0 to 5. Notice that the inner loop (with "j") starts with the current value of "i". This is easier to see if the "for" loop steps are printed:

comb2

Combination without repetition.

awk '{a[c++]=$0} END {for (i=0;i<c;i++) for (j=i+1;j<c;j++) print a[i]a[j]}' items

comb3

This command builds the 15 possible combinations of 6 items taken 2 at a time, with no repetition.

Here I avoid repetition by "off-setting" the inner "for" loop by 1, as seen below:

comb4

Next post:
2025-03-14   Find all data points "X" km or less from a given point


Last update: 2025-03-07
The blog posts on this website are licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License