RethinkDB: Pivot Table Example

Making a pivot table is a relatively common reporting task. I thought it’d be interesting to demonstrate how to do this with RethinkDB, as it shows how to build a complex query.

In my sample data, we’ll use a table that lists salary information for a list of people. We’ll generate a pivot table that shows average salary by department, role and level.

The first step is to compute all the averages, so that we can apply the pivot:

r.db('test')
 .table('salaries')
 .group('Department', 'role', 'level')
 .avg('salary')
 .ungroup()

In SQL, this would have been equivalent to:

SELECT Department, role, level, avg(salary)
GROUP BY 1, 2, 3

Notice how in SQL this would only return rows for values that exist. Values for each combination of role and level are in different rows, rather than columns.

To start the pivot, we first turn each result into an array:

 ...
 .map( (doc) =>
    [doc('group')(0), 
     doc('group')(1), 
     doc('group')(2), 
     doc('reduction')]
 )
value
1
[“Maintenance”, “Engineer”, 2, 72000]
2

[“Maintenance”, “Manager”, 1, 120000]

3

[“New Product Development”, “Designer”, 1, 40500]

4

[“New Product Development”, “Designer”, 2, 64500]

6

[“New Product Development”, “Engineer”, 1, 40000]

7

[“New Product Development”, “Engineer”, 2, 65000]

8

[“New Product Development”, “Engineer”, 3, 112500]

9

[“New Product Development”, “Manager”, 1, 103500]

This allows us to apply transformations to all the row values, so we can use some as column names.

Once we do this, we can combine the role and level fields into column names – i.e. to produce names like “Engineer_1”, “Engineer_2”, “Manager_1”, “Manager_2”, and so on.

...
  .map( (doc) =>
    r.object(
     'Department', 
     doc(0), 
     doc(1).add('_').add(doc(2).coerceTo('string')), 
     doc(3))
  )

The above expression is requires you to coerce numeric values to strings – “add”expects to operate on two values of the same type.

If you get null reference errors, you can start this with “expr(”).add(…)” to ensure the absence of null values.

Department
Engineer_2
Manager_1
Designer_1
Designer_2
Engineer_1
Engineer_3
1
Maintenance
72000
undefined
undefined
undefined
undefined
undefined
2
Maintenance
undefined
120000
undefined
undefined
undefined
undefined
3
New Product Development
undefined
undefined
40500
undefined
undefined
undefined
4
New Product Development
undefined
undefined
undefined
64500
undefined
undefined
5
New Product Development
undefined
undefined
undefined
undefined
undefined
undefined
6
New Product Development
undefined
undefined
undefined
undefined
40000
undefined
7
New Product Development
65000
undefined
undefined
undefined
undefined
undefined
8
New Product Development
undefined
undefined
undefined
undefined
undefined
112500
9
New Product Development
undefined
103500
undefined
undefined
undefined
undefined
...
 .group('Department')
 .reduce(
   (a, b) => a.merge(b)
 )

This produces a nice looking result. Note that we still have ‘reduction’ bit.

group
reduction.Department
reduction.Engineer_2
reduction.Manager_1
reduction.Designer_1
reduction.Designer_2
reduction.Engineer_1
reduction.Engineer_3
1
Maintenance
Maintenance
72000
120000
undefined
undefined
undefined
undefined
2
New Product Development
New Product Development
65000
103500
40500
64500
40000
112500
We can fix this with one more transformation, to remove the effects of ‘group’ and just get the child objects:

.ungroup()
.map( (doc) => doc(‘reduction’) )
Department
Engineer_2
Manager_1
Designer_1
Designer_2
Engineer_1
Engineer_3
1
Maintenance
72000
120000
undefined
undefined
undefined
undefined
2
New Product Development
65000
103500
40500
64500
40000
112500
And here is the final query, in all it’s glory:
r.db('test')
 .table('salaries')
 .group('Department', 'role', 'level')
 .avg('salary')
 .ungroup()
 .map( (doc) => [
   doc('group')(0), 
   doc('group')(1), 
   doc('group')(2), 
   doc('reduction')
  ])
 .map( (doc) =>
   r.object(
     'Department', 
     doc(0), 
     doc(1).add('_').add(doc(2).coerceTo('string')), 
     doc(3))
 )
 .group('Department')
 .reduce(
   (a, b) => a.merge(b)
 )
 .ungroup()
 .map( 
   (doc) => doc('reduction') 
 )