Postgres doesn’t have a built-in pivot table feature, but they are relatively easy to construct.
Let’s say you have a table of users, and a table of events, and want to make a pivot table that shows how often each event occurred.
create table users (name varchar);
insert into users ('gary');
insert into users ('melissa');
insert into users ('ella');
insert into users ('chloe');
create table events (user_name varchar, event varchar);
insert
into events (user_name, event)
values ('gary', 'event_1');
insert
into events (user_name, event)
values ('gary', 'event_1');
insert
into events (user_name, event)
values ('gary', 'event_2');
insert
into events (user_name, event)
values ('melissa', 'event_2');
insert
into events (user_name, event)
values ('ella', 'event_1');
The pivot table output should look like this:
user event_1 event 2 gary 2 1 melissa 0 2 ella 1 0 ziti 0 0
You can get counts easily enough, but the “0” records will drop out, which is a bit of a pain:
select
user_name,
event,
count(*) event_count
from events
group by 1, 2
What we really want is a list of all combinations of users and events, combined with their counts. Normally you don’t do things like this, but this can be obtained with a cartesian join.
You’ll want to make sure to itemize the names (don’t use select *) – this will help with the next step.
select users.name, event_list.event
from users
join (
select distinct event
from events
) as event_list on 1=1
The above query gives us the following data:
gary event_2 gary event_1 melissa event_2 melissa event_1 ella event_2 ella event_1 chloe event_2 chloe event_1
If you have a static list of events, you can also join to it directly:
select users.name, event_list.event
from users
join (
select unnest(array['event_1', 'event_2']) event
) as event_list on 1=1
Now that we have this, we can join it back to the count data:
select
users.name,
event_list.event,
coalesce(event_count, 0) event_count
from users
join (
select distinct event
from events
) as event_list on 1=1
left join (
select
user_name,
event,
count(*) event_count
from events
group by 1, 2
) counts on counts.user_name = users.name
and counts.event = event_list.event
This new dataset is much better, as it includes every possible cell that we would want in the pivot table:
name event event_count gary event_1 2 gary event_2 1 melissa event_1 0 melissa event_2 1 ella event_1 1 ella event_2 0 chloe event_1 0 chloe event_2 0
To get these all into one row, we can wrap this in array aggregation. Note that the values in the arrays need to be sorted by the event type, or else each record will get counts in a different order.
select
name,
array_agg(event order by event) events,
array_agg(event_count order by event) counts
from (
...
) aggregated_date
group by name
This returns:
name events counts chloe {"event 1","event 2"} {0,0} ella {"event 1","event 2"} {1,0} gary {"event 1","event 2"} {2,1} melissa {"event 1","event 2"} {0,1}
Now getting pivot table data is easy through array indexing. Note that Postgres arrays start at 1 (!), and obviously this solution won’t help you if you have an indeterminite number of events.
select
name,
counts[1] as event_1,
counts[2] as event_2
from (
...
) pivot_data
name event_1 event_2 chloe 0 0 ella 1 0 gary 2 1 melissa 0 1
If you’re following along at home, here is the final result:
-- Create pivot table
select
name,
counts[1] as event_1,
counts[2] as event_2
from (
-- Combine all the counts into one row per user
select
name,
array_agg(event order by event) events,
array_agg(event_count order by event) counts
from (
-- Create counts of "0" for events that haven't occurred
select
users.name,
event_list.event,
coalesce(event_count, 0) event_count
from users
join (
-- Compute the full list of events
select distinct event
from events
) as event_list on 1 = 1
left join (
-- Compute how often each event happened per user
select
user_name,
event,
count(*) as event_count
from events
group by 1, 2
) counts
on counts.user_name = users.name
and counts.event = event_list.event
) aggregated_date
group by name
) pivot_data
order by name