Published at Mar 22, 2024

Understanding the rank() and dense_rank() Functions in PostgreSQL

In PostgreSQL, rank() and dense_rank() are general-purpose window functions that are used to order values and assign them rank in the form of numbers on where they fall in relation to each other.

Both of these functions are used with an OVER clause along with either PARTITION BY, ORDER BY, or ROWS BETWEEN. When used with rank() and dense_rank(), they affect the windows in these ways:

ORDER BY: specifies the column whose values you wish to rank
PARTITION BY: groups the rankings

The difference between rank() and dense_rank() is in how they handle identical values. Let’s say that you are assigning rank based on a grade value, and you end up with two results that are both equal to 85. If you use rank(), both of these values will be given the same rank, and the next rank will be skipped. So if they both tied for 3, then they will both be given a 3, 4 will be skipped, and the next highest rank will be 5. If you use dense_rank() on the same example, 4 won’t be skipped. It will be the next rank value.

rank() syntax:

rank () → bigint

dense_rank() syntax:

dense_rank () → bigint

rank() syntax with window functions:

rank() OVER ([PARTITION BY <columns>] [ORDER BY <columns>])

dense_rank() syntax with window functions:

dense_rank() OVER ([PARTITION BY <columns>] [ORDER BY <columns>])

rank() and dense_rank() in PostgreSQL: Examples

The data set that we will be using for examples contains the grades of three students in four subjects.

student	subject	grade
Jim	Science	84
Jim	Math	93
Jim	History	79
Jim	English	75
Mary	Science	81
Mary	Math	81
Mary	History	80
Mary	English	88
Sam	Science	84
Sam	Math	90
Sam	History	79
Sam	English	92

Find the rank of every row

If we wanted to know which students needed the most help in specific subjects, we could run a query like this:

SELECT student, subject, grade, rank() OVER(ORDER BY grade DESC) FROM grades;

We want the rank of 1 to go to the highest grade. In order to do that, we order the results in the window frame in descending order so the highest grades are at the top.

Here are the results:

student	subject	grade	rank
Jim	Math	93	1
Sam	English	92	2
Sam	Math	90	3
Mary	English	88	4
Sam	Science	84	5
Jim	Science	84	5
Mary	Math	81	7
Mary	Science	81	7
Mary	History	80	9
Jim	History	79	10
Sam	History	79	10
Jim	English	75	12

From this, we can tell that Jim needs help in both History and English even though he is at the top of the class in Math. Sam also needs help with History. Now, you will notice that both 7 and 10 are repeated, and there is no 8 or 11. This is because we are using rank(). When we use dense_rank() instead, there will be no gaps.

Here is that result:

student	subject	grade	dense_rank
Jim	Math	93	1
Sam	English	92	2
Sam	Math	90	3
Mary	English	88	4
Sam	Science	84	5
Jim	Science	84	5
Mary	Math	81	6
Mary	Science	81	6
Mary	History	80	7
Jim	History	79	8
Sam	History	79	8
Jim	English	75	9

Using partitions with rank()

If we want to see how each student ranks per subject, we can add a PARTITION BY clause, which can group the records in the window. Here is the query:

SELECT student, subject, RANK() OVER(PARTITION BY subject ORDER BY grade DESC) FROM grades ORDER BY student, rank;

And here are the results:

student	subject	rank
Jim	Math	1
Jim	Science	1
Jim	History	2
Jim	English	3
Mary	History	1
Mary	English	2
Mary	Science	3
Mary	Math	3
Sam	English	1
Sam	Science	1
Sam	History	2
Sam	Math	2

Finding the rank of an average

To see how each student’s average grade across all the subjects rank, we can use a subquery to do that. Here is that query:

SELECT student, average, RANK() OVER(ORDER BY average DESC) FROM ( SELECT student, avg(grade) average FROM grades GROUP BY student) AS subquery;

And here are the results:

student	average	rank
Sam	86.25	1
Jim	82.75	2
Mary	82.50	3

Finding the rank of a value in grouped rows

For this example, let’s use another data set. This one contains the temperature and precipitation data for a couple of cities.

id	day	city	temperature	precipitation
17	2021-09-04	Miami	68.36	0.00
19	2021-09-05	Miami	72.50	0.00
11	2021-09-01	Miami	65.30	0.28
13	2021-09-02	Miami	64.40	0.79
18	2021-09-04	Atlanta	67.28	0.00
12	2021-09-01	Atlanta	63.14	0.20
14	2021-09-02	Atlanta	62.60	0.59
16	2021-09-03	Atlanta	62.60	0.39
15	2021-09-03	Miami	71.60	0.47
20	2021-09-05	Atlanta	70.80	0.00

Let’s say we have a temperature, and we want to see where it ranks in relation to the temperatures in our table. For example, we want to see how 68 degrees ranks in both Miami and Atlanta for this range of dates.

Here is the query:

SELECT city, rank(68) WITHIN GROUP ( ORDER BY temperature DESC) FROM city_data GROUP BY city;

We grouped the results by the city so we get the rank of 68 degrees in each. Then we use the WITHIN GROUP clause, which specifies how to sort the rows that are grouped by the aggregate function. Here, we order the rows by the descending temperature because we want the highest temperature to get a rank of 1.

Here are the results:

city	rank
Atlanta	2
Miami	4

This tells us that 68 degrees would be the second-highest temperature for Atlanta and the fourth-highest temperature in Miami for the date range in the table.

Next Steps

To learn more about rank() and dense_rank() and how to use them in PostgreSQL, check out PostgreSQL’s documentation on window functions. For more examples on how to use them in your own TimescaleDB SQL queries, see these Timescale documentation sections: