Greatest N per Group SQL Problem

Few days back, I encountered a problem where I need to find the last update date for a pricing and I seemed to stuck at the SQL query. Initially I was thinking about using subquery, an old habit of using PostgreSQL for many years. But as you know, this going to be tricky as MySQL is always kind of dog slow at subquery.

Some googling lead me to this simple answer that I realized I had over-analyzed the problem. The question was so common that the Stack Overflow community created a special greatest-n-per-group tag for it.

Let's illustrate this with an example. Imagine you have a employee table with a schema as follows:
employee
- emp_id
- dept
- salary

How do you find the highest paid employee in each department ? The answer was so simple, just use MAX and GROUP BY. The answer in SQL as shown which is so generic and should works with any database system without any optimization.
SELECT emp_id, MAX(salary)
FROM employee
GROUP BY dept

Yes, that all. No complex subquery or joins. Duh ! What was I thinking ?

No comments:

Post a Comment