SQL Learning Hub
SQL ROW_NUMBER() Window Function
SQL ROW_NUMBER() Window Function
Assign sequential numbers to rows and solve common SQL challenges
Understanding ROW_NUMBER()
The ROW_NUMBER() function is one of the most versatile window functions in SQL. It assigns a unique, sequential integer to each row within a partition of a result set. Think of it as adding a counter to your query results, giving each row its own number based on a specified order.
Basic Syntax
ROW_NUMBER() OVER (
[PARTITION BY column1, column2, ...]
ORDER BY column3 [ASC|DESC], column4 [ASC|DESC], ...
)
Common ROW_NUMBER() Interview Questions
- How would you find the most recent interaction for each user?
- How can you paginate results with 10 rows per page?
- What's the difference between ROW_NUMBER(), RANK(), and DENSE_RANK()?
- How would you identify and remove duplicate rows in a table?
Key Components of ROW_NUMBER()
The OVER Clause
The OVER clause defines the window of rows that the ROW_NUMBER() function operates on. Think of it as defining the scope and organization of your numbered list.
PARTITION BY (Optional)
The PARTITION BY clause divides the result set into partitions or groups. Row numbering starts at 1 for each new partition. This is like creating separate numbered lists for different categories.
-- Restart numbering for each user_id
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY upload_date DESC)
ORDER BY (Required)
The ORDER BY clause determines the sequence in which row numbers are assigned. This is mandatory for ROW_NUMBER() since it needs to know in what order to number the rows.
-- Number rows based on view count (highest first)
ROW_NUMBER() OVER (ORDER BY views DESC)
Practical ROW_NUMBER() Examples
Basic Row Numbering
The simplest use of ROW_NUMBER() is to add sequential numbers to your query results:
-- Add row numbers to videos ordered by views
SELECT
ROW_NUMBER() OVER (ORDER BY views DESC) AS row_num,
title,
views
FROM
Videos;
This query assigns numbers to videos starting with 1 for the most-viewed video, 2 for the second most-viewed, and so on. It's like creating a "Top Videos" chart where each video has its ranking position.
Partitioned Row Numbering
Use PARTITION BY to restart numbering for each group:
-- Number videos by view count within each user's content
SELECT
u.username,
v.title,
v.views,
ROW_NUMBER() OVER (PARTITION BY v.user_id ORDER BY v.views DESC) AS video_rank
FROM
Videos v
JOIN
Users u ON v.user_id = u.user_id;
This query numbers each creator's videos separately. Every user will have their own video ranked #1, #2, etc., based on views. This is perfect for finding each creator's most popular content.
Understanding the Results
If alice has 3 videos and bob has 2 videos, the results would look something like:
username | title | views | video_rank |
---|---|---|---|
alice | Funny Cat | 1000 | 1 |
alice | Cooking Tips | 500 | 2 |
bob | Dance Challenge | 2000 | 1 |
Notice how the video_rank restarts at 1 for each new username because of the PARTITION BY clause.
Common Use Cases for ROW_NUMBER()
1. Implementing Pagination
ROW_NUMBER() is perfect for paginating large result sets efficiently:
-- Get the second page of videos (rows 11-20)
WITH NumberedVideos AS (
SELECT
ROW_NUMBER() OVER (ORDER BY upload_date DESC) AS row_num,
video_id,
title,
views,
upload_date
FROM
Videos
)
SELECT
video_id,
title,
views,
upload_date
FROM
NumberedVideos
WHERE
row_num BETWEEN 11 AND 20;
This pattern is widely used in web applications to implement "Page 1, 2, 3..." navigation. It's more efficient than using OFFSET/LIMIT for large datasets, especially when deep pagination is required.
2. Finding the Most Recent Activity
Use ROW_NUMBER() to find the most recent interaction for each user:
-- Get the most recent interaction for each user
WITH RankedInteractions AS (
SELECT
user_id,
video_id,
interaction_type,
timestamp,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY timestamp DESC) AS interaction_rank
FROM
Interactions
)
SELECT
ri.user_id,
u.username,
v.title,
ri.interaction_type,
ri.timestamp
FROM
RankedInteractions ri
JOIN
Users u ON ri.user_id = u.user_id
JOIN
Videos v ON ri.video_id = v.video_id
WHERE
interaction_rank = 1;
This query assigns rank 1 to each user's most recent interaction, then filters to show only those top-ranked interactions. It's like finding the "last seen" activity for each user on the platform.
3. Identifying and Removing Duplicates
ROW_NUMBER() excels at finding duplicate records:
-- Find duplicate video titles
WITH DuplicateTitles AS (
SELECT
title,
user_id,
upload_date,
ROW_NUMBER() OVER (PARTITION BY title ORDER BY upload_date) AS title_instance
FROM
Videos
)
SELECT
dt.title,
dt.upload_date,
u.username
FROM
DuplicateTitles dt
JOIN
Users u ON dt.user_id = u.user_id
WHERE
title_instance > 1;
This query identifies videos with duplicate titles, keeping the original (first uploaded) title as instance 1, and marking all subsequent uploads with higher instance numbers. Filtering for title_instance > 1 shows only the duplicates.
4. Creating Top-N Reports
ROW_NUMBER() makes it easy to create "Top 3" or "Top 5" reports:
-- Get the top 3 most-viewed videos for each user
WITH RankedVideos AS (
SELECT
v.user_id,
v.title,
v.views,
ROW_NUMBER() OVER (PARTITION BY v.user_id ORDER BY v.views DESC) AS view_rank
FROM
Videos v
)
SELECT
u.username,
rv.title,
rv.views,
rv.view_rank AS "Rank"
FROM
RankedVideos rv
JOIN
Users u ON rv.user_id = u.user_id
WHERE
view_rank <= 3
ORDER BY
u.username, rv.views DESC;
This query creates a report showing each user's top 3 most-viewed videos. The PARTITION BY groups by user, and the WHERE clause limits it to ranks 1-3. The final ORDER BY ensures a clean, sorted output.
Performance Tips for ROW_NUMBER()
While ROW_NUMBER() is powerful, here are some tips to keep your queries efficient:
- Use Indexes: Ensure the columns in ORDER BY and PARTITION BY clauses are indexed, as these determine the sorting and grouping overhead.
- Limit Rows Early: Apply WHERE clauses before using ROW_NUMBER() to reduce the number of rows processed.
- Avoid Overuse: For simple ranking, consider if a basic ORDER BY would suffice instead of a window function.
- Test Alternatives: For pagination or deduplication, compare with LIMIT/OFFSET or DISTINCT to see which performs better on your data.
Ready for hands-on SQL practice?
We have 200+ questions on real companies and real products.
Ask Dawn AI