LogoInterview Master

SQL Learning Hub

SQL ROW_NUMBER() Window Function

Understanding ROW_NUMBER()

The ROW_NUMBER() function is one of the most versatile window functions in SQL. It assigns a unique, sequential integer to each row within a partition of a result set. Think of it as adding a counter to your query results, giving each row its own number based on a specified order.

Basic Syntax

ROW_NUMBER() OVER (
    [PARTITION BY column1, column2, ...]
    ORDER BY column3 [ASC|DESC], column4 [ASC|DESC], ...
)

Common ROW_NUMBER() Interview Questions

  • How would you find the most recent interaction for each user?
  • How can you paginate results with 10 rows per page?
  • What's the difference between ROW_NUMBER(), RANK(), and DENSE_RANK()?
  • How would you identify and remove duplicate rows in a table?

Key Components of ROW_NUMBER()

The OVER Clause

The OVER clause defines the window of rows that the ROW_NUMBER() function operates on. Think of it as defining the scope and organization of your numbered list.

PARTITION BY (Optional)

The PARTITION BY clause divides the result set into partitions or groups. Row numbering starts at 1 for each new partition. This is like creating separate numbered lists for different categories.

-- Restart numbering for each user_id
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY upload_date DESC)

ORDER BY (Required)

The ORDER BY clause determines the sequence in which row numbers are assigned. This is mandatory for ROW_NUMBER() since it needs to know in what order to number the rows.

-- Number rows based on view count (highest first)
ROW_NUMBER() OVER (ORDER BY views DESC)

Practical ROW_NUMBER() Examples

Basic Row Numbering

The simplest use of ROW_NUMBER() is to add sequential numbers to your query results:

-- Add row numbers to videos ordered by views
SELECT 
  ROW_NUMBER() OVER (ORDER BY views DESC) AS row_num,
  title,
  views
FROM 
  Videos;

This query assigns numbers to videos starting with 1 for the most-viewed video, 2 for the second most-viewed, and so on. It's like creating a "Top Videos" chart where each video has its ranking position.

Partitioned Row Numbering

Use PARTITION BY to restart numbering for each group:

-- Number videos by view count within each user's content
SELECT 
  u.username,
  v.title,
  v.views,
  ROW_NUMBER() OVER (PARTITION BY v.user_id ORDER BY v.views DESC) AS video_rank
FROM 
  Videos v
JOIN 
  Users u ON v.user_id = u.user_id;

This query numbers each creator's videos separately. Every user will have their own video ranked #1, #2, etc., based on views. This is perfect for finding each creator's most popular content.

Understanding the Results

If alice has 3 videos and bob has 2 videos, the results would look something like:

usernametitleviewsvideo_rank
aliceFunny Cat10001
aliceCooking Tips5002
bobDance Challenge20001

Notice how the video_rank restarts at 1 for each new username because of the PARTITION BY clause.

Common Use Cases for ROW_NUMBER()

1. Implementing Pagination

ROW_NUMBER() is perfect for paginating large result sets efficiently:

-- Get the second page of videos (rows 11-20)
WITH NumberedVideos AS (
    SELECT 
        ROW_NUMBER() OVER (ORDER BY upload_date DESC) AS row_num,
        video_id,
        title,
        views,
        upload_date
    FROM 
        Videos
)
SELECT 
    video_id,
    title,
    views,
    upload_date
FROM 
    NumberedVideos
WHERE 
    row_num BETWEEN 11 AND 20;

This pattern is widely used in web applications to implement "Page 1, 2, 3..." navigation. It's more efficient than using OFFSET/LIMIT for large datasets, especially when deep pagination is required.

2. Finding the Most Recent Activity

Use ROW_NUMBER() to find the most recent interaction for each user:

-- Get the most recent interaction for each user
WITH RankedInteractions AS (
    SELECT 
        user_id,
        video_id,
        interaction_type,
        timestamp,
        ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY timestamp DESC) AS interaction_rank
    FROM 
        Interactions
)
SELECT 
    ri.user_id,
    u.username,
    v.title,
    ri.interaction_type,
    ri.timestamp
FROM 
    RankedInteractions ri
JOIN 
    Users u ON ri.user_id = u.user_id
JOIN 
    Videos v ON ri.video_id = v.video_id
WHERE 
    interaction_rank = 1;

This query assigns rank 1 to each user's most recent interaction, then filters to show only those top-ranked interactions. It's like finding the "last seen" activity for each user on the platform.

3. Identifying and Removing Duplicates

ROW_NUMBER() excels at finding duplicate records:

-- Find duplicate video titles
WITH DuplicateTitles AS (
    SELECT 
        title,
        user_id,
        upload_date,
        ROW_NUMBER() OVER (PARTITION BY title ORDER BY upload_date) AS title_instance
    FROM 
        Videos
)
SELECT 
    dt.title,
    dt.upload_date,
    u.username
FROM 
    DuplicateTitles dt
JOIN 
    Users u ON dt.user_id = u.user_id
WHERE 
    title_instance > 1;

This query identifies videos with duplicate titles, keeping the original (first uploaded) title as instance 1, and marking all subsequent uploads with higher instance numbers. Filtering for title_instance > 1 shows only the duplicates.

4. Creating Top-N Reports

ROW_NUMBER() makes it easy to create "Top 3" or "Top 5" reports:

-- Get the top 3 most-viewed videos for each user
WITH RankedVideos AS (
    SELECT 
        v.user_id,
        v.title,
        v.views,
        ROW_NUMBER() OVER (PARTITION BY v.user_id ORDER BY v.views DESC) AS view_rank
    FROM 
        Videos v
)
SELECT 
    u.username,
    rv.title,
    rv.views,
    rv.view_rank AS "Rank"
FROM 
    RankedVideos rv
JOIN 
    Users u ON rv.user_id = u.user_id
WHERE 
    view_rank <= 3
ORDER BY 
    u.username, rv.views DESC;

This query creates a report showing each user's top 3 most-viewed videos. The PARTITION BY groups by user, and the WHERE clause limits it to ranks 1-3. The final ORDER BY ensures a clean, sorted output.

Performance Tips for ROW_NUMBER()

While ROW_NUMBER() is powerful, here are some tips to keep your queries efficient:

  • Use Indexes: Ensure the columns in ORDER BY and PARTITION BY clauses are indexed, as these determine the sorting and grouping overhead.
  • Limit Rows Early: Apply WHERE clauses before using ROW_NUMBER() to reduce the number of rows processed.
  • Avoid Overuse: For simple ranking, consider if a basic ORDER BY would suffice instead of a window function.
  • Test Alternatives: For pagination or deduplication, compare with LIMIT/OFFSET or DISTINCT to see which performs better on your data.

Ready for hands-on SQL practice?

We have 200+ questions on real companies and real products.

Find a question to practice
Dawn AI
Have questions?
Ask Dawn AI