LogoInterview Master

SQL Learning Hub

SQL AVG Function

Understanding the SQL AVG Function

The AVG function is a core SQL aggregate function used to calculate the average (arithmetic mean) value of a set of numbers. It sums the values in a specified numeric column and divides by the count of non-NULL values, providing a crucial measure of central tendency for your data.

Basic Syntax

Like SUM, AVG operates exclusively on columns with numeric data types (e.g., INTEGER, FLOAT, DECIMAL).

Common SQL AVG Interview Questions

  • Explain precisely how the AVG function handles NULL values.
  • What is the difference between the mean (calculated by AVG) and the median? When might median be preferred?
  • How can you calculate a weighted average in SQL (since AVG provides an unweighted mean)?
  • Provide a real-world example where using AVG with GROUP BY would be insightful.
  • How would you find rows with values above or below the overall average?

AVG Function Basics

Simple Average Calculation

AVG calculates the mean by summing the non-NULL values in the column and dividing by the count of those non-NULL values.

Result: A single row with the average value of the views column. If views were [100, 200, NULL, 300], the average would be (100 + 200 + 300) / 3 = 200.

AVG and NULL Value Handling

This is a critical point: AVG ignores NULL values entirely. It does not treat them as zero or include them in the count when calculating the average.

If you need to treat NULLs as 0 for the average calculation, you must explicitly use AVG(COALESCE(column_name, 0)). This would change the average in the example above to (10 + 20 + 0) / 3 = 10.

Formatting AVG Results (ROUND)

Since AVG often produces results with decimal places (even when averaging integers, depending on the database), using ROUND, CAST, or specific formatting functions is common for presentation.

Mean (AVG) vs. Median

  • Mean (AVG): Sum of values divided by the count of values. Sensitive to outliers (extreme values).
  • Median: The middle value when data is sorted. Less sensitive to outliers.

SQL databases typically provide AVG directly. Calculating the median usually requires window functions like PERCENTILE_CONT or NTILE, or more complex self-joins, depending on the specific SQL dialect. For skewed distributions (like income or view counts), the median often gives a better sense of the "typical" value than the mean.

Using AVG with GROUP BY

AVG becomes particularly insightful when used with GROUP BY to compute the average value for different segments or categories within your data.

Calculating Averages per Category

Compute the average value for each distinct group.

This identifies users whose videos, on average, perform better or worse than others.

Combining AVG with Other Aggregates

Provide richer context by including counts or sums alongside the average.

Seeing the video count alongside the average helps interpret the result (e.g., a high average based on only one video is less significant than a high average across many videos).

Filtering Groups Based on AVG (HAVING Clause)

Use the HAVING clause to select only those groups whose average meets a certain criterion.

HAVING allows filtering based on the result of the AVG() aggregate function after the groups have been formed.

Practical AVG Examples

Example 1: Calculate Average Views

Use AVG to find out the average number of views across all videos:

Example 2: Average with GROUP BY

Calculate average metrics by user. This query groups videos by creator and calculates the average views per video for each user, helping identify the most successful content creators.

Example 3: Filtering with HAVING

Use HAVING to filter groups based on their average values. This query identifies users whose content consistently performs well by filtering for those with an average view count above a threshold.

Common AVG Use Cases

The AVG function is particularly useful for:

  • Performance analysis: AVG(views) to assess typical content performance
  • User behavior: AVG(interactions_per_user) to understand engagement
  • Trend detection: AVG(views) by time period to spot changes over time
  • Benchmarking: Comparing individual performance to average performance

Summary

The AVG function provides the arithmetic mean, a key measure of central tendency in SQL:

  • It calculates the sum of non-NULL numeric values and divides by the count of non-NULL values.
  • Crucially, it ignores NULLs. Use COALESCE(column, 0) if you need to include NULLs as zeros in the average calculation.
  • Use with WHERE to calculate averages for specific data subsets.
  • Combine with GROUP BY to find average values for different categories.
  • Use with HAVING to filter groups based on their calculated average.
  • Often used with ROUND or formatting functions for clearer presentation.
  • Remember it calculates the mean, which can be skewed by outliers; the median might sometimes be more appropriate but requires different techniques.

AVG is indispensable for understanding typical values, comparing group performance, and identifying trends within your numeric data.

Test Your Knowledge

Take a quick quiz to reinforce what you've learned about SQL AVG Function

Ready for hands-on SQL practice?

We have 200+ questions on real companies and real products.

Find a question to practice
Dawn AI