SQL Learning Hub
SQL AVG Function
SQL AVG Function
Learn how to calculate average values (mean) using SQL's AVG aggregate function to discover central tendencies and insights in your numeric data.
Understanding the SQL AVG Function
The AVG
function is a core SQL aggregate function used to calculate the average (arithmetic mean) value of a set of numbers. It sums the values in a specified numeric column and divides by the count of non-NULL values, providing a crucial measure of central tendency for your data.
Basic Syntax
Like SUM
, AVG
operates exclusively on columns with numeric data types (e.g., INTEGER, FLOAT, DECIMAL).
Common SQL AVG Interview Questions
- Explain precisely how the
AVG
function handles NULL values. - What is the difference between the mean (calculated by
AVG
) and the median? When might median be preferred? - How can you calculate a weighted average in SQL (since
AVG
provides an unweighted mean)? - Provide a real-world example where using
AVG
withGROUP BY
would be insightful. - How would you find rows with values above or below the overall average?
AVG Function Basics
Simple Average Calculation
AVG
calculates the mean by summing the non-NULL values in the column and dividing by the count of those non-NULL values.
Result: A single row with the average value of the views
column. If views were [100, 200, NULL, 300], the average would be (100 + 200 + 300) / 3 = 200.
AVG and NULL Value Handling
This is a critical point: AVG
ignores NULL values entirely. It does not treat them as zero or include them in the count when calculating the average.
If you need to treat NULLs as 0 for the average calculation, you must explicitly use AVG(COALESCE(column_name, 0))
. This would change the average in the example above to (10 + 20 + 0) / 3 = 10.
Formatting AVG Results (ROUND)
Since AVG
often produces results with decimal places (even when averaging integers, depending on the database), using ROUND
, CAST
, or specific formatting functions is common for presentation.
Mean (AVG) vs. Median
- Mean (
AVG
): Sum of values divided by the count of values. Sensitive to outliers (extreme values). - Median: The middle value when data is sorted. Less sensitive to outliers.
SQL databases typically provide AVG
directly. Calculating the median usually requires window functions like PERCENTILE_CONT
or NTILE
, or more complex self-joins, depending on the specific SQL dialect. For skewed distributions (like income or view counts), the median often gives a better sense of the "typical" value than the mean.
Using AVG with GROUP BY
AVG
becomes particularly insightful when used with GROUP BY
to compute the average value for different segments or categories within your data.
Calculating Averages per Category
Compute the average value for each distinct group.
This identifies users whose videos, on average, perform better or worse than others.
Combining AVG with Other Aggregates
Provide richer context by including counts or sums alongside the average.
Seeing the video count alongside the average helps interpret the result (e.g., a high average based on only one video is less significant than a high average across many videos).
Filtering Groups Based on AVG (HAVING Clause)
Use the HAVING
clause to select only those groups whose average meets a certain criterion.
HAVING
allows filtering based on the result of the AVG()
aggregate function after the groups have been formed.
Practical AVG Examples
Example 1: Calculate Average Views
Use AVG to find out the average number of views across all videos:
Example 2: Average with GROUP BY
Calculate average metrics by user. This query groups videos by creator and calculates the average views per video for each user, helping identify the most successful content creators.
Example 3: Filtering with HAVING
Use HAVING to filter groups based on their average values. This query identifies users whose content consistently performs well by filtering for those with an average view count above a threshold.
Common AVG Use Cases
The AVG function is particularly useful for:
- Performance analysis:
AVG(views)
to assess typical content performance - User behavior:
AVG(interactions_per_user)
to understand engagement - Trend detection:
AVG(views)
by time period to spot changes over time - Benchmarking: Comparing individual performance to average performance
Summary
The AVG
function provides the arithmetic mean, a key measure of central tendency in SQL:
- It calculates the sum of non-NULL numeric values and divides by the count of non-NULL values.
- Crucially, it ignores NULLs. Use
COALESCE(column, 0)
if you need to include NULLs as zeros in the average calculation. - Use with
WHERE
to calculate averages for specific data subsets. - Combine with
GROUP BY
to find average values for different categories. - Use with
HAVING
to filter groups based on their calculated average. - Often used with
ROUND
or formatting functions for clearer presentation. - Remember it calculates the mean, which can be skewed by outliers; the median might sometimes be more appropriate but requires different techniques.
AVG
is indispensable for understanding typical values, comparing group performance, and identifying trends within your numeric data.
Test Your Knowledge
Take a quick quiz to reinforce what you've learned about SQL AVG Function
Ready for hands-on SQL practice?
We have 200+ questions on real companies and real products.