SQL Learning Hub
SQL AVG Function
SQL AVG Function
Learn how to calculate the average of numeric values in a column
Understanding the SQL AVG Function
The AVG function is a core SQL aggregate function used to calculate the average (arithmetic mean) value of a set of numbers. It sums the values in a specified numeric column and divides by the count of non-NULL values, providing a crucial measure of central tendency for your data.
Basic Syntax
Like SUM, AVG operates exclusively on columns with numeric data types (e.g., INTEGER, FLOAT, DECIMAL).
Common SQL AVG Interview Questions
- Explain precisely how the
AVGfunction handles NULL values. - What is the difference between the mean (calculated by
AVG) and the median? When might median be preferred? - How can you calculate a weighted average in SQL (since
AVGprovides an unweighted mean)? - Provide a real-world example where using
AVGwithGROUP BYwould be insightful. - How would you find rows with values above or below the overall average?
AVG Function Basics
Simple Average Calculation
AVG calculates the mean by summing the non-NULL values in the column and dividing by the count of those non-NULL values.
Result: A single row with the average value of the views column. If views were [100, 200, NULL, 300], the average would be (100 + 200 + 300) / 3 = 200.
AVG and NULL Value Handling
This is a critical point: AVG ignores NULL values entirely. It does not treat them as zero or include them in the count when calculating the average.
If you need to treat NULLs as 0 for the average calculation, you must explicitly use AVG(COALESCE(column_name, 0)). This would change the average in the example above to (10 + 20 + 0) / 3 = 10.
Formatting AVG Results (ROUND)
Since AVG often produces results with decimal places (even when averaging integers, depending on the database), using ROUND, CAST, or specific formatting functions is common for presentation.
Mean (AVG) vs. Median
- Mean (
AVG): Sum of values divided by the count of values. Sensitive to outliers (extreme values). - Median: The middle value when data is sorted. Less sensitive to outliers.
SQL databases typically provide AVG directly. Calculating the median usually requires window functions like PERCENTILE_CONT or NTILE, or more complex self-joins, depending on the specific SQL dialect. For skewed distributions (like income or view counts), the median often gives a better sense of the "typical" value than the mean.
Using AVG with GROUP BY
AVG becomes particularly insightful when used with GROUP BY to compute the average value for different segments or categories within your data.
Calculating Averages per Category
Compute the average value for each distinct group.
This identifies users whose videos, on average, perform better or worse than others.
Combining AVG with Other Aggregates
Provide richer context by including counts or sums alongside the average.
Seeing the video count alongside the average helps interpret the result (e.g., a high average based on only one video is less significant than a high average across many videos).
Filtering Groups Based on AVG (HAVING Clause)
Use the HAVING clause to select only those groups whose average meets a certain criterion.
HAVING allows filtering based on the result of the AVG() aggregate function after the groups have been formed.
Practical AVG Examples
Example 1: Calculate Average Views
Use AVG to find out the average number of views across all videos:
Example 2: Average with GROUP BY
Calculate average metrics by user. This query groups videos by creator and calculates the average views per video for each user, helping identify the most successful content creators.
Example 3: Filtering with HAVING
Use HAVING to filter groups based on their average values. This query identifies users whose content consistently performs well by filtering for those with an average view count above a threshold.
Common AVG Use Cases
The AVG function is particularly useful for:
- Performance analysis:
AVG(views)to assess typical content performance - User behavior:
AVG(interactions_per_user)to understand engagement - Trend detection:
AVG(views)by time period to spot changes over time - Benchmarking: Comparing individual performance to average performance
Summary
The AVG function provides the arithmetic mean, a key measure of central tendency in SQL:
- It calculates the sum of non-NULL numeric values and divides by the count of non-NULL values.
- Crucially, it ignores NULLs. Use
COALESCE(column, 0)if you need to include NULLs as zeros in the average calculation. - Use with
WHEREto calculate averages for specific data subsets. - Combine with
GROUP BYto find average values for different categories. - Use with
HAVINGto filter groups based on their calculated average. - Often used with
ROUNDor formatting functions for clearer presentation. - Remember it calculates the mean, which can be skewed by outliers; the median might sometimes be more appropriate but requires different techniques.
AVG is indispensable for understanding typical values, comparing group performance, and identifying trends within your numeric data.
Ready for hands-on SQL practice?
We have 200+ questions on real companies and real products.