Descriptive Statistics

Descriptive statistics answer the following questions:
  • What is the value that best describes the data set?
  • How much a data set speads from its average value?
  • What is the smallest and largest number in a data set?

It provides information on summary statistics that includes Mean, Standard Error, Median, Mode, Standard Deviation, Variance, Kurtosis, Skewness, Range, Minimum, Maximum, Sum, and Count.

Measure of Central Tendency
It  describes a whole set of data with a single value that represents the centre of its distribution.
There are three main measures of central tendency: the mode, the median and the mean. 

Mean, Median and Mode
It is the sum of the observations divided by the sample size.

The mean of the values 5,6,6,8,9,9,9,9,10,10 is (5+6+6+8+9+9+9+9+10+10)/10 = 8.1

Limitation :  
It is affected by extreme values. Very large or very small numbers can distort the answer
It is the middle value. It splits the data in half. Half of the data are above the median; half of the data are below the median.

Advantage :  
It is NOT affected by extreme values. Very large or very small numbers does not affect it
It is the value that occurs most frequently in a dataset

Advantage :  
It can be used when the data is not numerical.

Disadvantage :
1. There may be no mode at all if none of the data is the same
2. There may be more than one mode   

When to use mean, median and mode?
Mean – When your data is not skewed i.e normally distributed. In other words, there are no extreme values present in the data set (Outliers).

Median – When your data is skewed or you are dealing with ordinal (ordered categories) data (e.g. likert scale 1. Strongly dislike 2. Dislike 3.Neutral   4. Like 5. Strongly like)

Mode - When dealing with nominal (unordered categories) data.

 In real life, suppose a company is considering expanding into an area and is studying the size of containers that competitors are offering. They would be more interested in the mode because they want to know what size tends to sell most often.

Measure of Dispersion
It refers to the spread or dispersion of scores. There are four main measures of variability: Range, Inter quartile range, Standard deviation and Variance.

It is simply the largest observation minus the smallest observation.

It is easy to calculate.

It is very sensitive to outliers and does not use all the observations in a data set.
Standard Deviation
 It is a measure of spread of data about the mean.

Advantage :  
It gives a better picture of your data than just the mean alone.

Disadvantage :  
1. It doesn't give a clear picture about the whole range of the data.
2. It can give a skewed picture if data contain outliers.

It is a measure of symmetry. A distribution is symmetric if it looks the same to the left and right of the center point.
It is a measure of whether the data are peaked or flat relative to the rest of the data. Higher values indicate a higher, sharper peak; lower values indicate a lower, less distinct peak.

Example 1: Suppose you are asked to calculate the average asset value of top stock funds and check whether there is any variability in the assets of these stock funds. You would answer this question with a measure of central tendency and variability.

Example 2: Suppose you are asked to provide a figure that best describes the annual salary offered to students in ABC College. You would answer this question with a measure of central tendency and variability.

Best Online Course : Predictive Modeling using SAS & R

- Explain Advanced Algorithms in Simple English
- Live Projects & Case Studies
- Domain Knowledge
- Job Placement Assistance
- Money Back Guarantee
- Get 20% off till July 3, 2017

Statistics Tutorials : 50 Statistics Tutorials

About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has close to 7 years of experience in data science and predictive modeling. During his tenure, he has worked with global clients in various domains like retail and commercial banking, Telecom, HR and Automotive.

While I love having friends who agree, I only learn from those who don't.

Let's Get Connected: Email | LinkedIn

Get Free Email Updates :
*Please confirm your email address by clicking on the link sent to your Email*

Related Posts:

2 Responses to "Descriptive Statistics"

  1. Thanks for the info, just need some more examples for better understanding, though u have explained really good.

  2. Thanks for the easy and no nonsense content, you could have also included the description of variance and it's significance. After all, it has different applications than standard deviation.. I guess.


Next → ← Prev