####
**Live Online Training :**
Predictive Modeling using SAS

- Explain Advanced Algorithms in Simple English

- Live Projects & Case Studies

- Domain Knowledge

- Job Placement Assistance

- Get 10% off till Oct 11, 2017

- Batch starts from October 28, 2017

Descriptive statistics answer the following questions:

- What is the value that best describes the data set?
- How much a data set speads from its average value?
- What is the smallest and largest number in a data set?

It provides information on summary statistics that includes

*Mean, Standard Error, Median, Mode, Standard Deviation, Variance, Kurtosis, Skewness, Range, Minimum, Maximum, Sum, and Count*.

**Measure of Central Tendency**

It describes a whole set of data with a single value that represents the centre of its distribution.

*There are three main measures of central tendency: the mode, the median and the mean.*

**Mean, Median and Mode**

Mean

It is the sum of the observations divided by the sample size.

The mean of the values 5,6,6,8,9,9,9,9,10,10 is (5+6+6+8+9+9+9+9+10+10)/10 = 8.1

Limitation :

It is affected by extreme values. Very large or very small numbers can distort the answer

Median

It is the middle value. It splits the data in half. Half of the data are above the median; half of the data are below the median.

Advantage :It isNOTaffected by extreme values. Very large or very small numbers does not affect it

Mode

It is the value that occurs most frequently in a dataset

Advantage :It can be used when the data is not numerical.

Disadvantage :

1. There may be no mode at all if none of the data is the same

2. There may be more than one mode

*When to use mean, median and mode?***Mean**– When your data is not skewed i.e normally distributed. In other words, there are no extreme values present in the data set (Outliers).

**Median**– When your data is skewed or you are dealing with ordinal (ordered categories) data (e.g. likert scale 1. Strongly dislike 2. Dislike 3.Neutral 4. Like 5. Strongly like)

**Mode**- When dealing with nominal (unordered categories) data.

**Example***In real life, suppose a company is considering expanding into an area and is studying the size of containers that competitors are offering. They would be more interested in the mode because they want to know what size tends to sell most often.*

**Measure of Dispersion**

It refers to the spread or dispersion of scores. There are four main measures of variability:

*Range, Inter quartile range, Standard deviation and Variance.*

Range

It is simply the largest observation minus the smallest observation.

Advantage:

It is easy to calculate.

Disadvantage:

It is very sensitive to outliers and does not use all the observations in a data set.

Standard Deviation

It is a measure of spread of data about the mean.

Advantage :It gives a better picture of your data than just the mean alone.

Disadvantage :1. It doesn't give a clear picture about the whole range of the data.

2. It can give a skewed picture if data contain outliers.

Skewness

It is a measure of symmetry. A distribution is symmetric if it looks the same to the left and right of the center point.

Kurtosis

It is a measure of whether the data are peaked or flat relative to the rest of the data. Higher values indicate a higher, sharper peak; lower values indicate a lower, less distinct peak.

**Examples:**

**Example 1:**Suppose you are asked to calculate the average asset value of top stock funds and check whether there is any variability in the assets of these stock funds. You would answer this question with a measure of central tendency and variability.

**Example 2:**Suppose you are asked to provide a figure that best describes the annual salary offered to students in ABC College. You would answer this question with a measure of central tendency and variability.

Thanks for the info, just need some more examples for better understanding, though u have explained really good.

ReplyDeleteThanks for the easy and no nonsense content, you could have also included the description of variance and it's significance. After all, it has different applications than standard deviation.. I guess.

ReplyDelete