A Beginner's Guide to Basic Data Mining Concepts with Simple Examples
The mean is the sum of all values in a dataset divided by the number of values.
3, 2, 4, 3, 4, 9, 7:
Mean \(= \frac{3+2+4+3+4+9+7}{7} = \frac{32}{7} \approx 4.57\)
The median is the middle value of a dataset when it is ordered from least to greatest. If the dataset has an even number of observations, the median is the average of the two middle numbers.
3, 2, 4, 3, 4, 9, 7, sorted order is 2, 3, 3, 4, 4, 7, 9:
Median \(= 4\)
(The fourth number in this sorted list.)
{2, 3, 3, 4, 5, 7, 7, 7, 9, 10}:
Since there are 10 values (an even number), the median is the average of the 5th and 6th values:
Median \(= \frac{5 + 7}{2} = 6\)
The mode is the value that appears most frequently in a dataset. A dataset can have more than one mode if multiple values have the same highest frequency.
3, 2, 4, 3, 4, 9, 7:
Mode \(= 3\) and \(4\)
(Both 3 and 4 appear twice, more than any other numbers.)
{2, 3, 3, 4, 5, 7, 7, 7, 9, 10}:
The mode is the value that appears most frequently:
Mode \(= 7\)
(In this dataset, the value 7 appears three times, more than any other number.)