A Beginner's Guide to Basic Data Mining Concepts with Simple Examples.
Mean (Average)
The mean is the sum of all values in a dataset divided by the number of values.
Example: For the numbers
3, 2, 4, 3, 4, 9, 7:
Mean \(= \frac{3+2+4+3+4+9+7}{7} = \frac{32}{7} \approx 4.57\)
Median
The median is the middle value of a dataset when ordered from least to greatest. If the dataset has an even number of observations, the median is the average of the two middle numbers.
Example 1: For
3, 2, 4, 3, 4, 9, 7, sorted: 2, 3, 3, 4, 4, 7, 9Median \(= 4\) (the 4th number in the sorted list)
Example 2: Sorted dataset:
2, 3, 3, 4, 5, 7, 7, 7, 9, 10Since there are 10 values, the median is the average of the 5th and 6th values:
Median \(= \frac{5 + 7}{2} = 6\)
Median \(= \frac{5 + 7}{2} = 6\)
Mode
The mode is the value that appears most frequently in a dataset. A dataset can have more than one mode if multiple values share the same highest frequency.
Example 1: For
3, 2, 4, 3, 4, 9, 7:Mode \(= 3\) and \(4\) (both appear twice, more than any other number)
Example 2: For
2, 3, 3, 4, 5, 7, 7, 7, 9, 10:Mode \(= 7\) (appears three times, more than any other value)