Statistics For Machine Learning and Data Science (1)

In this tutorial, we going to learn statistics for data science and machine learning in this tutorial you don’t need to know any information about statistics and programming but you need to know basic math subjects. This tutorial includes simple operations such as to get modulo, get mode, arithmetic mean, regressions, variance, and standard deviation.

Get Modulo

Let’s start with getting the mod, the mod taking process is written as the remaining result after dividing the two numbers. For example, when we make 2 % 2, the remainder is 0, so the result is 0. When we make 3 % 2, the result is 1 because the remainder is 1.

12 % 4 = 0
11 % 4 = 3
What is the Median?

The arithmetic mean is the value found in the middle. The middle value in a data set is called the Arithmetic mean.

x = {1 , 2 , 3 , 4 , 5 , 6 , 7}
- x arithmetic mean is 4
What is the Arithmetic Mean?

The arithmetic mean is the value obtained by dividing the sum of the numbers in a series by the number of elements (number) of the series. the arithmetic mean is one of the most used operations.

Get Mode

The mode is the most repetitive number in a numerical data series. The number of repetitions of this number is called frequency. For example, if the value 2 is used 3 times in a data set, the mode 2 frequency of this data set is 3.

x = {1 , 2 , 3 , 3 , 4 , 3 , 4}
Mode = 3
Frequency = 3
What is Standart Deviation?

The standard deviation is the square root of the sum of the squares of the difference of the numbers in a series from the arithmetic mean of the series by one minus the number of elements of the array.

Step by Step Standart Deviation
  • The arithmetic mean of the numbers is calculated.
  • Each number is subtracted from the arithmetic mean.
  • The square of each of the numbers found is calculated.
  • The squares of the differences are added together.
  • The resulting sum is divided by one minus the number of elements of the series.
  • The square root of the number found is taken.

With standard deviation, we find how much of the data is close to the mean. If the standard deviation is small, the data are scattered close to the mean. Conversely, if the standard deviation is large, the data are scattered far from the mean. If all values are the same, the standard deviation will be zero.

x = [10 , 20 , 30]

1 - Arithmetic Mean = 20
2 - 10 - 20 , 20 - 20 , 30 - 20 = [-10 , 0 , 10]
3 - [100 , 0 , 100]
4 - 200 / 2 = 100
5 - Standart deviation = 10 (Square root of 100)
What is a Variance?

Variance is the sum of squares of deviations of data from the arithmetic mean. In other words, it is the standard deviation without the square root.

Why We Use Variance and Standart Deviation?

Average alone is not enough to understand the data. We use variance or standard deviation to see the distribution of the data relative to the mean, two values give us the closeness or distance of the data to the mean.

5 thoughts on “Statistics For Machine Learning and Data Science (1)

  1. Hello, after reading this remarkable piece of writing i am also
    happy to share my familiarity here with mates.

  2. you’re really a good webmaster. The web site loading pace is incredible.
    It sort of feels that you are doing any distinctive trick.
    In addition, The contents are masterpiece. you’ve done
    a wonderful activity in this matter!

  3. It’s amazing to go to see this web site and reading the views of all mates on the topic of this paragraph, while I am also zealous of getting familiarity.

  4. Hello! Someone in my Facebook group shared this site with us so I
    came to check it out. I’m definitely loving the information. I’m
    book-marking and will be tweeting this to my followers!

    Wonderful blog and amazing design and style.

Leave a Reply

Your email address will not be published. Required fields are marked *