In this second machine learning tutorial, we going to create a simple machine learning model with TensorFlow, sklearn, and another statistic library. If you don’t read the first machine learning article you should read it here.
You Should Know These Subjects
- Data Visualization Like a Professional
- NumPy For Beginner
- Things to Know on Pandas for Data Scientists
- Machine Learning For Beginner (1)
Setting up the dataset to use
The dataset we will use today is a car dataset, there are sales figures and properties of cars, let’s call the file with the panda’s library as we learned in the last lessons.
import pandas as pd dataset = pd.read_excel("dataset.xlsx")
For those who want to find the dataset, you can install it via the link I will leave a link below. Our dataset is ready to analyze the dataset before training it.
dataset.info() dataset.describe() dataset.head()
You can examine your dataset with the functions above. If you do not know what the functions are, it is recommended that you read the article here. Anymore our analysis process is finished.
Clearing Data
We need to prepare before dividing the data. It is not appropriate to give all the data to the model. Organized data offers better performance and reduces errors than mixed data.
import pandas as pd import matplotlib.pyplot as plt Cleardataset = dataset.sort_values("price" , ascending = False).iloc[119:] plt.figure() sbr.distplot(Cleardataset["price"])
Too large data can cause your model to calculate incorrectly, so if we reduce the price data slightly and rank it, we will get a better result. To understand the difference, plot the data frame in the same way and compare.
With this transaction we made, the table was ranked in descending order according to the price value, then we rounded 13.119 to 13.000 and got rid of big data.
Cleardataset = Cleardataset[Cleardataset.year != 1970]
We can continue to simplify this data by deleting it since it would be problematic for the data to be scattered and 1970 is farther away from other values. The cleaning operations can be further simplified upon request. Next is to get the unused transmission column from the dataset.
Cleardataset = Cleardataset.drop("transmission", axis = 1)
Splitting Test and Training Data
The other stage before creating the model is to divide the data, we separate the data as test and train so that after training the model, we have a test data set that we can experiment with. We will use sklearn, a library we haven’t seen before, to split it, but don’t worry it will be very simple.
We divided the data, now we can create our model. Did you see the last operation? Thanks to this, the machine will work with simpler data. Let’s make our first model with TensorFlow.
Making Model
Models are first trained with training values, then they are tested with test data, now we will train the model with the help of TensorFlow. Tensorflow doesn’t come ready, use the pip command for setup.
We created our model, we do not need to deal with activation and optimizers for now, we will talk about it in other lessons. Next is to train the model, let’s train our model.
Training Model
Before we begin, we use the fit function to train the model. Let’s look at the epochs argument. The Epochs argument specifies how many times a data goes through. If too much processing is done on a data it will be overfitted and the model will not produce accurate results so don’t raise epochs too much.
model.fit(x_train ,y_train , epochs = 250 , batch_size = 250)
Run the program to train the model, and when the wait is over, your model will be ready to run.
Testing the Model
Finally, we will test and evaluate our results, so first let our model estimate the price of x_test data, then compare this data with the real data.
predicts = model.predict(x_test) predicts[1] # first price y_test[1] # first price
Even if we do not get the exact result, better results can be obtained by increasing the neurons that can be seen to be very close and by increasing the epochs by increasing the half. This tutorial will show you the basics and we will open up every other subject here in future lessons.