Connect with us

Hi, what are you looking for?

Data Science

Creating Data Science Project

In this second tutorial we going to create a data science project with R, we will do a data science project to reinforce the concepts we saw in the previous tutorial in this guide. Remember to read the loop functions before starting.

We will produce projects in this tutorial. You may want to read a few articles. Reading the functions article will be useful. Also, we’ll use air quality datasets if you don’t know this dataset, you should read this article.

1 – Finding Values with Higher than Average Air Quality

Firstly, we going to calculate the average air quality, after that, we’ll compare this average and the value of countries. Let’s start with the first solution, which will have two solutions.

r$> Avvarage <- mean(airquality$Temp)
r$> airquality$Temp[airquality$Temp > Avvarage]
[1] 81 79 78 84 85 79 82 87 90 87 93 92 82 80 79 78 80 83 84 85 81 84 83 83 88 92 92 89 82 81 91 80 81 82 84 87 85 81
[39] 82 86 85 82 86 88 86 83 81 81 81 82 86 85 87 89 90 90 92 86 86 82 80 79 79 78 78 79 81 86 88 97 94 96 94 91 92 93
[77] 93 87 84 80 78 81 78 82 81

There are 77 values that are above the average degree. If these values were row names, we could align them according to the names. Now let’s write a few functions.

BestOzoneValue <- function(Column)
{
    past = 0
    count = 0
    for(value in Column[!is.na(Column)])
    {
        count <- count + 1
        if(value > past)
        {
            past = value
        }   
    }
    cat("Row No:" , count , "Value:" , past)
}

In the example above, we sorted all the values one by one in descending order and reported the value with the highest ozone value to the user. In addition, we printed the line number in order to reach the value with the help of a counter.

BestTempValue <- function(Columns)
{
    count = 0
    for(value in Columns[!is.na(Columns)])
    {
        count <- count + 1
    }
    cat("Best Temp Value:" , Columns[count] , "Second Temp:" , Columns[count - 1])
    cat("\nRow Index:" , count)
}

Yes, we finished two functions, this training will consist of 2 subsections. These examples have been kept simple for exercise. Other examples will be more complex and more complex data sets will be used.

In the example above, we made the function more functional, we reached the value we want with a count by indexing method, so we do not need the past variable.

CONGRATULATIONS, YOU FINISHED DATA SCIENCE PROJECT – 1!

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Programming

This article was created for those who want to use functional programming in their career or who are learning new programming paradigms. We will...

Programming

Functions are pieces of code that combine the integrity of the code used to perform a task into a single structure. If you are...

Programming

Computers can solve an event in different ways, we call these ways algorithms, each algorithm works differently from the others and has different processing...

Data Science

SQL (Structured Query Language) is one of the most used tools by data scientists, data analysts, and data engineers. Almost all companies use a...