Connect with us

Hi, what are you looking for?

Data Science

Creating Data Science Project

In this second tutorial we going to create a data science project with R, we will do a data science project to reinforce the concepts we saw in the previous tutorial in this guide. Remember to read the loop functions before starting.

We will produce projects in this tutorial. You may want to read a few articles. Reading the functions article will be useful. Also, we’ll use air quality datasets if you don’t know this dataset, you should read this article.

1 – Finding Values with Higher than Average Air Quality

Firstly, we going to calculate the average air quality, after that, we’ll compare this average and the value of countries. Let’s start with the first solution, which will have two solutions.

r$> Avvarage <- mean(airquality$Temp)
r$> airquality$Temp[airquality$Temp > Avvarage]
[1] 81 79 78 84 85 79 82 87 90 87 93 92 82 80 79 78 80 83 84 85 81 84 83 83 88 92 92 89 82 81 91 80 81 82 84 87 85 81
[39] 82 86 85 82 86 88 86 83 81 81 81 82 86 85 87 89 90 90 92 86 86 82 80 79 79 78 78 79 81 86 88 97 94 96 94 91 92 93
[77] 93 87 84 80 78 81 78 82 81

There are 77 values that are above the average degree. If these values were row names, we could align them according to the names. Now let’s write a few functions.

BestOzoneValue <- function(Column)
{
    past = 0
    count = 0
    for(value in Column[!is.na(Column)])
    {
        count <- count + 1
        if(value > past)
        {
            past = value
        }   
    }
    cat("Row No:" , count , "Value:" , past)
}

In the example above, we sorted all the values one by one in descending order and reported the value with the highest ozone value to the user. In addition, we printed the line number in order to reach the value with the help of a counter.

BestTempValue <- function(Columns)
{
    count = 0
    for(value in Columns[!is.na(Columns)])
    {
        count <- count + 1
    }
    cat("Best Temp Value:" , Columns[count] , "Second Temp:" , Columns[count - 1])
    cat("\nRow Index:" , count)
}

Yes, we finished two functions, this training will consist of 2 subsections. These examples have been kept simple for exercise. Other examples will be more complex and more complex data sets will be used.

In the example above, we made the function more functional, we reached the value we want with a count by indexing method, so we do not need the past variable.

CONGRATULATIONS, YOU FINISHED DATA SCIENCE PROJECT – 1!

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Programming

clean code is essential for updating scripts, adding new features to code, etc. So every developer, must learn how to clean the code. With...

Daily News

Cloud computing is a system that is often heard today and that most developers use at least once. In this article, we’ll look at...

Game Development

Visual Studio Code without a doubt, most popular and loved code editor in the code editor market. Many developers love working with Vs Code....

Daily News

GitHub Copilot was introduced last year and continues to make it easier for developers with nice updates to date. Although most people see Github...