Connect with us

Hi, what are you looking for?


Things to Know on Pandas for Data Scientists

Things To Know On Pandas For Data Scientists
Things To Know On Pandas For Data Scientists

In pandas, there are patterns and functions that every data scientist uses once in a project, in this article we will examine important functions and concepts. You can access the codes via the Github link at the end of the article. You can also access the summary note on GitHub after reading this article.

You Should Know These Topics
Work with Empty Value on Pandas

We will use the NumPy library to insert null values, null values are found in almost all datasets, we can use pandas to edit this data. Numpy offers many useful functions, you can customize these easy-to-use functions.

pip install pandas
pip install numpy

After installing the libraries, you can call them in the project and create a dataset. If you don’t know how to create it, you can access it from the content section at the top.

import pandas as pd
import numpy as np

data = {"Column 1": [20 , 30 , np.nan] , "Column 2": [10 , np.nan , np.nan]}
datasets = pd.DataFrame(data)

Let’s start with deleting the empty data. You can delete the row containing empty data or the row with the dropna function. The only argument of this function that you can use with or without arguments is the axis.

# If axis is 0, row deletes, if 1 deletes columns

You can assign a value instead of deleting it, using the fillna function, which prints the value you give to the argument to nan values. It accepts values such as int, bool, string, char.

# The value argument is printed to all nan.
Group Data in Dataset

Grouping data is important in analyzing data. Pandas library groups all your data with a single function. Let’s examine this function. Groupby function groups your data according to columns.

group = dataframe.groupby("Column Name")

group.count() # gives the number of row values.
group.mean() # averages the rows
group.max() # gives the highest value for that row.
Concatenate Dataframe with Concat and Merge

We can combine more than 1 data sets, so pandas offer us two functions, merge, and concat. Let’s start with the first Concat function. We will combine 2 datasets. You can use the above method within these 2 datasets.

We are creating two separate data frames. If you noticed, we change the index numbers in the second one, because we want the indexes to be 0 again when they are merged. The second will be added to the end of the first in the merge process.

The merge function operates on a column that is the same, if there is a different column in two data frames, it merges that column. It is asked over which values to combine the argument on. This value must also be in every data frame.

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Daily News

GitHub’s new feature “passwordless authentication” aims to increase account security and provide a smoother user experience. By opting for the “Enable passkeys” option within...

Artificial Intelligence

Scientists from Japan used AI deep learning to discover new geoglyphs in the Arid Peruvian coastal plain, in the northern part of Peru’s Nazca...

Artificial Intelligence

The Tokyo Institute of Technology and other organizations have announced the start of development of highly performing generative artificial intelligence using the Fugaku supercomputer,...

Artificial Intelligence

Japanese researchers announced that AI ChatGPT technology has succeeded in the annual national doctor’s exam. The AI chatbot, which has advanced speech features that...