Connect with us

Hi, what are you looking for?

Data Science

Data Analysis Tutorial In Python

DATA ANALYSIS IN PYTHON
DATA ANALYSIS IN PYTHON

I will try to gather all the necessary information about data analysis in this series, and Python is next in the series we have done with the R programming language.

This series will consist of approximately 3 parts. In the first part, simple analysis functions, indexing, and pandas library will be covered. Let’s start by first examining series, data frames, and strings.

Series, Data Frames, and Arrays

Before learning these structures, let’s add the necessary modules to the project. Two modules will be used in this course. These modules are the pandas module and numpy module.

import pandas as pd
import numpy as np

Series is a data structure that stores data in one dimension, it can consist of strings or objects. You can call the data you want indexable with the help of the index.

Series1 = pd.Series([1 , 2 , 3 , 4 , 5])

Here we keep the numbers from 1 to 5 in the Series1 variable. If we want to change the index numbers, we can add another array next to it.

Series1 = pd.Series([1 , 2 , 3 ,4 , 5] , ["a" , "b" , "c" , "d" , "e"])

We will create an array next, but since it is not available in python, we use a numpy array because it is faster than lists. Its usage is similar to Series.

Arr1 = np.array([1 , 2 , 3 , 4 , 5])

Series are one-dimensional data structures, but arrays can also be used in 2 dimensions. To work in 2 dimensions, you add 2 arrays into an array.

Arr2 = np.array([1 , 2 , 3 , 4 , 5] , [10 , 20, 30, 40 , 50]]

We will see the advantages of working with 2 dimensions again in the index section. For now, it is enough to know how to make a 2-dimensional string. Next is the data frame, which is the most used data structure.

Data frames consist of rows and columns just like an SQL table. It is very easy to process data and is the most preferred data structure.

data1 = dict(a = 1 , b = 10 , c = 12 , d = 13)
data2 = dict(a = 3 , b = 22 , c = 12 , d = 145)

data3 = dict(First = data1 , Second = data2)
df = pd.DataFrame(data3)

It may sound a little more complicated at first, we create two dictionaries, these dictionaries are our columns, we throw the rows into it, then combine them in a single dictionary and write to the data frame data type.

Subsetting Data Types

After storing the data, we will need to access them, so we can do the indexing process, first, let’s try it through series.

Seri1 = pd.Series([1 , 2 , 3] , ["a" , "b" , "c"])

Serie1[0] # Accessing First Index
Serie1[0:2] # Accessing from 0 to 2
Serie1[0:] # Accessing from 0 to end

The indexing of this structure is similar to numpy arrays. Let’s try it on arrays.

Arr1 = np.array([[1 , 2 , 3] , ["a" , "b" , "c"]])
Arr1[0] # Accessing first array
Arr1[0 , 1] # Accessing first array second element

One-dimensional strings have the same index structure Data frame indexing method is a little different than others. Let’s examine these methods now, as a series. For 2-dimensional strings, the above indexing method is used.

data = {"First": [1 , 2 , 3] , "Second": ["a" ,"b" ,"c"]}
Df1 = pd.DataFrame(data)
Df1["First"] # Accessing first column
Df1[0:1] # Accessing first row
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Cyber Security

In this article, we are going to examine real-world logic bombs. Also, we are going to create one with Python for seeing how to...

Cyber Security

Welcome to the second tutorial on networks for hackers. We are going to learn network types and weaknesses and strengths of networks in this...

Programming

In this tutorial, we going to learn file handling with Python. At the end of the article, you will be ready to work with...

Cyber Security

In this article, we going to learn how networking works and simple terms in networking. Also, we are going to learn simple network attacks....