I will try to gather all the necessary information about data analysis in this series, and Python is next in the series we have done with the R programming language.
This series will consist of approximately 3 parts. In the first part, simple analysis functions, indexing, and pandas library will be covered. Let’s start by first examining series, data frames, and strings.
Series, Data Frames, and Arrays
Before learning these structures, let’s add the necessary modules to the project. Two modules will be used in this course. These modules are the pandas module and numpy module.
import pandas as pd import numpy as np
Series is a data structure that stores data in one dimension, it can consist of strings or objects. You can call the data you want indexable with the help of the index.
Series1 = pd.Series([1 , 2 , 3 , 4 , 5])
Here we keep the numbers from 1 to 5 in the Series1 variable. If we want to change the index numbers, we can add another array next to it.
Series1 = pd.Series([1 , 2 , 3 ,4 , 5] , ["a" , "b" , "c" , "d" , "e"])
We will create an array next, but since it is not available in python, we use a numpy array because it is faster than lists. Its usage is similar to Series.
Arr1 = np.array([1 , 2 , 3 , 4 , 5])
Series are one-dimensional data structures, but arrays can also be used in 2 dimensions. To work in 2 dimensions, you add 2 arrays into an array.
Arr2 = np.array([1 , 2 , 3 , 4 , 5] , [10 , 20, 30, 40 , 50]]
We will see the advantages of working with 2 dimensions again in the index section. For now, it is enough to know how to make a 2-dimensional string. Next is the data frame, which is the most used data structure.
Data frames consist of rows and columns just like an SQL table. It is very easy to process data and is the most preferred data structure.
data1 = dict(a = 1 , b = 10 , c = 12 , d = 13) data2 = dict(a = 3 , b = 22 , c = 12 , d = 145) data3 = dict(First = data1 , Second = data2) df = pd.DataFrame(data3)
It may sound a little more complicated at first, we create two dictionaries, these dictionaries are our columns, we throw the rows into it, then combine them in a single dictionary and write to the data frame data type.
Subsetting Data Types
After storing the data, we will need to access them, so we can do the indexing process, first, let’s try it through series.
Seri1 = pd.Series([1 , 2 , 3] , ["a" , "b" , "c"]) Serie1[0] # Accessing First Index Serie1[0:2] # Accessing from 0 to 2 Serie1[0:] # Accessing from 0 to end
The indexing of this structure is similar to numpy arrays. Let’s try it on arrays.
Arr1 = np.array([[1 , 2 , 3] , ["a" , "b" , "c"]]) Arr1[0] # Accessing first array Arr1[0 , 1] # Accessing first array second element
One-dimensional strings have the same index structure Data frame indexing method is a little different than others. Let’s examine these methods now, as a series. For 2-dimensional strings, the above indexing method is used.
data = {"First": [1 , 2 , 3] , "Second": ["a" ,"b" ,"c"]} Df1 = pd.DataFrame(data) Df1["First"] # Accessing first column Df1[0:1] # Accessing first row

