For programming and computer science, data structures are the main subject. It is utilized in nearly all fields of computer science.
In this post, I will describe the data structures that are crucial for data science and programming, as well as what data structures mean in general.
What is Data Structures?
In computer science, a data structure is a data organization, management, and storage format that enables efficient access and modification according to Wikipedia.
If this term seems complicated, let me clarify. These are structures that can store, organize and help us modify multiple data.
Lists, dictionaries, and arrays are simple data structures present in all programming languages. Although their syntax varies, their logic stays the same, so you can adapt the examples we will conduct in Python to languages such as Java and C++.
Dictionaries , Maps , Hash Tables
When you start working with these structures, which I will give you important information instead of explaining at length, you will start to discover some features yourself.
In Python, dictionaries are for storing an arbitrary number of data, each with its own keywords. Maps are also called hash maps and lookup tables or associative arrays.
It makes it easier to organize data associated with a particular keyword and presents it in a more organized form.
For example, a dictionary may store attributes (data) such as age, date of birth, horoscope based on the name keyword.
data = {
"Mark": 12,
"Alice": 23,
"David": 8,
}
data["Mark"]
Output: 12
You can also type curly braces to fit on a single line, but dictionaries with large data will mess up, so it’s better to use indentation.
OrderedDict & DefualtDict & ChainMap
We have dictionaries with different features. Although each of them has a different purpose, I collected them under the same title because they are in the same library.
You need the collections library to access these data structures so don’t forget to install it on your computer.
As the name suggests, OrderedDict is a dictionary that does not forget the order of your keys and acts accordingly.
import collections as cs
dict1 = cs.OrderedDict(one=1,two=2)
print(dict1)
# Adding New Data
dict1["four" = 4]
print(dict1)
([('one', 1), ('two', 2)])
([('one', 1), ('two', 2), ('three', 3)])
You can use defaultdict to prepare messages that will inform you when you try to pull data that is not in the dictionaries.
Preparing the error messages in a way that is easy for you will help you in the debug phase.
from collections import defaultdict
# Error message
def error():
return "Index Not Found"
x = defaultdict(error)
# Create 10 index and get 10 value
for i in range(0,10):
x[i] = i
# Try give 11'th index
print(x[11])
Index Not Found
Do you have multiple dictionaries, let’s turn them into a single dictionary with Chainmap. ChainMap is most used with dictionaries.
from collections import ChainMap
dict1 = {"1": 1, "2": 2}
dict2 = {"3": 3, "4": 4}
main = ChainMap(dict1, dict2)
print(main["3"] , main["1"])
Output: 3 , 1
Array Data Structures
Arrays are one of the most widely used and well-known data structures. In this section, we’ll talk about how arrays work.
Arrays are called structures that hold the same data together, each array is a whole, and the data in it is of the same data type.
Although lists and arrays are separated in other programming languages, in Python they both have the same syntax. If you use different data, it becomes a list, if you use the same data, it becomes an array.
# from 0 to 10 value
arr = [1,2,3,4,5,6,7,8,9,10]
# String Array
arr1 = ["a" , "b" , "c"]
# Get First Indexing
arr1[0]
# Get from 0 to 4
arr[0:4]
# Deleting Element
del arr[0]
# Adding Element
arr.append(11)
Output 1: a
Output 2: [1, 2, 3, 4]
Tuple Data Structures
Tuples are another data structure capable of holding any data type components. With this flexibility, however, that the data is less tight than in a typed array.
In this data type, the data cannot be dynamically deleted, edited or any action can be taken, although it is often seen as a disadvantage, you can turn it in your favor for security.
tuple = (1 , 2 , 3)
tuple[0]
tuple1 = ("x" , 1 , 1.25)
tuple1[2]
# you'll get error
del tuple[0]
tuple[1] = "y"
Output: 1
Output: 1.25
Using Array with Array Library
Python’s array module stores fundamental C-style data types like bytes, 32-bit integers, floating-point numbers, and so on in a space-efficient manner.
Unlike lists with one data type, it only accepts data with the same data type, so it is impossible to enter different data types.
Because of this constraint, array.array objects with many elements are more space-efficient than lists and tuples.
# Accessing array
from array import array
# use type code
arr = array("f" , [1.0 , 1.2])
If you are using the IDE, it will give you all the type codes. Choose the type code according to the data you will use. Any type other than this data type of code is not accepted.
String – Arrays of Unicode Characters
String objects save space since they are densely packed and specialize in a particular data type. If you want to save Unicode text, you should use a string.
In this section, we will save not only texts but also emojis, as I said, every Unicode must be stored in a string object.
str = "55555"
emoji = "😀"
print(str , emoji)
Output: 55555 😀
Bytes & ByteArray
Bytes objects are structures that can hold numbers between 0 and 255. If you exceed 255, you will get an error, the same is true for numbers below 0.
They are high-performance and stable structures like strings, if the numbers you are dealing with are in this value range, it will be useful to use bytes.
x = bytes([1 , 2 , 3])
y = bytes([-1 , 2 , 3])
z = bytes([100 , 200 , 300])
Output: b'\x01\x02\x03'
Output: error
Output: error
With the indexing method, you can pull your data, but you cannot edit, add or delete like tuple structures.
However, you can add, update and delete data with the byte array class. bytes rules apply, they can only be manipulated.
x = bytearray((10 , 20 , 30))
# Update
x[1] = 11
# Remove
del x[0]
# Adding
x.append(40)
# Show all data
for i in x:
print(i)
11
30
40
Writing Custom Class For Productivity
For more control, more power, all you need is yourself. Don’t be afraid to create and use your own classes. Creating complex classes can be tiring at times, but it will increase your productivity.
When you want to add business logic and activity to your record objects via methods, creating a custom class is a fantastic solution. However, this means that these things are no longer just data objects.
class Student:
def __init__(self, name, note):
self.name = name
self.note = note
x = Student("David" , 55)
y = Student("Mark" , 35)
# Access Data
print(x.name , x.note)
print(Student)
Output: David 55
Output: <main.Student object at 0x7f53925c2400>
If you’ve noticed, it doesn’t exactly look like a data class. The easiest way out of this situation is to use @dataclass, but Python 3.7+ is required.
from dataclasses import dataclass
@dataclass
class Student:
name: str
note: int
x = Student("David" , 55)
print(x)
Output: Student(name='David', note=55)
Sets & Multisets Data Structures
The set type is the built-in set implementation in Python. It’s mutable and allows for the dynamic insertion and deletion of elements.
Python’s sets are backed by the dict data type and share the same performance characteristics. Any hashable object can be stored in a set.
set = {1 , 2 , 3}
set.add(4)
set.remove(3)
If it doesn’t seem like it’s structurally different from an array, don’t worry, I’ll summarize the important differences.
While renewed items can be found in lists and arrays, sets cannot. every data should be unique, like passwords, usernames, nicknames…
Frozen Sets
There is almost no difference between the set data structure and the only difference is that no value (data) can be changed.
frozen = frozenset({"x" , "y" , "z"})
frozen.add("k")
Output: Erorr
Using Multiplesets With Counter
The counter class in the Collections library allows us to combine more than one set. if you have more than one set but some have the same values it will not add the same values to the new set instead, it shows how much the key is used by increasing the value.
from collections import Counter
merge = Counter()
fruits = {"apple" , "banana" , "orange"}
merge.update(fruits)
print(merge)
fruits1 = {"apple" , "banana" , "watermelon"}
merge.update(fruits1)
print(merge)
{'orange': 1, 'apple': 1, 'banana': 1})
{'apple': 2, 'banana': 2, 'orange': 1, 'watermelon': 1})
Although it carries each value separately in the set, it only stores the main key in length, you can briefly view it with the code below.
# Unique Value
len(merge)
# Total Elements
sum(merge.values())
Output: 4
Output: 6
Stacks (LIFOs) in Python
A stack is a collection of items that fast in/out semantics (LIFO) for insert and delete are supported.
Unlike Arrays and lists, you can’t do random access, you need to use functions for insertion and deletion.
Using Lists For Stacks
You can add the data to the end with append and remove it from the LIFO queue with the help of pop, but let’s not say that the performance is very bad as the list grows.
stack = []
stack.append(1)
stack.append(2)
stack.append(3)
print(stack)
stack.pop()
stack.pop()
print(stack)
Output: [1,2,3]
Output: [1]
Let’s examine it step by step, every time you use append, it is appended in the last index, and every time the pow function runs, it deletes in the last index.
Using Deque For Stack
The difference of deque from the list also supports adding and removing elements at the beginning of the array in fixed time.
Therefore, it works more efficiently and faster than a list. It also supports random access.
You may lose performance if you try to delete the data in between with the deque, the main reason is that all the elements up to the two ends move to make room or fill the gap.
from collections import deque
stack = deque()
stack.append("a")
stack.append("b")
stack.append("c")
print(stack)
print(stack.pop())
print(stack.pop())
print(stack)
Output: deque(['a', 'b', 'c'])
Output: deque(['a'])
Queues (FIFO’s) in Python
The logic on the stack is slightly different here, where first-in, first-out (FIFO) is applied, while in the stack, first-in, last-out is applied.
We can use the list and deque data structures that we use in the stack here also use the Queue class in the queue.
queue = []
queue.append("x")
queue.append("y")
queue.append("z")
print(queue)
queue.pop(0)
queue.pop(0)
print(queue)
Output: ['x', 'y', 'z']
Output: ['z']
Using Deque For Queue
Everything in the stack is valid, the only difference is that we use popleft instead of the pop function, for use first-in-first-out logic.
from collections import deque
queue = deque()
queue.append("x")
queue.append("y")
queue.append("z")
print(queue)
queue.popleft()
queue.popleft()
print(queue)
Output: deque(['x', 'y', 'z'])
Output: deque(['z'])
Using Queue Class
A queue is a structure by which we can determine how much data a queue can hold and store. It is mostly used to implement a queue.
You can create an infinite queue by setting the max size parameter to 0 and this conforms to the FIFO rule.
from queue import Queue
queue = Queue(maxsize = 0)
# Adding element
queue.put(10)
queue.put(20)
queue.put(30)
print(queue.queue)
# Removing element
queue.get()
queue.get()
print(queue.queue)
Output: deque([10, 20, 30])
Output: deque([30])
Thanks For Reading and Happy Coding!

