Data Structures the Fun Way - 1. Memory Information
Basic methods for storing data in memory
š 1.1. Variables
Variables are Labels
Variables are like paper labels attached to folders containing documents. After attaching a label, there's no need to remember the order or position of the folder.
Therefore, it's important to use names that contain sufficient information.
Variables are also related to the type of stored data.
š 1.2 Composite Data Structures
When Things Get Complex, We Need to Bundle Them
Composite data structures refer to structures or objects that tie multiple individual variables into one group.
A business card is a data package containing multiple pieces of information such as name, email, etc. (Great analogy!)
š 1.3 Arrays
Characteristics Unique to Arrays
Since the boxes within an array are adjacent to each other in computer memory, each box can be easily accessed by calculating the offset from the first element and reading the memory at the corresponding location.
This means that regardless of the position of the box you want to access, only one addition and memory access is needed.
That's why arrays are particularly convenient for storing items with order.
Insertion sort sorts a portion of the array and expands this sorted range until the entire array is sorted. The algorithm iterates through each element of the unsorted array, moving them to the correct position in the sorted portion.
Connecting with Book Content
It's also possible to start indices from 1, and some programming languages follow this rule. This is a newly learned fact! Notably, R is said to start indices of vectors or lists from 1. However, languages like C, Python, and JavaScript use 0-based indexing as default. This is said to originate from the computer engineering tradition where arrays were designed based on memory addresses.
The book says that arrays are not like bookshelves where you can insert books, but rather like stores lined up in a row. So that's why arrays are sometimes called inefficient data structures?
Insertion Sort
Insertion sort sorts a portion of the array and expands this sorted range until the entire array is sorted. In other words, insertion sort initially creates a small sorted portion and gradually expands this sorted range by adding new elements one by one. This way, the entire array eventually becomes sorted.
Therefore, at the point when iteration i starts, all elements at positions i-1 and below are already sorted.
Insertion Sort vs. First-In-First-Out
The book used the example of sorting coffee by freshness. So what's the difference from First-In-First-Out (FIFO)?
Insertion sort is not simply sorting in the order they came in, but rather a strategy closer to maintaining "expiration date order" overall by finding the appropriate place within the already sorted section and inserting it each time a new product comes in.
To summarize, First-In-First-Out follows the incoming order as is, while insertion sort is a sorting method that repositions according to criteria each time a new element is inserted.
Implementing Insertion Sort with Nested Loops
function insertionSort(A: number[]) {
const N = A.length;
let i = 1; // Assume 0 is already sorted
while (i < N) {
let current = A[i]; // Store the element to insert temporarily
let j = i - 1; // Start comparison from the element right before current
while (j >= 0 && A[j] > current) {
A[j + 1] = A[j]; // Move j-th element one position back
j = j - 1; // Move j one position forward (to compare with earlier elements)
}
A[j + 1] = current; // Place current at the position after movement ends
i = i + 1; // Increment i for next position
}
return A;
}
Time Complexity
The time complexity of insertion sort is O(N²) on average.
If it's already almost sorted, the best case can be faster at around O(N), but in the worst case and average case, it performs approximately N² comparisons and movement operations.
Although not very efficient, as the book says, it's a sorting algorithm that helps understand how to handle arrays.
The characteristics of arrays learned through insertion sort can be summarized in three points: accessing elements by index, exchanging values when inserting new elements, and being able to iterate through all elements.
š 1.4 Strings
Relationship Between Strings and Arrays
Strings are often thought of as a special type of array, an ordered list of characters.
Since strings internally have a structure where characters are arranged in order, they can be viewed as "arrays of characters." (Of course, actual implementation or characteristics differ)
To summarize, both
str[0]
andarr[0]
are similar in that they "access ordered data using indices," but to say they are identical, there are implementation differences depending on data types and languages.
const arr = [1, 2, 3, 4];
const str = 1234;
console.log(arr[1]); // 2
console.log(str[1]); // undefined
Algorithm for Checking Equality of Two Strings
function stringEqual(str1: string, str2: string): boolean {
// If the lengths of the two strings are different, return false immediately
if (str1.length !== str2.length) {
return false;
}
// If the lengths are the same, compare each character
for (let i = 0; i < str1.length; i++) {
// If any character is different, return false
if (str1[i] !== str2[i]) {
return false;
}
}
// If all characters are the same, return true
return true;
}
Time Complexity
In the best case, if the first character is different, it ends with one comparison, so it's O(1).
However, on average or in the worst case, all characters must be compared, so it's O(N).
Array vs. String
So when solving algorithm problems, which is more efficient between arrays and strings? It seems appropriate to choose the right data structure according to the problem situation.
Category
Array
String
Advantages
⢠Random Access (O(1)): Can immediately access elements at desired positions using indices ⢠Flexibility: Can store various data types ⢠Various Operations Support: Can efficiently implement various algorithms like insertion, deletion, sorting
⢠Special Operations Support: Provides string-specific functions like substring, split, join, etc. ⢠Immutability: Can be safely shared ⢠Text Processing Optimization: Advantageous for text processing like pattern matching, regular expressions
Disadvantages
⢠Fixed Size: In some languages, size is fixed, requiring new array creation when changing ⢠Memory Usage: Requires contiguous memory, large arrays may waste memory
⢠Difficulty in Modification: If immutable, requires creating new string when modifying ⢠Limited Data Types: Mainly stores characters only, cannot store various types
Time Complexity
⢠Access and Modification: Random access and modification are fast (O(1)) ⢠Insertion/Deletion: Takes O(N) when inserting/deleting in the middle
⢠Access: Index access is O(1), but modification requires copying entire string (O(N)) ⢠Search/Pattern Matching: Efficient when using optimized algorithms (generally O(N))
Efficiency
⢠Data Manipulation: Advantageous for general data manipulation like sorting, insertion, deletion ⢠Various Types: Suitable when dealing with multiple data types
⢠Text Processing: Optimized for text-related tasks like search, pattern matching ⢠String-Specific Function Utilization: Easy to implement string-related features
Memory
⢠Flexible Type Storage: Can store various types, memory usage is dynamic ⢠Contiguous Memory: Large arrays may waste memory
⢠Fixed Character Type: Stores only characters, memory usage is constant ⢠Immutability: Additional memory usage when modifying strings
š Related Problems and Solutions
Last updated