My Imutality#206
Tell us about something you learned this week.
— Immutability is a core principle in functional programming, and has lots to offer to object-oriented programs.
What are the pros and cons of immutability?
— Each function is easier to understand because it uses only other public functions and its own private variables.
Disadvantages :
— It takes more memory. Each function uses its acc accumulator, so the functional version requires twice as much memory as the imperative version.
— The application state is stored in a single object, store, which cannot be edited directly.
— To change the state of the application, you need a function, reducer, which takes an input a state and an action (an object that declaratively describes a change) and returns the following state in return.
How can you achieve immutability in your own code?
— Mutable objects are those whose state is allowed to change over time. An immutable value is the exact opposite — after it has been created, it can never change. Strings and Numbers are inherently immutable in javascript.
What are Divide and Conquer algorithms? Describe how they work. Can you give any common examples of the types of problems where this approach might be used?
— Divide and conquer algorithms break down problems into smaller problems until the smaller problems can provide solutions, and then taking those solutions and merging them with other solutions to create new problems and eventually reach the desired result. A use case of this could be used to compare numbers against each other.
How do insertion sort, heap sort, quick sort, and merge sort work?
— Insertion Sort is a comparison sort in which the sorted array (or list) is built one entry at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. However, insertion sort provides several advantages:
- Simple implementation
- Efficient for small data sets
- Adaptive, i.e., efficient for data sets that are already substantially sorted: the time complexity is O(n+d), where d is the number of inversions
- More efficient in practice than most other simple quadratic algorithms such as selection sort or bubble sort: the average running time is n2/4, and the running time is linear in the best case
- Stable, i.e., does not change the relative order of elements with equal keys
- In-place, i.e., only requires a constant amount O(1) of additional memory space
- Online, i.e., can sort a list as it receives it
Algorithm
Every iteration of insertion sort removes an element from the input data, inserting it into the correct position in the already-sorted list, until no input elements remain. The choice of which element to remove from the input is arbitrary, and can be made using almost any choice algorithm.
Sorting is typically done in-place. The resulting array after k iterations has the property where the first k+1 entries are sorted. In each iteration the first remaining entry of the input is removed, inserted into the result at the correct position, thus extending the result.
Example: Consider an example of sorting “64 25 12 22 11”.
Pseudocode Implementation:
insertionSort(array A)
begin
for i := 1 to length[A] - 1 do
begin
value := A[i];
j := i - 1;
while j >= 0 and A[j] > value do
begin
A[j + 1] := A[j];
j := j - 1;
end;
A[j + 1] := value;
end;
end;
Performance
- Worst case performance: O(n2)
- Best case performance: O(n)
- Average case performance: O(n2)
- Worst case space complexity: O(n) total, O(1) auxiliary
A Comparison of the O(n2) Sorting Algorithms
Among simple average-case O(n2) algorithms, selection sort almost always outperforms bubble sort, but is generally outperformed by insertion sort.
Insertion sort’s advantage is that it only scans as many elements as it needs in order to place the k+1st element, while selection sort must scan all remaining elements to find the k+1st element. Experiments show that insertion sort usually performs about half as many comparisons as selection sort. Selection sort will perform identically regardless of the order the array, while insertion sort’s running time can vary considerably. Insertion sort runs much more efficiently if the array is already sorted or “close to sorted.”
Selection sort always performs O(n) swaps, while insertion sort performs O(n2) swaps in the average and worst case. Selection sort is preferable if writing to memory is significantly more expensive than reading.
Insertion sort or selection sort are both typically faster for small arrays (i.e., fewer than 10–20 elements). A useful optimization in practice for the recursive algorithms is to switch to insertion sort or selection sort for “small enough” subarrays.
Merge SortMerge sort is an O(n log n) comparison-based sorting algorithm. It is an example of the divide and conquer algorithmic paradigm.
Algorithm
Conceptually, a merge sort works as follows:
- If the list is of length 0 or 1, then it is already sorted. Otherwise:
- Divide the unsorted list into two sublists of about half the size.
- Sort each sublist recursively by re-applying merge sort.
- Merge the two sublists back into one sorted list.
Merge sort incorporates two main ideas to improve its runtime:
- A small list will take fewer steps to sort than a large list.
- Fewer steps are required to construct a sorted list from two sorted lists than two unsorted lists. For example, you only have to traverse each list once if they’re already sorted.
Algorithm MergeSort(A, 0, n-1)
MergeSort(A, 0, n/2)
MergeSort(A, n/2 + 1, n-1)
MergeTogether(2 arrays above)
Example: sort the following numbers 35, 62, 33, 20, 5, 72, 48, 50.
Performance
- Worst case performance: O(n log n)
- Best case performance: O(n log n) typical
- Average case performance: O(n log n)
- Worst case space complexity: O(n) total, O(n) auxiliary
Analysis
Let’s analyze the big-Oh of the above MergeSort.
We can solve the recurrence relation given above. We’ll write n instead of O(n) in the first line below because it makes the algebra much simpler.
T(n) = 2 T(n/2) + n
= 2 [2 T(n/4) + n/2] + n
= 4 T(n/4) + 2n
= 4 [2 T(n/8) + n/4] + 2n
= 8 T(n/8) + 3n
= (fill in this line)
= ……
= 2k T(n/2k) + k n
We know that T(1) = 1 and this is a way to end the derivation above. In particular we want T(1) to appear on the right hand side of the = sign. This means we want:
n/2k = 1 OR n = 2k OR log2 n = k
Continuing with the previous derivation we get the following since k = log2 n:
= 2k T(n/2k) + k n
= 2log2 n T(1) + (log2n) n
= n + n log2 n [remember that T(1) = 1]
= O(n log n)
So we’ve solved the recurrence relation and its solution is what we “knew” it would be. To make this a formal proof you would need to use induction to show that O(n log n) is the solution to the given recurrence relation, but the “plug and chug” method shown above shows how to derive the solution — — the subsequent verification that this is the solution is something that can be left to a more advanced algorithms class.
QuicksortQuicksort is a well-known sorting algorithm that, on average, makes O(n log n) comparisons to sort n items. However, in the worst case, it makes O(n2) comparisons. Typically, quicksort is significantly faster than other O(n log n) algorithms, because its inner loop can be efficiently implemented on most architectures, and in most real-world data, it is possible to make design choices which minimize the probability of requiring quadratic time.
Quicksort is a comparison sort and, in efficient implementations, is not a stable sort.
Algorithm
Quicksort sorts by employing a divide and conquer strategy to divide a list into two sub-lists.
The steps are:
- Pick an element, called the pivot, from the list.
- Reorder the list so that all elements which are less than the pivot come before the pivot and so that all elements greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
- Recursively sort the sub-list of lesser elements and the sub-list of greater elements. The base case of the recursion are lists of size zero or one, which are always sorted.
function quicksort(array)
var list less, greater
if length(array) <= 1
return array
select and remove a pivot from array
for each x in array
if x <= pivot then append x to less
else append x to greater
return concatenate(quicksort(less), pivot, quicksort(greater))
Quicksort is similar to merge sort in many ways. It divides the elements to be sorted into two groups, sorts the two groups by recursive calls, and combines the two sorted groups into a single array of sorted values. However, the method for dividing the array in half is much more sophisticated than the simple method we used for merge sort. On the other hand, the method for combining these two groups of sorted elements is trivial compared to the method used in mergesort.
The correctness of the partition algorithm is based on the following two arguments:
- At each iteration, all the elements processed so far are in the desired position: before the pivot if less than or equal to the pivot’s value, after the pivot otherwise.
- Each iteration leaves one fewer element to be processed.
The disadvantage of the simple version above is that it requires O(n) extra storage space, which is as bad as merge sort. The additional memory allocations required can also drastically impact speed and cache performance in practical implementations. There is a more complex version which uses an in-place partition algorithm and use much less space.
The partition pseudocode:
private static int partition(int[] data, int first, int n) {
1. Initialize values:
pivot = data[first];
tooBigIndex = first + 1; // Index of element after pivot
tooSmallIndex = first + n - 1; // Index of last element
2. Repeat the following until the two indices cross each other
2a. while tooBigIndex is not yet beyond the final index of the part of the array we are partitioning,
and data[tooBigIndex] is less than or equal to the pivot, move tooBigIndex to the next index.
2b. while data[tooSmallIndex] is greater than the pivot, move tooSmallIndex down to the previous index.
2c. if (tooBigIndex < tooSmallIndex), swap the values of data[tooBigIndex] and data[tooSmallIndex].
3. Move the pivot element to its correct position at data[tooSmallIndex], return tooSmallIndex
Choosing a Good Pivot Element
The choice of a good pivot element is critical to the efficiency of the quicksort algorithm. If we can ensure that the pivot element is near the median of the array values, then quicksort is very efficient. One technique that is often used to increase the likelihood of choosing a good pivot element is to randomly choose three values from the array and then use the middle of these three values as the pivot element.
Let’s try the quicksort algorithm with the following array: 40, 20, 10, 80, 60, 50, 7, 30, 100, 90, and 70.
HeapsortHeapsort is a comparison-based sorting algorithm, and is part of the selection sort family. Although somewhat slower in practice on most machines than a good implementation of quicksort, it has the advantage of a worst-case O(n log n) runtime. Heapsort combines the time efficiency of merge sort and the storage efficiency of quicksort.
Heapsort is similar to selection sort in that it locates the largest value and places it in the final array position. Then it locates the next largest value and places it in the next-to-last array position and so forth. However, heapsort uses a much more efficient algorithm to locate the array values to be moved.
How it works?
The heapsort( ) begins by building a heap out of the data set, and then removing the largest item and placing it at the end of the sorted array. After removing the largest item, it reconstructs the heap and removes the largest remaining item and places it in the next open position from the end of the sorted array.
This is repeated until there are no items left in the heap and sorted array is full. Elementary implementations require two arrays — one to hold the heap and the other to hold the sorted elements. In advanced implementation however, we have an efficient method for representing a heap (complete binary tree) in an array and thus do not need an extra data structure to hold the heap.
What are the key advantages of insertion sort, quick sort, heap sort and merge sort? Discuss best, average, and worst case time and memory complexity.
— From the fastest algorithms to slowest algorithms:
— Quick Sort: Can take in multiple arrays but slightly quicker than merge.
— Merge Sort: Can take in multiple arrays but slightly slower than quick.
— Heap Sort: Does not require multiple arrays to work but still pretty fast.
— Insertion Sort: Just ineffective.
Explain the difference between mutable and immutable objects.
— A mutable object is an object whose state can be modified after it is created. An immutable object is an object whose state cannot be modified after it is created. Examples of native JavaScript values that are immutable are numbers and strings. Examples of native JavaScript values that are mutable include objects, arrays, functions, classes, sets, and maps.
What are the three laws of algorithm recursion? Describe them in your own words and what they mean to you.
— A recursive algorithm must call itself, recursively.
— A recursive algorithm must have a base case.
— A recursive algorithm must change its state and move toward the base case.
A base case is the condition that allows the algorithm to stop recursing.
- A base case is typically a problem that is small enough to solve directly.
- In the
factorial
algorithm the base case is n=1.
We must arrange for a change of state that moves the algorithm toward the base case.
- A change of state means that some data that the algorithm is using is modified.
- Usually the data that represents our problem gets smaller in some way.
- In the
factorial
n decreases.
Recursion is a technique for iterating over an operation by having a function call itself repeatedly until it arrives at a result. Most loops can be rewritten in a recursive style, and in some functional languages this approach to looping is the default.