;; CS101 Lecture Notes ;; Spring 2013 ;; Lecture 22 ; Case Study: The sorting problem ; ; Suppose we have a list of numbers and we want to produce the same ; list in ascending (increasing) order. We need a function that ; consumes a list of numbers and produces a list of numbers, such ; that the output list of numbers is sorted in ascending order. ; ; ; Since sorting is such an important problem (estimates say up to ; 80% of the world's computational power is used in sorting large ; data sets), there are LOTS of sorting algorithms. Different ; sorting algorithms vary in running time. Running time is the ; number of steps taken by the algorithm for an input of size n. ; Steps can usually be viewed as lines of code, except when recur- ; sion causes repetition. ; ; Running time is always expressed in terms of n, the input size. ; ;; Example list-of-numbers: (define unordered '(5 3 6 2)) ;; For the list called unordered, ;; 1. (first unordered) --> 5 ;; 2. (rest unordered) --> '(3 6 2) ;; 3. (sorted (rest unordered)) --> '(2 3 6) ;; 4. sorting the whole list involves inserting 5 into '(2 3 6) --> '(2 3 5 6) ; ; INSERTION SORT ; ; One way to produce the desired answer for step 4 is to insert ; 5 into the sorted numbers in the rest of the list. ; ; When inserting an element in the general case, we must ; potentially look through the entire rest of the list. That ; suggests a function that consumes a list of numbers! ; ; Wish List ; ; 1. function to insert given number in proper location, ; given that the rest of the list is already in sorted ; order. ; ;; Contract: (insert number list-of-numbers) -> list-of-numbers ;; Header: (define insert (lambda (n lon) ... )) ;; Purpose: to create a list of numbers from n and the sorted numbers ;; in lon, given that lon is sorted in increasing order. ;;; Examples/Tests for insert: (check-expect (insert 1 empty) '(1)) (check-expect (insert 4 '(3 5 6)) '(3 4 5 6)) (check-expect (insert 8 '(1 3)) '(1 3 8)) (check-expect (insert 2 '(3 4 5)) '(2 3 4 5)) ;; The insert function will take 2 inputs, a number and a LON. ;; This suggests that we use the LON template to start writing ;; the insert helper function: ; (define insert ; (lambda (n lon) ; (cond ; [(empty? lon) ...] ; [else ... (first lon) ... (insert n (rest lon))])) ;; There are two cases to consider in the else: the first is the ;; case in which (first lon) >= n, and the second is the case ;; in which (first lon) < n. So we will have: ;; [else ;; (cond ;; [(<= n (first lon)) ...] ;; [(> n (first lon)) ...])] ;; Function definition (define insert (lambda (n lon) (cond ;; if lon is empty, then create a list containing only n [(empty? lon) (cons n empty)] ;; if n is less than or equal to the first element of lon, ;; cons n onto the front of lon [(<= n (first lon)) (cons n lon)] ;; otherwise, cons the first element of lon onto the ;; front of a recusive call to insert n on the rest ;; of lon [else (cons (first lon) (insert n (rest lon)))]))) ;; Insertion sort will use insert as a helper function to put the ;; list back together in sorted order. ;; Contract: (insertion-sort list-of-numbers) -> list-of-numbers ;; Header: (define insertion-sort (lambda (lon) ... )) ;; Purpose: to return lon in sorted order, from lowest to highest ;;Pre-function Tests: (check-expect (insertion-sort empty) empty) (check-expect (insertion-sort unordered) (cons 2 (cons 3 (cons 5 (cons 6 empty))))) (check-expect (insertion-sort '(8 3 1)) (cons 1 (cons 3 (cons 8 empty)))) (check-expect (insertion-sort '(47 27 5 9 12 5)) '(5 5 9 12 27 47)) ;;Function definition (define insertion-sort (lambda (lon) (cond ;; base case: list empty, return empty [(empty? lon) empty] ;; recursive case: list not empty, insert first onto recursive call [else (insert (first lon) (insertion-sort (rest lon)))]))) ; A more efficient sorting algorithm: Merge Sort ; ; Merge sort uses a "divide and conquer" technique to sort elements ; in a list. It starts by using a function called SPLIT-IN-HALF to ; repeatedly divide a given list in half, recursively dividing ; the resulting sub-lists in half until the base-case when there is ; at most one element in the list. ; ; In the base case, the algorithm appends the single-element lists ; using the MERGER algorithm below. You can think of the sorted lists ; MERGER takes as input as piles of playing cards sorted from top to ; bottom such that only the lowest valued card is face up. MERGER ; chooses the smaller value off the two piles of cards in each iteration ; until one pile is empty, when it returns the other sorted pile. ; ;; MERGE ;; ------------------------------------------------------------ ;; Contract: (merge lon lon) -> LON ;; Header: (define merge (lambda (lon1 lon2) ... )) ;; Purpose: Combine the two sorted input lists, lon1 and lon2, into ;; a single list sorted in ascending order. ; (check-expect (merge '(1 3 5 7 9 11) '(2 4 6 8 10 12)) '(1 2 3 4 5 6 7 8 9 10 11 12)) (define merge (lambda (lon1 lon2) (cond ;; Base Case 1: lon1 is empty [(empty? lon1) lon2] ;; Base Case 2: lon2 is empty [(empty? lon2) lon1] ;; Recursive Cases: ;; lon1 and lon2 both have at least one element ;; Rec. Case 1: First item of lon1 is smaller ... [(< (first lon1) (first lon2)) (cons (first lon1) (merge (rest lon1) lon2))] ;; First item of lon2 is smaller (or equal)... [else (cons (first lon2) (merge lon1 (rest lon2)))]))) ;; Contract: (split-in-half LON) -> (listof LON LON) ;; Header: (define split-in-half (lambda (listy) ... )) ;; Purpose: split listy into a list of two lists, each ;; containing roughly half the elements of listy ;; Pre-function tests: (check-expect (split-in-half '(9 8 7 5 4 3 2 1)) '((9 8 7 5) (4 3 2 1))) (check-expect (split-in-half '()) '(() ())) (check-expect (split-in-half '(13 2 1)) '((13 2) (1))) ;; Function definition: (define split-in-half (lambda (listy) (local [;; find index of midpoint of listy (f-half) ;; use ceiling function to round up in case of odd length (define f-half (ceiling (/ (length listy) 2))) ;; helper function creates sublist of listy from pos to index - 1 (define half (lambda (index pos acc) (cond [(= pos index) acc] [else (half index (add1 pos) (append acc (list (list-ref listy pos))))])))] (list (half f-half 0 empty) ;; sublist from 0 to f-half - 1 (half (length listy) f-half empty)))));; sublist from f-half to end ;(display "\n(split-in-half '(7 4 3 5 6 9 1 0))\n") ;(split-in-half '(7 4 3 5 6 9 1 0)) ;(display "\n(split-in-half '(48 23 1 97 8 30 2 72))\n") ;(split-in-half '(48 23 1 97 8 30 2 72)) "=> '((48 23 1 97)(8 30 2 72))" ;; MERGE-SORT ;; ----------------------------------------------------- ;; Contract: (merge-sort lon) -> LON ;; Header: (define merge-sort (lambda (listy) ... )) ;; Purpose: sort a list of numbers in increasing order (define merge-sort (lambda (listy) (begin ;(printf "~%MERGE-SORT: ~A" listy) (cond ;; Base Case 0: LISTY is empty [(empty? listy) ;; Thus, LISTY is already sorted empty] ;; Base Case 1: LISTY has exactly one element [(empty? (rest listy)) ;; Thus, LISTY is already sorted listy] ;; Recursive Case: LISTY has at least two elements [else (local [;; SPLIT-ACC returns a PAIR of LISTS (define pair-of-lists (split-in-half listy)) ;; LISTA and LISTB are the two halves of LISTY (define listA (first pair-of-lists)) (define listB (second pair-of-lists)) ;; Apply MERGE-SORT to each list--each call will return ;; for the final time when listA and listB are either ;; empty or contain only one element. (define sorted-listA (merge-sort listA)) (define sorted-listB (merge-sort listB)) ;; MERGE the two sorted sub-lists into one sorted list, ;; starting with the smallest lists first. (define merged-lists (merge sorted-listA sorted-listB))] (begin ;(printf "~% MERGING ~a and ~a into ~a" ;sorted-listA sorted-listB merged-lists) merged-lists))])))) ;(merge-sort '(48 23 1 97 8 30 2 72)) ;(newline) ;(merge-sort '(7 4 3 5 6 9 1 0)) ;(newline) (display "\n\n") ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; TIMING FUNCTIONS ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Racket provides a function called TIME that returns the running time ; of a function. We can use the TIME function to give us a comparison ; of the time used by algorithms that have the same outcome, such as ; insertionSort and mergeSort. ; ; Comparing the running time is not meaningful for small input sizes. ; We need to create a fairly large list in order to accurately test ; the running time of two functions. The function createRandomList ; is written below. This function will produce a list of numbers with ; values between 0 and 999. ; ; So, just for more exercise, write a function called CREATE-RANDOM- ; LIST that takes a number, N, as input and that returns a list of ; N random numbers between 0 and 999 as output ;; CREATE-RANDOM-LIST ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Contract: (create-random-list num) -> LON ;; num:number ;; Purpose: To create a list of N random numbers between 0 and 999 ;; Random function, but we can check-expect the length of the returned ;; list: (check-expect (length (create-random-list 500)) 500) ;; Function: (define (create-random-list n) (cond [(= n 0) empty] [else (cons (random 1000) (create-random-list (sub1 n)))])) (define LARGE-LIST (create-random-list 500)) "Timing Insertion-Sort" (time (insertion-sort LARGE-LIST)) "Timing Merge-Sort" (time (merge-sort LARGE-LIST)) "Timing built-in sort" (time (sort LARGE-LIST <)) ; ; When you run this code, you should see that the running time of ; insertionSort on LARGE-LIST is much greater than the running ; time of mergeSort (you will need to remove all the print statements ; from merge-sort functions and run the code several times to get an ; average of running time). As the length of the list is increased, ; this difference should be come more apparent. The running time of ; the built-in sort function is fastest of all because they are using ; the most efficient algorithm, probably some variant of quick-sort ; (take CMPU241 to find out more about quick-sort). ; ; Without getting too technical, the running time of insertionSort ; is at most (n^2) and the running time of mergeSort is at most ; (nlog_2n). More about this later. ;