;; CMPU101 Spring 2013 ;; Lab 11 - Printing matching DNA strands (display "\n CS101 Lab 11, Spring 2013") (display "\n PLEASE WRITE YOUR NAME HERE\n\n") ; ; In lab 10, you wrote the following functions: ; string->uppercase ; all-valid-bases? ; convert-to-complement ; find-match-pos ; ; You will use these functions in this lab assignment to create an inter- ; active DNA sequence matcher, so start by copying them in the space at the ; bottom of the file. Make sure the functions still work after copying them. ; ; The five functions you will write for this lab start at Problem 1 below. ; The main function in this program will be called start-sequence and it should ; do the following: ; ; 1. Call a function to prompt for and read a "long" DNA sequence. That ; function should check if the sequence entered contains all valid ; characters and convert it to all upper-case to return. If a string ; with invalid characters is entered, the function should tell the user ; that the sequence contains invalid bases and prompt them for another. ; ; 2. Call a function to prompt for and read a "short" DNA sequence. That ; function should check if the sequence entered contains all valid ; characters and convert it to all upper-case to return. If a string ; with invalid characters is entered, the function should tell the user ; that the sequence contains invalid bases and prompt them for another. ; ; NOTE: Steps one and two can be combined into a single function. ; ; 3. Once both valid sequences have been entered and stored in local ; variables, print the two sequences, e.g.: "Long is AGCT, short is CG" ; ; 4. Convert the short sequence to its complement and use the find-match-pos ; function to return the position of the substring in the long sequence ; that matches the complement of the short sequence. ; ; 5. If a match is found, you should print the two sequences as if they ; were attached to one another by rungs of a ladder. ; ; If no match is found, you should print "No match found." ; ; A few sample runs of the program execution are shown below. Pay close ; attention to them and make sure your output looks exactly the same. ; ;; SAMPLE RUN #1 ; ; > (start-sequence) ; ; Please enter a long DNA sequence ; caggttatt ; ; Please enter a short DNA sequence ; aata ; ; Long is CAGGTTATT, short is AATA ; ; AATA matches CAGGTTAT at position 4 ; ; ------------ ; | | | | ; A A T A ; C A G G T T A T ; | | | | | | | | ; ------------------------ ;; SAMPLE RUN #2 ; > (startSequence) ; ; Please enter a long DNA sequence ; rrattks ; ; The sequence RRATTKS has some invalid base notation. ; ; ; Please enter a long DNA sequence ; gggggggaaattc ; ; Please enter a short DNA sequence ; lkdrsgat ; ; The sequence LKDRSGAT has some invalid base notation. ; ; ; Please enter a short DNA sequence ; tttt ; ; Long is GGGGGGGAAATTC, short is TTTT ; ; No match found. ; ;; SAMPLE RUN #3 ; > (startSequence) ; ; Please enter a long DNA sequence ; GATTCTG ; ; Please enter a short DNA sequence ; CTA ; ; Long is GATTCTG, short is CTA ; ; CTA matches GATTCTG at position 0 ; ; --------- ; | | | ; C T A ; G A T T C T G ; | | | | | | | ; --------------------- ; ; The most challenging part of this assignment is building the strings to print so ; the matching DNA sequences are in the ladder format shown. Notice that each ; printed character takes exactly three spaces when printed. ; ; You will probably want to write helper functions to build each string and then ; use printf to report the result of calling the functions. ; ; Producing the strings to use in the print statements may require more than one ; function. ; ; For example, to print the following: ; ; ------------ ; | | | | ; C G A T ; A T G G C T A T ; | | | | | | | | ; ------------------------ ; ; you use the information about the position on the longer sequence at which the ; sequences match. In the example above, the position is 3. Since the short ; sequence should always be printed above the longer one, this requires building a ; string consisting of three sets of three blank spaces, followed by 4 (the length of ; the short sequence) sets of "---", followed by an end of line. The strings that ; make up the second and third lines should similarly start with 3 (the position the ; match occurs) sets of three blank spaces. ; ; IMPORTANT: YOUR FONT PREFERENCE IN DRRACKET MUST BE SET TO A MONO-SPACED FONT ; LIKE MONACO, OTHERWISE THE CHARACTERS IN THE DNA LADDER MAY NOT BE ALIGNED ; CORRECTLY. ; ; (display "\n\n----------------------------\n") (display "Problem 1: START-SEQUENCE") (display "\n----------------------------\n") ; ; Write a zero-parameter function to start the program: start-sequence. ; ; This function should have no output, only side-effect printing. ; ; Write this function AFTER you write the functions for Problems 2 through 5. ; ; The start-sequence function should receive both the long and short sequences ; from the same function, prompt-and-read. The prompt-and-read function, described ; below, should consume a string, either "long" or "short". After converting ; the short string of valid, uppercase bases to its complement, start-sequence ; should call find-match-pos and, depending on the result of that function, ; either call functions to print the ladder structure shown above, or print that ; there is no match found. ; (display "\n\n----------------------------\n") (display "Problem 2: PROMPT-AND-READ") (display "\n----------------------------\n") ; ; This function should consume a string str, either "long" or "short", ; and produce a string, a sequence of valid bases. ; ; This function should prompt the user to "Please enter a long DNA ; sequence" or "Please enter a short DNA sequence", depending on the ; value of the input, str. After reading the sequence and converting it ; to all uppercase, it should: 1) call itself recursively if the ; sequence has invalid characters, or 2) return the string of valid ; bases. ; (display "\n\n----------------------------\n") (display "Problem 3: REPEAT-STRING") (display "\n----------------------------\n") ; ; This function should consume a number, num, and a string, str. ; It should return a string that is the result of appending num ; copies of str and returning the result. ; ; Check-expects produced from running this function are shown below. ; Copy them above your function when you are ready to test it. ; ; (check-expect (repeat-str 6 " | ") " | | | | | | ") ; (check-expect (repeat-str 4 "---") "------------") ; (check-expect (repeat-str 2 " ") " ") ; (display "\n\n----------------------------\n") (display "Problem 4: PRINT-STRAND") (display "\n----------------------------\n") ; ; This function should consume a string, str. It should return ; the string that results from appending each character in str with ; a space on each side to the string returned as a result. ; ; Check-expects produced from running this function are shown below. ; Copy them above your function when you are ready to test it. ; ; (check-expect (print-strand "ACGT") " A C G T ") ; (check-expect (print-strand "CCT") " C C T ") ; (display "\n\n----------------------------\n") (display "Problem 5: PRINT-LADDER") (display "\n----------------------------\n") ; ; This function should consume 2 strings (short long) and a number, num, ; the position where short matches long. It should use printf statements ; to print the entire DNA ladder, returning void. ; ; This function should use the repeat-string and print-strand functions as ; helpers. ; ;;PASTE FUNCTIONS FROM LAB 10 BELOW.