logo
 
Assignment 4 - Fun with Python
CptS 355 - Programming Language Design
Washington State University
Home
Calendar
Syllabus
Resources
People
Project turn-in

Overview

Weight: This assignment will count for approximately 12% of your final grade.
Due Date Monday, April 25 (by 11:59PM)
In this assignment you will explore some of the unique, and not so unique, features of Python.

Turning in your assignment

Note the following directions carefully. Points will be deducted for incorrectly named files and functions. All code should be developed in the three directory. Each problem solution should be placed in a separate file within that directory with a .py extension, e.g., rot13.py. When you are done and certain that everything is working correctly, create a tarred, gzipped copy of your directory by executing
  tar cf three.tar three; gzip three.tar
or zip it on a Windows system. Turn in the tarred, gzipped or zipped directory using the turn-in page. The file you upload should be named three.tar.gz or three.zip. On the upload page, write a comment indicating whether your code is intended for Unix/Linux or Windows -- programs' behavior may differ slightly between the two systems; your code only has to work on one or the other. You may turn in your assignment as many times as you like. Only the last one submitted before the deadline will be graded.

Grading

The assignment will be marked for good programming style (indentation and appropriate comments), as well as clean compilation and correction execution.

Warmup -- rot13.py -- 25%

The text of a message can be hidden by applying a "rot13" translation to each alphabetic character, that is to characters between (inclusive) 'a' and 'z' or 'A' and 'Z'. The rot13 translation shifts each alphabetic character in the message by thirteen places to the right in the alphabet, but leaves all other characters unchanged. By "thirteen places to the right" I mean that each 'a' is made into an 'n', each 'b' is transformed to a 'o', etc. Characters greater than 'm' are "rotated" or wrapped around to the beginning, so 'n' is transformed to 'a', 'o' is changed into 'b', etc. The same transformation is applied to characters between 'A' and 'Z'. That is, each 'A' is made into an 'N', each 'B' is transformed to a 'O', etc. Characters greater than 'M' are wrapped around to the beginning, so 'N' is transformed to 'A', 'O' is changed into 'B', etc.

Your task is to write a rot13.py program that reads a message on sys.stdin (standard input), rotates the text in a message, and prints the rotated message. Output only the rotated message. Do not output any additional characters or text. As an example, assume that the file testmessage contains the following message.

abcdefghijklmnopqrstuvwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
How are you?
Groovy!
Then the command (in Unix)
  python rot13.py < testmessage 
would produce the following output.
nopqrstuvwxyzabcdefghijklm
NOPQRSTUVWXYZABCDEFGHIJKLM
Ubj ner lbh?
Tebbil!
And the command (in Unix)
  python rot13.py < testmessage | python rot13.py
would produce the original message.

Hint: Look at string.translate and string.maketrans.

digraphs.py - 30%

A digraph is a pair of characters that occur adjacent to one another in a text. By convention we write each digraph between a pair of '/' characters to make it easier to see where the blanks are. For example the digraphs at the beginning of the previous sentence are /A /, / d/, /di/, /ig/, etc. Digraph frequency counts are helpful in cryptological analysis of some ciphers.

The digraphs.py program lists the number of times each digraph occurs in text read from standard input. Digraphs must be listed from most frequently occurring to least frequently occurring. Digraphs that do not occur in the input (0 occurrences) should not be listed in the output. Before counting the digraphs replace each newline character in the input with a space character. The replacement of newline characters by spaces is the only change you should make in the input text.

Here is what the output of the program might look like on some hypothetical sample input:

  python digraphs.py < hypotheticalInput
produces
/e / 83
/ t/ 66
/ a/ 56
/in/ 53
/th/ 52
/ </ 48
/co/ 47
/re/ 44
/de/ 44
/s / 44
/to/ 42
/he/ 41
/n / 40
/d / 38
/ o/ 37
/t / 36
/or/ 36
/ i/ 34
/an/ 33
/r / 32
/. / 31
/ d/ 30
and so on. 
Notice that there are no newline characters in the output digraphs. The input is to be treated as one continuous stream of characters in which newline characters are replaced by blanks. Make sure that you test your program on input for which you know the correct output. If you are unsure what the correct output should be in a particular situation ASK how that situation should be handled. But don't ask me to tell you the full output for a particular input. Hint: Use an implementation of sorting from the Python library. You don't have to write your own sort.

sizes.py - 45%

Write a program, sizes.py, that lists all directories beginning at a certain point in a file directory hierarchy. For each directory compute two numbers: 1) the total size in bytes of all ordinary files in that directory; call this S and 2) S plus the total size in bytes of all ordinary files contained in directories recursively below the original; call this T. Compute these values for the original directory, as well as all directories recursively within the original directory.

Ignore any file that is not an ordinary file or a directory.

Output from your program should look like:

T directoryname S
  T subdirectory1 S
  T subdirectory2 S
    T subdirectory21 S
    T subdirectory22 S
  T subdirectory3 S
Notice how subdirectories of each directory are indented below it. It is also required that the T-values of subdirectories of a given directory be listed in decreasing order. Also notice that when your program is correct the T value listed to the left of a particular directory will be the directory's S value plus the sum of the T values of its subdirectories.

This is a very useful program for finding out where all the space on your disk drive has gone, something that is admittedly less of a problem in these days of 100GB disks.

sizes.py should accept one optional command-line argument, the name of a directory. (This may be either an absolute path names, i.e. starting with a '/', or a relative path name, i.e. not starting with a '/'. If the argument is not specified, the current directory should be used. (In this case you may use . as the name of the current directory when printing the directory names in your ouptut.)

Hints:

  • command-line arguments come into a python program in the sys.argv list.
  • http://docs.python.org/lib/module-stat.html contains example code for recursively walking a directory tree as well as documentation on how to interpret the result of an os.stat call to determine each file's type and size.

Example: to explore the space used by ~/testdir you would say

   python sizes.py testdir
On a small directory hierarchy it might produce something like this:
896 testdir 176
  376 testdir/b 376
  344 testdir/a 210
    114 testdir/a/c 114
    20 testdir/a/b 20

In order to test your code you will need to build a file hierarchy for which you can compute the correct output by hand. Be sure to include important "corner" cases such as directories that are empty, directories that contain only ordinary files, and directories that contain only other directories.
                                                                                                                                                                                                                                                                                                                                             
  (c) 2003 Curtis Dyreson, (c) 2004 Carl H. Hauser           E-mail questions or comments to Prof. Carl Hauser