CSE 2320 Section 501/571 Fall 1999
Due: October 12, 1999, 5:30pm (October 13, 1999, 5:00pm for -10%)
Implement a program that produces a histogram of unique tokens from a text
file. Your program will read in tokens from a given text file and
search for them in a hash table using collision resolution by chaining. If
not found, you will insert the token into the table with a count of one.
If found, you will increment the count of this token by one. After reading
the entire file, you will print out the tokens in decreasing order by
count, and in lexicographic order within tokens having the same count.
Lastly, your program will print out the minimum, maximum, mean and standard
deviation of the length of the hash table chains, and the hash table load
factor (tokens in table / size of table). Specifically,
- You may write your code in C or C++, but cannot use global variables.
Those of you using C++ classes may use state variables for the major data
structures, but avoid inappropriate use of state variables as a global
variables. Follow the Coding Standards referenced in Program 1 and be sure
to write modular, well-documented code.
- A token is any string of characters (ASCII codes 33-126)
separated by whitespace (spaces, tabs or newlines). You may assume the
input file consists of only these characters.
- Your program must process each token as it is read in from the file.
You should not first read in all tokens prior to processing them.
- The hash table using collision resolution by chaining should be of
size m=256 (define this as a constant). Use the division method hash
h(k) = k mod m, where k is the sum of the ASCII values of the
first ten characters of the token. If the token has less than ten
characters, then sum all the characters.
- You may use an auxiliary array for sorting the histogram. You must
implement the sorting algorithm(s) yourself from one of those discussed in
class; you may not use the C/C++ library sorting procedures.
- Follow the instructions from Program 1 for handing in the file
containing all your source code using the program /public/cse/2320-501/handin2.