CSCI 4500/6500 Programming Languages Project 4: Python Clarify First Program in Python [or Ruby if you prefer] |
---|
Assigned: Friday March 24, 2006 Due: Friday April 07, 2006 |
Collaboration Policy - Read Carefully
You must work on this project individually, but you may discuss this assignment with other students in the class and ask and provide help in useful ways, preferable over our email list so we can all benefit from your great ideas. You may consult any outside resources you wish including books, papers, web sites and people (but no penguins or sea urchins).
If you use resources other than the class materials, indicate what you used along with your answer.
Objective:
The main objective for this assignment is to familiarize yourself with a scripting language, in particular Python. You may use the python that is already installed on atlas or you may install your own. I suggest that you install your own. I suggest that you also use IDLE, the interactive development environment, to develop your python programs that was shown in class.
|
Runs on Microsoft Windows, MacOS X (yay!), UNIX. |
Tutorials:
Some tutorials on Python: "Dive into Python" book (available in pdf and html) "Python 101" (nice quick introduction) Official Python tutorial: The quick Python reference: |
Description:
Your assignment is to create a program, written in python, called clarify. Clarify filters out successive identical lines from a file (or standard input) and writes the results to standard output. Here is the synopsis of the command that you need to implement:
clarify [ -c ] [ -d | -u ] [ -i ] [ -f fields ] [ -s char ] [ input_file ]
Order of switches (except the optional input_file) does not matter.
You need to implement the below options:
-c | prefix lines by number of occurrences |
-d | only print duplicate (or more) lines |
-u | suppress writing successive lines that are repeated in the input |
-f fields | number of fields to skip over before checking for uniqueness |
-s chars | number of characters to skip over before checking for uniqueness. If you use both the field (-f) and character skipping (-s) options, fields are skipped over first. |
-i | ignore case |
** A field is a series of nonwhite space characters.
-d and -u together should give you an error message.
Example session:
{saffron:ingrid:221} cat test.txt
hello this is a test.
hello this is a test.
hello this IS a test.
hello this is a test.
hello this is a test.
hello this is a test.
hello this not a test.
hi over there a test.
{saffron:ingrid:222} clarify test.txt
hello this is a test.
hello this IS a test.
hello this is a test.
hello this not a test.
hi over there a test.
{saffron:ingrid:223} clarify -c test.txt
2 hello this is a test.
1 hello this IS a test.
3 hello this is a test.
1 hello this not a test.
1 hi over there a test.
{saffron:ingrid:224} clarify -d test.txt
hello this is a test.
hello this is a test.
{saffron:ingrid:225} clarify -u test.txt
hello this IS a test.
hello this not a test.
hi over there a test.
{saffron:ingrid:226} clarify -f 3 test.txt
hello this is a test.
{saffron:ingrid:227} clarify -c -f 3 test.txt
8 hello this is a test
For option -f. Consider a file called fruits.txt that consists of two lines:orange banana apple orange.
grape pear apple orange.{saffron:ingrid:225} clarify -f 2 fruits.txt
first line: clarify skips the first 2 fields "orange" and "banana". So to determine uniqueness it considers only "apple orange".
second line: clarify skips the first 2 fields "grape" and "pear" so it also only considers only "apple orange" to determine uniqueness.
So here (above) clarify deems that two lines are duplicates. For the same file, if -f 1, then,
clarify -f 1 fruits.txt
deems that the lines are different.
For -s, consider the file maria.txt:
123maria
234mariaclarify -c -s 3 maria.txt
2 123mariaAbove, the lines are considered duplicates, here it skips over the first 3 character to determine uniqueness, the prefix of 2, is a consequence of the -c option (and indicates the number of lines that it occurs).
Requirements:
clarify should be en executable "she banged!" script. It should run on atlas. You should develop it in your environment and as a last step, make sure it runs on atlas.
Since the version of atlas is somewhat old we will also use Window's XP version 2.4.2 to grade your project. Please indicate your preference: i.e you prefer your program is tested on atlas or Windows XP in your README.txt file.
You may use pre-defined bultin modules in Python, you may not invoke UNIX commands from python (such as cat).
Submitting:
Submitting
|
{atlas:maria} submit x500_program5 |