cosc 1306—computer science and programming python functions jehan-françois pâris jfparis@uh.edu

Download COSC 1306—COMPUTER SCIENCE AND PROGRAMMING PYTHON FUNCTIONS Jehan-François Pâris jfparis@uh.edu

Post on 27-Dec-2015

217 views

Category:

Documents

5 download

Embed Size (px)

TRANSCRIPT

  • Slide 1
  • COSC 1306COMPUTER SCIENCE AND PROGRAMMING PYTHON FUNCTIONS Jehan-Franois Pris jfparis@uh.edu
  • Slide 2
  • Module Overview We will learn how to read, create and modify files Pay special attention to pickled files They are very easy to use!
  • Slide 3
  • The file system Provides long term storage of information. Will store data in stable storage (disk) Cannot be RAM because: Dynamic RAM loses its contents when powered off Static RAM is too expensive System crashes can corrupt contents of the main memory
  • Slide 4
  • Overall organization Data managed by the file system are grouped in user-defined data sets called files The file system must provide a mechanism for naming these data Each file system has its own set of conventions All modern operating systems use a hierarchical directory structure
  • Slide 5
  • Windows solution Each device and each disk partition is identified by a letter A: and B: were used by the floppy drives C: is the first disk partition o f the hard drive If hard drive has no other disk partition, D: denotes the DVD drive Each device and each disk partition has its own hierarchy of folders
  • Slide 6
  • Windows solution C: Windows Users Second disk D: Program Files Flash drive F:
  • Slide 7
  • UNIX/LINUX organization Each device and disk partition has its own directory tree Disk partitions are glued together through the operation to form a single tree Typical user does not know where her files are stored
  • Slide 8
  • UNIX/LINUX organization Root partition bin usr / Other partition The magic mount Second partition can be accessed as /usr
  • Slide 9
  • Mac OS organization Similar to Windows Disk partitions are not merged Represented by separate icons on the desktop
  • Slide 10
  • Accessing a file (I) Your Python programs are stored in a folder AKA directory On my home PC it is C:\Users\Jehan-Francois Paris\Documents\ Courses\1306\Python All files in that directory can be directly accessed through their names "myfile.txt"
  • Slide 11
  • Accessing a file (II) Files in subdirectories can be accessed by specifying first the subdirectory Windows style: "test\\sample.txt" Note the double backslash Linux/Unix/Mac OS X style: "test/sample.txt" Generally works for Windows
  • Slide 12
  • Why the double backslash? The backslash is an escape character in Python Combines with its successor to represent non-printable characters \n represents a newline \t represents a tab Must use \\ to represent a plain backslash
  • Slide 13
  • Accessing a file (III) For other files, must use full pathname Windows Style: "C:\\Users\\Jehan-Francois Paris\\ Documents\\Courses\\1306\\Python\\ myfile.txt"
  • Slide 14
  • Accessing file contents Two step process: First we open the file Then we access its contents Read Write When we are done, we close the file.
  • Slide 15
  • What happens at open() time? The system verifies That you are an authorized user That you have the right permission Read permission Write permission Execute permission exists but doesnt apply and returns a file handle / file descriptor
  • Slide 16
  • The file handle Gives the user Direct access to the file No directory lookups Authority to execute the file operations whose permissions have been requested
  • Slide 17
  • Python open() open(name, mode = r, buffering = -1) where name is name of file mode is permission requested Default is r for read only buffering specifies the buffer size Use system default value (code -1)
  • Slide 18
  • The modes Can request r for read-only w for write-only Always overwrites the file a for append Writes at the end r+ or a+ for updating (read + write/append)
  • Slide 19
  • Examples f1 = open("myfile.txt") same as f1 = open("myfile.txt", "r") f2 = open("test\\sample.txt", "r") f3 = open("test/sample.txt", "r") f4 = open("C:\\Users\\Jehan-Francois Paris\\ Documents\\Courses\\1306\\Python\\myfile.txt")
  • Slide 20
  • Reading a file Three ways: Global reads Line by line Pickled files
  • Slide 21
  • Global reads fh.read() Returns whole contents of file specified by file handle fh File contents are stored in a single string that might be very large
  • Slide 22
  • Example f2 = open("test\\sample.txt", "r") bigstring = f2.read() print(bigstring) f2.close() # not required
  • Slide 23
  • Output of example To be or not to be that is the question Now is the winter of our discontent Exact contents of file test\sample.txt
  • Slide 24
  • Line-by-line reads for line in fh : # do not forget the column #anything you want fh.close() # not required
  • Slide 25
  • Example f3 = open("test/sample.txt", "r") for line in f3 : # do not forget the column print(line) f3.close() # not required
  • Slide 26
  • Output To be or not to be that is the question Now is the winter of our discontent With one or more extra blank lines
  • Slide 27
  • Why? Each line ends with an end-of-line marker print() adds an extra end-of-line
  • Slide 28
  • Trying to remove blank lines print('----------------------------------------------------') f5 = open("test/sample.txt", "r") for line in f5 : # do not forget the column print(line[:-1]) # remove last char f5.close() # not required print('-----------------------------------------------------')
  • Slide 29
  • The output ---------------------------------------------------- To be or not to be that is the question Now is the winter of our disconten ----------------------------------------------------- The last line did not end with an EOL!
  • Slide 30
  • A smarter solution (I) Only remove the last character if it is an EOL if line[-1] == \n : print(line[:-1] else print line
  • Slide 31
  • A smarter solution (II) print('----------------------------------------------------') fh = open("test/sample.txt", "r") for line in fh : # do not forget the column if line[-1] == '\n' : print(line[:-1]) # remove last char else : print(line) print('-----------------------------------------------------') fh.close() # not required
  • Slide 32
  • It works! ---------------------------------------------------- To be or not to be that is the question Now is the winter of our discontent -----------------------------------------------------
  • Slide 33
  • Making sense of file contents Most files contain more than one data item per line COSC713-743-3350 UHPD 713-743-3333 Must split lines mystring.split(sepchar) where sepchar is a separation character returns a list of items
  • Slide 34 >> text.split() ['Four', 'score', 'and', 'seven', 'years', 'ago'] >>>record ="1,'Baker,"> >> text.split() ['Four', 'score', 'and', 'seven', 'years', 'ago'] >>>record ="1,'Baker, Andy', 83, 89, 85" >>> record.split(',') [' 1', "'Baker", " Andy'", ' 83', ' 89', ' 85'] Not what we wanted!"> >> text.split() ['Four', 'score', 'and', 'seven', 'years', 'ago'] >>>record ="1,'Baker," title="Splitting strings >>> text = "Four score and seven years ago" >>> text.split() ['Four', 'score', 'and', 'seven', 'years', 'ago'] >>>record ="1,'Baker,">
  • Splitting strings >>> text = "Four score and seven years ago" >>> text.split() ['Four', 'score', 'and', 'seven', 'years', 'ago'] >>>record ="1,'Baker, Andy', 83, 89, 85" >>> record.split(',') [' 1', "'Baker", " Andy'", ' 83', ' 89', ' 85'] Not what we wanted!
  • Slide 35
  • Example # how2split.py print('----------------------------------------------------') f5 = open("test/sample.txt", "r") for line in f5 : words = line.split() for xxx in words : print(xxx) f5.close() # not required print('-----------------------------------------------------')
  • Slide 36
  • Output ---------------------------------------------------- To be of our discontent -----------------------------------------------------
  • Slide 37
  • Other separators (I) Commas CSV Excel format Values are separated by commas Strings are stored without quotes Unless they contain a comma Doe, Jane, freshman, 90, 90 Quotes within strings are doubled
  • Slide 38
  • Other separators (II) Tabs( \t) Advantages: Your fields will appear nicely aligned Spaces, commas, are not an issue Disadvantage: You do not see them They look like spaces
  • Slide 39
  • Why it is important When you must pick your file format, you should decide how the data inside the file will be used: People will read them Other programs will use them Will be used by people and machines
  • Slide 40
  • An exercise Converting our output to CSV format Replacing tabs by commas Easy Will use string replace function
  • Slide 41
  • First attempt fh_in = open('grades.txt', 'r') # the 'r' is optional buffer = fh_in.read() newbuffer = buffer.replace('\t', ',') fh_out = open('grades0.csv', 'w') fh_out.write(newbuffer) fh_in.close() fh_out.close() print('Done!')
  • Slide 42
  • The output Al

Recommended

View more >