cse 501n fall ‘09 18: files and streams 06 november 2009 nick leidenfrost

48
CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

Upload: sybil-skinner

Post on 14-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

CSE 501NFall ‘0918: Files and Streams

06 November 2009

Nick Leidenfrost

Page 2: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

2

Lecture Outline

Storing data to the hard diskFilesStreamsStorage DecisionsSerialization

Page 3: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

3

File SystemFile Hierarchy

In general, files in computers are organized in a directory tree A directory is a virtual container

that holds files and other directories

A.k.a folder

Page 4: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

4

FilesNaming Conventions

Filenames are unique and case-sensitive Extensions serve as a “hint” or “shortcut” for the

type of data contained in the file For Us For the Operating System (OS)

H:/workspace/Lab6/Ship.java

/home/username/workspace/Lab6/Ship.java

Windows

Mac / Linux / Unix

drive path filename

extension“full path”

Page 5: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

5

Referring to Files in ProgramsAbsolute vs. Relative Paths

Absolute PathThe “full path” to the file

Relative PathA path that specifies a file’s location relative to

another location “another location” = the location of our program

H:/workspace/Lab6/images/mothership.gif

images/Ship.java

Page 6: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

6

Relative Paths./ And ../

Some special notation specific to relative paths lets us refer to our own directory, as well as our parent directory

./ and .././ = The current directory../ = The parent directory of the current

directory (or the directory above the current directory)

This notation is fairly standard in computing [ cmd example ]

Page 7: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

7

Relative PathsExamples Relative Paths

Inside our directory

Inside a subdirectory

In our “parent” directory

In a “sibling directory”

In directory above our parent directory

images/mothership.gifimages/gif/small/mothership.gif

../motherhsip.gif

../../../motherhsip.gif

mothership.gif./mothership.gif

../images/motherhsip.gif

Page 8: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

8

Absolute vs. Relative Absolute

Path will rely on exactly the same directory structure being in place on every computer

Application can be moved independently of resources, and resources will still be found

Generally not portable (System-specific) Usually set on installation

Relative Will be correct as long as resources stay in the same

place relative to the application Application and resources can be moved and still

function From computer to computer From directory to directory on the same computer

Page 9: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

9

File FormatsText and Binary

Two ways to store data: Text format (a.k.a. plain-text)

Data stored as characters Human readable Less efficient with respect to storage A.k.a. ASCII (ask · ee)

(American Standard Code For Information Interchange) Binary format

Data stored as bytes Looks like gibberish to Humans Relies on a defined structure More compact / efficient than plain-text

// Let’s look at some examples!

Page 10: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

10

ASCII Format

Let’s look at storing an integer in a plain-text fileAn int in Java is 4 bytesWe want to store the number 12,345Our file actually holds the character ‘1’,

followed by ‘2’, then ‘3’, ‘4’ and finally, ‘5’

This takes at least 5 bytes

Page 11: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

11

Binary Format

Data items are represented in bytes Integer 12,345 stored as a sequence of

four bytes: 0 0 48 5748*256¹ + 57*256ºWhy the zeros? Why not just use 2 bytes to

store it? More compact and more efficient

Page 12: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

12

Files in JavaLibrary Support

Support for file interaction (and more) can be found in Java’s java.io library “io”: Input / Output

A file is represented in the library by the File class in java.io.File

We can create file objects with either absolute or relative paths

// Let’s have a look at java.io.File

Page 13: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

13

Reading Text Files

Simplest way to read text: use Scanner class To read from a file on disk, construct a FileReader Then, use the FileReader to construct a Scanner

object

Use the Scanner methods to read data from file next, nextLine, nextInt, and nextDouble // Let’s look at the Scanner API

FileReader reader = new FileReader("input.txt“); Scanner in = new Scanner(reader);

Page 14: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

14

Writing Text Files

To write to a file, construct a PrintWriter object // Let’s look at the PrintWriter API

If file already exists, it is emptied before the new data are written into it

If file doesn't exist, an empty file is created

PrintWriter out = new PrintWriter("output.txt");

Page 15: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

15

Writing Text Files Use print and println to write into a PrintWriter:

You must close a file when you are done processing it:

Otherwise, not all of the output may be written to the disk file

out.println(29.95); out.println(new Rectangle(5, 10, 15, 25)); out.println("Hello, World!");

out.close();

Page 16: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

16

A Sample Program

Reads all lines of a file and sends them to the output file, preceded by line numbers

Sample input file:

Mary had a little lamb Whose fleece was white as snow. And everywhere that Mary went, The lamb was sure to go!

Page 17: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

17

A Sample Program

Program produces the output file:

(Program could be used for numbering Java source files, etc.)

/* 1 */ Mary had a little lamb /* 2 */ Whose fleece was white as snow. /* 3 */ And everywhere that Mary went, /* 4 */ The lamb was sure to go!

Page 18: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

18

Write the code for this program

// Code Example: FileNumberer.java

Page 19: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

22

File Dialog Boxes More user friendly way of selecting files

// Let’s integrate this into our FileNumberer

Page 20: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

24

Text Format Human-readable form Data stored as sequence of characters

Integer 12345 stored as characters '1' '2' '3' '4' '5' Use Reader and Writer and their subclasses to process

input and output To read:

To write

FileReader reader = new FileReader("input.txt");

FileWriter writer = new FileWriter("output.txt");

Page 21: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

25

Binary Format Reading and writing binary files

Use subclasses of InputStream and OutputStream

To read:

To write

FileInputStream inputStream = new FileInputStream("input.bin");

FileOutputStream outputStream = new FileOutputStream("output.bin");

Page 22: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

26

Streams: Input and Output

We read from an InputStream

We write to an OutputStream As with System.out Imagine the “Stream” as a

hose connecting the data source and the data destination

Output(writing)

Input(reading)

The stream

Page 23: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

27

Reading a Single Character from a File in Text Format Use various read methods in InputStream class to read a single byte / array of bytes returns the next byte as an int

Returned value 0 <= x <= 255

or the integer -1 at end of file InputStream in = . . .; int next = in.read(); byte b;

if (next != -1) b = (byte) next;

Page 24: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

28

Text and Binary Format

Use variations of the write method to write a single byte / array of bytes

read and write are the only input and output methods provided by the file input and output classes

Java stream package principle: each class should have a very focused responsibility Use Library of subclasses for more high-level

behavior

Page 25: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

29

Text and Binary Format

Job of InputStream / OutputStream: interact with data sources and get bytes

To read numbers, strings, or other objects, combine a Stream with other classes E.g. java.util.Scanner

Page 26: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

30

File ExampleA Simple Encryption Program

File encryption To scramble it so that it is readable only to those

who know the encryption method and secret keyword

To use Caesar cipherChoose an encryption key–a number between 1

and 25 Example: If the key is 3, replace A with D, B with

E, . . .

Page 27: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

31

An Encryption Program Example text:

To decrypt, use the negative of the encryption key

Page 28: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

32

Code for this?

// Code Example: FileEncryptor.java

Page 29: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

38

Storage OptionsRandom Access vs. Sequential Access

Sequential access A file is processed a byte at a time It can be inefficient

Random access Allows access at arbitrary locations in the file Only files on disk support random access

System.in and System.out (normal input and output streams) do not

Each disk file has a special file pointer position You can read or write at the position where the file pointer is

Page 30: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

39

Storage OptionsRandom Access vs. Sequential Access

Each disk file has a special file pointer position You can read or write at the position where the pointer is

Page 31: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

40

RandomAccessFile

You can open a file either for Reading only ("r") Reading and writing ("rw")

To move the file pointer to a specific byte

(moves file pointer to nth byte)

RandomAccessFile f = new RandomAcessFile("bank.dat","rw");

f.seek(n);

Page 32: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

41

RandomAccessFile

To get the current position of the file pointer:

To find the number of bytes in a file:

long n = f.getFilePointer(); // of type "long" because files can be very large

long fileLength = f.length();

Page 33: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

42

Random AccessA Sample Program Use a random access file to store a set of bank

accounts Program lets you pick an account and deposit

money into it To manipulate a data set in a file, pay special

attention to data formatting Suppose we store the data as text

Say account 1001 has a balance of $900, and account 1015 has a balance of 0

Page 34: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

43

Random AccessA Sample Program

What if we want to deposit $100 into account 1001?

If we now simply write out the new value, the result is

Page 35: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

44

Random AccessA Sample Program

What if money becomes too big? This is caused by one of the downsides of human-

readable file formats

Better way to manipulate a data set in a file: Give each value a fixed size that is sufficiently large Every record has the same size Easy to skip quickly to a given record To store numbers, use binary format for scalability

Page 36: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

45

Random AccessA Sample Program

RandomAccessFile class stores binary data

readInt and writeInt read/write integers as four-byte quantities

readDouble and writeDouble use 8 bytes

double x = f.readDouble(); f.writeDouble(x);

Page 37: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

46

Random AccessA Sample Program

To find out how many bank accounts are in the file

public int numAccounts () throws IOException { return (int) (file.length() / RECORD_SIZE); // RECORD_SIZE is 12 bytes: // 4 bytes for the account number and // 8 bytes for the balance

}

Page 38: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

47

Random AccessA Sample Program

To read the nth account in the file

public BankAccount read (int n) throws IOException { file.seek(n * RECORD_SIZE); int accountNumber = file.readInt(); double balance = file.readDouble(); return new BankAccount(accountNumber, balance); }

Page 39: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

48

Random AccessA Sample Program

To write the nth account in the file

public void writeNth (int n, BankAccount account) throws IOException {

file.seek(n * RECORD_SIZE); file.writeInt(account.getAccountNumber()); file.writeDouble(account.getBalance()); }

Page 40: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

49

Object StreamsReading and Writing Objects? WTF?

Writing Objects directly to streams ObjectOutputStream class can write a

entire objects to disk ObjectInputStream class can read

objects back in from disk Objects are saved in binary format; hence, you

use streams // Let’s look at the APIs

Page 41: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

50

Writing a BankAccount Object to a File

The object output stream saves all instance variables

BankAccount b = . . .;

OutputStream os = new FileOutputStream("bank.dat");

ObjectOutputStream out = new ObjectOutputStream(fos);

out.writeObject(b);

Page 42: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

51

Reading a BankAccount Object From a File readObject returns an Object

reference Hence, we must remember the types of

the objects that you saved and use a cast

InputStream is = new FileInputStream("bank.dat"); ObjectInputStream in = new ObjectInputStream(is); BankAccount b = (BankAccount) in.readObject();

Page 43: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

52

Reading a BankAccount Object From a File

readObject method can throw a ClassNotFoundException Why is this?

It is a checked exception You must catch or declare it

Page 44: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

53

Writing Complex ObjectsWrite and Read an ArrayList to a File

Write

Read

ArrayList<BankAccount> bl = new ArrayList<BankAccount>(); // Now add many BankAccount objects into bl

out.writeObject(bl);

ArrayList<BankAccount> bl = (ArrayList<BankAccount>) in.readObject();

Page 45: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

54

Serializable Objects that are written to an object stream

must belong to a class that implements the Serializable interface.

Serializable interface has no methods.What is it good for then!?

class BankAccount implements Serializable { . . . }

Page 46: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

55

Serializable Implementing Serializable tells Java that a

class can be serializedMost issues of serialization cannot be:

Entirely identified with an interface Detected by the compiler

Therefore Java is forced to trust us The interface has only Semantic meaning Exceptions will be thrown if problems arise

If you want more control over serialization, implement java.io.Externizable

Page 47: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

56

Serializable

Serialization: process of saving objects to a stream Each object is assigned a serial number on the

stream If the same object is saved twice, only serial

number is written out the second time When reading, duplicate serial numbers are

restored as references to the same object

Page 48: CSE 501N Fall ‘09 18: Files and Streams 06 November 2009 Nick Leidenfrost

57

Conclusion

Questions? I will be in Lab now until 6:30