comp 335 file structures
DESCRIPTION
Comp 335 File Structures. Fundamental File Structure Concepts. File Organization. File Organization is how the data is organized in the file. Must be considered carefully how data is to be written to file because this will dictate how the data is to be read back in. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/1.jpg)
Comp 335File Structures
Fundamental File Structure Concepts
![Page 2: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/2.jpg)
File Organization
File Organization is how the data is organized in the file.
Must be considered carefully how data is to be written to file because this will dictate how the data is to be read back in.
![Page 3: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/3.jpg)
Example of Data saved to File
Assume a programmer writes all data to file by using strings.
Data to be saved on fileTowns and Populations
Searcy15000Bald Knob3500Romance950
![Page 4: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/4.jpg)
Example of Data saved to File
When saved on file:
Searcy15000Bald Knob3500Romance950
![Page 5: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/5.jpg)
Considerations when Writing Data to File
Must keep the “integrity” of the individual units of data (fields) which we wrote.
Group logical units of data together in records.
Within each record, organize the data on file in a way that will maintain “field separation”. In other words, write it in a way where the data can be recaptured.
![Page 6: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/6.jpg)
Common Field Structures
Force fields to have a predictable length
Begin each field with a length indicator Place a delimeter at the end of each
field to separate it from the next Use a “keyword = value” expression to
identify each field and its contents.
![Page 7: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/7.jpg)
Fields with a predictable length
Data to be saved on fileTowns and Populations
Searcy15000Bald Knob3500Romance950
Assume that: Towns (char [12]) and Population (char [7])
When written to file:
Searcy 15000 Bald Knob 3500 Romance 950
![Page 8: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/8.jpg)
Fields with a predictable length
A good method if all of the data to be stored was fixed in length.
What if the data to be stored were variable in length?
A lot of wasted space is used unnecessarily.
![Page 9: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/9.jpg)
Fields with a length indicator
Data to be saved on fileTowns and Populations
Searcy15000Bald Knob3500Romance950
Assume that: Towns (char [12]) and Population (char [7])
When written to file:
6Searcy5150009Bald Knob435007Romance3950
![Page 10: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/10.jpg)
Fields with a length indicator
The length indicator tells how many bytes to read.
How many bytes should you use for the length indicator? 1 byte (field size max = 255) 2 byte (field size max = 65535)
This method should save space if the data is quite variable in length.
In this case, mixes binary data with text.
![Page 11: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/11.jpg)
Fields separated by delimiters
Data to be saved on fileTowns and Populations
Searcy15000Bald Knob3500Romance950
Assume that: Towns (char [12]) and Population (char [7])
When written to file:
Searcy|15000|Bald Knob|3500|Romance|950
![Page 12: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/12.jpg)
Fields separated by delimiters
Could possibly save more space Delimiter choice must not be part of
valid data Language must provide instructions to
read data based on a sentinel value In C++, getline is overloaded to be
able to handle this.
![Page 13: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/13.jpg)
Fields separated by “keyword = value”
Data to be saved on fileTowns and Populations
Searcy15000Bald Knob3500Romance950
Assume that: Towns (char [12]) and Population (char [7])
When written to file:
TOWN=Searcy|POP=1500|TOWN=Bald Knob|POP=3500|TOWN=Romance
![Page 14: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/14.jpg)
Fields separated by “keyword = value”
This does make for potentially a lot of wasted space in the file.
It is a good technique if some fields are not used at times within records.
It also is good if you just want to save a lot of information on file and not organize the data within records.
![Page 15: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/15.jpg)
Record Organization
Fields can be combined to form a record An entire record can be read in at a time
into a buffer and then fields can be parsed out.
This is common because the majority of time we want to read and write records, not read and write individual fields.
![Page 16: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/16.jpg)
Fixed-Length Records
A frequently utilized method for file organization.
This can imply that each field must be fixed length.
It could be just a “container” to store a variable number of variable length fields.
![Page 17: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/17.jpg)
Fixed-Length Records
Data to be saved on fileTowns and Populations
Searcy15000Bald Knob3500Romance950
Assume that: Towns (char [12]) and Population (char [7])These fields are combined in a 19 byte record.
When written to file:
Searcy 15000 Bald Knob 3500 Romance 950
![Page 18: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/18.jpg)
Fixed-Length Records
Makes DIRECT ACCESS to records feasible, this will help reduce seeks!!!!!
Space could be wasted if the fields within the record are highly variable.
![Page 19: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/19.jpg)
Variable Length Records
Store just the data within the records, no wasted space.
Sequential access to get to each record. Typically a length indicator is given at the beginning of the record. It can be combined with “field integrity” techniques.
![Page 20: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/20.jpg)
Variable Length Records
Data to be saved on fileTowns and Populations
Searcy15000Bald Knob3500Romance950
Assume that: Towns (char [12]) and Population (char [7])These fields are combined in a 19 byte record.
When written to file:
13Searcy|15000|15Bald Knob|3500|12Romance|950|
![Page 21: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/21.jpg)
Variable Length Records
To improve access to records (which will minimize seeks), an index can be used which can store the offsets of each variable length record in the file.
![Page 22: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/22.jpg)
Variable Length RecordsData to be saved on fileTowns and Populations
Searcy15000Bald Knob3500Romance950
When written to file:
Searcy|15000|Bald Knob|3500|Romance|950|
Index of Offsets
0 13 28 40
![Page 23: Comp 335 File Structures](https://reader035.vdocuments.mx/reader035/viewer/2022062316/56814c22550346895db925d9/html5/thumbnails/23.jpg)
Variable Length Records
To obtain direct access to variable records, each offset address can be associated with a key which uniquely identifies each record.
The index can be searched for the key, address found and then directly access the record.