int 2/ higher - data representation - 1 why use binary? it is a two state system (on/off) which...
TRANSCRIPT
Int 2/ Higher - Data Representation - 1
Why use Binary?
It is a two state system (on/off) which makes it simple to operate
Even if degradation of current occurs (ie a slight drop in voltage) it will still be detected as a 1
There are only four rules for addition in binary compared to 100 in decimal
[0+0=0 ; 0+1=1 ; 1+0=1; 1+1=10]
Int 2/ Higher - Data Representation - 2
Number Systems - Decimal
The decimal system is a base-10 system. There are 10 distinct digits (0 to 9) to represent any quantity.
For an n-digit number, the value that each digit represents depends on its weight or position. The weights are based on powers of 10.
POSITION
WEIGHT
4TH. 3RD. 2ND. 1ST.
103 = 1000 102 = 100 101 = 10 100 = 1
For example, 491610 = 4*1000 + 9*100 + 1*10 +6*1
Int 2/ Higher - Data Representation - 3
Number Systems - Binary
The binary system is a base-2 system. There are 2 distinct digits (0 and 1) to represent any quantity.
To express any number in base 2 we use powers much like our own decimal system.
8TH 7TH 6TH 5TH 4TH 3RD 2ND 1ST POSITION
27=128 26=64 25=32 24=16 23=8 22=4 21=2 20=1 WEIGHT
For example: 110100102 = 1*128 + 1*64 + 0*32 + 1*16 + 0*8 + 0*4 + 1*2 +0*1 = 21010
Int 2/ Higher - Data Representation - 4
Number Systems - Binary to Decimal
Converting binary to decimal
27=128 26=64 25=32 24=16 23=8 22=4 21=2 20=1
1 0 1 1 1 0 0 1
1x128+0x64+1x32+1x16+1x8+0x4+0x2+1= 128+32+16+8+1= 185
Int 2/ Higher - Data Representation - 5
Number Systems - Decimal to Binary
We use the same table as before
• To convert the decimal number 115
27=128 26=64 25=32 24=16 23=8 22=4 21=2 20=1
115 is less than 128 so we put a zero in the 128 column 0we need a 64 to ‘built’ up to 115 so place a ‘1’ in the 64 column 64
96+16 is 112….just short we place a ‘1’ in the 16 column 112
We just need a 3 to give 115 so a ‘1’ in the 2s column anda ‘1’ in the units column gives 115
64+32 is 96 so could use a 32 place a ‘1’ in the 32 column 96
So that 115 using 8 bit binary is 01110011So that 115 using 8 bit binary is 01110011
0 1 1 1 0 0 1 1
Int 2/ Higher - Data Representation - 6
Storage of data
1byte = 8 bits1KiloByte = 1024 bytes1MegaByte = 1024 Kbytes1GigaByte = 1024 Mbytes1TetraBytes = 1024 GBytes
Hierarchy of storage
Int 2/ Higher - Data Representation - 7
Numbers
Numbers may be classified as real or integer.
Real numbers include ALL numbers, whole and fractional, positive or negative. From the smallest negative fraction to the largest number imaginable.
Integers include only whole numbers but can be both positive and negative
The method to represent integer numbers is different from the method used to represent real numbers.
Int 2/ Higher - Data Representation - 8
Integers
The size of the number that can be represented depends on the number of bytes which are available in the computers memory to store it.
If one byte is used to store the number then the numbers 0000 0000 to 1111 1111 can be stored, that is, 0 to 255 in decimal. A total of 256 numbers.
If two bytes are assigned to store numbers, then numbers from 0000 000 0000 000 to 1111 1111 1111 1111 can be stored, that is, 0 to 65 535 in decimal. A total of 65 536 numbers.
Int 2/ Higher - Data Representation - 9
Numbers/addresses
IPv4 allows 32 bits for an Internet Protocol address, and can therefore support 232 addresses (4,294,967,296)
IPv6 uses a 128-bit address, 2 128, that’s 340 undecillion possible addresses which is: 340,000,000,000,000,000,000,000,000,000,000,000,000 addresses.
Int 2/ Higher - Data Representation - 10
Real Numbers -Floating Point Representation
FPR is used to store real numbers, that is, fractional numbers, very large and very small numbers.
In decimal any number can be represented with a decimal point in a fixed position and a multiplier which is a power of 10.
398 = .398 * 1000 = .398 * 103
Int 2/ Higher - Data Representation - 11
Real Numbers -Floating Point Representation
.398 * 103 can be described as m * basee where m is called the mantissa, 10 the base and e the exponent. The exponent is the number of times the point moves.
398 is the mantissa, the base is 10 and the exponent is 3
FPR works exactly the same way when used to represent real numbers in binary. As the base is always 2 it does not need to be stored. Only the mantissa and the exponent need be stored.
Int 2/ Higher - Data Representation - 12
Real Numbers -Floating Point Representation
398 = 110001110 in binary, using FPR, only the mantissa and the exponent need be stored
110001110 = .1100011101001 (the point moves 9 places) The mantissa is .110001110 and the exponent 1001
(which is binary for 9)
If the binary number was 1101.0111 using FPR this would become .11010111100
As the point moves four places and 100 is binary for 4.
Int 2/ Higher - Data Representation - 13
Negative numbers
In decimal negative numbers are represented by using the negative sign (-) for example, (positive) 53 becomes negative 53 by using the negative sign, -53.
In a computer system if a sign was used it would have to be stored and that would take up one of the bits used for storing the number and therefore reduce the size of the numbers.
Int 2/ Higher - Data Representation - 14
Negative numbers
If one bit is used for the sign then only 7 bits would be available for the number 000 0000 to 111 1111, that is 0 to 127 in decimal. A total of 128 numbers.
If 1 is used for negative and 0 for positive then the range of numbers would be 1111 1111 (- 127) to 0111 111 (+127), the problem with this is you get two values for zero, 1000 000 (negative zero) and 0000 0000 (positive zero)
Int 2/ Higher - Data Representation - 15
Two’s Compliment
To avoid the two values for zero problem a system called two’s compliment is used. To represent the negative of a number you change all the 0’s to 1s and all the 1’s to 0’s and then add 1.
Remember that in binary adding 1 and 1 gives 0.
Int 2/ Higher - Data Representation - 16
Two’s Compliment
Int 2/ Higher - Data Representation - 17
Floating Point Representation
A range of very large and very small numbers can be represented with only a few digits by using scientific notation. For example:
• 976,000,000,000,000 = 9.76 * 1014
• 0.0000000000000976 = 9.76 * 10-14
This same approach can be used for binary numbers. A number represented by M*B±E can be stored in a binary word with three fields:
• Mantissa
• Exponent E
• The base B is implicit and need not be stored
Int 2/ Higher - Data Representation - 18
Typical 32-bit Floating Point Format
First 8 bits contain the exponent The remaining 24 bits contain the mantissa
The more bits we use for the exponent, the larger the range of numbers available, but at the expense of precision. We still only have a total of 232 numbers that can be represented.
Exponent Mantissa
8 bits 24 bits
Int 2/ Higher - Data Representation - 19
Floating point representation
How to represent the binary number
11010.11011011101
This has to be converted to the form M*B±E
. 1101011011011101 Mantissa
The point has been moved 5 placed so exponent +5
.1101011011011101 x 2 101
Only the mantissa and the exponent need to be stored to represent this number
Note: this assumes that all numbers are positive
Int 2/ Higher - Data Representation - 20
Representing negative numbers
TWO’s compliment Positive 9 0000 1001 Negative 9 1111 0111
To represent the negative Find the positive 0000 1001 Change all the ones to zeros and vice versa 1111 0110 Add 1 +1 Negative number 1111 0111
Int 2/ Higher - Data Representation - 21
Representing Text
Alphanumeric data such as names and addresses are represented as strings of characters containing letters, numbers and symbols.
Each character has a unique code or sequence of bits to represent it. As each character is entered from a keyboard it must be converted into its binary code.
Int 2/ Higher - Data Representation - 22
Representing Text
Character Set
The CS is the group of letters and numbers and characters that the computer can represent and manipulate
Each letter, number or character has its own unique binary value
Int 2/ Higher - Data Representation - 23
Coding Methods ASCII
ASCII
American Standard Code for Information Interchange
strictly speaking a 7-bit code (128 characters) has an extended 8-bit version used on PC’s and non-IBM mainframes widely used to transfer data from one computer to
another codes 0 to 31 are control codes
Int 2/ Higher - Data Representation - 24
Coding Methods ASCII continued
ASCII
Character code sets contain two types of characters: Printable (normal characters) Letters, numbers and symbols Upper and lower case letters have their own codes Numbers 0 to 9 Punctuation and other symbols, for example, % £ “ !
Int 2/ Higher - Data Representation - 25
Coding Methods ASCII continued
ASCII
Character code sets contain two types of characters: Non printable Control Codes (characters) The first 32 codes are set aside for control characters They all have their own unique code Examples are: <return>, <tab>, <escape>, They are often used when transmitting data, there are
codes for <start of text>, <end of text> and <end of transmission>
Int 2/ Higher - Data Representation - 26
ASCII Coding Examples
An ASCII subset
A 41 B 42 C 43 D 44 E 45 F 46 0 30 1 31 2 32 3 33 4 34 5 35 6 36 7 37
Symbol Code“BAD” = 0100 0010 0100 0001 0100 01002
“F1” = 0100 0110 0011 00012
“3415” = 0011 0011 0011 0100 0011 0001 0011 01012
Note that this is a text string and no arithmetic may be done on it. A postcode is a good example of the need to store numbers as text.
Int 2/ Higher - Data Representation - 27
Other coding methods
ASCII is a 7 bit code giving 128 code values 96 characters and 32 control codes Extended ASCII, using an 8 bit code gives 256
characters but is still not enough to represent the major writing schemes of the world
Unicode A 16 bit code Can represent 65,536 different characters First 256 values are used to represent 8 bit ASCII, this
makes conversion between the two easy
Int 2/ Higher - Data Representation - 28
Unicode
Advantages over ASCII Can support 65 380 more characters than 8 bit ASCII Every character base alphabet in the world can be
coded such as French, German and Finish
And others such as Arabic
The large ideographic languages can be coded such as Chinese, Japanese and Korean
Int 2/ Higher - Data Representation - 29
Unicode
Originally 49 000 of the codes were predefined 6400 can be used by software developers 10 000 codes set aside for future developments
Now Mobile phones use Unicode for SMS text messaging More and more character sets have been added Unicode now represents 109,000 characters Even 16 bit Unicode is no longer enough
Int 2/ Higher - Data Representation - 30
Unicode
Unicode takes up much more storage space than ASCII and it takes longer to transmit Unicode than ASCII (because there are more bits to transmit)
Both of these factors are less of a disadvantage now because storage space has increased significantly as transmission bandwidths
Int 2/ Higher - Data Representation - 31
Representing Graphics
There are two ways of representing graphics
• Bit Mapped Graphics
• Vector Graphics
Int 2/ Higher - Data Representation - 32
Bit Mapped Graphics
Any graphic is made up from a series of pixels (Picture Elements).
Each pixel is an individual point on the screen
Int 2/ Higher - Data Representation - 33
Bit Map
Assuming only black and white (1 or 0) for each pixel the image below would be stored as shown
Pixel Pattern using 8x8 grid The BIT MAP of the image
0 0 1 1 1 1 0 00 1 0 0 0 0 1 00 1 0 0 0 0 1 01 0 1 0 0 1 0 10 1 0 0 0 0 1 00 1 0 1 1 0 1 00 0 1 0 0 1 0 00 0 0 1 1 0 0 0
Int 2/ Higher - Data Representation - 34
Resolution
The quality of the image depends on the number of pixels
More pixels means higher resolution and clearer image
Pixel Pattern using 8x8 grid Pixel Pattern using 16x16 grid There is a one to one correspondence between pixels and bits
Int 2/ Higher - Data Representation - 35
Memory Storage
The image below is 4 inches x 6 inches. The resolution is 300 d.p.i. (dots per linear inch) and the image is black and white. Calculate the memory requirements
Length: 6x300 = 1800 pixels
Breadth: 4x300 = 1200 pixels
Total no. pixels = 1800x1200
= 2160000
1 bit per pixel
Storage = 2160000 bits /8
= 270000 bytes /1024
= 263.67 Kb
= 264 Kb
Int 2/ Higher - Data Representation - 36
Vector Graphics
Each Image is made from objects (line,rect,circle)
• Every object has ATTRIBUTES which define it
To draw the rectangle below we need to know:
Start X and Y coordinates
The length
The breadth
The thickness and colour of the lines
The type of line (dashed)
The fill colour
Int 2/ Higher - Data Representation - 37
A Vector Image
Int 2/ Higher - Data Representation - 38
Vector Vs Bit-Mapped
Advantages of vector graphics (draw packages)
• Images can be enlarged without losing resolution
• Objects can be edited by changing their attributes
• Objects can be layered on top/behind
• Images take up less disc space
• Ideal for drawing plans; use library of objects
Disadvantages of vector graphics
• Individual pixels cannot be altered
• Not realistic
Int 2/ Higher - Data Representation - 39
Vector Vs Bit-Mapped
Advantages of bit-mapped graphics (paint packages)
• Each pixel can be altered
• More realistic when used for photos/real life
Disadvantages of bit mapped representation
• requires large amounts of storage space;
• image becomes course (jagged) when scaled;
• does not take advantage of resolutions that are higher than the resolution of the image.
Int 2/ Higher - Data Representation - 40
Compression
A colour bit – mapped image with a high resolution and 24 bit colour needs a lot of storage (50MB for a smallish photo)
File compression is used to reduce storage requirements.
Different techniques – coding using and index of colours actually used and not coding differences indistinguishable to the human eye.
Int 2/ Higher - Data Representation - 41
Compression types
Lossless means that none of the original data is lost
Lossy compression involves sacrificing some of the data in order to reduce the file size
JPEG is a file format commonly used for data compression of bit mapped files
JPEG uses lossy compression
Int 2/ Higher - Data Representation - 42
Advantages of compression
Saves backing storage
Smaller size of file, less time to transmit over a network
Smaller files faster to load on a web page
Int 2/ Higher - Data Representation - 43
Disadvantages of compression
Detail will be lost using lossy compression
Changes can be made to the original due to compression. These changes are known as artefacts
It takes time to compress. Larger files take longer.
Repeated compressing and decompressing of a file can lead to a reduction in image quality
Int 2/ Higher - Data Representation - 44
Images
Bit-Mapped Vector graphic