study on internet delays - um) - u

80
STUDY ON INTERNET DELAYS by Master Thesis, 2005

Upload: others

Post on 04-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STUDY ON INTERNET DELAYS - um) - U

STUDY ON INTERNET DELAYS by

�������������������������������������������������� � � � � � � �� � � � � �� � � � � � � �� � � � � �� � � � � � � �� � � � � �� � � � � � � �� � � � � �� �� �������������������

������������������������������

�������������������������� � ��������������� � ��������������� � ��������������� � ������ ��

Master Thesis, 2005

������������� ������������������������������� !���"������#���%$&�'�����(���#)* !��������#�����+� $,�-������!�#./�0�1�($&�'23������1�4�+���56$&�7�! !����%.98':;�����:<�&=

Page 2: STUDY ON INTERNET DELAYS - um) - U

Page 3: STUDY ON INTERNET DELAYS - um) - U

To All my Professors All my friends And all who supported me …

Page 4: STUDY ON INTERNET DELAYS - um) - U

Page 5: STUDY ON INTERNET DELAYS - um) - U

Abstract: Internet, since its introduction has great impact on society. The usage and dependency on web has increased dramatically and the amount of time between when a user issues a request and receives a response is a critical issue. This unfortunately has the inherent problem of congestion which is caused by too many users on too little bandwidth. To speed up the dynamic web content is an important issue here. This seriously affects the user’s experience. The main aim of this thesis is to analyze the download times associated with a web request, identifying where actually the delay is occurring and come out with major suggestions and reasons for the its occurrence. This thesis will also include providing effective guidelines which can be followed by the developers to enable a faster and effective download and service for their users. The data collected is then analyzed and suggestions are made on how congestion can be avoided and the user’s experience can be improved.

Page 6: STUDY ON INTERNET DELAYS - um) - U

Page 7: STUDY ON INTERNET DELAYS - um) - U

Acknowledgments First of all, I would like to thank my supervisor Prof. Anders BroBerg, for his valuable advices, support, patience and his encouragement to me. It was an honor to work under his supervision. Secondly, my great thanks goes to Prof. PerL indstrom for his care, support and encouragement that I could not continue without it. Finally, I would like to thank all my Professors, and all fr iends who helped me to finish this work.

Page 8: STUDY ON INTERNET DELAYS - um) - U

Page 9: STUDY ON INTERNET DELAYS - um) - U

Table of Contents

1. Introduction 1.1 Aim of the thesis 13 1.2 Methodology 14 1.3 Outline of the thesis 15 2. Theoretical view of web delays 2.1 The Basics of Internet 17 2.2 layered view of networks 19

2.2.1 TCP/IP stack 22 2.3 Internet congestion 24 2.4 C/S Communication 25 2.5 Domain Name System 25 . 2.6 Hypertext transmission protocol 27 2.7 Reliable transmission of data (Bulk) 27 2.8 Summary 28 3 Techniques for decreasing congestion 3.1 Moving content close to the user 29 3.2 Image & its compression techniques 30 3.3 Summary 32 4. Websites migration load balancing of web servers 4.1 Migration of websites 33 4.2 Summary 34 5. Research & Analysis 5.1 Visual Route 35 5.2 Netscape 37 5.3 GIMP 37 5.4 Neo Trace 38 5.5 Trace Route 39 5.6 Filter gate 42 5.7 Ethereal 42 5.8 Summary 44 6. Empirical Study 6.1 Calculation 45 6.2 Ordinary downloads versus web cache downloads 46 6.3 Ordinary downloads versus popup included downloads 47. 6.4 Initial picture compared to compressed picture 48 6.5 Summary 49 7. Evaluation of Results 51 8. Guidelines 53 9. Conclusion 55 References: 57 Appendix A: 59 Appendix B: 69 Appendix C: 77

Page 10: STUDY ON INTERNET DELAYS - um) - U

���

Page 11: STUDY ON INTERNET DELAYS - um) - U

� �

List of Figures

Figure 1: Client/server Communication. 25

Figure 2: Domain Name System 26

Figure 3: Compressed Picture 31

Figure 4: Report for 194.242.61.52 using visual route tool 36

Figure 5: Netscape browser 37

Figure 6: Compressing using GIMP 38

Figure 7: Trace of a site using neo trace tool 39

Figure 8: Information flow path of data 40

Figure 9: Trace of “ardent –india.com” 41

Figure 10: Filter gate 42

Figure 11: Ethereal Tool 43

Figure 12: Results of www.streetonline.co.uk 46

Figure 13: Ordinary download Vs web cache download 47

Figure 14: Normal download Vs Popup downloads 47

Figure 15: Initial Picture compared to compresses Picture 48

Page 12: STUDY ON INTERNET DELAYS - um) - U

� �

Page 13: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 1 Introduction. Internet, since its introduction has great impact on society. From World internet usage and population statistics it can be predicted that there are lot of users [1] through out the world. Although the web continues to grow in size and popularity only small number of websites or pages are being user relatively by end users. Very few websites attracts large number of end users. Understanding and reducing web traffic often plays a major role in web designing and its management (ex. Managing web proxies). The usage and dependency on web has increased dramatically and the amount of time between when a user issues a request and receives a response is a critical issue [2]. Internet has become a major part in the quest for information. Almost everyone employs internet as a media for retrieving information because it provides faster access to huge amount of information from various sources. While internet is becoming the common mode of retrieving information people do not want to wait for information to arrive. One among many important problem is that large document are too time consuming to be retrieved. There are several causes of delay which includes web servers, web clients, network latency [3] etc. In general web delays can effect smaller internet firms as they cannot seriously overcome this situation and the major companies would have already established name for themselves and delays most likely will not put off there customers. This kind of scenario makes it hard to reduce user-perceived latency and reduce transmission of redundant traffic on the network. This unfortunately has the inherent problem of congestion which is caused by too many users on too little bandwidth. To speed up the dynamic web content is an important issue here. This seriously affects the user’s experience. So a detail study regarding the internet delays is made in this thesis, about the basic reasons for the web delays and how to overcome so that developers could over come to a certain extent. 1.1 Aim of the Thesis. The main aim of this thesis is to analyze the reasons for web delays, identifying where actually the delay is occurring and come out with major suggestions and reasons for the its occurrence. The thesis will also include providing effective guidelines which can be followed by the developers to enable a faster and effective download and service for their users. Based on the results obtained from this thesis suggestions will be made to show ways of optimizing files or data for transfer through the Internet and how to make full use of technologies which can reduce congestion. The main issue here is to understand is the reasons for web delays and to come with major reasons and suggestions so that the developer can overcome to some extent the effect and affects of internet delays.

Page 14: STUDY ON INTERNET DELAYS - um) - U

� �

1.2 Methodology This thesis includes three phases. Investigations consist of measuring effects of congestion by analyzing websites download times and types of content it has and some of the technologies used by them . In the first phase a theoretical study of about internet delays, networking concepts will be done to understand the basic concepts of networking, its applications and how the delays are occurred. This will make clear understanding of the working principles of internet and regarding their effects and defects. In the second phase, the effects of internet congestion will be investigated by analyzing website download times, its content and the way the data is represented on web. For this task some popular websites will be taken in to consideration and the result will be tabulated .Each webpage will be downloaded and the latency will be tabulated from when the user made initial request until their transfer is complete. The sites will be examined in three ways-Normal downloads, Cached downloads and using Popup Filters. This empirical phase of the thesis includes analyzing the content of the websites. In this, the Websites taken in to consideration will be measured for its content. Although some characteristics such as number of embedded images in a web page vary from one measurement study to another, possible images will be examined for its size and possible ways to compress them will be performed. All the images will be compressed wherever possible and compared to the original image to see if appropriate compression has been used. Also the application of “Web Caching” in reducing web delays, i.e. how are they applied will be studied. In the Third phase, detail and descriptive study made on the above three phases will be recorded and will help developers to identify the possible causes for the web delays. This analysis reveals the substantial diversity of the web in virtually every respect. Location of the server where the sites are residing and its information will be gathered as this is also the main underlying factor for web delays and will help with further analysis and comes out with as much reasons as it can to help developers build the sites in well designed manner.

Page 15: STUDY ON INTERNET DELAYS - um) - U

� �

1.3 Outline of the Thesis. Chapter 1: Descriptive introduction about internet and all related topics such as OSI model and all which give important information about congestion. Chapter 2: Theoretical study about web caching and reliable transmission of bulk data over networks Chapter 3: Extended empirical study about images and its compression techniques. Chapter 4: Discussion about migration of website and to know the underlying reason behind doing this which could help developers to built effective websites Chapter 5: Look at special tools for measuring downloading times such as Visual Route, Netscape, Gimp, NeoTrace and Filter Gate Chapter 6: Considering some popular websites which are in popular usage and apply these websites to the tools to know where actually the delay is occurred Evaluation of the results based on the study done in the previous chapters.

Chapter 7: Guidelines for developers.

Chapter 8: Conclusion of the thesis.

Page 16: STUDY ON INTERNET DELAYS - um) - U

� �

Page 17: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 2 Theoretical view of web delays. Internet, one of the powerful tools available for communication to various end users, since its introduction has a great impact on society. Its discovery was mainly for the purpose of defence applications .In 1973, Defence Advance Research Projects Agency (DARPA), a USA based organization wants to build a technology (protocol) which can communicate various computers over multiple networks. The Project basically called ‘ Internetting’ , after the protocol discovery came to be known as “ INTERNET” and the protocols developed as part of research was known as “TCP/IP” suite (Transmission Control Protocol and Internet Protocol)[4]. Since its introduction though it was a successful concept it had many problems like system crash which happened to Charley Kline at UCLA when he tried to communicate to another system which was at Stanford Research Institute in 1969[5]After the discovery of internet it was only used by professional bodies, computer experts, scientist etc. Nothing was their for a common man to have entertainment or useful aspects by using it and one was suppose to have a excellent knowledge to use the computer system. In this Chapter the specific description of various topics such as the basics of internet, how congestion occurs, data communications and all the required layers of OSI models related to web delays namely: network, transport, application layer will be discussed. The main aim of this chapter is to study the basic theory needed to carry out the reasons behind the web delays and to know the working principles of internet. How internet works and how the delay is occurred is a tricky question here. So to understand this and theoretical studies of the internet, topics are studied. The issue here is also to understand how large/bulk data is transferred reliably. 2.1 The basics of internet Internet is a combination of computer networks, which are cooperated to exchange information/data. Through sources such as telephone wires, satellite links, wireless technologies etc. Internet users can share information. But the issue here is to understand how data exchanges form source to destination. We have to know does the data transfer form one point to other over networks. It is the duty of TCP/IP (Transmission control protocol) software to receive, send and check for correctness of the data packets.

We can look at the procedure about how data is transferred over the Internet:

• The data is broken up into a small size data packets called packets • To each packet an index called ‘Header’ is added to let users know from which

corner it is coming and where it has to reach, destination. • Then the data packet is transferred from source and after its destination is found it

is halted.

Page 18: STUDY ON INTERNET DELAYS - um) - U

� �

• Based on the network speed the data packets are transported to the destination and it solely depends on the speed and traffic .If more traffic the data is reached slowly and possibilities are their that the data sent looses some information. Then the data has to be sent again.

• After the data packets are received they are checked to see if the data is destroyed or not.

IP Address [6]: Every computer has its unique address to send or receive packets form and too. Computer’s internet protocol (IP) address gives the protocols the route they need to send. Before sending data the machine looks at the destinations IP address and then the information about source and destination IP address is attached with the each data packet sent over network. IP address is made up of four bytes of information (1 byte=8 bits) expressed as four numbers between 0 and 255 separated by periods. For example, your computer's IP address might be 238.17.159.4, which is shown below in human-readable decimal form and in the binary form used on the Internet.

Example IP Address

Decimal: 238. 17. 159. 4

Binary: 11101110 00010001 10011111 00000100

Each of the four numbers uses eight bits of storage, and so can represent any of the 256 numbers between zero (binary 00000000) and 255 (binary 11111111).

Packet Switching: It is the division of each data into individual packets. These data packets are then sent to their destination over the network, and all the packets are rejoined when all the packets arrive at their destination. If the data sent is not received in correct order it could be predicted that there is some loss of data. The user should be automatically acknowledged about this and the data should be retransmitted. A data packet is divided according to the requirement. The length of the data packet is about one kilo byte. The starting corner of the data packet is called ‘Header’ . The header consists of the following information about the respective packet:

• Source IP address for the destination references. • Destination IP address for the data to be sent to the exact location • Length of the packet to be sent. • Sequence number of the packet being sent over the network �

This data provides the necessary information for a data packet to be sent in reliable manner. The Packet headers also contain an error correction code to check the content of the data whether it is correct or not. If at all there is short error while transmitting and the data sent and received is mismatched then the whole data is retransmitted .Packet switching is performed by ‘Routers’ which is enabled at the common point of interconnected network links. They use the routing protocols to shuffle the packets to the appropriate destinations and reduce the traffic. The good thing to know about the packet switched networks is that they share the bandwidth effectively, efficiently.

Page 19: STUDY ON INTERNET DELAYS - um) - U

� �

Routers [7]: Routers are special computers dedicated to receive and transmit packets over networks. When the data is in need to be transmitted it is sent with the necessary information like IP address of source and destination so that it reaches accurately. It is the duty of the Router to use special routing algorithms to send the data to accurate destination. A router is connected to different networks.

Characteristics of the Routers:

• Conforms to specific Internet protocols like internet control message protocol (ICMP)

• Interfaces to two or more packet networks. Routers encapsulates and decapsulates the datagram’s, receiving and sending the datagram’s up to their capacity and also like translating the IP address to the appropriate domain network path.

• Receives and sends the datagram’s and also generates the ICMP errors conditions • Provide network management and support facilities.

2.2 Layered view of networks [8] Open System Interconnection (OSI) is a model referred by International Organization for Standardization (ISO) as a framework of standards for communication in the network. It is the primary architectural model for internet or intranet. Most of the communication protocols used today is based on the OSI model. This model divides the communication process in to seven layers namely (in order): Physical, Data link, Network, Transport, Session, Presentation & Application Layer. The model has its individual functionalities at each layer. In general Transport, Session, Presentation and Application layers are responsible for end to end communications between source and destinations, where as the remaining are responsible for communication between network devices. The functionalities of each layer are as follows: Layer 1: Physical Layer

• Handles physical means of transmitting data over network devices. • Defines the cable or physical medium itself. • Defines optical and mechanical characteristic

Layer 2: Data Link Layer • Sending data as frames • Error corrections and detections • Handles the physical and logical connections to the packets destination. • Defines the format of data on the network.

Layer 3: Network Layer • Break larger data in to packets of small sizes and transmit to receiver • Routing packets to exact destination • Provides flow and congestion control

Layer 4: Transport Layer • Subdivides user buffer in to network buffer sized datagram’s and enforces desired

transmission control.

Page 20: STUDY ON INTERNET DELAYS - um) - U

� �

• Maintains end to end communication of data. • Provides reliable transmission of data by acknowledging the error. • Provides connectionless oriented packet delivery.

Layer 5: Session Layer. • Defines the format of the data to be sent over networks. • Responsible to report errors occurred in the upper layers. • Controlling establishments and of logical links between users.

Layer 6: Presentation Layer • Responsible to convert local representation of data to its canonical form and vice

versa. • Encodes and decodes the data • Compress and decompress data • Specifies architecture independent data transfer format.

Layer 7: Application Layer • Provides network services to end user .Ex: mail, ftp, telnet. • Provides standardized services such as virtual terminal, file transfers..

Network Layer : The network layer is the third level in the OSI model. It responds to the request issued by the transport layer and in turn issues it to the next layer, Data link layer. Network layer converts the logical address in to physical and determines the route form the source entity to the destination entity. Its main responsibility is to manage the traffic problems such as switching, routing etc. Network layer also manages functional and procedural means of carrying data from source and destinations in a network. This layer performs network routing, flow control, segmentation and de segmentation, and error control functions. The main function of the network layer is to transfer data all the way from its source to destinations. If one is not able to track the place in the layer then it is not possible for any to contact that place. Any how network layer does all these in a very basic way, with out error detection or flow control. Network layers are among the most challenging in the protocols stack. The biggest challenging and one of the biggest in the network layer is routing datagram’s through a network of millions of host and routers. Though there are some problems like “Scaling Problem” which are overcome by partitioning large networks in to small and independent domains called autonomous systems (AS). Each autonomous systems independently routes its datagram through the autonomous systems. To end up with network layer, the basic functionalities of this layer should be taken in to consideration. In general the functionality of the network layer is divided in to three:

a) Routing b) Congestion control c) Internetworking

Page 21: STUDY ON INTERNET DELAYS - um) - U

� �

Routing: Routing is the selection of paths for data packets. Selection of path is usually performed by the network. Moreover the source can also identify the path, if it has the necessary info like exact IP address of the destination and etc. The main characteristics of routing are:

• Stability • Optimality • Fairness • Robustness

Congestion control: It is the main function of network layer. The overall performance of the system will be degrading if the amount of work to be done by the system is not performed with the correct span of time. That is to say that if there are too many data packets to be delivered then we could say that the network is in congestion. The tricky question about how actually this congestion occurs is the main concern. This is concerned with the actual number of IMP buffers that are actively taking part in the system. Irrespective of the network capacity if the speed maintenance of the buffer is very slow to Overcome the traffic then it leads to congestion. Any how there must be ways to reduce the effects and minimising the chance of occurring congestion. This is what basically congestion control is. Some ways to control congestion are packet discarding, preallocation of buffers, flow control etc. Internetworking: It is the ability of networks to send message from one place to another. Although congestion control and routing is important, internetworking is also important part of this layer. Transport Layer This layer is the fourth and middle layer of the OSI model and concerned mainly with the transport of data. Layers 1, 2 & 3 are mainly concerned with addressing, routing and transmitting of data and this layer depends on this lower layers to handle all the process of transmitting/moving data between source and destinations. As like as modern computers has powerful feature called “multitasking” for software applications to enable sending data and receiving them at a time. Thus this layer is considered as the responsible for end to end transport layer. The main key function of the transport layer is to provide connection services for the protocols and applications that run above this layer. The famous example could be TCP/IP where TCP is connection oriented and IP is connectionless .Both of them are important when coming to usage due to their powerful features .As compared to network layer which cannot guarantee the deliverable of message, transport layers is designed with such an algorithm which could acknowledge the user with the confirmation stating about the deliverable of messages which ensures reliability. However not all the protocols of this layer ensure reliable data transfer .This is the case with TCP/IP that is TCP is includes reliability and the features of flow control whereas UDP do not have any features as same as TCP. Functions of Transport Layer.

• Flow control

Page 22: STUDY ON INTERNET DELAYS - um) - U

� �

• Segmentation • Connection Establishments • Multiplexing and Demultiplexing.

Flow Control: This feature allows the source to specify the rate at which it is transmitting the data to the destination so that the destination entity could receive it in correct format and their do not occurs any mismatch between source rate of transmitting to the destination rate of receiving. Segmentation: This layer has some special feature in to small chunks of data on the source side and then re arranges them after is has been arrived at destination terminal. Doing in this manner issues reliable transmission of data. Connection Establishment: This layer protocols are responsible for the connection establishment between source and destination. Transport layer has to keep this connection until the data is being sent over the networks and then terminate after the data has been received fully. Multiplexing and Demultiplexing: These are the two methods in which this layer transport the data, i.e. on source side the data is multiplex that are received from various sources and combining them in to single entity and then demultiplex at destination and directing each data to the appropriate recipients. Application layer . Application layer is the top most layers in the OSI Model. This is mainly used by the network applications. It helps the user who performs various task over network It is mandatory for any to know the underlying meaning of this layer as it names”application” As the name says, application is not exactly the same what we thought but is having a special meaning. If we considered a general example: when we use a web browser the actual browser in front of our eyes is the application which is running in the computer but the thing here to understand is that it is not actually residing in the application layer. Rather it actually makes use of the services of the HTTP protocol. It is to understand that all user applications not use the application layer, so as the case with the web browser. It is responsibility the operating system to take care of what different editors do over the network. So in general when ever you are involved in activities like sending mails, using chat programmes then you are actually dealing directly with the application layer. Some popular application layer protocols are HTTP, FTP, SMTP, FNS, TELNET, and POP3. As the top layer is the only layer which is not providing any services to any above it. It helps the programmes which are directly involved in through networks. So we could consider this layer responsibility to issue the appropriate commands to the request received by the lower layers. 2.2.1 TCP/IP Stack. Internet protocol Stack is the set of protocols that implement the protocol stack. The internet protocol stack can be descripted based on OSI model, which describes the basic

Page 23: STUDY ON INTERNET DELAYS - um) - U

� �

seven layers of a protocol stack. In a protocol stack, each layer contributes in solving the problems involving the transmission of data, and provides a solutions to next levels. Usually higher layers are closer to the enduser and intouch with data which are relying on lower layers .Basically the internet model duty was to deal with the practical problems relating to engineering domains.Whereas the OSI model was a more theoretical one and was deducted at the earlier stages of internet evolution. Hence, TCP/IP model is in more use compare to OSI, as it is easy to understand and implement

A simplified TCP/IP interpretation of the stack is shown below:

Application

e.g. HTTP, FTP, DNS

4 Transport e.g. TCP, UDP, RTP, SCTP

3 Network For TCP/IP this is the Internet Protocol (IP)

2 Data L ink e.g. Ethernet, Token ring, etc.

1 Physical e.g. physical media, and encoding techniques, T1, E1

Transmission Control Protocol (TCP) TCP, the widely seen concept when talking about networking due to its importance over other technologies. In general, this is the most commonly used protocol compare to other protocols. This protocol has checksum property which will detect errors occurred during transformation of data. The very simple concept being performed using this protocol is that it broke downs the data in to bunch of packets at client side and after receiving at receiver’s side it joints the packets it received to ensure the correctness of data. Some of the applications of TCP are [9] ->Simple Mail Transfer Protocol (SMTP):

• Mail transfer between hosts • Mailing lists, mail forwarding, return receipts • Does not specify how to create messages

->File transfer protocols (FTP): • Transfers files between hosts • Provides access control (user name and password) • Binary or text files are supported.

-> Remote login (Telnet): • Initially designed for simple scroll-mode terminals

Page 24: STUDY ON INTERNET DELAYS - um) - U

� �

User Datagram Protocol (UDP) [10] User datagram protocol is almost similar to TCP except that it is connection less protocol. Unlike TCP it provides less means of error recovery services, offering instead a direct way to send and receive datagram’s over an IP network. It's used primarily for broadcasting messages over a network. It has other problems compare to TCP. In TCP when the packets are send from client the receiver receives it in the same order and the error detection is easy. In UDP the situation is different. In this scenario the receiver receives the packets in different order and hence the scope of finding the errors is very less. This makes UDP less important compare to TCP and thus not used commonly. By using UDP to send data the whole time is taken over searching the errors and hence some delay will be generated.

2.3 Internet Congestion [11] Congestion is a condition which arises when there is over crowd of packets sent through network. The maximum usage of internet although has developed various aspects in human being, led to huge web delay (congestion) for internet users. The more the network traffic, the more is the delay. The increase in traffic is negative effect which may outweigh the positive side and all the positive side may slide down to negative side hence resulting in congestion. It is only the protocol software that can detect congestion and reduces the rate at which packets are sent. In general the data transmission in internet is done using a concept called packet. Each information is a collected a bunch of packets and send over network. While sending the data the condition to be take care is, that the packet is containing both the senders IP address as well as of the receiver too. By doing this it is understood that the information has to be broken down in to appropriate packets with all the relevant information. The next important issue to be considered is the routing, i.e. in which direction the packet has to be sent and to what destination. The routing algorithm plays a vital role in congestion control. Suppose any person say ‘X’ wants to send a data from location ‘ India’ to ‘Sweden’ to location ‘Y’ then the appropriate shortest root has to be select. If there are lot of packets already in that desired direction then another path has to be decided so that no traffic jam is created. The destination location is not responsible for any inconvenience occurred at middle but can respond only when it received all the packets from source. So the bandwidth and routing concepts must be taken in to consideration to get out of congestion problem

Page 25: STUDY ON INTERNET DELAYS - um) - U

� �

2.4 Client/server Communication [12] In general Client /Server communication includes certain steps to have proper communication between two hosts. User having browser to enter the URL is considered to be the client and that which serves the request is taken as server. Every request made be the user is stored in the physical storage called cache memory which is at the local space or at remote space. Example of local space could be university server. When the user issues the request the server accepts the URL and checks for its correctness and acknowledges to the server. To do this the whole system has to follow some rules, Protocol. If we take the situations where the protocols are not followed then their will be huge delay in the transmission of requested data. Let us see the general pictorial approach for client server communication environment.

Figure1: C/S comm.

2.5 DNS (Domain Name System) [13] The Transmission control protocol (TCP) implements the delivery of the packet over various layers present in network topology. Each layer performs their function irrespective of the implementation of other layer. The delivery of the packet between two hosts is taken care by TCP. In this context how the translation is being done in a scalable way is taken over by Domain Name System (DNS).Domain name system translates hostname to appropriate IP address and IP address to hostname.

Page 26: STUDY ON INTERNET DELAYS - um) - U

� �

In general it is difficult to remember IP address because it may change if the company is moving their website to new hosting service .so same website will be residing twice and the whole request would fail if any user wants to access the appropriate website. The main advantage of using names is to provide flexibility in contacting the appropriate server. Every web browser may it be Internet Explorer, Mozilla or Mosaic has a software library called “Resolver” which is helpful in accessing appropriate DNS. Resolver perform two main functions (i) gethostbyname () (ii) gethostbyaddr (). “gethostbyname ()” converts hostname to IP address and “gethostbyaddr ()” is vice versa of the gethostbyname (). Root Name Server 3 2 6 Intermediate Server 7 (dns.kth.se) 4 5 Local Name Server (dns.umu.se) 1 8 dns.cs.kth.se Authoritative Name Server Requested Host Figure 2: Domain Name System (www.umu.se) . Advantages

1. DNS associates it with caching to reduce the latency and any delay in the internet. 2. Easy for end users to remember the name of the website instead of its IP address

It uses datagram protocol (UDP) for sending and receiving for queries and response respectively. The reason for UDP usage is it allows communication without the establishment of communication manually. [14]

Page 27: STUDY ON INTERNET DELAYS - um) - U

� �

2.6 Hyper Text Transfer Protocol (HTTP) Hyper text transfer protocol is used to transfer the web documents over internet. It has two versions namely HTTP 1.1 defined in RFC 2068 and HTTP 1.0 defined in RFC 1945. When a connection is successfully made, the client transmits a request to the server, which acknowledges the client in turn to its request. HTTP messages are human-readable, and an HTTP server can be manually operated

2.7 Reliable Transmission of Data (Bulk)

Sending a huge data over network is always a serious disaster kind of thing for users. The only reason is due to delay in network and congestion occurred while sending the data. While developing the website if the web developer has not written proper code for arrangement of packets to be sent i.e. the order in which the data has to be sent, then the receiver may lose some of the packets which can not be found instantaneously. Initially when the user request for a data the transmitter first checks whether the request made is available in the server, if it so then the transmitter send the packets .The route in which it has to be travel is taken care by the routers and while passing through the different layers of the networks each layers performs various algorithms to send data through them. The duty of the router is only to set route to travel for the desired packets. The management protocols which are specified clearly in RFC (Request for comment, www.rfc.org) decided how long to send data and if any problem arises then it stop the transmission of data.

Some ways of recovering of this kind of problems can be achieved from FEC (Forward Error Correction) and even from FED (Forward Error Detection) Algorithms. The Scenario follows like this. After all the data to be send is set apart then all the packets are clubbed in to an object then through the encoder, is sent to receiver. The encoder clubs the whole data to as original and this was found to be the best for sending huge data over networks and over coming problems of loss of data while transmission. Let us have clear description taking an example. If there are say, ‘n’ data, divide the data to say, ‘m’ packets. After the transmitter clubs all these messages to one object and the receivers receive it, it is encoder that checks whether n==m and sends it to user. If any error is detected it send the acknowledgement to the user stating the data has not yet received fully. The use of FEC algorithms has various advantages:

1) High Redundancy- Take care repeatedly if any data has lost. 2) No Scalability- no inter dependent messages 3) No need of Acknowledgement, since full data transmission.

The demerit of using FEC & FED algorithms manually is the cost for encoding and decoding the data. From various results it has been proved that encoders and decoders require O (k) operations per packet produced and helps in reducing the complexity in encoding the data while receiving to O (log k) [15]

Page 28: STUDY ON INTERNET DELAYS - um) - U

� �

2.8 Summary The aim of this study was to understand the basics of how congestions occur and what each layer in open source layer (OSI) performs and their basic functionalities. The working procedure of communications gives us the understanding about how internet works and the basic concept of data communications. How the data is sent over the network is the key issue here. We have understood the basic Underlying concepts relating how the delay is occurred. The specific description of various topics such as congestion occurrence, data communications and all the required layers of OSI models related to web delays namely: network, transport , application layer have been discussed. The theory needed to carry out the reasons behind the web delays and to know the working principles of internet could be understood to some extent from the topics discussed above... By discussing about how bulk data could be transmitted gives us information that the main intension behind that was to reduce the average latency. By using FEC and FED the problem of delay could be overcome to some extent So to understand this and theoretical studies of the internet more detailed study which can give the solutions to web delays is what we have to study now.

In the coming chapter we are going to look at the key topics which can overcome the problem of delay to some extent. The topics such as Web cache are discussed to make user understand how the properties of cache help computer to utilise the cache memory to access the most recently visited websites. We will also look at the concepts we need to first understand what are the main causes for a page to be downloaded for that we have to first look at some issues like images and its compressions techniques.

Page 29: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 3 Techniques for decreasing congestion In the earlier chapter we understood the key concepts required to understand what congestion means and how it is occurred. We studied both the conceptual and implementation aspects of network applications. Some key concepts gives us the basic of what internet is and how information flow is going by discussing topics like TCP, UDP and DNS. To overcome the problem of congestion we need to understand caching and reliable transmission of bulk data. Equipped with knowledge about internet structure and its protocols we are now ready to head further down the key areas. The main aim of this chapter is to know what web cache is how it is being done and what are the goals behind developing those technologies. Techniques of decreasing congestion is analysed by studying how the content is made close to the user and how to increase the capacity. Also we have to understand the images techniques, their evolutions and about their technologies. How often the size of image varies is the main issue here. 3.1 Moving content close to the user The name ´cache’ itself indicates literally to store in general to its meaning. It stores recent data request by the user and when for the next time if some one again request the same, data will be accessed by the cache. Thus improves the efficiency of accessing data. When a data being request is found in cache then it is said to be ‘hit’ else if the data is not been caught with a short span of time then it is said to be ‘miss’ .

In general ‘cache’ pronounced as ‘cash’ is temporary storage space inside our computer to store the most recent accessed data. There are different kinds of caches to be discussed. The most common type of cache to be discussed is browser cache. The entire browser has built in cache facility. Let us take an example. Web browser in general due to its automatic storage property stores the most recently visited WebPages, in the hard disk of the computer. So when the user request again the same kind of data the browser will look in to computer cache directory instead of again searching in actual server locations. When we press the back button that is to say ‘hit the button, browser compares the cached page with the instant page requested and updates the user with the necessary data. We can even look at examples of some popular web browser’s –Mozilla, Netscape, Internet Explorer (Microsoft).Netscape lets the user to control how much of space can be allocate for caching and also allows to refresh the cache altogether in flexible way. Internet Explorer also allows to control how much of data can be allocated for caches and refresh the cache but in less flexible way. On the whole people use 10 to 100 MB of space for these browser caches.

Page 30: STUDY ON INTERNET DELAYS - um) - U

� �

There are some other type of caches too used in computers.-Memory and Disk Cache. While discussing memory there is nothing more to discuss in here. But in general memory cache is a portion of computer storage disk specially maintained to access instant retrieval of data by the computer. For Disk cache, it uses some special kind of files used in computers specially used for personal use. The main intension is to improve the overall speed performance

Hence Caching strategies are being continuously developed and more and more research is being done by developers and designers to increase the overall performance of the computer and sustain in the present market. In last the main goals of the caching can be viewed as follows:

1) The data requested and retrieved has to be matched that is the equivalence with the main server has to be the same with concern to the server and revalidating of the response has also been done.

2) The data has to be refreshing automatically each time user visits the page that is the synchronization of the data has to be gained by getting actual response and giving it to user who has requested the concerned data. [16]

Shared caches [17]: In general the cache maintained by the web browser is called a web cache. A web cache proxy server is a shared web cache and it is a program being running on a dedicated server that stores and in turn when request is made of documents those are recently used by the end users. Proxy server returns a desired document requested by the user and if the document is not found then it contacts the main server and stores it temporarily for the upcoming requests. It uses the URL stored objects and returns the same. By doing this kind of storage and sending the shared cache has the following advantages:

• Caching proxy can improve system response time • Reduce traffic in the network taking active participation • Increase the effective bandwidth available to end-users. • Free up space for new objects by deleting the older used files not being requested

The shared proxies in general uses LRU (least recently used) algorithm to determine which documents to remove and which to keep.

3.2 Images and its compression techniques In the previous chapter we have gone through the topics to know what web cache is, how it is being done and what the goals behind developing those technologies are. Also we have understood the key issue here about how bulk data transmission is done. Now we have to discuss the point which could be one of the reasons for web delay, images. In this chapter we are going to look at about images and its compression techniques to understand the effect of images on the website download times and how much this contributes in congestion.

Page 31: STUDY ON INTERNET DELAYS - um) - U

� �

With the introduction of Picture achieving and communication systems (PACS) it has developed a lot between representing data manually in paper to digital version. The way the images are showed in the digital form has a lot of effect on the domain of internet based systems. The more the size of the picture it is more tentatively making page to displayed slowly due to its size. Let us take a broad example .Today marketing goods of the companies are made easier by having photos of their good on their websites. If the images are too large the users will have to wait a pretty long to download images and it will cost the company more for its website maintenance. If this happens frequently leads to loss of their costumers and leads to degradation of the sales. This is a general kind of problem. We have many types of image file formats which have some merits as well as demerits. The most common type of image formats used are GIF (Graphic Interchange Format), JPEG(Joint Photographic Experts Group), TIFF ( Tag Image File Format), and now the most common type used is PNG (Portable Networks Graphics) . GIF was used almost when internet was discovered. Any image in GIF format though when compressed do not loss any data. For this reason it was success but it had some disadvantage too, it is limited to 256 colours or shades of gray. This brings a major difficult for the developers and users to see the image with actual colours it has to be. The JPEG format comes next to GIF but not with major development. Though it has advantages over GIF but it has even problems as of. The main reason of using this was to store images and transmit the images in pixel. The main advantage of using this format was the capacity to compress larger images and compress sing to almost quarter to it s size is achieved using less space which makes it easy to transfer and reliable in displaying. The only problem / disadvantage with using JPEG format for compression is was when a image is compressed by this, leads to some loss of data that leads to degradation of the images. Then comes the TIFF format of compressing images. This was some what better then compare to previous kind of approaches to compress images. This was developed by Microsoft, now this is taken care by ADOBE Company. For Example a sample picture in which its resolution is increased with out loss of data using Adobe s/w can be viewed here .

Figure 3: compressed image. It was used when a complicated image was needed to be compress. The header of the file to be compressed was followed by packets of data group at once called ‘ tag’ which

Page 32: STUDY ON INTERNET DELAYS - um) - U

� �

makes it happen to display the image. It gives the users to see the image in more clear way with good resolutions. The main advantage of using TIFF compression techniques was that it can support full range of images without any loss of data while compressing. The only problem caused to make this format to become less in demand was its result. When compressed it takes more size to store the image. Then comes the famous PNG format which is widely used image format technique. The main reason to build that kind of image compression technique was to build a compression technique which can give lossless image compression result. [18]

3.3 Summary

From the above discussion it could be understood that cache is both server and client at the same time, i.e. when it receives requests from and sends responses to a browser it acts like server and when it sends request to and receive responses from the origin server it acts like client. The goal makes it clear that the data sent and received should be matched without any loss.

Form the above discussion we have understood the evolution and the various technologies available and how images could be the reason for web delays. We have understood that images have serious effect to some extent and compression is necessary to overcome the problem of congestion and helps developers in making efficient websites. We even have to discuss about migration of load in the web servers to reduce the latency .It is not only images that affects in congestions but the companies who do not migrate the load balance of web servers also results in congestion. So to understand about the migration we have to discuss what it is and how it done.

Page 33: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 4 Website Migration Load balancing of web Servers: In many software development companies people usually treat migrating load balancing of web servers a huge cost effective and neglect them. But like uncompressed images leads to congestion problem, this also had serious effect on web delays. The main aim of this chapter is to discuss about migration of website and to know the underlying reason behind doing this which could help developers to built effective websites so that congestion could be reduced at least to some extent. 4.1 Migration of websites Latency, a term used in internet to define the average turn around time has a huge effect in accessing internet. To maintain this web hosting companies are balancing their loads, but many fail to follow this technique due to its heavy cost. The main objective of using this load balancing technique is to reduce the latency. Due to heavy load on network and it is made compulsory to balance the load to have reliable access to internet. But due to cost it becomes the main problem for mid term based IT companies to make effective use of load balance techniques. The main advantage of migrating websites between different servers is that this technique does not require high capacity load balancing switches between users and web servers. The various Load balancer techniques used by web hosting companies are:

IP-Sprayer 1) Domain Name System 2) Load balancing of Low end web hosting clusters.

IP-Sprayer- Also called one-IP approach is an important type of load balancing techniques. It reduces the load by transferring or to say distributing the load to front end web servers using switches thus making remove some load i.e. sharing load equally among front-end and back end web servers. Caches take all the data at a time and make the front web server to work efficiently. Since due to its heavy cost it is not being used commonly in web hosting companies. Domain Name System- DNS is having a small change over IP-sprayer which was described above. DNS do not require any switches as required by IP-sprayer. Here the load of the network is balanced by sharing equally between DNS server and front end server. But this technique is having only one demerit at most which is, the internet protocols numbers are getting stored in DNS server so it makes to have less control compare to user for server.

Page 34: STUDY ON INTERNET DELAYS - um) - U

� �

Famous Algorithm to control the load balancing of website is “Agarwal Greedy Algorithm”. The main objective of this algorithm is reducing latency and variance occurring in it. In this algorithm the technique followed is simple. It takes in to consideration the cost associated with it when migrating the content to different web servers. Let ‘m’ be the number of web servers and ‘n’ be the number of websites. First all the websites are sorted in decreasing order of load and according to their loads, the web server is sorted. After having ‘k’ iterations the web site having greater load is shifted web server having less load. Then for another ‘k’ iterations moves in complexion order. On the whole of sorting the websites it takes O (n * log (n) + m* log (m)) time. [19] 4.2 Summary: In the above discussion we have understood about the migration load and its usage in present internet world. We come to know that main objective of using this load balancing technique is to reduce the latency. Also due to heavy load on network and it is made compulsory to balance the load to have reliable access to internet .the underlying principle is straightforward and is implemented almost by every firm to offer best services to the customers. After making a lot of theoretical study on topics such as networks layers, TCP/IP and UDP, web caching, reliable transmission of data, images and compression techniques we have to look at some of the techniques/tools by which we could measure the web delays to conclude the reasons behind the congestion.

Page 35: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 5

Tools for studying congestion.

After discussing a lot about various technical terms relating to web delays we have to now do some kind of research work by which we could measure the web delays to some extent.

The main aim of this unit is to discuss which tools could provide us with best results specifying further reasons for web delays. Some tools which have been used to obtain results are: Visual Route, Netscape, Gimp, NeoTrace and Filter Gate to obtain results.

5.1 Visual Route

The visual route server provides a graphical trace route from this server to any other network device you choose, useful for pinpointing network connectivity problems and identifying IP addresses. Visual Route helps determine if a connectivity problem is due to your ISP, the Internet, or the host you are trying to reach, and pinpoints the network where a problem occurs.

In General Visual Route tool [20]:

• Identify the source of Internet delays • See the physical location of Internet servers and routers • Get instant lookups of Who’s and network provider information

Page 36: STUDY ON INTERNET DELAYS - um) - U

� �

Figure 4: report for www.visualroute.it (194.242.61.52)

In general when a computer (host) wants to communicate with other computer the message is send in terms of packets to the desired IP addressed machine. So to have that every computer engaged with internet must have IP address. It is the duty of “Routers” to take care of delivering the packets to appropriate destination. If the acknowledgement is not received then it re-sends the packets until it gets reply from the destination body.

To understand the above trace one has to know the meanings of the following term:

TTL (Time to Live): Time to live is an integer value between 0 and 255. For every packet delivered to appropriate destination the value is TTL is decremented by a unit and after it reaches to null (0) it is understood that the packet was delivered successfully.

Hop: when a packet is transferred it is considered to a hop. so if 12 packets are transferred then it is to be understood that 12 hops have been occurred..

ICMP (Internet Control Message Protocol) it is the ICMP that cares of error reporting and controlling of packet. When a receiver do not receives a complete list of packets then ICMP alerts them about the same and even when the user gets all the packets it acknowledges about the success.

Page 37: STUDY ON INTERNET DELAYS - um) - U

� �

In the above pictorial we can view that the trace of website www.virtualroute.com .we can see the number of hops occurred while tracing its website. Here the “ tracert” command sends ICMP echo packet to the destination machine step by step with specification of TTL (“TTL expired in transmit) until it receives each packet. If the internet connection is very fast it can have some packet loss and in the same time if the connection is very slow it may take huge time to transfer data. Sometimes routers in-between will not send ICMP "TTL expired in transit" messages, causing what looks to be high packet loss at a particular hop, but all it means is that the particular router is not responding to ICMP echo.

5.2 Netscape

Netscape 6.2 is the browser used download all the websites. It has a predefine built in option to display the download time of any website requested by the user. The download time of the website is displayed in milliseconds.The underlying reason behind this is to improved the accuracy of the results.[21]

Figure5: Netscape(downloading time)

5.3 Gimp7.01

Gimp is an image editor which has been used to reduce the size and contrast of the images in the website. After the image has been compressed The user can choose the compression scheme they want to use and compare the compressed image with the original image before making any changes. This was an excellent piece of software to use for this thesis as it meant a comparison could be made to see that compression did not affect the images quality. [22]

Page 38: STUDY ON INTERNET DELAYS - um) - U

� �

Figure 6: compression using GIMP Software

Here the new size of the compressed image is displayed in kilobytes.

5.4 Neotrace [23]

Neo Trace tool was used in this thesis to measure distance of the location. The main aim of using this tool was to identify whether delay is internet has cause of distance. It was the simple tool provided where the user just mentions the URL of the desired website and the user in turns gets distance of the server.

Page 39: STUDY ON INTERNET DELAYS - um) - U

� �

Figure 7: NeoTrace

Here in the above diagram we can see that the user gets the exact map of the server he desires. It clealry shows how the server location is interconnected from the peak point to intial point. On the whole it was the tool which could show how the distance has effect on the internet downloads.

5.5 Trace Route [24]:

Trace Route is a measurement tool which was also taken in thesis to trace the route and see its performance .interested thing to know is the location of the web server and how is it connected to the Internet. There is a network utility called trace route which is often used to troubleshoot network connections. In a UNIX or Windows environment, Trace route can be used to determine the specific network route taken from your workstation to reach a specific remote host. Fortunately there are many UNIX systems on the Internet that allow us to originate a trace route from their location to any other location that you specify.

The pictorial path of a information can be seen from the diagram which shows the information path and in how many ways a data can be transmitted in a network.

Page 40: STUDY ON INTERNET DELAYS - um) - U

� �

Figure 8: Information flow paths.

If we take a look over the folowwing site then the we could get a clear picture of a tracing a website from to its destination calling site.see the below picture-.

Page 41: STUDY ON INTERNET DELAYS - um) - U

� �

Figure 9: trace of www.ardent-india.com

In this example, bostream gets its connectivity from "breadband.com” who get its connectivity from IP address"64.215.185.81" who is connected to gblx.net. It then appears that the ardent-india.com Web site is connected though Asia Netcom Corporation-Sanjose. Recognize that an organization's website may not be located at the organization. The organization's website may be hosted someplace else. A more accurate approach to determine the location of the organization might be to do a trace route to the organization's mail host or proxy server.

Page 42: STUDY ON INTERNET DELAYS - um) - U

� �

5.6 FilterGate

This thesis used FilterGate tool which is popup filter software to see the effects of pop up windows called when a page is requested in a browser.We come accross many kind of pop up windows when we usually visit popular websites.

This tool recognses the user’s request and blocks the unwanted windows their by increasing the speed of the browser. By doing this the page is downloaded fastly and it gives user’s satisfaction. [25]

Figure 10: Filter Gate

5.7 Additional Tool: “ Ethereal”

Let us look at the additional tool called ethereal which is used to analyze the network packet. Although I have not used this tool in this thesis, for knowledge sake I am discussing here in short in this section.

Ethereal will try to capture network packets and display that packet data as detailed as possible for the enduser so that their may not occur any packet loss.This Etheral is used as a device to measure the data process and all information going inside the network transmission. Actually It won’ t warn you when any strange things on your network is done. Ethereal only figure out what is really going on inside the network..

Page 43: STUDY ON INTERNET DELAYS - um) - U

� �

The advantagesous of using Ethereal are as follows:

• Low cost and easy to maintain. • troubleshoot network problems • examine security problems • debug protocol implementations • learn network protocol

Ethereal is perhaps one of the best open source packet analyzers available today. Ethereal runs almost on any operating system irrespective of any kind of limitations.A sample diagram could be seen below that captures packets and contents of the data is allowed to view.

Figure 11: Ethereal captures packets and allows you to examine their content

Page 44: STUDY ON INTERNET DELAYS - um) - U

� �

5.8 Summary

In this chapter we discussed tools which could provide us with best results specifying further reasons for web delays. After looking at these tools Visual Route, Netscape, Gimp, NeoTrace and Filter Gate we come to understand their technologies and how they be used to measure the average downloading times and by this we could know the amount of delay in the network. In the coming chapter we would take some websites in to consideration and tabulate them to view the effect of congestion.

Page 45: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 6

Empir ical Study

In the earlier chapter we have discussed various tools and about their technologies. we have to use those tools and predict the reasons behind delays.

The main aim of this chapter is to consider some popular websites which are in popular usage and apply these websites to the tools discussed in the earlier chapter so that we could know where actually the delay is occurred. After taking in to consideration and the values got from the tools mentioned above was tabulated and compared to see the working conditions and the performance of the websites. Following are the addresses taken:

• www.martins-seafresh.co.uk • www.countrybookshop.co.uk • www.kelkoo.co.uk • www.bbc.co.uk • www.kelkoo.co.uk • www.streetsonline.co.uk • www.sevengatesdesigns.com • www.8over8.com • www.bet365.com • www.dabs.com • www.buyagift.co.uk

These websites were taken in to consideration and observation was made using the measurement tools using : No popup- normal download, using pop-up downloads, and cache request.

6.1 Calculations :

All the above websites were taken in to consideration and sample method approached to see the performance of websites look like this.

Download Method Day 1 Day2 Day3 Average Ordinary download 15.287 16.33 16.26 15.758231 Popup download 12.43 13.113 12.13 12.12324 Cache download 16.22 11.23 12.11 13.76573

Page 46: STUDY ON INTERNET DELAYS - um) - U

� �

Number Picture Size in kilobytes After Compression 1 1.35 1.06 2 1.36 1.22 3 1.26 1.12 4 1.34 1.22 5 4.55 2.29 6 2.44 1.25 7 1.33 1.01 8 1.22 1.12 TOTAL 14.85 10.29(37.42%)

Fig 12: Image compression of www.streetsonline.co.uk

To have a clear understanding of the performancing the results are compared among the ordinary downloads, popup download, web cache downloads, and among original images , compressed images.

The websites names are notified by the follwing shortnames for ease of use.

• www.martins-seafresh.co.uk = “mar” • www.countrybookshop.co.uk = “cou” • www.kelkoo.co.uk = “kel” • www.bbc.co.uk = “bbc” • www.kelkoo.co.uk = “kel” • www.streetsonline.co.uk =“str” • www.sevengatesdesigns.com = “sev” • www.8over8.com = “8ov ” • www.bet365.com= “bet” • www.countrybookshop.co.uk = “cou” • www.buyagift.co.uk= “buy”

6.2 Ordinary Download versus Webcache Downloads

The performance of ordinary downloads against web cached downloads to see what effects, if any, a web cache has on the download time. This kind of tabulating gives us good approaches to start while building web applications.

Page 47: STUDY ON INTERNET DELAYS - um) - U

� �

0

5

10

15

20

25

8ov bbc bet buy cla com cou dab kai kel mar sev sin str uls

Do

wn

load

Tim

e (s

ecs)

Figure 13: Ordinary Download versus Web Cache Download.

From the above graph above we can see that displays the average of the normal downloads against the average of the web cache .Most of the web sites has reduced their downloading time; however, two of the sites “8ov” and “dab” have not. On clear observation it can be understood that overall performance of that site is some what less. The only reason could be due to its small images and small attachments engaged with the good sites.

6.3 Ordinary Downloads versus Popup included Downloads

Figure 14: Normal No popup Download versus Popup Downloads�

The above graph shows that it (pop up filter) do not has effect on decreasing the download time. But the only problem is that it increases the time in downloading. The format of the page designed has too effect on the download time. As the developer make

0

5

10

15

20

25

8ov bbc bet buy cla com cou dab kai kel mar sev sin str uls

��������� ��� ��� ���������������

Page 48: STUDY ON INTERNET DELAYS - um) - U

� �

the page more complicated it usually takes more time to get downloaded. So even if the page uses lot of pop up windows they should me attached such that it could not make downloading time to more which will decrease the performance.

6.4 Initial Picture compared to Compressed Picture Here we can compare the pictures in a webpage to after it has been compressed b using software called “GIMP”.GIMP is almost similar to compression tool Adobe Photoshop but due to its licentiate requirement, GIMP was used. This tool was able to give good information about how the compression techniques could enhance the downloading of web pages when requested by the end user. After the images are compresses the values are tabulated and we could understand the compression importance.

0

20

40

60

80

100

120

140

160

180

8ov bbc bet buy cla com cou dab kai kel mar sev sin str uls

Figure 15: Initial Picture compared to Compressed Picture It can be predicted that the larger the image it takes long time to downlaod when requested. In popular sites theimages are kept deligently with out compressing them. In this case if we look at the graph we can see that the website “cou” has greater number of uncompressed pictures so the reason for its delay in downloading . It has direct uncompressed so while the page is requested it takes some time to get the exact objects from the server and makes user wait to see them .Developer has to be very carefull while inserting pictures that arenot compressed.

Page 49: STUDY ON INTERNET DELAYS - um) - U

� �

6.5 Summary

From the above discussion we have seen the affect of congestion on some popular websites and how they are able to be tabulated using different tools like visual route, trace route and how the image was compressed. When considering the ordinary download vs. web cache downloads most of the web sites has reduced their downloading time; except two of the sites “8ov” and “dab” have not. It can be understood that overall performance of that site is some what less...Similarly when considering ordinary download vs. popup downloads it shows that has less effect on decreasing the download time.. The format of the page designed has too effect on the download time.. So even if the page uses lot of pop up windows they should me attached such that it could not make downloading time to more which will decrease the performance

After compressing the image we come to know that the larger the image, the more it takes to downlaod when requested. After looking at the graph can see that the website “cou” takes more delay in downloading so the developer should compress the image first and then include that in the web page.In the upcoming chapter the evaluation of the results tabulted will be done.

Page 50: STUDY ON INTERNET DELAYS - um) - U

���

Page 51: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 7

Evaluation of Results.

From the above section the evaluation was carried out .Here the importance was given to the target file size, downloading time and whether the target file were compressed or not and simialr to it . follwing are some evaluations which were made in this theis. They are:

• Based on the download time two sites “8o8” and “bbc” were found to be better among the sites choosen for this thesis.

• While the user request for the page repeatedly, the page has to be downloaded fastly from the cache.Among the sites choosen “cla” cachibility was good compared to others.

• Image has important issue in this thesis evaluation as the image size grows the download time increases.Here the websites “bbc” , “8o8” , “bet” was having compressed images compare to others taken in to action.

• Sites “cou” ,”kel” ,”str” contains more frames .More popups were there which hinders their performance.

• It was found that some websites companies were having thier servers loaction at different location far away from their customers.Like of we take the example of considered iste “www.bbc.co.uk” it has multiple servers throughout the world. If user request “www.bbc.co.in” , the site will be downloaded from the server located in india but from Uk.Though distance of server do not have greater impact on internet delay it could however contributes to littile delay in internet sector.

• Internet connection speed also has major impact on speed .The impact on speed.The impact is some what tricky, like the site send out information being requested by end user at a speed lower than the requested connection speed.

• Internet web browser too plays an good role.for example a file being downloaded from a Mozilla say website “kel” takes 5.4 sec and the same file being using microsoft browser took 6.8 sec

Page 52: STUDY ON INTERNET DELAYS - um) - U

� �

Page 53: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 8

Guidelines for Developers.

Based on the study done during this thesis and using necessary tools provided some good guidelines can be suggested so that web developers can built good and efficient project so that the download time or delay can be overcome to as much extent. Guidelines suggested are not intended to be definitive or necessary applicable to all the scenarios but cover lot of web grounds. These guidelines will help to create a high effective and professional webpage’s.

• The location of the sever should be kept to close of its usage or should be kept in the location close to that user intended to use it.

• Number of frames should be less as to say it is good enough if no frames are used. Frames increases the time to download the file. Though pop up windows have little effect on delay it should be noted that frames too have some effect more or less.

• While designing the pages it should be taken care to include only compressed pictures.It is very important for deelopers to follow since it has been found out in the thesis that it results in more delay if non compressed pictures are included .

• Using too many colors contributes too some delay.A key to good design, colors that complement make your site much more integrated. Using fewer colors leads to higher compression than the picture with large range of colors.

• Check page in every browser you can find to see how it looks. It is has to be done to ensure compatibility. If this is not performed then when a user say uses mozilla browser will not able to open a page and mozilla will take more to acknowledge user about the problem.

• While designing pages the number of code lines should be less and effective according to norms specified by “www consortium”.

• After the website is developed the developer should check to make sure that the site isn't annoying, as in blinking, loud, or eye-hurting

• The website content should be static as much as can. Dynamic data delays the page and while retrieving the same from cache.

• Keep the elements of your site in good balance� The main objective of using balancing technique is to reduce the latency. Due to heavy load on network and it is made compulsory to balance the load to have reliable access to internet. But due to cost it becomes the main problem for mid term based IT companies to make effective use of load balance techniques

• Developing websites through various website builders has consistent effect.It has been in this founded that Flash site builder provides the same data in less size compare to if it were developed using other page builders.So the builders should used the effective one to build websites to reduce the download of page .

• All the graphical data should be compressed more or less to greater resolution so as to fix it in the webpage.

Page 54: STUDY ON INTERNET DELAYS - um) - U

� �

• Developers should even consider the machines capabilities.Many of the machines do not support the same technologies as were to opened in other.

• Using of automated software tools to build websites inserts unwanted code so the developer should remove them as much as can.

• Developers should make content understandable and navigable. This includes not only making the language clear and simple, but also providing understandable mechanisms for navigating within and between pages. Providing navigation tools in pages will maximize accessibility and fast downloads of the pages.

Page 55: STUDY ON INTERNET DELAYS - um) - U

� �

Chapter 9

Conclusion:

The overall aim of the thesis was to study about internet delays, analyze the reasons for web delays, identifying where actually the delay is occurring and come out with major suggestions and reasons for the its occurrence.

To perform this, tools such as visual route, neo trace, and gimp were used. congestion has become an serious problem for many users and site developers. The amount of traffic flowing around the internet is increasing all the time and this has lead to higher download times for all internet users.

Here in this thesis the average download times of some ten websites sites were considered. Different methods were performed like ordinary downloads, pop up downloads, and cached downloads. The content of the site and its pictures were tested to see whether they have significant cause leading to delay. All the last images were compressed to see to what extent they can be done and

The images contained in each site were examined to see whether image compression could change the overall sizes of each site. It was obseerved that in most cases the delay was due to to ineffecient developed websites and poor uncompressed pictures usages and many like to this. At last it could be understood that the option of reducing the size of the data without losing the semantic content is a good idea. Beyond a certain threshold of network distance, the response time is reduced if the image is loss compressed.

Finally the conclusion from this thesis is that delay in internet is mainly due to inefficient webpage design where images are not compressed properly and not designed according to standards. This study should give a deeper insight into the issues related with the performance of web and will help to build more efficient, reliable, and scalable web based systems.

Page 56: STUDY ON INTERNET DELAYS - um) - U

� �

Page 57: STUDY ON INTERNET DELAYS - um) - U

� �

References.

[1] http://www.internetworldstats.com/stats.htm

[2] Jakob Nielson’s designing web usability, www.useit.com

[3] Internet Society(http://www.isoc.org/internet/history/cerf.shtml.)

[4] http://www.walthowe.com/navnet/history.html

[5] http://www.internetworldstats.com/stats.htm

[6] RFC: 791, www.rfc-editor.org

[7] RFC: 1812, www.rfc-editor.org

[8] Computer Networking: James & Keith

[9] http://www.sailor.lib.md.us, Computer Networks, Andrew S.Tanenbaum

[11] Web Protocols: Balachander and Jennifer

[12] Computer Network with Internet Applications-Douglas E.Comer, Prentice Hall

4th Edition

[13] MCSE Training Material, Microsoft Press, 2003 Edition

[14] Postel,J.,"Internet Protocol," RFC 760, USC/Information Sciences Institute,

January 1980

[15] INFOCOMM’98 Paper titled “TCP-Like Congestion control for Layered

Multicast data transfer. BY Lorenzo Vicisano, Luigi Rizzo and Jon Crow croft.

http://www.digitalfountain.com

[16] Web caching, O’Reilly, Duane Wessels.

[17] http://Scholar.lib.vt.edu/digilib/

[18] http://www.w3.org/Graphics/

[19] http://www.stanford.edu/class

[20] www.visualroute.com

[21] Netscape 6.2, Netscape, http://www.netscape.com

[22] Gimp tool to compress images: http://developer.gimp.org/screenshots.html

Page 58: STUDY ON INTERNET DELAYS - um) - U

� �

[23] NeoTrace, Available at http://www.tucows.com

[24] http://navigators.com/traceroute.html

[17] Filter Gate, www.filtergate.com

Page 59: STUDY ON INTERNET DELAYS - um) - U

� �

Appendix A

Page 60: STUDY ON INTERNET DELAYS - um) - U

� �

Page 61: STUDY ON INTERNET DELAYS - um) - U

� �

Page 62: STUDY ON INTERNET DELAYS - um) - U

� �

Page 63: STUDY ON INTERNET DELAYS - um) - U

� �

Page 64: STUDY ON INTERNET DELAYS - um) - U

� �

Page 65: STUDY ON INTERNET DELAYS - um) - U

� �

Page 66: STUDY ON INTERNET DELAYS - um) - U

� �

Page 67: STUDY ON INTERNET DELAYS - um) - U

� �

Page 68: STUDY ON INTERNET DELAYS - um) - U

� �

Page 69: STUDY ON INTERNET DELAYS - um) - U

� �

Appendix B

Picture and its Compressions Information

www.kelkoo.co.uk

Number Picture Size in KB After Compression 1 0.42 0.42 2 1.01 1.01 3 0.89 0.89 4 1.46 1.46 5 13.3 11.03 6 3.21 3.21 7 1.9 1.62 8 0.59 0.59 9 0.99 0.99 10 1.07 1.07 11 2.11 1.642 12 2.24 1.793 13 1.63 1.305 14 1.93 1.563 15 0.75 0.75

Total 33.5 29.343 Compression % 12.40895522

www.martins-seafresh.co.uk

Number Picture Size in KB After Compression 1 10.7 8.227 2 5.54 1.658 3 38.9 11.47 4 4.55 2.545 5 4.03 1.59 6 0.24 0.24 7 8.28 4.129 8 2.87 1.136 9 0.69 0.69

Total 75.8 31.685 Compression % 58.19920844

Page 70: STUDY ON INTERNET DELAYS - um) - U

� �

www.sevengatesdesigns.com

Number Picture Size in KB After Compression 1 1.75 1.026 2 11.8 5.507 3 16.8 15.31 4 3.92 1.932 5 4.57 1.696

Total 38.84 25.471 Compression % 34.42070031

www.streetsonline.co.uk

Number Picture Size in KB After Compression 1 1.35 1.24 2 1.26 1.12 3 1.32 1.03 4 1.23 1.23 5 1.35 1.35 6 1.37 1.27 7 0.89 0.89 8 1.27 1.27 9 1.17 1.17 10 1.25 1.25 11 1.34 1.34 12 1.38 1.38 13 1.38 1.38 14 1.06 1.06 15 1.37 1.37 16 1.1 1.1 17 2.79 1.37 18 0.44 0.44 19 0.32 0.32 20 0.38 0.38 21 0.3 0.3 22 0.65 0.65 23 0.19 0.19 24 0.21 0.21 25 0.18 0.18 26 1.71 1.18 27 1.48 1.21 28 3.28 1.65

Page 71: STUDY ON INTERNET DELAYS - um) - U

� �

29 2.6 1.89 30 3.82 2.43 31 0.2 0.2 32 0.42 0.42 33 0.31 0.31 34 2.79 1.63 35 4.49 2.716 36 0.37 0.37 37 0.38 0.38

Total 47.4 37.876 Compression % 20.092827

www.8over8.com

Number Picture Size in KB After Compression

1 6.1 2.99 2 0.26 0.26 3 0.26 0.26 4 0.346 0.346 5 0.241 0.241 6 0.291 0.291 7 0.232 0.232 8 4.81 1.973 9 12.1 10.71 10 5.32 3.429 11 0.66 0.66 12 2.59 2.6 13 7.46 4.052 14 8.44 5.424 15 0.29 0.29

Total 49.4 33.758 Compression % 31.66396761

www.bbc.co.uk

Number Picture Size in KB After Compression

1 0.37 0.37 2 0.151 0.151 3 1.12 1.12 4 0.4 0.4 5 0.34 0.34 6 0.33 0.33 7 0.16 0.16 8 0.57 0.57 9 0.267 0.267 10 0.15 0.15 11 5.56 5.136

Page 72: STUDY ON INTERNET DELAYS - um) - U

� �

12 0.349 0.349 13 0.712 0.712 14 0.139 0.139 15 1.83 1.702 16 2.23 1.17 17 9.35 7.249

Total 24.028 20.315 Compression % 15.45280506

www.bet365.co.uk

Number Picture Size in KB After Compression

1 3.77 3.031 2 3.51 2.147 3 4.35 3.582 4 3.17 3.17 5 3.59 1.506 6 1.92 1.187 7 1.91 1.513 8 0.87 0.87 9 0.49 0.49 10 0.053 0.053 11 1.2 1.148 12 0.503 0.503 13 0.507 0.507

Total 25.843 19.707 Compression % 23.74337345

www.buyagift.co.uk

Number Picture Size in KB After Compression

1 0.43 0.43 2 0.74 0.74 3 1.12 0.93 4 4.41 2.16 5 0.74 0.74 6 0.43 0.43 7 0.82 0.82 8 0.74 0.74 9 2.52 2.123 10 2.69 2.231 11 0.27 0.27 12 0.69 0.69

Page 73: STUDY ON INTERNET DELAYS - um) - U

� �

13 0.55 0.55 14 0.44 0.44 15 6.08 2.189 16 7.54 5.908 17 0.88 0.88 18 0.77 0.77 19 0.5 0.5 20 0.84 0.84 21 0.19 0.19 22 3.1 1.977 23 0.48 0.48 24 0.96 0.96 25 5.14 2.579 26 0.94 0.94 27 1.11 0.89 28 7.55 5.332 29 0.43 0.43

Total 53.1 38.159 Compression % 28.13747646

www.dabs.com

Number Picture Size in KB After Compression

1 0.66 0.66 2 3.62 3.46 3 2.78 2.61 4 2.96 2.96 5 2.79 2.79 6 0.49 0.49 7 0.41 0.41 8 0.36 0.36 9 0.66 0.66 10 0.65 0.65 11 0.52 0.52 12 0.76 0.76 13 0.76 0.76 14 0.53 0.53 15 0.61 0.61 16 1.22 0.96 17 0.51 0.51 18 0.31 0.31 19 0.36 0.36 20 0.53 0.53

Page 74: STUDY ON INTERNET DELAYS - um) - U

� �

21 0.29 0.29 22 0.36 0.36 23 0.5 0.5 24 0.41 0.41 25 0.44 0.44 26 0.47 0.47 27 0.45 0.45 28 1.86 1.23 29 0.4 0.4 30 0.39 0.39 31 1.43 1.545 32 4.94 3.718 33 0.48 0.48 34 1.08 1.08 35 0.45 0.45 36 0.56 0.56 37 0.42 0.42 38 0.55 0.55 39 0.92 0.92 40 1.45 1.05 41 0.4 0.4 42 0.74 0.74 43 0.89 0.89 44 0.51 0.51 45 0.52 0.52 46 0.91 0.91 47 0.77 0.77 48 0.99 0.99 49 0.99 0.99 50 1.11 0.93 51 0.38 0.38 52 0.95 0.95 53 0.28 0.28 54 0.56 0.56 55 0.31 0.31 56 0.65 0.65

Total 50.3 47.393 Compression % 5.779324056

Page 75: STUDY ON INTERNET DELAYS - um) - U

� �

www.kainos.co.uk

Number Picture Size in KB After Compression

1 7.28 7.187 2 4.68 4.424 3 0.61 0.61 4 8.02 7.6 5 0.38 0.38 6 0.29 0.29 7 0.27 0.27 8 0.24 0.24 9 0.69 0.69 10 0.47 0.47 11 0.66 0.66 12 0.21 0.21 13 0.7 0.7 14 0.49 0.49 15 0.44 0.44 16 0.22 0.22 17 0.25 0.25 18 0.49 0.49 19 0.491 0.491 20 0.4 0.4 21 0.41 0.41 22 0.34 0.34 23 2.37 1.021 24 1.44 1.15 25 0.68 0.68 26 0.049 0.049 27 0.43 0.43

Total 33 30.592 Compression % 7.296969697

Page 76: STUDY ON INTERNET DELAYS - um) - U

� �

Page 77: STUDY ON INTERNET DELAYS - um) - U

� �

Appendix C

Download Times for sites taken. www.8over8.com Download Method Day1 Day2 Day3 Average ����������� ��������������

4.345 3.891 3.76 3.6521 ���������� ����� �!��" �

3.175 3.632 3.985 3.4927 # "�$&%�'� ����� �!��" �

3.211 3.931 4.12 3.5845

www.bbc.co.uk Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

2.54 3.61 3.15 4.3545 ���������

� ����� �!��" �

1.773 1.592 2.134 2.9175 # "�$&%�'

� ����� �!��" �

3.012 4.12 4.236 4.1142 www.bet365.co.uk Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

5.781 6.154 6.213 5.89759 ���������

� ����� �!��" �

2.804 4.957 3.158 3.7332 # "�$&%�'

� ����� �!��" �

6.132 5.986 7.143 6.5215

Page 78: STUDY ON INTERNET DELAYS - um) - U

� �

www.buyagift.co.uk Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

6.131 10.705 8.126 7.72957 ���������

� ����� �!��" �

4.571 5.647 4.962 5.55414 # "�$&%�'

� ����� �!��" �

10.672 12.808 9.764 10.5037 www.countrybookshop.co.uk Download

Method Day1

Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

5.991

6.22

8.731 7.86742 ���������

� ����� �!��" �

4.803

4.122

4.723 4.41928 # "�$&%�'

� ����� �!��" �

7.259

7.01

8.268 7.16214 www.dabs.com Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

7.451 6.789 7.569 7.34942 ���������

� ����� �!��" �

5.15 5.893 6.714 6.883 # "�$&%�'

7.201 9.674 6.839 7.41083

Page 79: STUDY ON INTERNET DELAYS - um) - U

� �

� ����� �!��" �

www.kainos.co.uk Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

5.366 6.974 7.132 5.9615 ���������

� ����� �!��" �

4.947 4.587 3.893 3.778 # "�$&%�'

� ����� �!��" �

5.879 5.238 7.561 5.9387 www.kelkoo.co.uk Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

6.689 6.755 6.704 7.0605 ���������

� ����� �!��" �

2.383 2.784 4.978 2.969 # "�$&%�'

� ����� �!��" �

6.129 5.158 7.019 7.1568 www.martins-seafresh.co.uk Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

8.921 10.3 9.32 10.175 ���������

� ����� �!��" �

7.53 7.381 7.523 6.5334

Page 80: STUDY ON INTERNET DELAYS - um) - U

� �

# "�$&%�'

� ����� �!��" �

9.063 10.896 10.518 10.229 www.sevengatesdesigns.com Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

4.71 5.342 4.507 4.487 ���������

� ����� �!��" �

1.242 1.302 1.482 1.353 # "�$&%�'

� ����� �!��" �

5.768 4.987 4.392 4.8212 www.streetsonline.co.uk Download

Method Day1 Day2 Day3 Average (*) ��+ ��" )�,

� ����� �!��" �

10.354 8.032 9.159 9.1381 ���������

� ����� �!��" �

4.446 3.425 2.734 3.0591 # "�$&%�'

� ����� �!��" �

13.369 9.474 10.365 10.612