web servers - how they work

18
How Web Servers Work Technologies Involved The primary technologies involved in a working web server include: Web Browser – Program used for accessing web pages (and possibly other content) TCP/IP – The Internet’s network protocol o IP Address – The address of the network card on the Internet or network. Analogous to a phone number. o Port – The particular connection area on the IP address to connect to. Analogous to a phone number’s extension number. DNS – Domain Name System HTTP – HyperText Transfer Protocol SSL – Secure Socket Layer CGI – Common Gateway Interface MIME - Multipurpose Internet Mail Extensions URL – Uniform Resource Locator Programming / Scripting Languages o ASP / ASP.NET – Application Server Pages o JSP – Java Server Pages o PHP – PHP Hypertext Preprocessor o Perl – Practical Extraction and Report Language (sometimes) o Python – Newer language that enforces good coding practices o Ruby on Rails – For rapid application development and Each of these topics could be a number of classes in itself. Search online for more information on any of these topics. They are listed here to give a sense of the digital ecosystem involved in a web server environment. Uploads, Downloads & Bandwidth When Downloading data, information is being taken from somewhere else and brought to your computer. When Uploading data, data (either stored or generated) is sent from your computer and is being sent to another computer somewhere. If you’re getting it, you are downloading. If you are sending it, you are uploading. The difference is important. Bandwidth is the speed at which information can be downloaded or uploaded. A typical Cable Internet connection may have bandwidth of 3.5MB/768KB. This means that you can download files at a maximum speed of 3.5 Million bytes per second, and can upload data at a slower speed of 768 Thousand bytes per second (much slower). This is called an Asymmetric connection, because the upload and download speeds are the same. Most office networks (LAN’s) are Symmetric connections, with a speed of 100MB for both uploading and downloading speed. The local network (LAN) will almost always be significantly faster than the Internet connection that it uses.

Upload: brian-gallagher

Post on 14-Dec-2014

946 views

Category:

Technology


1 download

DESCRIPTION

A visual description of how web servers and related technology works

TRANSCRIPT

Page 1: Web Servers -  How They Work

How Web Servers Work Technologies Involved The primary technologies involved in a working web server include:

Web Browser – Program used for accessing web pages (and possibly other content) TCP/IP – The Internet’s network protocol

o IP Address – The address of the network card on the Internet or network. Analogous to a phone number.

o Port – The particular connection area on the IP address to connect to. Analogous to a phone number’s extension number.

DNS – Domain Name System HTTP – HyperText Transfer Protocol SSL – Secure Socket Layer CGI – Common Gateway Interface MIME - Multipurpose Internet Mail Extensions URL – Uniform Resource Locator Programming / Scripting Languages

o ASP / ASP.NET – Application Server Pages o JSP – Java Server Pages o PHP – PHP Hypertext Preprocessor o Perl – Practical Extraction and Report Language (sometimes) o Python – Newer language that enforces good coding practices o Ruby on Rails – For rapid application development and

Each of these topics could be a number of classes in itself. Search online for more information on any of these topics. They are listed here to give a sense of the digital ecosystem involved in a web server environment.

Uploads, Downloads & Bandwidth When Downloading data, information is being taken from somewhere else and brought to your computer. When Uploading data, data (either stored or generated) is sent from your computer and is being sent to another computer somewhere. If you’re getting it, you are downloading. If you are sending it, you are uploading. The difference is important. Bandwidth is the speed at which information can be downloaded or uploaded. A typical Cable Internet connection may have bandwidth of 3.5MB/768KB. This means that you can download files at a maximum speed of 3.5 Million bytes per second, and can upload data at a slower speed of 768 Thousand bytes per second (much slower). This is called an Asymmetric connection, because the upload and download speeds are the same. Most office networks (LAN’s) are Symmetric connections, with a speed of 100MB for both uploading and downloading speed. The local network (LAN) will almost always be significantly faster than the Internet connection that it uses.

Page 2: Web Servers -  How They Work

Market Share for Top Servers Across All Domains August 1995 - November 2007 (Netcraft)

Apache47.73%

Microsoft37.13%

Google5.44%

Other9.70%

Network Use Comparison

0

10

20

30

40

50

60

70

80

90

100

Personal Computer Web Server

Pe

rce

nt

of

Ne

two

rk U

se

Download

Upload

Web Server Introduction A web server is a program that answers requests coming in from the Internet or local network and serves (returns) pages that can be displayed in a web browser or other application. A web server can be set up to be used for:

Public use on the Internet (typical web site) Public use on a local network (some kiosks and private customer assistance systems) Private use on the Internet (protected by passwords or other security) Private use on a local network (an Intranet)

The most popularly used web server software is the Apache Web Server. This is an open-source system that is free to download and use and has excellent performance and security and has versions that will run on almost any computer system. The second-most popular web server is Microsoft Internet Information Server, or IIS. This runs on most Windows platforms and can be tightly with Microsoft’s other database and business products. It will not run on Linux or Unix systems. It can be freely downloaded but requires paid licenses from Microsoft to use in a production environment if you will be hosting sites for others or have more than 5 users logged into the site at any given time.

The Web Server Difference A typical personal computer on the Internet only downloads data a small portion of the time, while the majority of the time the network is idle while the user is reading web pages, doing other work on the computer, or not on the computer at all. Of the amount of time a personal computer is actually using the network, almost all of its time is spent downloading information from various web sites across the Internet. As a result, the typical Cable modem or DSL connection has a high download speed, but a low upload speed. This gives the

Page 3: Web Servers -  How They Work

average user the best user experience, while putting limits on the ability of people to upload data or run their own servers (because they charge higher rates for those services). A typical web server has the opposite behavior of a personal computer user. It is running constantly, serving pages to users all over the world, so it must be on and running 24 x 7 x 365. Where the personal computer spends most of it’s time downloading information, a web server spends most of its time uploading content to all the browsers and users that are connecting to it for information from all around the world. The result of this is that a web server needs a reliable, very high speed Internet connection. Ideally, the web server’s internet connection should have more bandwidth than the combined bandwidth of all users who will connect to it at the same time. For testing purposes or low traffic web sites, almost any computer running Windows, Mac OS X or any version of Linux or Unix can act as a web server by running Apache or IIS on it and hooking it up to the network or Internet.

Web Server Operation Here is the most basic description of what happens when you view a web page:

As with most things, there is a good bit more going on behind the scenes that makes it appear so simple to the user. On the next page is a diagram with more detail (not totally complete) on what is really happening behind the scenes when you load a web page in your browser:

Page 4: Web Servers -  How They Work
Page 5: Web Servers -  How They Work

If you prefer words to pictures, here is another example:

Event (what happens)

Actor (what does it)

Target (what it does to)

Technologies Involved

User types http://www.desu.edu/cal into browser address bar.

User Web Browser Web Browser, Typing, URL

Web browser looks up IP address for www.desu.edu

Web Browser DNS Server DNS, TCP/IP

Remote DNS server gives IP address of 167.21.180.40

DNS Server Web Browser DNS, TCP/IP

Browser connects to 167.21.180.40 on Port 80

Web Browser Web Server TCP/IP

Server creates connection as requested

Web Server Web Browser TCP/IP

Web browser requests the page /cal on the site www.desu.edu

Web Browser Web Server HTTP, TCP/IP

Web Server sends the requested page to web browser

Web Server Web Browser HTTP, MIME, TCP/IP

Repeat process for all other page elements (graphics, stylesheets, etc) Web browser closes connection to server

Web Browser Web Server HTTP, TCP/IP

Web browser displays page Web Browser Screen Local Code User views page User Screen Eyeballs, Brain

Static vs. Dynamic Content There are two types of content, static and dynamic. Static content consists of files that do not change, or change infrequently. Static files are saved as files on the web server and are simply read in from disk and written out to the network connection to get the data from the web server to the browser. Static content is the best from the server’s perspective – it is fast to access, almost no load on the processor and uses minimal memory overhead. The following files types are usually static content: html, htm, txt, csv, pdf, gif, jpg, mp3, mov, swf, avi and wmv. Dynamic content is content that changes. It may change based on user input, business activity, time of day, date, database contents or any other factor. Dynamic content is more work for the server than static content because it requires programming to be run before the results of it are sent to the user’s web browser for display. This requires additional processing power, disk space and memory. How much more of these factors is needed depends on the program’s function, what language it is written in and how efficiently it is written. The following file types are usually dynamic content: cgi, shtml, php, pl, cf, asp, aspx, jsp, py, nsf, and cfm.

Page 6: Web Servers -  How They Work

Dynamic Content: In-Process vs. Out of Process For web servers generating dynamic content, there are two ways of executing the programs that create the content, In-Process and Out-of-Process. Out-of-Process means that the program that executes is completely separate from the web server process, except for some means of exchanging information. The most common method of executing an out-of-process program is through the use of the Common Gateway Interface, or CGI. A CGI program may be written in any language. The program gets its information from the web server on how to run and what to do through a number of channels:

Parameters set by a form using the GET method of posting a <form> tag. Parameters set by a form using the POST method of posting a <form> tag. Environment variables set by the operating system Server variables set by the web server

The program runs and generates the output to be sent to the web browser on the program’s Standard Output (STDOUT) file descriptor, usually by using the PRINT or ECHO statements. Errors may be sent to the browser and/or the web server’s error log file. In-Process means that the program actually executes as part of the web server’s programming, usually through the addition of a module to the web server to support that language. An in-process program can be written in any language that the web server has a module for. A web server can execute both in-process and out-of-process modules on the same server depending on the programs being run. This diagram shows the difference in how the processes are run:

Page 7: Web Servers -  How They Work

The advantages of in-process program execution include: No need to start up a new process

o Faster time to start up o Generally needs less memory o More efficient communication to web server

The disadvantages of in-process program execution include:

Possibility of crashing web server o Programs execute in web server’s memory

Possibility of program accessing unauthorized web server information o Other users’ session information o Web server operational statistics

Reduced functionality o Not all language functions may be available as a module

The advantages of out-of-process program execution include:

Can use any language o No need to have web server modules available

Can use full functionality of languages o No reduced feature sets of module implementations

The disadvantages of out-of-process program execution include:

Slower to execute o Must start up a new process for each execution (can be significant)

More memory needed Less efficient communication with web server Exposes more operating system functionality

o Requires even more attention to secure programming rules

Dynamic Content: Scripts, Executables & Caches Scripts are (generally) programs that are written in ASCII text files and read, compiled and executed on demand when the program is to run. Compiled programs are compiled from source code and stored in machine-executable files. Caches are pre-compiled images of scripts that are used within the web server to provide faster launch time of the scripts when they are called. On most web servers, out-of-process programs can be written in either scripting languages (PHP, Perl, Shell Scripts, ASP, Cold Fusion, etc) or as compiled programs (C, C++, Java, etc). Most in-process programs are written in scripting languages that are often pre-compiled or cached to improve performance.

Page 8: Web Servers -  How They Work

Server-Side vs. Client-Side Execution On web sites, programs may run either Server-Side or Client-Side, or both. A program that executes on the server and sends the results of its operations to the web browser is called a Server-Side program. Server side languages include PHP, Perl, Java, ASP, ASP.NET, Cold Fusion and Java Server Pages. A program that executes within the browser with no code having been executed on the server is called a Client-Side program (the browser is the web server’s client). The most common client-side programs are written in JavaScript (not related to Java at all), Flash (.swf files) and Java Applets (Java that runs on the web browser’s PC, not on the web server).

AJAX: Asynchronous Javascript And XML One of the newest developments on the Internet is the popularity of AJAX for building dynamic sites that run in the browser faster and with better user-interfaces than previous web-based applications. AJAX runs Javascript in the web browser and responds to user interaction with Javascript calls to change the page in the browser’s memory, without having to make round-trip calls to the web server each time to update the page being viewed. AJAX can make small, quick calls to the web server and request data and submit changes without changing the browser’s appearance, and can update any element of the HTML page in response to server actions. This has the advantage of making more engaging sites that have faster user response. AJAX has the disadvantage that it breaks the typical user experience of the browser. Pressing the BACK button on a typical site takes you back to the previous thing you were looking at. Pressing BACK on an AJAX page may take you completely out of the flow of information you were in, because AJAX was managing everything internally.

PHP Sample Code (Server-Side) PHP uses inline code in an HTML document. This means that PHP code is intermixed in a regular HTML document. The PHP code’s output is standard HTML which the web server will insert into the document before it sends it to the user’s web browser. PHP uses the tags <? and ?> to separate HTML from PHP code. Here is some sample code in PHP: count.php: <html><head><title>PHP 1 to 5</title></head> <body> <h1>Counting from 1 to 5 in PHP</h1> <? for ($num = 1; $num <= 5; $num++) { echo “<p>$num</p>\n”; } ?> <p>Wasn’t that easy!</p> </body></html>

Page 9: Web Servers -  How They Work

When this runs, the PHP program will execute the code in bold and the output will replace the code from <? to ?> in the script. This will generate the following HTML, which will be what is sent to the user’s web browser. <html><head><title>PHP 1 to 5</title></head> <body> <h1>Counting from 1 to 5 in PHP</h1> <p>1</p> <p>2</p> <p>3</p> <p>4</p> <p>5</p> <p>Wasn’t that easy!</p> </body></html> No PHP source code will ever be seen by the user. It will be executed on the server and replaced by whatever output, if any, that the block of PHP code generated. The user’s web browser will display this:

% PHP 1 to 5 _ O X

Counting from 1 to 5 in PHP 1 2 3 4 5 Wasn’t that easy! That’s all there is to it for writing a simple web server application.

Page 10: Web Servers -  How They Work

ASP Sample Code (Server-Side) Active Server Pages (ASP) was Microsoft’s use of Visual Basic Script running under Internet Information Server (IIS). It works much like PHP, but uses the Visual Basic syntax and uses the <% and %> tags to separate it’s code from the surrounding HTML. count.asp: <html><head><title>ASP 1 to 5</title></head> <body> <h1>Counting from 1 to 5 in ASP</h1> <? Dim Num as Integer FOR Num = 1 to 5 RESPONSE.WRITE(“<p>” & Num & “</p>\n”); NEXT Num ?> <p>Wasn’t that easy!</p> </body></html> This code does the same as the PHP example, but in Microsoft flavor. ASP uses the RESPONSE object to send data back to the web server to be passed on to the user’s web browser. The WRITE method of the RESPONSE object tells the web server to output this back to the web server. No ASP source code will ever be seen by the user. It will be executed on the server and replaced by whatever output, if any, that the block of ASP code generated. The output is identical to the PHP version.

JavaScript Sample Code (Client-Side) JavaScript can be used to execute code on the client’s machine within their web browser. This can sometimes be hand when you want to generate code but don’t want to take up the server’s resources to do it. Here is an HTML file that includes JavaScript to duplicate the server-side functionality of the last two examples: count.html: <html><head><title> JavaScript 1 to 5</title></head> <body> <h1>Counting from 1 to 5 in JavaScript</h1> <script> for (num=1; num <= 5; num++) { document.write("<p>" + num + "</p>\n"); } </script> <p>Wasn’t that easy!</p> </body></html> In this case, the HTML file will be downloaded to the browser with the source code included. Any user can select the View->View Page Source menu option and see the source code in the program. However, when the page is rendered, the web browser will execute the code between the <script> and </script> tags. The document.write() function (method to the document object) causes JavaScript to add the string passed to it to the HTML being rendered as it comes from the web server.

Page 11: Web Servers -  How They Work

To allow for browsers that do not have JavaScript turned on or lack the functionality to run JavaScript, you should modify our HTML file to make it degrade gracefully. We have put the actual JavaScript code with HTML comment tags <!-- and --> so that the source code will not be displayed if the user’s browser (such as a mobile phone) does not understand the <script> tag. We also added the <noscript> element that will be displayed in browsers that know about JavaScript, but do not have it enabled. This will cause it to display an error indicating that they are missing some content due to the lack of client-side functionality that they are using. NOTE: JavaScript should never be relied on when performing any operations on a web site unless you are willing to accept that the site will not work properly for some visitors. Any form validation or other functions that your server expects to be done in JavaScript should be verified again using Server-Side scripting. The updated version of count.html that degrades more gracefully is listed below: count.html: <html><head><title>JavaScript 1 to 5</title></head> <body> <h1>Counting from 1 to 5 in JavaScript </h1> <noscript>You don’t have JavaScript enabled. No counting for you.</noscript> <script> <!-- for (num=1; num <= 5; num++) { document.write("<p>" + num + "</p>\n"); } --> </script> <p>Wasn’t that easy!</p> </body></html>

PHP (Server-Side) and JavaScript (Client-Side) Sample Code Here is an example of a program that uses both Server-Side and Client-Side Scripting to Count to 10: phpjs.php: <html><head><title>PHP 1 to 5, JavaScript 6 to 10</title></head> <body> <h1>Counting from 1 to 5 in PHP</h1> <? for ($num = 1; $num <= 5; $num++) { echo “<p>$num</p>\n”; } ?> <h1>Counting from 6 to 10 in JavaScript</h1> <noscript>You don’t have JavaScript enabled. No counting from 6 to 10 for you.</noscript> <script> <!-- for (num=6; num <= 10; num++) { document.write("<p>" + num + "</p>\n"); } --> </script> <p>Wasn’t that easy!</p> </body></html>

Page 12: Web Servers -  How They Work

When the web server processes the request, it will execute the server-side PHP code and pass everything else (including the JavaScript) through to be sent to the user’s web browser. This is the HTML that will be sent to the user’s web browser: <html><head><title>PHP 1 to 5</title></head> <body> <h1>Counting from 1 to 5 in PHP</h1> <p>1</p> <p>2</p> <p>3</p> <p>4</p> <p>5</p> <noscript>You don’t have JavaScript enabled. No counting from 6 to 10 for you.</noscript> <script> <!-- document.write("<h1>Counting from 6 to 10 in JavaScript</h1>\n"); for (num=6; num <= 10; num++) { document.write("<p>" + num + "</p>\n"); } --> </script> <p>Wasn't that easy!</p> </body></html> The JavaScript code will then be executed in the user’s web browser, and will display in the user’s browser:

% PHP 1 to 5, JavaScript 6 to 10 _ O X

Counting from 1 to 5 in PHP 1 2 3 4 5

Counting from 6 to 10 in JavaScript 6 7 8 9 10 Wasn’t that easy!

Page 13: Web Servers -  How They Work

This shows how easily you can mix code for different types of programming languages to generate a single HTML page. Depending on how your server is configured, you can even mix different types of server-side code in the same script (though a good reason to do so is rare).

HTTP – Behind the Scenes So, what is actually sent over the Internet when you ask for a web page. Below is the actual network traffic captured directly from the network when connecting to the page at: http://diamondsea.com/demo/phpjs.php What the Browser sends to the Server: GET /demo/phpjs.php HTTP/1.1 Host: diamondsea.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cache-Control: max-age=0 Above is the actual request that the browser sends to the sever. The first line is the request to GET the page /demo/phpjs.php using Version 1.1 of the HTTP protocol. The second line says that it is requesting the page from the site diamondsea.com . The addition of the Host: field was not in the original HTTP 1.0 protocol, and required that every web site have its own IP address. The addition of the Host field allowed many web sites to share a single IP address, which aided greatly in minimizing the number of IP addresses needed, preventing us from running out of IP addresses and causing big problems for the Internet. Line 3 gives the User-Agent field, which is how the browser identifies itself to the web server. Some web sites return different results based on the User-Agent field to allow for differences in browser features. Line 4 gives the Accept field, which lists the type of files that the browser can receive. The q values show the optional quality value which ranges from 0 to 1 (defaults to 1 if not present) to indicate the browser’s preference for different media types. A higher q value means that the browser prefers that type to others. Line 5, 6 and 7 tell the server what Languages, Encodings and Character sets the browser can understand. Lines 8 and 9 tell the server to use a keep-alive connection, which means that it will keep the connection to the server open for 300 seconds (5 minutes) or until it is closed explicitly. This connection can then be used for transferring other data to and from the server, which saves a considerable amount of time compared to reopening a new connection each time a file is needed. This command is a holdover from HTTP 1.0 and is obsolete in HTTP 1.1. Line 10 uses the Cache-Control option to tell the browser that it wants all files available (where the file’s age >= 0).

Page 14: Web Servers -  How They Work

The server then responds with the requested information, as shown below. Response from Server to Browser: HTTP/1.1 200 OK Date: Tue, 18 Dec 2007 15:53:12 GMT Server: Apache/2.0.52 (Red Hat) X-Powered-By: PHP/4.3.9 Content-Length: 459 Connection: close Content-Type: text/html <html><head><title>PHP 1 to 5</title></head> <body> <h1>Counting from 1 to 5 in PHP</h1> <p>1</p> <p>2</p> <p>3</p> <p>4</p> <p>5</p> <noscript>You don't have JavaScript enabled. No counting from 6 to 10 for you.</noscript> <script> <!-- document.write("<h1>Counting from 6 to 10 in JavaScript</h1>\n"); for (num=6; num <= 10; num++) { document.write("<p>" + num + "</p>\n"); } --> </script> <p>Wasn't that easy!</p> </body></html> The bold lines are the HTTP protocol information describing the page being sent back, and the lines after the first blank line are the actual contents of the file being returned. The first line is the response and status code from the server about the request. A status of 200 OK indicates that the page had no problems. A code of 404 NOT FOUND indicates that the requested page was not found on the server. A code of 401 Unauthorized means that a username and password is required for HTTP Authentication. A code of 403 Forbidden means that the page is present on the server, but not available for the current user. A complete list of status codes is available at: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html The next lines are informative and describes the date and time when the page was sent, what server provided the response, any optional information provided by the server (Any header starting with X- is optional and not part of the official protocol specification). The Content-Length field says how many bytes the data (not including the headers) being sent will be, which programs can use to pre-allocate storage space to receive the file, and to verify the proper amount of data was received. The Connection value says to close the connection after receiving the data. The Content-Type field says that the MIME type of this data will be text/html, meaning it contains plain ASCII Text that is in an HTML format. A blank line separates the header from the data. The rest of the page is data that will be received by the browser.

Page 15: Web Servers -  How They Work

Web Server “Security” – HTTP Authentication A common way of securing all or part of a web site is to use HTTP Authentication, which is a user authentication method build into the HTTP protocol requiring a username and password to be sent before the pages can be accessed. While this seems a handy security function, it is insecure when used without an SSL connection. For an example of why, here is the session transcript after logging into the page: GET /demo/secure/phpjs.php HTTP/1.1 Host: www.diamondsea.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071127 irefox/2.0.0.11 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cache-Control: max-age=0, max-age=0 Authorization: Basic YWRtaW5pc3RyYXRvcjpzZWNyZXRwYXNzd29yZDk5OQ== HTTP/1.1 200 OK Date: Tue, 18 Dec 2007 16:33:29 GMT Server: Apache/2.0.52 (Red Hat) X-Powered-By: PHP/4.3.9 Content-Length: 459 Connection: close Content-Type: text/html <html><head><title>PHP 1 to 5</title></head> <body> <h1>Counting from 1 to 5 in PHP</h1> <p>1</p> <p>2</p> <p>3</p> <p>4</p> <p>5</p> <noscript>You don.t have JavaScript enabled. No counting from 6 to 10 for you.</noscript> <script> <!-- document.write("<h1>Counting from 6 to 10 in JavaScript</h1>\n"); for (num=6; num <= 10; num++) { document.write("<p>" + num + "</p>\n"); } --> </script> <p>Wasn't that easy!</p> </body></html> The text in bold is what was sent by the browser to the server after logging on, and the normal text is what the server sent back to the browser.

Page 16: Web Servers -  How They Work

Note the last line the browser sent to the server: Authorization: Basic YWRtaW5pc3RyYXRvcjpzZWNyZXRwYXNzd29yZDk5OQ==

This is the how the browser sends the HTTP “Basic” Authentication data to the server. It looks like it contains an encrypted password, which is probably quite secure. However, it doesn’t. What it contains is an encoded, not encrypted, password. This means that you can easily decode it if you know what the encoding method is. In this case, we know that the encoding method is Base64 because the HTTP protocol specification says that is what is used here. If we decode the string (just Google for “base64 decoder”), the line changes into: Authorization: Basic administrator:secretpassword999 We can now see that the user name was administrator and the password is secretpassword999. This information could (and is) easily “sniffed” (listened for and copied) off the network by any machine between you and the web server you are connecting to. This means that an ISP, network provider, hosting company or anyone who has compromised (virus, worm, Trojan, hack) any machine along the way can simply collect usernames and passwords “off the wire” without anyone ever knowing it has been done. The only way to securely use HTTP Basic Authentication is to use it over a secure SSL connection.

Web Server Security – Secure Sockets Layer (SSL) The Secure Socket Layer (SSL) is a “Holy Grail” of Internet security. It is a protocol that fully encrypts, using robust and reliable encryption, all the communication between the web server and the web browser. This means that anything that is viewed will be invisible (except for the web server and hostname being connected to) to anyone else on the Internet. To access a web site using SSL, instead of using the http://domain.com format of URL, use HTTPS for the protocol instead, like this: https://domain.com . This will ensure that all communication between the server is fully encrypted (again, except for the web server’s IP address being contacted and the hostname of the web site being accessed). For example, here is the same session over SSL: ..................{...{.55.....9f..&mb8..jn ...'I.P...<.|.J..>mM.............8. ...9.8.....5.........3.2............./......... ..... ...-.........www.diamondsea.com. ....................J...F..Gg.....aX..x..Sd.}$$m4c...r.t..F ...'I.P...<.|.J..>mM.............9...........0...E.......k.D....A.k.Yu.hp.^.../."..u...W...o............0...........<O.t.....t:..a.xf....`..8..IK..EB........p......p.x..86R+.j.....w.U^...{^."C ..>.3........^...DHy...KH.......*.\..:...g.c...hV.X.J.W...z....e..........f%n.N.....(,..,.i...(.^..8.h..dB...n....==o...{..G..=( ..(B....n..w&..m.].t/i..............)...- .Ju.....Dq.[.Qa......,.....p..}.....I..u...{...S .m'Z..m2.B...

Page 17: Web Servers -  How They Work

Vp"6..._..[.....2..Ha....-.Y.kNUH.9..w.S..`...Z#..f.uS.:...X...u...h..=....[r.n#@...._.k,0.Tm..9;...... ....e.DTf..........E.:........|+;T'^...l)...........'[email protected]\>o.%e.R.....7..+]...y|.b.e3[..{...]....U7^.-....}.x9.A.L........Q.d.1...MP2vz..;[email protected]..=..$....1..-j.m.I?...,....*.a.%.5oZ....|.].{\Z..9.+... .....M.2 ..v-W+Q9.qV.".xF...|.7......v.....s.*.q.... .0...j..6.....*.P.....W..9..wQrNs.-......."bK.B.E..,V....q.M..V..j.~..Y...R;.V..,v.q..kbI..|..x...Gq...VmAR.X.. ......).9fpw..8....(d..k.....r.<0.......D...}s.c....M~.L...{..l..Fg.p..Q!..^...VewLUb.C.c....q...B.d....9s2...5R..[k ........p......t...C.....K....-\[.....YG..b. !....O..#f.....Es0.......[v=.z[.7..3.....Z...>..h........_.Y..s._.>|.Y0I....T.}%g.....%M..8.x.^...=.+..'.eq3E.k.`[email protected]"/.0.>..%B..x.....57.s<...r...i.6....N.....FN.....k.w..........Q.T.T..+..P/m...np......J.#&..t._....j>.t.h9....r... ..Z$^...Z^..T..I......../...e....ZjJK.Oi.8.........7IT.W....<?^1.2ej.l..PS&G.......L.S..}....Zv.........^.Q..o......"..G..(..e.E._cA..5h......x...WyFhl...W.....Z..^. ......= .)...U.7..._...A:.v...|.An.HH..}.[K>..C...%H}.n..D(.V.....h..%..n f.....f...,tzvq{I.>yg..,.^|.4|Qae<.....,....i..Qi.M...o.... !..o.....Q.0..{....[...~...!Q.N........."8[.A.g..=......G{F._..LNw.;.........J.,.(..e..[....c...Q]/E.9....e.2|..).....| ..Y2.5q..t.._ZFi:..)*,.........q"..~..i.....!..E....K+.SK....>D{6..X.B.0..U....Knn......HFS.z.V..\.n.L....I..h.u.W.Yv...l.+ .0CD........9...V..Q@& .....Fv.T.q[........gDl.....!H..H.\..T.S.%.U'D+t.@ Z...2\...{1!.Ue..y.|.#...x..A.*....{.3.dB.#..*...'...U/.?../.C...OJ.K..lo.4.j.yx.r2LU.....P.^Y..8Na.u...(`. ...m....9..b.}C... Z{GoU...e..&A..7.RN....../..u3W?....3.1..e..........-=..<.L....I[..].S9.m..J..b.7~..{.a.B.%j.M...:M..b'...;.....o.J....$i.0wo..._e.........{...dq..D..7^G...gA..B.....19...'/.c*u.T.b..i... ....{...t \i.6d....l........3...m.u..m... .....x.....})...v..I.M.^.#t.....0qD.....N.....o./D.....N..$0..rSHsc.{v>y...].~"...X.t.&.z|cm.:m..D.~..Q.........b.....xm+.C+e/DP.8O....Y.*S.....LS...2...%.F.....T..tK0.t...A#.z.....?.#.J.X....jdQ..[.........%S(.f.R...k(g....e.eI........|T....y.....|..........H..d).N....>...E.C)...].z.lK....\w...t. .;..*....,...\..V.u.iCQ...o.#fK..;..o+....`...........c...........f...C.....`u=.C....MQ...L.C{.R..r6.E.jJ....FmH...N).[A.......~K.!...T,.@....`...V..1._......\... .d.mD.r.J..o..m}.R2.<.9......u.*Y....`j?.7.i.%.-.q...(......0l.}.......h......e.`P1.:.i......../.0.O1....g..M-..6.`.#T6...'.....20kD......./.#....o.s.`..k8...=m....uZ.J .......*y.K.......yO......yq~..)..0N4wFO..r&q-p..:......3._.........V..r.lph...~.B..}..M._....4....W.!..X.7......h.#f..2.....X.L_..Y0&..D........G............*O..3..i..y%.a?.,&F.v:....G.c..........u....:...D..g........#..\c..oxe....?......*.W,.....S.........oX.....@......!......3.Z...............>t.7.....6>s......ei4g}.......j~*=.DnPq.;hX/......'...V'.....xS....k..1[....HS>+.=03...x..M...x.fxF.d/.....F....,.I...K ....qy>.....'........O+.....K.U.K.$...9{...-.... x......).]&...5E.:.>(..FM.H... H.... cDe..Y7..U..b[.-,.kj4.Fg....r..f That insecure HTTP Basic Authentication string is still in there, but nobody’s ever going to decode it because they would have to decrypt the SSL-encrypted connection beforehand, and that would take a huge amount of computing power to do. Nobody will be able to tell anything other than the name of the site you connected to (which has to be sent unencrypted due to the way the HTTPS protocol works). The basic rule is: Anything that you do not want to be readable by someone between you and the server, use an SSL (https) connection.

Page 18: Web Servers -  How They Work

Resources W3 Schools – Free Web Building Tutorials http://www.w3schools.com/ Lynda.com – Online video training on HTML and lots of other topics ($25/mo) http://www.lynda.com

Author By Brian Gallagher - www.BrianGallagher.com