meljun cortes jedi course notes web programming-lesson1-introduction to the course

Download MELJUN CORTES Jedi course notes web programming-lesson1-introduction to the course

If you can't read please download the document

Upload: meljun-cortes

Post on 16-Apr-2017

459 views

Category:

Documents


1 download

TRANSCRIPT

Introduction to Web Programming

Why the Web?

Welcome to this course on web programming. To start things off, let's begin with a better appreciation of why it is worthwhile for companies and programmers alike to focus on web programming.

Technology-Neutral EnvironmentFirst of all, one of the great things about applications on the Internet is that Internet is a technology-neutral environment. Communication with any application in the web is done through popular protocols (HTML/HTTP) that do not require the user to have a particular operating system nor a client that is programmed in a particular programming language or framework. All that the users will be needing is a web browser, an application which is now bundled standard with any operating system. This translates into a wider possible audience for any web-based application.

Ease of Distribution/UpdatesSince the only program that the user needs is their web browser, there is no need to give away programs through CDs. There is no need as well for the user to go through a possibly lengthy installation sequence; all they need is the location of the application in the Internet, and they are ready to go.

Another benefit of having the actual binaries of the program residing in an accessible server instead of in the user's computer is that the usual problems related with program updates, such as the need to periodically check for newer versions of the program and the problem of how to actually get the program updates, are eliminated altogether. The user need not to be informed of an update in the program; all that would be needed would be to update the codebase in the web server, and automatically, all users who will make use of it afterwards will enjoy the benefits of the updates.

Client-Server Architecture

Thick and Thin Clients

A web application is a kind of application that makes use of what is called a client-server architecture. In this kind of architecture, a client program connects to a server in retrieving information that it needs to complete the tasks that the user has set it to do. There are what are called thin clients, and there are thick clients.

Thin clients are clients containing only a minimum of what is required for the user experience, mostly only an interface. All business logic, all data aside from the ones provided by the user, reside within the server. Thick clients are clients that, aside from an interface, also contain some, if not many, of the processing logic required for user-specified tasks.

Client-Server Architecture from a Web Perspective.

From the definition above, we can tell that the client used for web applications are what we call thin clients. The client program, a browser in this case, is only an interface that the user makes use of to perform tasks. Everything else, from the data that the user needs to operate on to the logic that determines program flow and execution, resides on the server.

From a more web-based perspective here are the duties of the server and the client:

Web server

Server response (contains the document requested by the user or an error code if the item does not exist)

Client request (contains the name and address of the item the client is looking for)

Machine running a Web Browser

Machine running a Web Server

Server processes client request by looking for the resource requested by the client

Figure 1: Responsibility of Server

Basically, the server takes in requests from web browser clients and returns a response. Any request coming in from the client includes the name and address of the item the client is looking for, as well as any user-provided data. The server takes in that request, processes it, and either returns as a response the data looked for by the client or displays an error code indicating that the item does not exist on the server.

Web client

It is the browser's responsibility to provide the user with an interface with which to issue requests to the server and to view the server's response.

When the user issues a request to the server (for example, to retrieve a document, or maybe to submit a form), it is the browser that formats that request into something that the server can understand. Once the server has finished processing the request and has sent a response, it is the browser that retrieves the required data from the server response and then renders that for display to the user.

HTMLHow does the browser know what to display to the user? Most web sites do not have just simple text content, but instead employ graphics or have forms that retrieve data. How does each browser know what to display?

The answer lies with HTML, an acronym for Hypertext Markup Language. HTML can be thought of as a set of instructions for the web browser on how to present content to the user. It is an open standard updated by the W3C or the World Wide Web Consortium. Since it is an open standard, everybody has access to it. It also means that browsers are developed with that standard in mind. This further means that all browsers know what to do when it encounters HTML, although some older browsers might have problems in rendering some pages that were written using newer versions of HTML that were updated after their development.

HTTP

Definition

HTTP stands for Hypertext Transfer Protocol. It is a network protocol with Web-specific features that runs on top of two other protocol layers, TCP and IP. TCP is a protocol that is responsible for making sure if a file sent from one end of a network is delivered completely and successfully at its destination. IP is a protocol that routes file pieces from one host to another on their way to its destination. HTTP uses these two protocols to make sure that requests and responses are delivered completely between each end of the communication.

HTTP uses a Request/Response sequence: an HTTP client opens a connection and sends a request message to an HTTP server; the server then returns a response message, usually containing the resource that was requested; after delivering the response, the server closes the connection making HTTP a stateless protocol (i.e. not maintaining any connection information between transactions).

The format of the request and response messages are similar and English-oriented. Both kinds of messages consist of:

an initial line,

zero or more header lines,

a blank line (i.e. a CRLF by itself), and an optional message body (e.g. a file, or query data, or query output).

HTTP Requests

Requests from the client to the server contain the information about the kind of data the user is requesting. One of the items of information encapsulated in the HTTP request is a method name. This tells the server the kind of request being made, as well as how the rest of the message from the client is formatted. There are two methods that you'll likely encounter and use: GET and POST.

GET

GET is the simplest HTTP method that is used mainly to request a particular resource from the server, whether it be a web page, a graphic image file, a document, etc.

GET can also be used to send data over to the server, though doing this has its limitations. For one, the total amount of characters that can be encapsulated into a GET request is limited, so for situations where a lot of data need to be sent to the server, not all of the message can come through.

Another limitation of the GET request method when it comes to sending data is that the data you send using this method is simply appended to the URL you send to the server. (For now, think of URL as the unique address you send to the server denoting the location of whatever it is you are requesting). One of the problems encountered in this method is that the URL of any request you make to the server is displayed in the browser bar of any browser. This means that any sensitive data such as passwords or contact information can be exposed to anybody.

The advantage of using GET to send data over to the server is that the URL requesting from a GET request can be bookmarked by the browser. This means that the user can simply bookmark his request and access that every now and then instead of having to go through a process every time. Take note though that this can also be dangerous; if bookmark functionality is not something that you want your users to have, use another method instead.Here is what a URL generated with a GET request may look like:http://jedi-master.dev.java.net/servlets/NewsItemView?newsItemID=2359&filter=true

All of the items before the question mark (?) is the original URL of the request (in this case its http://jedi-master.dev.java.net/servlets/NewsItemView). Everything after that are the parameters or data that you send along to the server.

Let's take a closer look at that part. Here are the parameters added to that request:

newsItemID=2359&filter=true

In GET requests, parameters are encoded as name and value pairs. You don't send over data values to the server without it knowing specifically what that value is for. The name and value pairs are encoded as:

name=value

Also, if there are more than one set of parameters, they are separated using the ampersand symbol (&). So, in this case, the parameter names we are specifying for the server are newsItemID and filter, with the values of 2359 and true, respectively.

POST

The other kind of request method that you are most likely to use would be the POST request. These kinds of requests are designed such that the browser can make complex requests of the server. That is, they are designed so that the user, through the browser, can send a lot of data to the server. Complex forms are generally accomplished using POST requests, as well as simple forms that require the uploading of files to the server.One apparent difference between the GET and POST methods is the way they send data to the server. As stated before, GET simply appends the data to the URL it sends over. POST, on the other hand, encapsulates or hides the data inside of the message body it sends. When the server receives the request and determines that it is a POST request, it looks in the message body for this data.

HTTP Response

HTTP responses from the server contain both headers and a message body like HTTP requests do. They use a different set of headers though, but we won't go into too much detail of those in here. It is sufficient to say that the headers contain information about the version of the HTTP protocol that the server is using, as well as the type of content that is encapsulated within the message body. The value for the content type is called the MIME-type. This tells the browser if the message contains HTML, a picture, or some other type of content.

Dynamic Over Static Pages

The kind of content that can be served up by the web server can either be static or dynamic. Static content is content that does not change. This kind of content usually just sits around in storage where the server can access it and is brought up on request. When these contents are sent as a response from the server, they are sent exactly the way they were as when they were residing in the server. Examples of static content include archived newspaper articles, family pictures from an online photo gallery, or even possibly an online copy of this document!

Dynamic content, on the other hand, changes according to user input. What applications in the server have access to for this type of content is a kind of template that they can refer to to know how the document to be sent will look like in general. This template is then filled in according to the parameters sent in by the user and returned to the client.Suffice it to say, dynamic pages have a lot more flexibility and have more utility than static pages. Here are a couple of scenarios where dynamic content is the only thing that will fit the bill:

The Web page is based on data submitted by the user. For example, the results pages from search engines are generated this way. Programs that process orders for e-commerce sites do this as well.

The data changes frequently. A weather-report or news headlines page might build the page dynamically, perhaps returning a previously built page if it is still up to date.

The Web page uses information from corporate databases or other such sources.

It is important to realize though, that web servers by themselves do not have the capability to serve dynamic content. Web servers need to have access to applications that can build dynamic content. Also, aside from needing separate applications for creating dynamic content, web servers also need separate applications that will store pertinent user information (such as data collected from forms) into storage. You can't expect to create a form, have the user input data into it, submit it to the server, and have the server automatically know what to do with that data.

We are now into that part of our discussion where we can explicitly point out that it is the creation of these web applications that form the basis of our course. So, how do we go on about creating these applications?

In this course, we will be turning primarily to Java-based technologies to create our web applications. More specifically, we will be making extensive use of the APIs provided in the web tier of the J2EE (Java 2 Enterprise Edition) specification.

J2EE Web Tier Overview

The Java 2 Enterprise Edition (J2EE) platform is a platform introduced for the development of enterprise applications in a component-based manner. The application model used by this platform is called a distributed multi-tier application model. The distributed aspect of this model simply means that most applications designed and developed with this platform in mind can have their different components installed in different machines. The multi-tier part means that the applications are designed with multiple degrees of separation with regards to the various major components of the application. An example of a multi-tiered application is a web application: the presentation layer (the client browser), the business logic layer (the program that resides on the web server), and the storage layer (the database which will handle the application data) are distinctly separated, but are all needed as a whole to create one application for the user.One of the tiers in the J2EE platform as previously mentioned is the Web tier. This tier is described to be the layer which interacts with browsers in order to create dynamic content. There are two technologies within this layer: servlets and JavaServerPages.

Figure 2: The Web Tier in the J2EE Platform (Image from J2EE Tutorial)

Since these will be tackled more intensively later, only a brief description will be given here.

Servlets

Servlet technology is Java's primary answer for adding additional functionality to servers that use a request-response model. They have the ability to read data contained in the requests passed to the server and generate a dynamic response based on that data. Servlets are not necessarily limited to HTTP-based situations; as stated before, they are applicable for any scenario requiring the request-response model. HTTP-based situations are currently their primary use, so Java has provided a HTTP-specific version that implements HTTP-specific features.

JavaServerPages

One of the disadvantages of using servlets in generating a response to the client is formatting the HTML to be sent back. Since servlets are simply Java language classes, they produce output the way other Java programs would: through printing characters as Strings into the output stream, in this case the HTTP-response. However, HTML can be quite complex and it can be very hard to encode HTML through the use of String literals. Also, engaging the services of a dedicated graphics and web page designer to help in the static parts of the pages is hard if not impossible. We would be expecting him to have a minimum knowledge of Java.

This is where JavaServerPage(JSP) technology comes in. JSP looks just like HTML, only it has access to all the dynamic capabilities of Servlets through the use of scripts and expression languages. Since it looks just like HTML, designers can concentrate on simple HTML design and simply leave placeholders for developers to fill with dynamic content.

Containers

Central to the concept of any J2EE application is the Container. All J2EE components, including web components (servlets, JSPs) rely on the existence of a container; without the appropriate container, they would not run.

Perhaps another way to explain this would be to think of the normal mode of execution of Java programs. Java programs, in order to be run, must have a main method defined; this marks the start of program execution and is the method performed when the program is executed from the command line.Figure 3: Containers in the J2EE Platform (Image from J2EE Tutorial)

But, as we can see later, servlets do not have a main method defined. And if there is one defined (bad programming design), it does not mark the start of program execution. When a user makes an HTTP request for a servlet, its methods are not called directly. Instead, the server hands the request not to the servlet, but to the container in which the servlet is deployed. The container is then the one responsible for calling the appropriate method in the servlet, depending on the type of user request.

Features provided by the container:

Communications support. The container handles all of the code necessary for your servlet to communicate with the web server. Without the container, developers may need to write code that will create a socket connection from the server to the servlet (and vice-versa) and manage how they talk to each other every single time.

Lifecycle management. The container handles everything in the life of your servlet, from its class-loading, instantiation and initialization, and garbage collection.

Multi-threading support. The container manages the duty of creating a new thread each time a call to a servlet is made. NOTE: The container is NOT responsible for the thread safety of your servlet.

Declarative security. A container supports the use of an XML configuration file that can handle security for your web application without needing to hard-code any of it into your servlets.

JSP Support. JSP pages, in order to work, must be compiled into Java code. The container manages the task of translating your JSP pages into Java code, compiling it, and calling the appropriate methods in that code.

Basic Structure of a Java Web App

For a container to recognize your application as a valid web application, it must conform to a specific directory structure:

Contains HTML, images, other static content, plus JSPs

Contains meta-information about your application (optional)

All contents of this folder cannot be seen from the web browser

Contains class files of Java classes created for this application (optional)

Contains JAR files of any third-party libraries used by your app (optional)

XML file storing the configuration entries for your application

Figure 4: Directory Structure of Java Web Application

The illustration above shows the directory structure required by the container to recognize your application.

Some points regarding this structure:

One: The top-level folder (the one containing your application) does NOT need to be named Document Root. It can be, in fact, named any way that you like, though it is highly recommended that the top-level folder name be the same name as your application. It is only named Document Root in the figure to indicate that it serves as the root folder of the files or documents in your application.

Two: Any other folder can be created within this directory structure. For example, for developers wishing to organize their content, they can create an images folder from within the document root to hold all their graphics files, or maybe a config directory inside the WEB-INF folder to hold additional configuration information. As long as the prescribed structure as shown above is followed, the container will allow additions.

Three: The capitalization on the WEB-INF folder is intentional. The lowercaps on classes and lib are intentional as well. Not following the capitalization on any of these folders will result in your application not being able to see the contents of these folders.

Four: All contents of the WEB-INF folder cannot be seen from the browser. The container automatically manages things such that, for the browser, this folder does not exist. This mechanism protects your sensitive resources such as Java class files, application configuration, etc. The contents of this folder can only be accessed by your application.

Five: There MUST be a file named web.xml inside the WEB-INF folder. Even if, for example, your web application contains only static content and does not make use of Java classes or library files, the container will still require your application to have these two items.

Exercise

Answer the following questions:

What kind of architecture does a web application make use of? Who are the participants of such an architecture, and what are their roles?

What markup language is used to instruct the browser on how to present content to the user?

HTTP is a (stateful | stateless) connection protocol. (Underline the best answer).

The two most used HTTP request methods are GET and POST. How are they different? When is it better to use one over the other?

How are request parameters sent to the server using the GET method?

What component is absolutely necessary to be able to run web applications?

What are the non-optional elements of a web application's directory structure?

What is the name of the XML file used for configuring the web application? In what directory can it be found?

Which folder contains the JAR files of the software libraries used by your application?

What folder will contain the class files of the Java code used by the application?