broken link checker

24
URL LINK CHECKING WEB APPLICATION

Upload: samuel-boboo-bassah

Post on 21-Jul-2016

33 views

Category:

Documents


2 download

DESCRIPTION

A Web Application to Check for Broken Links.

TRANSCRIPT

Page 1: Broken Link Checker

URL LINK CHECKING WEB APPLICATION

Page 2: Broken Link Checker

INTRODUCTION Since the first websites in the early 1990′s, designers have been experimenting with the way websites look. Early websites were entirely text-based, with minimal pages and no real layout to speak of other than headings and paragraphs.

Page 3: Broken Link Checker

INTRODUCTION CONT’D However, the industry progressed and new technologies, has made it possible to easily develop websites with unlimited number of web pages. Today's web is a result of the ongoing efforts of an open web community that helps define these web technologies, like HTML5, CSS, PHP, Javascript and WebGL and ensure that they're supported in all web browsers.

Page 4: Broken Link Checker

INTRODUCTION CONT’D One of the most persistent problems faced by website designers and webmasters is that, links within pages become broken. It’s common with the growth of website content management systems that websites have multiple editors & frequent updates, so this can easily result in mistakes. While easy to manually check, it is often the case that sites are just too big to go through page-by-page to check.

Page 5: Broken Link Checker

INTRODUCTION CONT’D A broken link is a link that doesn't work, often resulting in an error page (404 Error ). A broken link happens when the link points to a web page that has been deleted or moved.

Page 6: Broken Link Checker

INTRODUCTION CONT’DIf the site is visited by the general public, not only does it disrupt user journeys through the website which could leave users frustrated with your brand, but it also makes search engine spiders jobs more difficult and therefore reducing the opportunity to rank the website well. If search engines are unable to crawl / index a site easily they will be unable to appropriately rank it. Web masters are therefore required to occasionally perform routine maintenance on the website to check for broken links, in other to increase the traffics to their website. The process of checking for broken links can become a cumbersome task, if the website has lots of web pages. This process of manually checking links to identify broken once, can be better accomplished if there is a web service to automate it.

Page 7: Broken Link Checker

OBJECTIVES

By the end of this project we aim at achieving the following objectives;

To develop a web application that will crawl registered websites to search for broken links on the web pages.

To develop a web application that will notify registered user via Email on the status of the links on their website.

To make the Web application user-friendly and interactive so that the users find it easy to use and navigate it.

To provide detailed documentation of the project being undertaken.

Page 8: Broken Link Checker

LIMITATIONS OF PROPOSED SYSTEMThe more web pages a website has, the more time the web crawler will take to scan through the site and this will takes a large amount of technical resources.

Page 9: Broken Link Checker

PROJECT METHODOLOGY Taking into consideration the small number of team members (2) for this project, the Extreme Programming Methodology (XP) of the agile development approach will be used. Extreme Programming (XP) is a software development methodology which is intended to improve software quality and responsiveness to changing requirements.

The Unit Testing and pair programming features of Extreme Programming was used to ensure Code accuracy and collaboration of both team members respectively. The Software was developed iteratively using a test-driven approach, such that a new feature is added, when the code passes the test.

Page 10: Broken Link Checker

REVIEW OF A WEB CRAWLER A web crawler (also known as a robot or a spider) is a system for the bulk downloading of web pages. Web crawlers are used for a variety of purposes. Most prominently, they are one of the main components of web search engines, systems that assemble a large number of web pages, index them, and allow users to issue queries against the index and find the web pages that match the queries.

Page 11: Broken Link Checker

REVIEW OF A WEB CRAWLER The basic web crawling algorithm is simple: Given a set of seed Uniform Resource Locators (URLs), a crawler downloads all the web pages addressed by the URLs, extracts the hyperlinks contained in the pages, and iteratively downloads the web pages addressed by these hyperlinks.

Page 12: Broken Link Checker

USER REQUIREMENTSUser requirements consist of both functional and non-functional requirements.

Functional requirements are statements of services the system should provide, how the system should react to particular inputs and how the system should behave in particular situations.

Non-functional requirements are constraints on the services or functions offered by the system such as timing constraints, constraints on the development process, standards etc. Non-functional requirements may be more critical than functional requirements. If these are not met, the system is useless.

Page 13: Broken Link Checker

FUNCTIONAL REQUIREMENT.Functional requirements explain what has to be done by identifying the necessary task, action or activity that must be accomplished.

Register with the SystemLog into the system with their valid accountAllow user to view scan report historyChange Password, if forgotten.Allow Supervisors or Administrator to approve user registrationAllow Supervisors or Administrator to run Scan JobAllow Supervisors or Administrator to delete UserAllow Administrator to view Scan Report Summary.

Page 14: Broken Link Checker

NON-FUNCTIONAL REQUIREMENTSThese requirements are to specify the criteria that can be used to judge the operation of a system rather than specific behaviors. They are requirements of the system that are not directly related to the functions performed by the system.

 EASE OF USE

PORTABILITY

MAINTAINABILITY

SECURITY

Page 15: Broken Link Checker

SOFTWARE REQUIREMENTSThe System runs best on a system with the following capabilities and provision:

Microsoft Windows Operating System, Mac OS or Linux

A web browser such as Mozilla Firefox, Google Chrome, Internet Explorer, etc

HARDWARE REQUIREMENT

A Computer with a RAM of at least 512MB

The primary input and pointing device such as keyboard and mouse

The most important requirement is internet access.

Page 16: Broken Link Checker

ARCHITECTURE OF THE LINK CHECKER

Page 17: Broken Link Checker

USE CASE DIAGRAM

Page 18: Broken Link Checker

E-R DIAGRAM

Page 19: Broken Link Checker

FUTURE WORKThere are certain features that can be added to the software to increase its functionalities and this is something we will like to improve in the future. The current process of running a Scan job is manually started by a Supervisor, and we hope to automate this task. Also the current System can be further enhanced by providing details of the exact line the Broken link was found on a page so as to make it easy for the user to locate and correct the broken Link.

Page 20: Broken Link Checker

CONCLUSION

The URL Link Checking Web Application will go a long way to simplify webmasters maintenances activities of checking for broken links on websites.

Page 21: Broken Link Checker
Page 22: Broken Link Checker
Page 23: Broken Link Checker
Page 24: Broken Link Checker