v2i700147

Upload: govind-upadhyay

Post on 05-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 V2I700147

    1/4

    2012, IJARCSSE All Rights Reserved Page | 141

    Volume 2, Issue 7, July 2012 ISSN: 2277 128X

    International Journal of Advanced Research inComputer Science and Software Engineering

    Research PaperAvailable online at: www.ijarcsse.com

    Web Mining: Concepts and Decision Making AidGanesh Dhar, Govind Murari Upadhyay

    IITM, JnakpuriNew Delhi, India

    Abstract- The focus of this paper is to bring in light the value of Web Mining. The paper gives an insight into its techniques, processes and its applications in the current cut-throat business environment. It further explains how web mining plays an integral role during the decision making in the corporate environment and is used by the professionals from every walk of life to take rational decisions for the future of the organization. Not only this, the paper also mentions as to how the web mining process has an impact on the society as a whole. There is tremendous growth of web mining over the years. The concept of Web Mining is used in every application domain whether it be Finance, Marketing, HR, Economics. Web Mining has played a significant role in the approach of doing business on

    the web. It has made the life of the professionals smooth unlike the conventional techniques where the user had to face a lot of hindrances while extracting the hidden patterns of data. But having said that, there is still a lot of research that needs to be explored in this domain.

    Keywords- E-Learning, E-Government, Intelligent Search Agents, CRM

    I. INTRODUCTIONIn todays scenario, information is growing at a

    rapid rate. Web Mining these days plays a pivotal rolein extracting patterns of related and unrelatedinformation and knowledge. Web Mining is a broadconcept in the field of Data Mining. It is primarily aninterdisciplinary field which is used for extracting

    information in the fields of Management (Finance,Marketing and HR), Mathematics, Social Sciences,Economics, Information Technology and Productionand Operations. There are numerous sources of information. These sources are classified into twocategories. The sources of information are:

    Internal Sources External Sources

    The internal sources include personnel from theorganization whereas external sources include theclients, vendors, suppliers, competitors, Internet,Intranet and the Extranet. In this scenario, the conceptof WEB MINING plays an important role in the

    corporate structure.The traditional techniques of extracting

    information are no longer needed now. Online libraries,search engines and other repositories are growing sorapidly that it is a Herculean task of taking stock of each and every document manually. This has givenbirth to working with web documents which can bemore easily organized. This technique of retrieval of information is better known as WEB MINING.

    In an organization, the web mining plays asignificant role in the efficiency and effectiveness of decision making across the three levels of Management.This concept of accessing information from a Data

    Warehouse is not only a time saving affair but also ischeap and has helped in making huge profits for theorganizations.

    Web content mining is related to information extractionand knowledge discovery from analyzing a collectionof web documents. As already stated earlier, webMining is a wider concept. It also incorporatesMultimedia Web Mining. That means content indexing

    of multimedia tools like images, videos, audios,animations etc through Multimedia Web Mining. WebMining is divided into three main categories. These are:

    A. Web-content Mining :It emphasizes on the knowledge discovery by goingthrough the web pages contents like images, videos,animations etc. The intelligent agents in the webcontent mining concept solve the indexing problem of the search engines which otherwise result in deliveringimprecise results due to information overloading.Web content mining is much more than the selection of relevant documents on the web. It is related toinformation mining and knowledge discovery through

    analysis of collection of web documents. Web Contentmining is the effort required to organize semi-structuredweb data into structured collection of resources leadingto fruitful and efficient query results.It consists of two approaches and these are as follows:

    Agent based Approach Database Approach Agent Based Approach:-

    There are three categories of Agent Based Approachand these are as follows:

    Intelligent Search Agents:These agents search for relevantinformation using domain

    characteristics and user profiles tostructure and interpret the discoveredknowledge.

    http://www.ijarcsse.com/http://www.ijarcsse.com/http://www.ijarcsse.com/http://www.ijarcsse.com/
  • 7/31/2019 V2I700147

    2/4

    Volume 2, Issue 7, July 2012 www.ijarcsse.com

    2012, IJARCSSE All Rights Reserved Page | 142

    Information Filtering:These agents retrieve informationwhich they open, filter and categorizein hypertext, web documentautomatically.

    Personalized Web AgentsThese agents work on the principle of user preferences.

    Database Approach: It models the dataonto the web into structured format so thatdatabase querying operations can beperformed for data analysis.[2] Georgios Lappas, (2008).

    B. Web-structure Mining :The web page structure consists of web pages as nodesand hyperlinks as edges connecting related pages. Web

    structure mining depicts a structural layout of the web.It emphasizes on the connectivity among the websitesthat is hyperlinks.Further, web structure mining can be classified into twocategories.These are as follows:

    Hyperlink :It is an element in an electronic documentthat links to another place in the samedocument or to an entirely differentdocument.

    Hyperlinks are divided into two types.

    Internal hyperlinks that lead to pages withinthe same website.

    External hyperlinks that lead to other webpages.

    Document Structure:It is a schema language for XML that is alanguage for describing valid XMLdocuments.

    Fig. 1 Web mining element

    C. Web- usage Mining :It emphasizes on the knowledge discovered while theuser is navigating through the websites. That means allkinds of user requests and maintaining a repository of all such requests in log files.Web usage mining is classified into two types.

    Web Server Data:Logs are made by the web server and they include fieldlike IP addresses, the web pages accessed and theaccess times.

    Application Server Data:Such applications are prepared for carrying out thebusiness transactions and make their repository inapplication server logs.Below is the pictographic representation of WebMining elements. [1] Padre Toms, S.J., Taipa 2005

    II. RELATIONSHIP OF WEB MININGWITH SOCIETY:

    Web Mining has played a significant role inmeeting the daily challenges of the society. Lifewithout Web Mining in various spheres of life likeHealth, Politics, Entertainment etc stands null and voidin the present context.Social benefits of Web Mining are divided into thefollowing areas:A. E-Learning : The learning process has undergone

    a remarkable change over the years. Gone are thedays of traditional learning processes. The adventof web Mining has given birth to E-Learning overthe years thus making life much more comfortableand tech-savvy.

    B. E-Government : Web Mining and E-Governmentgo hand in hand. The induction of this concept inE-Government has facilitated the Government intaking decisions regarding its policies. It has beena decision making tool for smoothening theimplementation policies of the Government andmaintaining transparency at both national andinternational levels.

    C. Digital Libraries : The Digital Libraries havemade the life of net savvy person quitecomfortable. He need not go into traditionallibraries and search for books, journals,magazines, articles and research papers. Just withthe help of a click, he gets access to all theseresources. Web Mining search leads to betterservices as far as Digital Libraries are concerned.The web mining applications are web-structuredriven applications. Nowadays, it is a challenge toweb mine the dynamic Web Portals. The area of digital libraries will play a significant role in theWeb-Mining Process in future. The web is adocumented encyclopedia which needs tools tostructure and organize the information andcontains a vast amount of semi-structuredinformation, prompting the search to structure andorganize parts of this information whereas on theother hand the digital libraries are experts inorganizing, structuring and indexing digitalcontent.

  • 7/31/2019 V2I700147

    3/4

    Volume 2, Issue 7, July 2012 www.ijarcsse.com

    2012, IJARCSSE All Rights Reserved Page | 143

    D. Public Security and Crime Investigation: E-commerce faulty websites, child pornography,online gambling, hacking and virus spreading aresome of the areas which utilize the web miningpractices. The web information is classified intowhite list-that contains all secure information andblacklist-the one with blocked information.

    III. WEB MINING IMPACT ON BUSINESSPROFITABLILITY :

    A. Product Preference :Nowadays, customer is the king. It is a customer-centricmarket. It is very important for any business and anybusinessman to understand the customer mind,understand his tastes and preferences. Marketing playsa key role in keeping the business and the entrepreneurwell abreast of the market trends. In order to sailthrough, the top management should be well aware of the target audience. In case, the business has no ideaabout the target audience, it is a HERCULEAN job to

    sustain in this cut-throat competition.

    The above site Netflix.com is a productrecommendation site which works on the WebMining concept. The site gives recommendationsregarding the various movies based on their rental anduser profiles, recommendations based on earlier moviesrecommended by user, recommendations based oninformation shared with friends as can be seen in thepicture too. Various web mining techniques likeAssociation, Classification, Clustering etc are used bysuch sites to improve customer experience along withprovide recommendations

    B. Trend Analysis :This is another method of Web Mining which is beingused in the industry nowadays to facilitate decisionmaking in the organizations. In this technique, thequery words play an important role in establishingtrends on products among the users who are performingweb mining. Search Engines have been focusing onanalyzing trends in these query words for improving theresults.

    The above figure depicts the trends in keywords Web Mining and Business Computing. The news articleson the right side represent randomly selected newsarticles based on the topics.

    C. E-Business :It enables to target the right customers, understand andanalyze their requirements, preferences instantly andcome up with their tastes. It utilizes the opportunities topromote the right products to the right customer andmost importantly at the right time. As a result of that,the website gets nicely promoted leading to high salesand consequently higher profitability in the future. Insuch a case, the website becomes the channel formarketing.

    D. CRM :CRM stands for Customer Relationship Management. Itis a widely implemented model for managingcompanys interaction with customers, clients and salesprospects. It uses technology to organize, automate andsynchronize all the business processes. The goal of CRM is to attract the new clients, retain the old onesleading to newer clients and increased profitability of the organization. Measuring and emphasis on customerrelationships are the two major things whileimplementing the CRM strategy.CRM solutions useintelligent systems to analyze data, identify thedemographic profiles, measure the buying capacity andother unknown behavioral patterns of data about thecustomer and based on all that take decisions on behalf of organizations.

    E. HR Call Centres :As information is growing at an exponential rate, itbecomes a Herculean task for the HR employees of thecompany to give answers to customer queries. Thereason is that although every information can beretrieved from the website of the company, but it is adifficult task to access information from huge numberof documents.

  • 7/31/2019 V2I700147

    4/4

    Volume 2, Issue 7, July 2012 www.ijarcsse.com

    2012, IJARCSSE All Rights Reserved Page | 144

    This problem gave rise to creation of Internal HRWebsites. These internal websites cater to therequirements of the web users in every form so that theweb users of that organization feel satisfied. Thistechnique of gathering information involves bothstructural and conceptual characteristics of extractinginformation. The conceptual aspect lays emphasis on

    the logics used for creating website whereas structuralemphasizes on navigational path followed for extractinginformation. Both these characteristics play significantrole in measuring the efficiency of recommendationengines too. [5] Prasanna Desikan, 2009

    Fig.4 HR recomendationRecommendations provided to user of EmployeeBenefit Website.

    IV. CONCLUSIONSThis paper lays stress on the significance of WebMining and its impact on the society. It explores thevarious applications of the Web Mining processes, itstechniques and its scope in the future. The paper provesthe concept of Web Mining is interdisciplinary and

    that Web Mining can be used by every professionalirrespective of the domain he or she is working in. Italso underlines how web mining plays a vital role in thedecision building exercise of the organization.This is just the beginning. Web Mining has a longway to go

    References[1] Padre Toms, S.J., Taipa : Web Structure Mining:An Introduction, Department of Computer andinformation Science Faculty of Science andTechnology University of Macau. Proceedings of the 2005 IEEE, International Conference onInformation Acquisition June 27 - July 3, 2005,Hong Kong and Macau, China

    [2] Georgios Lappas, (2008),"An overview of webmining in societal benefit areas", OnlineInformation Review, Vol. 32 Iss: 2 pp. 179 195

    [3] Rajni Pamnani, Pramila Chawan: Web UsageMining: A Research Area in Web Mining Department of computer technology, VJTIUniversity, Mumbai

    [4] Pradnya Purandare :Web Mining: A Key toImprove Business on Web ConferenceProceeding: 01/2008; In proceeding of: IADISEuropean Conference on Data Mining 2008.

    [5] Prasanna Desikan, Colin DeLong, Sandeep Mane,Kalyan Beemanapalli, Kuo-Wei Hsu, PrasadSriram, Jaideep Srivastava, Vamsee Venuturumilli:Web Mining for Business Computing Universityof Minnesota, Emerald Group Publishing Limited,

    Howard House, Wagon Lane, Bingley BD16 1WA,UK, First Edition,2009.

    http://www.researchgate.net/researcher/69634091_Pradnya_Purandarehttp://www.researchgate.net/researcher/69634091_Pradnya_Purandare