cse509 lecture 1

24
Muhammad Atif Qureshi Web Science Research Group Institute of Business Administration (IBA) CSE509: Introduction to Web Science and Technology Lecture 1: Introduction

Category:

Technology


0 download

DESCRIPTION

Lecture 1 of CSE509:Web Science and Technology Summer Course

TRANSCRIPT

Page 1: CSE509 Lecture 1

Muhammad Atif Qureshi

Web Science Research Group

Institute of Business Administration (IBA)

CSE509: Introduction to Web Science

and Technology

Lecture 1: Introduction

Page 2: CSE509 Lecture 1

2

Outline

What is Web Science?

Why We Need Web Science?

Implications of Web Science

CSE509 Adminstrivia

Course Contents

July 09, 2011

Page 3: CSE509 Lecture 1

3

Science of the Web

Why we need Web Science as a research field? Because we need a systems-level understanding of the Web.

– Prof. Nigel Shadbolt,One of pioneers of Web Science program,

University of Southampton

July 09, 2011

Introduction

Page 4: CSE509 Lecture 1

4

Web Science

Social and engineering dimensions (New York Times at launch of Web Science Program at Univ. of Southampton and MIT in 2006)

Extends well beyond traditional Computer Science

The Web isn’t about what you can do with computers. It’s people and, yes, they are connected by computers. But computer science, as the study of what happens in a computer, doesn’t tell you about what happens on the Web.

–Tim Berners-LeeOne of the founder of WWW

July 09, 2011

Introduction

Page 5: CSE509 Lecture 1

5

What is the Web?

A distributed document delivery system implemented through application-level protocols on the Internet

A tool for collaborative writing and community building

A framework of protocols that support e-commerce

A network of co-operating computers

A large, cylindrical, directed graph made up of Web pages and links

July 09, 2011

Introduction

Page 6: CSE509 Lecture 1

6

Science (in a nutshell)

July 09, 2011

Introduction

Existence Does X exist?

Description and Classification What is X like? What are its properties? How can it be categorized? How can we measure it? What are its components?

Descriptive Process How does X work? What is the process by which X

happens? What are the steps as X

evolves? How does X achieve its

purpose?

Descriptive-Comparative How does X differ from Y?

Relationship Are X and Y related? Do occurrences of X co-relate

with occurrences of Y?

Casuality Does X cause Y? Does X prevent Y? What causes X? What effect does X have on Y?

Design What is an effective way to

achieve X? How can we improve X?

Page 7: CSE509 Lecture 1

7

Perspectives of “Science”

Physical/biological science perspectives Analytic disciplines that aim to find laws/processes that generate or explain

observed phenomena

Social science perspective Scholarly or scientific disciplines that deal with the study of human society

and of individual relationships in and to society

Computer science perspective Synthetic discipline that creates mechanisms (e.g., formalisms, algorithms,

etc.) in order to support particular desired behavior

July 09, 2011

Introduction

Page 8: CSE509 Lecture 1

8

Which Science Explains the Web?

Given Neither the Web nor the world is static The Web evolves in response to various pressures from

Science Commerce The public Politics Etc.

July 09, 2011

Introduction

Page 9: CSE509 Lecture 1

9

Web Science

The Web is a new technical and social phenomenon and a growing organism

The Web needs to be studied and understood as an entity in its own right

Web Science is a new field of science that involves a multi-disciplinary study and inquiry for the understanding of the Web and its relationships to us

July 09, 2011

Introduction

Page 10: CSE509 Lecture 1

10

Why Web Science?

Dynamics and evolution The “deep (or dark) Web” Sampling, lack of complete enumeration Scale (e.g., What is the percentage of Web pages updated

daily?) Search (e.g., What percentage of Web pages are indexed by

search engines?) Web topology Artifacts of social interactions (blogs, etc.), Web sociology

July 09, 2011

Importance

Page 11: CSE509 Lecture 1

11

Web Science vs. Computer Science

Metrics Computer Science: Moore’s Law, O(n) algorithm analysis, Gigabytes Web Science: Page views, Unique visitors/month, No. of songs/videos

Topics Computer Science: Computer networks, Programming languages, Database

systems, Operating systems, Compilers, Graphics Web Science: Social networks, Relationships (users, web pages, etc.), Web

2.0 applications, E-*, Creating/sharing multimedia

Focus Computer Science: Technology, Computers, HPC, Proficient programmers Web Science: Applications, Users, Mobile interactivity, Universal accessibility

July 09, 2011

Page 12: CSE509 Lecture 1

12

What Could Scientific Theories for the Web Look Like?

Every page on the Web can be reached by following less than 10 links

The average number of words per search query is greater than 3

A wikipedia page on average contains 0.03 false facts The Web is a “scale-free” graph

July 09, 2011

Importance

Page 13: CSE509 Lecture 1

13

Intersection of Disciplines

July 09, 2011

Importance

Page 14: CSE509 Lecture 1

14July 09, 2011

Proper discipline of interest is not only Web ScienceBut

“Web Science and Technology”

Page 15: CSE509 Lecture 1

15

Web’s Relation with Entrepreneurship

July 09, 2011

Web Science represents a pretty big next step in the evolution of information.  This kind of research is likely to have a lot of influence on the next generation of researchers, scientists and most importantly, the next generation of entrepreneurs who will build new companies from this.

– Eric Schmdt,Ex-CEO, Google Inc.

Implication

Page 16: CSE509 Lecture 1

16

For Pakistan Web Science and Technology

Job market is heavily consumed by technology of Web solutions

Remote industry such as Google, Yahoo, Microsoft is heavily investing in it

Business is getting a good amount of share from the Web

Social Media reaches people massively than the traditional media

July 09, 2011

Implication

Page 17: CSE509 Lecture 1

17

Course Objectives

Have insight on the future direction of the Web How technological changes affect the Web as a system

Learn design principles for complex Web applications and systems

Prepare for the new era of Web science and technology

July 09, 2011

Page 18: CSE509 Lecture 1

18

Course Information

Instructors Muhammad Atif Qureshi Arjumand Younus

Class Hours Saturdays 6:00 pm to 8:15 pm

Office Hours Mondays 1:00 pm to 3:00 pm

Evaluation Assignments (50%) Mid-Term Exam (30%) Research Project (20%)

July 09, 2011

Page 19: CSE509 Lecture 1

19

Course Organization

Session One Information Retrieval

Session Two Large-Scale Web Mining

Session Three Social Web Mining

July 09, 2011

Page 20: CSE509 Lecture 1

20

Information Retrieval

Principles and Theories behind Web Search Engines Basic IR models, data structures and algorithms Topic-based models Link-based ranking Search engine architecture

July 09, 2011

Page 21: CSE509 Lecture 1

21

Large-Scale Web Mining

MapReduce Design Patterns Big data Larger amount of data means useful applications

Algorithms using MapReduce Distributed File Systems (GFS)

July 09, 2011

There is substantial promise in this new paradigm of computing, but unwarranted hype by the media and popular sources threat-ens its credibility in the long run. In some ways, cloud computing

is simply brilliant marketing

– Jimmy LinTwitter Scientist and Maryland Professor

Page 22: CSE509 Lecture 1

22

Social Web Mining

Social Web Crawling Mining for Information in Social Networks

Trend analysis Dynamics and evolution patterns Temporal analysis Community detection and analysis

Social Search

July 09, 2011

Page 23: CSE509 Lecture 1

23

EXAMPLE OF WEB SCIENCE PROJECT: Diff-IE

(courtesy Jaime Teevan, Microsoft Research)

July 09, 2011

Page 24: CSE509 Lecture 1

24

DISCUSSION

July 09, 2011