a nalysis of p olitics and i ndustry n exus : i ndia project supervisor: prof. aaditeshwar seth...

21
ANALYSIS OF POLITICS AND INDUSTRY NEXUS: INDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224)

Upload: leon-manning

Post on 18-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

ANALYSIS OF POLITICS AND INDUSTRY NEXUS: INDIA

Project Supervisor: Prof. Aaditeshwar Seth

Himanshu Sharma (2010CS50284)

Mayank Srivastava (2010CS10224)

Page 2: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

OBJECTIVES

Extract information about political-industry and intra political nexus from newspapers and some available structured sources on the web.

Represent it in the form of a graph with nodes representing entities and edges representing the relation between entities.

Analyze the graph obtained, rank the entities, and find correlation between news in different newspapers.

Page 3: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

IMPLEMENTATION

Structured information collected from netapedia.in, myneta.info, PPPIndia.com and capitaline.info.

Continuous RSS feed collection from different newspapers.

Processing of the news through an NLP tool, OpenCalais.

Storing information in database in tables, filtering it and ranking the entities.

Page 4: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

SYSTEM IN DETAIL

Page 5: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

RANKING OF ENTITIES

Ranking entities using exponential moving average (called Fame from now onwards), which is updated on occurrence basis: High sensitivity to changing news, important entities in news come up while less important ones go down.

Ranking using PageRank algorithm with the exponential moving average used as personalization vector: Low sensitivity to changing news, shows the overall influence of an entity in the network.

Page 6: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

CORRELATION BETWEEN NEWSPAPERS

Used Spearman’s rank correlation coefficient. High correlation when entities are ranked

using PageRank values. Correlation coefficients as on 1st March (with

respect to the overall data): DNA (Business Section): 0.99118 Hindustan Times: 0.99147 DNA (Political Section): 0.99290 The Times of India: 0.99305 The Hindu: 0.99336

Page 7: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

CORRELATION BETWEEN NEWSPAPERS

Low correlation when entities are ranked by Fame values.

Correlation coefficients as on 1st March (with respect to the overall data): DNA (Business Section): 0.33939 Hindustan Times: 0.41778 DNA (Political Section): 0.52837 The Times of India: 0.54673 The Hindu: 0.57951

Low correlation suggests that newspapers are biased.

Page 8: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

MORE ON CORRELATION

Plotted week to week correlation Higher correlation between DNA (Business

Section) and DNA (Political Section). Hindu Shows a little lower correlation with

Hindustan Times and The Times of India, showing some “different news from Times”.

Plotted inter-week correlation coefficients for newspaper: Mostly varies between 0.2 to 0.4

Increased time duration to see longevity of news. Correlation values reach an asymptotic value of around 0.15 for political newspapers.

Page 9: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

MORE ON CORRELATION

For DNA (Business section), correlation touches 0.05.

DNA (Business Section) has lowest maximum longevity- It frequently switches news.

Longevity lower in general for The Hindi and Hindustan Times, as compared to DNA (Political Section) and The Times of India.

DNA (Political Section) and TOI cling to the same news and repeat it through a prolonged duration, while HT and Hindu prefer to switch news.

Page 10: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

BIAS BY NEWSPAPERS: EXAMPLES

In August 2012, TOI gives a lot of emphasis on Nitish Kumar; while Hindu chooses to neglect it.

During mid of March 2013, Hindu, Hindustan Times and DNA (Political Section) give a lot of emphasis on Manmohan Singh,but The Times of India gives him less importance. Instead, it shows a number of news pertaining to Xi Jinping, while the rest ignore him.

Page 11: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

TIMELINES SHOWING SOME IMPORTANT ENTITIES

Hindustan Times

Page 12: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

TIMELINES SHOWING SOME IMPORTANT ENTITIES

The Hindu

Page 13: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

TIMELINES SHOWING SOME IMPORTANT ENTITIES

The Times of India

Page 14: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

TIMELINES SHOWING SOME IMPORTANT ENTITIES

DNA (Political Section)

Page 15: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

TIMELINES SHOWING BIAS WITH POLITICAL PARTIES

Hindustan Times

Page 16: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

TIMELINES SHOWING BIAS WITH POLITICAL PARTIES

The Times of India

Page 17: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

TIMELINES SHOWING BIAS WITH POLITICAL PARTIES

DNA (Political Section)

Page 18: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

TIMELINES SHOWING BIAS WITH POLITICAL PARTIES

The Hindu

Page 19: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

CONCLUSIONS

The most important parts of news are shown almost equally by all newspapers.

Newspapers generally do biasing in showing the less important components of news.

Newspapers are generally biased in showing regional parties. Janata Dal (United) is given preference by TOI

and DNA, while ignored by Hindu. Both Samajwadi Party and Akhilesh yadav are

very clearly avoided by Hindustan Times. CPI is closely followed by Hindu, while Shiv Sena

is avoided by it.

Page 20: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

REFERENCES

www.visualdataweb.org/relfinder.php www.mpi-inf.mpg.de/yago-naga/yago www.dbpedia.org www.opencalais.com www.wikipedia.org www.myneta.info www.netapedia.in www.semanticproxy.com “Identifying Influencers in Social Networks”

by Kushal Dave, Rushi Bhatt, VasudevaVarma.

Page 21: A NALYSIS OF P OLITICS AND I NDUSTRY N EXUS : I NDIA Project Supervisor: Prof. Aaditeshwar Seth Himanshu Sharma (2010CS50284) Mayank Srivastava (2010CS10224

Thank You