INTEGRAREA IDENTITĂȚII CULTURALE ÎN REȚELELE SOCIALE
Autor: Ionel‐Bujorel PĂVĂLOIU Conducător ştiințific: Acad. Paul CRISTEA
Lucrare realizată în cadrul proiectului „Valorificarea identităților culturale în procesele globale”, cofinanțat din Fondul Social European prin Programul Operațional Sectorial Dezvoltarea Resurselor Umane 2007 – 2013, contractul de finanțare nr. POSDRU/89/1.5/S/59758. Titlurile şi drepturile de proprietate intelectuală şi industrială asupra rezultatelor obținute în cadrul stagiului de cercetare postdoctorală aparțin Academiei Române.
Punctele de vedere exprimate în lucrare aparțin autorului şi nu angajează Comisia Europeană şi Academia Română, beneficiara proiectului.
Exemplar gratuit. Comercializarea în țară şi străinătate este interzisă.
Reproducerea, fie şi parțială şi pe orice suport, este posibilă numai cu acordul prealabil al Academiei Române.
ISBN 978‐973‐167‐192‐5 Depozit legal: Trim. II 2013
Ionel‐Bujorel PĂVĂLOIU
Integrarea identității culturale în rețelele sociale
Editura Muzeului Național al Literaturii Române
Colecția AULA MAGNA
5
Cuprins
1. IDENTITATEA CULTURALĂ ŞI REȚELELE SOCIALE ........................... 11
1.1. Scopul lucrării..........................................................................................11
1.2. Identitatea națională ...............................................................................11
1.3. Rețelele sociale.........................................................................................12
1.4. Integrarea identității naționale în rețelele sociale...............................14
2. MODELAREA ŞI ANALIZA REȚELELOR DE SOCIALIZARE ............... 19
2.1. Rețelele de socializare.............................................................................19
2.1.1. Structurile sociale...........................................................................19
2.1.2. Rețelele sociale................................................................................19
2.1.3. Sociometria......................................................................................19
2.1.4. Rețelele de socializare....................................................................20
2.1.5. Rețelele de socializare în Romania ..............................................21
2.1.6. Rețelele de socializare româneşti .................................................22
2.2. Modelarea rețelelor sociale ....................................................................23
2.2.1. Reprezentarea matriceală..............................................................23
2.2.2. Reprezentare cu grafuri.................................................................26
2.3. Analiza rețelelor sociale .........................................................................30
2.3.1. Dimensiunea rețelei .......................................................................31
2.3.2 .Densitatea şi gradul rețelei ...........................................................31
2.3.3. Gradul unui actor...........................................................................31
2.3.4. Drumul minim................................................................................32
2.3.5. Distanța dintre actori.....................................................................32
2.3.6. Grupările locale ..............................................................................33
2.3.7. Centralitatea....................................................................................33
6
2.4. Reprezentarea grafică a rețelelor sociale .............................................34
2.4.1. Parametri .........................................................................................34
2.5. Particularitățile rețelelor de socializare................................................34
2.5.1. Analiza structurală ........................................................................35
2.5.2. Analiza bazată pe conținut ...........................................................35
2.5.3. Analiza dinamică ...........................................................................36
2.6. Caracteristicile unei aplicații dedicate analizei rețelelor de socializare ...........................................................................................36
2.7. Aplicații uzuale pentru studiul rețelelor sociale ................................36
3. MASS‐MEDIA SOCIALE................................................................................ 38
3.1. Componente.............................................................................................38
3.1.1. Platforme pentru rețelele de socializare .....................................39
3.1.2. Bloguri .............................................................................................39
3.1.3. Microbloguri ...................................................................................40
3.1.4. Proiecte colaborative .....................................................................40
3.1.5. Comunități de conținut .................................................................42
3.1.6. Lumi virtuale ..................................................................................44
3.2. Tehnologii ................................................................................................44
3.2.1. Discuțiile online..............................................................................44
3.2.2. Rețelele de socializare....................................................................45
3.3. Caracteristicile mediilor sociale ............................................................46
3.3.1. Interactivitatea................................................................................46
3.3.2. Accentul pe utilizator ....................................................................46
3.3.3. Accentul pe grup............................................................................46
3.3.4. Legăturile ........................................................................................47
3.3.5. Accesibilitatea.................................................................................48
3.3.6. Audiența..........................................................................................48
7
3.3.7. Factorul emoțional .........................................................................48
3.3.8. Promptitudinea ..............................................................................48
3.3.9. Autonomia ......................................................................................49
3.3.10. Flexibilitatea..................................................................................49
3.3.11. Practicabilitatea ............................................................................50
3.3.12. Calitatea.........................................................................................50
3.3.13. Cantitatea ......................................................................................50
3.3.14. Persistența .....................................................................................50
3.4. Mass‐media sociale în România............................................................51
4. BLOGOSFERA.................................................................................................. 52
4.1. Importanța blogurilor.............................................................................53
4.1.1. Socializarea .....................................................................................53
4.1.2. Libertatea de exprimare ................................................................54
4.1.3. Notorietatea ....................................................................................54
4.1.4. Informarea.......................................................................................54
4.1.5. Câştigul financiar ...........................................................................55
4.1.6. Ajutorul ...........................................................................................56
4.2. Structura unui blog .................................................................................56
4.2.1. Antet şi logo‐ul ...............................................................................56
4.2.2. Subsolul blogului ...........................................................................56
4.2.3. Partea centrală ................................................................................57
4.2.4. Articolul de blog ............................................................................57
4.2.5. Părție laterale ..................................................................................59
4.2.6. Componentele de navigare...........................................................59
4.3. Elemente HTML......................................................................................60
4.3.1. Etichetarea documentului.............................................................60
4.3.2. Codul de limbă şi setul de caractere............................................61
8
4.3.3. Etichetele titlului şi capitolelor.....................................................62
4.3.4. Etichetarea textului ........................................................................63
4.3.5. Etichetarea listelor..........................................................................63
4.3.6. Etichetarea listelor de bloguri ......................................................63
4.3.7. Etichetarea legăturilor ...................................................................64
4.4. Particularitățile blogurilor ca pagini de web.......................................65
4.4.1. Legăturile din exterior...................................................................65
4.5. Blogosfera românească ...........................................................................66
4.5.1. Bloguri personale ...........................................................................67
4.5.2. Bloguri de domeniu.......................................................................69
4.5.3. Blogul companiei ...........................................................................69
4.5.4. Colecțiile de bloguri.......................................................................70
4.5.5. Forumurile ......................................................................................70
5. EXPLOATAREA DATELOR DIN MEDIILE SOCIALE ............................. 71
5.1. Extragerea informației din bloguri .......................................................71
5.2. Roboții de Internet ..................................................................................71
5.2.1. Funcționarea roboților de Internet ..............................................72
5.2.2. Extracția de rețele din blogosfera ................................................77
5.3. Metodele de grupare nesupervizată.....................................................77
5.4. Clasificarea de text ..................................................................................77
5.4.1. Metode de clasificare supervizată ...............................................77
5.4.2. Clasificarea Bayesiană ...................................................................78
5.4.2. Clasificarea folosind cei mai apropiați K‐vecini ........................80
5.4.4. Maşini cu vectori suport ...............................................................81
5.4.5. Modelul folosind grafuri de termeni...........................................82
5.4.6. Clasificarea bazată pe centroizi....................................................82
5.4.7. Arborii de decizie...........................................................................83
9
5.4.8. Clasificarea folosind rețele neurale .............................................85
5.5. Analiza sentimentelor ............................................................................86
5.6. Evoluția temporală..................................................................................86
6. CLASIFICATORII FOLOSIND REȚELE NEURALE .................................. 87
6.1. Avantaje şi dezavantaje..........................................................................87
6.2. Rețele neurale clasice..............................................................................88
6.3. Sisteme conexioniste ‐ rețele neurale....................................................89
6.3.1. Generația I de rețele neurale ........................................................94
6.3.2. Generația a II‐a de rețele neurale.................................................95
6.3.3. Generația a III‐a de rețele neurale ...............................................98
6.4. Rețele neurale cu valori complexe ......................................................104
6.4.1. Neuronul binar universal ...........................................................104
6.4.2. Neuronul bazat pe fază ...............................................................106
6.4.3. Implementarea funcției XOR......................................................106
6.4.4. Antrenarea PBN ...........................................................................108
6.4.5. Minimizarea erorii pentru PBN .................................................109
6.4.6. Neuronul cu impulsuri bazat pe fază .......................................118
7. ANALIZA MEDIILOR SOCIALE................................................................ 122
7.1. Obținerea informațiilor din mass‐media sociale ..............................122
7.2. Colectarea datelor .................................................................................122
7.2.1. Caracteristici generale .................................................................122
7.2.2. Obținerea informațiilor de tip text din bloguri........................124
7.3. Procesarea textului................................................................................130
7.3.1. Înlăturarea cuvintelor de legătură .............................................130
7.3.2. Extragerea rădăcinilor .................................................................132
7.4. Analiza mediilor sociale.......................................................................134
10
7.4.1. Caracteristicile necesare unor aplicații dedicate analizei mediilor sociale............................................................................134
7.4.2. Utilizarea bazelor de date ...........................................................134
7.5. Integrarea identității culturale în rețelele sociale .............................140
7.5.1. Blogosfera din România ..............................................................140
7.5.2. Limba şi setul de caractere..........................................................140
7.5.3. Clasificarea blogurilor .................................................................141
7.5.4. Integrarea identității culturale în blogosferă ...........................155
8. ÎNCHEIERE .................................................................................................... 157
8.1. Descrierea rezultatelor .........................................................................157
8.2. Contribuțiilor proprii aduse la cercetare ...........................................157
8.2.1. Dezvoltare conceptuală ...............................................................157
8.2.2. Metodologii şi instrumente de analiză şi cercetare aplicativă ..................................................................158
8.3. Impactului posibil al cercetării in domeniul studiat ........................159
9. GLOSAR.......................................................................................................... 161
10. BIBLIOGRAFIE............................................................................................. 167
ADDENDA ......................................................................................................... 182
ABSTRACT ............................................................................................ 182
SUMMARY ............................................................................................ 185
182
ADDENDA
Abstract
Integration of Cultural Identity in Social Networks
This work was carried out under the project entitled ʺValorization of
cultural identities within global processesʺ, with the identification number of the contract POSDRU/89/1.5/S/59758, having as main beneficiary and promoter the Romanian Academy.
I want to thank this way the expert coordinator, Prof. Paul Dan Cris‐tea, for the continuous support and guidance in research, the professors Eugen Simion and Valeriu Franc, for the opportunity to explore a new and promising field. I owe thanks to professors Asoke Nandi from Liverpool University and Adrian Munteanu from Vrije Universiteit of Brussels for the warmth with which I was received in their research groups for the advice given during the documentation and research stages performed abroad.
The research addresses an actual issue in the Romanian society: the transition to a knowledge society, which is a fundamental component of the information society. A powerful form of manifestation is given by the virtual social networks (named from now on just social networks), Internet sites that give users the ability to create their own profiles and share information / socialize based on common preferences. In direct terms, the social networks amplify the relationship between people and contribute to globalization, with express fears that they are an important factor dissipation and degradation of national identity.
The project aims to study how cultural identity is integrated in the social networks. The three main project components are:
1. Classification of information (from messages, blogs, posts, hits) characteristic for social networks and Internet into categories related to the representation and visibility of culture and cultural identity.
183
2. Representation, analysis and modeling of virtual social networks. It be will be identified the main actors (members of the network with large number of strong connections, i.e. audience and authority) and the main issues (broadcasted on a large number of connections).
3. Tracking the evolution of the topics that reflect the cultural identity in social networks. We use techniques of web mining (using crawlers to retrieve the data into a database) and text classification to classify and to track the dynamics of these topics.
The social networks are an important factor of globalization, with express fears that they will lead to the dissipation and degradation of the national identity. The thesis of this work is the confirmation of the immanent and immutable character of the cultural identity in the social networks, regarded as one of the various manifestations of the Romanian people. There are sought the social networking features that allows greater visibility and impact for topics which support the cultural identity.
The research validates the assumption of cultural identity integration in the social networks created around blogs. Here are some arguments:
‐ In 450 of the approximately 12,000 blogs in the database, or about 4% have the national poet name ʺEminescuʺ on the first page
‐ Root word ʺromanʺ (the word for Romanian in the native tongue) is the fourth stem‐word as frequency.
‐ Just approximately 1% of the blogs use the English language, although many paragraphs are in this language, especially for convenience.
‐ 4500, or about 38%, use the Romanian language with accents, not fully whatsoever.
‐ Of the 40 categories of blogs category ʺReligion and Spiritualityʺ owns about 4.8% (second, after ʺPersonal journalʺ), category ʺLiteratureʺ is third with 4.2%, and ʺSociety and Cultureʺ is in 12th place with 2.7%.
‐ The classification of the blogs regarding the position to the national culture shows that 8.5% of the blogs actively campaign in favor of the national culture, 23.7% support it, 55.6% are neutral and 12.2% are detrimental to national culture.
184
‐ Well represented categories such as ʺActivismʺ, ʺArtʺ, ʺLiteratureʺ, ʺMusicʺ, ʺPoliticsʺ, ʺReligion and spiritualityʺ, ʺSociety and cultureʺ militate clearly in favor of the national culture, with ratios starting from 4:1 and exceeding 45:1 in favor of the “Militant” and “In favor of” national culture labels in opposition to the “Against” label.
185
Summary
1. CULTURAL IDENTITY AND SOCIAL NETWORKS................................ 11
1.1. Scope of the work....................................................................................11
1.2. National identity .....................................................................................11
1.3. Social networks........................................................................................12
1.4. Integration of national identity in social networks ............................14
2. MODELING AND ANALYZING SOCIAL NETWORKS.......................... 19
2.1. Social networks........................................................................................19
2.1.1. Social structures .............................................................................19
2.1.2. Social networks...............................................................................19
2.1.3. Sociometry ......................................................................................19
2.1.4. Social networks...............................................................................20
2.1.5. Social networks in Romania .........................................................21
2.1.6. Romanian social networks............................................................22
2.2. The modeling of social networks..........................................................23
2.2.1. Matrix representation ....................................................................23
2.2.2. Representation with graphs .........................................................26
2.3. Social network analysis ..........................................................................30
2.3.1. Network size...................................................................................31
2.3.2. Network density and degree........................................................31
2.3.3. The degree of an actor ...................................................................31
2.3.4. Minimum path ...............................................................................32
2.3.5. Distance between actors................................................................32
2.3.6. Cliques .............................................................................................33
186
2.3.7. Centrality.........................................................................................33
2.4. Graphical representation of social networks.......................................34
2.4.1. Parameters ......................................................................................34
2.5. Social networks particularities ..............................................................34
2.5.1. Structural analysis..........................................................................35
2.5.2. Content based analysis..................................................................35
2.5.3. Dynamic analysis ...........................................................................36
2.6. Features of an application dedicated for social network analysis ...36
2.7. Common applications for the study of social networks....................36
3. SOCIAL MEDIA............................................................................................... 38
3.1. Components.............................................................................................38
3.1.1. Social networking platforms ........................................................39
3.1.2. Blogs.................................................................................................39
3.1.3. Microblogs ......................................................................................40
3.1.4. Collaborative projects....................................................................40
3.1.5. Content communities ....................................................................42
3.1.6. Virtual Worlds................................................................................44
3.2. Technologies ............................................................................................44
3.2.1. Online discussions .........................................................................44
3.2.2. Social networks...............................................................................45
3.3. Social media features ..............................................................................46
3.3.1. Interactivity.....................................................................................46
3.3.2. Emphasis on user ...........................................................................46
3.3.3. Emphasis on group........................................................................46
3.3.4. Links.................................................................................................47
3.3.5. Accessibility ....................................................................................48
3.3.6. Audience .........................................................................................48
187
3.3.7. Emotional factor .............................................................................48
3.3.8. Promptitude....................................................................................48
3.3.9. Autonomy .......................................................................................49
3.3.10. Flexibility.......................................................................................49
3.3.11. Practicability .................................................................................50
3.3.12. Quality ...........................................................................................50
3.3.13. Quantity.........................................................................................50
3.3.14. Persistence.....................................................................................50
3.4. Social media in Romania........................................................................51
4. BLOGOSPHERE............................................................................................... 52
4.1. Importance of blogs ................................................................................53
4.1.1. Socialization....................................................................................53
4.1.2. Freedom of expression ..................................................................54
4.1.3. Awareness.......................................................................................54
4.1.4. Information .....................................................................................54
4.1.5. Financial gain..................................................................................55
4.1.6. Help..................................................................................................56
4.2. Structure of a blog...................................................................................56
4.2.1. Header and logo.............................................................................56
4.2.2. Blog footer.......................................................................................56
4.2.3. The central bar................................................................................57
4.2.4. The blog...........................................................................................57
4.2.5. Side bars ..........................................................................................59
4.2.6. Navigation components................................................................59
4.3. HTML elements.......................................................................................60
4.3.1. Document labeling.........................................................................60
4.3.2. Code language and character set .................................................61
188
4.3.3. Title and chapters tags...................................................................62
4.3.4. Text tags ..........................................................................................63
4.3.5. Lists tags..........................................................................................63
4.3.6. Blog lists tags ..................................................................................63
4.3.7. Links tags ........................................................................................64
4.4. Particularities of blogs as web pages....................................................65
4.4.1. Backlinks .........................................................................................65
4.5. Romanian blogosphere...........................................................................66
4.5.1. Personal blogs.................................................................................67
4.5.2. Domain Blogs .................................................................................69
4.5.3. Company blogs ..............................................................................69
4.5.4. Collections of blogs........................................................................70
4.5.5. Forums.............................................................................................70
5. DATA MINING IN SOCIAL MEDIA ........................................................... 71
5.1. Blogs mining ............................................................................................71
5.2. Crawlers ...................................................................................................71
5.2.1. Crawlers operation ........................................................................72
5.2.2. Extraction of blogosphere networks............................................77
5.3. Unsupervised clustering methods........................................................77
5.4. Text Classification ...................................................................................77
5.4.1. Supervised classification methods...............................................77
5.4.2. Bayesian Classification..................................................................78
5.4.2. Classification using the K‐nearest neighbors.............................80
5.4.4. Support vector machines ..............................................................81
5.4.5. Terms using graphs .......................................................................82
5.4.6. Classification based on centroids.................................................82
5.4.7. Decision trees..................................................................................83
189
5.4.8. Classification using neural networks ..........................................85
5.5. Sentiment Analysis .................................................................................86
5.6. Temporal evolution ................................................................................86
6. NEURAL NETWORK CLASSIFIERS............................................................ 87
6.1. Advantages and disadvantages ............................................................87
6.2. Classical neural networks ......................................................................88
6.3. Connectionist systems ‐ neural networks............................................89
6.3.1. First Generation of neural networks ...........................................94
6.3.2. Second generation of neural networks .......................................95
6.3.3. Third generation of neural network............................................98
6.4. Complex‐valued neural networks ......................................................104
6.4.1. Universal Binary Neuron............................................................104
6.4.2. Phase‐Based Neuron....................................................................106
6.4.3. Implementing XOR function ......................................................106
6.4.4. Training PBN................................................................................108
6.4.5. Minimize the error for PBN........................................................109
6.4.6. Phase Based Spiking Neuron .....................................................118
7. SOCIAL MEDIA ANALYSIS........................................................................ 122
7.1. Getting information from social media..............................................122
7.2. Data Collection ......................................................................................122
7.2.1. Features .........................................................................................122
7.2.2. Getting text information from blogs .........................................124
7.3. Text Processing......................................................................................130
7.3.1. Removing stop words .................................................................130
7.3.2. Extracting stem words.................................................................132
7.4. Social Media Analysis ..........................................................................134
7.4.1. Characteristics of a social media analysis applications ..........134
190
7.4.2. Database use .................................................................................134
7.5. Integration of cultural identity in social networks...........................140
7.5.1. Blogosphere in Romania .............................................................140
7.5.2. Language and character set ........................................................140
7.5.3. Blogs Classification .....................................................................141
7.5.4. Integration of cultural identity in blogosphere .......................155
8. CONCLUSION............................................................................................... 157
8.1. Description of the results .....................................................................157
8.2. Contributions to the research ..............................................................157
8.2.1. Conceptual development............................................................157
8.2.2. Methodologies and tools for analysis and applied research.........................................................................................158
8.3. Possible impact of the research in the field .......................................159
9. GLOSSARY ..................................................................................................... 161
10. BIBLIOGRAPHY .......................................................................................... 167