bayesialab satisfaction poll analysis

57
Data analysis – satisfaction poll

Upload: avdesh-kothari

Post on 27-Oct-2015

21 views

Category:

Documents


3 download

DESCRIPTION

BayesiaLab Satisfaction Poll Analysis

TRANSCRIPT

Page 1: BayesiaLab Satisfaction Poll Analysis

Data analysis –

satisfaction poll

Page 2: BayesiaLab Satisfaction Poll Analysis

In this part we

present how to

define global

satisfaction and how

to see all

interactions between

variables.

Page 3: BayesiaLab Satisfaction Poll Analysis

Data is contained in

text file (CSV).

Page 4: BayesiaLab Satisfaction Poll Analysis

There is a title line

The separator

is a semicolon

The import

wizard

automatically

detects the file

separators and

title line.

Page 5: BayesiaLab Satisfaction Poll Analysis

The first column is

an identifier. Since

this information is

not useful for

analysis, the

column becomes

grey: it is unused.

Page 6: BayesiaLab Satisfaction Poll Analysis

The file contains

missing data. The

average value of

present data shall

replace any

missing value in

the considered

column.

Data information is displayed

here. 711 poll responses are

gathered in this dataset.

Page 7: BayesiaLab Satisfaction Poll Analysis

Discretizing

continuous values

Variables represent evaluation

marks from 1 to 10. Manual

discretization allows showing

repartition function of the

selected continuous variable.

Page 8: BayesiaLab Satisfaction Poll Analysis

Generate a

discretization

with equal

distances with

three intervals

leads to this

graph.

Page 9: BayesiaLab Satisfaction Poll Analysis

Since the

discretization is

adequate, it can be

applied to all

variables

For transferring the

discretization mode

to other variables Ctrl + A for applying

discretization to all

variables.

Page 10: BayesiaLab Satisfaction Poll Analysis

The Bayesian

network is

created with one

node per column.

Page 11: BayesiaLab Satisfaction Poll Analysis

For characterizing

global satisfaction,

the first step is to

use the search

function for

finding

“Satisfaction”

node.

The search function

* and % can be used for

simplifying search

Page 12: BayesiaLab Satisfaction Poll Analysis

Clicking on the

line causes the

node to blink.

Page 13: BayesiaLab Satisfaction Poll Analysis

This node is the

target variable of

the analysis. We

are interested in

the >7 satisfaction

value.

Page 14: BayesiaLab Satisfaction Poll Analysis

The augmented

Markov blanket

shall be used for

characterizing the

target variable. It

allows to find the

minimal set of

variables that

characterize

global satisfaction.

Page 15: BayesiaLab Satisfaction Poll Analysis

Zoom in and out

tools are available

for better graph

visualization.

Page 16: BayesiaLab Satisfaction Poll Analysis

Force directed

layout positioning

algorithm allows

organizing the

nodes on the

workspace

Page 17: BayesiaLab Satisfaction Poll Analysis

While switching

to validation

mode, note that

only 15 nodes

among 215 are

selected relevant

by the network

Page 18: BayesiaLab Satisfaction Poll Analysis

For highlighting

important

relationships

between

variables, the

force of the arcs

tools shall be

used.

Page 19: BayesiaLab Satisfaction Poll Analysis

An arc’s thickness

is proportional to

its relevance with

regards to target

variable. SE1

variable is the

most important

for global

satisfaction

Unconnected nodes

become transparent.

Page 20: BayesiaLab Satisfaction Poll Analysis

BayesiaLab can

generate reports.

Page 21: BayesiaLab Satisfaction Poll Analysis

SE1 node is in first

position : it is the

most important

variable of this

analysis.

Page 22: BayesiaLab Satisfaction Poll Analysis

The probabilistic

profile of polls

presenting a

global satisfaction

mark >=7 is also

reported.

Page 23: BayesiaLab Satisfaction Poll Analysis

After closing the

report, note that it

is possible to

monitor all

correlations

between variables

by right clicking in

the right side of

the screen.

Page 24: BayesiaLab Satisfaction Poll Analysis

The monitors

display the

probability

distribution and

permit changing

the variables

values.

Target variable has

red background.

As the most important,

SE1 variable appears in

first position.

Page 25: BayesiaLab Satisfaction Poll Analysis

Monitors can be

used for finding

the probabilistic

profile of polls

presenting high

satisfaction mark.

When clicking on this modality,

the probabilities are

propagated throughout the

network. The probabilistic

profile becomes readable.

Page 26: BayesiaLab Satisfaction Poll Analysis

The same

technique can be

applied to other

modalities and

variables. The

results are

automatically

propagated to the

remaining

variables.

Poor SE1 mark is

reported on all monitors.

Page 27: BayesiaLab Satisfaction Poll Analysis

After target

variable

characterization,

the second part of

this tutorial

explores the

relationship

between all

variables of the

poll.

In modelization

mode, delete all arcs.

Page 28: BayesiaLab Satisfaction Poll Analysis

The SopLEQ

algorithm is

appropriate for

discovering

associations

between

variables.

Page 29: BayesiaLab Satisfaction Poll Analysis

After some

computational

time, SopLEQ

learning finds a

complex network.

Page 30: BayesiaLab Satisfaction Poll Analysis

By using

positioning and

zoom tools, the

graph becomes

more reader-

friendly.

In this case, where

the graph is large

but with average

connectivity,

symmetric

positioning is

adequate.

Page 31: BayesiaLab Satisfaction Poll Analysis

For increasing

network

readability, a

comments

dictionary can be

linked with the

graph. In this file,

the name of each

node is completed

with comments.

Page 32: BayesiaLab Satisfaction Poll Analysis

When done, hints

indicate that the

node has

comments.

Clicking this button displays

or disables comments for

selected nodes

Page 33: BayesiaLab Satisfaction Poll Analysis

A modality

dictionary can

also be

interactively

designed. This can

be done by double

clicking on a node

and opening

“modality name”

sheet

Page 34: BayesiaLab Satisfaction Poll Analysis

Give a name to

each modality

Page 35: BayesiaLab Satisfaction Poll Analysis

Once the

modalities labels

are validated, the

dictionary can be

exported as a text

file

Page 36: BayesiaLab Satisfaction Poll Analysis

The file is defined

only for SK5 node.

#Wed Oct 11 14:28:27 CEST 2006

SK5.<\=7=Average

SK5.<\=4=Poor

SK5.>7=Very good

Page 37: BayesiaLab Satisfaction Poll Analysis

By a simple

modification, it

becomes valid for

all nodes of the

graph.

#Wed Oct 11 14:28:27 CEST 2006

<\=7=Average

<\=4=Poor

>7=Very good

Page 38: BayesiaLab Satisfaction Poll Analysis

The dictionary can

now be associated

back to all nodes

of the graph

Page 39: BayesiaLab Satisfaction Poll Analysis

The monitors

from the

validation mode

become easier to

read.

Page 40: BayesiaLab Satisfaction Poll Analysis

The same process

can be applied for

attributing values

to modalities and

generating

modality values

dictionary.

This is done in

modelization

mode, by double

clicking a node

and opening the

“values” sheet.

Page 41: BayesiaLab Satisfaction Poll Analysis

When the

modality is poor,

it marks 0 points,

10 points for

average and 20

points for very

good

Page 42: BayesiaLab Satisfaction Poll Analysis

The same process

consisting of

exporting the

dictionary,

modifying the text

file and importing

back can be

applied for

attributing values

to all nodes

modalities

The total and average

values of the graph

modalities are calculated

The values are also

computed depending on

the probability distribution.

Page 43: BayesiaLab Satisfaction Poll Analysis

Every question is

related to a theme.

For instance, this

pool has 36

themes. The class

concept in

BayesiaLab is

useful for

associating

themes to nodes.

The themes

dictionary is

contained in a text

file.

Page 44: BayesiaLab Satisfaction Poll Analysis

By clicking on the

new-appeared

icon on the

bottom right of

the window, the

class editor opens.

It becomes

possible to apply

modifications to

classes instead of

applying to nodes

Opens the class editor

Page 45: BayesiaLab Satisfaction Poll Analysis

The readability

can be increased

by applying

automatic class

colours. This is

done by selecting

all the classes

with <ctrl + a>

and clicking the

“color” button.

Page 46: BayesiaLab Satisfaction Poll Analysis

Note that nodes

are globally

gathered by

colour. This

provides useful

information about

links inter and

intra-theme. In

this case, this also

denotes a well-

designed poll.

When closing the “Edit

classes” window, the

nodes become coloured

depending on their class.

Page 47: BayesiaLab Satisfaction Poll Analysis

The comments are

also coloured

depending on the

class.

Page 48: BayesiaLab Satisfaction Poll Analysis

A “colours

dictionary” can

also be saved as a

text file.

Page 49: BayesiaLab Satisfaction Poll Analysis

In this example,

themes have been

created base on

expert knowledge.

Nevertheless,

BayesiaLab

provides tools for

automatic theme

design by

grouping

semantically close

variables.

In validation mode, the

variable clustering is based

on association rules

discovering in the network.

Page 50: BayesiaLab Satisfaction Poll Analysis

Since the

clustering is

applied, new

colours are

applied to nodes.

BayesiaLab identified

48 nodes groups.

Moving this cursor forces

the number of groups.

The nodes colours are

also changed.

Page 51: BayesiaLab Satisfaction Poll Analysis

There are two

other new icons in

the clustering

toolbar.

Exiting the clustering modeThis is for validating

the current clustering

Page 52: BayesiaLab Satisfaction Poll Analysis

BayesiaLab is able

to build latent

variables

according to the

recently realized

clustering.

When validating, a

confirmation is asked.

Page 53: BayesiaLab Satisfaction Poll Analysis

In modelization

mode, the

multiple

clustering allows

clustering

individuals from

each single

variable group.

Page 54: BayesiaLab Satisfaction Poll Analysis

This wizard tunes

the multiple

clusterings

realized. (one per

identifier cluster).

Data is saved in this directory

Specifying the

number of

classes for each

new latent

variable

Page 55: BayesiaLab Satisfaction Poll Analysis

In the same

fashion as data

clustering, a

HTML report is

created for each

clustering. They

are useful for

renaming new

variables and

their modalities

Page 56: BayesiaLab Satisfaction Poll Analysis

Once the

clusterings are

realized, a new

network is

created with one

node per latent

variable (keeping

the initial colour)

An internal database is

created. It contains the most

probable cluster values for

each line of the initial file.

This database can be saved in

a spare file with the “data”

menu.

Page 57: BayesiaLab Satisfaction Poll Analysis

Probabilistic

relationships

between the

nodes of this new

network can be

discovered with

the SopLEQ

algorithm.

After computation

and automatic

nodes positioning,

the obtained

network present

51 nodes

representing the

latent variables of

the initial dataset.