sharepoint saturday belgium 2014 - yammer data mining

Post on 09-May-2015

257 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Yammer social data mining

#SPSBE27Joris PoelmansApril 26th, 2014

Thanks to our sponsors!

Gold

Silver

About me Work at RealDolmen

Principal consultant HPW and domain manager Big Data (Predictive analytics, data mining and machine learning)

SharePoint Server MVP since 2005

Blog: http://jopx.blogspot.com

Twitter: @jopxtwits

Agenda Inspired by SPC3391: Yammer mining – dig in and listen to what your social data is saying (Richard diZerega @richdizz)

Export and ModelsUnderstand the basics of exporting Yammer data, what the export includes, how augment the export, and how to best model it for Microsoft BI tools

Delivering InsightsExplore approaches for delivering rich insights using standard Microsoft BI tools such as Excel, SharePoint, and Power BI

Custom MiningDiscuss advanced patterns and practices for extending Yammer based on social mining trends such as real-time insights and advanced modeling

Why social mining?

Export and models

Login to Yammer as a network administrator (only available to network admins)Navigate to the Network Admin portal within YammerSelect “Export Data” from the “Content and security” section of the side navigationSelect a start date for the export (read: all additional filter must be completed after the export)Optionally include attachment and external networks

Yammer exports (Enterprise only)

What is exported?What’s included (.csv format)AdminsFilesGroupsMessagesNetworksPagesTopicsUsers

What’s missingMentions*FollowingSharingDetailed Date and User Demographics

Demo 1: Standard export

Built in export in Yammer Enterprise

Building your own – Yammer APIs

RESTJavaScript

API

Embed

Developer registers application and implements authorization

Simple authorization

App registration

Accessible to all users of a network.

Apps are registered with a home network and then published to the directory.

Redirect URI is the most important setting, but it is set elsewhere.

Authorization

Redirect user to the OAuth dialog URL.

Process the response at your Redirect URI when user allows the app.

Demo 2: App registration

Register an app in Yammer

REST API Endpoints

Data exportMessages Users Groups

• https://www.yammer.com/api/v1/messages.json• https://www.yammer.com/api/v1/users.json• https://www.yammer.com/api/v1/groups.json

Full list at http://developer.yammer.com/restapi/

Yammer API throttling Throttled by Paging

May Yammer APIs return in batches of 50, especially calls that could return large amounts of data such as a popular users followers or the members of a group

Paged APIs will return value indicating if more rows available (total counts are typically available too)

Throttled by Call Frequency Yammer throttles by call frequency based on the type of query…throws

429 errors when exceeded: Autocomplete: 10 requests in 10 seconds Messages: 10 requests in 30 seconds Notifications: 10 requests in 30 seconds All Other Resources: 10 requests in 10 seconds

Yammer REST API ... what to watch out for Users REST API only returns active users

Messages.json – all public messages in the “All conversations” view Unfortunately no reference to a specific Yammer Group Use https://www.yammer.com/api/v1/messages/in_group/[123456].json

for messages in a specific group

Different message formats available in export

Building SimpleYammerClient: a custom Yammer exporter Dynamic languages are easier to use then .NET

Generate classes to map to the API Essential tooling

RestSharp/JSON C# Class Generator/JsonPack

Example code to start with Yammer.SimpleAPI (http://yammersimpleapi.codeplex.com) Code samples blog Steve Pechka - http://

blogs.technet.com/b/speschka/archive/2013/10/05/using-the-yammer-api-in-a-net-client-application.aspx

Yammer .NET SDK to be expected

Demo 3: SimpleYammerClient

Export tool for Yammer messages, users and groups

Pre-import cleanup Date dimensions

Generate date dimensions using Excel App

Export group specific messages and merge with existing data

Mention extraction Yammer Export: User and Topic mentions are embedded in message data Careful – difference between REST API (messages.json) and Yammer export- “I am

preparing for my [Tag:3422:SPC14] with my co-presenter [User:77383:Richard Zerega]”

Following data Sentiment analysis

Calculate against a text analysis service (https://www.mashape.com/japerk/text-processing ) or Pattern 2.6 (Python)

Other profile/demographic data Typically available in other systems (SAP, Oracle,Peoplesoft,...)

Pattern (1/2) Free toolkit for data mining(Google,Twitter,Facebook, Wikipedia,..), text analysis (modality, sentiment analysis,..) and machine learning(SVM, Bayesian stats, Infogain,...) – written in Python

http://www.clips.ua.ac.be/pattern

Pattern (2/2)

“my new phone is awesome”

Sentiment analysis in Pattern is based on adjectives

75-80% accurate for Dutch, English and French

Polarity score between -1 and +1. Score >0.1 considered to be positive. Converted to score between 0 and 5 for nicer grapsh

Not so good at sarcasm “trains are late again ... Good job #NMBS”

Demo 4

Pre-import cleanup

Powerful Self-Service BI with Excel 2013Search and find internal &

external data

Clean, transform, and shape data

Merge and combine data from multiple sourcesLightning fast analytics with xVelocity in-memory technology

Model relationships, custom measures, hierarchies, and KPI’s

Bring your data to life with interactive visualization

Explore data in new ways to discover hidden insights

Ask questions with natural lang

Powerful Self-Service BI with Excel 2013Search and find internal &

external data

Clean, transform, and shape data

Merge and combine data from multiple sourcesLightning fast analytics with xVelocity in-memory technology

Model relationships, custom measures, hierarchies, and KPI’s

Bring your data to life with interactive visualization

Explore data in new ways to discover hidden insights

Ask questions with natural lang

Tips and tricks before import Remove irrelevant columns Remove duplicate rows Filter out null, empty rows or irrelevant rows

Append/Merge where appropriate

Post-import augmentation Establish relationships Date mappings

Convert Yammer date/time fields to date keys that you can reference against a date dimension

EX: =YEAR([created_at]) & IF (LEN(MONTH([created_at])) > 1, "" & MONTH([created_at]), "0" & MONTH([created_at])) & IF (LEN(DAY([created_at])) > 1, "" & DAY([created_at]), "0" & DAY([created_at]))

De-normalize lookup tables where possible PowerPivot will store a referenced column from another table in memory – better

performance

Use rankings Yammer dimensional data can be very large – so it is helpful to filter by top n

Hide confusing columns from end users End users don’t care about Ids or other columns – hide these columns

Demo 5

Import and modeling with Power Query and Power Pivot

Delivering insights

Powerful Self-Service BI with Excel 2013Search and find internal &

external data

Clean, transform, and shape data

Merge and combine data from multiple sourcesLightning fast analytics with xVelocity in-memory technology

Model relationships, custom measures, hierarchies, and KPI’s

Bring your data to life with interactive visualization

Explore data in new ways to discover hidden insights

Ask questions with natural lang

Powerful Self-Service BI with Excel 2013Search and find internal &

external data

Clean, transform, and shape data

Merge and combine data from multiple sourcesLightning fast analytics with xVelocity in-memory technology

Model relationships, custom measures, hierarchies, and KPI’s

Bring your data to life with interactive visualization

Explore data in new ways to discover hidden insights

Ask questions with natural lang

Powerful Self-Service BI with Excel 2013Search and find internal &

external data

Clean, transform, and shape data

Merge and combine data from multiple sourcesLightning fast analytics with xVelocity in-memory technology

Model relationships, custom measures, hierarchies, and KPI’s

Bring your data to life with interactive visualization

Explore data in new ways to discover hidden insights

Ask questions with natural lang

Demo 6

Power BI in Office 365

Other scenarios

Real-time insights with Yammer Based on Bayeux protocol of long poll requests Make a request to Yammer with a long timeout (ex: 30 seconds) At any time, Yammer could respond with a real-time response At the least, Yammer will respond before the timeout indicating no

messages When response received (messages or not), immediately make new

request with long timeout

Use cases for realtime social data Realtime ETL for up-to-date reporting and analytics Eliminate bulk processing for “augmenting activities” Identify trend data

Use cases for enterprise social mining Yammer APIs combined with analytics and visualizations for obtaining insights Which topics are trending? Who are key influencers within your company? Which “hidden” communities exist within your company? Specific analytics to drive your Yammer business scenarios

Track adoption rate and user activity Track sentiment of employees in real time Combine with other data (gender, age, location, departement,

manager...) to build analytical models

Visualization examples

https://yamosphere.azurewebsites.net

https://nucleus.azurewebsites.net

Demo 7 - Yamosphere

Summary As ESNs mature mining them will provide more value

Available 3d party tools: ViewPoint Enterprise - http://www.viewdolabs.com/ Cardiolog Analytics - http://www.intlock.com (See http://

bit.ly/1msBG4j) Gooddata Yammer Analytics (Requires Yammer Enterprise)

Other ESN also provide endpoints for data manipulation – same principles apply

References and links SPC3991 – Yammer Mining (Video) - http://

channel9.msdn.com/Events/SharePoint-Conference/2014/SPC3991 Yammer Analytics with Excel and Power BI - http://bit.ly/1qlImCY Power BI, Power Query Links - http://

jopx.blogspot.be/2014/04/power-bi-power-view-power-query-and.html Pattern - http://www.clips.ua.ac.be/pattern

top related