a summer internship: ibm in germany

17
A Summer Internship: IBM in Germany ANNA WILSON 12 SEPTEMBER 2014 CLINGDING, IU BLOOMINGTON

Upload: indiana

Post on 05-Feb-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

A Summer Internship:IBM in GermanyANNA WILSON

12 SEPTEMBER 2014

CLINGDING, IU BLOOMINGTON

Roadmap1. What (general)

2. How

3. Who

4. What (precisely)

5. Lessons learned

The internshipSummer intern with IBM, Deutschland in Social Media Analytics

4 permanent employees◦ A host of temporary interns

◦ Housed in the Department of Analytics

How I found the positionFirst visit to Bloomington:◦ Stayed with a grad (not in a hotel)

◦ Before and after perspective shift

Recommendations

The teamLocation: IBM, Germany: Böblingen(just outside of Stuttgart)

The team

The teamSocial Media Analytics◦ Primarily a cloud-based service

◦ “Customer Insight”

◦ Pulls massive amounts of data from all sorts of social media

◦ e.g. reviews, posts (i.e. Amazon), blogs, tweets, etc.

◦ Searchable according to client-specified criteria

◦ For example (hypothetical situation): Apple & the iPhone 5s

Social Media AnalyticsPurpose:◦ What to avoid, what to include, trends in sentiment, future

directions

◦ Essentially: Keeping their thumb on the customer pulse

Support provided for many languages:◦ English, German, Spanish, Italian, French, Russian, Arabic, etc.

My role

Data Gathering

(Automated)

Programming Skeleton

(Semi-automatic)

Capturing Language Specifics

Client

Me!

The internsEach intern works on a different language**◦ Only one intern on a given

language at a time

◦ Typically, trained linguists

**Can have downsides!

The interns

The contentEach language had its own project, comprised of 5 sections:

1. Churn

2. Ownership

3. Potential owner

4. Recommenders/detractors

5. Sentiment

The contentAQL: Annotation Query Language◦ Daughter to SQL

◦ Designed by IBM specifically for Social Media Analytics

◦ Basically: an exercise in labeling and sorting/searching text data

Example:◦ User: “I’m returning my Sprint phone ASAP!”

Churn Ownership Future

The structurePrimarily:◦ External dictionaries (i.e. word lists)

◦ Regular expressions

◦ Case-by-case basis for false positives

Tuple theory (database structure):◦ <a,b> <x,y>

◦ <a,x>, <a,y>

◦ <b,x>, <b,y>

◦ But if one is empty, nothing gets returned

The taskCrucial: ◦ Precision & Recall

Filtering false positives◦ Maximally inclusive first, then reduce

My work:◦ English: Churn, phrasal work

◦ “the last straw”

◦ Spanish: Ownership, possession issues

◦ “mi hermano tiene un celular Sprint” (my brother has a Sprint cell)

Lessons to shareMake an effort to get to know your peers and keep up with them◦ “Spy network”

◦ Recommendations

Make sure you’re proactive about paperwork◦ Learn from others > be responsible for yourself

Sounds cliché:◦ Get outside of your comfort zone

◦ Most common question: Why Stuttgart?

Thank you! Questions?