research group data and web science - uni-mannheim.de€¦ · • experiment with rapidminer or...

32
Research Group Data and Web Science Mannheim, 11. Februar 2019

Upload: others

Post on 15-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

Research Group Data and Web Science

Mannheim, 11. Februar 2019

Page 2: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

Data and Web Science Group

• 7 Professors, 5 Post-docs, 20 PhD students

Page 3: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

Research Areas

− Artificial Intelligence (Prof. Heiner Stuckenschmidt)• Knowledge representation formalisms and reasoning techniques for

information extraction and integration

− Data Analysis (Prof. Rainer Gemulla)• Methods for analyzing and mining large datasets as well as

their practical realizations and applications

− Natural Language Processing (Prof. Simone Ponzetto)• Knowledge acquisition, knowledge-rich language understanding,

Computational Social Science and Digital Humanities

Page 4: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

Research Areas

− Statistical Nat. Lang. Processing (Prof. Goran Glavaš)• modeling meaning of language, understanding text, and

structuring knowledge from text

− Image Processing (Prof. Dr.-Ing. Margret Keuper)• Image Segmentation, Motion Segmentation, Efficient Video

Segmentation, Semantic Segmentation, Multiple Object Tracking

− Web-based Systems (Prof. Chris Bizer)• large-scale data integration, evolution of the World Wide

Web from a medium for the publication of documents into a global dataspace

− Data Science (Prof. Dr. Heiko Paulheim)• web data as background knowledge in data mining, and data

mining methods to create and improve large-scale knowledge bases

Page 5: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

Research Goals

DWS Overall Research Goals:

1. Methods for understanding large and heterogeneous data 2. Application of these methods in different contexts

Schema.org Data

IsA Database

Social

Sciences

Web Search

Data Analytics

Business

Applications

Info

rma

tion E

xtra

ctio

n

Data

Inte

gra

tion

Data

Min

ing &

Re

aso

nin

g

Ap

plic

atio

ns

Web Tables

Page 6: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

Teaching Overview: Courses

Decision Support

Data Mining II

Web MiningWeb Data Integration

Semantic Web Technologies

Information Retrieval

Text Analytics

Data Mining ILarge-Scale Data

Management

Data Mining andMatrices

FSS

Image Processing

Database Technology (MMDS)

Computer Vision

RelationalLearning

HWS

Hot Topics in Machine Learning

Page 7: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 500: Data Mining 1

• Content: the basics of “torturing data”:

1. Cluster Analysis: How to automatically organize your MP3 collection?

2. Classification: Will your bank grant you a loan?

3. Regression: How to determine the price of a house?

4. Association Analysis: Which products to place together

in a supermarket to maximize customer purchases?

5. Text Mining: Do students on Twitter like or dislike this lecture?

• Exercises

• Experiment with Rapidminer or Phython

• Student project:

• Mine some data of your choice

• Teaching staff:

• Prof. Dr. Christian Bizer (Lectures)

• Anna Primpeli, Oliver Lehmberg (Exercises)

Page 8: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

• Data integration is the process of consolidating data from

heterogeneous data sources into a single uniform representation.

• Data integration is critical within many application domains

• Business: CRM, Business Intelligence

• Science: Exploitation of existing research data

• The Web: Comparison shopping, job search

• Topics of the Course

1. The Data Integration Process

2. Web Data Formats

3. Schema Mapping and Data Translationa?

4. Identity Resolution

5. Data Fusion

DB1

DB4

DB3

IntegratedData

IE 670 + IE683: Web Data Integration

Page 9: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

⚫ Lecture (IE670)

• Introduces the principle methods of data integration

• Discusses how to evaluate data integration results

• Instructor: Prof. Dr. Christian Bizer

• Grading: Written Exam

⚫ Student Projects (IE683)

• Teams (five students) realize a data integration project

1. data gathering

2. schema matching and data translation

3. identity resolution

4. data quality assessment and data fusion

• Teams will use commercial data integration tools

as well as the Java data integration framework Winte.r

• Instructors: Anna Primpeli, Oliver Lehmberg

• Grading: Project report and presentation

IE 670 + IE683: Web Data Integration

Page 10: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

• Advanced Data Mining methods

• Dimensionality Reduction

• Anomaly Detection

• Time Series Analysis

• Parameter Tuning

• Ensemble Learning

• Neural Networks & Deep Learning

• Organization:• Lectures and Exercises

• Participation in Data Mining Cup

• Teaching Staff

• Prof. Dr. Heiko Paulheim (Lectures)

• Nicolas Heist (Exercises)

IE 672: Data Mining II

starts nextweek!!!

Page 11: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

• MMDS fundamental course− Foundations of Relational Databases− Relational Modeling− Normal Forms− Query Processing and Optimization− Transactions, Concurrency, and Recovery

• Teaching Staff:• Prof. Dr. Heiko Paulheim (Lectures)• Sven Hertling (Exercises)

CS 460: Database Technology

Page 12: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 650: Semantic Web Technologies

• Prerequisites: • Basic programming skills (e.g. Java, Python)

• Topics:

• Understanding the vision of the Semantic Web

• Acquaintance with foundations of W3C standards for building

semantic web applications

• Data Integration and Access: XML, RDF and SPARQL

• Knowledge Representation: RDFS and OWL

• Ontology Management: Engineering, Learning and Alignment

• Programming skills and IT competence: practical usage of

technologies for building semantic web applications

• Teaching staff:• Prof. Dr. Heiko Paulheim (Lecture)

• Sven Hertling (Exercises)

Page 13: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 560: Decision Support

)|(maxarg eaEUactiona

=

• Decision-making is an important part of all

science-based professions• Specialists apply their knowledge in a given area to

make informed decisions.

• Models that help to formulate and algorithmically

solve decision making problems− Find a solution that maximizes the expected

benefit of the outcome.

• Topics include: • Probabilistic Graphical Models

• Decision Theory and Decision Networks

• Game Theory and Mechanism Design

Page 14: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 560: Decision Support

• Lectures & Exercises

• Teaching staff: • Prof. Dr. Heiner Stuckenschmidt (Lecture)

• Dr. Melisachew Wudage Chekol (Exercises)

• Literature:

• Stuart Russel and Peter Norvig: Artificial Intelligence – A

modern Approach.• Pearson 2013.

• Chapters 2,7,10,11 and 13-17

• A Reader will be available

Page 15: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 689: Relational Learning

active(M) ring_size_5(M, R), element(Y, R), bond(M, Y, Z, 2).

Example Problem: learn a relational

definition of active componentsLecture

Prof. Dr. Heiner Stuckenschmidt

Every second Monday 12:00-13:30

Exercises/Tutorial:

Manuel Fink / Dr. Christian Meilicke

Every second Monday 12:00-13:30

Heiner Manuel Christian

Solution:

Page 16: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 660: Text Analytics

• Methods to automatically process natural language from a

computational / algorithmic perspective

• An introduction to NLP in three main blocks:Computational linguistics

Machine Learning and NLP

Applications

Page 17: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 660: Text Analytics

• Lectures + Exercises (2+2 SWS)

• Course personnel:

• Prof. Dr. Simone Ponzetto

• Prof. Dr. Goran Glavaš

• Topics:

• Finite state methods

• Language models (N-Gram models)

• Semantics in a sparse/dense vector space

• Sequence labeling

• Neural networks and deep learning for NLP

• Aplications: machine translation, sentiment analysis, etc.

Page 18: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 671: Web Mining

• Approaches to mine knowledge from the Web

• Web Usage Mining

• Web Structure Mining

• Web Content Mining

• Course Structure:

• Lectures and exercises

• Projects (during the second half)

• Teaching staff:

• Prof. Dr. Simone Ponzetto

• Prof. Dr. Goran Glavaš

• Dr. Dmitry Ustalov

Page 19: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 663 + IE 681:Information Retrieval

• Lectures (IE 663)

• Boolean and vector space retrieval models

• Probabilistic and lang. modeling retrieval

• Semantic and Latent Retrieval

• Web search: Link-based algorithms

• Teaching staff:

• Prof. Goran Glavaš (Lectures)

• Robert Litschko (Exercises

• Team Project (IE 681)

• Build your own search engine!

Page 20: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

• What you need to know to work with Big Data

• Fundamental concepts and computational paradigms for

large-scale data management and Big Data

CS 560: Large-Scale Data Management

Page 21: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

CS 560: Large-Scale Data Management

• Teaching staff:

• Prof. Rainer Gemulla (lectures)

• Daniel Ruffinelli (exercises/tutorials)

• 2 SWS lecture, 2 SWS tutorial, 6 ECTS

• Lecture: concepts, methods, systems

• Tutorial: In-depth discussion, exercises, hands-on

assignments

• Prerequisites

• Database Systems I or equivalent

• Programming experience

• Passing requirements

• Written exam

Page 22: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 673: Data Mining and Matrices

• Matrices & tensors are powerful data representations

• Data points, sets, graphs, relational data, knowledge bases, ...

• Course goal: Learn how to analyze such data• Course covers theory and applications of dimensionality reduction, embeddings, denoising, discovery of

latent structure, visualization, prediction, clustering, pattern mining, topic modelling, …

• Focus is on unsupervised and semi-supervised learning & matrix decompositions

Page 23: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 673: Data Mining and Matrices

• Instructor: Rainer Gemulla

• Tutor: Daniel Ruffinelli

• 2 SWS lecture, 2 SWS tutorium, 6 ECTS

• IE 500 Data Mining I recommended

• Gain hands-on experience• Smaller exercises to deepen lecture material• Homework assignments to analyze real data• Learn R

• Passing requirements• Regular assignments• Final exam or oral examination

Page 24: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

IE 674: Hot Topics in Machine Learning

Machine learning

How can we build computer systems that automatically improve with

experience, and what are the fundamental laws that govern all

learning processes?

Goal: in-depth understanding of underlying algorithms and concepts

Focus: basics + selected “hot topics” and their applications

Page 25: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

• Instructor: Rainer Gemulla

• Tutor: TBA

• 2 SWS lecture, 2 SWS tutorial, 6 ECTS

• Recommended prerequisites

• IE 500 Data Mining I, IE 560 Decision support

• Basic knowledge of probability and statistics

• Gain hands-on experience

• Smaller exercises to deepen lecture material

• Homework assignments to analyze real data

• Learn Python, NumPy, scikit-learn, PyTorch, Stan, ...

• Passing requirements

• Written exam or oral examination

• Assignments

IE 674: Hot Topics in Machine Learning

Page 26: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

CS647: Image Processing

• Lecture contents

• Basics of Imaging

• Noise and basic operations

• Variational Methods

• Image Feature Extraction

• Segmentation

• Image Sequences and Motion

• Organization

• Lectures and Exercises

• Gain practical python and C++ coding experience in the exercises

• Teaching Staff

• Prof. Margret Keuper (Lectures and Exercises)

Page 27: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

CS 646: Higher Level Computer Vision

• Lecture contents

• Object Detection

• Semantic Image Segmentation

• Optical Flow

• Video and Motion Segmentation

• Deep Learning for Computer Vision

• Organization

• Lectures and Exercises

• Gain practical python and Matlab coding experience in the

exercises

• Teaching Staff

• Margret Keuper (Lectures and Exercises)

Page 28: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

CS 707: Data and Web Science Seminar

• Learn about recent advancements in data and web science

• Read, understand, explore, present, and peer-review scientific literature

• This term: Graph Mining and Learning from Graphs

• Topics: graph mining, graph representation learning, graph analysis frameworks, applications

Instructors: Kiril Gashteovski, Rainer Gemulla

Page 29: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

CS 709: Text Analytics Seminar

• Goals:• Examine and explore cutting-edge research in the

areas of natural language processing,

computational linguistics, and information retrieval

• Learn how to read and interpret scientific work in

this area of research

• Learn how to write a survey/overview paper on

the assigned topic, covering a specific task

• Instructors:• Simone Ponzetto, Goran Glavaš

• Not offered this term

Page 30: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

CS 710: Knowledge Graphs Seminar

• Gain insights into…

• Construction of Knowledge Graphs

• Contents of Open Knowledge Graph

• Application Areas

• Instructor: Heiko Paulheim

Page 31: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

CS 715: Large Scale Data Integration Seminar

− Covers current topics in the area of

− large-scale schema matching, identity resolution,

− data fusion, set completion, data search, and

− data exploration, and data profiling

− Concrete topics change from semester to semester

− You summarize a current research topic in a concise report

− You systematically compare different state of the art methods

− You give a presentation about your topic

− Good start for writing your master thesis at the chair

Page 32: Research Group Data and Web Science - uni-mannheim.de€¦ · • Experiment with Rapidminer or Phython ... clustering, pattern mining, topic modelling, … • Focus is on unsupervised

• To work on:

• Data and Web Mining projects

• Information Extraction and Integration projects

• Knowledge Representation and Reasoning projects

• Natural Language Processing projects

• Implement open source tools

• 30-60 h/month contracts are possible

• Contact PostDoc or Professor responsible for the

project/area that you are interested in.

• include CV and transcript of records.

• Good start for writing your master thesis within group.

DWS hires good students!