icml/mloss 2010-06-25holger arndt: the universal java matrix package1 the universal java matrix...

10
ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 1 The Universal Java Matrix Package The Universal Java Matrix Package (UJMP) (UJMP) Holger Arndt Technical University of Munich Department of Computer Science Garching, Germany [email protected] [email protected] http://www.holger-arndt.com http://www.holger-arndt.com Find out more at: http://www.ujmp.org ICML/MLOSS, Haifa, 2010-06-25 Everything is a Matrix! Outline Introduction Comparison of Matrix Libraries for Java Concepts for a Next-Generation Matrix Library Integration of Other Matrix Libraries Calculation Methods Matrix Annotation Automatic Entry Type Conversion Demo Summary and Discussion

Upload: andrew-wilkinson

Post on 03-Jan-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 1

The Universal Java Matrix Package (UJMP)The Universal Java Matrix Package (UJMP)

Holger ArndtTechnical University of Munich

Department of Computer ScienceGarching, Germany

[email protected]@holger-arndt.comhttp://www.holger-arndt.comhttp://www.holger-arndt.com

Find out more at:http://www.ujmp.org

ICML/MLOSS, Haifa, 2010-06-25

Everything is a Matrix!

Outline Introduction Comparison of Matrix Libraries for Java Concepts for a Next-Generation Matrix

Library Integration of Other Matrix Libraries Calculation Methods Matrix Annotation Automatic Entry Type Conversion Demo Summary and Discussion

Page 2: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 2

IntroductionIntroduction

Matrix computations essential in various fields of computer science (machine learning, data mining, etc.)

Collaborative networks require large sparse adjacency matrices

Increasing amount of data

But: No direct support for

matrix algebra in JDK Matlab or Octave cannot

always be used Other libraries have

limitations: JAMA, Colt, MTJ, commons-math

Online Marketing Bio-Medical Data Analysis Collaborative Networks

3

1

54

2

0 1 1 1 01 0 1 0 01 1 0 1 01 0 1 0 10 0 0 1 0

Why do we need yet another Java matrix library?

Page 3: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 3

Comparison of Matrix Libraries for JavaComparison of Matrix Libraries for Java

JAMA Colt MTJ commons-math UJMP

extendable dense matrices

sparse matrices 2D matrices 3D matrices 4D matrices

matrices > 4D > 2^31 rows/columns

object entries generic entries

matrices > RAM advanced operators import/export filters

Matlab/Octave/R interface visualization methods

No single Java matrix library can fulfill all needs! We need a „universal“ matrix package...

Page 4: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 4

Concepts for a Next-Generation Matrix Concepts for a Next-Generation Matrix LibraryLibrary

Matrix Interface multi-dimensional, dense/sparse, 2^63 rows/columns, various cell types

size

get/set cell

plus, minus

multiply, divide

transpose

min, max, mean

variance, std

sin, cos, tan

select rows/cols

get submatrix

import/export

visualization

Fu

nct

ion

Dec

lara

tio

ns

Abstract Matrix

Def

ault

Fu

nct

ion

Imp

lem

enta

tio

ns

Matrix Implementations

Data in Memorydouble[][]

int[][]String[][]

Data on DiskCSV, TXT

Cu

sto

m F

un

ctio

n I

mp

lem

enta

tio

ns

Database Tables

JDBC

Matrix Libraries

JAMA

ojAlgo

Java Libraries

The actual implementation of a matrix becomes secondary!

(list not complete)

Page 5: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package

Integration of Other Matrix LibrariesIntegration of Other Matrix Libraries

switching to ojAlgo

usingJAMA

Switching to faster libraries for better performance. Example: SVD

Page 6: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 6

Calculation MethodsCalculation Methods

Matrix

Calculation

Matrix MatrixCalculation

copy:

link:

original:

MatrixCalculation

There are three different „modes“ to perform a calculation.

Page 7: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 7

Matrix AnnotationMatrix Annotation

6757 5 yes 15.4% 230.87

6876 1 yes 20.3% 330.53

9976 4 yes 12.3% 321.45

9975 2 no 7.4% 732.42

980 1 yes 2.4% 643.32

8657 1 yes 33.2% 313.53

7677 5 no 23.4% 832.95

7657 13 yes 11.5% 542.32

6678 9 yes 6.5% 232.54

8865 2 yes 45.6% 335.21

product data

labelforrowaxis

Report June 2009

row1

row2

row3

row4

row5

row6

row7

row8

row9

row10

in stock# of salesproduct id margin price [US$]

column labelsmatrix label

axis label

Data requires annotation to be valuable.

Page 8: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 8

Automatic Entry Type ConversionAutomatic Entry Type Conversion

1

"1.2"

1.2

getAsString(3,1)

getAsDouble(3,1)

1l

true

getAsInt(3,1)

getAsLong(3,1)

getAsBoolean(3,1)

Not all matrices contain numerical data.

"9.1"

"4.0"

"1.9"

"0.5"

"7.7"

"5.7"

"3.8"

"This"

"matrix"

"contains"

"Strings,"

"data"

"must"

"be"

"converted"

"1.2"

Matrix imported from CSV

float, double, byte, char, short, int, long, boolean, Date, BigDecimal, BigInteger, String, Object, <Generic>

Supported value types:

Page 9: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 9

DemoDemoIt is important to visualize data.

editor

2D overviewadditional visualization modules

20GB of data!

Example Code:

Page 10: ICML/MLOSS 2010-06-25Holger Arndt: The Universal Java Matrix Package1 The Universal Java Matrix Package (UJMP) Holger Arndt Technical University of Munich

ICML/MLOSS 2010-06-25 Holger Arndt: The Universal Java Matrix Package 10

Summary and DiscussionSummary and Discussion

Summary Extendable architecture Ready for large amounts of data Integration of other libraries Flexible calculation methods Open Source LGPL Online forum for Q&A

Future Work: Documentation Developers wanted!

The Universal Java Matrix Package: A novel and innovative matrix library for Java.

Homepage: http://www.ujmp.org