apache madlib (incubating)madlib.incubator.apache.org/community-artifacts/apache...apache madlib...
TRANSCRIPT
3
Summary (1) • ~50% of respondents have 1 year or less of
MADlib use• Fraud detection is the most common use case• Regression (various), clustering and random
forest are the most commonly used MADlib algorithms
• Gradient boosting is the most commonly requested new algorithm
4
Summary (2) • Users prefer new algorithms more than
improvements to existing algorithms by a 2:1 margin
• Improved documentation/examples and better performance are the biggest concerns
• The most common other tools used by respondents are R, Spark and Python (and associated libraries)
12
Q6 - Top Requested Features
*Note that there is an R interface called PivotalRhttps://cran.r-project.org/web/packages/PivotalR/
*