give me my damn report: making nosql data accessible to the business
TRANSCRIPT
![Page 1: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/1.jpg)
@slamdata @jdegoes
John A. De Goes — CTO SlamData Inc.
Give Me My Damn Report: Making NoSQL Data Accessible to the
Business
![Page 2: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/2.jpg)
@slamdata @jdegoes
Agenda
1. The Rise of NoSQL2. The Dark Side of NoSQL3. Options for Reporting
a. Extract-Transform-Loadb. Fat Driversc. Code to NoSQL APIsd. Native NoSQL Analytics
4. Why NoSQL Analytics is Hard5. NoSQL Databases: Not Equal6. Question & Answer
![Page 3: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/3.jpg)
@slamdata @jdegoes
The Rise of NoSQL
![Page 4: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/4.jpg)
@slamdata @jdegoes
The Rise of NoSQL
![Page 5: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/5.jpg)
@slamdata @jdegoes
The Rise of NoSQL
● Massively scalable
● Operational Ease-of-Use
● Native support for rich data structures
● Native Support for heterogeneity
● Rapid Time-to-Deployment
![Page 6: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/6.jpg)
@slamdata @jdegoes
The Rise of NoSQL
![Page 7: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/7.jpg)
@slamdata @jdegoes
The Dark Side of NoSQL
![Page 8: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/8.jpg)
@slamdata @jdegoes
The Dark Side of NoSQLOverview
![Page 9: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/9.jpg)
@slamdata @jdegoes
The Dark Side of NoSQL
Give Me My Damn Report!
● Ad hoc analytics
● Exploratory analytics
● Operational analytics
● Analytics dashboards
● Batch reporting
● IoT / Event analytics
Need for Analytics
![Page 10: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/10.jpg)
@slamdata @jdegoes
The Dark Side of NoSQLSQL Analytics
![Page 11: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/11.jpg)
@slamdata @jdegoes
The Dark Side of NoSQL
1. ETL2. Fat Drivers
3. Code to NoSQL API4. Native NoSQL ANalytics
Choices
![Page 12: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/12.jpg)
@slamdata @jdegoes
Options for Reporting
![Page 13: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/13.jpg)
@slamdata @jdegoes
Extract-Transform-Load
{"user_id": "[email protected]",
"profile": {
"name": "Mary Jane",
"addresses": [{
"city": "London",
"country": "UK"
}],
"band_plays": {
"Squirrel Nut Zippers": 56,
"Red Hot Tomatoes": 19,
"Big Bad Voodoo Daddy": 102
}
}
SQL /Hadoop
Overview
![Page 14: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/14.jpg)
@slamdata @jdegoes
Extract-Transform-Load1. Flattening
users
user_id
...
...
band_plays
user_id band_name play_count
[email protected] Squirrel Nut Zippers 56
[email protected] Red Hot Tomatoes 19
[email protected] Big Bad Voodoo Daddies 102
profiles
profile_id user_id name
1 [email protected] Mary Jane
addresses
profile_id city country
1 London UK
![Page 15: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/15.jpg)
@slamdata @jdegoes
Extract-Transform-Load2. Homogenization
events
type user_id genre_name artist_name band_name play_count
“band_play” ... NULL NULL “Squirrel Nut Zippers” 56
“artist_play” ... NULL “Frank Sinatra” NULL 19
“genre_play” ... “New Age” NULL NULL 102
![Page 16: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/16.jpg)
@slamdata @jdegoes
Extract-Transform-Load3. Incremental ETL
1. Last_modified Field2. Import changed data*
* Less relevant for Hadoop
![Page 17: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/17.jpg)
@slamdata @jdegoes
Extract-Transform-LoadTools
![Page 18: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/18.jpg)
@slamdata @jdegoes
Extract-Transform-LoadReport Card
✗ Slow
✗ Painful
✗ Brittle
✓ Tunable Performance
✓ Unlimited Flexibility in Reporting / Analytics
![Page 19: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/19.jpg)
@slamdata @jdegoes
Fat DriversOverview
Driver
Embedded SQL Engine
Real-Time ETL(Filtered Table Scan)
![Page 20: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/20.jpg)
@slamdata @jdegoes
Fat DriversApproaches
Magic Config
![Page 21: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/21.jpg)
@slamdata @jdegoes
Fat DriversVendors
![Page 22: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/22.jpg)
@slamdata @jdegoes
Fat DriversReport Card
✗ Slow
✗ Limited to Small Data
✗ Limited to Simple Analytics
✗ Limited to Simple Data
✓ Low Friction
✓ Flexibility in Analytics / Reporting
![Page 23: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/23.jpg)
@slamdata @jdegoes
Code to NoSQL APIOverview
Code
CSV
HTML5/Javascript
![Page 24: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/24.jpg)
@slamdata @jdegoes
Code to NoSQL APIReport Card
✗ Slow
✗ Painful
✗ Brittle
✗ Performance
✓ No ETL
![Page 25: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/25.jpg)
@slamdata @jdegoes
Native NoSQL AnalyticsOverview
Native NoSQL Analytics
![Page 26: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/26.jpg)
@slamdata @jdegoes
Native NoSQL AnalyticsTools
SQL (+/-)
Visual Analytics
ETL (+/-) Native
ZoomData
Cloud 9 Charts
JSON Studio
Apache Drill
Quasar
SlamData
Impala
![Page 27: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/27.jpg)
@slamdata @jdegoes
Native NoSQL AnalyticsReport Card
✗ Immature
✗ Learning Curve
✗ Limited Choices
✓ No ETL
✓ Flexible & Fast
✓ Any data, Anywhere
✓ Tunable Performance
![Page 28: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/28.jpg)
@slamdata @jdegoes
Why NoSQL Analytics Is Hard
![Page 29: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/29.jpg)
@slamdata @jdegoes
The
Eight
Deadly Obstacles
to NoSQL Analytics
![Page 30: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/30.jpg)
@slamdata @jdegoes
CHaracteristics1. Generic Data Model
![Page 31: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/31.jpg)
@slamdata @jdegoes
CHaracteristics2 Isomorphic Data Model
Data SQL²
{
"userId": 8927524,
"profile": {
"name": "Mary Jane",
"age": 29,
"gender": "female"
},
"comments": [{
"id": "F2372BAC",
"text": "I concur.",
"replyTo": [9817361, "F8ACD164F"],
"time": "2015-02-03"
}, {
"id": "GH732AFC",
"replyTo": [9654726, "A44124F"],
"time": "2015-03-01"
}]
}
SELECT comments[*].replyTo[*] FROM data
![Page 32: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/32.jpg)
@slamdata @jdegoes
CHaracteristics3. Multidimensionality
Data SQL²
{"user_id": 928347234,
"email": null,
"events": [
{"impression":{
"ts": 912348934,
"page": "index.html"}}]}
SELECT user_id, [events[_] WHERE events[_].ts < 9347234 ...] AS events FROM visitors
![Page 33: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/33.jpg)
@slamdata @jdegoes
CHaracteristics4. Unified Schema/Data
Data SQL²
{"user_id": "[email protected]",
"band_plays":{
"Squirrel Nut Zippers": 56,
"Red Hot Tomatoes": 19,
"Big Bad Voodoo Daddy": 102}}
SELECT band_plays{*:} AS artistName, SUM(band_plays{*}) AS votes FROM music GROUP BY band_plays{*:}
![Page 34: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/34.jpg)
@slamdata @jdegoes
CHaracteristics5. Polymorphic Queries
Data SQL²
{"type": "click",
"link": "http://foo.com"
"timestamp": 123987172}
{"type": "impression",
"page": "index.html"
"timestamp": 92372}
SELECT COUNT(*) AS count, timestamp FROM data GROUP BY timestamp
![Page 35: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/35.jpg)
@slamdata @jdegoes
CHaracteristics6. Post-Relational
Data SQL²
{"name": "John Doe",
"blog_posts": [
{"post_id": "89934"},
{"post_id": "92371"}
]}
SELECT authors.name, posts.title FROM authors JOIN posts ON authors.blog_posts[*].post_id = posts._id
![Page 36: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/36.jpg)
@slamdata @jdegoes
CHaracteristics7. Runtime Type Id & ConverSION
Data SQL²
{"email": ["[email protected]",
{"email": {
"home": "[email protected]",
"work": "[email protected]"}}
SELECT
CASE TYPEOF email
-- old: email stored in 2nd el:
WHEN 'array' THEN email[1]
-- new format:
WHEN 'map' THEN email.work
ELSE email
END AS email
FROM users
![Page 37: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/37.jpg)
@slamdata @jdegoes
CHaracteristics8. Structural Pattern Matching
Data SQL²
{"user_id": "[email protected]",
"events": [{"type": "purchase",
"timestamp": 12392342,
"order_id": "2ffa34aa"},
{"type": "click",
"timestamp": 92327123,
"link": "http://foo.com"}]}
SELECT
CASE user_events
WHEN […, e1, e2, …] THEN
e1.timestamp - e2.timestamp
END AS delta
FROM users
![Page 38: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/38.jpg)
@slamdata @jdegoes
NoSQL Databases: Not Equal
![Page 39: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/39.jpg)
@slamdata @jdegoes
NoSQL Databases: Not Equal
Desired Characteristics
1. DUal Operations & Analytics2. In-Database Analytics
3. General-Purpose Analytics4. Native Report Tooling
![Page 40: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/40.jpg)
@slamdata @jdegoes
NoSQL Databases: Not Equal
Couchbase
✓ Dual Operations / Analytics
✓ In-Database Analytics
✓ General-Purpose Analytics
✗ Native Report Tooling
Best Reporting Option: Fat DriversRunner-Up: Code to NoSQL APIs
![Page 41: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/41.jpg)
@slamdata @jdegoes
NoSQL Databases: Not Equal
MarkLogic
✓ Dual Operations / Analytics
✓ In-Database Analytics
✓ General-Purpose Analytics
✗ Native Report Tooling
Best Reporting Option: ETLRunner-Up: Code to NoSQL APIs
![Page 42: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/42.jpg)
@slamdata @jdegoes
NoSQL Databases: Not Equal
MongoDB
✓ Dual Operations / Analytics*
✓ In-Database Analytics
✗ General-Purpose Analytics
✓ Native Report Tooling
Best Reporting Option: Native NoSQL AnalyticsRunner-Up: Code to NoSQL APIs
* Further maturation needed
![Page 43: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/43.jpg)
@slamdata @jdegoes
NoSQL Databases: Not Equal
ElasticSearch
✓ Dual Operations / Analytics
✓ In-Database Analytics
✗ General-Purpose Analytics
✗ Native Report Tooling
Best Reporting Option: Code to NoSQL APIsRunner-Up: ETL to Hadoop
![Page 44: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/44.jpg)
@slamdata @jdegoes
NoSQL Databases: Not Equal
Cassandra
✗ Dual Operations / Analytics*
✗ In-Database Analytics*
✗ General-Purpose Analytics
✗ Native Report Tooling
Best Reporting Option: ETLRunner-Up: Code to NoSQL APIS*
* Real-time analytics
![Page 45: Give Me My Damn Report: Making NoSQL Data Accessible to the Business](https://reader031.vdocuments.mx/reader031/viewer/2022030316/5872bcee1a28ab0c718b4793/html5/thumbnails/45.jpg)
@slamdata @jdegoes
THE ENDQuestions?