dredge overview
TRANSCRIPT
![Page 1: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/1.jpg)
H E M A L G A N D H ID I R E C T O R O F DATA E N G I N E E R I N G
![Page 2: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/2.jpg)
DATA ENGINEERING AT ONE KINGS LANE
Powering business decisions through understanding
customer behavior.
![Page 3: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/3.jpg)
Observations on Data Platforms
![Page 4: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/4.jpg)
DREAM
You start with a simple design...
![Page 5: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/5.jpg)
REALITY
...but you end up with a complex design.
![Page 6: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/6.jpg)
DREAM
You start with full speed...
![Page 7: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/7.jpg)
REALITY
...but you end up being slow.
![Page 8: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/8.jpg)
DREAM
You start with the latest technology...
![Page 9: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/9.jpg)
REALITY
...but end up with old stack before going live.
![Page 10: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/10.jpg)
DREAM
You dream of a low cost platform...
![Page 11: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/11.jpg)
REALITY
... but you end up shelling a lot of $$.
![Page 12: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/12.jpg)
To build a scalable, loosely coupled big data platform.
WHAT IS OUR GOAL
![Page 13: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/13.jpg)
Some design questions we need to answer:
DESIGN
![Page 14: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/14.jpg)
Which technologies to choose? How to keep the stack current?
![Page 15: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/15.jpg)
How to keep up with evolving business needs?
![Page 16: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/16.jpg)
How to make your investment count?
![Page 17: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/17.jpg)
It’s like building a city.
![Page 18: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/18.jpg)
Technology
ProcessPeople
![Page 19: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/19.jpg)
Technology
ProcessPeople
![Page 20: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/20.jpg)
High Level Architecture
![Page 21: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/21.jpg)
COLLECTION
- Apache Flume
- Sqoop
FLOW
- Kafka
- Spark
STORAGE
- HBase
- Hive
PROCESSING
- Pig
- Spark
DELIVERY
- Visualization
- Email / FTP
DATA PLATFORM
![Page 22: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/22.jpg)
SCHEDULING & CLUSTER MONITORING
DATA PLATFORM
SE
CU
RIT
Y
COLLECTION
- Apache Flume
- Sqoop
FLOW
- Kafka
- Spark
STORAGE
- HBase
- Hive
PROCESSING
- Pig
- Spark
DELIVERY
- Visualization
- Email / FTP
![Page 23: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/23.jpg)
APPLICATIONS & VISUALIZATION TOOLS
SCHEDULING & CLUSTER MONITORING
DATA PLATFORM
SE
CU
RIT
Y
COLLECTION
- Apache Flume
- Sqoop
FLOW
- Kafka
- Spark
STORAGE
- HBase
- Hive
PROCESSING
- Pig
- Spark
DELIVERY
- Visualization
- Email / FTP
![Page 24: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/24.jpg)
DATA ACCESS ABSTRACTION API
SCHEDULING & CLUSTER MONITORING
DATA QUALITY SERVICE
DATA PLATFORM
APPLICATIONS & VISUALIZATION TOOLS
SE
CU
RIT
Y
COLLECTION
- Apache Flume
- Sqoop
FLOW
- Kafka
- Spark
STORAGE
- HBase
- Hive
PROCESSING
- Pig
- Spark
DELIVERY
- Visualization
- Email / FTP
![Page 25: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/25.jpg)
DATA ACCESS ABSTRACTION API
SCHEDULING & CLUSTER MONITORING
DATA QUALITY SERVICE
DREDGE
SE
CU
RIT
Y
DATA PLATFORM
APPLICATIONS & VISUALIZATION TOOLS
COLLECTION
- Apache Flume
- Sqoop
FLOW
- Kafka
- Spark
STORAGE
- HBase
- Hive
PROCESSING
- Pig
- Spark
DELIVERY
- Visualization
- Email / FTP
![Page 26: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/26.jpg)
WHAT IS DREDGE
A declarative, abstraction layer for integrating big data
tools, enabling loosely coupled big data platform.
![Page 27: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/27.jpg)
SOURCE END POINTS
DREDGE LOGICAL VIEW
![Page 28: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/28.jpg)
SOURCE END POINTS
DREDGE LOGICAL VIEW
SOURCE READERS
![Page 29: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/29.jpg)
SOURCE END POINTS
DREDGE LOGICAL VIEW
SOURCE READERS
TASKSHADOOP CLUSTER
![Page 30: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/30.jpg)
SOURCE END POINTS
DREDGE LOGICAL VIEW
SOURCE READERS
TASKSHADOOP CLUSTER
TARGET WRITERSSTREAM/DIRECT
![Page 31: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/31.jpg)
SOURCE END POINTS
DREDGE LOGICAL VIEW
SOURCE READERS
TASKSHADOOP CLUSTER
TARGET WRITERSSTREAM/DIRECT
TARGET ENDPOINTS
![Page 32: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/32.jpg)
SOURCE END POINTS
DREDGE LOGICAL VIEW
SOURCE READERS
TASKSHADOOP CLUSTER
TARGET WRITERSSTREAM/DIRECT
TARGET ENDPOINTS
LOG STREAMING
![Page 33: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/33.jpg)
SOURCE END POINTS
DREDGE LOGICAL VIEW
SOURCE READERS
TASKSHADOOP CLUSTER
TARGET WRITERSSTREAM/DIRECT
TARGET ENDPOINTS
LOG STREAMINGEVENTS
MANAGEMENT
![Page 34: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/34.jpg)
SOURCE END POINTS
DREDGE LOGICAL VIEW
SOURCE READERS
TASKSHADOOP CLUSTER
TARGET WRITERSSTREAM/DIRECT
TARGET ENDPOINTS
LOG STREAMINGEVENTS
MANAGEMENT
CONFIGURATION
ABSTRACTION
![Page 35: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/35.jpg)
SOURCE END POINTS
LOG STREAMINGEVENTS
MANAGEMENT
CONFIGURATION
ABSTRACTION
TARGET ENDPOINTS
DREDGE LOGICAL VIEW
DREDGE REPOSITORY – HBASE
SOURCE READERS
TASKSHADOOP CLUSTER
TARGET WRITERSSTREAM/DIRECT
![Page 36: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/36.jpg)
DREDGE ARCHITECTURE
LAMDA ARCHITECTURE : HDFS, HIVE, HBASE, P IG, FLUME, KAFKA, OOZIE
![Page 37: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/37.jpg)
DREDGE DATA SERVICES
DREDGE ARCHITECTURE
ABSTRACTION BUILDER (KAFKA, FLUME, P IG, CUSTOM )
PLUGIN (JAVA/SHELL , P IG, SQL )
RANK, SORTER
AGGREGATOR
UDF’S
SET OPERATIONS
COMBINERS, ROUTERS. .
F ILTERS/PATTERNS ANALYSIS
SOURCE READERS (LOGS, RDBMS, UNSTRUCTURED DATA, CUSTOM ) D IRECT/STREAM
TARGET WRITERS (HIVE, HBASE, RDBMS, CUSTOM )DIRECT/STREAM
LAMDA ARCHITECTURE : HDFS, HIVE, HBASE, P IG, FLUME, KAFKA, OOZIE
![Page 38: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/38.jpg)
DREDGE RUNTIME
DREDGE DATA SERVICES
DREDGE ARCHITECTURE
TEMP STORE - HDFS TEMP STORE - HDFSEVENT
MANAGEMENT
ABSTRACTION BUILDER (KAFKA, FLUME, P IG, CUSTOM )
PLUGIN (JAVA/SHELL , P IG, SQL )
RANK, SORTER
AGGREGATOR
UDF’S
SET OPERATIONS
COMBINERS, ROUTERS. .
F ILTERS/PATTERNS ANALYSIS
SOURCE READERS (LOGS, RDBMS, UNSTRUCTURED DATA, CUSTOM ) D IRECT/STREAM
TARGET WRITERS (HIVE, HBASE, RDBMS, CUSTOM )DIRECT/STREAM
LOGGERSTREAM
LAMDA ARCHITECTURE : HDFS, HIVE, HBASE, P IG, FLUME, KAFKA, OOZIE
![Page 39: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/39.jpg)
DREDGE RUNTIME
DREDGE UI
Declarative configuration
Logical Flows
Data Lineage
Runtime Logs
Admin
DREDGE DATA SERVICES
DREDGE ARCHITECTURE
TEMP STORE - HDFS TEMP STORE - HDFSEVENT
MANAGEMENT
ABSTRACTION BUILDER (KAFKA, FLUME, P IG, CUSTOM )
PLUGIN (JAVA/SHELL , P IG, SQL )
RANK, SORTER
AGGREGATOR
UDF’S
SET OPERATIONS
COMBINERS, ROUTERS. .
F ILTERS/PATTERNS ANALYSIS
SOURCE READERS (LOGS, RDBMS, UNSTRUCTURED DATA, CUSTOM ) D IRECT/STREAM
TARGET WRITERS (HIVE, HBASE, RDBMS, CUSTOM )DIRECT/STREAM
LOGGERSTREAM
LAMDA ARCHITECTURE : HDFS, HIVE, HBASE, P IG, FLUME, KAFKA, OOZIE
![Page 40: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/40.jpg)
DREDGE RUNTIME
DREDGE UI
Declarative configuration
Logical Flows
Data Lineage
Runtime Logs
Admin
DREDGE DATA SERVICES
DREDGE ARCHITECTURE
DREDGE REPOSITORY – HBASE
TEMP STORE - HDFS TEMP STORE - HDFSEVENT
MANAGEMENT
ABSTRACTION BUILDER (KAFKA, FLUME, P IG, CUSTOM )
PLUGIN (JAVA/SHELL , P IG, SQL )
RANK, SORTER
AGGREGATOR
UDF’S
SET OPERATIONS
COMBINERS, ROUTERS. .
F ILTERS/PATTERNS ANALYSIS
SOURCE READERS (LOGS, RDBMS, UNSTRUCTURED DATA, CUSTOM ) D IRECT/STREAM
TARGET WRITERS (HIVE, HBASE, RDBMS, CUSTOM )DIRECT/STREAM
LOGGERSTREAM
LAMDA ARCHITECTURE : HDFS, HIVE, HBASE, P IG, FLUME, KAFKA, OOZIE
![Page 41: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/41.jpg)
Closing the Loop
![Page 42: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/42.jpg)
Abstraction layer
![Page 43: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/43.jpg)
Abstraction layer
Reusable data components
![Page 44: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/44.jpg)
Abstraction layer
Reusable data components
Event Driven dependencies
![Page 45: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/45.jpg)
Abstraction layer
Reusable data components
Event Driven dependencies
Plug n Play integration, loosely coupled (Cluster Resources, Data)
![Page 46: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/46.jpg)
Summarizing
![Page 47: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/47.jpg)
Big data requires a different mindset: Innovate, iterate often and
keep it simple.
![Page 48: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/48.jpg)
E N G I N E E R I N G . O N E K I N G S L A N E . C O M
Thank you.
![Page 49: Dredge Overview](https://reader034.vdocuments.mx/reader034/viewer/2022042615/55cefc0fbb61ebaf438b459e/html5/thumbnails/49.jpg)
C O N T R I B U T O R S :
Maria Latushkin (CTO, One Kings Lane)
Joana Koiller (Senior Product Designer, One Kings Lane)