hewlett packard enterprise confidential information...please give me your feedback –use the mobile...
TRANSCRIPT
![Page 1: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/1.jpg)
![Page 2: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/2.jpg)
#SeizeTheData
Hewlett Packard Enterprise confidential informationThis is a rolling (up to three year) roadmap and is subject to change without notice.
This Roadmap contains Hewlett Packard Enterprise Confidential Information. If you have a valid Confidential Disclosure Agreement with Hewlett Packard Enterprise, disclosure of the Roadmap is subject to that CDA. If not, it is subject to the following terms: for a period of three years after the date of disclosure, you may use the Roadmap solely for the purpose of evaluating purchase decisions from HPE and use a reasonable standard of care to prevent disclosures. You will not disclose the contents of the Roadmap to any third party unless it becomes publically known, rightfully received by you from a third party without duty of confidentiality, or disclosed with Hewlett Packard Enterprise’s prior written approval.
![Page 3: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/3.jpg)
#SeizeTheData
Please give me your feedback
–Use the mobile app to complete a session survey 1. Access “My schedule”2. Click on the session detail page3. Scroll down to “Rate & review”
– If the session is not on your schedule, just find it via the Discover app’s “Session Schedule” menu, click on this session, and scroll down to “Rate & Review”
– If you have not downloaded our event app, please go to your phone’s app store and search on “Discover 2016 Las Vegas”
– Thank you for providing your feedback, which helps us enhance content for future events.
Session ID:Bxxxxx Speaker: Mark Fay, Natalia Stavisky
![Page 4: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/4.jpg)
Effectively managing & monitoring streaming data loads
Mark FayNatalia Stavisky
using Kafka and the Vertica Management Console
![Page 5: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/5.jpg)
#SeizeTheData
Vertica & Kafka Integration
In a world of just-in-time inventory and on-demand services, the ability to quickly load and analyze tremendous amounts of data is more important than ever before. Last year, HPE Vertica addressed this growing need by integrating with Apache Kafka to offer scalable, real-time loading from Kafka sources. Today Vertica continues to leverage these strengths by adding flexibility, monitoring, and the ability to relay data back to Kafka. With the upcoming Frontloader release, Vertica has created a data ecosystem capable of supporting even the most demanding needs.
5
![Page 6: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/6.jpg)
#SeizeTheData
Agenda
1. Kafka Background
2. Vertica & Kafka Integration
3. Filtering & Parsing Enhancements
4. Closing the Loop: Vertica to Kafka Production
5. Scheduler CLI & Schema Enhancements
6. Monitoring Data Load with MC
6
![Page 7: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/7.jpg)
#SeizeTheData
Kafka Background
7
![Page 8: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/8.jpg)
#SeizeTheData
Apache Kafka Overview
A scalable, distributed message bus─ Apache project originating from LinkedIn─ Rich ecosystem of libraries and tools─ Highly optimized for low latency streaming
Solves the data integration problem─ Producers decoupled from consumers─ O(N) instead of O(N2) data pipelines─ Throughput scalable independently of source &
destination
Producer A Producer B Producer C
Consumer X Consumer Y Consumer ZConsumer Y Consumer ZConsumer X
Producer A Producer B Producer C
Kafka
8
![Page 9: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/9.jpg)
#SeizeTheData
Apache Kafka Architecture
9
Broker A
Partition 0 0 1 32 4 765
Broker B
Partition 1 0 1 32 4 65
Broker N
Partition N 0 1 32 4 97 865
Producer writes to a
topic
Consumer reads offsets
from partitions…
TopicBrokers
PartitionsOffsets
![Page 10: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/10.jpg)
#SeizeTheData
Recap: Vertica & Kafka in 7.2 Excavator
10
![Page 11: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/11.jpg)
#SeizeTheData
Streaming Load Architecture
11
− Vertica schedules loads to continuously consume from any source via Kafka
− JSON, Avro, or custom data formats
− CLI driven− In-database monitoring
![Page 12: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/12.jpg)
#SeizeTheData
Breaking Things Down
Load SchedulerImplements continuous, exactly-once
streaming
Dynamically prioritizes
resources to load from many
topics
Microbatch CommandsLoads a finite chunk of data
Updates stream progress
Kafka UDx PluginPulls data from Kafka Converts Kafka messages to
Vertica tuples
12
![Page 13: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/13.jpg)
#SeizeTheData
Kafka UDx PluginExtending Vertica’s parallel load operators to load from Kafka
Store
…
Parse
Filter
SourceRaw bytes
Transformed bytes
Vertica Tuples
Transformed Tuples
─ Vertica’s execution is modeled as a series of datatransformations pipelined through operators for processing
─ The user defined extension (UDx) framework enables custom logic during this pipeline
─ UDx writer worries about domain logic, Vertica worries about resource management, parallelism, node communication, fault tolerance...
Source: acquire bytes (files, HDFS, Kafka)
Filter: transform bytes (decryption, decompression)
Parse: convert bytes to tuples (JSON, Avro)
Store: write tuples to projections (WOS, ROS)
13
![Page 14: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/14.jpg)
#SeizeTheData
Microbatch CommandsSupport ‘exactly once’ through Vertica transactions
Kafka Data @ Offset X
Data Inserted Into Vertica
New Offsets Stored In Vertica
Commit
Microbatch (µB)
Confidential
![Page 15: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/15.jpg)
#SeizeTheData
Scheduler SQL Statements
SELECT source, target_table, partition, start_offsetFROM stream_microbatch_history;-- run a microbatch for each item returned
COPY target_tableSOURCE KafkaSource(
stream=‘topic|0|0,topic|1|0’, brokers=‘broker:port’, duration=interval ‘10000 milliseconds’, stop_on_eof=true)
PARSER KafkaJSONParser( ) REJECTED DATA AS TABLE rejections_tableDIRECT NO COMMIT;
INSERT INTO stream_microbatch_history(…,*) from (SELECT KafkaOffsets() OVER ()) as microbatch_results;
COMMIT;
stream_microbatch_history table stores state about what to do next. Bootstrap with a SELECT query.
KafkaSource instructs Vertica nodes to load in parallel from Kafka for a period of time, starting at the specified <topic|partition|offset>’s
KafkaJSONParser coverts Kafka JSON messages emitted by the source into Vertica tuples for storage
KafkaOffsets returns the ending offset for each <topic|partition|offset> read by the source. Next frame will start here.
Commit atomically persists the data and ending offsets. It’s all or nothing!
µB
Frame Bootstrap
15
![Page 16: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/16.jpg)
#SeizeTheData
Scheduling
16
![Page 17: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/17.jpg)
#SeizeTheData
Static Scheduling Algorithm
Simple Example:– 5 topics– Concurrency of 1– Frame split into 5 equal parts
– 10 seconds total: 2 seconds each
17
1Example scheduling yields: 2 3 4 5
Hot topics become starved Lots of wasted time!
![Page 18: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/18.jpg)
#SeizeTheData
Dynamic Scheduling Algorithm
1
1 2
1 2 3
1 2 3 4
1 2 3 4 5
To start, every batch gets an even portion of the frame.
If a batch ends early, split the leftover time evenly amongst the remaining batches.Corollary: batches that run later in the frame tend to have more time to run.
…but there’s still some wasted time at the end of the frame.
18
![Page 19: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/19.jpg)
#SeizeTheData
Dynamic Scheduling Algorithm
5
5 4
5 4 3
5 4 3 1
5 4 3 1 2
Next time, sort by the runtime of the previous frame so that batches that ended early go first.
µB2 gets lots of time now!
19
![Page 20: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/20.jpg)
#SeizeTheData
Dynamic Scheduling in Action
– Scheduler configured to load two topics with a frame duration of 5 seconds
– Two producers continuously producing at dynamic rates (dotted lines)
– Vertica’s load rate for the topics keeps up with the produce rate (solid lines), roughly 5 seconds behind
– Net throughput rate remains constant as load resources shift from one topic to the other
20
![Page 21: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/21.jpg)
#SeizeTheData
Since 7.2 Excavator
21
![Page 22: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/22.jpg)
#SeizeTheData
Added Since 7.2 Excavator
– Multiple Kafka cluster support– Added capability within a scheduler
configuration to setup multiple Kafka clusters– Kafka topics can be associated with clusters,
allowing users to stream data into Vertica from anywhere
– Single resource pool; single configuration
– Kafka version support– Added support for Kafka 0.9.x– Working with Confluent to keep up-to-date on
Kafka’s fast release cycles– 0.10 in the works
22
Scheduler Configuration
Kafka cluster
Kafka cluster
Kafka cluster
Vertica
![Page 23: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/23.jpg)
#SeizeTheData
User-Defined Filters and Parsers
Why only JSON & Avro in 7.2 Excavator?– Kafka messages arrive with structure & metadata in
the source– Traditional parsers assume no structure; instead they
discover that structure in the data stream– Kafka JSON & Avro parsers specially designed to
preserve & leverage that information without modifying the data stream
How can I use other formats? Inject a filter!– KafkaInsertDelimiters(delimiter=E’$’)– KafkaInsertLengths()
Once filtered, data can be parsed using built-in parsers or your own custom UDParser
23
Parse
Filter
Source
WTJ(5
![Page 24: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/24.jpg)
Slide 23
WTJ(5 Too much text.Wall, Tom James (Vertica), 8/2/2016
![Page 25: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/25.jpg)
#SeizeTheData
User-Defined Filters and Parsers Example
KafkaInsertDelimiters(delimiter=E'$')– Appends a delimiter character after each message
– Most builtin parsers look for a record boundary
COPY t SOURCE KafkaSource(stream=‘some_topic|0|-2’, stop_on_eof=true, brokers=‘localhost:9092’)
FILTER KafkaInsertDelimiter(delimiter=E’$’)
RECORD TERMINATOR E’$’ DIRECT;
KafkaInsertLengths() – Prepends a unit32 length before each message
– Custom parsers can inspect lengths for efficient parsing
COPY t SOURCE KafkaSource(stream=‘some_topic|0|-2’, stop_on_eof=true, brokers=‘localhost:9092’)
FILTER KafkaInsertLengths()
Parser MyCustomParser() DIRECT;
24
Data in Kafka Data emitted by SOURCE
Data emitted by FILTER
Offset 0: {a:“foo”}Offset 1: {b:“bar”}Offset 2: {c:“baz”}
{a:“foo”}{b:“bar”}{c:“baz”} {a:“foo”}${b:“bar”}${c:“baz”}$
Data in Kafka Data emitted by SOURCE
Data emitted by FILTER
Offset 0: {a:“foo”}Offset 1: {b:“bar”}Offset 2: {c:“baz”}
{a:“foo”}{b:“bar”}{c:“baz”} 9{a:“foo”}9{b:“bar”}10{c:“baz”}
![Page 26: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/26.jpg)
#SeizeTheData
KafkaAVROParser External Schema Support
– Avro Documents have three parts– Schema: JSON blob describing the message(s) in the
document– Object metadata: metadata for parsing the object
(SpecificData vs GenericData)– The data (i.e. a vertica row)
– Kafka Avro serializers typically do one document per Kafka message – lots of bloat!
– Remove bloat with two settings:– external_schema – specify the JSON header up front
and omit from your messages– with_metadata = false (default) to omit metadata
(parse using Avro GenericData)
– Kafka 0.10 schema registry not supported yet
25
Schema (JSON)
Metadata Data
Metadata Data
Metadata Data
Metadata Data
![Page 27: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/27.jpg)
#SeizeTheData
VerticaKafkaKafkaExport UDTSend query results to a kafka topic in parallel!
– Input:– Partition (optional, NULL for round-robin)– Key (optional, NULL for unkeyed)– Message
– Output is messages that failed to send & reasons why (at least once semantics)
– Typical Kafka producer settings available to control performance & reliability
– INSERT … (SELECT …) for error management
26
CREATE TEMP TABLE kafka_rejections(partition integer, key varchar(128), message varchar(2000), reason varchar(1000));
INSERT INTO kafka_rejectionsSELECT KafkaExport(
partition, key, message
USING PARAMETERS
brokers=‘host1:9092,host2:9092’,
topic=‘foo’,
message_timeout_ms=5000
queue_buffering_max_ms=2000,
queue_buffering_max_messages=‘10000’)
OVER(PARTITION BEST) FROM export_src;
![Page 28: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/28.jpg)
#SeizeTheData
VerticaKafkaNotifiers– Notifiers emit messages to external systems,
starting with Kafka
– Data Collector hooks can trigger notifiers when a record is written
– Enables external monitoring of Vertica, with persistence!
27
CREATE NOTIFIER dc_to_kafkaACTION ‘kafka://localhost:9092’MAXMEMORYSIZE ‘1GB’
![Page 29: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/29.jpg)
#SeizeTheData
Schema and CLI Enhancements:A more flexible, more [re-]useable Scheduler– CLI reworked for more flexibility, maintainability &
extensibility
– Separated configuration from state: no longer worry about configuring topics and having the entire offsets history updated.
– Better projection design to optimize scheduler operations
– More consistent CLI config schema mappings to make it easier to do SQL based monitoring
28
MicroBatch
Source
Cluster
Target
Load Spec
![Page 30: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/30.jpg)
#SeizeTheData
From Old to New
Old CLI utilities:
– scheduler
– kafka-cluster
– topic
New CLI utilities:
– scheduler
– cluster
– source
– target
– load-spec
– microbatch
Topic utility managed several different components
Now each component is separated into logical utilities
29
![Page 31: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/31.jpg)
#SeizeTheData
More Flexibility
– Configure clusters that reference Kafka brokers– vkconfig cluster --create --cluster kafka1 --hosts some-kafka-broker:9092
– Separation of Topic (now: Source) and Target:– vkconfig source --create --source topic1 --cluster kafka1 --partitions 3– vkconfig source --create --source topic2 --cluster kafka1 --partitions 5– vkconfig target --create --target-schema public --target-table tgt1
– Configure Microbatches with N:1 source(s)target– Reuse sources and targets as desired– Full N:M multiplexing capabilities with M Microbatches
– vkconfig microbatch --create --microbatch mb1 --target-schema public --target-table tgt1 --add-source-cluster kafka1 --add-source topic1
– vkconfig microbatch --update --microbatch mb1 --add-source-cluster kafka1 --add-source topic2
Note: BOLD refers to Unique Keys for
referencing the specific part of the configuration.
30
![Page 32: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/32.jpg)
#SeizeTheData
More [re-]usability
– COPY statements have many parameters, which are great for differing workloads.
– Sometimes, however, we want to reuse the same “load specification”:
– Introducing new CLI and configuration table: load spec
vkconfig load-spec --create --load-spec SPEC-1 --load-method DIRECT --parser KafkaJSONParser --parser-parameters flatten_tables=true
vkconfig microbatch --update --microbatch mb1 --load-spec SPEC-1
SPEC-1:- Load DIRECT- JSON format- Flatten JSON- No FILTERS
SPEC-2:- Load TRICKLE- Pipe-delimited CSV format- Insert Delimiter FILTER- Specific Kafka configs
31
![Page 33: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/33.jpg)
#SeizeTheData
The New CLI
vkconfig cluster --create --cluster kafka1 --hosts some-kafka-broker:9092
vkconfig source --create --source topic1 --cluster kafka1 --partitions 3
vkconfig source --create --source topic2 --cluster kafka1 --partitions 5
vkconfig target --create --target-schema public --target-table tgt1
vkconfig load-spec --create --load-spec SPEC-1 --load-method DIRECT --parser KafkaJSONParser --parser-parameters flatten_tables=true
vkconfig microbatch --create --microbatch mb1 --target-schema public --target-table tgt1 --add-source-cluster kafka1 --add-source topic1
vkconfig microbatch --update --microbatch mb1 --add-source-cluster kafka1 --add-source topic2
Each component has its own CLI
Each instance of a component is uniquely
identifiable
All components are reusable
Each component independently editable
CRUD keywords for consistency
32
![Page 34: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/34.jpg)
#SeizeTheData
Upgrade Process
– Upgrade will convert current scheduler settings & state to new format
– Old config state left in-tact for historical purposes, but is no longer used
– vkconfig scheduler --upgrade [--upgrade-to-schema <desired-schema>]– Upgrade by default upgrades your 7.2.x configuration within the same schema– upgrade-to-schema allows users to move upgraded schema to a new location– All objects have human-readable identifiers. Upgrade auto generates names, which can be edited afterwards
33
![Page 35: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/35.jpg)
#SeizeTheData
Monitoring Kafka Loading with Vertica Management Console
34
![Page 36: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/36.jpg)
#SeizeTheData
Monitoring data load activities in MC – available in Frontloader 8.0
– Displays history of data loading jobs including COPY command
– Shows outcome of individual COPY commands
35
Kafka loading – many, many COPY commands executed repeatedly over time…
After configuring Kafka streams:
![Page 37: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/37.jpg)
#SeizeTheData
How is Kafka loading different from other types of data loading?Need to track and display many different pieces of data!– Is the data flowing?
– What microbatches are defined in the database?
– Is the data getting processed by my microbatches?
– Is the Scheduler running?
– How many messages have been processed in the last hour? In the last frame?
– Are there any errors?
– Are there any rejections?
36
The MC now presents a separate view of Instance and Continuous types of loading
The MC user can easily focus on the type of loading tasks that they want to track
![Page 38: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/38.jpg)
#SeizeTheData
Continuous (Kafka) loading – your data flow at a glanceMonitoring Kafka loading: MC data collector streams
37
![Page 39: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/39.jpg)
#SeizeTheData
Explore the details…
Scheduler
Microbatch 38
![Page 40: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/40.jpg)
#SeizeTheData
Explore the details…
Microbatch errors Microbatchrejections
39
![Page 41: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/41.jpg)
#SeizeTheData
Suspend or Resume Topic Processing
40
![Page 42: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/42.jpg)
#SeizeTheData
Filtering Continuous Loads
41
Filtering out MC data collector monitoring streams
Filtering on the source
![Page 43: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/43.jpg)
#SeizeTheData
Benefits of Using MC to Monitor Kafka
Monitor the Scheduler: – Is it running?
Monitor microbatches: – Are they enabled?
Monitor microbatch processing messages: – Is the data flowing the way it is expected?
Easier to triage errors and rejected data
42
![Page 44: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/44.jpg)
#SeizeTheData
Wrap Up
– Enhancements to Integration– Closed the loop:
– Export Vertica records to Kafka– Write DC table data to Kafka
– Enhanced Filtering & Parsing capabilities– Any UDFilter & UDParser can be used, not just Kafka-specific– Native Vertica parsing
– Scheduler: extensible, relational schema design– Flexible– Easily SQL-monitored
– Scheduler CLI enhancements
– Monitoring– Browser-based access to the status of the scheduler and microbatches– Easy assessment of issues such as: data not loading, errors and rejections
43
![Page 45: Hewlett Packard Enterprise confidential information...Please give me your feedback –Use the mobile app to complete a session survey 1. Access “My schedule” 2. Click on the session](https://reader034.vdocuments.mx/reader034/viewer/2022042400/5f0f64117e708231d443ed21/html5/thumbnails/45.jpg)
Thank youContact information
44