leanxcale features in a nutshellv5docx · 2020-03-23 · leanxcale features in a nutshell deploy...

12
LeanXcale features in a nutshell

Upload: others

Post on 30-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

Page 2: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

Index Dual API: low-latency NoSQL API and easy-to-use SQL interface .................................................................... 4

A developer friendly database ................................................................................................................................... 4

KiVi, the ultra-efficient store engine ............................................................................................................................. 5

A unique relational key-value store .......................................................................................................................... 5

Near costless MVCC ................................................................................................................................................ 5

New data structure ................................................................................................................................................... 5

Ultra-Efficient .............................................................................................................................................................. 5

Bidimensional partitioning ..................................................................................................................................... 6

Online aggregations ................................................................................................................................................. 6

Vectorial processing ................................................................................................................................................. 6

Highly efficient storage ..................................................................................................................................................... 6

Take advantage of a highly efficient storage ....................................................................................................... 6

Full durability ............................................................................................................................................................... 6

Optimal cache management ................................................................................................................................ 6

Compression .............................................................................................................................................................. 6

Distributed storage ................................................................................................................................................... 7

Big data storage ......................................................................................................................................................... 7

Active-active replication .......................................................................................................................................... 7

Linear scalability from MVP to global .......................................................................................................................... 7

Run on a single pc or in 100s of servers, and get always a linear performance .................................... 7

Scalable ........................................................................................................................................................................ 7

Elastic ............................................................................................................................................................................ 7

Distributed architecture .......................................................................................................................................... 7

Distributed query engine with analytical and geographical capabilities ......................................................... 8

Run complex multi-join and gis queries to get real-time analytics .............................................................. 8

Derived Tables ............................................................................................................................................................ 8

Covering indexes ....................................................................................................................................................... 8

Multithread batch procedures .............................................................................................................................. 8

Table functions .......................................................................................................................................................... 8

Efficient Secondary Indexes .................................................................................................................................. 8

Predicate push-down ............................................................................................................................................... 8

Compiled Queries ...................................................................................................................................................... 9

Parallel Scans ............................................................................................................................................................. 9

Page 3: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

Workload Management .......................................................................................................................................... 9

Scalable GIS ................................................................................................................................................................ 9

Polyglot queries ......................................................................................................................................................... 9

Integration with data lakes ..................................................................................................................................... 9

Geohash filters ........................................................................................................................................................ 10

BI integration ............................................................................................................................................................ 10

ML integration ......................................................................................................................................................... 10

SQL ANSI ................................................................................................................................................................... 10

HTAP ........................................................................................................................................................................... 10

Deploy LeanXcale Clusters Anywhere ........................................................................................................................ 5

Use commodity hardware .......................................................................................................................................... 5

Security and compliance .............................................................................................................................................. 11

Meet any compliance requirement ...................................................................................................................... 11

Access control ......................................................................................................................................................... 11

Communica-tion encryption .............................................................................................................................. 11

Data storage encryption ...................................................................................................................................... 11

LeanXcale migration tools and free professional services ............................................................................... 11

Make an effortless transition .................................................................................................................................. 11

Model replicator ...................................................................................................................................................... 11

Transparent adaptor ............................................................................................................................................. 11

Free professional services .................................................................................................................................. 11

World-class management and monitoring ............................................................................................................ 12

Powerful tools that boost the performance of any devops team ............................................................. 12

Monitoring ................................................................................................................................................................. 12

Node management ............................................................................................................................................... 12

Backup ....................................................................................................................................................................... 12

Page 4: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

LeanXcale is a fast and scalable database that combines the characteristics of SQL and NoSQL. It is built to ingest massive batch and real-time data pipelines and make it available through SQL or GIS for any use, such as operational applications, analytics, dashboarding, or machine learning processing.

Figure 1: LeanXcale features, connections and deployments

Features

Dual API: low-latency NoSQL API and easy-to-use SQL interface A developer friendly database

No matter what stack you use, LeanXcale provides you both SQL and NoSQL interfaces. KiVi storage engine is a relational key-value data store. Users can access the data not only through the standard SQL API but also through a direct ACID key-value interface. This key-value interface allows users to perform data ingestion at very high rates and very efficiently by avoiding SQL processing overhead. The Direct API provides all operations available with SQL with the added benefits of but JOINS, i.e.: insertions, predicate filtering, aggregation, grouping, and sorting. At the same time, the traditional SQL interface provides users with an easy way to migrate from other databases and gives them the whole environment of the SQL world.

Page 5: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing architecture that enables it to run anywhere: on a commodity bare-metal cluster, on a container, as OEM, as Database as a Service or on a private cloud.

KiVi, the ultra-efficient store engine A unique relational key-value store

KiVi is the LeanXcale store engine. It has been designed from scratch with a brand-new architecture to grant the optimal performance.

Near costless MVCC Modern databases use multi-version concurrency control (MVCC) to avoid conflicts between reads and writes. However, MVCC requires removing obsolete versions. LeanXcale's MVCC uses a new approach that is near costless and does not create issues with any update rate.

New data structure Prior to LeanXcale, there had been a duality between SQL databases and key-value data stores: SQL databases, using B+ trees, have been more performant for range queries. Key-value data stores, with LSM trees, have been more efficient for data ingestion. LeanXcale uses a novel data structure that is as efficient as B+-trees for range queries, and as efficient as LSM-trees for random updates/inserts.

Ultra-Efficient KiVi’s single-thread design avoids expensive context switches, thread syn-chronization, and NUMA remote memory accesses. It takes advantage of 20+ years of operating systems research.

Page 6: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

Bidimensional

partitioning As soon the memory cannot deal with the workload, IO increments and throughput goes down. LeanXcale is optimized to handle time series, partitioning the data in several dimensions to make smart usage of cache and increasing the data locality in the searches.

Online aggregations

LeanXcale enables the addition of values of several inserts in a specific row in real time without any contention or conflict. Aggregation computing is made online at the time of insertion, so aggregates are pre-calculated. Getting the desired result requires just reading a row.

Vectorial processing

Vectorization is the process of converting an algorithm from operating on a single value at a time to operating on a set of values at one time. Modern CPUs provide direct support for vector operations, where a single instruction is applied to multiple data (SIMD). LeanXcale’s store engine takes advantage of the SIMD to increase the performance of the solution.

Highly efficient storage Take advantage of a highly efficient storage Highly scalable, efficient and distributed storage engine distributed data along the cluster to improve the performance and increase the reliability.

Full durability

Write everlasting transactions with full data persistence and archiving.

Optimal cache management

Make use of memory when possible to speed-up performance and use disk storage when cache size is not big enough.

Compression

Use ZFS file system compression and optimize resources' usage for storing high volumes of data.

Page 7: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

Distributed storage

Distribute your data table along the different storage engines to get fast parallel scans.

Big data storage

Run queries over petabytes of stored data with an extraordinary performance

Active-active

replication Active-active replication is a typical bottleneck for databases. LeanXcale has developed a new replication algorithm with no overhead and bottleneck-free.

Linear scalability from MVP to global Run on a single pc or in 100s of servers, and get always a linear performance

Start with a cluster size that covers your current needs, and grow it as your needs increase, always getting a linear performance, even for hundreds of nodes. Process up to millions of transactions per second.

Scalable

Traditional ACID databases do not scale out linearly or do not scale out at all. LeanXcale has developed the patented Iguazu technology to scale out linearly with no bottleneck from a single server to hundreds. It uses a distributed algorithm that processes transactions massively in parallel, always staying ACID-compliant.

Elastic

LeanXcale has a novel nonintrusive data migration algorithm that allows for moving data from one server to another without disrupting operations, even while it is being updated, and keeping full ACID consistency. LeanXcale can grow or shrink according to the current needs with zero downtime.

Distributed architecture

LeanXcale, like any other relational database, has three layers: store, transactional, and query engine. The three layers are completely distributed and can scale independently.

Page 8: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

Distributed query engine with analytical and geographical capabilities

Run complex multi-join and gis queries to get real-time analytics LeanXcale performs fast query responses across live and historical data. It allows the running of complex multi-join queries using familiar ANSI SQL to get real-time analytics and avoid complex architectures such as lambda. LeanXcale is also a scalable GIS database.

Derived Tables

Unlike materialized views, derived tables keep an up-to-date complex transformation of another table. Like materialized views, derived tables can provide a fast representation of the data for specific purposes.

Covering indexes

LeanXcale provides covering indexes that boost the response time. It helps to avoid the round trip to the table to satisfy the request, since all the columns requested exist in the non-clustered index itself.

Multithread batch

procedures Batch procedures in Java or Python take advantage of the fully distributed architecture of LeanXcale, since they are multithreaded and make uses in parallel of the different resources.

Table functions

Integrate SQL queries with custom Java functions in a transparent way. Develop your own Java code to generate table-like data through an easy-to-use framework.

Efficient Secondary Indexes

LeanXcale supports efficient secondary indexes that are distributed and connected to primary data. Queries over secondary indexes can retrieve the primary data locally and provide the full result with the minimum number of round trips.

Predicate push-down

LeanXcale is a fully distributed database with a store engine that can execute any type of query but JOINS. It can push down the predicate from one to several store engines, improving the performance and avoiding costly data movements.

Page 9: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

Compiled Queries

Repeated requests are compiled into machine code by a JIT processor to optimize the response time in the common queries.

Parallel Scans

Leanxcale can improve massive data queries. It can reduce the latency and take advantage of user infrastructure by parallelizing scan operations over large tables that are split across several store engines.

Workload

Management

LeanXcale manages the end-user request to balance the workload among different SQL engines to optimize the performance.

Scalable GIS

Execute complex geographical queries at any scale while under high-rate data ingestion. LeanXcale can efficiently handle large volumes of geographical data while supporting numerous parallel GIS queries.

Polyglot queries

LeanXcale supports queries across MongoDB, HBase, Neo4J, and any SQL RDBMS. The queries combine the ease of SQL with the power of the native APIs/query languages of the underlying data stores.

Integration with data

lakes By defining metadata and parsing of data lake (i.e., HDFS) files, they become read-only SQL tables. SQL queries can query and correlate operational data and historical data stores in data lakes.

Page 10: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

Geohash filters

Thanks to a new geohash-based filtering algorithm specially designed for LeanXcale, it can offer high-performance behavior over queries containing geographical functions.

BI integration LeanXcale can integrate with the most popular BI tools, such as QLink, SAS, Tableau, or Power BI, through OData, ODBC or JDBC interface. Additionally, LeanXcale is released with Apache Superset integrated out of the box.

ML integration

Integrating with your favorite Machine Learning Toolkit: Pandas, Tensorflow, Spark, or Apache Arrow.

SQL ANSI

LeanXcale provides a full 2003 ANSI-compliant SQL interface to facilitate the migration from other databases and to allow taking advantage of all the SQL tools.

HTAP LeanXcale has a distributed data warehouse engine designed to run analytical queries on operational data and delivering the real-time analytical request. Thanks to this capacity, it avoids ETLs, which can save up to 80% of the average business analytics cost. This capability enables real-time analytics so that decisions can be made in real-time.

Page 11: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

Security and compliance Meet any compliance requirement Security is an important aspect of any application, but some of them have critical data with special security restrictions to be handled because of business or legal requirements (e.g., banking, insurance or health). LeanXcale provides the highest standard to be compliant with any requirement.

Access control

LeanXcale can provide role-based access control, per user, and individual permission level. LeanXcale can integrate authorization with an enterprise level LDAP

Communica-tion

encryption Activate SSL/TLS encryption for any external connection and, depending on the deployment and security level of your application, you can enable SSL/TLS for connections between internal database components.

Data storage

encryption LeanXcale can run on top of ZFS file system, encrypting all the information with the highest standards.

LeanXcale migration tools and free professional services Make an effortless transition

LeanXcale provides a set of tools and services to migrate transparently from your current database to LeanXcale.

Model replicator

Thanks to the Model Replicator, it is possible to duplicate a schema in LeanXcale with a few clicks and commands.

Transparent adaptor

LeanXcale is a 2003 ANSI SQL-compliant database. Anyhow some proprietary another databases extension could need of translation. The transparent adaptor allows to do it dynamically with no code changes.

Free professional

services Free migration service that helps you to migrate the application, identify possible problems, export-import the dataset, and optimize the final design.

Page 12: LeanXcale features in a nutshellv5docx · 2020-03-23 · LeanXcale features in a nutshell Deploy LeanXcale Clusters Anywhere Use commodity hardware LeanXcale has a shared-nothing

LeanXcale features in a nutshell

World-class management and monitoring Powerful tools that boost the performance of any devops team

LeanXcale is a database engine that works in your critical systems. Set up your system through a console. Keep track of its performance to ensure the best quality.

Monitoring

LeanXcale provides an integrated monitoring dashboard based on Prometheus and Grafana out of the box. Additionally, LeanXcale ex-poses a series of metrics to third-party systems through JMX and as Prometheus custom exporters.

Node management

Companies must over-provision for the highest peaks they expect. But this over-provisioning is expensive because users must pay for it 24/7. With just a command, LeanXcale allows for the creation of additional nodes or decommissions them with no impact on the end user.

Backup

Continuous backup and consistent snapshots of distributed clusters allow seamless data recovery in the event of system failures or application errors. LeanXcale, evenly distributed, has a point in time hot backup capability.

Resources • Visit www.leanxcale.com for more information,or contact us at [email protected]. Free

Trial (https://www.leanxcale.com/trial) • Documentation and drivers (https://www.leanxcale.com/company-resources)

Whitepapers and videos (https://www.leanxcale.com/company-resources) • Get a demo (https://www.leanxcale.com/get-a-demo) • Talks (https://www.leanxcale.com/talks) • Blog (https://www.leanxcale.com/blog)