an introduction to presto, an open source distributed dipti ......25 ahana • sql analytics company...
TRANSCRIPT
![Page 1: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/1.jpg)
DiptiBorkarCo-Founder & CPO | Ahana
An introduction to Presto, an open source distributed
SQL engine
![Page 2: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/2.jpg)
Founder
Mom
����Immigrant
Girldatageek(DB)
Engineeralways
Producttechie
Teambuilder
Opensourcebeliever
Mixologist
�
![Page 3: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/3.jpg)
3
Agenda
• WhatisPresto?
• Historyoffederation
• IntroductiontoPresto
• WhatmadePrestodifferent?
• Scalablearchitecture
• FlexibleConnectors
• Performance
• Thelifeofaquery
![Page 4: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/4.jpg)
4
TechnologyCyclesRhyme:DataFederationFDBMSChallengesRDBMSFDBMSPaperbyMcCleod /Heimbigner (1985)FDBMSPaperbySheth /Larson(1990)
OLTPtoDWWinsDataWarehousebecomesthesourceoftruthStarschemabecomessacred
Cloud&BigDataComposite Software(founded2001)GarlicPaperbyLauraHaas(2002)à DB2FederatedGoogleFileSystemPaper(2003)MapReducepaper(2006)SparkPaper(2010)ToomanyDataSources,Nooneuberschema
NewCloudDWw/DataLakesBasedonSQLSelfServicePlatformswhichenableSelf-ServiceAnalytics
SQLFederationMakesComebackDremel Paper (2010) àDrill paper (2012)SQL ++ paper (2014) à Couchbase SQL++ engine (2018)Presto paper (2019), PartiQL (2019)
80’s
90’s
2000’s
2010’s
2020’s
![Page 5: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/5.jpg)
5
Presto:OneoftheFastestGrowingOpenSourceProjectsinDataAnalyticsBusinessNeeds
Data-drivendecisionmaking
Businessesneedmoredatatoiterateover
TechnologyTrends
DisaggregationofStorageandCompute
Theriseofdatalakes
![Page 6: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/6.jpg)
6
WhatisPresto?
• DistributedSQLqueryengine
• ANSISQLonDatabases,Datalakes
• Designedtobeinteractive
• Accesstopetabytesofdata
• Opensource,hostedongithub
• https://github.com/prestodb
![Page 7: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/7.jpg)
7
PrestoOverview
![Page 8: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/8.jpg)
8
CommonQuestions?
• Isprestoadatabase?
• HowisitrelatedtoHadoop?
• Howisitdifferentfromadatawarehouse?
![Page 9: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/9.jpg)
9
SamplePrestodeploymentstack&usecases
• Adhoc
• BItools
• Dashboard
• A/Btesting
• ETL/scheduledjob
• Onlineservice
![Page 10: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/10.jpg)
10
WhatmadePrestodifferent?
• Scalablearchitecture
• PluggableConnectors
• Performance
![Page 11: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/11.jpg)
11
ScalableArchitecture
• Tworoles- coordinatorand
worker
• Easyscaleupandscaledown
• Scaleupto1000workers
• Validatedatwebscaled
companies
![Page 12: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/12.jpg)
12
ScalableArchitecture
![Page 13: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/13.jpg)
13
PluggablePrestoConnectors
![Page 14: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/14.jpg)
14
PrestoConnectorDataModel
• Connector:Driverforadatasource.
• Example:HDFS,AWSS3,Cassandra,MySQL,SQLServer,Kafka
• Catalog:Containsschemasfromadatasourcespecifiedbythe
connector
• Schemas:Namespacetoorganizetables.
• Tables:Setofunorderedrowsorganizedintocolumnswithtypes.
![Page 15: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/15.jpg)
15
PrestoHiveConnectorforObjectstores&Filessystems
![Page 16: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/16.jpg)
16
PrestoHiveConnector– AccessControl
![Page 17: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/17.jpg)
17
PrestoHiveConnector– DataFileTypes
• SupportedFileTypes• ORC• Parquet• Avro• RCFile• SequenceFile• JSON• Text
• Nodataingestionneeded
![Page 18: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/18.jpg)
18
PrestoDruidConnectorforreal-timeanalytics
![Page 19: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/19.jpg)
19
WhyPrestoisFast
• In-Memoryprocessing
• Pullmodel
• Columnarstorageandexecution
![Page 20: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/20.jpg)
20
TheLifeofaQuery– SimpleScan
![Page 21: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/21.jpg)
21
TheLifeofaQuery– JoinandAggregationSELECT
orders.orderkey,SUM(tax)
FROM orders
LEFTJOINlineitem
ON orders.orderkey =lineitem.orderkey
WHERE discount=0GROUPBYorders.orderkey
This example is from Presto: SQL on Everything
https://research.fb.com/publications/presto-sql-on-everything/
![Page 22: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/22.jpg)
22
LogicalPlan- DoNOTJoinTwoBigTables
![Page 23: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/23.jpg)
23
Limitations
• MemoryLimitation
• FaultTolerance
• SingleCoordinator
![Page 24: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/24.jpg)
24
Getstarted
DockerSandboxforPresto
https://hub.docker.com/r/ahanaio/prestodb-sandbox
AWSSandboxAMIforPresto
https://ahana.io/tutorials/aws-sandbox/
![Page 25: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/25.jpg)
25
Ahana
• SQLanalyticscompanybasedonPresto
• Teamofexpertsincloud,database,andPresto
• InvestmentfromGoogleVentures
• NamedCRNTop10BigDataStartupof2020
• Premiermemberof “[Ahana founders] have been strongsupporters of the Presto Foundationsince its launch in September 2019”
“We are excited to welcome Ahana, asthe first and only company focused onsupporting Presto of the PrestoFoundation”
![Page 26: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/26.jpg)
https://events.linuxfoundation.org/prestocon/
PRESTO20WIBD
Free for WiBD Members
![Page 27: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/27.jpg)
27
JointhePrestoCommunity• Requirenewfeatureorfileabug:github.com/prestodb/presto• Slack:prestodb.slack.com• Twitter:@prestodb
Stay Up-to-Date with Ahana• URL: ahana.io
• Twitter: @ahanaio
![Page 28: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/28.jpg)
Q & A
And yes! We are hiring!
![Page 29: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/29.jpg)
8/27/20
![Page 30: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/30.jpg)
30
PrestoFoundation:CommunityDriven
![Page 31: An introduction to Presto, an open source distributed Dipti ......25 Ahana • SQL analytics company based on Presto • Team of experts in cloud, database, and Presto • Investment](https://reader033.vdocuments.mx/reader033/viewer/2022060900/609dc11159236a34654f388c/html5/thumbnails/31.jpg)
31
Data-DrivenCompaniesneedLowDataLatency
AnalystsandScientistsneedtoanswerquestions:
Thetimeittakesfromauserhavingaquestiontothetimetheycanactuallyanswerit
“DataLatency”=
1.Userwantstotrackorexploresomenewdata
2.UsermeetswithDataEng teamto
makeplan
3.Datateamacquiredataandcheck
accesspermissions
4.BuildandtesttheETLsandmake
tablesavailabletouser
5.Notifytheusersotheycanasktheir
questions
!Canbedaysorweeksof
time