using elasticsearch and couchbase together to build large scale applications

Download Using Elasticsearch and Couchbase Together to Build Large Scale Applications

Post on 26-Jun-2015

8.837 views

Category:

Technology

5 download

Embed Size (px)

DESCRIPTION

Couchbase Server 2.0 allows for full-text search integration. In this webinar we examine how you can integrate your Couchbase Server 2.0 cluster with an Elasticsearch Cluster to provide enhanced querying capabilities and build large scale applications.

TRANSCRIPT

  • 1. Using&Elas*csearch&and&Couchbase&Together&to&Build&Large&Scale&Apps&Uri&Boness,&Founder,&Elas*csearch&Dip*&Borkar,&Director,&Products,&Couchbase&

2. Introduction to Elasticsearch 3. What is Elasticsearch?Open source Apache 2 licensemulti-tenant, realtime anddistributed search & analyticsengineBacked by Elasticsearch (the company)Proven technology in productionOver 2 million downloads 4. What can Elasticsearch do?Unstructured searchfind all companies in the search marketStructured searchfind all companies founded since 2000Analyticsfind the average annual revenue of all companiesCombine allfind the average annual revenue of all companies foundedsince 2000 within the search market 5. (near) real-time! 6. Distributed & multi-tenantA node is single Elasticsearch instanceMultiple nodes can form a clusterA cluster can manage multiple indicesA cluster is agile & self managingcontinuously ensuring the distributed characteristics of allindices are maintained and that all nodes in the cluster areefficiently & effectively utilized 7. The Index 8. Whats in an index?An identified collection of documentsBuilt & designed for small & large scalesdata volumesdata can be split and distributed between shardsloads & HAeach shard can have zero or more replicas 9. starting a nodenode_1 10. creating our first indexnode_1curl -XPUT localhost:9200/companies -d {"settings" : {"index" : {"number_of_shards" : 2,"number_of_replicas" : 1}}} 11. the two shards are allocatednode_10 1curl -XPUT localhost:9200/companies -d {"settings" : {"index" : {"number_of_shards" : 2,"number_of_replicas" : 1}}} 12. starting a second nodenode_1 node_20 1curl -XPUT localhost:9200/companies -d {"settings" : {"index" : {"number_of_shards" : 2,"number_of_replicas" : 1}}} 13. shards are relocatingnode_1 node_20 1curl -XPUT localhost:9200/companies -d {"settings" : {"index" : {"number_of_shards" : 2,"number_of_replicas" : 1}}} 14. replicas are allocatednode_1 node_20 11 0curl -XPUT localhost:9200/companies -d {"settings" : {"index" : {"number_of_shards" : 2,"number_of_replicas" : 1}}} 15. Indexing Data 16. the dataDocuments are typically JSON formattedcurl -XPUT localhost:9200/companies/company/1 -d {"id" : "elasticsearch","name" : "elasticsearch","website" : "http://www.elasticsearch.com","category" : "software","overview" : "distributed search & analytics engine","founded_year" : 2012,"location" : {"city" : "Amsterdam","country_code" : "NL","geo" : {"lat" : 52.370176,"lon" : 4.895008}}} 17. sending req. to one of the nodesnode_3node_1 node_20 11 010client 18. sending req. to one of the nodesnode_3node_1 node_20 11 010clientresolve thetarget shard 19. resolve shard & index to primarynode_3node_1 node_20 11 010client 20. replicate to replicasnode_3node_1 node_20 11 010client 21. Searching 22. unstructured searchUsing an extensive & powerful QueryDSLcurl -XGET localhost:9200/companies/_search -d {"query" : {,"match" : {"overview" : "search"}}} 23. unstructured searchUsing an extensive & powerful QueryDSLcurl -XGET localhost:9200/companies/_search -d {"query" : {,"match" : {"overview" : "search"}}}search for the term search in the overviewfield 24. structured searchnarrows the searchable document spacecurl -XGET localhost:9200/companies/company/_search -d {"query" : {,"filtered" : {"query" : {"match" : {"overview" : "search"}},"filter" : {"range" : {"founded_year" : { "gte" : 2000 }}}}}} 25. structured searchnarrows the searchable document spacecurl -XGET localhost:9200/companies/company/_search -d {"query" : {,"filtered" : {"query" : {"match" : {"overview" : "search"}},"filter" : {"range" : {"founded_year" : { "gte" : 2000 }}}}}}only search companies that were founded since year 2000 26. returned hits{..."hits": [{"_index": "companies","_type": "company","_id": "1","_score": 0.13424811,"_source": {"id": "elasticsearch","name": "elasticsearch","website": "http://www.elasticsearch.com","category": "software","founded_year": 2012,"overview": "distributed search & analytics engine","location": {"city": "Amsterdam","country_code": "NL","geo": {"lat": 52.370176,"lon": 4.895008}}}}]}} 27. returned hits{..."hits": [{"_index": "companies","_type": "company","_id": "1","_score": 0.13424811,"_source": {"id": "elasticsearch","name": "elasticsearch","website": "http://www.elasticsearch.com","category": "software","founded_year": 2012,"overview": "distributed search & analytics engine","location": {"city": "Amsterdam","country_code": "NL","geo": {"lat": 52.370176,"lon": 4.895008}}}}]}} 28. returned hits{..."hits": [{"_index": "companies","_type": "company","_id": "1","_score": 0.13424811,"_source": {"id": "elasticsearch","name": "elasticsearch","website": "http://www.elasticsearch.com","category": "software","founded_year": 2012,"overview": "distributed search & analytics engine","location": {"city": "Amsterdam","country_code": "NL","geo": {"lat": 52.370176,"lon": 4.895008}}}}]}} 29. returned hits{..."hits": [{"_index": "companies","_type": "company","_id": "1","_score": 0.13424811,"_source": {"id": "elasticsearch","name": "elasticsearch","website": "http://www.elasticsearch.com","category": "software","founded_year": 2012,"overview": "distributed search & analytics engine","location": {"city": "Amsterdam","country_code": "NL","geo": {"lat": 52.370176,"lon": 4.895008}}}}]}} 30. Query DSLQueries (unstructured)term queriesboolean queriesphrase (proximity) queriesfuzzy/prefix/regexp/wildcardsmore...Filters (structured)term (exact match)rangebooleangeo_* (e.g. geo_distance) 31. Analytics(a.k.a facets) 32. Analytics (facets)Slice & dice your dataCompute aggregations over field valuesAcross any index field/sAll in (near) realtime 33. used as navigation aid 34. or analytics dashboards 35. Elasticsearch is often usedpurely for analytics(without incorporating free text search) 36. ExampleFind the average revenue of all companiessince 2000curl -XGET localhost:9200/companies/revenues/_search -d {"query" : {"match_all" : {}},"facets" : {"revenue_stats" : {"date_histogram" : {"key_field" : "year","value_field" : "value","interval" : "month"}}}} 37. ExampleFind the average revenue of all companiessince 2000curl -XGET localhost:9200/companies/revenues/_search -d {"query" : {"match_all" : {}},"facets" : {"revenue_stats" : {"date_histogram" : {"key_field" : "year","value_field" : "value","interval" : "month"}}}}return a yearly breakdown of stats over companies revenues 38. response"facets": {"revenue_stats": {"_type": "date_histogram","entries": [{"time": 956448895664,"mean": 23.0},{"time": 987984922557,"mean": 267.1034482758621},{"time": 1019520942098,"mean": 195.51724137931035}...]}} 39. response"facets": {"revenue_stats": {"_type": "date_histogram","entries": [{"time": 956448895664,"mean": 23.0},{"time": 987984922557,"mean": 267.1034482758621},{"time": 1019520942098,"mean": 195.51724137931035}...]}}year 2000avg revenue 40. Types of analyticstermsunique value countsrangestatistics of specific field for a set of range groups ofanother fieldstatisticalstats over a specific fieldterms_statsstats over a specific fields for every unique field valuedate_/histograma breakdown of statistics of a specific field over a 41. Theres much moreFine control of how documents are treatedindexed, stored, text analysis, relationsAdditional featureshighlightingsuggest API (type ahead, auto-completion)percolator (reverse search)support of document relations (parent/child)extensive geo-location search & analyticsmore....------ 42. Introduc)on*to*Couchbase* 43. Couchbase*Server*NoSQL*Document*Database* 44. Couchbase*Open*Source*Project* Leading(NoSQL(database(project(focused(on(distributed(database(technology(and(surrounding(ecosystem( Supports(both(key;value(and(document;oriented(use(cases( All(components(are(available(under(the(Apache*2.0*Public*License* Obtained(as(packaged(so?ware(in(both(enterprise(and(community(ediAons.(Couchbase Open Source Project 45. Easy*Scalability*Consistent*High*Performance*Always*On*24x365*Grow(cluster(without(applicaAon(changes,(without(downAme(with(a(single(click(Consistent(sub;millisecond((read(and(write(response(Ames((with(consistent(high(throughput(No(downAme(for(so?ware(upgrades,(hardware(maintenance,(etc.(JSONJSONJSONJSONJSONPERFORMANCEFlexible*Data*Model*JSON(document(model(with(no(xed(schema.(Couchbase*Server* 46. Features*in*Couchbase*Server*2.0*JSON*support* Indexing*and*Querying*Cross*data*center*replica)on*Incremental*Map*Reduce*JSONJSONJSONJSONJSON 47. Addi)onal*Features*Built;in(clustering((All(nodes(equal((Data(replicaAon(with(auto;failover((Zero;downAme(maintenance(((Built;in(managed(cached((((Append;only(storage(layer((Online(compacAon((Monitoring(and(admin(API(&(UI((SDK(for(a(variety(of(languages( 48. Couchbase*Server*2.0*Architecture*Heartbeat(Process(monitor(Global(singleton(supervisor(ConguraAon(manager(on(each(node(Rebalance(orchestrator(Node(health(monitor(one(per(cluster(vBucket(state(and(replicaAon(manager(hQp*REST*management*API/Web*UI*HTTP(8091*Erlang(port(mapper(4369*Distributed(Erlang(21100*Y*21199*Erlang/OTP*storage(interface(Couchbase*EP*Engine*11210*Memcapable((2.0(Moxi*11211*Memcapable((1.0(Memcached*New*Persistence*Layer*8092*Query(API(Query*Engine*Data*Manager* Cluster*Manager* 49. 3(3( 2(Cross*data*center*replica)on**Data*ow*2(Managed(Cache(Disk(Queue(Disk(ReplicaAon(Queue(App(Server(Couchbase(Server(Node(Doc*1*Doc*1*Doc*1*To(other(node(XDCR(Queue(Doc*1*To(other(cluster( 50. Cross*Datacenter*Replica)on*(XDCR)* 51. Couchbase*plugYin*for*Elas)csearch* 52. How*does*it*work?*Elas)cSearch*UnidirecAonal(Cross(Data(Center(ReplicaAon( 53. ElasAcsearch(IntegraAon((via(XDCR)(RAM(CACHE(Doc(1(Doc(2(Doc(Doc(Doc(Doc(Doc(Doc(Doc(Doc(Doc(SERVER(1(Doc(6(DISK(RAM(CACHE(Doc(1(Doc(2(Doc(Doc(Doc(Doc(Doc(Doc(Doc(Doc(Doc(SERVER(2(Doc(6(DISK(RAM(CACH