datastax: an introduction to datastax enterprise search
TRANSCRIPT
![Page 1: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/1.jpg)
An Introduction to DSE SearchCaleb RackliffeSoftware [email protected]@calebrackliffe
![Page 2: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/2.jpg)
What problem were we trying to solve?
![Page 3: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/3.jpg)
3
Application
DataStax Driver
![Page 4: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/4.jpg)
4
SELECT * FROM customers WHERE country LIKE '%land%';
![Page 5: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/5.jpg)
5
What about secondary indexes?
![Page 6: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/6.jpg)
Why not just create your own secondary index implementation that supports wildcard queries?
![Page 7: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/7.jpg)
7
I need full-text search!
![Page 8: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/8.jpg)
![Page 9: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/9.jpg)
Why did we build something new?
![Page 10: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/10.jpg)
10
Application
DataStax Driver Solr Client
![Page 11: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/11.jpg)
Polyglot Persistence!
![Page 12: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/12.jpg)
12
Application
DataStax Driver Solr Client
Consistency
Cost
Complexity
![Page 13: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/13.jpg)
![Page 14: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/14.jpg)
14
partitioning
multi-DC
replication
geospatial
wildcards
monitoring
C* field type support (UDT, Tuple, collections)security
live indexing
sorting
faceting
fault-tolerant distributed search
cachingtext analysis
grouping
automatic index updates
JVM
CQL
repair
![Page 15: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/15.jpg)
15
Application
DataStax Driver Solr Client
Consistency
Complexity
Cost
![Page 16: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/16.jpg)
How about some examples?
![Page 17: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/17.jpg)
Creating a Solr Core
bash$ dse cassandra -s
cqlsh> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy',
'Solr':1};
cqlsh:test> CREATE TABLE test.user(username text PRIMARY KEY, fullname text, address_ map<text, text>);
bash$ dsetool create_core test.user generateResources=true
Start a node…
Create a table…
Create the core…
![Page 18: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/18.jpg)
bash$ dsetool get_core_schema test.user
<?xml version="1.0" encoding="UTF-8" standalone=“no"?><schema name="autoSolrSchema" version="1.5"> <types> <fieldType class="org.apache.solr.schema.TextField" name="text"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> <fieldType class="org.apache.solr.schema.StrField" name="string"/> </types> <fields> <field indexed="true" name="username" stored="true" type="string"/> <field indexed="true" name="fullname" stored="true" type="text"/> <dynamicField indexed="true" name="address_*" stored="true" type="string"/> </fields> <uniqueKey>fullname</uniqueKey></schema>
The Schema
![Page 19: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/19.jpg)
Insert Rows (…and Index Documents)
cqlsh:test> INSERT INTO user(username, fullname, address)VALUES('sbtourist', 'Sergio Bossa', {'address_home' : 'UK', 'address_work' : 'UK'});
cqlsh:test> INSERT INTO user(username, fullname, address) VALUES('bereng', 'Berenguer Blasi', {'address_home' : 'ES', 'address_work' : 'ES'});
cqlsh:test> INSERT INTO user(username, fullname, address)VALUES('thegrinch', 'Sven Delmas', {'address_home':'US','address_work':'HQ'});
…and that’s it. No ETL. No writing to a second datastore.
![Page 20: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/20.jpg)
Wildcards
cqlsh:test> SELECT username, address FROM user WHERE solr_query='{"q":"address_home:U*"}'; username | address-----------+---------------------------------------------------- sbtourist | {‘address_home': 'UK', ‘address_work': 'UK'} thegrinch | {‘address_home': 'US', ‘address_work': 'HQ'}(2 rows)
![Page 21: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/21.jpg)
Sorting and Limitscqlsh:test> SELECT username, address FROM user WHERE solr_query=‘{"q":"*:*", "sort":"address_home desc"}'; username | address-----------+---------------------------------------------------- thegrinch | {'address_home': 'US', 'address_work': 'HQ'} sbtourist | {'address_home': 'UK', 'address_work': 'UK'} bereng | {'address_home': 'ES', 'address_work': 'ES'}(3 rows)
cqlsh:test> SELECT username, address FROM user WHERE solr_query='{"q":"*:*", "sort":"address_home desc"}' LIMIT 1; username | address-----------+---------------------------------------------------- thegrinch | {'address_home': 'US', 'address_work': 'HQ'}(3 rows)
![Page 22: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/22.jpg)
Faceting
cqlsh:test> SELECT * FROM user
WHERE solr_query='{"q":"*:*", "facet":{"field" : "address_work"}}';
facet_fields-------------------------------------------- {"address_work" : {"ES" : 1 , "HQ" : 1 , "UK" : 1}}
(1 rows)
![Page 23: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/23.jpg)
Partition Restrictions
cqlsh:test> CREATE TABLE event(sensor_id bigint, recording_time timestamp, description text, PRIMARY KEY(sensor_id, recording_time));
…
cqlsh:test> SELECT recording_time, description FROM test.event WHERE sensor_id = 2314234432 AND
solr_query=‘description:unremarkable’;
![Page 24: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/24.jpg)
What do the internals look like?
![Page 25: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/25.jpg)
Indexing
![Page 26: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/26.jpg)
26
Buffered
Searchable
Durable
Memory
Disk
![Page 27: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/27.jpg)
27
Buffered
Searchable
Durable
Memory
Disk
![Page 28: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/28.jpg)
28
RAMBuffer
Segment
Segment
Memory
Disk
Segment Segment
Buffered
Searchable
Durable
Soft Commit
Hard Commit
![Page 29: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/29.jpg)
Querying
![Page 30: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/30.jpg)
Replica Selection
A
A
RF=2shards: A-E
B
B CC D
D E
E
coordinator1
2
34
5
Healthy Unhealthy
![Page 31: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/31.jpg)
Replica Selection
A
A
RF=2shards: A-E
B
B CC D
D E
E
coordinator1
2
34
5
Healthy Unhealthy
![Page 32: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/32.jpg)
What happens if a shard query fails?
![Page 33: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/33.jpg)
Failover: Phase 1
4 nodesRF = 2shards: A-Dno vnodes
1
2
3
4
![Page 34: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/34.jpg)
Failover: Phase 2
4 nodesRF = 2shards: A-Dno vnodes
1
2
3
4
![Page 35: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/35.jpg)
![Page 36: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/36.jpg)
Failover: Phase 3
4 nodesRF = 2shards: A-Dno vnodes
1
2
3
4
![Page 37: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/37.jpg)
Platform Integrations
![Page 38: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/38.jpg)
Search + Analytics: Explicit Predicate Pushdown
bash$ dse spark
scala> val table = sc.cassandraTable("wiki","solr")
scala> val result = table.select("id","title") .where(“solr_query=‘body:dog'") .collect
![Page 39: DataStax: An Introduction to DataStax Enterprise Search](https://reader034.vdocuments.mx/reader034/viewer/2022051404/58edc83e1a28ab9d1c8b4577/html5/thumbnails/39.jpg)
http://docs.datastax.com