advanced queries on the infinispan data grid
TRANSCRIPT
Advanced Queries on the Infinispan Data Grid
Navin Surtani
13th May 2015
GeeCon, Krakow
Who is Navin?
• Worked on Red Hat projects
since 2008
• Infinispan
• Hibernate Search
• Wildfly/JBoss EAP
Tweet your questions
@navssurtani
#advancedqueries
What are we talking about?
• What is Infinispan?
• The Query module
• Backend tech Hibernate Search & Apache Lucene
• Setup and configuration
• Demo and code walkthrough
What is Infinispan?
• Distributed in-memory key/value data store
• Extension of java.util.Map
• Modes
• Library Embed into EE/SE application
• Server Connect remotely
Some features
• Fully transactional (JTA, XA)
• Hibernate 2nd level caching
• Full-text querying
• Non-JVM clients for server mode
How do I use it?
• Cache Sit in front of your NoSQL data store
• In-memory DB Primary data store is in memory
• Clusterability Manage state that is distributed
… but we have a problem here
• How do I find my data?
• I don’t want to give out
keys
• I might not know what I
need to find
Query module to the rescue
• Allows searching of values in the cache
• Original project: JBoss Cache Searchable in 2008
• Integration between Infinispan and Hibernate Search
• Became Query module in 2009
Full-text search
• Library example:
• Is author name: Surname, Name?
• Name, Surname?
• How do I deal with …
• Special characters?
• Typos?
Lucene
• Scalable high-performance indexing
• Small RAM requirement ~ 1MB heap
• Index size ~ 20-30% size of data
• 100% open source and written in Java
• Apache Licensing
• Ports to other languages exist
Lucene
• Optimised for searching and querying
• Rich feature-set for query types
• Typo-tolerant searches
• Similar keywords
• Document structure
• Unstructured data
• Documents stored in-memory or on disk
Two features we will look at
Facets
• Obtain counts, or frequencies of a result
• O(1) to obtain counts
• EBay counts
Filters
• Filters are:
• Declarative
• Stacking
• Reusable
How it all fits together
XML Configuration
<local-cache name="Votes">
<transaction mode="NONE"/>
<indexing index="ALL">
<property name="default.directory_provider">
ram
</property>
</indexing>
</local-cache>
Programmatic Configuration
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.indexing()
.enable()
.indexLocalOnly() // Will only index local node
.withProperties(properties);
EmbeddedCacheManager cm = new DefaultCacheManager(cb.build());
// My key is an int and value is of type Person
Cache<int, Person> cache = cm.getCache();
Annotations required
• @Indexed
• @Field
• @IndexedEmbedded
Running queries
// I have a cache instance which is not empty
SearchManager sm = Search.getSearchManager(cache);
QueryBuilder qb = sm.buildQueryForClass(Person.class)
.get();
Query q = qb.keyword().onField(“name”).matching(“Surtani”)
.createQuery();
CacheQuery cq = sm.getQuery(q, Person.class);
How it all ties together …
• Web-application using Infinispan running on WildFly 9 CR
• App-server ships with Query module
• Use a web-form to vote in an ‘election’
• One vote for governor
• One vote for senator
Flow I: Query ‘warm-up’
• Story: ‘We don’t know who is running in the election’
• WebSocket endpoint to delegate to Worker object
• Worker object executes on CandidateCacheDao
• Returns results through WebSocket endpoint
Flow II: Voting form
• Story: ‘This is our ballot paper’
• Front-end creates JSON to go to WebSocket endpoint
• JSON gets parsed by BallotWorker object
• BallotWorker puts parsed JSON into Cache through VotingCacheDao
Flow III: Faceted search
• Story: ‘We want to know who has won the election’
• Front-end asks for the result of an election (governor or senator)
• ElectionResultWorker object runs a query through the
VotingCacheDao
• Result passed back to web-page as JSON
Flow III: Faceted search with Filter
• Story: ‘We would like to know who has received the most votes
in a particular region’
• Essentially the same workflow as III but we also pass a Filter to our query
• We are using the same query code, except we also filter out our results.
Demo time
Roadmap
• API:
• JDK 8 integration
• FunctionalCache interface
• Query:
• Query on Non-Indexed fields
• Continuous querying
Summary
• Query module 101
• Configuration
• Demo
• Basic query on multiple fields
• Faceted search with and without filter
Get in touch
Twitter:
• @navssurtani
• @infinispan
• @c2b2consulting
IRC:
• #infinispan on FreeNode
Blogs:
• navssurtani.blogspot.com
• blog.infinispan.org
• blog.c2b2.co.uk
Demo:
• github.com/navssurtani/query-demo
Q&A
#thankyougeecon