session 10 / 11.12.2008 hibernate search in action · hosted by tikal 15 (optional) project...
TRANSCRIPT
Hosted by Tikal. w w w . t i k a l k . c o m Cost-Benefit Open Source
Israel Israel JBJBoss oss UUserser G GrouproupSession 10 / 11.12.2008Session 10 / 11.12.2008
Hibernate Search Hibernate Search in Actionin Action
By : Yanai Franchi, Chief Architect , TikalBy : Yanai Franchi, Chief Architect , Tikal
Hosted by Tikal 2 www.tikalk.com
AgendaAgenda
The mismatch problems Hibernate Search in a nutshell Mapping – Solving the structural mismatch Indexing – Solving the synchronization mismatch Querying – Solving the retrieval mismatch Demo Scale Hibernate Search
Hosted by Tikal 3 www.tikalk.com
The Mismatch Problems
Hosted by Tikal 4 www.tikalk.com
Impedance Mismatch Between Impedance Mismatch Between Object And Index ModelsObject And Index Models
!=Document
DocumentDocument
ClassClass
Class
Index
DocumentDocument
Document
Hosted by Tikal 5 www.tikalk.com
Mismatch With TypesMismatch With Types
Hosted by Tikal 6 www.tikalk.com
Mismatch With AssociationsMismatch With Associations
Hosted by Tikal 7 www.tikalk.com
Synchronization MismatchSynchronization Mismatch
Hosted by Tikal 8 www.tikalk.com
Retrieval MismatchRetrieval Mismatch
NO Conversation – You don't want to go there...» Loose domain driven, and OO paradigm» No type safety and strong type
Conversion» “rehydrate” Document from field values stored in index.
• No lazy loading and transparent access• No automatic synchronous against the DB (and index)
» Retrieve Hibernate managed objects.• Loading one-by-one is NOT efficient...
Hosted by Tikal 9 www.tikalk.com
Hibernate Search in a Nutshell
Hosted by Tikal 10 www.tikalk.com
Hibernate Search GoalHibernate Search Goal
Leverage Hibernate (ORM) and Apache Lucene (full-text search engine),
while address the mismatch problems.
Hosted by Tikal 11 www.tikalk.com
Hibernate Search FeaturesHibernate Search Features
Under the Hibernate platform» LGPL
Built on top of Hibernate Core
Use Apache Lucene(tm) under the hood» Hides the low level and complex Lucene API usage
Solve the mismatches
Hosted by Tikal 12 www.tikalk.com
Hosted by Tikal 13 www.tikalk.com
Hosted by Tikal 14 www.tikalk.com
Project Set-upProject Set-up Set your classpath» hibernate-search.jar: the core API and engine of
Hibernate Search» lucene-core.jar: Apache Lucene engine» hibernate-commons-annotations.jar: some common
utilities for the Hibernate project<dependency>
<groupId>org.hibernate</groupId><artifactId>hibernate-search</artifactId><version>3.1.0.GA</version>
</dependency><dependency>
<groupId>org.hibernate</groupId><artifactId>hibernate-annotations</artifactId><version>3.4.0.GA</version>
</dependency>
Hosted by Tikal 15 www.tikalk.com
(Optional) Project (Optional) Project ConfigurationConfiguration Configure hibernate search» No need for event listeners.
• When using JPA/Hibernate Annotations
hibernate-cfg.xml or META-INF/persistence.xml<?xml version="1.0" encoding="UTF-8"?> META-INF/persistence.xml<persistence> <persistence-unit name="dvdstore-catalog"> <jta-data-source>jdbc/test</jta-data-source> <properties> ...
<property name="hibernate.search.default.indexBase"value="/users/application/indexes"/>
.. </properties>
</persistence-unit></persistence>
Hosted by Tikal 16 www.tikalk.com
Map Your Domain ModelMap Your Domain Model
@Entity@Indexedpublic class Book {
@Id // → Automatically mapped to @DocumentIdprivate Integer id;
@Fieldprivate String title;
...}
Hosted by Tikal 17 www.tikalk.com
How Is The Index Look Like?How Is The Index Look Like?
Hosted by Tikal 18 www.tikalk.com
Hibernate Search ManagersHibernate Search Managers
Session session = sessionFactory.getCurrentSession();
FullTextSession fts = org.hibernate.search.Search.getFullTextSession(session);
@PersisntenceContext EntityManager em;...
FullTextEntityManager ftem = org.hibernate.search.jpa.Search.getFullTextEntityManager(em);
Hosted by Tikal 19 www.tikalk.com
Query in ActionQuery in ActionString searchStr = “title:hypernate~ OR description:persistence”;
org.apache.lucene.search.Query luceneQuery = buildLuceneQuery(searchStr);
javax.persistence.Query jpaQuery = ftEm.createFullTextQuery(luceneQuery,Book.class, Course.class);
List booksAndCourses = query.getResultList();
applySomeChanges(booksAndCourses);
Books and Courses get into JPA “persistent context” and changes will be automatically applied to DB (and the Lucene Index)
Can accept more than one class
Hosted by Tikal 20 www.tikalk.com
Mapping – Solve The Structural Mismatch
Hosted by Tikal 21 www.tikalk.com
Hosted by Tikal 22 www.tikalk.com
Mapping Entity & Primary KeyMapping Entity & Primary Key
@Entity@Indexedpublic class Item {
@Id // → Automatically mapped to @DocumentId @GeneratedValue
private Integer id;...
}
Hosted by Tikal 23 www.tikalk.com
Marking Properties As IndexedMarking Properties As Indexed
Hosted by Tikal 24 www.tikalk.com
Mapping PropertiesMapping Properties@Entity@Indexedpublic class Item {
@Id @GeneratedValueprivate Integer id;@Field(index=Index.UN_TOKENIZED)private String ean;
@Field(store=Store.YES)private String title;
//Will not be indexed while still being stored into DBprivate String imageURL;
private String description;...@Field //Annotation on the getterpublic String getDescription() {
return this.description;}
}
Hosted by Tikal 25 www.tikalk.com
Multiple Indexed PropertyMultiple Indexed Property
Properties that will be used to sort query results (rather than by relevance) must not be tokenized but must be indexed. » Use UN_TOKENIZED indexing strategy
@Entity@Indexedpublic class Item {
... @Fields({ @Field(index=Index.TOKENIZED) @Field(name="title_sort", index=Index.UN_TOKENIZED) })private String title;
Hosted by Tikal 26 www.tikalk.com
Mapping InheritanceMapping Inheritance@Entity //Superclasses do not have to be marked @Indexedpublic abstract class Item {
@Id // used as @DocumentId @GeneratedValue
private Integer id
@Field //Superclasses can contain indexed propertiesprivate String title;...
}
@Entity@Indexed //Concrete subclasses are marked @Indexedpublic class Dvd extends Item {
@Field(index=Index.UN_TOKENIZED)private String ean;...
}
Hosted by Tikal 27 www.tikalk.com
Built-In BridgesBuilt-In Bridges Bridges convert a Java object type into a string.
Some field bridges also convert back the string into
the original object structure » Identity and projected fields
Hibernate Search comes with many out-of-the-box field bridges. But you can write (or reuse) you own...» PDF, Microsoft-Word and other document types» Index Year, Month, Day on separate fields» Make numbers comparerable
Hosted by Tikal 28 www.tikalk.com
Bridge IssuesBridge Issues Dates» [20080112 TO 20080201] - field is between 12 January
2008 and 1 February 2008.» Hibernate Search lets you pick the date precision you wish
from year to millisecond:@DateBridge( resolution = Resolution.DAY ) private Date birthdate;
@DateBridge( resolution = Resolution.MINUTE )private Date flightArrival;
Numbers» “2 > “12”» [6 TO 9] => 6 OR 7 OR 8 OR 9
Hosted by Tikal 29 www.tikalk.com
Custom Bridge in ActionCustom Bridge in Action
Mapping a property to split the information to multiple fields in the index.
@Entity @Indexedpublic class Item { @Field @FieldBridge( impl=PaddedRoundedPriceBridge.class, // So 2 becomes “002” params= { @Parameter(name="pad", value="3") } ) private double price; ...}
Hosted by Tikal 30 www.tikalk.com
ClassBridgeClassBridge@Entity @Indexed@ClassBridge(name="childrenOnly", impl=ChildrenFlagBridge.class,index=Index.UN_TOKENIZED)public class Item {...}
public class ChildrenFlagBridge implements StringBridge { public String objectToString(Object object) { Item item = (Item) object; Category childrenCategory = new Category("Children");
boolean hasChildrenCategory = item.getCategories().contains(childrenCategory);
return hasChildrenCategory ? "yes" : "no"; }
Hosted by Tikal 31 www.tikalk.com
How to Map Associations ?How to Map Associations ?
Hosted by Tikal 32 www.tikalk.com
De-Normalize AssociationsDe-Normalize Associations
Hosted by Tikal 33 www.tikalk.com
De-Normalization ImplicationDe-Normalization Implication Can return items that :» One of the actor is “Cruise” and another one is “McGillis”» One of the actor is either “Cruise” or “McGillis”» “Cruise” plays but not “McGillis”
Can **NOT** do:» Return items where one of the actor is “Tom” and his home
town is “Atlanta”.• Turn the query upside down by targeting actor as the root
entity and then collect the matching items• Use a query filter to refine an initial query
Sometime you may end up in a dead end...» Apply part of the query (the discriminant part) in Lucene, » Collect the matching identifiers» Run a HQL query restricting by these identifiers.
Hosted by Tikal 34 www.tikalk.com
Indexing EmbeddablesIndexing Embeddables@Embeddablepublic class Rating { @Field(index=Index.UN_TOKENIZED) private Integer overall; @Field(index=Index.UN_TOKENIZED) private Integer scenario; @Field(index=Index.UN_TOKENIZED) private Integer soundtrack; @Field(index=Index.UN_TOKENIZED) private Integer picture; ...}
@Entity @Indexedpublic class Item { @IndexedEmbedded private Rating rating;...}
“find items with overall rating equals to 9” rating.overall : 9
Hosted by Tikal 35 www.tikalk.com
...And Embeddables Collection...And Embeddables Collection@Embeddablepublic class Country { @Field private String name; ...}
@Entity @Indexedpublic class Item { @CollectionOfElements @IndexedEmbedded private Collection<Country> distributedIn;...}
Don't abuse IndexedEmbedded. Be careful on collection indexing...
Hosted by Tikal 36 www.tikalk.com
Indexing Associated EntitiesIndexing Associated Entities
When a change is done on an associated entity, Hibernate Search must update all the documents where the entity is embedded in
Hosted by Tikal 37 www.tikalk.com
Indexing Associated EntitiesIndexing Associated Entities@Entity @Indexedpublic class Item { @ManyToMany @IndexedEmbedded private Set<Actor> actors; //embed actors when indexing ...}
@Entity @Indexedpublic class Actor { @Field private String name;
@ManyToMany(mappedBy="actors") @ContainedIn // We may use (depth=4) to limit depth private Set<Item> items; ...}
Relations between entities become bi-directional in case the Actor is not immutable, or do manual index
Hosted by Tikal 38 www.tikalk.com
Indexing Your Data - Solve The Synchronization Mismatch
Hosted by Tikal 39 www.tikalk.com
Defining a DirectoryProviderDefining a DirectoryProvider# Production configurationhibernate.search.default.directory_provider org.hibernate.search.store.FSDirectoryProviderhibernate.search.default.indexBase /User/production/indexes
# File directory structure/Users /Production /indexes /com.manning.hsia.dvdstore.model.Item /com.manning.hsia.dvdstore.model.Actor
# Test Configurationhibernate.search.default.directory_provider org.hibernate.search.store.RAMDirectoryProvider
Hosted by Tikal 40 www.tikalk.com
Analyzers - Lucene BrainAnalyzers - Lucene Brain
The key feature of the full text search
Taking text as an input, chunking it into individual words (by a tokenizer) and optionally applying some operations (by filters) on the tokens.
Applied: globally, per entity, or per property
Hosted by Tikal 41 www.tikalk.com
Tokenizers & FiltersTokenizers & Filters
StandardTokenizer -Splits words at punctuation characters and removing punctuation signs with a couple of exception rules.
Filters alter the stem of tokens (remove/change/add)» StandardFilter – Removes apostrophes and acronyms dots » LowerCaseFilter» StopFilter - Eliminates “noise” words.
Hosted by Tikal 42 www.tikalk.com
StandardAnalyzer in ActionStandardAnalyzer in Action@AnalyzerDef( //This is the default → no need to write it name="applicationAnalyzer", tokenizer =@TokenizerDef(factory=StandardTokenizerFactory.class), filters = { @TokenFilterDef(factory = StandardFilterFactory.class), @TokenFilterDef(factory = LowerCaseFilterFactory.class), @TokenFilterDef(factory = StopFilterFactory.class) } )
@Entity @Indexed@Analyzer(definition="applicatioAanalyzer")public class Item {...}
Hosted by Tikal 43 www.tikalk.com
More Available FiltersMore Available FiltersSynonym Stem
Phonetic N-Gram
Les Misérable => LS MSRP
Hosted by Tikal 44 www.tikalk.com
N-Gram Analyzer ExampleN-Gram Analyzer Example@AnalyzerDef( name="ngramAnalyzer", tokenizer =@TokenizerDef(factory=StandardTokenizerFactory.class), filters = { //Standard, LowerCase and Stop filters goes here @TokenFilterDef(factory = NGramTokenFilterFactory.class, params = { @Parameter(name="minGramSize", value="3"), @Parameter(name="maxGramSize.", value="3") }) } )
@Entity @Indexed // The default StandardAnalyzer will be usedpublic class Item{ @Fields({ @Field(index=Index.TOKENIZED), @Field(name="title_ngram",index=Index.UN_TOKENIZED, analyzer=@Analyzer(definition="ngramAnalyzer") }) private String title;}
Hosted by Tikal 45 www.tikalk.com
Which Technique to Choose?Which Technique to Choose?
Use approximation analyzers on dedicated fields.
Search in layers - Expand the approximation level. » The search engine can execute the strict query first» If more data is required a second query using
approximation techniques can be used and so on. » Once the search engine has retrieved enough information, it
bypasses the next layers.
Remember that a Lucene query is quite cheap. Running several Lucene queries per user query is perfectly acceptable.
Hosted by Tikal 46 www.tikalk.com
Indexing Flow DiagramIndexing Flow Diagram
Hosted by Tikal 47 www.tikalk.com
Synchronous FlowSynchronous Flow
Hosted by Tikal 48 www.tikalk.com
Asynchronous FlowAsynchronous Flow
Hosted by Tikal 49 www.tikalk.com
JMS FlowJMS Flow
Hosted by Tikal 50 www.tikalk.com
Manual Index - Naïve ApproachManual Index - Naïve ApproachTransaction tx = session.beginTransaction();
//read the data from the databaseQuery query = ftSession.createCriteria(Item.class);List<Item> items = query.list();
//index the datafor (Item item : items) { ftSession.index(item); }
tx.commit();
OutOfMemoryError
Load “distributor” for each item
Hosted by Tikal 51 www.tikalk.com
Manual Index – The Right WayManual Index – The Right Way
Transaction tx = ftSession.beginTransaction();
ftSession.setFlushMode(FlushMode.MANUAL);//disable flushftSession.setCacheMode(CacheMode.IGNORE);//disable 2nd level cache
ScrollableResults results = ftSession.createCriteria( Item.class ) .setFetchMode("distributor", FetchMode.JOIN) .setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY) .setFetchSize(BATCH_SIZE); .scroll( ScrollMode.FORWARD_ONLY );
for(int i=1; results.next() ; i++) { ftSession.index( results.get(0) ); if (i % BATCH_SIZE == 0) { ftSession.flushToIndexes(); //apply changes to the index ftSession.clear(); //clear the session releasing memory }}tx.commit(); //apply the remaining index changes
Hosted by Tikal 52 www.tikalk.com
Index With Batch ApproachIndex With Batch Approach
hibernate.search.indexing_strategy = manual
Hosted by Tikal 53 www.tikalk.com
Mix Batch And Event ApproachMix Batch And Event Approach
Hosted by Tikal 54 www.tikalk.com
Third Party Updates Your DBThird Party Updates Your DB
Hosted by Tikal 55 www.tikalk.com
What Influences Indexing TimeWhat Influences Indexing Time
Number of properties indexed Type of analyzer used Properties stored Properties embedded
On Mass Indexing» Index asynchronously» Index on a different machine» Use our previous manual sample as a template» session.getSearchFactory().optimize();
Hosted by Tikal 56 www.tikalk.com
Query – Solving The Retrieval Mismatch
Hosted by Tikal 57 www.tikalk.com
Full-Text Search QueryFull-Text Search Query
Running Hibernate-Search Query:» Building a Lucene query to express the full text search
(either through the query parser or the programmatic API)» Building an Hibernate Search query wrapping the Lucene
query» Execute Hibernate Search Query.
But why do we need this wrapper around Lucene ?» Build the Lucene Query is easy :
• title:Always description:some desc actors.name:Tom Cruise
Hosted by Tikal 58 www.tikalk.com
Executing Lucene Query Is Executing Lucene Query Is Low Level APILow Level API Open the Lucene directory(ies) Build one or several IndexReaders, and an
IndexSearcher on top of them Call the appropriate execution method from
IndexSearcher. Resource management for Lucene API Convert Documents into objects of your domain model.» “rehydrate” values from Lucene index
• No lazy loading, No transparent access, No change propagation» Load entities using ORM
• Loading one by one will work inefficiently
Hosted by Tikal 59 www.tikalk.com
Hibernate Search QueryHibernate Search Query
Return managed Hibernate entities.
Query API is similar. Use the same Query API as JPA or Hibernate-Query API.
Query semantic is also similar.» Lazy loading mechanism.» Transparent propagation to DB and Index
Hosted by Tikal 60 www.tikalk.com
Build Lucene Query Build Lucene Query With QueryParserWith QueryParserprivate org.apache.lucene.search.Query buildLuceneQuery (String words, Class<?> searchedEntity) { Analyzer analyzer = getFTEntityManager().getSearchFactory() .getAnalyzer(searchedEntity);
QueryParser parser = new QueryParser( "title", analyzer ); org.apache.lucene.search.Query luceneQuery = parser.parse(words); return luceneQuery;}
Hosted by Tikal 61 www.tikalk.com
Build Lucene Query Build Lucene Query With MutilFieldParserWith MutilFieldParserprivate org.apache.lucene.search.Query buildLuceneQuery (String words, Class<?> searchedEntity) { Analyzer analyzer=getFTEntityManager().getSearchFactory() .getAnalyzer(searchedEntity);
String[] productFields = {"title", "description"}; Map<String,Float> boostPerField = new HashMap<String,Float>; boostPerField.put( "title", 4f); boostPerField.put( "description", 1f);
QueryParser parser = new MultiFieldQueryParser( productFields,analyzer,boostPerField);
org.apache.lucene.search.Query luceneQuery = parser.parse(words); return luceneQuery;}
Hosted by Tikal 62 www.tikalk.com
Build & Execute The Build & Execute The FullTextQueryFullTextQuery
public List<Item> findByTitle(String words) { org.apache.lucene.search.Query luceneQuery = buildLuceneQuery(words,Item.class); javax.persistence.Query query = getFTEntityManager().createFullTextQuery(luceneQuery,Item.class); return query.getResultList();}
@PersisstenceContext private EntityManager em;
private FullTextEntityManager getFTEntityManager() { return Search.getFullTextEntityManager(em);}...
Hosted by Tikal 63 www.tikalk.com
Execute FullTextQueryExecute FullTextQuery
Hosted by Tikal 64 www.tikalk.com
Pagination & Result SizePagination & Result Sizepublic Page<Item> search(String words,int pageNumber,int window) { org.apache.lucene.search.Query luceneQuery = buildLuceneQuery(words,Item.class); FullTextQuery query = getFTEntityManager().createFullTextQuery(luceneQuery,Item.class); List<Item> results = query .setFirstResult( (pageNumber - 1) * window ) .setMaxResults(window) .getResultList();
int resultSize = query.getResultSize(); Page<Item> page = new Page<Item>(resultSize, results); return page;}
Hosted by Tikal 65 www.tikalk.com
Override Fetch StrategyOverride Fetch Strategy
For JPA use getDelegate() Don't use Criteria restrictions» Will hurt pagination and will provide wrong resultSize
public List<Item> findByTitle(String words) { org.apache.lucene.search.Query luceneQuery = buildLuceneQuery(words,Item.class); FullTextQuery query = getFTSession().createFullTextQuery(luceneQuery,Item.class);
Criteria fetchingStrategy = getFTSession().createCriteria(Item.class) .setFetchMode("actors", FetchMode.JOIN); query.setCriteriaQuery(fetchingStrategy);
return query.list();}
Hosted by Tikal 66 www.tikalk.com
Demo
Hosted by Tikal 67 www.tikalk.com
Product Domain ModelProduct Domain Model
Hosted by Tikal 68 www.tikalk.com
Service & DAO LayersService & DAO Layers
Hosted by Tikal 69 www.tikalk.com
Simple Search Simple Search Sequence DiagramSequence Diagram
Hosted by Tikal 70 www.tikalk.com
Hosted by Tikal 71 www.tikalk.com
ProjectionProjection
public List<ItemView> search(String words) { org.apache.lucene.search.Query luceneQuery = buildLuceneQuery(words,Item.class); FullTextQuery query = getFTSession().createFullTextQuery(luceneQuery,Item.class);
query.setProjection("ean", "title");
List<ItemView> results = query.setResultTransformer( new AliasToBeanResultTransformer(ItemView.class)).list(); return results;}
public class ItemView {// A view Object NOT necessary an entity... private String ean; private String title; public String getEan() { return ean; } public String getTitle() { return title; }}
No hit on the DB
Hosted by Tikal 72 www.tikalk.com
Store Properties In The Index Store Properties In The Index For ProjectionFor Projection
@Entity @Indexedpublic class Item {
@Id @GeneratedValueprivate Integer id;
@Field(store=Store.YES)private String title;
@Fieldprivate String description;
@Field(index=Index.UN_TOKENIZED, store=Store.YES)private String ean;...
}
Hosted by Tikal 73 www.tikalk.com
Sorting By FieldSorting By Fieldpublic List<Item> findByTitle(String words) { org.apache.lucene.search.Query luceneQuery = buildLuceneQuery(words,Item.class); FullTextQuery query = getFTSession().createFullTextQuery(luceneQuery,Item.class);
Sort sort = new Sort(new SortField(“title_sort”,SortField.STRING));
query.setSort(sort);
return query.list();}
@Entity @Indexed public class Item { ... @Fields({ @Field(index=Index.TOKENIZED) @Field(name="title_sort", index=Index.UN_TOKENIZED) }) private String title;
}
Hosted by Tikal 74 www.tikalk.com
Dynamic Data FilteringDynamic Data Filtering
Restrict results of a query after the Lucene query has been executed» Rules that are not directly related to the query.» Cross-cutting restrictions
• category, availability , security.
The ordering defined by the original query is respected.
Hosted by Tikal 75 www.tikalk.com
FiltersFilters
Hosted by Tikal 76 www.tikalk.com
Filter ExampleFilter Example//service implementationpublic List<Item> searchItems(String search, boolean isChild) { org.apache.lucene.search.Query luceneQuery = buildLuceneQuery(search); FullTextQuery query = getFTSession().createFullTextQuery(luceneQuery, Item.class);
if (isChild) query.enableFullTextFilter("chldFilter"); List<Item> results = query.list(); return results;}
Hosted by Tikal 77 www.tikalk.com
Filter Example Cont.Filter Example Cont.
public class ChildFilterFactory { @Factory public Filter getChildrenFilter() { Query query = new TermQuery( new Term("childrenOnly", "yes") ); return new QueryWrapperFilter( query ); }}
@Entity @Indexed@FullTextFilterDef(name="childFilter", impl=ChildFilterFactory.class)@ClassBridge(name="childrenOnly", impl=ChildrenFlagBridge.class,index=Index.UN_TOKENIZED)public class Item {...}
Hosted by Tikal 78 www.tikalk.com
Optimizing SearchOptimizing Search
Limit targeted classes (one class is the best)» ftSession.createFullTextQuery(luceneQuery, Item.class);
Use pagination
Avoid the n+1 by using setCriteria()
Use projection carefully
Hosted by Tikal 79 www.tikalk.com
Scale Hibernate Search
Hosted by Tikal 80 www.tikalk.com
Synchronous ClusteringSynchronous Clustering
Who can use it?» Applications with medium-size indexes
• Network traffic will be needed to retrieve the index.» Applications with low to moderate write intensive .
Hosted by Tikal 81 www.tikalk.com
Synchronous Clustering Synchronous Clustering ProblemsProblems Some NFS cache the directory contents» No immediate visibility for the directory content
• Lucene relies (partially) on an accurate listing of files.» “delete on last close” semantic NOT always implemented .
Database Directory issues» Segments are represented as blobs» A pessimistic lock hurts concurrency on massive updates.
In-memory distributed Directory» GigaSpace, JBoss Cache and Terracotta
Hosted by Tikal 82 www.tikalk.com
Asynchronous ClusteringAsynchronous ClusteringChange-Event not
propagated to Index
Hosted by Tikal 83 www.tikalk.com
Slave ConfigurationSlave Configuration<persistence-unit name="dvdstore-catalog"> <jta-data-source>java:/DefaultDS</jta-data-source> <properties> <!-- regular Hibernate Core configuration --> <property name="hibernate.dialect" value="org.hibernate.dialect.H2Dialect"/>
<!-- JMS backend → <property name="hibernate.search.worker.backend" value="jms"/> <property name="hibernate.search.worker.jms.connection_factory" value="/ConnectionFactory"/> <property name="hibernate.search.worker.jndi.url" value="jnp://master:1099"/> <property name="hibernate.search.worker.jms.queue" value="queue/hibernatesearch"/> ...
Hosted by Tikal 84 www.tikalk.com
Slave Configuration Cont.Slave Configuration Cont.
...
<!-- DirectoryProvider configuration --> <property name="hibernate.search.default.directory_provider" value="org.hibernate.search.store.FSSlaveDirectoryProvider"/> <property name="hibernate.search.default.refresh" value="1800"/> <property name="hibernate.search.default.indexBase" value="/Users/prod/lucenedirs"/> <property name="hibernate.search.default.sourceBase" value="/mnt/share"/> </properties></persistence-unit>
Hosted by Tikal 85 www.tikalk.com
Master ConfigurationMaster Configuration<persistence-unit name="dvdstore-catalog"> <jta-data-source>java:/DefaultDS</jta-data-source> <properties> <!-- regular Hibernate Core configuration --> <property name="hibernate.dialect" value="org.hibernate.dialect.H2Dialect"/>
<!-- Hibernate Search configuration --> <!-- no backend configuration necessary --> <property name="hibernate.search.default.directory_provider" value="org.hibernate.search.store.FSMasterDirectoryProvider"/> <property name="hibernate.search.default.refresh" value="1800"/>
<property name="hibernate.search.default.indexBase" value="/Users/prod/lucenedirs"/>
<property name="hibernate.search.default.sourceBase" value="/mnt/share"/>
</properties></persistence-unit>
Hosted by Tikal 86 www.tikalk.com
Building The Master MDBBuilding The Master MDB@MessageDriven(activationConfig = { ActivationConfigProperty(propertyName="destinationType", propertyValue="javax.jms.Queue"), @ActivationConfigProperty(propertyName="destination", propertyValue="queue/hibernatesearch") } )public class MDBSearchController extends AbstractJMSHibernateSearchController implements MessageListener{
@PersistenceContext private EntityManager em;
@Override protected void cleanSessionIfNeeded(Session session) { //clean the session if needednothing to do container managed }
@Override protected Session getSession() { return (Session) em.getDelegate(); }}
Hosted by Tikal 87 www.tikalk.com
What Happens What Happens On Master Failure ?On Master Failure ? Slave» Continue to serve full-text queries » Continue push changes that need indexing.
Master» Messages on the master are roll-backed to queue.» Optional - Prepare a standby for the master
On corrupted Index...» Re-index manually from DB» Optional – Use Storage Area Network (SAN)
Hosted by Tikal 88 www.tikalk.com
Summary
Hosted by Tikal 89 www.tikalk.com
Full-Text Search Full-Text Search Without The HassleWithout The Hassle Solves The 3 mismatch problems» Automatic structural conversion through Mapping» Transparent index synchronization» Retrieved data from index become “persistent” entities.
Easier / Transparent optimized Lucene use
Scalability capabilities out of the box
Hosted by Tikal 90 www.tikalk.com
Q & AQ & A
Hosted by Tikal 92 www.tikalk.com
AppendixesAppendixes
Hosted by Tikal 93 www.tikalk.com
Use SAN For Lucene Directory Use SAN For Lucene Directory
Hosted by Tikal 94 www.tikalk.com
JBoss Cache SearchableJBoss Cache Searchable
Integration package between JBoss Cache and Hibernate Search.
Provides full text search capabilities to the cache.
User
CoreCache
Searchable-cache
ApacheLucene
1 - CreateQuery
2 - Documents retrievedvia Hibernate Search
3 - cache.get()called
4 - Objectsreturned to user
Hosted by Tikal 95 www.tikalk.com
Annotated PojoAnnotated Pojo@Indexed@ProvidedIdpublic class Person { //Not necessary a Hibernate Entity @Field private String name; @Field private Date dateOfBirth; //Not Indexed private String massiveString; //Standard getters, setters etc follow. }
Hosted by Tikal 96 www.tikalk.com
FullText Search on CacheFullText Search on Cache
public void putStuffIn(Person p){ searchableCache.put(Fqn.fromString("/a/b/c"), p.getName(), p);}
public List findStuff(String searchStr){ Query luceneQuery = buildLuceneQuery(String searchStr)
CacheQuery cacheQuery = searchableCache.createQuery(luceneQuery,Person.class); return cacheQuery.list();}
Cache<String, Person> cache = new DefaultCacheFactory<String, Person>().createCache();SearchableCache searchableCache = new SearchableCacheFactory(); createSearchableCache(cache, Person.class);...