epl 660 lab 3 - ucy · 2012-02-10 · apache solr (2) •solr -> java, runs as a standalone...

13
EPL 660 Lab 3 Apache Solr Tutorial Paris Iona Email: parisiona+epl660 [at] gmail dot com

Upload: others

Post on 12-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

EPL 660 Lab 3

Apache Solr TutorialParis Iona

Email: parisiona+epl660 [at] gmail dot com

Page 2: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

This Lab

• Apache Solr

• Getting Started

• Example of Solr

Page 3: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Apache Solr

• What is Apache Solr?

– “Solr is the popular, blazing fast open source

enterprise search platform from the Apache Lucene project.”from : http://lucene.apache.org/solr/index.html

Page 4: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Apache Solr (2)

• Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat)

• Solr uses the Lucene Java search library at its

core for full-text indexing and search.

• It has REST-like HTTP/XML and JSON APIs that

make it easy to use from virtually any

programming language.

Page 5: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Getting Started

• Download the latest Solr release (NOW 3.5.0) from: http://www.bizdirusa.com/mirrors/apache//lucene/solr/

• Unzip the file and navigate to the example directory

• Open a terminal in this folder

Page 6: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Getting Started (2)

• For this example we are using a small installation of Jetty (but you can use any Java Servlet Container)

• To start Jetty with Solr WAR just run in terminal

• Well Done . Solr is now running on http://localhost:8983/solr/admin/

java –jar start.jar

Page 7: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Indexing data

• Open a new terminal and navigate to the exampledoc directory (located into the example directory of solr)

• We can index data with post.jar. In terminal write

• Now go back to the admin page and search for “solr”

java –jar post.jar solr.xml monitor.xml

Page 8: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Indexing data (2)

• Now import all the provided data with

• We can search using the Solr Query Syntaxlike:

– Video

– Name:video

– …

– …

java –jar post.jar *.xml

Page 9: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Indexing data (3)

• We can also index data with:

• Import database records with Data Import Handler (DIH).

• Load a CSV file

• POST JSON documents

• Index binary documents such as Word and PDF with Solr Cell

• Use SolrJ for Java or other Solr clients to programatically create documents to send to Solr.

Page 10: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Updating data

• For statistics about the data go to

– http://localhost:8983/solr/admin/stats.jsp

• Edit any of the provided xmls

• Rerun the post.jar command to update data

java –jar post.jar

Page 11: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Deleting data

• Using again the post.jar command

• Data will not be delete. Commit changes with

• Delete entry with name DDR

• Now reload all data with

java -Ddata=args -Dcommit=no -jar post.jar "<delete><id>SP2514N</id></delete>"

java –jar post.jar

java -Ddata=args -jar post.jar "<delete><query>name:DDR</query></delete>"

java –jar post.jar *.xml

Page 12: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Searching data

• Web(add to this http://localhost:8983/solr/select/?indent=on& one of the following)– q=video&fl=name,id (return only name and id fields)

– q=video&fl=name,id,score (return relevancy score as well)– q=video&fl=*,score (return all stored fields, as well as relevancy

score)

– q=video&sort=price desc&fl=name,id,price (add sort specification: sort by price descending)

– q=video&wt=json (return response in JSON format)

• Web UI– http://localhost:8983/solr/browse

Page 13: EPL 660 Lab 3 - UCY · 2012-02-10 · Apache Solr (2) •Solr -> Java, runs as a standalone full-text search server within a servlet(e.g. Tomcat) •Solr uses the Lucene Java search

Final Words on Solr

• Example of a Solr app• How to import data• How to update and delete data• How to search data• Learn much more on

– http://lucene.apache.org/solr/tutorial.html (used for this presentation)

– http://www.solrtutorial.com/– http://www.ibm.com/developerworks/java/library/j-

solr1/