pentaho3.7.0windows.docx

32
Pentaho Business Intelligence Suite 3.7 A guide to getting started with Microsoft SQL Server 2005+ and Windows Pentaho Business Intelligence Suite 3.7 A guide to getting started with Microsoft SQL Server 2005+ and Windows Page 1 of 1

Upload: will-flores-soto

Post on 24-Nov-2015

10 views

Category:

Documents


0 download

TRANSCRIPT

Pentaho 3.7.0 Windows and Microsoft SQL Server.docx

Pentaho Business Intelligence Suite 3.7A guide to getting started with Microsoft SQL Server 2005+ and Windows

Pentaho Business Intelligence Suite 3.7A guide to getting started with Microsoft SQL Server 2005+ and Windows

Table of Contents

Table of ContentsIntroductionLicenseAboutThe CommunityThanksGetting StartedInstalling and Configuring JavaDeploying the PlatformPackaged Apache-Tomcat ServerExisting Apache-Tomcat ServerwebappsMicrosoft SQL Server JDBC DriverSQL Script PackConfiguring the DatabasesExtract the Microsoft SQL Server Script PackModify SQL Script PackLoad the SQL scriptsConfiguring JDBC SecurityapplicationContext-spring-security-jdbc.xmlapplicationContext-spring-security-hibernate.propertieshibernate-settings.xmlmssql.hibernate.cfg.xmlConfiguring Hibernate and Quartzcontext.xmlConfiguring Apache-Tomcat Serversolution-pathfully-qualified-server-urlTrustedIpAddrsOther ParametersConfiguring SMTP (mail server)Configuring PublishingConfiguring the Administration ConsoleStarting the Business Intelligence PlatformStarting the Administration Console

IntroductionLicenseThis work is licensed under aCreative Commons Attribution 3.0 Australia License.

DonateThis tutorial is accessed by thousands on a monthly basis, from most of the feedback many found it extremely helpful! But contrary to belief Ido notwork for Pentaho and all of this work is voluntary, so even $1 can help me with producing bigger and better tutorials!You can donate to my PayPal account by clicking here.AboutTo use this guide It is assumed that readers have intermediate to advanced knowledge in their setup of choice and basic knowledge of Pentaho (although it is not needed). The following operating systems and databases aresupported:

Windows * MySQL 5.x PostgreSQL 8.x.x Oracle 10g & 11g Microsoft SQL Server 2005+* Linux MySQL 5.x PostgreSQL 8.x.x Oracle 10g & 11g

* This tutorial is for Windows and SQL Server 2005+ setup.

The CommunityDon't forget about the other hardworking projects which are part of the Pentaho community and also deserve a donation:

PAT (Pentaho Analysis Tool)An alternative to Pentaho's current OLAP analyser tool, JPivot.

CDF (Community Dashboard Framework)A framework for building dashboards within Pentaho's Business Intelligence Server User Console.

CBF (Community Build Framework)Is an ant build.xml file script and alternate way to setup and deploy Pentaho based applications

CDA (Community Data Access)A data access layer for CDF (Community Dashboard Framework).ThanksThanks to the following blogs, individuals, companies and groups:

Roland Boumar (co-author of Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL) for providing the necessary configuration and scripts for this guide.

##pentaho & ##pentaho.patIRC channels found on Freenode (Pentaho and Pentaho PAT).

Pentaho Wiki & Pentaho ForumsThe first place any new user to Pentaho should look.

Open Source Business IntelligenceProvided a working copy of the sample database for MySQL.

BizcubedProvided a working copy of the sample database for PostgreSQL - they are also Australian!

Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQLA book by Roland and Jos van Dongen.

Getting StartedInstalling and Configuring JavaThe Pentaho BI Platform requires a JVM (Java Virtual Machine) to be installed on your PC or server. To check if Java is already installed issue the following command (seen in bold) at the command prompt:

C:\>java -versionjava version "1.6.0_13"Java(TM) SE Runtime Environment (build 1.6.0_13-b03)Java HotSpot(TM) Client VM (build 11.3-b02, mixed mode, sharing)If a similar output (seen above) is displayed Java is already installed. If not, to install Java on Windows you will need to download the Java installation file from the Sun Developer Network downloads page.

The next step is to check if the JAVA_HOME environment variable is setup correctly, issue the following command (seen in bold) at the command prompt:

C:\>echo %JAVA_HOME%C:\Program Files\Java\jdk1.6.0_13

If a similar output (seen above) is displayed the JAVA_HOME environment variable is already setup. To setup the JAVA_HOME environment variable right click on My Computer and click the Properties option then the Advanced tab and click the Environment Variables button.

Depending on your setup (User variables or System variables)click on the New button to create a new Environment Variable (in this guide I will be adding them for the user). For the variable name enter JAVA_HOME and for the variable value find the location of your Java installation in this example it is c:\Program Files\Java\jdk1.6.0_13:

The CATALINA_OPTS environment variable should also be set to tell the Apache-Tomcat server to use more than the default memory, to do this follow the same steps from above but this time make sure you set the variable name to CATALINA_OPTS and the variable value to -Xms256m -Xmx768m -XX:MaxPermSize=256m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000:

From now on every time the PC or server is started/restarted the JAVA_HOME and CATALINA_OPTS environment variables will be set automatically.

Deploying the PlatformYou are able to deploy the platform in many different ways but in this guide I will explain how to deploy it with the packaged Apache-Tomcat server (comes with the Pentaho BI Server installation file) or with an existing Apache-Tomcat server.Packaged Apache-Tomcat ServerYou will need to first download the biserver-ce-3.7.x.stable.zip file from the Pentaho Sourceforge projects page - this file contains all the files/packages needed for setting up our platform. After downloading extract its contents into a folder you would like to store the Pentaho BI Server - in this example I have chosen c:\pentaho\.

Use 7-Zip to extract the file contents to C:\pentaho\ folder.

The following folders should be visible after you have extracted the ZIP file:

C:\|-- pentaho||-- adminstration-console||-- biserver-ceExisting Apache-Tomcat ServerIf you would like to deploy the Pentaho BI Platform on an existing Apache-Tomcat server first extract the contents of the biserver-ce-3.7.x.stable.zip file found on the Pentaho Sourceforge projects page. After downloading extract its contents into a folder you would like to store the Pentaho BI Server - in this example I have chosen c:\pentaho\

Use 7-Zip to extract the file contents to C:\pentaho\ folder.

The following folders should be visible after you have extracted the ZIP file:

C:\|-- pentaho||-- adminstration-console||-- biserver-ce| |-- pentaho-solutions| |-- tomcat| |-- common| `-- lib| `-- jtds-1.2.5.jar (optional)| or| `-- sqljdbc4.jar (optional)| |-- webapps| `-- pentaho| `-- pentaho-styles| `-- sw-styles

The folders in bold (seen above) will need to be moved to your existing Apache-Tomcat installation.webappsYou will need to copy all the folders under the C:\pentaho\biserver-ce\tomcat\webapps\ folder to the webapps\ folder under your existing Apache-Tomcat installation (the sw-styles webapp is optional).

The last step is to move the pentaho-solutions folder into the C:\pentaho\ folder or any other location which you would like to store all your pentaho solutions and configuration files.

One more step will need to be done to make sure Pentaho knows the new location of the pentaho-solutions folder which will be covered in the "Configuring Apache-Tomcat" section.

The new structure of the pentaho-solutions and existing Apache-Tomcat folder looks like this:

C:\ |-- pentaho | `-- pentaho-solutions |-- tomcat | -- webapps

You can now safely remove any other files that came with the original Pentaho BI Platform (only under the biserver-ce\ folder).

Microsoft SQL Server JDBC DriverYou can use two different JDBC drivers one which is proprietary to Microsoft and the jTDS alternative open source version. Through out this guide I will refer to the open source JDBC driver but if you would like to use the Microsoft one you must go directly to here http://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=a737000d-68d0-4531-b65d-da0f2a735707 and download it yourself.

You will need to download the jTDS JDBC client JAR file for Microsoft SQL Server. There are many different types available from the jTDS JDBC Driver website depending on your setup (In this guide I was using the jtds-1.2.5.jar if you are having problems with 1.2.5 I suggest using the jtds-1.2.4.jar file.).

Once you have downloaded the JDBC Thin Client driver issue copy the file to the \tomcat\common\lib and \administration-console\jdbc folders.SQL Script PackA SQL Script Pack is a set off SQL scripts which will configure all the necessary databases. To download the SQL Script Pack for Microsoft SQL Server 2005+ click here.

Configuring the DatabasesExtract the Microsoft SQL Server Script PackAfter downloading the SQL Script Pack for Microsoft SQL Server you will need to extract the files into a temporary location. These are the three SQL scripts and one BAT file which should be visible after the pack has been extracted:

setup_pentaho_data.bat

Executes all .sql filesprepare.sql

Creates databases and usernamescreate_repository.sql

Creates the Hibernate and Quartz databases

sampledata_mssql.sql

Creates the sampledata

It is recommended that your load the above scripts using the setup_pentaho_data.bat file which uses sqlcmd otherwise you can load these SQL scripts separately using the SQL Server Management Studio.Modify SQL Script PackBy default the guide creates a specific databases, schemas, usernames for Pentaho to use but if you would like to configure your own then you are able to manually edit the .sql files.Load the SQL scriptsIt is recommended to run the setup_pentaho_data.bat file, if you do you should receive output similar to below in your command prompt:

C:\tmp\>setup_pentaho_data.bat

C:\tmp\>sqlcmd -S localhost -U sa -P password -i prepare.sqlChanged database context to 'pentaho'.Msg 15025, Level 16, State 1, Server D119959\SQLEXPRESS, Line 3The server principal 'pentaho' already exists.Changed database context to 'pentaho_sample_data'.Msg 15025, Level 16, State 1, Server D119959\SQLEXPRESS, Line 3The server principal 'pentaho_sample_data' already exists.

C:\tmp\>sqlcmd -S localhost -U pentaho -P 5tah0 -i create_repository.sqlChanged database context to 'pentaho'.

(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)

C:\tmp>sqlcmd -S localhost -U pentaho_sample_data -P 5tah0_sample_data -i sampledata_mssql.sqlChanged database context to 'pentaho_sample_data'.

(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)(1 rows affected)After the bat file has successfully been executed two new databases should of been created:

Along with the following tables:

* Hibernate will create new tables into the pentaho database after Pentaho BI Platform has started for the first time.

Configuring JDBC SecurityThis section describes how to configure the Pentaho BI Platform JDBC security to use a Microsoft SQL server, this means the Pentaho BI Platform will now point to the hibernate database on the Microsoft SQL server instead of the packaged HSQL database.

NOTE

This configuration does not use Pentaho's default username and password - you can make changes to the username and password within the SQL Script Pack.

applicationContext-spring-security-jdbc.xmlThis file is located under the pentaho-solutions\system\ folder.

Once the file has opened locate this snippet of code:

Make changes to the highlighted sections so that the section of code looks similar to this:

NOTE

This configuration does not use Pentaho's default username and password - you can make changes to the username and password within the SQL Script Pack.

applicationContext-spring-security-hibernate.propertiesThis file is located under the pentaho-solutions\system\ folder.

Once the file has opened locate this snippet of code:jdbc.driver=org.hsqldb.jdbcDriverjdbc.url=jdbc:hsqldb:hsql://localhost:9001/hibernatejdbc.username=hibuserjdbc.password=passwordhibernate.dialect=org.hibernate.dialect.HSQLDialectMake changes to the highlighted sections so that the section of code looks similar to this:jdbc.driver=net.sourceforge.jtds.jdbc.Driverjdbc.url=jdbc:jtds:sqlserver://localhost:1433/pentahojdbc.username=pentahojdbc.password=5tah0hibernate.dialect=org.hibernate.dialect.SQLServerDialecthibernate-settings.xmlThis file is located under the pentaho-solutions\system\hibernate\ folder.

Once the file has opened locate this snippet of code:system/hibernate/hsql.hibernate.cfg.xmlMake changes to the highlighted section so that the section of code looks similar to this:system/hibernate/mssql.hibernate.cfg.xmlmssql.hibernate.cfg.xmlThis file is not created by default so you will need to create this file yourself. Open up notepad or any other text editor and paste in the following snippet of code:

org.hibernate.cache.EhCacheProvider true true net.sourceforge.jtds.jdbc.Driver jdbc:jtds:sqlserver://localhost:1433/pentaho org.hibernate.dialect.SQLServerDialect pentaho 5tah0 10 false true update

NOTE

This configuration does not use Pentaho's default username and password - you can make changes to the username and password within the SQL Script Pack.

After pasting in the above snippet of code save the file as mssql.hibernate.cfg.xml under the pentaho-solutions\system\hibernate\ folder.

Configuring Hibernate and QuartzHibernate and Quartz need to specifically use the hibernate and quartz databases which were created on the Microsoft SQL server. To do so modifications need to be made to the context.xml file which is located in the \tomcat\webapps\pentaho\META-INF\ folder. context.xmlOnce the file has opened the following piece of code should be visible:

Make changes to the highlighted sections so that the section of code looks similar to this:

NOTE

This configuration does not use Pentaho's default username and password - you can make changes to the username and password within the SQL Script Pack.

Configuring Apache-Tomcat ServerTo configure the settings of the Apache-Tomcat server for your Pentaho BI Platform most of the changes are done inside the web.xml file which is located under the \tomcat\webapps\pentaho\WEB_INF\ folder. You are able to configure the following items (and more) for the Pentaho BI Platform:

pentaho-solutionslocation URL Disable HSQL database startup TrustedIpAddrs (optional - for the administration console and if you are accessing the server remotely)

If you are happy with the following settings for your Pentaho BI Platform server you will not need to make any changes to this file:

pentaho-solutions/ folder located under the biserver-ce\ folder Visit http://localhost:8080/pentaho URL to launch the Pentaho BI Platformsolution-pathThe solution-path parameter lets the Pentaho BI Platform know where to locate the pentaho-solutions folder. By default this is set to biserver-ce\ folder.

If you have decided to use an existing Apache-Tomcat server (or have moved your pentaho-solutions folder) you will need to point this to where you have placed your pentaho-solutions folder. In this example my pentaho-solutions folder is under the C:\pentaho\ folder, now my solution-path code snippet looks like this:

solution-pathC:\pentaho\pentaho-solutions

fully-qualified-server-url

If you are happy with visiting the URL http://localhost:8080/pentaho to access Pentaho's BI Platform you will not need to change this parameter, however if you would like others to access the site (remotely or on a network) you will need to make changes to this parameter.

Open up the file and locate this line of code:http://localhost:8080/pentaho/Make changes to the highlighted section to your PC or server's domain or IP address so it looks similar to this:

http://www.prashantraju.com:8080/pentaho/ or http://192.168.1.10:8080/pentaho/Disable HSQL Database StartupBy default with 3.7 HSQL database starts up automatically - to prevent this from happening locate the following snippets of code:

hsqldb-databasessampledata@../../data/hsqldb/sampledata,hibernate@../../data/hsqldb/hibernate,quartz@../../data/hsqldb/quartz

org.pentaho.platform.web.http.context.Hsqldb StartupListener

You can either remove the above snippets or comment it out, if you are commenting it out it will look similar to this:

TrustedIpAddrsIf you want to access your Tomcat-Apache server remotely - so in the above step you have not specified localhost or 127.0.0.1 for the base-url parameter - you will need to add your Tomcat-Apache server's IP address to this list.

Open up the file and locate this line of code:

TrustedIpAddrs127.0.0.1Make changes to the highlighted section add your PC or server's domain or IP address so it looks similar to this:TrustedIpAddrs127.0.0.1,[your_ip_address]This will allow the Pentaho Administration Console to 'ping' the server to see if it is up or down - you do not need to do this if you are hosting your server locally.Other ParametersYou can also change the local language and country under the web.xml file, the changes to these parameters are self explanatory.

Configuring SMTP (mail server)To configure the Pentaho BI Platform to use a SMTP server (mail server) to use for emailing reports etc. you will need to make modifications to the \pentaho-solutions\system\smtp-email\email_config.xml file.

Here are the available parameters that can be configured for SMTP support:

mail.smtp.host This is the address of your SMTP email server for sending email e.g. smtp.gmail.com mail.smtp.port This is the port of your SMTP email server e.g. for GMail this is 587 mail.transport.protocol The transport for accessing the email server. Usually this is smtp e.g. for GMail this is smtps mail.smtp.starttls.enable If you SMTP server uses TTLS authentication set this to true e.g. for GMail this is true mail.smtp.auth Set to true if the email server requires the sender to authenticate mail.smtp.ssl This is true if the email server requires an SSL connection e.g. for GMail this is true mail.debug Output debug information from the JavaMail API mail.pop3 Not being used. mail.from.default The from address that emails from the Pentaho BI Platform e.g. [email protected] mail.userid The userid that is used when authenticating with the SMTP server, mail.smtp.auth must be set to true. mail.password The password that is used when authenticating with the SMTP server, mail.smtp.auth must be set to true.

Here is an example of a smtp-email.xml file configured for GMail:

smtp.gmail.com 587 smtps true true true false [email protected] [email protected] password

Configuring PublishingBy default publishing is not enabled, to enable it you will need to specify a password which will need to be used when publishing. To get started you will need to edit the publisher_config.xml file located under the \pentaho-solutions\system\ folder, once open locate the following snippet of code:

Enter a password between the publisher-password tags (this password will be the same for all users) so the snippet of code looks similar to the example below (in this example the publisher password is publishthis):

publishthis

From now on when any user tries to publish content to Pentaho BI Platform they will need to specify this password.

Configuring the Administration ConsoleAfter completing this step no further configuration is needed when setting up the Administration Console with Microsoft SQL Server.

Starting the Business Intelligence PlatformThe Pentaho BI Platform is a webapp on the Apache-Tomcat server. To start Apache-Tomcat you will need to setup Apache-Tomcat as a service which is a lot easier to start and stop (skip this step if you are using an existing installation of Apache-Tomcat). At the command prompt issue the following command (in bold):

C:\pentaho\biserver-ce\tomcat\bin> service.bat install tomcat5Installing the service 'tomcat5' ...Using CATALINA_HOME: D:\pentaho\biserver-ce\tomcatUsing CATALINA_BASE: D:\pentaho\biserver-ce\tomcatUsing JAVA_HOME: C:\Program Files\Java\jdk1.6.0_13Using JVM: C:\Program Files\Java\jdk1.6.0_13\jre\bin\server\jvm.dllThe service 'tomcat5' has been installed.

Once you have received the above output the next step is to start the Tomcat service. To do this firstly click on the Start button then Runand type in services.mscand click OK. A Services window should appear and it will list all available services, locate the Apache Tomcat tomcat5 service and double click on it to open up the Properties dialog box:

To start Tomcat click on the Start button (to stop Tomcat simply click on the Stop button).Now you should be able to visit http://localhost:8080/pentaho or http://[your_domain_or_ip]:8080/pentaho. If the Pentaho BI Platform has started successfully you should see the following welcome screen:

After logging in try and run a sample report from the Steel Wheels solution folder:

Starting the Administration ConsoleTo start the Administration Console you will need to run the start-pac.bat file which is located under the c:\pentaho\adminstration-console\ folder after double clicking on the file a new command prompt window and should display something similar to the output below:DEBUG: Using JAVA_HOMEDEBUG: _PENTAHO_JAVA_HOME=C:\Program Files\Java\jdk1.6.0_13DEBUG: _PENTAHO_JAVA=C:\Program Files\Java\jdk1.6.0_13\bin\java.exe2010-01-05 16:27:17.824::INFO: Logging to STDERR via org.mortbay.log.StdErrLog05/01/2010 4:27:18 PM org.pentaho.pac.server.JettyServer startServerINFO: Console is starting2010-01-05 16:27:18.118::INFO: jetty-6.1.22010-01-05 16:27:38.672::INFO: Started SocketConnector @ 0.0.0.0:809905/01/2010 4:27:38 PM org.pentaho.pac.server.JettyServer startServerINFO: Console is now started. It can be accessed using http://D119940:8099 or http://161.117.117.40:8099Now you should be able to visit http://localhost:8099/ or the other two address's specified in your output (highlighted above). You will be prompted for a Username and Password which by default are "admin" and "password". If you have successfully started and logged into the administration console you should see the following welcome screen:

Page 1 of 1