user's guide oracle¢® machine learning for r 1.4.2.1 about oracle machine learning for r...

Download User's Guide Oracle¢® Machine Learning for R 1.4.2.1 About Oracle Machine Learning for R Data Types

Post on 22-Jul-2020

0 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Oracle® Machine Learning for R User's Guide

    Release 1.5.1 E97851-04 March 2020

  • Oracle Machine Learning for R User's Guide, Release 1.5.1

    E97851-04

    Copyright © 2012, 2020, Oracle and/or its affiliates.

    Primary Author: David McDermid

    Contributors: Mark Hornick, Sherry Lamonica, Qin Wang, Lei Zhang

    This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

    The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

    If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable:

    U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or “commercial computer software documentation” pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract. The terms governing the U.S. Government’s use of Oracle cloud services are defined by the applicable contract for such services. No other rights are granted to the U.S. Government.

    This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in dangerous applications.

    Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

    Intel and Intel Inside are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Epyc, and the AMD logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group.

    This software or hardware and documentation may provide access to or information about content, products, and services from third parties. Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party content, products, and services unless otherwise set forth in an applicable agreement between you and Oracle. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to your access to or use of third-party content, products, or services, except as set forth in an applicable agreement between you and Oracle.

  • Contents

    Preface Technology Rebrand ix

    Audience ix

    Documentation Accessibility ix

    Related Documents x

    Oracle Machine Learning for R Online Resources x

    Conventions x

    1 Introduction to Oracle Machine Learning for R 1.1 About Oracle Machine Learning for R 1-1

    1.2 Advantages of Oracle Machine Learning for R 1-2

    1.3 Get Online Help for Oracle Machine Learning for R Classes, Functions, and Methods 1-3

    1.4 About Transparently Using R on Oracle Database Data 1-6

    1.4.1 About the Transparency Layer 1-6

    1.4.2 Transparency Layer Support for R Data Types and Classes 1-7

    1.4.2.1 About Oracle Machine Learning for R Data Types and Classes 1-8

    1.4.2.2 About the ore.frame Class 1-9

    1.4.2.3 Support for R Naming Conventions 1-11

    1.4.2.4 About Coercing R and Oracle Machine Learning for R Class Types 1-11

    1.5 Typical Operations in Using Oracle Machine Learning for R 1-12

    1.6 Oracle Machine Learning for R Global Options 1-12

    2 Get Started with Oracle Machine Learning for R 2.1 Connect to an Oracle Database Instance 2-1

    2.1.1 About Connecting to the Database 2-1

    2.1.1.1 About Using the ore.connect Function 2-2

    2.1.1.2 About Using the ore.disconnect Function 2-3

    2.1.2 Use the ore.connect and ore.disconnect Functions 2-3

    2.2 Create and Manage R Objects in Oracle Database 2-4

    iii

  • 2.2.1 Create R Objects for In-Database Data 2-5

    2.2.1.1 About Creating R Objects for Database Objects 2-5

    2.2.1.2 Synchronize Data with the ore.sync Function 2-6

    2.2.1.3 Get Objects with the ore.get Function 2-8

    2.2.1.4 Add a Schema with the ore.attach Function 2-9

    2.2.2 Create Ordered and Unordered ore.frame Objects 2-10

    2.2.2.1 About Ordering in ore.frame Objects 2-10

    2.2.2.2 Global Options Related to Ordering 2-11

    2.2.2.3 Ordering Using Keys 2-12

    2.2.2.4 Ordering Using Row Names 2-13

    2.2.2.5 Using Ordered Frames 2-15

    2.2.3 Move Data to and from the Database 2-17

    2.2.4 Create and Delete Database Tables 2-19

    2.2.5 Save and Manage R Objects in the Database 2-20

    2.2.5.1 About Persisting Oracle Machine Learning for R Objects 2-21

    2.2.5.2 About OML4R Datastores 2-22

    2.2.5.3 Save Objects to a Datastore 2-22

    2.2.5.4 Control Access to Datastores 2-24

    2.2.5.5 Get Information about Datastore Contents 2-25

    2.2.5.6 Restore Objects from a Datastore 2-28

    2.2.5.7 Delete a Datastore 2-29

    2.2.5.8 About Using a Datastore in Embedded R Execution 2-30

    3 Prepare and Explore Data in the Database 3.1 Prepare Data in the Database Using Oracle Machine Learning for R 3-1

    3.1.1 About Preparing Data in the Database 3-2

    3.1.2 Select Data 3-2

    3.1.2.1 Select Data by Column 3-2

    3.1.2.2 Select Data by Row 3-3

    3.1.2.3 Select Data by Value 3-4

    3.1.3 Index Data 3-5

    3.1.4 Combine Data 3-6

    3.1.5 Summarize Data 3-7

    3.1.6 Transform Data 3-8

    3.1.7 Sample Data 3-10

    3.1.8 Partition Data 3-15

    3.1.9 Prepare Time Series Data 3-16

    3.2 Explore Data 3-22

    3.2.1 About the Exploratory Data Analysis Functions 3-23

    3.2.2 About the NARROW Data Set for Examples 3-23

    iv

  • 3.2.3 Correlate Data 3-24

    3.2.4 Cross-Tabulate Data 3-26

    3.2.5 Analyze the Frequency of Cross-Tabulations 3-30

    3.2.6 Build Exponential Smoothing Models on Time Series Data 3-31

    3.2.7 Rank Data 3-34

    3.2.8 Sort Data 3-35

    3.2.9 Summarize Data with ore.summary 3-37

    3.2.10 Analyze the Distribution of Numeric Variables 3-38

    3.2.11 Principal Component Analysis 3-39

    3.2.12 Singular Value Decomposition 3-41

    3.3 Data Manipulation Using OREdplyr 3-42

    3.3.1 Select and Order Data 3-42

    3.3.1.1 Examples of Selecting Columns 3-43

    3.3.1.2 Examples of Programming with select_ 3-44

    3.3.1.3 Examples of Selecting Distinct Columns 3-45

    3.3.1.4 Examples of Selecting Rows by Position 3-46

    3.3.1.5 Examples of Arranging Columns 3-47

    3.3.1.6 Examples of Filtering Columns 3-48

    3.3.1.7 Examples of Mutating Columns 3-49

    3.3.2 Join Rows 3-50

    3.3.3 Group Columns and Rows 3-51

    3.3.4 Aggregate Columns and Rows 3-54

    3.3.5 Sample Rows 3-57

    3.3.6 Rank Rows 3-59

    3.4 About Using Third-Party Packages on the Client 3-62

    4 Build Models in Oracle Machine Learning for R 4.1 Build Oracle Machine Learning for R Models 4-1

    4.1.1 About OREmodels Functions 4-2

    4.1.2 About the longley Data Set for Examples 4-3

    4.1.3 Build Linear Regression Models 4-3

    4.1.4 Build a Generalized Linear Model 4-5

    4.1.5 Build a Neural Network Model 4-8

    4.1.6 Build a Random Forest Model 4-9

    4.2 Build Oracle Machine Learning for SQL Models 4-11

    4.2.1 About Building OML4SQL Models using OML4R 4-12

    4.2.1.1 OML4SQL Models Supported by OML4R 4-13

    4.2.1.2 About OML4SQL Models Built by OML4R Functions 4-13

    4.2.1.3 Specify Model Settings 4-14

    4.2.2 Build an Association Rules Model 4-15

    v

  • 4.2.3 Build an Attribute Importance Model 4-18

    4.2.4 Build a Decision Tree Model 4-19

    4.2.5 Build an Expectation Maximization Model 4-21

    4.2.6 Build an Explicit Semantic Analysis Model 4-26

    4.2.7 Build an Extensible R Algorithm Model 4-30

    4.2.8 Build

View more >