creating r packages
DESCRIPTION
Presentation on creating R packages (including native code integration). Presented at the Melbourne R User Group in 2011.TRANSCRIPT
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Creating R Packages
Rory Winston
February 17, 2011
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
1 Outline
2 Basics
3 Creating a Simple Package
4 Interfacing With Native Code
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
R Packages
R’s ”jewel in the crown”
Almost 3,000 packages on CRAN
Preferred extension mechanism for R
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Why create a Package?
Keep frequently-used code and data together
Save repetitive typing and analysis
Extend base R functionality
Share analysis with others
Package reproducible research
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Package Conventions
R follows ”convention over configuration”
Flexible packaging structure
Sensible defaults
Some pedantry: Note that ’package’ and ’library’ are notstrictly equivalent
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Package Structure
Basic package structure
mypackage/
DESCRIPTION # Mandatory package metadata
R/ # R source files
data/ # Data directory
demo/ # Demo code
man/ # Package docs (.Rd)
po/ # i18n
src/ # Native (compiled) code
tests/ # Unit tests
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Default Loaded Packages
Not all packages are loaded by default
A basic subset only
Loading many packages can aversely affect performance
To see packages loaded by default:
> getOption("defaultPackages")
[1] "datasets" "utils" "grDevices"
[4] "graphics" "stats" "methods"
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Installed Packages
> pkginfo <- installed.packages()
> class(pkginfo)
[1] "matrix"
> dimnames(pkginfo)[1]
[[1]]
[1] "aplpack" "base"
[3] "boot" "caret"
[5] "codetools" "datasets"
[7] "distr" "e1071"
[9] "fortunes" "graphics"
[11] "grDevices" "grid"
[13] "highlight" "inline"
[15] "IPSUR" "iterators"
[17] "itertools" "lotto"
[19] "methods" "neuralnet"
[21] "parser" "plyr"
[23] "qcc" "Rcpp"
[25] "reshape" "RUnit"
[27] "scatterplot3d" "sfsmisc"
[29] "splines" "startupmsg"
[31] "stats" "stats4"
[33] "tcltk" "tools"
[35] "utils" "zoo"
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Installed Packages
> dimnames(pkginfo)[2]
[[1]]
[1] "Package" "LibPath" "Version"
[4] "Priority" "Depends" "Imports"
[7] "LinkingTo" "Suggests" "Enhances"
[10] "OS_type" "License" "Archs"
[13] "Built"
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Currently Loaded Packages
To see currently loaded packages:
> (.packages())
[1] "stats" "graphics" "grDevices"
[4] "utils" "datasets" "methods"
[7] "base"
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Simple Package Example
Australian Lotto package
Some sample data (historical results)
Simple functions
Help files
Building and checking the package
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Creating the Package
Simplest way to create a package in R:
Create a basic set of functions and data
Use package.skeleton()
Modify and add as required
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
The ozlotto Package
Let’s download some sample data for our package:$ curl
https://www.tattersalls.com.au/FullResults/TattslottoResults.zip
> lotto.zip
$ unzip lotto.zip
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Load The Data
Load the data into R:
> lotto<-read.table("Tattslotto.txt", sep = ",",
+ fill = TRUE, header = TRUE,
+ col.names = c("number", "date",
+ c(1:6), "supp1", "supp2"),
+ na.strings=c("-"))
> lotto$date <- as.POSIXct(strptime(lotto$date,
+ "%Y%m%d"))
> head(lotto,4)
number date X1 X2 X3 X4 X5 X6 supp1 supp2
1 101 1981-03-07 33 8 15 20 25 5 11 NA
2 102 1981-03-14 1 32 18 19 37 38 4 NA
3 103 1981-03-21 20 12 17 1 19 39 2 NA
4 104 1981-03-28 34 14 2 18 26 15 4 NA
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Some Data
> draws <- as.data.frame(lotto[,3:8])
> colnames(draws) <- paste("draw", c(1:6))
> head(draws, 5)
draw 1 draw 2 draw 3 draw 4 draw 5 draw 6
1 33 8 15 20 25 5
2 1 32 18 19 37 38
3 20 12 17 1 19 39
4 34 14 2 18 26 15
5 14 29 7 18 2 16
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Some Functions
> plot.freqs <- function(x) barplot(cex.names=.6,
+ table(unlist(x)), col="lightblue",
+ las=2, main="Total Draw Frequency")
> plot.freqs(draws)1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
Total Draw Frequency
0
50
100
150
200
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
What’s In The Environment?
> sapply(objects(), function(x) (class(get(x))))
draws lotto plot.freqs
"data.frame" "data.frame" "function"
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Creating The Package Skeleton
> package.skeleton(list=ls(), name="lotto")
Creating directories ...
Creating DESCRIPTION ...
Creating Read-and-delete-me ...
Saving functions and data ...
Making help files ...
Done.
Further steps are described in './lotto/Read-and-delete-me'.
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
What’s In The Package?
lotto/
|~data/
| |-draws.rda
| |-lotto.rda
|~man/
| |-draws.Rd
| |-lotto-package.Rd
| |-lotto.Rd
| |-plot.freqs.Rd
|~R/
| |-plot.freqs.R
|-DESCRIPTION
|-Read-and-delete-me
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Editing the DESCRIPTION
Mandatory file (very important!)
”Debian Control File” format
Many different fields, see docs for reference
Dependencies (and licenses) can use version ranges
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Example DESCRIPTION
Package: lotto
Type: Package
Title: OzLotto Example Package
Version: 1.0
Date: 2011-02-14
Author: Rory Winston
Maintainer: Rory Winston <[email protected]>
Description: Simple toy package
Depends: R (>= 2.12.0)
License: GPL (>=2) | BSD
LazyLoad: yes
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Package Dependencies
If your package depends on functionality defined in otherpackages
This can be added to the Depends section
Package versions can also be specified
Example from the highlight package:
Depends: R (>= 2.11.0), tools, codetools, utils, parser (>= 0.0-10)
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
.RData Files
Data files are stored in .rda format
This is a portable, (optionally) compressed representation
Same as save(lotto, file="lotto.rda")
$ file lotto.rda
lotto.rda: gzip compressed data
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Help Files - The .Rd Format
Rd is the ”R documentation format”
Can be compiled into
LATEX;PDF;HTML;ASCII text;HTML Help;etc.
Functions and data can be documented;
Uses a TEX-like markup
Many, many options
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Sample Documentation for a Function
\name{plot.freqs}
\alias{plot.freqs}
\title{Plotting Number Frequencies Across Draws}
\description{
This function produces a bar plot of number
frequencies across all six-number draws.
}
\usage{plot.freqs(x)}
\arguments{\item{x}{
A \code{data.frame} where each row corresponds
to a separate lottery draw and the columns
represent the numbers drawn in that event, in order.}}
\author{Rory Winston}
\seealso{
See \code{\link{draws}}
Also see \code{\link[graphics]{hist}}
}
\examples{
random.draw <- function() sapply(45:(45-6),
function(x) sample(1:x, 1))
draws <- t(replicate(random.draw(), n=1000))
plot.freqs( draws )
}
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Documenting Data
Note that R will generate doc skeletons for package data
The data will be inspected and sample docs created
For example:
\format{
A data frame with 1620 observations on the following 10 variables.
\describe{
\item{\code{number}}{a numeric vector}
\item{\code{date}}{a POSIXct}
\item{\code{X1}}{a numeric vector}
\item{\code{X2}}{a numeric vector}
\item{\code{X3}}{a numeric vector}
\item{\code{X4}}{a numeric vector}
\item{\code{X5}}{a numeric vector}
\item{\code{X6}}{a numeric vector}
\item{\code{supp1}}{a numeric vector}
\item{\code{supp2}}{a numeric vector}
}
}
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Sample Generated Manual
Package ‘lotto’February 15, 2011
Type Package
Title OzLotto Example Package
Version 1.0
Date 2011-02-14
Author Rory Winston
Maintainer Rory Winston <[email protected]>
Description Simple toy package
Depends R (>= 2.12.0)
License GPL (>=2) | BSD
LazyLoad yes
R topics documented:lotto-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1draws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2lotto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3plot.freqs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Index 5
lotto-package Simple Oz Lotto Package
Description
This package contains some historical result data and some simple functions.
Details
Package: lottoType: PackageVersion: 1.0Date: 2011-02-14License: UnlimitedLazyLoad: yes
1
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Math in Rd Docs
Note that Rd supports TEX-like math markup
The math markup will be downgraded to ASCII whereappropriate
The text \deqn{p(x) = \frac{1}{b-a}} becomes (inPDF and console):
lotto 3
lotto Historical Oz Lotto Results
Usage
data(lotto)
Format
A data frame with 1620 observations on the following 10 variables.
number a numeric vector
date a POSIXct
X1 a numeric vector
X2 a numeric vector
X3 a numeric vector
X4 a numeric vector
X5 a numeric vector
X6 a numeric vector
supp1 a numeric vector
supp2 a numeric vector
Examples
data(lotto)## maybe str(lotto) ; plot(lotto) ...
plot.freqs Plotting Number Frequencies Across Draws
Description
This function produces a bar plot of number frequencies across all six-number draws. The uniformdistribution is commonly notated as
p(x) =1
b − a
Usage
plot.freqs(x)
Arguments
x A data.frame where each row corresponds to a separate lottery draw and thecolumns represent the numbers drawn in that event, in order.This function produces a bar plot of number frequencies across all
six-number draws. The uniform distribution is commonly notated as
p(x) = \frac{1}{b-a}
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
R CMD check
R CMD check is the first port of call
Checks documentation, package structure, runs examples
Produces compiled documentation (e.g. PDF) ifappropriate
Basic procedure:
Run R CMD check <packagename>
Check errors in generated <packagename>.Rcheck dirIf any errors, fix uprinse and repeat
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Testing The Package
The package can be loaded from a working directoryinstance, if we are in the generated lotto.Rcheck dir:
> library(lib.loc=".", package="lotto")
As R CMD check generates a loadable package
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Building The Package
A binary package can be built using R CMD build
<packagename>
This can be installed to a local library
> install.packages(c("lotto_1.0.tar.gz"),
repos=NULL)
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Things To Be Aware Of
Namespaces
Lazy Loading
What does the following mean:
> suppressWarnings(dump("AirPassengers",
+ "", evaluate=FALSE))
AirPassengers <-
<promise: lazyLoadDBfetch(c(0L, 367L), datafile, compressed,
envhook)>
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Interfacing With Native Code
Why go native?
SpeedFunctionality otherwise unavailable
Some examples:
Algorithms in C/C++/Fortran code
Speeding up slow R routines
Workarounds for R limitations (e.g. shared memory)
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Considerations When Using R and C
R uses many LISP idioms in the C code
e.g. PROTECT(ans =
FirstArg(CAR(sub),CADR(sub)));
R itself has many LISP-like features
> (`+`(`sum`(`^`((`:`(1,10)),2)),+ (`^`((`sum`(`:`(1,10))),2))))
[1] 3410
> sum((1:10)^2) + (sum(1:10))^2
[1] 3410
Garbage collection is also an issue
Frequent source of error (even for the R team)
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Simple Example
R does not have a ’matrix exponentiation’ operator
Scalar exponentiation only(x11 x12x21 x22
)n
> X <- matrix(1:4, 2, 2)
> X^2
[,1] [,2]
[1,] 1 9
[2,] 4 16
> X %*% X
[,1] [,2]
[1,] 7 15
[2,] 10 22
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Creating a New Operator
All operators in R are just functions
Binary operators take two arguments
We will create a new operator %^%
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
The Exponentiation Operator
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
A Better Way (At Least For C++)
Use the Rcpp package
Lots of examples
Clean, ”modern” C++
Manages memory allocation/protection
Provides nice syntatic sugar for C++ operations
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Building Windows Packages
R is reasonably Unix-centric
Perl no longer required for most tasks
Some tools support (e.g. LATEX) also assumed
Packages can be compiled with Visual Studio
The mingw compiler and other supporting tools can bedownloaded from:
http://www.murdoch-sutherland.com/Rtools/
Rory Winston Melbourne R User Group
Creating R Packages
Creating RPackages
Rory Winston
Outline
Basics
Creating aSimplePackage
InterfacingWith NativeCode
Further Reading
R CMD <command> -help
The R documentation
Mailing Lists
Rory Winston Melbourne R User Group
Creating R Packages