implementing sushi/counter at your institution
TRANSCRIPT
As part of their Training Thursday’s initiative, on May 14th I had the honor of presenting a session, entitled "Implementing SUSHI/COUNTER at Your Institution". This session was geared towards library staff… in particular those that may be responsible for retrieving and working with COUNTER statistics and who hope to take advantage of SUSHI to make their lives easier.
I suspect most librarians have heard about SUSHI and know that, in concept, it is supposed to make working with COUNTER statistics much easier; however, I also suspect many are still confused as to how they could actually use SUSHI in their day‐to‐day activities.
1
The goal of this session is to remove at least some of the mystery surrounding SUSHI and by the end of the session the participants should be able to successfully use the SUSHI feature of their ERM or usage product ‐‐ or use a free client like MISO to retrieve COUNTER reports. They will understand:• What information is needed for SUSHI requests to be successful • Where to get that information • The necessity to interact with the content provider or their web site before they
can start using SUSHI• As well as some of the reasons why SUSHI requests may fail so that they can
trouble‐shoot the situation and know where to ask for help
During the session we will demonstrate some different SUSHI clients and look at some of the different approaches taken by various content providers to enable SUSHI for a given customer.
Armed with this knowledge you should be able to reliably harvest your COUNTER statistics. But of course retrieving the usage is only the first step in any analysis. Usage needs to be processed, understood, and combined with other information. While normally your ERM will handle these details, we will show how to use Excel to:• Work with XML versions of COUNTER reports
2
• And, to merge in cost data and create a quick cost‐per‐use analysis.
The Excel templates we use during the session will be made available on the Ususweb site. And for those of you not familiar with Usus ‐‐ it is a community website focused on all things usage and it is an excellent resource to become familiar with ‐‐you can find it a USUS.ORG.UK.
2
While we assume you have a basic understanding of COUNTER and SUSHI, it doesn’t hurt to spend a couple of minutes reviewing these two topics.
3
Project COUNTER released its first Code of Practice in 2002 – initiated as a collaborative effort by librarians, publishers and full text aggregators to ensure librarians had access to usage statistics that are “consistent”, “comparable” and “credible”. The Code of Practice, now in release 4, covers usage statistics for journal, books, databases and multimedia. For content providers to adhere to the Code of Practice requires correct processing of usage logs, proper formatting of reports and presenting those reports both via a web site & automatically using the SUSHI protocol (more on that in a minute). The “credibility” of COUNTER statistics comes as a result of the requirement for content providers to have their statistics audited by an accredited (by COUNTER) auditor.
4
SUSHI, which stands for the “Standardized Usage Statistics Harvesting Initiative” is about automating the retrieval, or harvesting of COUNTER reports. SUSHI, which became a NISO standard in 2007 and updated in 2014, describes a protocol that enables automated request and retrieval of COUNTER reports. The reason for SUSHI? With literally 100s of platforms and multiple reports per platform, the effort of retrieving and organizing these usage reports can be immense. With SUSHI, it is possible for an E‐Resource Management system or dedicated usage consolidation application to automatically harvest and process the usage.
5
This is an old diagram from the early days of SUSHI, but it provides a good review of that is going on.
6
Before we dig into SUSHI clients and configuration, I want to take a minute to talk about “Platforms”. They are integral to COUNTER statistics and it is important to understand what they are.
7
The simple definition is that a Platform is where content is accessed. Some platforms are obvious, like EBSCOhost, MIT Press, ScienceDirect and Taylor & Francis – these are easily recognizable as the platform name is synonymous with the publisher or content provider’s host where content is accessed. Some Platforms are not so obvious, HigWwire and SilverChair for example are platforms that host content from many publishers. In such cases the name of the platform name doesn’t reference the publisher. When you think of Sage, for example, you may think of the host as “Sage Journals Online (SJO)”; however, the actual “Platform” found in the COUNTER report is HighWire.
AND, from the perspective of harvesting usage, we should further refine the definition of the Platform as also “representing unique usage reporting site login”. We mentioned HighWire and SilverChair as two platforms that host content for many publishers. When it comes to COUNTER reports, HighWire provides a single report that returns usage for all of a customer’s subscriptions across all hosted publishers – this makes HighWire a single Platform serving all of its publishers. SilverChair, on the other hand, requires a separate login for each publisher with the COUNTER reports only providing usage for that publisher ‐‐ even though the Platform name in all of these reports is listed as “SilverChair”, from a reporting perspective each SilverChair publisher is a separate Platform.
8
When it comes to COUNTER usage statistics, the usage being reported is totaled up for a given “Platform”. A Journal Report 1, lists the total usage for a given journal on a given Platform.
9
While we are on the topic, one common misconception is that there is a COUNTER report to show usage of titles in a “package”. At least at this point, there is no such report –Journals, Book and Title usage reports will include all subscribed titles across the entire Platform and there is no breakdown by platform offered.
10
Of course, if you want to track usage for an e‐journal package, it can be done by pulling the JR1 report from the content provider and filtering the results to just those journals you know are in a particular package. We have an example later of how to do this in Excel.
11
If we want to automate the harvesting of COUNTER reports we need to talk about SUSHI clients.
12
SUSHI, as a standard is about a communications or message‐exchange protocol and by itself it does nothing.
13
In order to take advantage of SUSHI you need a computer program that is (or includes) a SUSHI client so that client can talk to the SUSHI server where the COUNTER reports are.
14
The content providers are the ones that provide the SUSHI server.
15
So, you need to provide the client.
16
So where do you find one?
17
One option is to use commercial applications that offer usage management features. These would include E‐Resource Management (ERM) systems or dedicated usage consolidation products – most of the commercial offerings have a SUSHI client built in.
18
Some example are:• EBSCO Usage Consolidation• Ex Libris UStat• Innovative ERM• ProQuest 360 COUNTER• Etc.
19
Another option is to use an Open Source SUSHI client. Typically these are provided as‐is and are often limited to helping with the harvesting of COUNTER reports or simply to serve as a proof‐of‐concept or a starting point for an individual or organization that wants to create their own usage management system. Typically Open Source clients are about harvesting (retrieving) the reports and not about analysis of the usage.
20
Open Source SUSHI clients that we are aware of include:
• MISO (ProQuest)• Pycounter (The Health Sciences Library System of the University of Pittsburgh)• SoapUI (a web service test tool by SmartBear)
SoapUI is a generic tool for testing web services, but it can be used a SUSHI client.
21
The NISO SUSHI website includes a web page called “SUSHI Tools & Other Aids” where links to these Open Source clients can be found.
22
Once you have found your SUSHI client, understanding what is needed to make it work is important. Lets talk about some basics for configuration.
23
The SUSHI protocol is pretty simple. When configuring your client to access usage from a given provider (Platform) here is what you need:1. The URL where that Platform’s Usage can be found2. Identity of who is making the request3. Identity of the institution’s that you want usage for4. The name of the report you want5. The date range for the usage
24
In the SUSHI protocol...‐ The URL of the SUSHI Server may referred to as the SUSHI Server URL or the Service End
Point.‐ The “Requestor ID” is what identifies the who is making the request. This is a value
assigned by the content provider and usually varies from Platform to Platform.‐ The “Customer ID” identifies the institution usage is being requested for. This is the
identifier that the content provider uses for the institution. Note that the Requestor and the Customer may be two different organizations – consider the example of EBSCO or ProQuest retrieving usage for a customer of their service.
‐ The report name will be the official abbreviated name such as “JR1” or “BR1”, etc.‐ And the date range will be start and end dates that represents a range of full months.
Most Platforms allow you to retrieve anywhere from 1 to 12 months of usage.
25
In this section we will take a walk through a commercial application and how to add the details for a Platform. In our example we will use “ACM Digital Library”. Note that the values provided in this example are not real so do not try to use them to test your own configuration since you will get an error if you do.
26
This is EBSCO’s new Usage Consolidation product. The Usage features are integrated with the holdings management functionality.
27
Since we are configuring a platform, we start by selecting “Platforms” from the menu
28
Here we are presented with a list of Platforms available for configuration. Currently the system already has listings for nearly 300 Platforms.
29
We will go ahead and pick “ACM Digital Library”
30
And we are presented with the Platform Details. On this first page you can see where the URL and username and password for the usage reporting site can be added.
31
As we scroll down, we see a place for instructions... (these show on the “Load Usage” page to provide the library staff member with step‐by‐step instructions for manually requesting reports from this Platform.)
Just below the instructions is the “Day of Month to Harvest Usage”. This is what tells the SUSHI Client when to attempt to pull the report. COUNTER allows content providers up to 4 weeks AFTER the end of the month to have COUNTER usage prepared. Some organizations like EBSCO and Elsevier will have their usage ready within 2‐3 days of the beginning of the month. Others take 10‐15 days. And many require 28 days. If you try to harvest usage too early, the server will respond with an exception indicating “no usage available for the requested months.”
32
But our goal is to configure the SUSHI information ‐‐ I have zoomed them in a bit so you can see them. Just like we mentioned earlier, the information is very basic – the SUSHI Server URL; the Requester ID and the SUSHI Customer ID. All of this information would come from the content provider – they are typically values that the librarian will just “know” and definitely are not values you can guess.
33
OK, so lets scroll back to the top where we see the “Reports to Load” tab
34
Click this to see the reports offered on this Platform, then select the reports you want.
35
Most commercial applications will provide a similar fill‐in‐the‐blanks approach for entering the configuration data for the various platforms. So lets take a look at one of the more popular Open Source SUSHI clients – MISO.
36
You can download the MISO client from https://code.google.com/p/sushicounterclient/
and the zip file includes the executable – so even though you get the source code, you have something you can just run. For my examples, I have downloaded the MISO client and it is installed on my hard drive in the C:\MISO directory.
37
Here is what the contents of that directory looks like. Notice the highlighted file “SUSHIConfig.csv” – this is where we enter our configuration data.
38
SUSHIConfig.csv, as its “CSV” extension suggests, is a comma‐separated file. It has the capability of holding configurations for multiple Platforms and multiple reports for a given platform.
39
One important note is that the MISO client was built for COUNTER Release 3; HOWEVER, it still works very well for COUNTER Release 4 reports Journal Report 1 (JR1) and Database Report 1.
40
Here is what the file looks like when opened in Excel... With some sample data included.
41
The “Library Code” identifies the library usage is for and will be part of the name of the file MISO creates. This value is NOT sent to the server.
42
The “Provider name” identifies the platform where usage will be harvested. It is also part of the name of the file MISO creates. This value is NOT sent to the server.
43
Specifies the COUNTER “Release” number for the report being requested. At this point it should always be “4”. This is included in the SUSHI Request.
44
The “URL” is the address of the SUSHI Server for this provider. This MUST be the URL of the server (service endpoint) not the WSDL. The WSDL is the “Web Service Description Language” that describes the service.
45
The “Requestor ID” identifies organization making the request. This is assigned by the usage provider and is their way of identifying who is asking for the usage.
46
The name of the organization making the request. This could be the library or the usage consolidation service provider. This is included in the request.
47
The email for the requestor. This is sent as part of the request to give the usage provider a way to contact the requester if something goes wrong. Note that some usage providers use this field to send the library administrator password for their administration module (although this technique is not really considered compliant).
48
This is the identifier for the library that the usage is being requested for. It is the identifier assigned by the content provider. All usage providers will require this field to contain the correct value and most will also require the Requestor to have been authorized to retrieve usage for the identified institution.
49
The name of the institution that usage is being requested for. This is sent as part of the request but is informational.
50
In this section, a “y” or “n” indicates if the corresponding reports should be retrieved. Note that only JR1 and DB1 apply for COUNTER R4.
51
In the previous section we showed you where to enter the configuration information for both a commercial client and a popular Open Source client. Somehow we knew what to enter for the Platforms in question – so where did we find this information? That is what we will talk about next.
52
The information we need to correctly configure SUSHI for a platform comes from the content provider.
53
Plus, many content providers require the institution to “activate” SUSHI harvesting for their account – this is done for security reasons to prevent just anyone from retrieving your usage.
54
All this to say that to configure a Platform for SUSHI means you have to engage with the content provider’s usage web site or their customer service team – you can’t figure it out on your own.
55
Plus what you need to do will vary by content provider.
56
If you are starting to feel discouraged, there is some assistance available in the form of the SUSHI Server Registry that currently hosted on the NISO SUSHI web site.
57
Here is what it look like... And here we see the entry for ACM Digital Library.
58
There are detailed instructions
59
The SUSHI Server URL is provided
60
As well as information about the Requester ID and the Customer ID (what values are expected, where to get them, etc.)
61
Now the good news is that many publishers work with 3rd party organizations for either their content hosting or their COUNTER statistics processing. Here is a list of the five most popular such services and they represent over 75 publishers. The good news that once you know how to activate SUSHI and find the necessary credentials for one participating publisher, the others operate in much the same manner.
62
Lets walk through some examples from each of these usage providers, starting with MIT Press – a publisher that uses Atypon for their hosting and usage.
63
We start by accessing MIT Press Journals web site and logging in with our administrative username and password.
64
You will see a multi‐tabbed interface.
65
We want the “Institutional administration” tab. If you don’t see this, you don’t have administrative rights on your login so content MIT Press customer service for assistance (or find someone else within your organization that has administrative rights.)
66
See the section “Retrieval via SUSHI”?
67
This is where the needed information is...
68
Zooming in a bit we see the URL for the SUSHI server, the Requestor ID and the Customer ID. If you are observant you will notice that the Requester ID is the username from the administrative login and the Customer ID is the customer number for our institution.
69
Now lets take a look at how you activate SUSHI for publishers that use MPS Insight for their COUNTER statistics. Here is a partial list of such publishers.• ASTM Digital Library• astm.org• Emerald Group Publishing Limited• http://www.computer.org/• IEEE Xplore• IOPscience• nature.com• palgrave‐journals.com• rsc.org
70
In this example we are using IEEE – and we have already logged in to our IEEE “Library Portal” for managing usage statistics. Select “Manage Account”.
71
Then “Manage SUSHI”
72
And now we are presented with a list of “Available SUSHI Partners”.... Here we click the edit icon to the right of the “Ex Libris” one.
73
This gives is the “SUSHI Permissions” (already checked) and we can click “Update” and the system will provide us with the credentials we need for our SUSHI client (I do not have a screen shot for that).
74
Another popular provider of COUNTER statistics services for publishers is “ScholarlyIQ”.
75
We will walk through the configuration of ingentaconnect. We have already logged in to the reporting module for ingentaconnect where we will click the “SUSHI” link on the upper‐right.
76
We are prompted to enter contact information and select the “ERM Type” – if you are using a commercial ERM or usage consolidation application it will be listed here. Then we “Click Here to Enable SUSHI Access for the Above”
77
The system then displays the credentials (as well as sends this same information in an email.)
78
Here is a closer look where you can see the SUSHI Server URL, the Requestor ID and Customer ID.
79
So we have seen how to configure a SUSHI client and know a little more about how to get the information, now lets harvest some COUNTER reports.
80
First we will “manually” request a COUNTER report via a commercial usage product.
81
We are back to the main screen of EBSCO Usage Consolidation.
82
We select “Upload” from the menu.
83
Where we are presented with the upload “landing page”. Under the “Usage Consolidation” section we select “Load COUNTER File”
84
And we are taken to the Load Counter Files page
85
Where we select a platform from the list
86
We have chosen ACM Digital Library and you will see on the right, instructions and other information that were entered when we configured the platform.
87
But our interest is in using SUSHI to harvest a report, so we click the orange “Harvest via SUSHI” button. This button will only show if the platform has been configured for SUSHI.
Notice we have already selected Journal Report 1 as our report.
88
We are given a simple screen where we enter the two and from months then click the blue “Harvest File via SUSHI” button to start the process. Our report will be added to the list of loaded files...
89
We can click the “Back” button or the “Upload” button
90
To be taken back to the Upload Landing Page where we select “View All Loads”
91
To get the list out our loaded files.
92
And here is the one we just requested. Since this is an application that provides reporting and more, the requested COUNTER report is automatically loaded and processed and will be made available for reporting.
93
Now lets follow the steps to harvest some reports via the MISO client.
94
In this section we not only harvest a set of platforms using MISO (the ones that were in the SUSHIConfig file), we will also look at one of the XML files
95
So MISO is a command‐line program, meaning that it doesn’t run like other windows applications (it doesn’t have a user interface). So we need to get to a command prompt to run it.
96
Start by clicking the “Start” icon on Windows
97
Next choose “All Programs” then “Accessories” then “Command Prompt”
98
And you will see something like this.
99
We need to change to the “MISO” directory so I do this with the “Change Directory” command – I type:
CD \miso
100
Now I can type my command line to ask for usage for March 2015.
Miso –d 201503 201503 –x
The “‐d” specifies the date range in yyyymm format (if you leave it off it retrieves the prior month.) The “‐x” tells MISO to save the XML version of the file.
101
We wait a minute or so for the cursor to return
102
Now we can open Windows Explorer to see the new files (I have sorted the contents of the MISO directory by Date Modified so the new files appear on top.)
103
Here are our files....
104
When we went through the section on how to configure the MISO client, I pointed out how some of the elements became part of the file name. Here we can see this in action –looking at a given file we know the provider, the library, the dates requested and the report.
105
So, we have downloaded a COUNTER report via SUSHI – what does it look like?
106
It looks like “XML” that is what it looks like...
107
As we browse down a bit, there are certainly recognizable elements. But this is not really the file you want to use for your usage analysis.
108
Commercial applications will know how to handle XML and will extract the information out of the file and move it to their reporting systems. Basically you are shielded from the XML view of the file.
However, if you are planning to work with an Open Source client, like MISO, you may need some help.
109
In this next section, we will use Excel to open an XML file and turn it into something you could work with for your usage evaluation.
110
We start by opening Excel – by default we get a new workbook.
111
Then choose to “Open” a file, then select our XML file.
112
Bow you get something like this. The data elements have been exploded out so each element is in its own column... And you can see that some elements are repeated.
113
Lets scroll to the right..
114
115
116
Until finally we see ItemName (Title), Data Type (Journal), begin and end dates, metric types and counts.
117
So lets start cleaning this file up by filtering to only those rows with a metric type of “ft_total”
118
It is starting to look much better...
119
Now we are going to clean this up even more by selecting COLUMNS A through W and hiding them (select the columns, the right‐click on the column header and click “Hide”
120
Now this is even better. By notice that our columns X and Y, where we would hope to see ISSNs are empty. This is one of the issues with this simple approach to transposing XML (which allows multiple identifiers per item) into a simple table.
121
However, for many publisher, the title may be enough.
122
In preparing for this presentation, I decided to see if it was possible to create an macro‐enabled Excel file that would convert a MISO‐loaded JR1 file into a proper Excel version of a JR1 report. It worked and this template is now available on the Usus web site.
123
Through the capabilities of Excel, using this tool is very easy – that is once you have pulled a JR1 report via MISO.
124
Click the big button.
125
You will be prompted to select the file to convert. Then you may have to wait a few seconds or minutes while the macro works on the file.... Larger files will take longer.
126
And the result is a NEW Excel file with the same name as the XML file but an extension of XLSX.
127
Lets scroll through... Notice we have the ISSNs
128
And notice that we pulled 12 months worth of data and it appears in columns. (For those of you who are paying attention, you will also notice the “Total for all Journals” row is missing ‐‐ this is easy to add if you need it, but we don’t really want that for what we are doing next.)
129
So you have your usage – what next? The most common “what next” is cost‐per‐use analysis. If you use a commercial ERM or usage product the chances are that the cost‐per‐use analysis is a built in feature. In this section, though, we will us our freshly loaded JR1 reports we created from the MISO XML and add cost‐per‐use to it.
130
Again we are using Excel.
131
Now, I have already added some other worksheets to this Excel file – you can see the “Cost” and “Packages” tabs here.
132
In the “Cost” tab, I have provided a simple title‐listing with Title Name, ISSNs and cost. This is fake data for cost so I set all of the values to $999.
133
And I have added the “Packages” tab to show another trick we can use to provide a simple package analysis. In this case, I have listed the package a title is found in.
134
OK... Lets go back to the COUNTER worksheet and insert some columns and do some calculations.
135
The first think I do is scroll the data so that row 8 is on the top of the screen. I then select the body of the report and choose the option to “Format as Table” making sure the “My table has headers” is checked. I format as a table because it makes other parts of this exercise easier.
136
Next I select cell B9 and choose to “Freeze Panes” (you will find this option under “View” menu). This is more of a personal preference, but I find it easier if the header row and the journal name stay on the screen as I scroll around the report.
137
Next I insert a new Column after Reporting Period Total.
138
I will name this column “Cost”
139
Then I add a formula that uses “VLOOKUP” to use the journal name and look up the corresponding cost from the cost worksheet.
140
Here is that formula. Notice the “IFERROR()” function that surrounds the VLOOKUP – this simply tells Excel to put in an empty value (“”) if the VLOOKUP fails to find a match.
An alternative method is using the “SUMIF()” function... In the following we are checking for an ISSN and if found, adding up all the cost for that ISSN otherwise we add all the usage for the title.
=IFERROR(IF(F9 <> "",SUMIF(Cost!C:C,F9,Cost!D:D),SUMIF(Cost!A:A,A9,Cost!D:D)),"")
141
Of course we really want Cost‐per‐Use so we will insert another column to the right of “Cost”.
142
And name it “Cost‐per‐Use”
143
Then enter the formula to calculate cost‐per‐use.
144
Here is the close‐up of the formula. Again, you can see the “IFERROR()” function surrounding the calculation. With this, if the division of cost by usage generates an error (as would happen if usage was zero), the full “Cost” value will be substituted instead.
145
Next scroll to the bottom of the table where we have added Totals for usage and cost and an average for cost‐per‐use.
146
OK... Now lets see how we can add package analysis to this file. We will add a new column to the right of the Cost Per Use column
147
And name it “Package”
148
Again, we use the VLOOKUP() function to use the journal name to look up the package name from the “Packages” worksheet,
149
Here again we use IFERROR() to make the cell blank if there is no matching package.
150
Now we are able to filter our file by Package – here we have selected “Pkg A”
151
And now when we scroll to the bottom, we see totals for the package.
152
And, if you want to get fancy, Excel offers the ability to create a Pivot Table (Select “Insert” then “Pivot Table”). Here you can see the package breakdown with title counts, total usage, total cost and cost‐per‐use.
153
OK... So lets wrap up with a few random items.
154
We mentioned this earlier, but it is important to remember that some platforms may take up to a month to make their prior year usage available.
COUNTER allows up to a month...
155
The delay will vary by Platform so you will need to get to know your platforms.
156
And if you try to harvest too early, your SUSHI client will get an exception.
157
Here is an example of such an exception....
158
Zooming in on the number (3030) and the message “No usage available”
159
SO what are the key points we want you to take away from this session?
SUSHI is a protocol and not an application.You need a “client” application to harvest reportsAnd configuring the client for a give platform requires interaction with the publisher/content provider – either their customer service staff or their web siteSUSHI will return the COUNTER report as XMLIf you don’t have a commercial application, you can use Excel to product Analysis reports – with a bit of work.
160
161
162
163
164
165
166