clas data dictionary

7
CLAS Report Interpretation Guide Introduction Each row in the comprehensive report represents all the activities of one user either in a course (all videos aggregated together) or in each video in the course. The subsequent slides are a list of all the data types (columns) provided in the comprehensive reports. Some of these columns are only available when the data is aggregated per course, and some are only available when aggregated per video. Request for comment: if you believe that a certain data type not yet presented here would be valuable to collect, feel free to contact [email protected] !

Upload: thomas-dang

Post on 17-Feb-2017

180 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Clas data dictionary

CLAS Report Interpretation Guide Introduction

Each row in the comprehensive report represents all the activities of one user either in a course (all videos aggregated together) or in each video in the course.

The subsequent slides are a list of all the data types (columns) provided in the comprehensive reports.

Some of these columns are only available when the data is aggregated per course, and some are only available when aggregated per video.

Request for comment: if you believe that a certain data type not yet presented here would be valuable to collect, feel free to contact [email protected]!

Page 2: Clas data dictionary

Data dictionary: identification columns

Originating Host: Only available in anonymized reports, for aggregating CLAS data from multiple installations.

Instance Name: Only available in anonymized reports. As each CLAS installation has a multi-tenant architecture, the

instance_name column can be used to distinguish between these virtual instances (tenants).

Course ID: a unique number that denotes a particular course, in a particular section (taught by a particular instructor), in a

particular year and season. For example, MATH 100 201 2014W may be course ID 23.

Course Name: alpha-numeric course name without the academic year and season suffix. This column contains a randomized code when the report is anonymized.

Academic year: which academic year and season (W for normal term, S for summer term) was this course offered? Using

the course name (or anonymized course name) and year, you can view the progression of the same course over time.

User ID: a numeric user ID. This number is positive for users with an account in CLAS, but can be negative. Negative,

random userID’s are attached to posts made by anonymous visitors if the video is shared publicly. Each specific negative userID denotes a single session by a particular IP, making it possible to distinguish between multiple anonymous users.

Name: user’s full name, alpha-numeric. This column is removed when the report is anonymized.

Video id: alpha-numeric video ID, only available if the report is aggregated by video. If the report is anonymized, this will

be a randomized code that cannot be used to search for the video inside CLAS

Video title: alpha-numeric video title or video file name, only available if the report is aggregated by video. This column is

removed in anonymized reports.

Page 3: Clas data dictionary

Data dictionary: Columns describing the inherent characteristics of the course, video, or user

Subject: 3, 4 letter subject code extracted from the course name, such as MATH or POLI (political science). “N/A” if cannot be extracted

Year Lv: an estimate of the year level of the course, based on the course number. This is usually but not always accurate

and is also dependent on how each university name their courses. It will be “N/A” if the year level cannot be extracted

from the course code or if there is no course code (i.e. a video collection with a bespoke name).

Is annotating available: is the annotation feature available for this course, “YES” or “NO”?

Is commenting available: is general commenting available for this course, “YES” or “NO”? If both annotating and

commenting are disabled, then this course utilizes CLAS simply as a video library and player.

Role: role of this user in this course; can be “student” or “instructor/TA.”

User’s own video: Is this video about this user, "YES” or “NO”? This is meaningful for courses where the videos are students’ assignments. Some instructors have students filming and uploading the videos themselves, making the statistics

collection simple, but it can be logistically more convenient to film at a central location and have the videos uploaded by

instructors/TAs, a heuristics is used to identify if this is “about” a student. A video is deemed “about” a student if either

that video is viewable only in a private group (handin-box) belonging to this student, or the full name of this student

appears in the title or description fields.

Shared to: Who was the video shared too, aka. Who can view this video? Can be "everyone (in this course)", "multi-user

group”, "a single student”, or “the general public”

Page 4: Clas data dictionary

Data dictionary Usage Metrics Columns (1/3)

Num logins: number of times this student has logged in during the entire duration of this course. Note that if a student uses CLAS for more than one courses in the same time period, then the login number is shared between the two courses.

Total posts authored: total number of public posts by this user in this particular video (posts can be annotations or

comments). This value is “N/A” if BOTH annotations and comments are turned off for this course, and CLAS is used

only as a video player/library.

Annotations authored: total annotations. Annotations are text notes, links, or video clips attached to a specific point on the video timeline. Annotations can be replied to, creating a thread at a specific point on the video. This value is “N/A” if

annotations are turned off for this course, so that students can only makes general comments.

Aggregated annot content: Text content of all annotations by this user in this video. Note the time stamps in [ ] in front of

each annotation. This value is “N/A” if annotations are turned off.

Comments authored: comments are posts that are not anchored onto the timeline, but just refer to the video as a whole.

This value is “N/A” if comments are turned off for this course.

Aggregated comment content: all comment content by this user in this video, concatenated. This value is “N/A” if

comments are turned off.

Page 5: Clas data dictionary

Data dictionary Usage Metrics Columns (2/3)

Total public posts length: word count of all public posts (annotations & comments) that this user makes in this video. Note that the word count does not include the timestamps in the brackets. “N/A” if both annotations and comments are turned off.

Avg public post length: average word count of all public posts. “N/A” if both annotations and comments are turned off.

Peer referrals: number of time this user, in this video, refer to another user either by Replying to a post, or writing @user. “N/A” if both annotations and comments are turned off.

Private notes authored: number of private notes (visible ONLY to the author or another user which the author referred to). Full text of private notes are not provided in the dataset. “N/A” if both annotations and comments are turned off.

Total private notes length: word count of all private notes by this user, in this particular video in this course. “N/A” if both annotations and comments are turned off.

Avg private note length: average word count of private notes. “N/A” if both annotations and comments are turned off.

Num posts read: Number of posts, both annotations and comments, in this video that this user has read. “N/A” if both annotations and comments are turned off.

An annotation is considered read when its hover preview or editing panel is displayed for at least 3 seconds. A comment is considered read when the user scrolls over it slowly. Fast scrolling or mouse moving pass a comment will not mark that comment as read.

Page 6: Clas data dictionary

Data dictionary Usage Metrics Columns (3/3)

Video views from start: view events triggered by pressing PLAY at 0

Video seeks from middle: view events triggered by seeking back and forth in the middle of a video

Total video views: total number of view events: a sum of "video views from start" and "video seeks from middle”

Average duration per view: how many seconds on average does this user let this video play whenever he/she presses

PLAY or SEEK

Total video duration: the total length of this particular video, in seconds

Percentage coverage: percentage of this particular video has the student viewed? (in the “aggregate by video” report only)

Average coverage: percentage view coverage on average across all the videos in this course (for the “aggregate by course”

report only)

Num videos viewed: number of videos this user viewed

Num videos can view: number of videos in this course that this user CAN view. For courses where the videos are lectures

and other resources shared to the whole class, this number will be the same as “Total videos in course”, but some course

uses videos for group or personal activities, and it would be more fair to compare the number of videos a student viewed

with what they actually could view

Total videos in course: number of videos in this course in total

Page 7: Clas data dictionary

Caveats

S  Unless necessary for confidentiality, do not delete all the videos from their course at the end of

each term or clean out all the annotations and move the same videos to a new term. This will

make retroactive report generation impossible.

S  Any videos that were uploaded to CLAS can be duplicated with the duplication tool in the

video management page. Only external URL-imports like Youtube, Mediasite, Dropbox, etc.

videos cannot be duplicated (yet).

S  Even better, contact CLAS support to “archive old course and recreate for new term”. This will

automatically duplicate all non-private videos into the new term, preserving retroactive usage

data and also allowing students of the old course to retain access to the resources. This is the

recommended method of life cycle management for CLAS courses.