teacher moderated and student modertaed e learning
DESCRIPTION
LEARNING WITH TECHNOLOGYTRANSCRIPT
J. EDUCATIONAL COMPUTING RESEARCH, Vol. 40(2) 183-210, 2009
EXAMINING ONLINE LEARNING PATTERNS WITH
DATA MINING TECHNIQUES IN PEER-MODERATED
AND TEACHER-MODERATED COURSES
JUI-LONG HUNG
Boise State University
STEVEN M. CROOKS
Texas Tech University
ABSTRACT
The student learning process is important in online learning environments.
If instructors can “observe” online learning behaviors, they can provide
adaptive feedback, adjust instructional strategies, and assist students in
establishing patterns of successful learning activities. This study used
data mining techniques to examine and compare learning patterns
between peer-moderated and teacher-moderated groups from a r ecently
completed experimental study (Zhang, Peng, & Hung, 2009). The online
behaviors of the students from the Zhang et al. study were analyzed to
determine why teacher-moderated groups performed significantly better than
peer-moderated groups. Three data mining techniques—clustering analysis,
association rule analysis, and decision tree analysis—were used for data
analysis. The results showed that most students in the peer-moderated
condition had low participation levels and relied on student-content
interaction only. On the other hand, teacher presence promoted student
interaction with multiple sources (content, student, and teacher). The findings
demonstrate the potential of data mining techniques to support teaching and
learning.
PROBLEM STATEMENT
One of the advantages of WWW-based instruction is that student learning
behaviors can be tracked and recorded as they occur. This capability enables
183
� 2009, Baywood Publishing Co., Inc.
doi: 10.2190/EC.40.2.c
http://baywood.com
instructors to provide adaptive feedback, adjust instructional strategies, and
assist students in establishing a pattern(s) of successful learning activities.
Lots of studies have been done regarding teaching presence and community
(Anderson, Rourke, Garrison, & Archer, 2001; Shea & Bidjerano, 2009;
Shea, Li, Pickett, 2006; Shea, Pickett, & Pelz, 2003), online learning styles
(Akdemir & Koszalka, 2008; Garland & Martin, 2005; Liu, Magjuka, & Lee,
2008; Neuhauser, 2002), and online learning processes (Fernandez, Marin, &
Wirz, 2007; Macdonald, 2003; Thomas, & Macgregor, 2005). These studies relied
on self-report data as major data sources. Due to the concerns of response
distortions in self-report measurement and possible bias in interpretive coding
(Donaldson & Grant-Vallone, 2002), this approach cannot provide instructors
with correct information for decision-making in response to the dynamics of
online learning.
Instructors and researchers need a way to track and study online learning
activities as they are occurring. Server logs stored in a learning management
system (LMS, hereafter) are the most feasible data sources for studying online
learning behaviors and patterns. Currently, almost all learning management
systems provide basic statistics of student usage data such as frequency and
duration of access. These basic statistics derived from server logs can provide
an overview of an individual student or an online course. However, these
statistics cannot provide sequential learning behaviors to construct a learning
pattern for each student. Recent developments in data mining techniques have
provided new and better methods for researchers to extract knowledge from
raw data in the server logs. Such techniques have been widely used in
various business fields as tools to study consumer behaviors (Yada, 2007), to
predict profit/loss scenarios (Kirkosa, Spathisb, & Manolopoulosc, 2007), and to
provide decision-making support (Michalewicz, Schmidt, Michalewicz, &
Chiriac, 2007).
Zhang, Peng, and Hung (2009) investigated performance and attitude differ-
ences between students in online peer-moderated and online teacher-moderated
discussions. Their results revealed that students in teacher-moderated groups
reported higher satisfaction and achieved higher final grades than students in
peer-moderated groups. However, the Zhang et al. study provided no
information about the enroute learning behaviors that existed between these two
conditions. This study re-examined the Zhang et al. study with data mining
techniques to better understand the types of online behavior that affect satisfaction
and performance. By using data mining techniques, instructors and researchers
can identify online learning behavior patterns and obtain information for
decision making. Researchers can apply these techniques to identify important
behavioral indicators for predicting learning performance. LMS vendors can
plan to integrate data mining functions with learning management systems to
complement online learning management, facilitation, and design.
184 / HUNG AND CROOKS
PURPOSE OF STUDY
The purpose of this study was to re-examine the course used in the Zhang et al.
(2009) study. Data mining techniques were applied to extracted LMS server
logs to analyze and compare student online learning behaviors between peer-
moderated and teacher-moderated groups in an undergraduate online collabor-
ative project-based learning course in Taiwan.
RESEARCH QUESTIONS
This study aims to answer the following questions that grew from the Zhang
et al. (2009) study:
1. What overall similarities and differences in learning behaviors exist
between the peer-moderated and the teacher-moderated classes?
2. What successful and unsuccessful student characteristics are exhibited in
the peer-moderated and the teacher-moderated classes?
3. What unique daily learning patterns are exhibited within the peer-
moderated and the teacher-moderated classes?
4. What are the most important indicators for predicting student learning
performance?
DATA MINING TECHNIQUES
Roiger and Geatz (2003) regarded Data Mining (DM) as the process of
employing one or more computer-learning techniques to analyze and extract
knowledge from data contained within a database. There are two categories of
DM techniques (Chen, Sakaguchi, & Frolick, 2000): one is the use of descriptive
techniques (e.g., summarization, clustering, and association rules) to evaluate
hypotheses, and the other is often referred to as predictive techniques or
machine learning with artificial intelligence technologies (e.g., decision trees and
neural networks) used to build predictive models (Tseng, Tsai, Su, Tseng, &
Wang, 2005).
Following is a brief introduction of these techniques.
Descriptive Techniques
1. Summarizing (Roiger & Geatz, 2003): A population of numerical data is
uniquely defined by a mean, a standard deviation, and a frequency or probability
distribution of values occurring in the data. For example, to compare student
performance between two classes, summarizing provides an overview of the data
before further knowledge exploration.
2. Clustering (Maimon & Rokach, 2005): Clustering is a technique used to
classify instances based on their similar characteristics. For example, to classify
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 185
student characteristics one might include gender, age, learning styles, and test
scores. The purpose of clustering analysis is to describe the student-shared
characteristics and not to build a relationship model between dependent and
independent variables.
3. Association rules: (Roiger & Geatz, 2003) Association rule techniques, also
called market basket analysis (Maimon & Rokach, 2005), are used to discover
interesting associations among attributes contained in a database. The application
of association rule mining is to find related products by analyzing the content of
the customer’s market basket in order to find product associations. For example,
70% of female undergraduate students who own a desktop at home will also take
at least one online course. The purpose is to find associated products within the
set of offered products, as a support for marketing decisions.
Predictive Techniques
1. Decision trees (Breiman, Friedman, Olshen, & Stone, 1984): In data mining,
decision tree analysis is used as a predictive model to match observations with
outcomes. This predictive model is presented as a tree structure in which leaves
represent classifications and branches represent conjunctions of features that lead
to those classifications. For example, a teacher could use a decision tree to predict
student retention rate in an online class. After some literature reviews that identify
key factors influencing student retention, the teacher creates a decision tree using
learning styles, occupation, and age to predict the student retention rate.
2. Neural networks (Roiger & Geatz, 2003): A neural network is a mathe-
matical model that mimics biological neural networks. It consists of an inter-
connected group of artificial neurons and processes information using a con-
nectionist approach. A neural network is an adaptive system that changes its
structure based on external or internal information that flows through the network
during the learning phase. For example, an online program can identify potential
students based on their demographic information, a neural network can “learn”
from existing students’ demographic data. Then the trained network can be used
to identify potential students.
ONLINE LEARNING BEHAVIORS
Web Usage Mining (WUM, hereafter) is a set of data mining techniques used
to describe or predict online navigation patterns through weblog data. The extrac-
tion of students’ navigation patterns can be a valuable tool to understand how
students learn and to evaluate the design and effectiveness of the online learning
environment. WUM is composed of three core phases: data preprocessing, data
mining, and pattern analysis (Becker, Vanzin, Marquardt, & Ruiz, 2006). The
LMS provides an instructional platform consisting of content management, com-
munication tools, assignment submission systems, and various accessories such as
186 / HUNG AND CROOKS
a white board, an online quiz, and a calendar. Unlike online consumer platforms,
LMS platform components are inherently more difficult to obtain information
about user behavior. Typically, WUM can be categorized into two classes of user
behavior: usage and navigation (Becker et al., 2006). Usage behavior focuses
on how resources in the site are used to perform learning activities. It allows the
researcher to characterize student learning progress based on the contents they
read, the tools they used, and how these resources are combined to achieve
goals or acquire competencies. The navigation behavior focuses on investigating
frequent or infrequent usage paths and groups of students with similar access
characteristics. WUM has been applied widely in the e-commerce domain
(Riecken, 2000) and has been extended to other domains such as distance educa-
tion (Becker & Vanzin, 2003; Machado & Becker, 2003; Zaiane, 2001). How-
ever, because a LMS builds up a distributed online learning environment with
many functions, analyzing usage and navigation behaviors in an e-learning
environment is more complicated than that in e-commerce.
Becker et al. (2006) investigated an intensive extracurricular course that
lasted 11 days and involved 15 students. The study focused on understanding
the potential for using two WUM tools (O3R and LogPrep) to analyze the
effectiveness of the web-based learning environment. The study also evaluated the
usability of these two tools within the educational data-mining domain. Three
phases of data mining procedures were conducted for data analysis: the pre-
processing phase, the data mining phase, and the pattern analysis phase.
The purpose of the preprocessing phase was to create datasets for data mining
analysis. The data mining phase applied association rules, clustering, and clas-
sification techniques. Association rules were used to find which content areas
students tended to access together, or which combination of tools students usually
explored during their learning processes. Clustering techniques grouped instances
with similar characteristics.
The final phase, pattern analysis, involved pattern retrieval and pattern inter-
pretation. A large number of interesting patterns were identified and interpreted
by the researchers who filtered the data using domain knowledge and the per-
ceived value of particular data points. The results of their study revealed that
WUM can be used to help monitor, understand, and evaluate student learning
behaviors. Becker et al. (2006) also concluded that the O3R is a suitable tool
for association and sequential association analysis, whereas LogPrep is more
suitable for performing preprocessing and clustering tasks.
METHODOLOGY
Participants
Ninety-eight college freshmen in 36 learning groups at a 4-year, vocational-
track university in Taiwan participated in a series of online learning activities for
6 weeks in a project-based learning environment. Both teacher-moderated and
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 187
peer-moderated conditions consisted of 17 groups. Each group was formed by
randomly assigning students to 15 groups of three and two groups of two. Each
group was asked to investigate and report on a different, ill-structured problem.
The ill-structured problems required knowledge and skills in survey design,
statistical analysis, advanced Excel applications, and report writing.
A demographic survey investigated participants’ prior online learning experi-
ence and software skill knowledge on Microsoft Office and Internet. The results
showed that 79.56% of the participants had not taken an online course, 12.24%
had taken one online course, and 8.2% had taken more than one online course. The
participants’ prior software skill knowledge scores ranged from 11 to 54 (1-20
novice; 21-40 skilled; 41-60 expert). The results revealed an average score
of 26.61, indicating a basic level of software skill knowledge. There were no
significant differences in prior online learning experience and software knowl-
edge between the teacher-moderated and the peer-moderated conditions. Partici-
pants in the peer-moderated condition had taken an average of 2.51 blended
or fully online courses and had an average score of 28.02 (skilled) on prior
software skill knowledge. Participants in teacher-moderated condition had taken
an average of 2.53 blended or fully online courses and an average score of 25.20
(skilled) on prior software skill knowledge.
Materials
The participants collaborated within a LMS with their group members to finish
an assigned project in 6 weeks. Each project consisted of a different ill-defined
problem (e.g., investigate students’ Internet use behaviors) that needed to be
answered through survey investigation. Groups were engaged in learning activ-
ities such as determining and defining the proper scope of a research topic,
designing and developing appropriate information gathering instruments, col-
lecting data with self-created instruments, analyzing data, and writing and pre-
senting a comprehensive report on the investigated phenomenon. Each group
had 2 weeks for questionnaire design and investigation, 2 weeks for statistical
analysis with Microsoft Excel, and 2 weeks for visual presentation and report
writing. All learning materials, such as computer-based software training video,
instructional files, and task description could be downloaded through the LMS.
Students were assessed on a report task and a skill test (Zhang et al., 2009).
Two instructors followed a common evaluation rubric to evaluate the group
reports. In order to evaluate group critical thinking in the final report of their
collaborative learning activity, the rubric followed the principles of “Holistic
Critical Thinking” (Facione & Facione, 1994; Schamber & Mahoney, 2006;
Stevens & Levi, 2005). The final grade on the group reports came from the
average score given by two instructors and a peer evaluation score. The skill
test was evaluated by a standardized test called Techficiency Quotient Certifi-
cation (TQC). TQC test was developed by the Computer Skill Foundation, an
188 / HUNG AND CROOKS
organization that does testing of technical skills and issuing of certificates
(CSF, 2005). The TQC certification is very similar to the Microsoft Office User
Specialist (MOUS) certification.
Apparatus or Measures
The course was delivered online through Wisdom Master, the most widely
applied LMS in Taiwan’s higher education institutions. All learning materials,
communication tools, assignment submissions, and course announcements were
available in the LMS. All students’ online activities were recorded in the LMS
database system. All server logs, in the LMS database system, could be extracted
by Structured Query Language (SQL). The extracted data were cleaned and
processed, then stored in a compatible format for further analysis. Oracle 10G
Express and Microsoft Excel 2003 were used to perform these tasks. Analytical
data processes included descriptive and predictive techniques, which were per-
formed by SPSS1, Knime2, and Weka 33. SPSS was used to execute descriptive
analysis and data exploration. Knime and Weka 3 were used to perform descrip-
tive and predictive analysis.
Procedure
The mining procedure consisted of three phases: data preprocessing, data
mining, and pattern analysis. These followed the Becker et al. (2006) WUM
processes.
Data Preprocessing Phase
The first step in the data preprocessing phase was to create reduced log files
by removing all the useless information (e.g., IP addresses and browser types)
present in the common log files of the LMS. The second step in the preprocessing
phase used a session filter that was applied to the reduced log files for feature
extraction. The aim of the session filter was to aggregate all user requests within
a session into a single set of features (variables). A session is a central notion
in network-mediated learning. A session is technically a sequence of web log
entries that reflects the interaction behavior of a learner within a period of active
study. In regular conditions, a session starts when a student begins interacting
with the LMS and ends when the student presses the exit button. However, under
some conditions, students might abort the session by closing the web browser,
or might suspend his or her web browser and not do anything. The LMS would
end the session automatically after 20 minutes of inactivity.
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 189
1 Further information can be found at http://spss.com/2 Further information can be found at http://www.knime.org/3 Further information can be found at http://www.cs.waikato.ac.nz/ml/weka/
The session filter was used to extract the following primary and derived
variables: a user identifier, a session identifier, start date and times, end date and
times, user hit counts, and session duration times (Becker et al., 2006). These are
general variables that were extracted from reduced log files. In order to gain
insight into the learning processes that occurred, statistical variables such as
frequency of login, frequency of accessing course materials, number of messages
posted, number of messages read, frequency of synchronous discussion, and
reading duration were recorded. Identifying these statistical variables required
accumulating duration and frequency data of each student on a daily and weekly
basis. In addition, each student’s demographic information, such as technology
competency and prior online learning experience, were also collected. Except for
extracted variables, cleaned server logs were used to conduct association analysis.
Data Mining Phase
The whole data mining phase can be divided into two stages: the data
exploration stage and the knowledge exploration stage. The data exploration stage
in the current study overviewed populations of numerical data through descriptive
statistics necessary for gaining a better understanding of how to apply further data
mining techniques. Knowledge exploration involved applying both descriptive
and predictive techniques in order to answer the research questions including
clustering, association rules, and decision tree analysis.
Pattern Analysis Phase
The pattern analysis phase includes interpretation and evaluation of the
results to identify valuable outcomes of the previous phase and needs the most
domain knowledge. Variables and factors were overviewed after the initial data
exploration. Mining results were compared with the literature on online col-
laborative learning for validation and evaluation. Conclusions were drawn from
analysis of the data and recommendations were made for future research on
online learning behaviors and educational data mining.
RESULTS
Two stages of the data mining phase (the data exploration stage and the
knowledge exploration stage) are discussed below. The data exploration stage
presents descriptive and inferential statistics about each of the variables used in
the current study. More sophisticated descriptive and predictive techniques are
combined in the knowledge exploration phase to obtain knowledge related to
the research questions in this study.
190 / HUNG AND CROOKS
Data Exploration Stage
Based on previous research in online learning environments (Becker et al.,
2006), five frequency variables (course logins, instructional content accessed,
asynchronous messages posted, asynchronous messages read, and synchronous
discussions attended) were applied to compare differences between the peer-
moderated and teacher-moderated conditions.
Table 1 shows descriptive statistics for these variables across both conditions.
Participants in the teacher-moderated condition logged into the course, accessed
course content, and posted asynchronous messages more frequently than par-
ticipants in the peer-moderated condition (t(96) = –2.21, p = .03; t(96) = –2.78,
p = .006; and t(96) = –3.00, p = .003, respectively). While the number of messages
read was higher in the teacher-moderated condition, it was not significantly
different from the peer-moderated condition.
There was also no significant difference between the two conditions in syn-
chronous discussion frequency. In fact, students in both conditions seldom
utilized the synchronous communication tool (M = 2.31 and M = 2.39 over 6
weeks of activities), choosing instead to use the bulletin board (asynchronous
communication) for their class discussions.
The results also show that the course login and content access distributions
were positively skewed in the peer-moderated group, but relatively normal in the
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 191
Table 1. Descriptive Statistics for Frequency Variables for
Peer and Teacher-Moderated Conditions
Peer-moderated Teacher-moderated
Variables M SD Skewness M SD Skewness
Course logins
Instructional
content accessed
Asynchronous
messages posted
Asynchronous
messages read
Synchronous
discussions attended
29.35
29.57
21.41
41.98
2.31
19.81
17.50
22.14
39.80
2.89
1.09
0.66
1.72
1.23
1.73
37.78
39.94
38.86
46.45
2.39
17.96
19.31
34.18
36.55
3.98
0.24
0.21
1.64
0.92
3.27
Note: All statistics represet frequencies, or the number of times variables were recorded
in the server logs.
teacher-moderated group. Furthermore, the distributions for messages posted,
messages read, and discussion attended were positively skewed in both conditions.
All students were randomly assigned to either the peer-moderated or the
teacher-moderated condition. The expected distribution patterns should be similar
in both conditions. However, students in the peer-moderated condition logged
in and accessed content less frequently than those in the teacher-moderated
condition. This implies students in the peer-moderated condition had lower
learning motivation as compared with those in the teacher-moderated condition.
In discussion participation, students in both conditions had similar distribution
patterns. However, students in the teacher-moderated condition participated
in asynchronous discussion more actively than those in the peer-moderated
condition.
Knowledge Exploration Stage
The knowledge exploration stage applied three artificial intelligence methods:
cluster analysis, association analysis, and decision tree analysis. Cluster
analysis was used to classify participants by behavioral and performance charac-
teristics, association analysis was used to discover participant learning patterns,
and decision tree analysis was used to build a performance prediction model
for each condition.
Cluster Analysis
Cluster analysis (K-means) extends findings from the data exploration stage
by discovering data structures that exist across several variables within each
condition. Eleven variables were applied in the cluster analysis: Course logins,
instructional content accessed, asynchronous messages posted, asynchronous
messages read, asynchronous message reading time, synchronous discussions
attended, online learning experience, software experience, project grade, skill
grade, and final grade.
Table 2 shows the clustering results for the peer-moderated and the teacher-
moderated conditions. For simplification, the number of clusters has been limited
to three. In both peer- and teacher-moderated conditions, cluster 1 represents
superior performers, cluster 2 represents average performers, and cluster 3 repre-
sents poor performers (performing worse than cluster 1 and cluster 2 partici-
pants on almost all variables). While both conditions had low-performing par-
ticipants (i.e., cluster 3), the low performers in the teacher-moderated condition
had significantly higher final grades (M = 64.55) than the low performers in
the peer-moderated condition (M = 49.47), t(35) = –2.92, p = .006.
To understand differences in performance differences, the skill grade and the
project grade of low performers in both conditions were compared. Results
revealed the major differences occurred in the project grade. A separate cluster
analysis conducted on the cluster 3 participants in the peer-moderated condition
192 / HUNG AND CROOKS
Tab
le2
.M
ean
sfo
rC
luste
rR
esu
lts
inth
eP
eer-
Mo
dera
ted
an
dT
each
er-
Mo
dera
ted
Co
nd
itio
ns
Peer-
mo
dera
ted
co
nd
itio
nT
each
er-
mo
dera
ted
co
nd
itio
n
Inp
ut
vari
ab
les
Clu
ste
r1
(n=
7)
Clu
ste
r2
(n=
25
)
Clu
ste
r3
(n=
17
)
All
(n=
40
)
Clu
ste
r1
(n=
6)
Clu
ste
r2
(n=
23
)
Clu
ste
r3
(n=
20
)
All
(n=
49
)
Co
urs
elo
gin
s
Instr
uctio
nalco
nte
nt
accessed
Asyn
ch
ron
ou
sm
essag
es
po
ste
d
Asyn
ch
ron
ou
sm
essag
es
read
Asyn
ch
ron
ou
sm
essag
ere
ad
ing
tim
ea
Syn
ch
ron
ou
sd
iscu
ssio
ns
att
en
ded
On
line
learn
ing
exp
eri
en
ce
So
ftw
are
exp
eri
en
ce
Pro
ject
gra
de
Skill
gra
de
Fin
alg
rad
e
52
.43
56
.14
55
.86
11
3.4
3
13
.01
3.2
9
2.1
4
32
.00
95
.57
78
.43
87
.29
30
.96
33
.28
21
.24
45
.08
5.1
0
2.2
8
2.6
0
27
.64
69
.96
75
.28
72
.84
17
.47
13
.18
7.4
7
8.0
0
0.7
6
1.9
4
2.5
3
26
.94
28
.47
70
.06
49
.47
29
.35
29
.57
21
.41
41
.98
4.7
2
2.3
1
2.5
1
28
.02
59
.22
73
.92
66
.80
47
.17
47
.50
10
9.3
3
20
.83
1.5
8
0.6
7
2.3
3
24
.50
87
.83
83
.83
86
.17
49
.87
53
.57
37
.22
77
.22
9.8
9
3.4
8
2.7
8
26
.26
86
.48
82
.78
84
.87
21
.05
22
.00
19
.60
18
.75
1.8
1
1.6
5
2.3
0
24
.20
54
.05
74
.50
64
.55
37
.78
39
.94
38
.86
46
.45
5.5
8
2.3
9
2.5
3
25
.20
73
.41
79
.53
76
.74
aR
ead
ing
tim
ew
as
measu
red
inh
ou
rs.
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 193
suggests some reasons why this performance discrepancy may have occurred
(see Table 3). Table 3 reveals that the poor performers in the peer-moderated
condition were further categorized into two groups (i.e., clusters 3-1 and 3-2).
Comparing the results of the two cluster analyses provides further information
about the poor performers in the peer-moderated condition. In conclusion, cluster
3-1 suggests that the students may have had login problems and cluster 3-2
suggests that the students didn’t know how to solve practical problems or they
didn’t want to engage in learning activities.
For example, the learning behaviors of cluster 3-1 participants such as course
logins, instructional content accessed, and messages posted were significantly less
than the cluster 3-2 participants. Moreover, most of these learning behaviors
occurred within the first 2 weeks. Table 3 shows that cluster 3-2 students
performed well on the software skill test (M = 83.91) relative to superior and
average performers in the peer-moderated condition (Ms = 78.43 and 75.28,
respectively; see clusters 1 and 2, Table 2). Interestingly, though, cluster 3-2
students received much lower project grades (M = 32) than these superior and
average performers (Ms = 95.57 and 69.96, respectively). There are at least
two possible reasons for these findings. One is that cluster 3-2 participants,
notwithstanding their skill with the software, had difficulty solving practical
problems as evidenced by their poor project grades. Another possibility is that
they lacked the motivation to engage in instructional activities that may have
194 / HUNG AND CROOKS
Table 3. Means for Cluster Analysis of Poor-Performing Students
in the Peer-Moderated Condition
Input variables
Cluster 3-1
(n = 6)
Cluster 3-2
(n = 11)
Cluster 3 Total
(n = 17)
Course logins
Instructional content accessed
Asynchronous messages posted
Synchronous discussions attended
Asynchronous message reading time
Asynchronous messages read
Project grade
Skill grade
Final grade
Online learning experience
Software experience
12.80
10.80
5.00
0.80
0.09
4.40
18.60
53.60
34.00
2.00
23.00
20.02
14.09
9.00
2.55
1.13
8.36
32.00
83.91
58.09
2.82
29.00
17.47
13.18
7.47
1.94
0.76
8.00
28.47
70.06
49.47
2.53
26.94
improved their project grades. Evidence for this second conclusion is found in the
fact that cluster 3-2 participants were far less involved with accessing course
content and participating in discussions than their superior and average per-
forming counterparts.
Association Analysis
Association rule techniques are used to discover interesting associations among
events contained in a database. In this study, association analysis was used to
compare the differences in learner behavior patterns between participants in
the peer-moderated and teacher-moderated conditions. The association analysis
used 7,926 server logs from the peer-moderated condition and 10,006 server
logs from the teacher-moderated condition. In this analysis a server log was
defined as a sequence of recorded participant learning behaviors determined
by beginning (login) and ending (logout) time stamps.
Support and confidence are two terms used to describe the association rules
discovered in an association analysis (see Table 4). Support refers to the propor-
tion of all server logs containing the sequence of learning behaviors defining the
rule. Confidence refers to the probability that the entire rule will be observed given
the occurrence of the first behavior in the sequence. For example, the association
rule: “go to student environment => access course materials” has a support score
of 34.71% and a confidence score of 88.18% (see Table 5, rule 2). In this example,
support means that 34.71% of the total server logs (i.e., 3,473 out of 10,006)
contained the sequence “go to student environment => access course materials.”
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 195
Table 4. Daily Association Rules in Peer-Moderated Groups
Support
(%)
Confidence
(%) Counts Rule
1
2
3
4
5
82.28
82.19
61.18
20.23
14.71
90.52
91.09
67.31
58.87
49.67
850
849
632
209
152
longin => access course materials
goto student environmenta =>
access course materials
login => goto student environment
post onb => post on
Longin => login
aThe student environment contains course announcements, personal messages, and
other personal information.b“Post on” refers to posting a message on the asynchronous discussion board.
The confidence score means that the probability is .8818 that the learning behavior
“access course materials” will occur if the behavior “go to student environment”
has occurred on that day.
Tables 4 and 5 list daily association rules (with accompanying support and
confidence measures) for peer-moderated and teacher-moderated conditions
respectively. These rules represent students’ daily learning behavior patterns.
For example, Table 4, rule 4: “post on => post on” indicates that students in the
peer-moderated condition posted messages twice on the same day with support
and confidence scores of 20.23% and 58.87%, respectively. These postings may
have occurred in succession or at different times during the day.
In Table 4, Rules 1 and 3 reveal that a very high percentage (support = 82.28%
and 61.18%, respectively) of server logs from the peer-moderated condition
included either the behavior sequence “login => access course materials,” or
“Login => go to student environment.” These rules also show high probabilities
(confidence = 90.52% and 67.31%, respectively) that, on a given day, students in
the peer-moderated condition accessed either the course materials or the student
environment after login. In addition, Rule 2 shows strong support (82.19%) and
confidence (92.09%) for accessing course materials after going to the student
environment section. In summary, Rules 1 through 3 reveal that participants in
196 / HUNG AND CROOKS
Table 5. Daily Association Rules in Teacher-Moderated Groups
Support
(%)
Confidence
(%) Counts Rule
1
2
3
4
5
6
7
8
2.67
34.71
2.93
21.10
18.35
19.64
32.39
32.13
10.06
88.18
11.04
93.16
86.94
82.61
84.88
81.62
31
403
34
245
213
228
376
373
longin => access course materials
goto student environment => access
course materials
login => goto student
post onb => post on
post on => post on => post on
Longin => login => login
access course materials => access
course materials
goto student environment => goto
student environment
the peer-moderated condition spent the most time in course materials and student
environment (to check announcements or read personal messages and statistics)
sections of the course.
While the same association rules found in the peer-moderated condition can be
found in the teacher-moderated condition (see Rules 1 through 3 in Table 5).
The support and confidence measures for these rules are significantly lower in
the teacher-moderated condition. Participants in the teacher-moderated condi-
tion experienced a much broader array of activity patterns than peer-moderated
participants. This is illustrated in Rules 4 through 8 in Table 5, where it can
be observed that teacher-moderated participants often worked on the same
activity more than once a day. While participants in the peer-moderated condition
exhibited some of these same learning patterns, the support and confidence
measures are much lower. In summary, it appears that teacher-moderated partici-
pants were more motivated to experience a broader array of learning activities
than peer-moderated participants. Peer-moderated participants tended to focus
primarily on accessing course materials and checking personal records. The
implications of the results will be treated in the discussion section.
Decision Tree Analysis
In this study, the CHAID decision tree analysis (Kass, 1980) was used to build
a performance prediction model in both conditions. Ten independent variables
(course logins, instructional content accessed, asynchronous messages posted,
asynchronous messages read, asynchronous message reading time, synchronous
discussions attended, project grade, skill grade, online learning experience,
and software experience) and one dependent variable (final grade) were applied
to the analysis.
Figure 1 shows the results of the decision tree analysis for the peer-moderated
condition. Number of asynchronous messages read (MessagesRead) was the
most important variable for predicting the final grades of participants in this
condition. Those participants reading more than 33 messages throughout the
entire course on the course discussion board received 44% higher final grades
(M = 79.17) than those reading 33 messages or less (M = 54.92). Among those
receiving higher final grades, frequency of accessing instructional content
(FreqContent) was an important variable in differentiating this group. As shown
in Figure 1, participants who accessed the instructional content section of
the course more than 48 times received 21% higher final grades (M = 87.57)
than those who accessed the instructional content section less than 39 times
(M = 72.27). Among those receiving lower final grades, number of asynchronous
messages posted (MesssagesPosted) and prior software experience (SoftExp)
were important differentiating variables. Participants posting more than three
messages and receiving software experience ratings greater than 28 received 79%
higher final grades (M = 66.3) than those posting three messages or less (M = 37).
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 197
These results suggest that students in peer-moderated learning situations
may benefit from spending considerable time reading the messages posted by
other students. This benefit appears to be enhanced when these students invest
considerable time studying the course instructional content. Interestingly, the
number of messages posted does not appear to be as important as the number of
messages read for students working in peer-moderated groups.
Figure 2 shows the results of the decision tree analysis for the teacher-
moderated condition. Frequency of accessing instructional content (FreqContent)
was the most important variable for predicting the final grades of participants in
this condition. Those participants who accessed the instructional content section
of the course 37 or more times received 30% higher final grades (M = 86.15)
than those who accessed the instructional content section less than 37 times
(M = 66.09). Among those with higher final grades, the participants who had
significant software experience and who were active in posting on the course
discussion board (more than 36 posts) received 24% higher final grades (94.11)
than those with less software experience (M = 76.13). For those participants
receiving lower final grades, the number of messages posted was an important
differentiating variable. Participants posting more than 14 messages received
37% higher final grades (M = 72.87) than those posting 14 messages or less
(M = 53.38).
198 / HUNG AND CROOKS
Figure 1. Decision tree for peer-moderated groups.
These results suggest that students in teacher-moderated conditions may benefit
most from accessing instructional content and posting messages to course dis-
cussion boards. Prior knowledge in the content area also appears to be an impor-
tant predictor of success.
In summary, the results of the decision tree analysis suggest that frequency
of accessing instructional content (FreqContent), number of asynchronous
messages read (MessagesRead), frequency of posting asynchronous messages
(FreqPosting), and software experience (SoftExp) were four predictors of student
performance in this study.
DISCUSSION
This section is organized around the four research questions pertaining to this
study. Each of the research questions are addressed with findings from this study
along with relevant literature from the field of online collaborative learning.
What Differences in Learning Behaviors Exist Between the
Peer-Moderated and the Teacher-Moderated Conditions?
The results from the data exploration stage showed that participants in the
teacher-moderated condition logged in, accessed course content, and posted
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 199
Figure 2. Decision tree for teacher-moderated groups.
asynchronous messages more frequently than participants in the peer-moderated
condition (see Table 1). These results suggest that teacher-moderated participants
were more actively involved, and may have experienced more motivation and
satisfaction, in the course than peer-moderated participants.
These findings are consistent with several survey research studies that have
examined the relationship between teacher presence and student motivation (see
Christensen & Menzel, 1998; Christophel, 1990; Christophel & Gorham, 1995;
Frymier, 1993) and between teacher presence and course satisfaction (Dziuban
& Moskal, 2001; Kreijns, Kirschner, & Jochems, 2003; Richardson & Swan,
2003; Wise, Chang, Duffy, & Valle, 2004). All but one of the studies reviewed
found that teacher presence was positively related to either student motivation or
course satisfaction. For example, Christophel (1990) investigated the relationship
between immediacy (a form of teacher presence emphasizing behaviors such as
smiles, head nods, use of inclusive language, and eye contact) and student state
motivation in college classes. The study found significant relationships between
learning and both immediacy and motivation. Immediacy was found to modify
motivation, which led to increased learning. Dziuban and Moskal (2001) analyzed
the responses from 52,218 questionnaires, which focused on investigating
students’ course satisfaction and perception of level of learning in courses offered
in web-based (totally online), mixed-mode (part online and part face-to-face),
web-supported (website as supplement), and face-to-face formats. They found
high correlations between the quantity and quality of teacher-student interaction
and students’ perception of their level of learning and satisfaction in all four
types of courses.
The one study with contrary findings (Wise et al. 2004) investigated the
relationship between teacher presence and student satisfaction, engagement, and
learning performance. The authors define engagement as active involvement in
knowledge construction, peer interaction, and peer collaboration. The results
indicated that teacher presence was related to peer interaction and students’
perception of the instructor but was not related to perceived learning, satisfaction,
engagement, or the quality of their final course product. The experimental findings
from the initial analysis of data from the current study (Zhang et al., 2009) support
the majority of correlational research in this area. Zhang’s study was the only
experimental study identified that compared student perceptions and perform-
ance differences between peer-moderated and teacher-moderated conditions.
They found that teacher presence influenced students’ performance. In addition,
a posttest attitude survey indicated that students in peer-moderated groups
were unsatisfied with their online collaborative learning experience after 6-weeks.
In general, these students felt that online collaboration and discussion were a
waste of time and that collaborative learning does not help students develop
quality projects.
This study reanalyzes Zhang’s data beyond student’s perceptions and posttest
performance. Its results, obtained through data mining, support previous findings
200 / HUNG AND CROOKS
that teacher presence motivated students. However, this study also discovered that
relying only on perceptions may lead to misguided conclusions. For example,
based on the posttest attitude survey results, students perceived that peer support
was useless during collaboration in the peer-moderated condition. However,
the cluster analysis showed that the superior performers (cluster 1 students) in the
peer-moderated condition read a high number of messages, spent a lot of time
reading messages, and posted a high number of messages to the discussion board
(see Table 2). Interestingly, the decision tree analysis showed that the number
of messages read is the most important variable for predicting performance in
the peer-moderated condition. Students who read over 33.5 messages through-
out the 6-weeks of collaborative activities received a much higher grade
(M = 79.17) than those who read fewer messages (M = 54.92; see Figure 1).
These results from data-mining show that students in the peer-moderated con-
dition can actually benefit from peer collaboration. However, there were only
seven students who benefited from peer support. Later sections will discuss
detailed learning behaviors in both conditions that can provide more evidence
of teacher presence effects.
What successful and unsuccessful student
characteristics are exhibited in the peer-moderated
and the teacher-moderated conditions?
Successful and unsuccessful students were defined by their final grade
performance. High or low participation level was defined by relatively higher or
lower value on learning behavior variables. The following section discusses the
characteristics of successful, average, and unsuccessful students within the
peer-moderated and the teacher-moderated conditions.
Characteristics of Successful Students
In both conditions, cluster 1 represents the most successful students. Table 2
shows that the cluster 1 students in both conditions have relatively higher values
on almost all variables, evidencing active participation in online activities. This
finding is consistent with other studies that have found that students with high
participation levels have the highest achievement scores (Beaudoin, 2002;
Picciano, 2002). The current study used six behavioral variables to depict student
participation level (see Table 2). However, Beaudoin’s and Picciano’s study
used just one variable to define students’ participation level. Beaudoin (2002)
divided an online class into three groups (high interaction, moderate interaction,
and low interaction) based on time spent in course-related activity. Picciano
(2002) defined interaction level by the number of postings. Therefore, the current
study provides a more comprehensive measure of interaction level. Moreover,
Picciano (2002) reported that student perceptions are not consistent with
actual postings. He found that the low-interaction group perceived themselves as
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 201
having made a higher number of postings than they actually did, and the high-
interaction group perceived themselves as having made fewer postings than
they actually did. Again, this finding supports the previous discussion, in
which perceptions of participants may provide incorrect information and lead
to misguided conclusions.
Although cluster 1 students in both conditions represented students who per-
formed well, there were important differences between the conditions. While
the six cluster 1 students in the teacher-moderated condition had the highest
final grade (M = 86.17), they read fewer messages, spent less time reading
messages, and frequented synchronous discussions less than their cluster 2 and 3
counterparts in the same condition. On the other hand, except for prior online
learning experience, cluster 1 participants in the peer-moderated condition
possessed higher values on all variables. These results indicate that cluster 1
participants in the teacher-moderated condition may have learned more efficiently
than all other participants. However, the data mining results cannot provide
enough information to make a certain conclusion. Similar results were also found
by Beaudoin (2002). He found that the low-interaction group performed higher
than did the moderate group without any rational explanation. These cluster 1
participants in the teacher-moderated condition are worthy of further investigation
on their learning strategies.
Characteristics of Unsuccessful Students
Cluster 3 in both conditions represents low performers who did not actively
participate in online activities. To understand them further, cluster 3 students in the
peer-moderated condition were reclassified. The results (see Table 3) revealed
six students in cluster 3-1 who may have had log-in problems. These six students
had few records on the LMS. In addition, we found several login failure messages
in the beginning of the online course on these students’ web logs. Cluster 3-2
consists of 11 students. These students had relatively high prior experience
(M = 29) and high scores on the software skills test (M = 83.91). However,
their low project grades (M = 32) indicates that they may have had difficulty in
applying their software skills to practical problems. These findings can help
instructors to develop different instructional strategies for participants in cluster
3-1 and cluster 3-2. For example, cluster 3-1 students can be given more technical
support to help them with log-in problems (Ashton, Roberts, & Teles, 1999). For
cluster 3-2 participants, the instructor can provide pedagogical support for skill
application (Ashton et al., 1999; Waeytens, Lens, & Vandenberghe, 2002).
Cluster 1 participants in the peer-moderated condition had relatively higher
values on all variables except for prior online learning experience. On the other
hand, cluster 3 students in the peer-moderated condition had higher than average
prior online experience. These results indicate that prior online learning experi-
ence may not influence performance. Bernard et al. (2004) obtained similar
202 / HUNG AND CROOKS
findings. In their survey investigation they found that student confidence about
basic prerequisite skills did not influence final performance.
What Unique Daily Learning Patterns are Exhibited
Within the Peer-Moderated and the Teacher-Moderated
Conditions?
Daily Learning Patterns in the Peer-Moderated Condition
Association rules revealed daily learning patterns in both conditions. Table 4
shows that the major learning activities in the peer-moderated condition were:
1. “login => access course materials”;
2. “goto student environment => access course materials”; and
3. “login => goto student environment.”
These results indicate that more than 80% of the learning activities were reading
the course materials and checking course announcements and personal records.
Content-student interaction, as opposed to student-student interaction, was the
primary source of content knowledge for students in the peer-moderated con-
dition. The posttest survey also supported this inference. Students indicated a
lack of trust for information on the bulletin board. These students regarded
the information from peers as useless (Zhang et al., 2009). However, relying on the
course materials was insufficient to achieve a high-quality project.
Peer support has been given a fair amount of attention in the literature (e.g.,
Arvaja, Häkkinen, Eteläpelto, & Rasku-Puttonen, 2000; Wu, Farrell, & Singley,
2002). Salomon and Perkins (1998) noticed that a teacher’s objective is to
facilitate learning, but peers working together aim for task accomplishment, so the
goal of human support may vary with the person doing the support. This may
explain why students in the peer-moderated condition held negative attitudes
toward peer support. The quality of peer support varies. In the current study most
of the messages posted to the discussion board consisted of peer encouragement
(Zhang et al., 2009). While there were some useful messages scattered over
different threads in the bulletin board, participants in the peer-moderated con-
dition needed to read several messages to indentify useful information.
Daily Learning Patterns in the Teacher-Moderated Condition
Table 5 lists common daily learning patterns in the teacher-moderated condi-
tion. Learning patterns in the teacher-moderated condition showed very different
results from those of the peer-moderated condition. While teacher-moderated
participants had the same top three rules as in the peer-moderated condition
(“login => access course materials,” “goto student environment => access course
materials,” and “login => goto student environment”), these rules had far lower
support and confidence in the teacher-moderated condition. The highest support
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 203
rating for teacher-moderated participants was 34.71%, while the highest support
rating for peer moderated participants was 82.28%. The results show that
teacher presence promoted student participation in a wide variety of learning
activities. Therefore, while the support ratings for each learning activity were
lower, teacher presence tends to facilitate student interaction with multiple sources
(content, student, and teacher).
Ashton et al. (1999) defined four categories of support in online collaborative
environments: pedagogical, social, managerial, and technical. Pedagogical sup-
port includes all attempts to assist in reaching a particular learning objective,
such as providing feedback, instructions, information, opinions, preferences,
advice, questions, summaries, comments, or referring to outside sources. Social
support includes all attempts to make students comfortable and promote inclu-
sion, such as using empathy, meta-communication, humor, or performing inter-
personal outreach. Managerial support includes all attempts to coordinate assign-
ments, discussions, and course activities. Technical support includes assistance
to students in using the course delivery software. It focuses on solving user and
technical issues.
Lund (2004), based on Ashton’s categories, listed possible types of support
given by and to different participants in a computer supported collaborative
environment. Table 6 shows of Lund’s conclusions.
Table 2 provides insight into student performance differences. There are no
significant differences in the skill grades between the peer-moderated and the
teacher-moderated conditions. The major performance difference occurs in their
project grades. Zhang et al. (2009) also found that students in a peer-moderated
condition tended to imitate each other. On the other hand, students in a
204 / HUNG AND CROOKS
Table 6. Possible Types of Support Given by and to
Different CSCL Participants
Support given by
Support given to Student Tutor/teacher Technical expert
Student
Tutor/teacher
Technical expert
Pedagogical,
social, managerial,
technical
Pedagogical,
social, managerial,
technical
Meta-pedagogical,
Meta-social,
Meta-managerial,
Meta-technical
Meta-technical
Technical
Technical
Technical
teacher-moderated condition were more creative in their final reports. According
to Table 6 and Ashton’s categories, if students have multiple sources of inter-
action, they can obtain different types of support for achieving higher per-
formance. Students in a peer-moderated condition tend not to rely on peer support
because they need to spend considerable effort to find useful information. There-
fore, students in the peer-moderated condition mainly relied on learner-content
interaction, which can cover only part of the pedagogical, managerial, and tech-
nical support. This may be why they did not perform well, even if they possessed
the required software skills.
Moreover, students in the teacher-moderated condition tended to login, to
read course materials, or to post messages several times on the same day (Table 5,
rule 4 to rule 8). Students in the peer-moderated condition had fewer same-day
logins (Table 4, rule 4 and rule 5 show their lower frequencies). The support and
confidence of rule 4 and rule 5 in the peer-moderated condition are also far
lower than rule 4 to rule 8 in the teacher-moderated condition (Tables 4 and 5).
Association rules revealed that teacher presence promoted learning variety and
active participation.
What are the Most Important Indicators for
Predicting Student Learning Performance?
Performance Predictors for Peer-Moderated Learning
A Decision Tree analysis produced a predictive model for each condition.
Figure 1 indicates that the most important variable for performance prediction in
the peer-moderated condition is the number of bulletin board messages read. The
right branch of Figure 1 shows that students who read more than 33.5 messages
during 6 weeks of activities received an average final grade of 79.17. Therefore,
reading discussions on the bulletin board was the most valuable activity for
improving performance among participants in the peer-moderated condition.
Again, the results did not correspond to the posttest survey results (Zhang et al.,
2009). The posttest survey results showed that students in the peer-moderated
conditions did not feel that messages on the bulletin board were useful. Since
surveys do not provide sufficient information for online learning research, more
tools such as data mining are needed to support online teaching and research.
The other important variable in the right branch of Figure 1 is the frequency
of accessing course materials. The peer-moderated participants who read more
than 33.5 messages and accessed course materials more than 38.5 times received
an average final grade of 85. Those whose frequency of accessing course materials
increased to 48.5 received an average final grade of 87.57. These results indicate
that student-content interaction is the most important part of the peer-moderated
condition. This may be because peers can provide metacognitive support.
However, students need to deal with a large amount of information and to extract
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 205
useful information from many discussions. Perhaps that is why only 13 students
performed well (final grade average = 85) in the peer-moderated condition.
The left branch of Figure 1 shows how critical participating in online discussion
is to success in the course. Six participants in the peer-moderated condition who
read and posted the fewest messages (fewer than 33.5 and 3.5 respectively) had
the lowest final grades (M = 37). Students who read fewer than 33.5 messages
but posted more than 3.5 messages earned a better final average of 60.58. Prior
software knowledge is another important variable, as shown in the left branch.
For those who posted more than 3.5 messages in 6 weeks, the participants with
high prior software knowledge (>= 28.5) performed better than those with low
prior software knowledge. These results imply that active participation with peers
(reading and posting messages) is important for better performance. For the
students with low participation levels, the only thing they could rely on was
their prior knowledge.
Performance Predictors for Teacher-Moderated Learning
Figure 2 shows a predictive model for the teacher-moderated students. The most
important variable influencing their final grades was the frequency of accessing
course materials. Twenty-six participants who accessed the course materials
more than 37 times received an average final grade of 86.15. Participants in this
group that had high prior software knowledge received average final grades of
90.61. Because the standardized software test covered the content of the whole
semester, the results reflected that the 6 weeks of activities did not cover all
the scope of the software standardized test. Therefore, students’ prior software
knowledge still influenced their performance.
Comparing the right branch of Figures 1 and 2 (high performing students)
shows that participants in the peer-moderated condition needed to access course
materials (>= 48.5) more than those in the teacher-moderated condition (>= 37)
to achieve higher performance. In addition, even peer-moderated participants
with low prior software knowledge (< 21.5) performed better than those having
higher prior software knowledge (>= 28.5) when they access course materials
more frequently. These results show that various interactions are needed in
online collaborative environments. Relying on only a single interaction makes it
hard to improve learning. Students need to participate actively in order to obtain
problem-solving skills from various interactions.
CONCLUSION
This study revealed that teacher presence, as opposed to just peer presence,
makes a significant difference in influencing students’ learning behaviors.
Students participated more actively and more variously when a teacher was
present. However, the results also show that an online peer-moderated course can
produce successful learning outcomes when multiple student interactions are
206 / HUNG AND CROOKS
facilitated. For courses using asynchronous communication, this study provides an
important performance-predictive model. It seems ridiculous to ask students to
post and to read a specific number of messages weekly. However, these simple
principles can be embedded and tested within a learning management system.
Once they have been validated, these principles can help to simplify instructional
design and online teaching.
In terms of professional development, more and more people use commer-
cially produced online training modules and conduct self-paced learning. These
products feature training materials and discussion forums for their users. The
situation is very similar to a peer-moderated condition. Unless users consistently
maintain a high quality of discussion and strong self-motivation, they will find
it hard to achieve high performance. The results also provide information for
improving online university coursework which has online instructors guiding
the learning process.
In this study, quantified behavioral data has demonstrated the potential of
data mining for supporting online teaching and research. These artificial intelli-
gence methods such as clustering, association, and decision tree are unique and
useful statistical tools for online teaching and research. This study has discussed
the possibility of applying different instructional strategies after clustering
analysis, especially for lower-achieving students. Association rules clearly can
help instructors to understand students’ learning patterns. Decision trees can
be used to build a predictive model and predict student performance. Educational
data mining supports and simplifies online teaching and learning. This study
also shows the possibility of building a predictive model for online leaning. Once
the model has been verified and integrated into the leaning management system
(LMS), the LMS can pop up reminders, messages, and warnings for students
and instructors based on the model built. This can save instructors time and
energy and enable them to put their efforts toward facilitation of learning.
REFERENCES
Akdemir, O., & Koszalka, T. A. (2008). Investigating the relationships among instructional
strategies and learning styles in online environments. Computers & Education, 50(4),
1451-1461.
Anderson, T., Rourke, L., Garrison, D. R., & Archer, W. (2001). Assessing teaching
presenece in a computer conference context. Journal of Asynchronous Learning
Networks, 5(2), 1-17.
Arvaja, M., Häkkinen, P., Eteläpelto, A., & Rasku-Puttonen, H. (2000). Collaborative
processes during report writing of a science learning project: The nature of discourse as
a function of task requirements. European Journal of Psychology of Education, 15,
455-466.
Ashton, S., Roberts, T., & Teles, L. (1999). Investigation the role of the instructor in
collaborative online environments. Poster session presented at the CSCL ‘99
Conference, Stanford University, CA.
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 207
Beaudoin, M. F. (2002). Learning or lurking: Tracking the ‘‘invisible’’ online student.
The Internet and Higher Education, 5(2), 147-155.
Becker, K., & Vanzin, M. (2003). Discovering interesting usage patterns in web-based
learning environments. Proceeding of the International Workshop on Utility, Usability
and Complexity of e-Information Systems, 57-72.
Becker, K., Vanzin, M., Marquardt, C., & Ruiz, D. (2006). Applying web usage mining
for the analysis of behavior in web-based learning environments. In C. Romero &
S. Ventura, (Eds.), Data mining in e-learning (pp. 117-137). Billerica, MA: WitPress.
Bernard, R. M., Brauer, A., Abrami, P. C., & Surkes, M. (2004). The development of a
questionnaire for predicting online learning achievement, Distance Education, 25(1),
31-47.
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and
regression trees. Monterey, CA: Wadsworth International Group.
Chen, L. D., Sakaguchi, T., & Frolick, M. N. (2000). Data mining methods, applications,
and tools. Information systems management, 17(1), 65-70.
Christensen, L. J., & Menzel, K. E. (1998). The linear relationship between student reports
of teacher immediacy behaviors and perceptions. Communication Education, 47(1),
82-90.
Christophel, D. M. (1990). The relationships among teacher immediacy behaviors, students
motivation, and learning. Communication Education, 39(4), 323-340.
Christophel, D. M., & Gorham, J. (1995). A test-retest analysis of student motivation,
teacher immediacy, and perceived sources of motivation and demotivation in college
classes. Communication Education, 44(4), 292-306.
CSF. (2005). Techficiency Quotient Certification (TQC). Retrieved January 27, 2009
from http://www.tqc.org.tw/TQC/index.asp
Donaldson, S. I., & Grant-Vallone, E. J. (2002). Understanding self-report bias in organi-
zational behavior research. Journal of Business and Psychology, 17(2), 245-260.
Dziuban, C., & Moskal, P. (2001). Evaluating distributed learning at metropolitan univer-
sities. Educause Quarterly, 24(4), 60-61.
Facione, P. A., & Facione, N. C. (1994). The holistic critical thinking scoring rubric.
Millbrae, CA: The CaliforniaAcademic Press.
Fernandez, J., Marin, R., & Wirz, R. (2007). Online competitions: An open space to
improve the learning process. IEEE Transactions on Industrial Electronics, 54(6),
3086-3093.
Frymier, A. B. (1993). The relationships among communication apprehension, immediacy
and motivation to study. Communication Reports, 6(1), 8-17.
Garland, D., & Martin, B. N. (2005). Do gender and learning style play a role in how
online courses should be designed? Journal of Interactive Online Learning, 4(2),
67-81.
Kass, G. V. (1980). An exploratory technique for investigatin large quantities of categorical
data. Applied Statistics, 29, 119-127.
Kirkosa, E., Spathisb, C., & Manolopoulosc, Y. (2007). Data mining techniques for
the detection of fraudulent financial statements. Expert Systems with Applications,
32(4), 995-1003.
Kreijns, K., Kirschner, P. A., & Jochems, W. (2003). Identifying the pitfalls for social
interaction in computer-supported collaborative learning environments: A review of
the research. Computers in Human Behavior, 19, 335-353.
208 / HUNG AND CROOKS
Liu, X. J., Magjuka, R. J., & Lee, S. H. (2008). The effects of cognitive thinking styles,
trust, conflict management on online students’ learning and virtual team performance.
British Journal of Educational Technology, 39(5), 829-846.
Lund, K. (2004). Human support in CSCL: What, for whom, and by whom? In J. W.
Strijbos, P. A. Kirschner, & R. L. Martens (Eds.), What we know about CSCL
(pp. 167-198). Norwell, MA: Kluwer Academic Publisher.
Macdonald, J. (2003). Assessing online collaborative learning: Process and product.
Computers & Education, 40(4), 377-391.
Machado, L., & Becker, K. (2003). Distance education: A web usage mining case study
for the evaluation of learning sites. ICALT: Proceeding of the International Conference
on Advanced Learning Technologies. IEEE Society, 360-361.
Maimon, O., & Rokach, D. (2005). Data mining and knowledge discovery handbook.
O. Maimon & L. Rokach (Eds.). New York: Springer Science+Business Media.
Michalewicz, Z., Schmidt, M., Michalewicz, M., & Chiriac, C. (2007). Adaptive business
intelligence. Berlin, NY: Springer.
Neuhauser, C. (2002). Learning style and effectiveness of online and face-to-face instruc-
tion. American Journal of Distance Education, 16(2), 99-113.
Picciano, A. G. (2002). Beyond student perceptions: Issues of interaction, presence, and
performance in an online course. Journal of Asynchronous Learning Networks, 6(1),
pp. 21-40.
Richardson, J. C., & Swan, K. (2003). Examining social presence in online courses in
relation to students’ perceived learning and satisfaction. The Journal of Asynchronous
Learning Networks, 7(1), 68-88.
Riecken, D. (2000) Personalized views of personalization. Communications of the Asso-
ciation for Computing Machinery, 43(8), 27-28.
Roiger, R. J., & Geatz, M. W. (2003). Data mining: A tutorial-based primer. Boston,
MA: Addison Wesley.
Salomon, G., & Perkins, D. N. (1998). Individual and social aspects of learning. Review
of Research in Education, 23, 1-24.
Schamber, J. F., & Mahoney, S. L. (2006). Assessing and improving the quality of
group critical thinking exhibited in the final projects of collaborative learning groups.
The Journal of General Education, 55(2), 103-137.
Shea, P. J., & Bidjerano, T. (2009). Community of inquiry as a theoretical framework
to foster “epistemic engagement” and “cognitive presence” in online education.
Computers and Education, 52, 543-553.
Shea, P. J., Li, C. S., & Pickett, A. M. (2006). A study of teaching presence and student
sense of learning community in fully online and web-enhanced college courses. The
Internet and Higher Education, 9(3), 175-190.
Shea, P. J., Pickett, A. M., & Pelz, W. E. (2003). A follow-up investigation of “teaching
presence” in the SUNY learning network. Journal of Asynchronous Learning
Networks, 7(2), 61-80.
Stevens, D. D., & Levi, A. (2005). Introduction to rubrics: An assessment tool to save
grading time, convey effective feedback, and promote student learning. Sterling, VA:
Stylus Publisher.
Thomas, W. R., & Macgregor, S. K. (2005). Online project-based learning: How collab-
orative strategies and problem solving processes impact performance. Journal of
Interactive Learning Research, 16(1), 83-107.
EXAMINING ONLINE LEARNING PATTERNS WITH DM / 209
Tseng, S. T., Tsai, S. M., Su, T. H., Tseng, CH. L., & Wang CH. I. (2005). Data mining.
Taipei, Taiwan: Flag Publishing
Waeytens, K., Lens, W., & Vandenberghe, R. (2002). Learning to learn: Teachers’
conceptions of their supporting role. Learning and Instruction, 12, pp. 305-322.
Wise, A., Chang, J., Duffy T., & Valle, R. D. (2004). The effects of teacher social presence
on student satisfaction, engagement, and learning. Journal of Educational Computing
Research, 31(3), pp. 247-271.
Wu, A. S., Farrell, R., & Singley, M. K. (2002). Scaffolding group learning in a collabor-
ative networked environment. In G. Stahl (Ed.), Computer support for collaborative
learning: Foundations for a CSCL community (pp. 435-444). Hillsdale, NJ: Lawrence
Erlbaum Associates.
Yada, K. (2007). CODIRO: A new system for obtaining data concerning consumer
behavior based on data factors of high interest determined by the analyst. Soft
Computing, 11(8), 811-817.
Zaiane, O. R. (2001). Web usage mining for a better web-based learning environment.
CATE: Proceeding of the International Conference on Advanced Technology for
Education. Banff, Alberta, 60-64.
Zhang, K., Peng, S. W., & Hung, J. L. (forthcoming). Online collaborative learning in a
project-based learning environment in Taiwan. Educational Media International.
Direct reprint requests to:
Jui-Long Hung
1910 University Drive
Dept. of Educational Technology
College of Education, Boise State University
Boise, ID 83725
e-mail: [email protected]
210 / HUNG AND CROOKS