teacher moderated and student modertaed e learning

J. EDUCATIONAL COMPUTING RESEARCH, Vol. 40(2) 183-210, 2009

EXAMINING ONLINE LEARNING PATTERNS WITH

DATA MINING TECHNIQUES IN PEER-MODERATED

AND TEACHER-MODERATED COURSES

JUI-LONG HUNG

Boise State University

STEVEN M. CROOKS

Texas Tech University

ABSTRACT

The student learning process is important in online learning environments.

If instructors can “observe” online learning behaviors, they can provide

adaptive feedback, adjust instructional strategies, and assist students in

establishing patterns of successful learning activities. This study used

data mining techniques to examine and compare learning patterns

between peer-moderated and teacher-moderated groups from a r ecently

completed experimental study (Zhang, Peng, & Hung, 2009). The online

behaviors of the students from the Zhang et al. study were analyzed to

determine why teacher-moderated groups performed significantly better than

peer-moderated groups. Three data mining techniques—clustering analysis,

association rule analysis, and decision tree analysis—were used for data

analysis. The results showed that most students in the peer-moderated

condition had low participation levels and relied on student-content

interaction only. On the other hand, teacher presence promoted student

interaction with multiple sources (content, student, and teacher). The findings

demonstrate the potential of data mining techniques to support teaching and

learning.

PROBLEM STATEMENT

One of the advantages of WWW-based instruction is that student learning

behaviors can be tracked and recorded as they occur. This capability enables

183

� 2009, Baywood Publishing Co., Inc.

doi: 10.2190/EC.40.2.c

http://baywood.com

instructors to provide adaptive feedback, adjust instructional strategies, and

assist students in establishing a pattern(s) of successful learning activities.

Lots of studies have been done regarding teaching presence and community

(Anderson, Rourke, Garrison, & Archer, 2001; Shea & Bidjerano, 2009;

Shea, Li, Pickett, 2006; Shea, Pickett, & Pelz, 2003), online learning styles

(Akdemir & Koszalka, 2008; Garland & Martin, 2005; Liu, Magjuka, & Lee,

2008; Neuhauser, 2002), and online learning processes (Fernandez, Marin, &

Wirz, 2007; Macdonald, 2003; Thomas, & Macgregor, 2005). These studies relied

on self-report data as major data sources. Due to the concerns of response

distortions in self-report measurement and possible bias in interpretive coding

(Donaldson & Grant-Vallone, 2002), this approach cannot provide instructors

with correct information for decision-making in response to the dynamics of

online learning.

Instructors and researchers need a way to track and study online learning

activities as they are occurring. Server logs stored in a learning management

system (LMS, hereafter) are the most feasible data sources for studying online

learning behaviors and patterns. Currently, almost all learning management

systems provide basic statistics of student usage data such as frequency and

duration of access. These basic statistics derived from server logs can provide

an overview of an individual student or an online course. However, these

statistics cannot provide sequential learning behaviors to construct a learning

pattern for each student. Recent developments in data mining techniques have

provided new and better methods for researchers to extract knowledge from

raw data in the server logs. Such techniques have been widely used in

various business fields as tools to study consumer behaviors (Yada, 2007), to

predict profit/loss scenarios (Kirkosa, Spathisb, & Manolopoulosc, 2007), and to

provide decision-making support (Michalewicz, Schmidt, Michalewicz, &

Chiriac, 2007).

Zhang, Peng, and Hung (2009) investigated performance and attitude differ-

ences between students in online peer-moderated and online teacher-moderated

discussions. Their results revealed that students in teacher-moderated groups

reported higher satisfaction and achieved higher final grades than students in

peer-moderated groups. However, the Zhang et al. study provided no

information about the enroute learning behaviors that existed between these two

conditions. This study re-examined the Zhang et al. study with data mining

techniques to better understand the types of online behavior that affect satisfaction

and performance. By using data mining techniques, instructors and researchers

can identify online learning behavior patterns and obtain information for

decision making. Researchers can apply these techniques to identify important

behavioral indicators for predicting learning performance. LMS vendors can

plan to integrate data mining functions with learning management systems to

complement online learning management, facilitation, and design.

184 / HUNG AND CROOKS

PURPOSE OF STUDY

The purpose of this study was to re-examine the course used in the Zhang et al.

(2009) study. Data mining techniques were applied to extracted LMS server

logs to analyze and compare student online learning behaviors between peer-

moderated and teacher-moderated groups in an undergraduate online collabor-

ative project-based learning course in Taiwan.

RESEARCH QUESTIONS

This study aims to answer the following questions that grew from the Zhang

et al. (2009) study:

1. What overall similarities and differences in learning behaviors exist

between the peer-moderated and the teacher-moderated classes?

2. What successful and unsuccessful student characteristics are exhibited in

the peer-moderated and the teacher-moderated classes?

3. What unique daily learning patterns are exhibited within the peer-

moderated and the teacher-moderated classes?

4. What are the most important indicators for predicting student learning

performance?

DATA MINING TECHNIQUES

Roiger and Geatz (2003) regarded Data Mining (DM) as the process of

employing one or more computer-learning techniques to analyze and extract

knowledge from data contained within a database. There are two categories of

DM techniques (Chen, Sakaguchi, & Frolick, 2000): one is the use of descriptive

techniques (e.g., summarization, clustering, and association rules) to evaluate

hypotheses, and the other is often referred to as predictive techniques or

machine learning with artificial intelligence technologies (e.g., decision trees and

neural networks) used to build predictive models (Tseng, Tsai, Su, Tseng, &

Wang, 2005).

Following is a brief introduction of these techniques.

Descriptive Techniques

1. Summarizing (Roiger & Geatz, 2003): A population of numerical data is

uniquely defined by a mean, a standard deviation, and a frequency or probability

distribution of values occurring in the data. For example, to compare student

performance between two classes, summarizing provides an overview of the data

before further knowledge exploration.

2. Clustering (Maimon & Rokach, 2005): Clustering is a technique used to

classify instances based on their similar characteristics. For example, to classify

EXAMINING ONLINE LEARNING PATTERNS WITH DM / 185

student characteristics one might include gender, age, learning styles, and test

scores. The purpose of clustering analysis is to describe the student-shared

characteristics and not to build a relationship model between dependent and

independent variables.

3. Association rules: (Roiger & Geatz, 2003) Association rule techniques, also

called market basket analysis (Maimon & Rokach, 2005), are used to discover

interesting associations among attributes contained in a database. The application

of association rule mining is to find related products by analyzing the content of

the customer’s market basket in order to find product associations. For example,

70% of female undergraduate students who own a desktop at home will also take

at least one online course. The purpose is to find associated products within the

set of offered products, as a support for marketing decisions.

Predictive Techniques

1. Decision trees (Breiman, Friedman, Olshen, & Stone, 1984): In data mining,

decision tree analysis is used as a predictive model to match observations with

outcomes. This predictive model is presented as a tree structure in which leaves

represent classifications and branches represent conjunctions of features that lead

to those classifications. For example, a teacher could use a decision tree to predict

student retention rate in an online class. After some literature reviews that identify

key factors influencing student retention, the teacher creates a decision tree using

learning styles, occupation, and age to predict the student retention rate.

2. Neural networks (Roiger & Geatz, 2003): A neural network is a mathe-

matical model that mimics biological neural networks. It consists of an inter-

connected group of artificial neurons and processes information using a con-

nectionist approach. A neural network is an adaptive system that changes its

structure based on external or internal information that flows through the network

during the learning phase. For example, an online program can identify potential

students based on their demographic information, a neural network can “learn”

from existing students’ demographic data. Then the trained network can be used

to identify potential students.

ONLINE LEARNING BEHAVIORS

Web Usage Mining (WUM, hereafter) is a set of data mining techniques used

to describe or predict online navigation patterns through weblog data. The extrac-

tion of students’ navigation patterns can be a valuable tool to understand how

students learn and to evaluate the design and effectiveness of the online learning

environment. WUM is composed of three core phases: data preprocessing, data

mining, and pattern analysis (Becker, Vanzin, Marquardt, & Ruiz, 2006). The

LMS provides an instructional platform consisting of content management, com-

munication tools, assignment submission systems, and various accessories such as


a white board, an online quiz, and a calendar. Unlike online consumer platforms,

LMS platform components are inherently more difficult to obtain information

about user behavior. Typically, WUM can be categorized into two classes of user

behavior: usage and navigation (Becker et al., 2006). Usage behavior focuses

on how resources in the site are used to perform learning activities. It allows the

researcher to characterize student learning progress based on the contents they

read, the tools they used, and how these resources are combined to achieve

goals or acquire competencies. The navigation behavior focuses on investigating

frequent or infrequent usage paths and groups of students with similar access

characteristics. WUM has been applied widely in the e-commerce domain

(Riecken, 2000) and has been extended to other domains such as distance educa-

tion (Becker & Vanzin, 2003; Machado & Becker, 2003; Zaiane, 2001). How-

ever, because a LMS builds up a distributed online learning environment with

many functions, analyzing usage and navigation behaviors in an e-learning

environment is more complicated than that in e-commerce.

Becker et al. (2006) investigated an intensive extracurricular course that

lasted 11 days and involved 15 students. The study focused on understanding

the potential for using two WUM tools (O3R and LogPrep) to analyze the

effectiveness of the web-based learning environment. The study also evaluated the

usability of these two tools within the educational data-mining domain. Three

phases of data mining procedures were conducted for data analysis: the pre-

processing phase, the data mining phase, and the pattern analysis phase.

The purpose of the preprocessing phase was to create datasets for data mining

analysis. The data mining phase applied association rules, clustering, and clas-

sification techniques. Association rules were used to find which content areas

students tended to access together, or which combination of tools students usually

explored during their learning processes. Clustering techniques grouped instances

with similar characteristics.

The final phase, pattern analysis, involved pattern retrieval and pattern inter-

pretation. A large number of interesting patterns were identified and interpreted

by the researchers who filtered the data using domain knowledge and the per-

ceived value of particular data points. The results of their study revealed that

WUM can be used to help monitor, understand, and evaluate student learning

behaviors. Becker et al. (2006) also concluded that the O3R is a suitable tool

for association and sequential association analysis, whereas LogPrep is more

suitable for performing preprocessing and clustering tasks.

METHODOLOGY

Participants

Ninety-eight college freshmen in 36 learning groups at a 4-year, vocational-

track university in Taiwan participated in a series of online learning activities for

6 weeks in a project-based learning environment. Both teacher-moderated and


peer-moderated conditions consisted of 17 groups. Each group was formed by

randomly assigning students to 15 groups of three and two groups of two. Each

group was asked to investigate and report on a different, ill-structured problem.

The ill-structured problems required knowledge and skills in survey design,

statistical analysis, advanced Excel applications, and report writing.

A demographic survey investigated participants’ prior online learning experi-

ence and software skill knowledge on Microsoft Office and Internet. The results

showed that 79.56% of the participants had not taken an online course, 12.24%

had taken one online course, and 8.2% had taken more than one online course. The

participants’ prior software skill knowledge scores ranged from 11 to 54 (1-20

novice; 21-40 skilled; 41-60 expert). The results revealed an average score

of 26.61, indicating a basic level of software skill knowledge. There were no

significant differences in prior online learning experience and software knowl-

edge between the teacher-moderated and the peer-moderated conditions. Partici-

pants in the peer-moderated condition had taken an average of 2.51 blended

or fully online courses and had an average score of 28.02 (skilled) on prior

software skill knowledge. Participants in teacher-moderated condition had taken

an average of 2.53 blended or fully online courses and an average score of 25.20

(skilled) on prior software skill knowledge.

Materials

The participants collaborated within a LMS with their group members to finish

an assigned project in 6 weeks. Each project consisted of a different ill-defined

problem (e.g., investigate students’ Internet use behaviors) that needed to be

answered through survey investigation. Groups were engaged in learning activ-

ities such as determining and defining the proper scope of a research topic,

designing and developing appropriate information gathering instruments, col-

lecting data with self-created instruments, analyzing data, and writing and pre-

senting a comprehensive report on the investigated phenomenon. Each group

had 2 weeks for questionnaire design and investigation, 2 weeks for statistical

analysis with Microsoft Excel, and 2 weeks for visual presentation and report

writing. All learning materials, such as computer-based software training video,

instructional files, and task description could be downloaded through the LMS.

Students were assessed on a report task and a skill test (Zhang et al., 2009).

Two instructors followed a common evaluation rubric to evaluate the group

reports. In order to evaluate group critical thinking in the final report of their

collaborative learning activity, the rubric followed the principles of “Holistic

Critical Thinking” (Facione & Facione, 1994; Schamber & Mahoney, 2006;

Stevens & Levi, 2005). The final grade on the group reports came from the

average score given by two instructors and a peer evaluation score. The skill

test was evaluated by a standardized test called Techficiency Quotient Certifi-

cation (TQC). TQC test was developed by the Computer Skill Foundation, an


organization that does testing of technical skills and issuing of certificates

(CSF, 2005). The TQC certification is very similar to the Microsoft Office User

Specialist (MOUS) certification.

Apparatus or Measures

The course was delivered online through Wisdom Master, the most widely

applied LMS in Taiwan’s higher education institutions. All learning materials,

communication tools, assignment submissions, and course announcements were

available in the LMS. All students’ online activities were recorded in the LMS

database system. All server logs, in the LMS database system, could be extracted

by Structured Query Language (SQL). The extracted data were cleaned and

processed, then stored in a compatible format for further analysis. Oracle 10G

Express and Microsoft Excel 2003 were used to perform these tasks. Analytical

data processes included descriptive and predictive techniques, which were per-

formed by SPSS1, Knime2, and Weka 33. SPSS was used to execute descriptive

analysis and data exploration. Knime and Weka 3 were used to perform descrip-

tive and predictive analysis.

Procedure

The mining procedure consisted of three phases: data preprocessing, data

mining, and pattern analysis. These followed the Becker et al. (2006) WUM

processes.

Data Preprocessing Phase

The first step in the data preprocessing phase was to create reduced log files

by removing all the useless information (e.g., IP addresses and browser types)

present in the common log files of the LMS. The second step in the preprocessing

phase used a session filter that was applied to the reduced log files for feature

extraction. The aim of the session filter was to aggregate all user requests within

a session into a single set of features (variables). A session is a central notion

in network-mediated learning. A session is technically a sequence of web log

entries that reflects the interaction behavior of a learner within a period of active

study. In regular conditions, a session starts when a student begins interacting

with the LMS and ends when the student presses the exit button. However, under

some conditions, students might abort the session by closing the web browser,

or might suspend his or her web browser and not do anything. The LMS would

end the session automatically after 20 minutes of inactivity.


1 Further information can be found at http://spss.com/2 Further information can be found at http://www.knime.org/3 Further information can be found at http://www.cs.waikato.ac.nz/ml/weka/

The session filter was used to extract the following primary and derived

variables: a user identifier, a session identifier, start date and times, end date and

times, user hit counts, and session duration times (Becker et al., 2006). These are

general variables that were extracted from reduced log files. In order to gain

insight into the learning processes that occurred, statistical variables such as

frequency of login, frequency of accessing course materials, number of messages

posted, number of messages read, frequency of synchronous discussion, and

reading duration were recorded. Identifying these statistical variables required

accumulating duration and frequency data of each student on a daily and weekly

basis. In addition, each student’s demographic information, such as technology

competency and prior online learning experience, were also collected. Except for

extracted variables, cleaned server logs were used to conduct association analysis.

Data Mining Phase

The whole data mining phase can be divided into two stages: the data

exploration stage and the knowledge exploration stage. The data exploration stage

in the current study overviewed populations of numerical data through descriptive

statistics necessary for gaining a better understanding of how to apply further data

mining techniques. Knowledge exploration involved applying both descriptive

and predictive techniques in order to answer the research questions including

clustering, association rules, and decision tree analysis.

Pattern Analysis Phase

The pattern analysis phase includes interpretation and evaluation of the

results to identify valuable outcomes of the previous phase and needs the most

domain knowledge. Variables and factors were overviewed after the initial data

exploration. Mining results were compared with the literature on online col-

laborative learning for validation and evaluation. Conclusions were drawn from

analysis of the data and recommendations were made for future research on

online learning behaviors and educational data mining.

RESULTS

Two stages of the data mining phase (the data exploration stage and the

knowledge exploration stage) are discussed below. The data exploration stage

presents descriptive and inferential statistics about each of the variables used in

the current study. More sophisticated descriptive and predictive techniques are

combined in the knowledge exploration phase to obtain knowledge related to

the research questions in this study.


Data Exploration Stage

Based on previous research in online learning environments (Becker et al.,

2006), five frequency variables (course logins, instructional content accessed,

asynchronous messages posted, asynchronous messages read, and synchronous

discussions attended) were applied to compare differences between the peer-

moderated and teacher-moderated conditions.

Table 1 shows descriptive statistics for these variables across both conditions.

Participants in the teacher-moderated condition logged into the course, accessed

course content, and posted asynchronous messages more frequently than par-

ticipants in the peer-moderated condition (t(96) = –2.21, p = .03; t(96) = –2.78,

p = .006; and t(96) = –3.00, p = .003, respectively). While the number of messages

read was higher in the teacher-moderated condition, it was not significantly

different from the peer-moderated condition.

There was also no significant difference between the two conditions in syn-

chronous discussion frequency. In fact, students in both conditions seldom

utilized the synchronous communication tool (M = 2.31 and M = 2.39 over 6

weeks of activities), choosing instead to use the bulletin board (asynchronous

communication) for their class discussions.

The results also show that the course login and content access distributions

were positively skewed in the peer-moderated group, but relatively normal in the


Table 1. Descriptive Statistics for Frequency Variables for

Peer and Teacher-Moderated Conditions

Peer-moderated Teacher-moderated

Variables M SD Skewness M SD Skewness

Course logins

Instructional

content accessed

Asynchronous

messages posted

Asynchronous

messages read

Synchronous

discussions attended

29.35

29.57

21.41

41.98

2.31

19.81

17.50

22.14

39.80

2.89

1.09

0.66

1.72

1.23

1.73

37.78

39.94

38.86

46.45

2.39

17.96

19.31

34.18

36.55

3.98

0.24

0.21

1.64

0.92

3.27

Note: All statistics represet frequencies, or the number of times variables were recorded

in the server logs.

teacher-moderated group. Furthermore, the distributions for messages posted,

messages read, and discussion attended were positively skewed in both conditions.

All students were randomly assigned to either the peer-moderated or the

teacher-moderated condition. The expected distribution patterns should be similar

in both conditions. However, students in the peer-moderated condition logged

in and accessed content less frequently than those in the teacher-moderated

condition. This implies students in the peer-moderated condition had lower

learning motivation as compared with those in the teacher-moderated condition.

In discussion participation, students in both conditions had similar distribution

patterns. However, students in the teacher-moderated condition participated

in asynchronous discussion more actively than those in the peer-moderated

condition.

Knowledge Exploration Stage

The knowledge exploration stage applied three artificial intelligence methods:

cluster analysis, association analysis, and decision tree analysis. Cluster

analysis was used to classify participants by behavioral and performance charac-

teristics, association analysis was used to discover participant learning patterns,

and decision tree analysis was used to build a performance prediction model

for each condition.

Cluster Analysis

Cluster analysis (K-means) extends findings from the data exploration stage

by discovering data structures that exist across several variables within each

condition. Eleven variables were applied in the cluster analysis: Course logins,

instructional content accessed, asynchronous messages posted, asynchronous

messages read, asynchronous message reading time, synchronous discussions

attended, online learning experience, software experience, project grade, skill

grade, and final grade.

Table 2 shows the clustering results for the peer-moderated and the teacher-

moderated conditions. For simplification, the number of clusters has been limited

to three. In both peer- and teacher-moderated conditions, cluster 1 represents

superior performers, cluster 2 represents average performers, and cluster 3 repre-

sents poor performers (performing worse than cluster 1 and cluster 2 partici-

pants on almost all variables). While both conditions had low-performing par-

ticipants (i.e., cluster 3), the low performers in the teacher-moderated condition

had significantly higher final grades (M = 64.55) than the low performers in

the peer-moderated condition (M = 49.47), t(35) = –2.92, p = .006.

To understand differences in performance differences, the skill grade and the

project grade of low performers in both conditions were compared. Results

revealed the major differences occurred in the project grade. A separate cluster

analysis conducted on the cluster 3 participants in the peer-moderated condition


Tab

le2

.M

ean

sfo

rC

luste

rR

esu

lts

inth

eP

eer-

Mo

dera

ted

an

dT

each

er-

Mo

dera

ted

Co

nd

itio

ns

Peer-

mo

dera

ted

co

nd

itio

nT

each

er-

mo

dera

ted

co

nd

itio

n

Inp

ut

vari

ab

les

Clu

ste

r1

(n=

7)

Clu

ste

r2

(n=

25

)

Clu

ste

r3

(n=

17

)

All

(n=

40

)

Clu

ste

r1

(n=

6)

Clu

ste

r2

(n=

23

)

Clu

ste

r3

(n=

20

)

All

(n=

49

)

Co

urs

elo

gin

s

Instr

uctio

nalco

nte

nt

accessed

Asyn

ch

ron

ou

sm

essag

es

po

ste

d

Asyn

ch

ron

ou

sm

essag

es

read

Asyn

ch

ron

ou

sm

essag

ere

ad

ing

tim

ea

Syn

ch

ron

ou

sd

iscu

ssio

ns

att

en

ded

On

line

learn

ing

exp

eri

en

ce

So

ftw

are

exp

eri

en

ce

Pro

ject

gra

de

Skill

gra

de

Fin

alg

rad

e

52

.43

56

.14

55

.86

11

3.4

3

13

.01

3.2

9

2.1

4

32

.00

95

.57

78

.43

87

.29

30

.96

33

.28

21

.24

45

.08

5.1

0

2.2

8

2.6

0

27

.64

69

.96

75

.28

72

.84

17

.47

13

.18

7.4

7

8.0

0

0.7

6

1.9

4

2.5

3

26

.94

28

.47

70

.06

49

.47

29

.35

29

.57

21

.41

41

.98

4.7

2

2.3

1

2.5

1

28

.02

59

.22

73

.92

66

.80

47

.17

47

.50

10

9.3

3

20

.83

1.5

8

0.6

7

2.3

3

24

.50

87

.83

83

.83

86

.17

49

.87

53

.57

37

.22

77

.22

9.8

9

3.4

8

2.7

8

26

.26

86

.48

82

.78

84

.87

21

.05

22

.00

19

.60

18

.75

1.8

1

1.6

5

2.3

0

24

.20

54

.05

74

.50

64

.55

37

.78

39

.94

38

.86

46

.45

5.5

8

2.3

9

2.5

3

25

.20

73

.41

79

.53

76

.74

aR

ead

ing

tim

ew

as

measu

red

inh

ou

rs.


suggests some reasons why this performance discrepancy may have occurred

(see Table 3). Table 3 reveals that the poor performers in the peer-moderated

condition were further categorized into two groups (i.e., clusters 3-1 and 3-2).

Comparing the results of the two cluster analyses provides further information

about the poor performers in the peer-moderated condition. In conclusion, cluster

3-1 suggests that the students may have had login problems and cluster 3-2

suggests that the students didn’t know how to solve practical problems or they

didn’t want to engage in learning activities.

For example, the learning behaviors of cluster 3-1 participants such as course

logins, instructional content accessed, and messages posted were significantly less

than the cluster 3-2 participants. Moreover, most of these learning behaviors

occurred within the first 2 weeks. Table 3 shows that cluster 3-2 students

performed well on the software skill test (M = 83.91) relative to superior and

average performers in the peer-moderated condition (Ms = 78.43 and 75.28,

respectively; see clusters 1 and 2, Table 2). Interestingly, though, cluster 3-2

students received much lower project grades (M = 32) than these superior and

average performers (Ms = 95.57 and 69.96, respectively). There are at least

two possible reasons for these findings. One is that cluster 3-2 participants,

notwithstanding their skill with the software, had difficulty solving practical

problems as evidenced by their poor project grades. Another possibility is that

they lacked the motivation to engage in instructional activities that may have


Table 3. Means for Cluster Analysis of Poor-Performing Students

in the Peer-Moderated Condition

Input variables

Cluster 3-1

(n = 6)

Cluster 3-2

(n = 11)

Cluster 3 Total

(n = 17)

Course logins

Instructional content accessed

Asynchronous messages posted

Synchronous discussions attended

Asynchronous message reading time

Asynchronous messages read

Project grade

Skill grade

Final grade

Online learning experience

Software experience

12.80

10.80

5.00

0.80

0.09

4.40

18.60

53.60

34.00

2.00

23.00

20.02

14.09

9.00

2.55

1.13

8.36

32.00

83.91

58.09

2.82

29.00

17.47

13.18

7.47

1.94

0.76

8.00

28.47

70.06

49.47

2.53

26.94

improved their project grades. Evidence for this second conclusion is found in the

fact that cluster 3-2 participants were far less involved with accessing course

content and participating in discussions than their superior and average per-

forming counterparts.

Association Analysis

Association rule techniques are used to discover interesting associations among

events contained in a database. In this study, association analysis was used to

compare the differences in learner behavior patterns between participants in

the peer-moderated and teacher-moderated conditions. The association analysis

used 7,926 server logs from the peer-moderated condition and 10,006 server

logs from the teacher-moderated condition. In this analysis a server log was

defined as a sequence of recorded participant learning behaviors determined

by beginning (login) and ending (logout) time stamps.

Support and confidence are two terms used to describe the association rules

discovered in an association analysis (see Table 4). Support refers to the propor-

tion of all server logs containing the sequence of learning behaviors defining the

rule. Confidence refers to the probability that the entire rule will be observed given

the occurrence of the first behavior in the sequence. For example, the association

rule: “go to student environment => access course materials” has a support score

of 34.71% and a confidence score of 88.18% (see Table 5, rule 2). In this example,

support means that 34.71% of the total server logs (i.e., 3,473 out of 10,006)

contained the sequence “go to student environment => access course materials.”


Table 4. Daily Association Rules in Peer-Moderated Groups

Support

(%)

Confidence

(%) Counts Rule

1

2

3

4

5

82.28

82.19

61.18

20.23

14.71

90.52

91.09

67.31

58.87

49.67

850

849

632

209

152

longin => access course materials

goto student environmenta =>

access course materials

login => goto student environment

post onb => post on

Longin => login

aThe student environment contains course announcements, personal messages, and

other personal information.b“Post on” refers to posting a message on the asynchronous discussion board.

The confidence score means that the probability is .8818 that the learning behavior

“access course materials” will occur if the behavior “go to student environment”

has occurred on that day.

Tables 4 and 5 list daily association rules (with accompanying support and

confidence measures) for peer-moderated and teacher-moderated conditions

respectively. These rules represent students’ daily learning behavior patterns.

For example, Table 4, rule 4: “post on => post on” indicates that students in the

peer-moderated condition posted messages twice on the same day with support

and confidence scores of 20.23% and 58.87%, respectively. These postings may

have occurred in succession or at different times during the day.

In Table 4, Rules 1 and 3 reveal that a very high percentage (support = 82.28%

and 61.18%, respectively) of server logs from the peer-moderated condition

included either the behavior sequence “login => access course materials,” or

“Login => go to student environment.” These rules also show high probabilities

(confidence = 90.52% and 67.31%, respectively) that, on a given day, students in

the peer-moderated condition accessed either the course materials or the student

environment after login. In addition, Rule 2 shows strong support (82.19%) and

confidence (92.09%) for accessing course materials after going to the student

environment section. In summary, Rules 1 through 3 reveal that participants in


Table 5. Daily Association Rules in Teacher-Moderated Groups

Support

(%)

Confidence

(%) Counts Rule

1

2

3

4

5

6

7

8

2.67

34.71

2.93

21.10

18.35

19.64

32.39

32.13

10.06

88.18

11.04

93.16

86.94

82.61

84.88

81.62

31

403

34

245

213

228

376

373

longin => access course materials

goto student environment => access

course materials

login => goto student

post onb => post on

post on => post on => post on

Longin => login => login

access course materials => access

course materials

goto student environment => goto

student environment

the peer-moderated condition spent the most time in course materials and student

environment (to check announcements or read personal messages and statistics)

sections of the course.

While the same association rules found in the peer-moderated condition can be

found in the teacher-moderated condition (see Rules 1 through 3 in Table 5).

The support and confidence measures for these rules are significantly lower in

the teacher-moderated condition. Participants in the teacher-moderated condi-

tion experienced a much broader array of activity patterns than peer-moderated

participants. This is illustrated in Rules 4 through 8 in Table 5, where it can

be observed that teacher-moderated participants often worked on the same

activity more than once a day. While participants in the peer-moderated condition

exhibited some of these same learning patterns, the support and confidence

measures are much lower. In summary, it appears that teacher-moderated partici-

pants were more motivated to experience a broader array of learning activities

than peer-moderated participants. Peer-moderated participants tended to focus

primarily on accessing course materials and checking personal records. The

implications of the results will be treated in the discussion section.

Decision Tree Analysis

In this study, the CHAID decision tree analysis (Kass, 1980) was used to build

a performance prediction model in both conditions. Ten independent variables

(course logins, instructional content accessed, asynchronous messages posted,

asynchronous messages read, asynchronous message reading time, synchronous

discussions attended, project grade, skill grade, online learning experience,

and software experience) and one dependent variable (final grade) were applied

to the analysis.

Figure 1 shows the results of the decision tree analysis for the peer-moderated

condition. Number of asynchronous messages read (MessagesRead) was the

most important variable for predicting the final grades of participants in this

condition. Those participants reading more than 33 messages throughout the

entire course on the course discussion board received 44% higher final grades

(M = 79.17) than those reading 33 messages or less (M = 54.92). Among those

receiving higher final grades, frequency of accessing instructional content

(FreqContent) was an important variable in differentiating this group. As shown

in Figure 1, participants who accessed the instructional content section of

the course more than 48 times received 21% higher final grades (M = 87.57)

than those who accessed the instructional content section less than 39 times

(M = 72.27). Among those receiving lower final grades, number of asynchronous

messages posted (MesssagesPosted) and prior software experience (SoftExp)

were important differentiating variables. Participants posting more than three

messages and receiving software experience ratings greater than 28 received 79%

higher final grades (M = 66.3) than those posting three messages or less (M = 37).


These results suggest that students in peer-moderated learning situations

may benefit from spending considerable time reading the messages posted by

other students. This benefit appears to be enhanced when these students invest

considerable time studying the course instructional content. Interestingly, the

number of messages posted does not appear to be as important as the number of

messages read for students working in peer-moderated groups.

Figure 2 shows the results of the decision tree analysis for the teacher-

moderated condition. Frequency of accessing instructional content (FreqContent)

was the most important variable for predicting the final grades of participants in

this condition. Those participants who accessed the instructional content section

of the course 37 or more times received 30% higher final grades (M = 86.15)

than those who accessed the instructional content section less than 37 times

(M = 66.09). Among those with higher final grades, the participants who had

significant software experience and who were active in posting on the course

discussion board (more than 36 posts) received 24% higher final grades (94.11)

than those with less software experience (M = 76.13). For those participants

receiving lower final grades, the number of messages posted was an important

differentiating variable. Participants posting more than 14 messages received

37% higher final grades (M = 72.87) than those posting 14 messages or less

(M = 53.38).


Figure 1. Decision tree for peer-moderated groups.

These results suggest that students in teacher-moderated conditions may benefit

most from accessing instructional content and posting messages to course dis-

cussion boards. Prior knowledge in the content area also appears to be an impor-

tant predictor of success.

In summary, the results of the decision tree analysis suggest that frequency

of accessing instructional content (FreqContent), number of asynchronous

messages read (MessagesRead), frequency of posting asynchronous messages

(FreqPosting), and software experience (SoftExp) were four predictors of student

performance in this study.

DISCUSSION

This section is organized around the four research questions pertaining to this

study. Each of the research questions are addressed with findings from this study

along with relevant literature from the field of online collaborative learning.

What Differences in Learning Behaviors Exist Between the

Peer-Moderated and the Teacher-Moderated Conditions?

The results from the data exploration stage showed that participants in the

teacher-moderated condition logged in, accessed course content, and posted


Figure 2. Decision tree for teacher-moderated groups.

asynchronous messages more frequently than participants in the peer-moderated

condition (see Table 1). These results suggest that teacher-moderated participants

were more actively involved, and may have experienced more motivation and

satisfaction, in the course than peer-moderated participants.

These findings are consistent with several survey research studies that have

examined the relationship between teacher presence and student motivation (see

Christensen & Menzel, 1998; Christophel, 1990; Christophel & Gorham, 1995;

Frymier, 1993) and between teacher presence and course satisfaction (Dziuban

& Moskal, 2001; Kreijns, Kirschner, & Jochems, 2003; Richardson & Swan,

2003; Wise, Chang, Duffy, & Valle, 2004). All but one of the studies reviewed

found that teacher presence was positively related to either student motivation or

course satisfaction. For example, Christophel (1990) investigated the relationship

between immediacy (a form of teacher presence emphasizing behaviors such as

smiles, head nods, use of inclusive language, and eye contact) and student state

motivation in college classes. The study found significant relationships between

learning and both immediacy and motivation. Immediacy was found to modify

motivation, which led to increased learning. Dziuban and Moskal (2001) analyzed

the responses from 52,218 questionnaires, which focused on investigating

students’ course satisfaction and perception of level of learning in courses offered

in web-based (totally online), mixed-mode (part online and part face-to-face),

web-supported (website as supplement), and face-to-face formats. They found

high correlations between the quantity and quality of teacher-student interaction

and students’ perception of their level of learning and satisfaction in all four

types of courses.

The one study with contrary findings (Wise et al. 2004) investigated the

relationship between teacher presence and student satisfaction, engagement, and

learning performance. The authors define engagement as active involvement in

knowledge construction, peer interaction, and peer collaboration. The results

indicated that teacher presence was related to peer interaction and students’

perception of the instructor but was not related to perceived learning, satisfaction,

engagement, or the quality of their final course product. The experimental findings

from the initial analysis of data from the current study (Zhang et al., 2009) support

the majority of correlational research in this area. Zhang’s study was the only

experimental study identified that compared student perceptions and perform-

ance differences between peer-moderated and teacher-moderated conditions.

They found that teacher presence influenced students’ performance. In addition,

a posttest attitude survey indicated that students in peer-moderated groups

were unsatisfied with their online collaborative learning experience after 6-weeks.

In general, these students felt that online collaboration and discussion were a

waste of time and that collaborative learning does not help students develop

quality projects.

This study reanalyzes Zhang’s data beyond student’s perceptions and posttest

performance. Its results, obtained through data mining, support previous findings


that teacher presence motivated students. However, this study also discovered that

relying only on perceptions may lead to misguided conclusions. For example,

based on the posttest attitude survey results, students perceived that peer support

was useless during collaboration in the peer-moderated condition. However,

the cluster analysis showed that the superior performers (cluster 1 students) in the

peer-moderated condition read a high number of messages, spent a lot of time

reading messages, and posted a high number of messages to the discussion board

(see Table 2). Interestingly, the decision tree analysis showed that the number

of messages read is the most important variable for predicting performance in

the peer-moderated condition. Students who read over 33.5 messages through-

out the 6-weeks of collaborative activities received a much higher grade

(M = 79.17) than those who read fewer messages (M = 54.92; see Figure 1).

These results from data-mining show that students in the peer-moderated con-

dition can actually benefit from peer collaboration. However, there were only

seven students who benefited from peer support. Later sections will discuss

detailed learning behaviors in both conditions that can provide more evidence

of teacher presence effects.

What successful and unsuccessful student

characteristics are exhibited in the peer-moderated

and the teacher-moderated conditions?

Successful and unsuccessful students were defined by their final grade

performance. High or low participation level was defined by relatively higher or

lower value on learning behavior variables. The following section discusses the

characteristics of successful, average, and unsuccessful students within the

peer-moderated and the teacher-moderated conditions.

Characteristics of Successful Students

In both conditions, cluster 1 represents the most successful students. Table 2

shows that the cluster 1 students in both conditions have relatively higher values

on almost all variables, evidencing active participation in online activities. This

finding is consistent with other studies that have found that students with high

participation levels have the highest achievement scores (Beaudoin, 2002;

Picciano, 2002). The current study used six behavioral variables to depict student

participation level (see Table 2). However, Beaudoin’s and Picciano’s study

used just one variable to define students’ participation level. Beaudoin (2002)

divided an online class into three groups (high interaction, moderate interaction,

and low interaction) based on time spent in course-related activity. Picciano

(2002) defined interaction level by the number of postings. Therefore, the current

study provides a more comprehensive measure of interaction level. Moreover,

Picciano (2002) reported that student perceptions are not consistent with

actual postings. He found that the low-interaction group perceived themselves as


having made a higher number of postings than they actually did, and the high-

interaction group perceived themselves as having made fewer postings than

they actually did. Again, this finding supports the previous discussion, in

which perceptions of participants may provide incorrect information and lead

to misguided conclusions.

Although cluster 1 students in both conditions represented students who per-

formed well, there were important differences between the conditions. While

the six cluster 1 students in the teacher-moderated condition had the highest

final grade (M = 86.17), they read fewer messages, spent less time reading

messages, and frequented synchronous discussions less than their cluster 2 and 3

counterparts in the same condition. On the other hand, except for prior online

learning experience, cluster 1 participants in the peer-moderated condition

possessed higher values on all variables. These results indicate that cluster 1

participants in the teacher-moderated condition may have learned more efficiently

than all other participants. However, the data mining results cannot provide

enough information to make a certain conclusion. Similar results were also found

by Beaudoin (2002). He found that the low-interaction group performed higher

than did the moderate group without any rational explanation. These cluster 1

participants in the teacher-moderated condition are worthy of further investigation

on their learning strategies.

Characteristics of Unsuccessful Students

Cluster 3 in both conditions represents low performers who did not actively

participate in online activities. To understand them further, cluster 3 students in the

peer-moderated condition were reclassified. The results (see Table 3) revealed

six students in cluster 3-1 who may have had log-in problems. These six students

had few records on the LMS. In addition, we found several login failure messages

in the beginning of the online course on these students’ web logs. Cluster 3-2

consists of 11 students. These students had relatively high prior experience

(M = 29) and high scores on the software skills test (M = 83.91). However,

their low project grades (M = 32) indicates that they may have had difficulty in

applying their software skills to practical problems. These findings can help

instructors to develop different instructional strategies for participants in cluster

3-1 and cluster 3-2. For example, cluster 3-1 students can be given more technical

support to help them with log-in problems (Ashton, Roberts, & Teles, 1999). For

cluster 3-2 participants, the instructor can provide pedagogical support for skill

application (Ashton et al., 1999; Waeytens, Lens, & Vandenberghe, 2002).

Cluster 1 participants in the peer-moderated condition had relatively higher

values on all variables except for prior online learning experience. On the other

hand, cluster 3 students in the peer-moderated condition had higher than average

prior online experience. These results indicate that prior online learning experi-

ence may not influence performance. Bernard et al. (2004) obtained similar


findings. In their survey investigation they found that student confidence about

basic prerequisite skills did not influence final performance.

What Unique Daily Learning Patterns are Exhibited

Within the Peer-Moderated and the Teacher-Moderated

Conditions?

Daily Learning Patterns in the Peer-Moderated Condition

Association rules revealed daily learning patterns in both conditions. Table 4

shows that the major learning activities in the peer-moderated condition were:

1. “login => access course materials”;

2. “goto student environment => access course materials”; and

3. “login => goto student environment.”

These results indicate that more than 80% of the learning activities were reading

the course materials and checking course announcements and personal records.

Content-student interaction, as opposed to student-student interaction, was the

primary source of content knowledge for students in the peer-moderated con-

dition. The posttest survey also supported this inference. Students indicated a

lack of trust for information on the bulletin board. These students regarded

the information from peers as useless (Zhang et al., 2009). However, relying on the

course materials was insufficient to achieve a high-quality project.

Peer support has been given a fair amount of attention in the literature (e.g.,

Arvaja, Häkkinen, Eteläpelto, & Rasku-Puttonen, 2000; Wu, Farrell, & Singley,

2002). Salomon and Perkins (1998) noticed that a teacher’s objective is to

facilitate learning, but peers working together aim for task accomplishment, so the

goal of human support may vary with the person doing the support. This may

explain why students in the peer-moderated condition held negative attitudes

toward peer support. The quality of peer support varies. In the current study most

of the messages posted to the discussion board consisted of peer encouragement

(Zhang et al., 2009). While there were some useful messages scattered over

different threads in the bulletin board, participants in the peer-moderated con-

dition needed to read several messages to indentify useful information.

Daily Learning Patterns in the Teacher-Moderated Condition

Table 5 lists common daily learning patterns in the teacher-moderated condi-

tion. Learning patterns in the teacher-moderated condition showed very different

results from those of the peer-moderated condition. While teacher-moderated

participants had the same top three rules as in the peer-moderated condition

(“login => access course materials,” “goto student environment => access course

materials,” and “login => goto student environment”), these rules had far lower

support and confidence in the teacher-moderated condition. The highest support


rating for teacher-moderated participants was 34.71%, while the highest support

rating for peer moderated participants was 82.28%. The results show that

teacher presence promoted student participation in a wide variety of learning

activities. Therefore, while the support ratings for each learning activity were

lower, teacher presence tends to facilitate student interaction with multiple sources

(content, student, and teacher).

Ashton et al. (1999) defined four categories of support in online collaborative

environments: pedagogical, social, managerial, and technical. Pedagogical sup-

port includes all attempts to assist in reaching a particular learning objective,

such as providing feedback, instructions, information, opinions, preferences,

advice, questions, summaries, comments, or referring to outside sources. Social

support includes all attempts to make students comfortable and promote inclu-

sion, such as using empathy, meta-communication, humor, or performing inter-

personal outreach. Managerial support includes all attempts to coordinate assign-

ments, discussions, and course activities. Technical support includes assistance

to students in using the course delivery software. It focuses on solving user and

technical issues.

Lund (2004), based on Ashton’s categories, listed possible types of support

given by and to different participants in a computer supported collaborative

environment. Table 6 shows of Lund’s conclusions.

Table 2 provides insight into student performance differences. There are no

significant differences in the skill grades between the peer-moderated and the

teacher-moderated conditions. The major performance difference occurs in their

project grades. Zhang et al. (2009) also found that students in a peer-moderated

condition tended to imitate each other. On the other hand, students in a


Table 6. Possible Types of Support Given by and to

Different CSCL Participants

Support given by

Support given to Student Tutor/teacher Technical expert

Student

Tutor/teacher

Technical expert

Pedagogical,

social, managerial,

technical

Pedagogical,

social, managerial,

technical

Meta-pedagogical,

Meta-social,

Meta-managerial,

Meta-technical

Meta-technical

Technical

Technical

Technical

teacher-moderated condition were more creative in their final reports. According

to Table 6 and Ashton’s categories, if students have multiple sources of inter-

action, they can obtain different types of support for achieving higher per-

formance. Students in a peer-moderated condition tend not to rely on peer support

because they need to spend considerable effort to find useful information. There-

fore, students in the peer-moderated condition mainly relied on learner-content

interaction, which can cover only part of the pedagogical, managerial, and tech-

nical support. This may be why they did not perform well, even if they possessed

the required software skills.

Moreover, students in the teacher-moderated condition tended to login, to

read course materials, or to post messages several times on the same day (Table 5,

rule 4 to rule 8). Students in the peer-moderated condition had fewer same-day

logins (Table 4, rule 4 and rule 5 show their lower frequencies). The support and

confidence of rule 4 and rule 5 in the peer-moderated condition are also far

lower than rule 4 to rule 8 in the teacher-moderated condition (Tables 4 and 5).

Association rules revealed that teacher presence promoted learning variety and

active participation.

What are the Most Important Indicators for

Predicting Student Learning Performance?

Performance Predictors for Peer-Moderated Learning

A Decision Tree analysis produced a predictive model for each condition.

Figure 1 indicates that the most important variable for performance prediction in

the peer-moderated condition is the number of bulletin board messages read. The

right branch of Figure 1 shows that students who read more than 33.5 messages

during 6 weeks of activities received an average final grade of 79.17. Therefore,

reading discussions on the bulletin board was the most valuable activity for

improving performance among participants in the peer-moderated condition.

Again, the results did not correspond to the posttest survey results (Zhang et al.,

2009). The posttest survey results showed that students in the peer-moderated

conditions did not feel that messages on the bulletin board were useful. Since

surveys do not provide sufficient information for online learning research, more

tools such as data mining are needed to support online teaching and research.

The other important variable in the right branch of Figure 1 is the frequency

of accessing course materials. The peer-moderated participants who read more

than 33.5 messages and accessed course materials more than 38.5 times received

an average final grade of 85. Those whose frequency of accessing course materials

increased to 48.5 received an average final grade of 87.57. These results indicate

that student-content interaction is the most important part of the peer-moderated

condition. This may be because peers can provide metacognitive support.

However, students need to deal with a large amount of information and to extract


useful information from many discussions. Perhaps that is why only 13 students

performed well (final grade average = 85) in the peer-moderated condition.

The left branch of Figure 1 shows how critical participating in online discussion

is to success in the course. Six participants in the peer-moderated condition who

read and posted the fewest messages (fewer than 33.5 and 3.5 respectively) had

the lowest final grades (M = 37). Students who read fewer than 33.5 messages

but posted more than 3.5 messages earned a better final average of 60.58. Prior

software knowledge is another important variable, as shown in the left branch.

For those who posted more than 3.5 messages in 6 weeks, the participants with

high prior software knowledge (>= 28.5) performed better than those with low

prior software knowledge. These results imply that active participation with peers

(reading and posting messages) is important for better performance. For the

students with low participation levels, the only thing they could rely on was

their prior knowledge.

Performance Predictors for Teacher-Moderated Learning

Figure 2 shows a predictive model for the teacher-moderated students. The most

important variable influencing their final grades was the frequency of accessing

course materials. Twenty-six participants who accessed the course materials

more than 37 times received an average final grade of 86.15. Participants in this

group that had high prior software knowledge received average final grades of

90.61. Because the standardized software test covered the content of the whole

semester, the results reflected that the 6 weeks of activities did not cover all

the scope of the software standardized test. Therefore, students’ prior software

knowledge still influenced their performance.

Comparing the right branch of Figures 1 and 2 (high performing students)

shows that participants in the peer-moderated condition needed to access course

materials (>= 48.5) more than those in the teacher-moderated condition (>= 37)

to achieve higher performance. In addition, even peer-moderated participants

with low prior software knowledge (< 21.5) performed better than those having

higher prior software knowledge (>= 28.5) when they access course materials

more frequently. These results show that various interactions are needed in

online collaborative environments. Relying on only a single interaction makes it

hard to improve learning. Students need to participate actively in order to obtain

problem-solving skills from various interactions.

CONCLUSION

This study revealed that teacher presence, as opposed to just peer presence,

makes a significant difference in influencing students’ learning behaviors.

Students participated more actively and more variously when a teacher was

present. However, the results also show that an online peer-moderated course can

produce successful learning outcomes when multiple student interactions are


facilitated. For courses using asynchronous communication, this study provides an

important performance-predictive model. It seems ridiculous to ask students to

post and to read a specific number of messages weekly. However, these simple

principles can be embedded and tested within a learning management system.

Once they have been validated, these principles can help to simplify instructional

design and online teaching.

In terms of professional development, more and more people use commer-

cially produced online training modules and conduct self-paced learning. These

products feature training materials and discussion forums for their users. The

situation is very similar to a peer-moderated condition. Unless users consistently

maintain a high quality of discussion and strong self-motivation, they will find

it hard to achieve high performance. The results also provide information for

improving online university coursework which has online instructors guiding

the learning process.

In this study, quantified behavioral data has demonstrated the potential of

data mining for supporting online teaching and research. These artificial intelli-

gence methods such as clustering, association, and decision tree are unique and

useful statistical tools for online teaching and research. This study has discussed

the possibility of applying different instructional strategies after clustering

analysis, especially for lower-achieving students. Association rules clearly can

help instructors to understand students’ learning patterns. Decision trees can

be used to build a predictive model and predict student performance. Educational

data mining supports and simplifies online teaching and learning. This study

also shows the possibility of building a predictive model for online leaning. Once

the model has been verified and integrated into the leaning management system

(LMS), the LMS can pop up reminders, messages, and warnings for students

and instructors based on the model built. This can save instructors time and

energy and enable them to put their efforts toward facilitation of learning.

REFERENCES

Akdemir, O., & Koszalka, T. A. (2008). Investigating the relationships among instructional

strategies and learning styles in online environments. Computers & Education, 50(4),

1451-1461.

Anderson, T., Rourke, L., Garrison, D. R., & Archer, W. (2001). Assessing teaching

presenece in a computer conference context. Journal of Asynchronous Learning

Networks, 5(2), 1-17.

Arvaja, M., Häkkinen, P., Eteläpelto, A., & Rasku-Puttonen, H. (2000). Collaborative

processes during report writing of a science learning project: The nature of discourse as

a function of task requirements. European Journal of Psychology of Education, 15,

455-466.

Ashton, S., Roberts, T., & Teles, L. (1999). Investigation the role of the instructor in

collaborative online environments. Poster session presented at the CSCL ‘99

Conference, Stanford University, CA.


Beaudoin, M. F. (2002). Learning or lurking: Tracking the ‘‘invisible’’ online student.

The Internet and Higher Education, 5(2), 147-155.

Becker, K., & Vanzin, M. (2003). Discovering interesting usage patterns in web-based

learning environments. Proceeding of the International Workshop on Utility, Usability

and Complexity of e-Information Systems, 57-72.

Becker, K., Vanzin, M., Marquardt, C., & Ruiz, D. (2006). Applying web usage mining

for the analysis of behavior in web-based learning environments. In C. Romero &

S. Ventura, (Eds.), Data mining in e-learning (pp. 117-137). Billerica, MA: WitPress.

Bernard, R. M., Brauer, A., Abrami, P. C., & Surkes, M. (2004). The development of a

questionnaire for predicting online learning achievement, Distance Education, 25(1),

31-47.

Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and

regression trees. Monterey, CA: Wadsworth International Group.

Chen, L. D., Sakaguchi, T., & Frolick, M. N. (2000). Data mining methods, applications,

and tools. Information systems management, 17(1), 65-70.

Christensen, L. J., & Menzel, K. E. (1998). The linear relationship between student reports

of teacher immediacy behaviors and perceptions. Communication Education, 47(1),

82-90.

Christophel, D. M. (1990). The relationships among teacher immediacy behaviors, students

motivation, and learning. Communication Education, 39(4), 323-340.

Christophel, D. M., & Gorham, J. (1995). A test-retest analysis of student motivation,

teacher immediacy, and perceived sources of motivation and demotivation in college

classes. Communication Education, 44(4), 292-306.

CSF. (2005). Techficiency Quotient Certification (TQC). Retrieved January 27, 2009

from http://www.tqc.org.tw/TQC/index.asp

Donaldson, S. I., & Grant-Vallone, E. J. (2002). Understanding self-report bias in organi-

zational behavior research. Journal of Business and Psychology, 17(2), 245-260.

Dziuban, C., & Moskal, P. (2001). Evaluating distributed learning at metropolitan univer-

sities. Educause Quarterly, 24(4), 60-61.

Facione, P. A., & Facione, N. C. (1994). The holistic critical thinking scoring rubric.

Millbrae, CA: The CaliforniaAcademic Press.

Fernandez, J., Marin, R., & Wirz, R. (2007). Online competitions: An open space to

improve the learning process. IEEE Transactions on Industrial Electronics, 54(6),

3086-3093.

Frymier, A. B. (1993). The relationships among communication apprehension, immediacy

and motivation to study. Communication Reports, 6(1), 8-17.

Garland, D., & Martin, B. N. (2005). Do gender and learning style play a role in how

online courses should be designed? Journal of Interactive Online Learning, 4(2),

67-81.

Kass, G. V. (1980). An exploratory technique for investigatin large quantities of categorical

data. Applied Statistics, 29, 119-127.

Kirkosa, E., Spathisb, C., & Manolopoulosc, Y. (2007). Data mining techniques for

the detection of fraudulent financial statements. Expert Systems with Applications,

32(4), 995-1003.

Kreijns, K., Kirschner, P. A., & Jochems, W. (2003). Identifying the pitfalls for social

interaction in computer-supported collaborative learning environments: A review of

the research. Computers in Human Behavior, 19, 335-353.


Liu, X. J., Magjuka, R. J., & Lee, S. H. (2008). The effects of cognitive thinking styles,

trust, conflict management on online students’ learning and virtual team performance.

British Journal of Educational Technology, 39(5), 829-846.

Lund, K. (2004). Human support in CSCL: What, for whom, and by whom? In J. W.

Strijbos, P. A. Kirschner, & R. L. Martens (Eds.), What we know about CSCL

(pp. 167-198). Norwell, MA: Kluwer Academic Publisher.

Macdonald, J. (2003). Assessing online collaborative learning: Process and product.

Computers & Education, 40(4), 377-391.

Machado, L., & Becker, K. (2003). Distance education: A web usage mining case study

for the evaluation of learning sites. ICALT: Proceeding of the International Conference

on Advanced Learning Technologies. IEEE Society, 360-361.

Maimon, O., & Rokach, D. (2005). Data mining and knowledge discovery handbook.

O. Maimon & L. Rokach (Eds.). New York: Springer Science+Business Media.

Michalewicz, Z., Schmidt, M., Michalewicz, M., & Chiriac, C. (2007). Adaptive business

intelligence. Berlin, NY: Springer.

Neuhauser, C. (2002). Learning style and effectiveness of online and face-to-face instruc-

tion. American Journal of Distance Education, 16(2), 99-113.

Picciano, A. G. (2002). Beyond student perceptions: Issues of interaction, presence, and

performance in an online course. Journal of Asynchronous Learning Networks, 6(1),

pp. 21-40.

Richardson, J. C., & Swan, K. (2003). Examining social presence in online courses in

relation to students’ perceived learning and satisfaction. The Journal of Asynchronous

Learning Networks, 7(1), 68-88.

Riecken, D. (2000) Personalized views of personalization. Communications of the Asso-

ciation for Computing Machinery, 43(8), 27-28.

Roiger, R. J., & Geatz, M. W. (2003). Data mining: A tutorial-based primer. Boston,

MA: Addison Wesley.

Salomon, G., & Perkins, D. N. (1998). Individual and social aspects of learning. Review

of Research in Education, 23, 1-24.

Schamber, J. F., & Mahoney, S. L. (2006). Assessing and improving the quality of

group critical thinking exhibited in the final projects of collaborative learning groups.

The Journal of General Education, 55(2), 103-137.

Shea, P. J., & Bidjerano, T. (2009). Community of inquiry as a theoretical framework

to foster “epistemic engagement” and “cognitive presence” in online education.

Computers and Education, 52, 543-553.

Shea, P. J., Li, C. S., & Pickett, A. M. (2006). A study of teaching presence and student

sense of learning community in fully online and web-enhanced college courses. The

Internet and Higher Education, 9(3), 175-190.

Shea, P. J., Pickett, A. M., & Pelz, W. E. (2003). A follow-up investigation of “teaching

presence” in the SUNY learning network. Journal of Asynchronous Learning

Networks, 7(2), 61-80.

Stevens, D. D., & Levi, A. (2005). Introduction to rubrics: An assessment tool to save

grading time, convey effective feedback, and promote student learning. Sterling, VA:

Stylus Publisher.

Thomas, W. R., & Macgregor, S. K. (2005). Online project-based learning: How collab-

orative strategies and problem solving processes impact performance. Journal of

Interactive Learning Research, 16(1), 83-107.


Tseng, S. T., Tsai, S. M., Su, T. H., Tseng, CH. L., & Wang CH. I. (2005). Data mining.

Taipei, Taiwan: Flag Publishing

Waeytens, K., Lens, W., & Vandenberghe, R. (2002). Learning to learn: Teachers’

conceptions of their supporting role. Learning and Instruction, 12, pp. 305-322.

Wise, A., Chang, J., Duffy T., & Valle, R. D. (2004). The effects of teacher social presence

on student satisfaction, engagement, and learning. Journal of Educational Computing

Research, 31(3), pp. 247-271.

Wu, A. S., Farrell, R., & Singley, M. K. (2002). Scaffolding group learning in a collabor-

ative networked environment. In G. Stahl (Ed.), Computer support for collaborative

learning: Foundations for a CSCL community (pp. 435-444). Hillsdale, NJ: Lawrence

Erlbaum Associates.

Yada, K. (2007). CODIRO: A new system for obtaining data concerning consumer

behavior based on data factors of high interest determined by the analyst. Soft

Computing, 11(8), 811-817.

Zaiane, O. R. (2001). Web usage mining for a better web-based learning environment.

CATE: Proceeding of the International Conference on Advanced Technology for

Education. Banff, Alberta, 60-64.

Zhang, K., Peng, S. W., & Hung, J. L. (forthcoming). Online collaborative learning in a

project-based learning environment in Taiwan. Educational Media International.

Direct reprint requests to:

Jui-Long Hung

1910 University Drive

Dept. of Educational Technology

College of Education, Boise State University

Boise, ID 83725

e-mail: [email protected]


teacher moderated and student modertaed e learning

Documents