draft proposal - wordpress.com · georgia institute of technology september 26, 2015 1.2. present...

Snejana Shegheva Georgia Institute of Technology September 26, 2015

Draft Proposal

Awakefulness State Learning Evidenced by Pattern of Typing

There was no "undelete" key to press, no other pages saved, no other "untitled" entries ...

just that hollow feeling of defeat, the helplessness… Michael Maldonado, 2015. From the “Living on the edge of reality”

The following diagram will serve as a visual guide through the motivation, methodology and the

solution proposed for the research of the narcolepsy phenomenon from the angle of improving

individuals learning strategies.

Figure 1. Visual Guide through the Project Sections

1


1. Research Motivation

1.1. Narcolepsy Phenomenon

The motivation for the research of the narcolepsy phenomenon gradually emerged from nearly a

decade of observing the individual closest to me who struggles on a daily basis with the

symptoms imposed by this neurological disorder.

Narcolepsy is characterized by abnormalities of the sleepwake cycle caused by hypocretin

deficiency in the brain. The main symptoms include excessive daytime sleepiness, cataplexy,

hypnotic hallucinations, sleep paralysis, automatic behavior and disrupted nighttime sleep.

Figure 2 demonstrates the contrast of the sleep patterns between normal and narcoleptic

brains. Besides capturing the frequent daytime sleep attacks, the image also portrays the

sudden nature (seconds to minutes) of the transition from wakefulness into REM, whereas

normal brain usually takes more than an hour.

Figure 2. The Brain from Top to Bottom.

Image Source: http://thebrain.mcgill.ca/flash/a/a_11/a_11_p/a_11_p_cyc/a_11_p_cyc.html

2

http://www.webmd.com/sleep-disorders/guide/narcolepsy

https://en.wikipedia.org/wiki/Orexin

http://thebrain.mcgill.ca/flash/a/a_11/a_11_p/a_11_p_cyc/a_11_p_cyc.html


1.2. Present Behavioral Intervention Methodologies

Heterogeneity of the symptoms makes the process of narcolepsy diagnosis difficult enough that

it might take 10 years or more before an individual’s impairment is properly assessed [1]. The

complexity of this disorder has gained more visibility in recent years in neuroscience

communities which helped drive the research towards better understanding of the nature of the

brain’s abnormalities.

There exist very few limited options to treat the symptoms of narcolepsy on the behavioral level

— such as scheduled daily naps and routine exercise. Study [3] suggests a beneficial outcome

following these changes in lifestyle, especially in individuals with profound symptoms. An

interview with Dr. Kirch [4] informs that behavioral treatment is complementary to treatment with

medication, and by itself might not lead to subdued symptoms.

Any lifestyle changes need to be personalized and adapted for each individual, and currently no

technologies exist that can help with this adaptation, such as knowing when is the best time to

take a nap or what amount of exercise is necessary taking into account additional individual

physical conditions.

1.3. Adaptive Nature of the New Strategies

With this research, I hope to take the first steps in creating a personalized technology with initial

focus on improving experience during essay writing — via learning about and educating the

individual on their changes in typing patterns and alter or adjust specific habits if necessary.

The new strategy gives new hope largely due to its adaptive feature — learning how to learn

with narcolepsy versus looking for ways to treat the symptoms. In particular, the Metacognition

framework appears to be relevant to the problem at hand; it has the potential to be a solution to

at least a few important activities in learning — such as writing and reading. This particular

study is concentrated on finding and adapting innovative techniques to the writing activity as it

has been identified as a significant obstacle toward successful learning for people with

narcolepsy.

3


2. Methodology and Solution

2.1. ASLEPT — Stochastic Modeling of the Awakefulness

2.1.1. Hybrid System Overview

The purpose of this proposal is to introduce a hybrid metacognitiveprobabilistic system —

Awakefulness State Learning Evidenced by Pattern of Typing — ASLEPT. The probabilistic

component of the system reasons in the uncertain environment modeled by a Hidden Markov

Process which is a goto solution for sequential and noisy data. The metacognitive component

of the proposed system analyzes the results of inference tasks to generate an overall picture of

a narcoleptic state of awareness. The insights are used to intervene during the writing task in

the presence of document corruption risks or to engage the individual into selfreflection helping

to break the sleepiness cycle. The actions are determined by which of five roles are

automatically assigned to the Digital Assistant for Writing Activities — DAWA.

2.1.2. Stochastic Component — Goals & Design

Narcolepsy is a complex phenomenon with almost unique manifestation of symptoms per

individual. Let us take the most common symptom frequent daily sleep attacks (EDS

excessive daytime sleepiness) — and see if we can accurately describe the process starting

with the assumption of its deterministic nature.

If we know all the variables which cause EDS, then we should be able to exactly predict the next

wave. However, here lies the problem there are too many variables, such as individual's age,

the profoundness of the disorder, type and amount of medication taken (if any), time of day,

general physical state of the individual, and so on. This makes the problem intractable very

quickly due to complex interweaving of the intrinsic properties of each variable. Their causal

relationship however can be captured with the stochastic models which infer the state of

wakefulness by making observations of the entered text.

4


The graphical representation below (see Figure 3) demonstrates the temporal characteristic of

the learning portion of the ASLEPT system. The process is initiated by the user entering the

text — evidence — which is measured at discrete intervals of time.

The overall learning process can be viewed as a threefold modeling:

Sensor Model — the outer visible layer — describes the observation process by

collecting and analyzing the evidence from what and how a user is typing. The realtime

language model (addressed in a later section) is critical for making the sensor as

accurate as possible.

Emission Model — the middle layer — describes how the quality of the entered text is

affected by the individual’s state, i.e. being drowsy negatively correlates with speed of

typing.

5


Transition Model — the inner hidden layer — describes the dynamics of the actual

states, i.e. what is the likelihood of being drowsy now if one has been drowsy for the last

N observations; how often does the state change and how fast is it detectable?

2.2. Real-Time Language Model

A primary angle that we need to consider in the sensor model’s design is addressing the most

common causes for document corruption while writing an essay. We conjecture that document

corruption is preceded by subtle signals — increased error rate and decreased typing speed.

MacKenzie et al [8] discussed important techniques for measuring error rates in user entered

text — keystrokes per character ( KSPC ) and characters per second ( CPS ). Some adaptation

of these techniques could be applied to the problem at hand, in particular, detecting quality of

entered text in realtime while writing an essay.

2.2.1. KSPC as an Accuracy Metric MacKenzie [5] describes Keystrokes Per Character as an important characteristic especially in

the area of mobile computing and suggests computing KSPC as follows (here, we adapt the

word version of metric definition):

SPC K = ×F∑

Cw w

×F∑

Kw w

where is number of keystrokes made for entering a word , Kw w

is the number of characters in the word Cw

and is the frequency of the word in the corpusFw

For this study, the importance of the KSPC metric is slightly reduced within the domain of essay

writing largely due to the presence of automatic spelling corrections in most modern word

processing editors. This can skew the results of the analysis and play in favor of the false

negatives (undetected high error rate obfuscated by autocorrections). Additionally, there is no

6


availability or need for corpus, so we can reduce the scope of the metric by removing the

weighting factor .Fw

SPC K unweighted = ∑

Cw

∑

Kw

The goal is to learn the acceptable thresholds below which the entered sequences will be

considered affected by sleepiness.

2.2.2. CPS as a Speed Metric

Typically, the task of measuring Characters per Second (CPS ) [6] is very trivial as it involves

keeping count of only two variables — elapsed time and number of entered characters.

,PS C = ΔT∑N

k=1k

where is elapsed timeT T T Δ = 2 − 1

and is an entered characterk

The metric presented above works very well if a user is measured based on predefined

(presented) short snippets of text where inactivity is not anticipated. However, when writing an

essay it is commonly expected that the user will periodically suspend typing (for thought

gathering, rereading material or for general mental or physical breaks).

To distinguish between a valid time interval during which a user has not entered any text and

unintentional stream interruptions, it becomes necessary to extend the definition of the CPS

metric to take into account word boundaries.

, PS CPS︿

= 1Nwords

∑Words

w=1C w

where CPS metric is measured per each word w

and is averaged across all words in the measured time lapse.

7


While the new definition solves the problem of false positives (intentional typing suspension

classified as an example of falling asleep), it presents a different challenge — correctly

identifying word boundaries. This might require techniques from natural language processing to

determine whether entered sequences of characters constitute a valid word.

2.3. Inferring the Narcolepsy State

2.3.1. Model with Latent Variables

An immediate task in overcoming the obstacles in writing essays for individuals with narcolepsy,

is identifying signals observable through typing and responsible for document corruption.

Let us model the narcolepsy process as a discrete random variable — X

Asleep, Drowsy, Awake X =

which takes one of the three values at each point in time .T

By observing the dynamics of the variable in the past and present we would like to be able to

predict its value at a future time. This assumption that past states can correlate to future states

forms the basis of the Markov Process.

8


Graphically, this process could be represented as a chain of states (see Figure 4) and in this

particular example it is a first order Markov Chain where the future state is a function of the

present state only — a phenomenon also known as a markovian property.

Using mathematical notation, this property can be expressed as a conditional independence:

(X | X ) P (X | X ) P t 0 : t − 1 = t t − 1

In addition to the simplifying assumption of memorylessness of the process, time will also be

considered a discrete variable. In other words, we will measure the awake state of the

individual at discrete periods of time — 5s., 10s, 15s., etc.

The described above model which assumes direct observability of the variables (state which the

narcoleptic individual is currently in) has a strong limitation — observing the individual directly

may not be very plausible as it requires advanced Computer Vision techniques to detect minute

facial muscle tension (expressions).

Can we still infer the narcoleptics’ state by observing their writing activities instead?

What properties of those activities have a direct relationship with levels of alertness?

These questions where the actual physical process is unobserved (hidden) can be answered by

modeling the individual's behavior with the Hidden Markov Process (see Figure 5)

9


Here we introduce another random variable which is the evidence directly observed from theE

typing activity and can be modeled with two states:

valid sequence, invalid sequence E =

The assumption made here is that we can infer whether the individual is alert by

analyzing what and how they are typing at each time slice .T

The direct relationship between evidence and the actual process is modeled by conditionalE X

probability and it expresses the causality — the quality of the typing depends on the(E | X ) P t t

current physical state of the individual.

2.3.2. Inference tasks

The topology of the HMM also known as a Trellis diagram allows making certain types of

inferences about an underlying process (in this case, narcolepsy):

what is the probability of being in one of three states — awake, drowsy or asleep —

given the sequential observations of the entered text

how can we detect changes in the wakefulness state given the observations

what are the descriptive attributes of the individual’s narcolepsy — how often is the

individual in any one of the states; does any state prevail and at what times of the day

how can we visualize narcolepsy transitions in order to educate the individual about their

specific and unique behavior

In the quest of answering these questions we will look at the Forward/Backward algorithm [13]

for estimating the belief state of the narcolepsy. This would mean computing the probability of a

state right now (at this moment), given the analysis of the text entered thus far. The task of

making this inference is commonly called filtering or monitoring and it involves learning the

posterior distribution over the narcolepsy states.

10


2.3.3. Recursive Estimation

The task of filtering is simple and it is built upon the notion of the recursive estimation (Norvig,

1995, [10]) which can be applied to any type of sequential model (it satisfies the definitions of

the narcolepsy problem statement):

(X | E ) σ(E , P (X | E )) P t+1 1:t+1 = t+1 t 1:t

What this implies is that the current state of narcolepsy (given the observed language model) is

a function of evidence of entered text and the previous conditional estimation. Using the image

from the previous reflection, we circle the dependency which the algorithm is(X | E ) P t+1 1:t+1

trying to compute (see Figure 6) in two steps — forward and backward (thus the name of the

algorithm).

The filtering task uses factorization techniques on the joint distribution of narcolepsy hidden

states and observable language to express the task in terms of transition, emission and initial

parameters described in the general architecture of the ASLEPT system (see sections above).

Visually the two steps — forward and backward — can be understood as a projection and

update in the opposite directions (see Figure 7)

11


In essence, what seemed to be an intractable model at first, can be very efficiently computed.

This computational model provides a blackbox answering all kinds of questions about the

modeled process, some of which were highlighted in previous sections.

2.4. Narcolepsy State Visualization

2.4.1. Metacognosis

The important question that the reader of this proposal should ask is what is the learning goal of

the research and how does modeling the narcolepsy help achieve that goal. Modeling is just

one aspect of the problem which in itself is by no means a complete solution. The necessary

step following computations is extracting the insights which can be visualised in order to

diagnose the individual's state and provide recommendations.

A Digital Assistant for Writing Activities — DAWA — will serve a purpose of monitoring the

current state and to playing an active role as the metacognitive mentor. DAWA may initiate a

short dialog with the individual by either asking a few questions further probing the state, or

making a joke which can potentially lead the individual out of the hypnotic trance.

12


2.4.2. DAWA Goals

The graphical representation below (see Figure 8) demonstrates the roles assigned to DAWA

depending on the scenario:

insights — generate text based on information extracted from the trained HMM model to

improve the individual’s understanding of the symptoms behavior

questions — when DAWA is unsure about the insight, additionals questions may be

generated based on predefined rules

suggestions — often the individual can take actions if given advance warning about the

next wave of sleepiness; additionally in accordance with the behavioral interventions

treatment, such as daily naps, DAWA may recommend a specific time.

actions — on occasion when an individual falls asleep while still pressing a certain key,

DAWA may generate a sound notification informing of the possibility of document

corruption

miscellaneous — another way of engaging the individual at critical times is to generate

jokes or retrieve inspirational quotes with intentions to keep the individual awake

13


3. Evaluation Methodology

3.1. Model Goodness of Fit

Eventually any modeling task will culminate in a question — how good is the model? Generally

speaking, the goodness of the model is measured by how well it describes the observed data.

There are few methods which can accomplish this task (sorted from less interesting to more

curious and elegant ): 1

Pearson Correlation — allows measuring the correlation between an observable

variable and its predicted values (generated by the model), and is typically described by

value , where implies a strong negative correlation, , such as r r − 1 ≤ ≤ 1 − 1 + 1

suggests a strong positive correlation and stands for lack of any kind of correlation.0

KolmogorovSmirnov test — measures the distance between two distributions of

random variables. The metric can quantify the probability of two samples being drawn

from the same distribution.

Monte Carlo Methods — this class of algorithms allows sampling from the posterior

distribution of each model which can be used to estimate the likelihood of the model

given the observed data. Roughly speaking, we can let the sampling process jump from

one model to another until convergence, and subsequently compare the parameters of

the model and data.

3.2. Visual Evaluation via Separation Plots

In addition to quantifying model accuracy and likelihood, we can follow an original graphical

approach called separation plots [14]. The beauty of this method is that it provides a

straightforward visual interpretation of the model's predictive power without loss of information

typical of other statistical tests such as Pearson correlation or KolmogorovSmirnov test.

1 This expresses strictly my personal opinion

14

https://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient

https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo


Among other advantages the interesting side effect of applying a separation plot is its ability to

summarize event behaviors such as sparsity, variability and relative concentration of events vs.

nonevents.

The process of estimating how well the model fits the data would involve a few simple steps:

Create the table with the actual observations and their fitted values

Rearrange the table so that the fitted values are sorted in descending order

Map the data to a plot with dark stripes representing events and light areas otherwise

This will create a very compact summary of the model’s fitness such as represented in Figure 9:

Figure 9. Separation Plot. Image source Greenhill, Brian, Michael D. Ward, and Audrey Sacks. "The

separation plot: A new visual method for evaluating the fit of binary models."

Taking one step further, one can imagine lining up multiple models to visually select the best fit

— those models which have a very clear separation of events from non events by clustering the

former to the right side of the plot.

4. Timeline & Calendar

4.1. Weekly milestones

The timeline image below ( see Figure 10 ) represents ten weekly milestones spanning from the

end of September to the beginning of December. There are roughly four themes — mega

milestones — describing the project progression:

September: Final project proposal and creation of synthetic data

October: Modeling the narcolepsy phenomenon — ASLEPT system

November: Mapping the knowledge extracted from model to actions — DAWA system

December: Project wrap up, presentation and paper writing

15


4.2. Detailed weekly schedule

Besides milestones and their approximate timeline, we present a table with more detailed task

descriptions and their level of effort (LOE).

Real Time Language Model: — due to time constraints, a decision was made to skip the

implementation of the language parsing portion and instead use synthesized data which would

simulate already processed observations.

# Week Milestone Tasks LOE (hours)

1 Synthetic Data Creation

I. Finalize the proposal II. Create synthetic data for observation

Use PYMC library for modeling Create a writeup in the IPython

Notebook

2 8

16


2 Hierarchical Model Implementation

I. Apply Hidden Markov Model Define distributions for all random

variables and create the necessary topology

Integrate simulated data to make predictions on the cognitive state

10

3 State Detection Implementation

I. Implement the model to predict changes in the cognitive state

II. Create basic visualization (plotting of the state) III. Create a write up in the IPython Notebook

5 3 2

4 Inference Tasks Implementation

I. Progress Report II. Apply recursive estimation algorithm (not

implementing from scratch)

5 5

5 Behavior Visualization I. Brainstorm the best way to visualize cognitive state change

most likely to use plotly II. Augment visualization with prototype

representation of the DAWA

4 2

6 DAWA Rules Engine I. Investigate available tools for rules engines II. Implement facts and rules

2 8

7 DAWA Role Implementation

I. Create interface for different DAWA roles from insights to questions

II. Visualize roles depending on their goals

5 5

8 Evaluation Metrics Design and

Implementation

I. Apply Pearson correlation on distribution level define posterior of the pearson

correlations II. Apply KolmogorovSmirnov Test III. Apply Monte Carlo Methods

visualize the results with PYMC library IV. Use separation plots

1 2 5 5

9 Component Integration Testing

I. Trailer II. Integrate prototype pieces III. Put a simple web interface — possible flask

library

2 4 8

10 Final Paper I. Wrapping up the project II. Paper writing III. Presentation

2 5 3

Total estimated hours: 103

17


References

[1] Berro, Laís F., Sergio B. Tufik, and Sergio Tufik. "A journey through narcolepsy diagnosis: From ICSD 1

to ICSD 3." Sleep Science 1.7 (2014): 34.

[2] Rovere, Heloísa, Sueli Rossini, and Rubens Reimão. "Quality of life in patients with narcolepsy: a

WHOQOLbref study." Arquivos de neuropsiquiatria 66.2A (2008): 163167.

[3] Rogers, Ann E., and Michael S. Aldrich. "The effect of regularly scheduled naps on sleep attacks and excessive daytime sleepiness associated with narcolepsy." Nursing research 42.2 (1993): 111117.

[4] “Narcolepsy Treatment.” Web blog post. WakeUpNarcolepsy. 2015 <http://www.wakeupnarcolepsy.org/aboutnarcolepsy/treatmentoptions/>

[5] MacKenzie, I. Scott. "KSPC (keystrokes per character) as a characteristic of text entry techniques." Human Computer Interaction with Mobile Devices. Springer Berlin Heidelberg, 2002. 195210.

[6] MacKenzie, I. Scott. "A note on calculating text entry speed." unpublished work, available online at http://www. yorku. ca/mack/RNTextEntrySpeed. html (2002). [7] Soukoreff, R. William, and I. Scott MacKenzie. "Measuring errors in text entry tasks: an application of the Levenshtein string distance statistic." CHI 01 extended abstracts on Human factors in computing systems. ACM, 2001. [8] Soukoreff, R. William, and I. Scott MacKenzie. "Metrics for text entry research: an evaluation of MSD and KSPC, and a new unified error metric." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2003.

[9] Murphy, Kevin P. Machine learning: a probabilistic perspective. MIT press, 2012.

[10] Russell, Stuart, and Peter Norvig. "Artificial intelligence: a modern approach." (1995).

[11] Luger, George F. Artificial intelligence: structures and strategies for complex problem solving. Pearson education, 2005.

[12] Lou, HuiLing. "Implementing the Viterbi algorithm." Signal Processing Magazine, IEEE 12.5 (1995): 4252.

[13] Yu, ShunZheng, and Hisashi Kobayashi. "An efficient forwardbackward algorithm for an explicitduration hidden Markov model." Signal Processing Letters, IEEE 10.1 (2003): 1114.

[14] Greenhill, Brian, Michael D. Ward, and Audrey Sacks. "The separation plot: A new visual method for evaluating the fit of binary models." American Journal of Political Science 55.4 (2011): 9911002.

[15] Gelman, Andrew, and Cosma Rohilla Shalizi. "Philosophy and the practice of Bayesian statistics." British Journal of Mathematical and Statistical Psychology 66.1 (2013): 838.

18

http://www.wakeupnarcolepsy.org/about-narcolepsy/treatment-options/

draft proposal - wordpress.com · georgia institute of technology september 26, 2015 1.2. present...

Documents