providing data security in patient clinical … data security in patient clinical data using support...

11
PROVIDING DATA SECURITY IN PATIENT CLINICAL DATA USING SUPPORT VECTOR MACHINE ANNAI COLLEGE OF ENGINEERING AND TECHNOLOGY DEPARTMENT OF INFORMATION TECHNOLOEGY GUIDED BY Mrs D.SHYAMALADEVI, M.Tech. Ass Pro/IT P.GEETHA([email protected]) R.KALPANA([email protected]) P.VINOTHINI([email protected]) ABSTRACT: To speed up the time and improve the diagnosis accuracy in today’s Healthcare system. The advantages of clinical decision support system include not only improving diagnosis accuracy but also reducing diagnosis time. We have use the data mining technique name Using the SVM classifier and the encryption technique and try to make the Clinical Decision Support System more helpful for providing information about some important diseases more accurately and efficiently. The models were proposed nearly three decades ago, but have had little practical impact, despite their significant advantages compared to either ranked keyword or pure svm. We also propose term independent bounds that are able to further reduce the number of score calculations for short, simple queries under the extended svm. Together, these methods yield an overall saving from 50 to 80 percent of the evaluation cost on test queries drawn from biomedical search. INTRODUCTION :Now a days, Healthcare industry is extensively distributed in the global scope to provide health services for patients, has never faced such a massive amounts of electronic data or experienced such a sharp growth rate of data today. However, if no appropriate technique is developed to find great potential economic values from big healthcare data, these data might not only become meaningless but also requires a large amount of space to store and manage. Over the past two decades, the miraculous evolution of data mining technique has imposed a major impact on the revolution of human’s lifestyle by predicting behaviors and future trends on everything which can convert stored data into meaningful information. These techniques are well suitable for providing decision support in the healthcare setting. To speed up the diagnosis time and improve the diagnosis accuracy, a new system in healthcare industry should be workable to provide a much cheaper and faster way for diagnosis. Clinical Decision Support System (CDSS), with various data mining techniques being applied to assist physicians in diagnosing patient diseases with similar symptoms, has received a great attention recently. Clinical decision support system has been defined as an “active knowledge systems”, which use two or more items of patient’s data to generate case specific advice. This implies that a CDSS is simply a decision support system that is focused on using knowledge management in such a way to achieve clinical advice for patient care based on multiple items of patient’s data. The main purpose of modern CDSS is to assist clinicians at the point of care. This means that clinicians interact with a CDSS to help to analyses, and reach a diagnosis based on patient data. Naive Bayesian classifier, one of the popular machine learning tool of data mining, has been widely used recently to predict various diseases in CDSS. Despite its simplicity, it is more appropriate for medical diagnosis in healthcare than some sophisticated techniques. The CDSS with naive Bayesian classifier has offered many advantages over the traditional healthcare systems and opens a new way for clinicians to predict patient’s diseases. However, its flourish still hinges on understanding and managing the information security and privacy challenges, especially during the patient disease decision phase. One of the main challenges is how to keep patient’s medical data away from unauthorized disclosure. The usage of medical data can be of interest for a large variety of healthcare stakeholders. For example, an online directto-consumer service provider offers individual risk prediction for patient’s disease. Without good protection of patient’s medical data, patient may feel afraid that his medical data will be leaked and abused, and refuse to provide his medical data to CDSS for diagnosis. Therefore, it is crucial to protect patient’s medical data. However, keeping medical data’s privacy is not sufficient to push forward the whole CDSS into flourish. Service provider’s classifier, which is used to predict patient’s disease, cannot be exposed to third parties since the classifier is considered as service provider’s own asset. Otherwise, the third parties can abuse the classifier to predict patient’s disease which could damage service provider’s profit. Therefore, besides preserving the privacy of patient’s medical data, how to protect service provider’s privacy is also crucial for the CDSS. Preserving Patient-Centric Clinical

Upload: duongtuyen

Post on 15-Apr-2018

214 views

Category:

Documents


2 download

TRANSCRIPT

PROVIDING DATA SECURITY IN PATIENT CLINICAL DATA USING SUPPORT VECTOR

MACHINE

ANNAI COLLEGE OF ENGINEERING AND TECHNOLOGY

DEPARTMENT OF INFORMATION TECHNOLOEGY

GUIDED BY

Mrs D.SHYAMALADEVI, M.Tech. Ass Pro/IT

P.GEETHA([email protected]) R.KALPANA([email protected]) P.VINOTHINI([email protected])

ABSTRACT: To speed up the time and improve the diagnosis

accuracy in today’s Healthcare system. The advantages of

clinical decision support system include not only improving

diagnosis accuracy but also reducing diagnosis time. We have

use the data mining technique name Using the SVM classifier

and the encryption technique and try to make the Clinical

Decision Support System more helpful for providing

information about some important diseases more accurately and

efficiently. The models were proposed nearly three decades ago,

but have had little practical impact, despite their significant

advantages compared to either ranked keyword or pure svm. We

also propose term independent bounds that are able to further

reduce the number of score calculations for short, simple queries

under the extended svm. Together, these methods yield an

overall saving from 50 to 80 percent of the evaluation cost on

test queries drawn from biomedical search.

INTRODUCTION :Now a days, Healthcare industry is

extensively distributed in the global scope to provide health

services for patients, has never faced such a massive amounts of

electronic data or experienced such a sharp growth rate of data

today. However, if no appropriate technique is developed to find

great potential economic values from big healthcare data, these

data might not only become meaningless but also requires a

large amount of space to store and manage. Over the past two

decades, the miraculous evolution of data mining technique has

imposed a major impact on the revolution of human’s lifestyle

by predicting behaviors and future trends on everything which

can convert stored data into meaningful information. These

techniques are well suitable for providing decision support in the

healthcare setting. To speed up the diagnosis time and improve

the diagnosis accuracy, a new system in healthcare industry

should be workable to provide a much cheaper and faster way

for diagnosis. Clinical Decision Support System (CDSS), with

various data mining techniques being applied to assist

physicians in diagnosing patient diseases with similar

symptoms, has received a great attention recently. Clinical

decision support system has been defined as an “active

knowledge systems”, which use two or more items of patient’s

data to generate case specific advice. This implies that a CDSS

is simply a decision support system that is focused on using

knowledge management in such a way to achieve clinical advice

for patient care based on multiple items of patient’s data. The

main purpose of modern CDSS is to assist clinicians at the point

of care. This means that clinicians interact with a CDSS to help

to analyses, and reach a diagnosis based on patient data. Naive

Bayesian classifier, one of the popular machine learning tool of

data mining, has been widely used recently to predict various

diseases in CDSS. Despite its simplicity, it is more appropriate

for medical diagnosis in healthcare than some sophisticated

techniques. The CDSS with naive Bayesian classifier has offered

many advantages over the traditional healthcare systems and

opens a new way for clinicians to predict patient’s diseases.

However, its flourish still hinges on understanding and

managing the information security and privacy challenges,

especially during the patient disease decision phase. One of the

main challenges is how to keep patient’s medical data away

from unauthorized disclosure. The usage of medical data can be

of interest for a large variety of healthcare stakeholders. For

example, an online directto-consumer service provider offers

individual risk prediction for patient’s disease. Without good

protection of patient’s medical data, patient may feel afraid that

his medical data will be leaked and abused, and refuse to

provide his medical data to CDSS for diagnosis. Therefore, it is

crucial to protect patient’s medical data. However, keeping

medical data’s privacy is not sufficient to push forward the

whole CDSS into flourish. Service provider’s classifier, which is

used to predict patient’s disease, cannot be exposed to third

parties since the classifier is considered as service provider’s

own asset. Otherwise, the third parties can abuse the classifier to

predict patient’s disease which could damage service provider’s

profit. Therefore, besides preserving the privacy of patient’s

medical data, how to protect service provider’s privacy is also

crucial for the CDSS. Preserving Patient-Centric Clinical

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 105
vts-6
Text Box
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

Decision Support System called PPCD, helps physician to

predict disease risks of patients in privacy-preserving way. A

secure and privacy-preserving patient-centric clinical decision

support system which allows service provider to diagnose

patient’s disease without leaking any patient’s medical data.

CDSS provides knowledge and specific information about the

diseases for clinicians to enhance diagnostic efficiency and

improving healthcare quality. It can highly elevate patient

safety, improve healthcare quality. With increasing amounts of

data being generated by all the healthcare industries and

researchers there is a need for fast, accurate and robust

algorithms for data analysis. Improvements in databases

technology, computing performance and artificial intelligence

have contributed to the development of intelligent data analysis.

The primary aim of data mining is to discover patterns in the

data that lead to better understanding of the data generating

process and to useful predictions. One recent technique that has

been developed to address these issues is the support vector

machine. The support vector machine has been developed as

robust tool for classification and regression in noisy, complex

domains. In this paper, we have use the Support Vector Machine

(SVM) technique which is one of the most powerful

classification techniques that was successfully applied to many

real world problems. This SVM has greater advantage in case of

improving diagnosis accuracy in clinical decision support

system. Along with this, we have uses the homomorphic

Encryption technique for providing security to the sensitive data

related to patient health information. The remaining paper is

organized as, Section II gives some Literature Survey which

gives some brief information of the study done in the field of

Clinical Decision Support System CDSS. Section III discuss in

brief about some the data mining and Privacy preserving

technique used for providing CDSS. Section IV provides the

workflow of the system that we have generated for providing

proper CDSS. Finally, Section V concludes the paper.

LITERATURE SURVEY

Application of the clinical decision support systems in the

management of chronic diseases :With the continuous

deepening of the hospital information and improvement of self-

awareness of quantitative management, to reduce the incidence

of medical malpractice, the clinical decision support system

(CDSS), which provides decision support for medical staff, has

emerged. Faced with China's aging population and the

background of high incidence of chronic diseases, establishing a

clinical decision support system with chronic disease

management function is particularly important. This paper

introduces the concept, development process and architecture of

clinical decision support system (CDSS) firstly. Then the

application progress and problems of clinical decision support

systems in chronic disease management is reviewed, in order to

provide a new idea for the development of relevant clinical

decision support system.

Architectural Approaches for Implementing Clinical

Decision Support Systems in Cloud: A Systematic Review

:Clinical Decision Support Systems (CDSS) were explicitly

introduced in the 90's with the aim of providing knowledge to

clinicians in order to influence its decisions and, therefore,

improve patients' health care. There are different architectural

approaches for implementing CDSS. Some of these approaches

are based on cloud computing, which provides on-demand

computing resources over internet. The goal of this paper is to

determine and discuss key issues and approaches involving

architectural designs in implementing a CDSS using cloud

computing. To this end, we performed a standard Systematic

Literature Review (SLR) of primary studies showing the

intervention of cloud computing on CDSS implementations.

Twenty-one primary studies were reviewed. We found that

CDSS architectural components are similar in most of studies.

Cloud-based CDSS are most used in Home Healthcare and

Emergency Medical Systems. Alerts/Reminders and Knowledge

Service are the most common implementations. Major

challenges are around security, performance and compatibility.

We concluded on the benefits of implementing a cloud-based

CDSS, since it allows cost-efficient, ubiquitous and elastic

computing resources. We highlight that some studies show

weaknesses regarding the conceptualization of a cloud-based

computing approach and lack of a formal methodology in the

architectural design process.

A Machine Learning Approach to Improve Contactless

Heart Rate Monitoring Using a Webcam :Unobtrusive,

contactless recordings of physiological signals are very

important for many health and human–computer interaction

applications. Most current systems require sensors which

intrusively touch the user's skin. Recent advances in contact-free

physiological signals open the door to many new types of

applications. This technology promises to measure heart rate

(HR) and respiration using video only. The effectiveness of this

technology, its limitations, and ways of overcoming them

deserves particular attention. In this paper, we evaluate this

technique for measuring HR in a controlled situation, in a

naturalistic computer interaction session, and in an exercise

situation. For comparison, HR was measured simultaneously

using an electrocardiography device during all sessions. The

results replicated the published results in controlled situations,

but show that they cannot yet be considered as a valid measure

of HR in naturalistic human–computer interaction. We propose a

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 106
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

machine learning approach to improve the accuracy of HR

detection in naturalistic measurements. The results demonstrate

that the root mean squared error is reduced from 43.76 to 3.64

beats/min using the proposed method.

Prediction of extubation failure for neonates with

respiratory distress syndrome using the MIMIC-II clinical

database :Extubation failure (EF) is an ongoing problem in the

neonatal intensive care unit (NICU). Nearly 25% of neonates

fail their first extubation attempt, requiring re-intubations that

are associated with risk factors and financial costs. We identified

179 mechanically ventilated neonatal patients that were

intubated within 24 hours of birth in the MIMIC-II intensive

care database. We analyzed data from the patients 2 hours prior

to their first extubation attempt, and developed a prediction

algorithm to distinguish patients whose extubation attempt was

successful from those that had EF. From an initial list of 57

candidate features, our machine learning approach narrowed

down to six features useful for building an EF prediction model:

monocyte cell count, rapid shallow breathing index, fraction of

inspired oxygen (FiO2), heart rate, PaO2/FiO2 ratio where

PaO2 is the partial pressure of oxygen in arterial blood, and work

of breathing index. Algorithm performance had an area under

the receiver operating characteristic curve (AUC) of 0.871 and

sensitivity of 70.1% at 90% specificity.

Public Key Cryptosystems from the Multiplicative Learning

with Errors :We construct two public key cryptosystems based

on the multiplicative learning with errors (MLWE) problem.

The cryptosystem 1 (resp. 2) are semantically secure assuming

the worst-case hardness of the decisional composite residuosity

problem and the search (resp. decisional) LWE problem, which

implies approximating the shortest vector problem in a lattice to

small polynomial factors. Namely, our cryptosystems are secure

as long as one of them is hard. Moreover, the cryptosystem 1

can be adapted to the chosen ciphertext secure cryptosystem. In

addition, the cryptosystem 2 has the additively homomorphic

property.

k-Nearest Neighbor Classification over Semantically Secure

Encrypted Relational Data :Data Mining has wide applications

in many areas such as banking, medicine, scientific research and

among government agencies. Classification is one of the

commonly used tasks in data mining applications. For the past

decade, due to the rise of various privacy issues, many

theoretical and practical solutions to the classification problem

have been proposed under different security models. However,

with the recent popularity of cloud computing, users now have

the opportunity to outsource their data, in encrypted form, as

well as the data mining tasks to the cloud. Since the data on the

cloud is in encrypted form, existing privacy-preserving

classification techniques are not applicable. In this paper, we

focus on solving the classification problem over encrypted data.

In particular, we propose a secure k-NN classifier over

encrypted data in the cloud. The proposed protocol protects the

confidentiality of data, privacy of user's input query, and hides

the data access patterns. To the best of our knowledge, our work

is the first to develop a secure k-NN classifier over encrypted

data under the semi-honest model. Also, we empirically analyze

the efficiency of our proposed protocol using a real-world

dataset under different parameter settings.

Secure two-party distance computation protocol based on

privacy homomorphism and scalar product in wireless

sensor networks Numerous privacy-preserving issues have

emerged along with the fast development of the Internet of

Things. In addressing privacy protection problems in Wireless

Sensor Networks (WSN), secure multi-party computation is

considered vital, where obtaining the Euclidian distance between

two nodes with no disclosure of either side's secrets has become

the focus of location-privacy-related applications. This paper

proposes a novel Privacy-Preserving Scalar Product Protocol

(PPSPP) for wireless sensor networks. Based on PPSPP, we then

propose a Homomorphic-Encryption-based Euclidean Distance

Protocol (HEEDP) without third parties. This protocol can

achieve secure distance computation between two sensor nodes.

Correctness proofs of PPSPP and HEEDP are provided,

followed by security validation and analysis. Performance

evaluations via comparisons among similar protocols

demonstrate that HEEDP is superior; it is most efficient in terms

of both communication and computation on a wide range of data

types, especially in wireless sensor networks.

Toward efficient and privacy-preserving computing in big

data era :Big data, because it can mine new knowledge for

economic growth and technical innovation, has recently received

considerable attention, and many research efforts have been

directed to big data processing due to its high volume, velocity,

and variety (referred to as "3V") challenges. However, in

addition to the 3V challenges, the flourishing of big data also

hinges on fully understanding and managing newly arising

security and privacy challenges. If data are not authentic, new

mined knowledge will be unconvincing; while if privacy is not

well addressed, people may be reluctant to share their data.

Because security has been investigated as a new dimension,

"veracity," in big data, in this article, we aim to exploit new

challenges of big data in terms of privacy, and devote our

attention toward efficient and privacy-preserving computing in

the big data era. Specifically, we first formalize the general

architecture of big data analytics, identify the corresponding

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 107
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

privacy requirements, and introduce an efficient and privacy-

preserving cosine similarity computing protocol as an example

in response to data mining's efficiency and privacy requirements

in the big data era.

Efficient privacy preservation technique for healthcare

records using big data :Big data, because it can mine new

knowledge for economic growth and technical innovation, has

recently received considerable attention, and many research

efforts have been directed to big data processing due to its high

volume, velocity, and variety (referred to as “3V”) challenges.

However, in addition to the 3V challenges, the flourishing of big

data also hinges on fully understanding and managing newly

arising security and privacy challenges. If data are not authentic,

new mined knowledge will be unconvincing; while if privacy is

not well addressed, people may be reluctant to share their data.

Because security has been investigated as a new dimension,

“veracity,” in big data, in this article, we aim to exploit new

challenges of big data in terms of privacy, and devote our

attention toward efficient and privacy-preserving computing in

the big data era. Specifically, we first formalize the general

architecture of big data analytics, identify the corresponding

privacy requirements, and introduce an efficient and privacy

Triple DES as an example in response to data mining's

efficiency and privacy requirements in the big data era.

A Review on the State-of-the-Art Privacy-Preserving

Approaches in the e-Health Clouds :Cloud computing is

emerging as a new computing paradigm in the healthcare sector

besides other business domains. Large numbers of health

organizations have started shifting the electronic health

information to the cloud environment. Introducing the cloud

services in the health sector not only facilitates the exchange of

electronic medical records among the hospitals and clinics, but

also enables the cloud to act as a medical record storage center.

Moreover, shifting to the cloud environment relieves the

healthcare organizations of the tedious tasks of infrastructure

management and also minimizes development and maintenance

costs. Nonetheless, storing the patient health data in the third-

party servers also entails serious threats to data privacy. Because

of probable disclosure of medical records stored and exchanged

in the cloud, the patients' privacy concerns should essentially be

considered when designing the security and privacy

mechanisms. Various approaches have been used to preserve the

privacy of the health information in the cloud environment. This

survey aims to encompass the state-of-the-art privacy-preserving

approaches employed in the e-Health clouds. Moreover, the

privacy-preserving approaches are classified into cryptographic

and noncryptographic approaches and taxonomy of the

approaches is also presented. Furthermore, the strengths and

weaknesses of the presented approaches are reported and some

open issues are highlighted.

OBJECTIVE : A privacy-preserving patient-centric clinical

decision support system may propose called PPCD. which is

based on Support Vector Machine classification to help

physician to predict disease risks of patients in a privacy-

preserving way.

EXISTING SYSTEM

A privacy-preserving patient-centric clinical decision

support system using naive Bayesian classifier.

In existing system used different data mining

techniques, neural networks and association rule

mining, have been used for anomaly detection and

classification.

In existing system described several computer

models (such as Bayesian networks) that may be

used in clinical practice in the near future.

PROPOSED SYSTEM

To improve the existing system using Clinical

Decision Support System based on Support Vector

Machine (SVM).

Data Mining classification technique for Clinical

Decision Support System.

The system will work faster and efficient using

SVM. It is widely used in real-life applications.

We have also used the RSA and Paillier algorithm

for homomorphic encryption using proxy Re-

encryption algorithm that prevents cipher data from

Chosen Cipher text Attack (CCA). So this system is

more secure than existing system.

Modules

1. Similar Diseases Document Extraction

2. Score Computation

3. Term Independent Score Bounds

4. Get SVM Top k

Module Description

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 108
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

Similar Diseases Extraction :In this module we extract the

similar historical medical data contain confirmed case records of

patient’s symptoms and confirmed diseases. These data contain

some sensitive information which are highly related to data set

based on the given SVM Clinical decision support system. The

similar document is extracted using TF-IDF values. Compute

the similarity score for the given query and the data set. Get the

highest similarity score document.

Score Computation -In this module we compute the score for

each document in a data set. The recursive nature of SVM

queries makes it necessary to calculate the scores on lower

levels in the query first. One obvious possibility would be to try

and add processing logic to each query node as it acts on its

clauses. But optimizations such as max-score could only be

employed at the query root node, as a threshold is only available

for the overall query score. Instead, we follow a holisti

approach and prefer to be able to calculate the document score

given a set of query terms S T present in a document, no

matter where they appear in the query tree.

Term Independent Score Bounds :To provide early

termination of Clinical Decision Support System scoring, we

also propose the use of term independent SVM score bounds

that represent the maximum attainable score for a given number

of terms. A lookup table of score bounds Mr is created, indexed

by r that is consulted to check if it is possible for a query

document containing r of the terms to achieve a score greater

than the current entry threshold. That is, for each r = 1. . . n , we

seek to determine

The number of possible term combinations that could occur in

documents is for each r, which is in

total, and a demanding computation. However, the scoring

functions only depend on the clause scores, that is, the overall

score of each sub-tree, meaning that the problem can be broken

down into subparts, solved for each sub-tree separately, and then

aggregated. The simplest (sub) tree consists of one term, for

which the solution is trivial.

Get SVM Top-k :In this module we get the top k Clinical

Decision Support System for the give SVM query. Our objective

is to construct a query sequence q1, q2... qv of SVM queries,

that can be submitted to the database, retrieve as few documents

as possible, and still contain all the documents that would be in

the top-k results clinical decision support system in datasets.

ARCHITECHTURE DIAGRAM

DATAFLOWDIAGRAM

SOFTWARE REQUIREMENTS

Introduction to JAVA :Java was conceived by James Gosling,

Patrick Naughton, Chris Warth, Ed Frank, and Mike Sheridan at

Sun Microsystems, Inc. in 1991. It took 18 months to develop

the first working version. This language was initially called

“Oak” but was renamed “Java” in 1995. Between the initial

implementation of Oak in the fall of 1992 and the public

announcement of Java in the spring of 1995, many more people

contributed to the design and evolution of the language. Bill Joy,

Arthur van Hoff, Jonathan Payne, Frank Yellin, and Tim

Lindholm were key contributors to the maturing of the original

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 109
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

prototype. Somewhat surprisingly, the original impetus for Java

was not the Internet! Instead

Why JAVA? :Java is not just a language. Java is an all purpose

language that can be successfully used for all kind of software

including data processing software, scientific software, system

software and real time software, as well as possible for portable

web software. The Internet helped catapult Java to the forefront

of programming, and Java, in turn, has had a profound effect on

the Internet. The reason for this is quite simple: Java expands the

universe of objects that can move about freely in cyberspace. In

a network, two very broad categories of objects are transmitted

between the server and ythe personal computer:

1) Dynamic ,active programs.For example,

when you read ythe e-mail, you are viewing passive

data. Even when you download a program, the

program’s code is still only passive data until you

execute it. However, a second type of object can be

transmitted to

2) An application is a program that runs on

ythe computer, under the operating system of that

computer. That is, an application created by Java is

more or less like one created using C or C++. When

used to create applications, Java is not much different

from any other computer language. Rather, it is

Java’s ability to create applets that makes it

important. An applet is an application designed to be

transmitted over the Internet and executed by a Java-

compatible Web browser. An applet is actually a tiny

Java program, dynamically downloaded across the

network, just like an image, sound file, or video clip.

The important difference is that an applet is an

intelligent program, not just an animation or media

file. In other words, an applet is a program that can

react to user input and dynamically change—not just

run the same animation or sound over and over.

re likely aware, every time that you download a

“normal” program, you are risking a viral infection. Prior to

Java, most users did not download executable programs

frequently, and those who did scanned them for viruses prior to

execution. Even so, most users still worried about the possibility

of infecting their systems with a virus. In addition to viruses,

another type of malicious program exists that must be guarded

against. This type of program can gather private information,

such as credit card numbers, bank account balances, and

passwords, by searching the contents of ythe computer’s local

file system. Java answers both of these concerns by providing a

“firewall” between a networked application and ythe computer.

When you use a Java-compatible Web browser, you can safely

download Java applets without fear of viral infection or

malicious intent. Java achieves this protection by confining a

Java program to the Java execution environment and not

allowing it access to other parts of the computer. (You will see

how this is accomplished shortly.) The ability to download

applets with confidence that no harm will be done and that no

security will be breached is considered by many to be the single

most important aspect of Java.

Portability :As discussed earlier, many types of computers and

operating systems are in use throughout the world—and many

are connected to the Internet. For programs to be dynamically

downloaded to all the various types of platforms connected to

the Internet, some means of generating portable executable code

is needed. As you will soon see, the same mechanism that helps

ensure security also helps create portability. Indeed, Java’s

solution to these two problems is both elegant and efficient.

JDBC :( Java Database Connectivity) :Java provides various

interfaces for connecting to the database and executing SQL

statements. Using the JDBC interfaces you can execute normal

statements, dynamic SQL statements and store procedures that

take both IN & OUT parameters. It enables application to

connect any database using the various drivers and new set of

java API (Application Programming Interface) objects and

methods. The database is a collection of related information and

a Database Management system (DBMS) is software that

provides you with a mechanism to retrieve modify and insert

data to the database.

For the application to communicate with the database it needs

the following information:

1. RDBMS/DBMS product using which the database is

created.

2. The location of database.

3. The name of database.

JDBC APPLICATION ARCHITECTURE:

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 110
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

Figure 9 JDBC Application Architecture

JDBC interface:

Interface Description

Callable statement Enable you to

execute stored procedure that

have OUT parameter.

Connection The interface that

connects ythe java

application to database.

Database Met

Data

Provides java with

information concerning the

database.

Driver The interface used

to execute dynamic SQL

statement and stored

procedures.

Prepared

statement

An interface used

to execute dynamic SQL

statement and stored

procedures.

Result Set An interface that

provides information on a

result set returned from

database.

Statement An interface for

executing normal SQL

statement and stored

procedures.

Table 1 JDBC Interface

Java objects:

Java also provides a handful of objects that you can

use in ythe java application. Most of the objects are used to

provide java with some database specifications data type

available in most databases.

JDBC API Objects:

Some of the important JDBC API objects are as

follows:

Data Provides an object

that can accepts database data

values.

Driver Manager Provides another way

to make a connection to

database.

Driver Property Info Used to manage the

driver.

Time Provides an object

that can accepts database timer

values.

Table 2 JDBC API Objects

Advantages of JDBC:

One of the main the reason to use JDBC is that it is

responsibly faster than CGI (Common Gateway Interface)

approach. Using CGI usually requires that you can call another

program that must be executed by computer. This separate

program then access the database process the data returning it

back to calling program in a stream. This requires multiple level

of processing which is turn increase wait time as well as enable

more bugs to appear. Executing JDBC statement against the

database requires only some sort of services that the process the

SQL commands through the database. This speeds up

dramatically the time it takes to execute the SQL statement.

MYSQL:

DBMS AND RDBMS:

Data: :Collection of facts in raw form that becomes information

offer proper organization of processing is known as data.

Database: :Collection of data files which are arranged into

integrated file system which helps inn minimizing duplication of

data and provides convenient access to information within the

system.

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 111
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

Benefits of database system:

Minimal Redundancy: :Redundant data are space expensive.

They required more than one updating operations. Further as

different copies of data may be in different stages of updating,

the system may not give consistent information. Hence a

database system can be used to eliminate redundancy whenever

possible.

Data consistency: :When changes occur in a data item, every

file, which contains those fields, should be updated to reflect the

change. A database system efficiently does this function.

Data sharing:The central database can be located on a server,

which can then be shared by several users. In fact, the database

can be shared. This sharing enables central storage of the

database.

Data integrity:When any users put in the database it is very

important that the data items and associations between data

items are not destroyed. Integrity checks can be performed at the

data level itself, by checking that data values confirm to certain

specified rules.

Privacy and security:The information stored should be kept in

such a way that it does not get lost or affected. Privacy means

that individuals and corporation as a whole should have rights to

determine themselves when how and to what extent information

about them is to the transmitted to others. Security of data

means protecting the data against accidental intentional

disclosure to unauthorized modification description.

Database management system:A collection of programs for

storing and receiving data from a database is called database

management system. It is a software used for the management,

maintenance and retrieval of data stored in a database. It is an

integrated set of computer programs that collectively provide an

organization with all the capabilities of a central of a physical

database structure are handled by the DBMS.

Advantages of DBMS:

Create a database, define its contents.

Backup the data.

Enforce data security.-

Organize and maintain the physical storage.

Disadvantages of DBMS:

It allows only single user to access the database.

Information cannot be retrieved from more than one

database, at same time.

Through its uses powerful hardware, if the database

fails all the operations based on the database will be

suspended.

Relational database management system: -A relational

database is based on the concept that the related data is

organized and stored in a series of two dimensional tables

known as relations. In relation a row is called tuple and column

is called attribute. RDBMS is a suite of software programs that

can be used for creating, maintaining, modifying and

manipulating the relational database. RDBMS is a one among

the three basic modules of DBMS.

Difference: :The difference of two relation is a third relation

containing tuples that occur in the first relation but not in the

second.

Selection: :The select operation on a relation results in another

relation with the same attributes and the body containing tuples

where the condition falls true.

Projection: :Projection is an operation that selects specified

attributes from relations. The result of a projection is a new

relation that has selected attributes. It picks columns out of

relations.

Division: :For a division operation attributes of second relations

must be a subset of those of first relation. First relation divided

by a second relation is a relation where attributes consists of all

attributes of first relation not included in second relation.

Join :A join combines the data spread across the tables. A

join combine the specified rows of tables. A common column

provides the join condition. To avoid ambiguity the reference to

the table must be made. If the tables in join have the same

column name, it is distinguish them by referring the table name.

Integrity constraint:

An integrity constraint is a mechanism to prevent invalid

data entry into the table. Some types of integrity constraints are:

1. Domain integrity constraints.

2. Entity integrity constraints.

3. Referential integrity constraint.

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 112
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

Domain integrity constraint: :This can verify whether the data

entered is in proper form and also they set a range for the input

data.

Entity integrity constraint:These constraints are used to

enforce consistency in database. Any data recorded in the

database is called entity. Each entity represents a table and each

record represents an instance of that entity. Entity constraint is

used to identify each row in a table uniquely.

Types of entity integrity constraints:

a. Unique constraint.

b. Primary constraint.

Referential integrity constraints: :The referential integrity

constraint enforces relationship between tables. The foreign key

establishes a column or a combination of columns as a foreign

key. It establishes a relationship with a specified primary or

unique key in another table, called the referenced key. The table

containing the referenced key is called the parent table and that

containing the foreign key is called the child table.

Advantages of RDBMS over DBMS:

RDBMS provide a flexible mechanism to design a

database in such a way as to meet the changing needs of an

application.

It allows multiple users to simultaneously access the

database.

It allows the retrieval of information from more than

one table by linking the tables.

It provides an efficient security mechanism to the

database.

Disadvantages:

The performance of RDBMS on a smaller system

such as a pc is considerably reduced.

It assumes enormous storage space.

INPUT DESIGN AND OUTPUT DESIGN

INPUT DESIGN :The input design is the link between the

information system and the user. It comprises the developing

specification and procedures for data preparation and those steps

are necessary to put transaction data in to a usable form for

processing can be achieved by inspecting the computer to read

data from a written or printed document or it can occur by

having people keying the data directly into the system. The

design of input focuses on controlling the amount of input

required, controlling the errors, avoiding delay, avoiding extra

steps and keeping the process simple. The input is designed in

such a way so that it provides security and ease of use with

retaining the privacy. Input Design considered the following

things:’

1. What data should be given as input?

2. How the data should be arranged or coded?

3. The dialog to guide the operating personnel

in providing input.

4. Methods for preparing input validations and

steps to follow when error occur.

OBJECTIVES :1.Input Design is the process of converting a

user-oriented description of the input into a computer-based

system. This design is important to avoid errors in the data input

process and show the correct direction to the management for

getting correct information from the computerized system.

2.It is achieved by creating user-friendly screens for the data

entry to handle large volume of data. The goal of designing

input is to make data entry easier and to be free from errors. The

data entry screen is designed in such a way that all the data

manipulates can be performed. It also provides record viewing

facilities.

3.When the data is entered it will check for its validity. Data can

be entered with the help of screens. Appropriate messages are

provided as when needed so that the user

will not be in maize of instant. Thus the objective of input

design is to create an input layout that is easy to follo

OUTPUT DESIGN :A quality output is one, which meets the

requirements of the end user and presents the information

clearly. In any system results of processing are communicated to

the users and to other system through outputs. In output design it

is determined how the information is to be displaced for

immediate need and also the hard copy output. It is the most

important and direct source information to the user. Efficient

and intelligent output design improves the system’s relationship

to help user decision-making.

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 113
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

1. Designing computer output should proceed in an

organized, well thought out manner; the right output must be

developed while ensuring that each output element is designed

so that people will find the system can use easily and effectively.

When analysis design computer output, they should Identify the

specific output that is needed to meet the requirements.

2. Select methods for presenting information.

3. Create document, report, or other formats that

contain information produced by the system.

The output form of an information system should

accomplish one or more of the following objectives.

Convey information about past activities, current

status or projections of the Future.

Signal important events, opportunities, problems, or

warnings.

Trigger an action.

Confirm an action.

METHODOLOGY :Software methodologies are concerned

with the process of creating software - not so much the technical

side but the organizational aspects. In this, the first of two

articles, I will introduce the different types of methodologies. It

will describe them from an historical perspective, as one way to

understand where the project is and where the project is going is

to know where the project has been. There will also be a

subsequent companion article looking at the current state of the

art and what believe the future holds. This second article is

given the rather controversial title of "Why Creating Software is

not engineering", which I will, of course, explain. In the 2nd

article I will discuss several popular agile methodologies

particularly their most important aspects (such as unit testing),

which are generally neglected or glossed over.

Waterfall -Conventional wisdom is to ask an expert for help

and there are many willing helpers ready to provide their

services, for a fee, of course. You find that you need an

"analyst" to work out where you really want to go and a

"designer" to provide detailed unambiguous instructions on how

to get there. The analyst works out by deduction and/or educated

guesswork exactly which "Station Hotel" you want. Perhaps

they even manage to contact Jim to confirm the location. They

also find out the exact address and the opening hours. Now you

know exactly where you want to go but how to get there? The

designer provides a "specification" or instructions for getting to

the hotel - eg proceed 2 miles to the roundabout, take the 3rd

exit, etc. To ensure that the driver understands the instructions

the essential parts are even translated into his native language. A

good designer might also try to anticipate problems and have

contingency plans - eg, if the freeway s alternative.

The essential point of the specification is to have the

trip completely and thoroughly planned before starting out.

Everybody involved reads the specification and agrees that this

should get the customer to the pub on time. Can you see a

problem with this approach? While the analyst/designer is busy

at work you (the customer) are getting a bit nervous. It's been

some time and you still haven't gone anywhere. You also want

feedback that once you start the trip everything will stay on

track, since the experience of taxi journeys is that they can be

very unpredictable and the driver never gives any indication of

whether he is lost or on course. You need a "plan" so that you

can check that everyone is doing their job and that if something

is amiss it will be immediately apparent. The plan will also

require the driver to regularly report his position so you know if

he is going to be late or not get there at all. For a large project

you will need a "project manager" to formulate the plan.

Prototype :The prototype methodology addresses the major

problem of the waterfall methodology's "specification", which is

that you are never sure it will work until you arrive at the

destination. It is difficult or impossible to prove that the

specification is correct so just create a simple working example

of the product, much like an architect would create a scale

model of a proposed new building. To continue the taxi analogy

the designer, or a delegate, grabs a motorbike to first check that

you can actually get to the Station Hotel and even explore a few

alternative routes. When the motorbike driver has found a good

way the actual taxi trip can begin.

Iterative :The precursor to today's agile methodologies can be

loosely grouped as "iterative". They are also often described as

"cascade" methods as they are really just a series of small

waterfalls. I am not going to discuss these in detail as they were

not very widely used or very successful. They were more an

attempt to fix the problems of the waterfall methodology rather

than to bypass them altogether. As such they had many of the

same problems, and even exacerbated some.

Agile :The main problem with the waterfall approach is that it

takes a lot of effort up front effort in planning, analysis and

design. When it comes to the actual implementation there are so

many variables and unknowns that it is very unlikely to go to

plan. The prototype approach meant the project could eliminate

some of this uncertainty by demonstrating that there is a

reasonable chance of success. However, there is still a lot of up

front effort to design and build the prototype even before the

project even start the real development. What if the project

could divide the project into a sequence of steps so that at each

stage the project can demonstrate that the project is closer to the

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 114
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017

final product? After each step the project produce a working (but

not final) product so that the customer can see that the project is

on the right track.

Research Project Development Processes

The key research project management processes,

which run though all of these phases, are:

Phase management

Planning

Control

Procurement

Integration

CONCLUSION :Clinical decision support system CDSS,

which uses advanced data mining techniques that help clinician

make proper decisions, has received considerable attention

recently. The advantages of clinical decision support system

include not only improving diagnosis accuracy but also reducing

diagnosis time. We have proposed Clinical decision support

system using Support Vector Machine. Using SVM, the

computational time and diagnosis rate will improved. The

patient can get proper much efficiently and properly by

providing diagnosis result according to their own preferences.

The data that will be transfer is present in encrypted data, so that

there will be no loss in the privacy of patients data while training

the SVM classifier and providing data on the network.

REFERENCE

[1] Ximeng Liu, Rongxing Lu, Jianfeng Ma, Le Chen, and

Baodong Qin, “Privacy- Preserving Patient-Centric Clinical

Decision Support System on Naïve Bayesian Classification”,

IEEE JOURNAL OF BIOMEDICAL AND HEALTH

INFORMATICS, VOL. XX, NO. XX, DECEMBER 2014.

[2] R. S. Ledley and L. B. Lusted, “Reasoning foundations of

medical diagnosis,” Science, vol. 130, no. 3366, pp. 9–21, 1959.

[3] Maria-Luiza Antonie, Osmar R. Za¨ıane, Alexandru Coma,

.Application of Data Mining Techniques for Medical Image

Classification.Proceeding of second International worshop on

Mutimedia data mining(MDM/KDD’2001),in conjuction with

ACM SIGKDD conference.SAN FRANCISCO,USA,AUG

26,2001.

[4] C. Schurink, P. Lucas, I. Hoepelman, and M. Bonten,

“Computer- assisted decision support for the diagnosis and

treatment of infectious disease s in intensive care units,” The

Lancet infectious diseases, vol. 5, no. 5, pp. 305–312, 2005.

[5] Tzu-cheng Chuang, Okan K. Ersoy, Saul B. Gelfand,

Boosting Classification Accuracy With Samples Chosen From A

Validation Set, ANNIE (2007), Intelligent ` Engineering

systems through artificial neural networks, St. Louis, MO, pp.

455-461.

[6] McSherry D (2011), Conversational case-based reasoning in

medical decision making, Artificial Jun; 52(2):59-66. Epub

[7] X. Yi and Y. Zhang, “Privacy-preserving naive bayes

classification on distributed data via semi-trusted mixers,”

Information Systems, vol. 34, no. 3, pp. 371–380, 2009.

[8] A. Amirbekyan and V. Estivill-Castro, “A new efficient

privacy- preserving scalar product protocol,” in Proceedings of

the sixth Aus- tralasian conference on Data mining and

analytics-Volume 70. Australian Computer Society, Inc., 2007,

pp. 209–214.

[9] R. Lu, H. Zhu, X. Liu, J. K. Liu, and J. Shao, “Toward

efficient and privacy-preserving computing in big data era,”

IEEE Network, vol. 28, no. 4, pp. 46–50, 2014.

[10]A simultaneous consult on your patient’s diagnosis,

Simulconsult, www.simulconsult.com/

[11]Mr. P. A. Kharat, Dr. S. V. Dudul, IEEE 2011, Clinical

Decision Support based on Jordan/Elman Network”.

[12]AY AI-Hyari, A. M. Al-Taee and M. A. Al-Taee, IEEE

2013, “Clinical Decision Support for diagnosis and management

of Chronic Renal Failure.

[13]G.Subbalakshmi, K. Ramesh and M. Chinna Rao,IJCSE

2011, “Decision Support in Heart Disease Prediction System

using Naive Bayes”

[14]Hui-Ling Chen, Bo Yang, Jie Liu and Da-You Liu, Springer

2011, “A support vector machine classifier with rough set-based

feature selection for breast cancer diagnosis.”

vts-6
Text Box
ISSN: 2348 - 8387 www.internationaljournalssrg.org Page 115
vts-6
Text Box
SSRG International Journal of Computer Science and Engineering- (ICET'17) - Special Issue - March 2017