dmdw-mid 2

25
JNTU ONLINE EXAMINATIONS [Mid 2 - DMDW] 1.percent(A,"70,71 _ _ _ 80") => placement(A,"Infosys") percent(A,"70,71 _ _ _ 80") => placement(A, "Microsoft") percent(A,"70,71 _ _ _ _ 80") => placement(A," Dell") percent(A,"70,71 _ _ _ _ 80") => placement(A,"IBM") These set of rules clearly refer to _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule a. Boolean association b. Quantitative association c. Single dimensional association d. Multi dimensional association 2.The following rule Age(A,"20,21 _ _ _ _ 27") Λ percent(A,"60,61 _ _ _ 80") Λ test(A,"B,B+ _ _ _ _ .A+) =>placement(A,"MNCs") is an example of _ _ _ _ _ _ _ _ rule. a. Boolean association b. Quantitative association c. Single dimensional association d. Multi dimensional association 3.A set of items is referred to as a(n) _ _ _ _ _ _ _ _ _ _ a. itemset b. item set c. set entity d. entity set 4.car =>financial from bank [loan=80 %,insuranace=20 %] Association rules are satisfied if they have _ _ _ _ _ _ _ _ a. maximum carshops,maximum bank branches b. maximum banks, maximum loan threshold c. maximum loan threshold, minimum insurance threshold d. maximum bank threshold, maximum loan threshold, minimum insurance threshold 5.If X=>Y[a=50 %,b=2 %] a(X=>Y) = _ _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ and b(X=>Y)= _ _ _ _ _ _ _ _ _ _ a. P(AUB),P(A/B) b. P(AUB),P(B/A) c. P(A∩B),P(A/B) d. P(A∩B),P(B/A) 6.car => financial from bank [loan=80 %, insurance=20 %] _ _ _ _ _ _ _ & _ _ _ _ _ are two measures of Association rules. a. car,bank b. bank.loan c. loan,insurance d. bank,loan,insurance 7.The set X=>Y [a=50 %,b=2 %] is a _ _ _ _ _ _ _ _ _ _ _ itemset. a. 1 b. 2 c. 3 d. 4 8.If a rule concerns associations between the presence or absence of items, it is a _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule. a. Boolean association b. Quantitative association

Upload: nikhil-menda

Post on 19-Nov-2014

165 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: dmdw-mid 2

JNTU ONLINE EXAMINATIONS [Mid 2 - DMDW]

1.percent(A,"70,71 _ _ _ 80") => placement(A,"Infosys")percent(A,"70,71 _ _ _ 80") => placement(A, "Microsoft")percent(A,"70,71 _ _ _ _ 80") => placement(A," Dell")percent(A,"70,71 _ _ _ _ 80") => placement(A,"IBM")These set of rules clearly refer to _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule

a. Boolean associationb. Quantitative associationc. Single dimensional associationd. Multi dimensional association

2.The following rule Age(A,"20,21 _ _ _ _ 27") Λ percent(A,"60,61 _ _ _ 80") Λ test(A,"B,B+ _ _ _ _ .A+) =>placement(A,"MNCs") is an example of _ _ _ _ _ _ _ _ rule.

a. Boolean associationb. Quantitative associationc. Single dimensional associationd. Multi dimensional association

3.A set of items is referred to as a(n) _ _ _ _ _ _ _ _ _ _

a. itemsetb. item setc. set entityd. entity set

4.car =>financial from bank [loan=80 %,insuranace=20 %] Association rules are satisfied if they have _ _ _ _ _ _ _ _

a. maximum carshops,maximum bank branchesb. maximum banks, maximum loan thresholdc. maximum loan threshold, minimum insurance thresholdd. maximum bank threshold, maximum loan threshold, minimum insurance threshold

5. If X=>Y[a=50 %,b=2 %] a(X=>Y) = _ _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ and b(X=>Y)= _ _ _ _ _ _ _ _ _ _

a. P(AUB),P(A/B)b. P(AUB),P(B/A)c. P(A∩B),P(A/B)d. P(A∩B),P(B/A)

6.car => financial from bank [loan=80 %, insurance=20 %] _ _ _ _ _ _ _ & _ _ _ _ _ are two measures of Association rules.

a. car,bankb. bank.loanc. loan,insuranced. bank,loan,insurance

7.The set X=>Y [a=50 %,b=2 %] is a _ _ _ _ _ _ _ _ _ _ _ itemset.

a. 1b. 2c. 3d. 4

8. If a rule concerns associations between the presence or absence of items, it is a _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule.

a. Boolean associationb. Quantitative associationc. Frequent associationd. Transaction association

9.TID stands for _ _ _ _ _ _ _ _ _ _ _

a. Transaction is associated with an identifierb. Transaction is Differentiablec. Transaction identifierd. Transaction is dissociated with an identifier

Page 2: dmdw-mid 2

10. If a rule describes association between quantitative attributes, it is a _ _ _ _ _ _ _ _ _ _ rule.

a. Boolean associationb. Quantitative associationc. Frequent associationd. Transaction association

11. If the transactional data is

 The minimum support count is _ _ _ _ _ _ _ _ _

a. 1b. 2c. 3d. 4

12. Apriori algorithm employs level-wise search, where k-itemsets uses ------ itemsets.

a. kb. (k-1)c. (k+1)

d.13. Which step could involve huge computations?

a. join stepb. prune stepc. calculation stepd. logical step

14. If support count at  =  then support count at  = _ _ _ _ _ _ _ _ _

a.

b.

c.

d.15. Anti-monotone states _ _ _ _ _ _ _ _

a. if a set cannot pass a test, all its supersets also cannot pass the same testb. if a set cannot pass a test, all its supersets pass the testc. if a set pass a test, all its supersets cannot pass the same testd. Is a set pass a test, all its subsets cannot pass the same test.

16. Transaction reduction implies _ _ _ _ _ _

a. reducing the number of transactions scanned till previous iterationsb. reducing the number of transactions in the current iterationc. reducing the number of transactions in the future iterationd. reducing the number of iterations in a transaction

17. _ _ _ _ _ _ _ _ _ is used to improve the efficient of the Apriori algorithm.

a. Berg queriesb. Iceberg queriesc. Ice Burg queriesd. Ice Cube queries

18. Which threshold can be set up for passing down relatively frequent items to lower levels?

a. level-class threshold

Page 3: dmdw-mid 2

b. level-shift thresholdc. level-passage thresholdd. level-jump threshold

19. When multi-level association rules are mined, some of the rules found will be redundant due to _ _ _ _ _ _ _ _ _ _ _ relationships between them.

a. hierarchicalb. multi-levelc. single-dimensiond. ancestor

20. the Manhattan distance between 2 tuples t1=  and t2=  is _ _ _ _ _ _ _ _

a.

b.

c.

d.21. Consider the following rile

Age(A,"18,19, _ _ _ _ _ 29") Λ placement(A,"Infosys,IBM, _ _ _ _ ") Λ purchases(A,"mobile") => purchases(A,"high memory card") Λ purchases(A,"card reader")This rule is highlighted in saying that it has _ _ _ _ _ _ _ _

a. multiple predicatedb. single predicatec. repetitive predicated. dependent predicate

22. The correlation between the occurrence of A and B can be measured by computing _ _ _ _ _ _ _ _ _ _ _ _ _ _

a. corrA,B =P(A) / P(B)b. corrA,B=(P(A)P(B)) / P(A∩B)c. corrA,B = P(A∪B) / (P(A)P(B))d. corrA,B = P(A∩B) / (P(A)P(B))

23. Which association rule has overcome the disadvantage of Association rules?

a. quantitative association rulesb. Distance-based association rulesc. single dimensional association rulesd. Multi dimensional association rules

24. Let A[X] be the set of 'n' tuples t1,t2, _ _ _ _ _ _ _ _ tn projected on the attribute set X . Which measure of A[X] is the average pair wise distance between the tuples projected on X?

a. radiusb. diameterc. densityd. frequency

25. If a single distinct predicate exists in single dimensional association rule , it is also called as _ _ _ _ _ _ _ _ _ _

a. intra dimension association ruleb. inter dimension association rulec. extra dimension association ruled. quantitative association rule

26. If no repeated predicates exists in multi dimensional association rule is also called as _ _ _ _ _ _ _ _

a. intra dimension association rule

Page 4: dmdw-mid 2

b. inter dimension association rulec. extra dimension association ruled. quantitative association rule

27. If in multi dimensional association rule with repeated predicates, which contains multiple occurrences of some predicate certain rules are called as _ _ _ _ _ _ _ _ _

a. repeatitive association ruleb. recursive association rulec. mixed association ruled. hybrid association rule

28. The partitioning of process is referred to as binning and the intervals are considered as _ _ _ _ _ _ _ _ _ _

a. binsb. modulesc. segmentsd. time laps

29. Which algorithm is included with certain series of walks through itemset space?

a. Apriori algorithmb. Ancestor algorithmc. random walk through algorithmd. sequential walk through algorithm

30. Which categories can be used during association mining to guide the process, leading to more efficient and effective mining?

a. antimonotone, monotone, succinct, convertibleb. antimonotone, monotone, succinct, inconvertiblec. antimonotone, monotone, convertible, inconvertibled. antimonotone, succinct, convertible, iconvertible

31. Consider the following rule: If an engineering student inWarangal bought "speech recognition CD" and "MS Office" and "jdk 1.7", it is likely (with a probability of 58 %) that the student also bought SQL Server and "My SQL Server" and 6.5 % of all the students bought all five. The meta rule can be generated in association rule as _ _ _ _ _ _ _ _ _ _ _

a. lives(S, _, Warangal") Λ sales(S,"speech recognition", _) Λ sales(S,"MS Office", _) Λ sales(S,"jdk 1.7", _) Λ sales(S,"SQL Server", _) => sales(S,"My SQL Server", _) [6.5 % 58 %]

b. lives(S, _,Warangal") Λ sales(S,"speech recognition", _) Λ sales(S,"MS Office", _) Λ sales(S,"jdk 1.7", _) =>sales(S,"SQL Server", _) Λ sales(S,"My SQL Server", _) [6.5 % 58 %]

c. lives(S, _,Warangal") Λ sales(S,"speech recognition", _) Λ sales(S,"MS Office", _) Λ sales(S,"jdk 1.7", _) Λ sales(S,"SQL Server", _) => sales(S,"My SQL Server", _) [58 % 6.5 %]

d. lives(S, _,Warangal") Λ sales(S,"speech recognition", _) Λ sales(S,"MS Office", _)Λ sales(S,"jdk 1.7", _) =>sales(S,"SQL Server", _) Λ sales(S,"My SQL Server", _) [58 % 6.5 %]

32. A constraint such as "avg(I.marks) <= 70" is not a(n) _ _ _ _ _ _ _ _ _ _ _

a. anti-monotoneb. monotonec. succinctd. convertible

33. A constraint "max(I.marks) >=600" is acceptable for _ _ _ _ _ _ & _ _ _ _ _ _ _ categories.

a. antimonotone,monotoneb. monotone,succinctc. antimonotone,succinctd. succinct,convertible

34. The constraint "max(I.marks)<=600" is acceptable by _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ categories.

a. antimonotone,monotoneb. monotone,succinctc. antimonotone,succinctd. succinct,convertible

35. Which constraint specify the set of task-relevant data?

a. knowledge type constraints

Page 5: dmdw-mid 2

b. data constraintsc. task-oriented constraintsd. rule constraints

36. Which constraints may be expressed as Metarules?

a. knowledge type constraintsb. data constraintsc. interestingness constraintsd. rule constraints

37. Anti-monotone, monotone, succinct, convertible and inconvertible are five different categories of _ _ _ _ _ _ _ _ _ _ constraints.

a. knowledge type constraintsb. data constraintsc. interestingness constraintsd. rule constraints

38. Which constraints are applied before mining?

a. knowledge type and data constraintsb. data and dimension constraintsc. dimension and ruled. knowledge and rule constraints

39. The constraint "support(s)   is acceptable by _ _ _ _ _ _ _ _ _ _ category.

a. Antimonotoneb. monotonec. succinctd. Anti-succinct

40. Preprocessing of data in preparation for classification and prediction can involve ------ for normalizing the data.

a. data cleaningb. relevance analysisc. data transformationd. data redundancy

41. While _ _ _ _ _ _ _ predicts class, _ _ _ _ _ _ _ models continuous-valued functions.

a. prediction-classificationb. classification-predictionc. speed-scalabilityd. scalability-speed

42. _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ are the two major types of prediction problems.

a. classification-regressionb. classification-data cleaningc. data cleaning-predictive accuracyd. regressive-scalability

43. Which of the following criteria is not the one used for the comparison of classification and prediction?

a. speedb. predictive accuracyc. data cleaningd. interpretability

44. Normalization fall within a range of _ _ _ _ _ _ _ _ _

a. -2.0 to +2.0b. -1.0 to +1.0c. +1.0 to +2.0d. -2.0 to -1.0

45. _ _ _ _ _ _ _ _ _ _ _ is a two step process

a. data classificationb. data predictionc. data hiding

Page 6: dmdw-mid 2

d. data abstraction46. The data tuples analyzed to build the model collectively form _ _ _ _ _ _ _ _ _ _

a. samplesb. training data setc. untrained data setd. supervised data set

47. In _ _ _ _ _ _ _ _ _ the class label of each training sample is not known, and the number or set of classes to be learned may not be known in advance.

a. supervised learningb. unsupervised learningc. authorized learningd. unauthorized learning

48. _ _ _ _ _ _ _ is a simple technique that uses a test set of class-labeled samples.

a. clustering methodb. holdout methodc. data classificationd. data learning

49. _ _ _ _ _ _ _ _ _ refers to the preprocessing of data in order to remove noise.

a. predictive accuracyb. robustnessc. data cleaningd. Interpretability

50. Early decision tree algorithms typically assume that the data is from _ _ _ _

a. memoryb. user input from keyboardc. dynamic user inputd. mouse click

51. Decision trees can easily be converted to _ _ _ _ _ _ _ rules.

a. IFb. Nested IFc. If-THENd. GROUP BY

52. During the construction of decision tree induction the tree starts as _ _ _ _ _ _ _ _ _

a. single nodeb. dual child nodec. binary tree nodesd. multi-valued nodes

53. Gain(A)=I(S1,S2,----------.Sm) E(A) where E(A) = _ _ _ _ _ _ _ _ _ _ _

a.

b.

c.

d.54. _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data

with high-level concepts.

a. analysis oriented inductionb. algorithm oriented inductionc. attribute oriented inductiond. approach oriented induction

55. In a decision tree, _ _ _ _ _ _ _ _ represents an outcome of the test.

a. internal nodeb. branch

Page 7: dmdw-mid 2

c. leaf nodesd. root

56. The basic algorithm for decision tree induction is _ _ _ _ _ _ _ _

a. knapsack algorithmb. greedy algorithmc. traveling sales person algorithmd. 0/1 knapsack algorithm

57. _ _ _ _ _ _ _ _ _ _ _ methods use statistical measures to remove the least reliable branches.

a. tree pruningb. fragmentationc. segmentationd. classification

58. In how many approaches does tree pruning work?

a. 1b. 2c. 3d. 4

59. Classification threshold is also called as _ _ _ _ _ _ _ _ _ _ _

a. exception thresholdb. SLIQ thresholdc. precision thresholdd. threading threshold

60. If an arc is drawn from node A to node B then A is _ _ _ _ _ _ _ _ _ of Bi)Parent ii)immediate predecessor iii)descendent iv)immediate successor

a. only ib. i & iic. only iiid. iii & iv

61. Gaussian density function g(x, ) is defined by _ _ _ _ _ _ _ _ _ _ _

a.  

b.  

c.

d.  62. Bayes theorem provides a way of calculating which probability?

a. posteriorb. priorc. stabled. ideal

63. In Gaussian density function   stands for _ _ _ _ _ _ & _ _ _ _ _ _ _ _ _ _

a. Standard deviation(SD) & meanb. mean & SDc. mode & SD

Page 8: dmdw-mid 2

d. SD & median64. P(H/X) is a _ _ _ _ _ _ _ _ probability.

a. posteriorb. priorc. stabled. ideal

65. Nave Bayesian classifier is also called as _ _ _ _ _ _ _ _ _ _ classifier.

a. computational Bayesianb. simple Bayesianc. non-computational Bayesiand. Complexed Bayesian

66. CPT stands for _ _ _ _ _ _ _ _ _ _ _

a. Computer preparation testb. Computational probability testc. Conditional probability tabled. Computational probability table

67. During Bayesian n/w s incomplete data is referred to _ _ _ _ _ _ _ _ _ _

a. input datab. Hidden datac. output datad. recursive data

68. The algorithm of "Training Bayesian Belief Networks" involve which sequence of stepsi)compute the gradientsii)renormalize the weightsiii)update the weights

a. i, ii, iiib. ii, i, iiic. i, iii, iid. ii, iii, i

69. If 'l' is learning rate & 't' is no of iterations ,the relation b/w l & t is given by

a.

b.c. l α t2

d.70. The error of a hidden layer of until I is given by _ _ _ _ _ _ _ _ _ where k is node in next

hidden layer

a. O_i(1-O_i) Σ (Errk W_{ik})2

b.

c.

d.71. A Neural Network containing N hidden Layers is called as _ _ _ _ _ _ _ _ _ _ Neural network

layered

a. (N-1)b. Nc. (N+1)d. 2N

Page 9: dmdw-mid 2

72. _ _ _ _ _ _ _ _ _ _ _ are modified so as to minimize the mean squared error b/w the networks prediction and the actual class

a. no of hidden layersb. weights of the nodesc. no of nodesd. no of inputs to a node

73. Back propagation is a neural n/w _ _ _ _ _ _ _ _ _ _ _ algorithm

a. updationb. learningc. comparisond. data mining

74. Neural n/w learning is also referred to as _ _ _ _ _ _ _ _ _ learning

a. sequentialb. connectionistc. dependentd. random

75. In a multilayer feed-forward NN the weighted output of hidden layer are inputs to _ _ _ _ _ _ _ _ _ _ _ _

a. input layerb. next hidden layerc. output layerd. input to another NN

76. Erri =   where   are _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ _

a. desired o/p, true o/pb. actual o/p, true o/pc. desired o/p, error o/pd. actual o/p, error o/p

77. The tech of updating the weights & biases after the presentation of each sample is refered to As _ _ _ _ _ _ _ _ _

a. node updatingb. case updatingc. epoch updatingd. layer updating

78. If in the item set {percent <= "59", placement ="no" } whose support increases from 0.7 % is c1 to 92.6 % in c2, the growth rate is _ _ _ _ _ _ _ _ _

a. 0.7 %/92.6 %b. 92.6 %/0.7 %c. (92.6 %-0.7 %)/0.7 %d. 92.6 %/(92.6 %-0.7 %)

79. The statement "If any student age is above 20 and their percentage is 70 or more are approved for placement " the rule in rough set theory can be written as _ _ _ _ _ _ _ _ _ _ _ _

a. if(x, age > 20)Λ if(x, percentage >= 70) then placement ="approved"b. if(x, age > 20) Λ if(percentage >= 70) then placement ="approved"c. if(age > 20) Λ if(percentage >= 70) ?> placement ="approved"d. if(x, age > 20) Λ if(x, percentage >= 70) then placement =�approved�

80. _ _ _ _ _ _ _ _ _ _ _ is defined in terms of Euclidean distance

a. Min distance between two pointsb. max distance between 2 pointsc. Closeness between 2 pointsd. mean between 2 points

81. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as

a. 101b. 010c. 001

Page 10: dmdw-mid 2

d. 11082. If a set of rules has same condent,then the rule with highest confidence is selected as _

_ _ _ _ _ _ _ _ _ to represent the set

a. probability ruleb. possible rulec. posterior ruled. proportionality rule

83. EP stands for _ _ _ _ _ _ _ _ _ _ _

a. emerging pointb. evolving pointc. emerging patternsd. evolving patters

84. JEP is a special case of EP , where J stands for _ _ _ _ _ _ _ _ _ _ _ _ _ _

a. jointb. jazzingc. jaggingd. jumping

85. _ _ _ _ _ _ _ _ _ are used to incorporate ideas if natural evolution

a. case based reasoningb. genetic algorithmsc. rough set theoryd. fuzzy set approach

86. Rough set theory is based on _ _ _ _ _ _ _ _ _ _ classes with in fn training date

a. symmetricb. transitivec. equivalenced. trichotonomy

87. In which operation Substring from pairs of rules are swapped to form new pair of rules

a. equivalenceb. CBRc. crossoverd. mutation

88. Assume the following salary details: X(years experience) Y(salary in Rs.)2 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 130009 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 4500015 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 750004 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000The value of x = _ _ _ _ _ _ _ _ _

a. 30b. 8.0c. 8.5d. 9.0

89. Assume the fallow data X(years experience ) y( salary)02 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 1300009 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 4500015 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 7500003 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000Its β value is _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

a. 4698.6b. 3698.6c. 2698.6d. 5698.6

90. From the eg. Y= X, the co-efficient α= _ _ _ _ _ _ _ _ _ _ _ _

a. Y-βXb. Y-βx�

Page 11: dmdw-mid 2

c. y'-βx

d. y'- 91. Apart from prediction, the log linier model is also useful for _ _ _ _ _ _ _ _ _ _ _

a. image patterningb. data compressionc. voice recognitiond. speech recognition

92. In Y=  X,   are _ _ _ _ _ _ _ _ _ _

a. constantsb. regression coefficientsc. variable coefficientsd. averages of X and Y

93. Bivariate linear regression models is represented as _ _ _ _ _ _ _ _ _ _ _ _

a. y= α + β x+ γ x2

b.

c.

d.94. _ _ _ _ _ _ _ _ can be modeled by adding polynomial terms to the basic linear mode.

a. linear regressionb. multiple regressionc. polynomial regressiond. poison regression

95. _ _ _ _ _ _ _ _ regression can be modeled by adding polynomial terms to the basic linear model

a. Linearb. multiplec. polynomiald. binomial

96. Which regression helps in counting the data frequently?

a. Linear regressionb. logistic regressionc. Poisson regressiond. multiple regression

97. Accuracy is given as _ _ _ _ _ _ _ _ _ _

a. specificity * (pos/(pos+neg)) + sensitivity *(neg/(pos+neg))b. specificity * (neg/(pos+neg)) + sensitivity *(pos/(pos+neg))c. sensitivity * (pos/(pos+neg)) + specificity *(neg/(pos+neg))d. sensitivity * (neg/(pos+neg)) + specificity *(pos/(pos+neg))

98. Training set & test set are two sets of _ _ _ _ _ _ _ _ _ _ _

a. holdout methodb. classifierc. sensitivityd. k-fold cross-validation

99. Sensitivity & specificity can be used to measure _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ _

a. +ve samples & -ve samplesb. -ve samples & +ve samplesc. speed & robustnessd. classification & prediction

Page 12: dmdw-mid 2

100. In _ _ _ _ _ _ _ _ the class distribution of the samples in each fold is approximately the same as that in the initial data.

a. random subsamplingb. k-fold cross validationc. stratified cross-validationd. boot strapping

101. Accuracy estimates also help in _ _ _ _ _ _ _ _ _ _ _ _

a. known behavior of future classifiersb. comparison of different classifiersc. selection of classifierd. increasing classifier accuracy

102. Into how many independent sets the given data are randomly partitioned in holdout method?

a. 2b. 3c. 4d. 5

103. Which method of estimating classifier samples the given the training instances uniformly with replacement?

a. k-fold cross-validationb. stratified cross validationc. boot strappingd. leave-one-out

104. _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _are the techniques for approving overall classifier accuracy by learning and combining series of individual classifiers.

a. bogging & boostingb. bagging & bangingc. bagging & boostingd. bagging &bogging

105. _ _ _ _ _ _ _ _ _ is used to access the percentage of samples.

a. sensitivityb. specificityc. precisiond. decision

106. Bootstrap is also known as _ _ _ _ _ _ _ _ _

a. baggingb. boggingc. boostingd. banging

107.  is _ _ _ _ _ _ _ _ _ _ _

a. simple match coefficientb. Jaccard coefficientc. transition coefficientd. regression coefficient

108.  where m and p are _ _ _ _ _ _ _ _ _ _ _ _

a. number of mappings and number of patternsb. number of matches and number of variablesc. number of mappings and number of clustersd. number of matches and number of clusters

109. Sp =  calculates _ _ _ _ _ _ _ _ _ _

Page 13: dmdw-mid 2

a. Euclidean distanceb. Manhattan distancec. z-scored. mean absolute deviation

110.  is _ _ _ _ _ _ _ _

a. simple matching coefficientb. Jaccard coefficientc. transition coefficientd. regression coefficient

111. A _ _ _ _ _ _ _ _ _ resembles a nominal variable.

a. normal variableb. discrete ordinal variablec. continuous ordinal variabled. poission ordinal variable

112. The process of grouping a set of physical objects into classes of similar objects is called as -

a. modulingb. segmentingc. clusteringd. machine learning

113. Clustering is a form of _ _ _ _ _ _ _ _ _ _ _

a. learning by practiceb. learning by examplesc. learning by observationd. learning by testing

114. Interval-scaled variables are _ _ _ _ _ _ _ _ measurements of a linear scale.

a. discreteb. continuousc. differentiabled. non-continuous

115. The data matrix is often called as _ _ _ _ _ _ _

a. one-mode matrixb. two-mode matrixc. zero-mode matrixd. poly-node matrix

116. Object-by-object structure is also known as _ _ _ _ _ _ _ _ _ _ _ _

a. difference matrixb. data matrixc. Dissimilarity matrixd. Identity matrix

117. Squared-error criterion is defined as _ _ _ _ _ _ _ _ _ _ _ _ _

a.

b.

c.

d.118. the computational complexity of CLARANS is _ _ _ _ _ _ _ _ _ _

a. O(n)b. O(n2)

c. O(log n)

Page 14: dmdw-mid 2

d. O(n log n)119. _ _ _ _ _ _ _ _ _ can be used to find the most "natural" number of clusters using a

silhouette coefficient.

a. PAMb. CLAPPc. CLARAd. CLARANS

120. The _ _ _ _ _ _ algorithm where each cluster is represented byy one of the objects located near the center of cluster.

a. k-meansb. k-mediansc. k-medoidsd. k-modes

121. The agglomerative approach is also called as _ _ _ _ _ approach.

a. top-downb. bottom-upc. sequentiald. random

122. Most of the partitioning methods cluster objects are based on _ _ _ _ _ _ _ _

a. number of clustersb. distance between objectsc. number of objects in each classd. learning rate

123. _ _ _ _ _ _ _ _ _ _ methods quantize the object space into a finite number of cells that form a grid structure.

a. hierarchical methodsb. density-based methodsc. grid-based methodsd. model-based methods

124. _ _ _ _ _ _ _ is a density-based method that computers an augmented clustering.

a. DBSCANb. OPTICSc. STINGd. CLIQUE

125. EM is expanded to _ _ _ _ _ _ _ _ _

a. entity maximizationb. exception maximizationc. expectation maximizationd. earning maximization

126. Clustering large applications can be shortened as _ _ _ _ _ _ _ _ _

a. CLAb. CLAPPc. CLARAd. CLULA

127. The relative interconnectivity between 2 clusters   is given by _ _ _ _ _ _ _ _ _ _ _ _ _

a.

b.

c.

Page 15: dmdw-mid 2

d.128. The absolute closeness between 2 clusters, normalized w.r.t the internal

closeness of two clusters is _ _ _ _ _ _ _ _ _ _

a. relative distanceb. relative interconnectivityc. relative densityd. relative closeness

129. -------- is a triplet summary information about sub clusters of objects

a. clustering featuresb. clustering feature treec. Chameleond. density feature

130. Which method overcame with the problem of favoring clusters with spherical shape and similar sizes?

a. BIRCHb. CUREc. ROCKd. STING

131. For given n objects the complexity of CURE is _ _ _ _ _ _ _ _ _ _

a. O(n)b. O(n2)c. O(log n)d. O(n log n)

132. AGNES is expanded to _ _ _ _ _ _ _ _

a. agglomerative nestingb. aglomative nestingc. agnomative nestingd. aggomerative nesting

133. Which method represents each cluster by a certain fixed number of representative objects and shrinks them towards the center of the cluster

a. BIRCHb. CUREc. ROCKd. DBSCAN

134. CF tree has how many parameters?

a. 5b. 4c. 3d. 2

135. Which algorithm has overcome with the weakness of CURE & ROCK algorithms?

a. DBSCANb. OPTICSc. Chameleond. Wave Cluster

136. Which method doesn't handle categorical attributes?

a. BIRCHb. CUREc. ROCKd. CLIQUE

137. Square wave influence function is set to 0(zero) if _ _ _ _ _ _ _ _ _

a. d(x,y)= σb. d(x,y)> σc. d(x,y)< σd. d(x,y ) ≠ σ

138. Gaussian influence function is given as _ _ _ _ _ _ _ _ _ _ _

Page 16: dmdw-mid 2

a. e- 

b. e-

c. e-

d. e-139. Grid-based computation is _ _ _ _ _ _ _ _ _ _ _

a. cluster-independentb. data independentc. query independentd. density independent

140. Which algorithm facilitates parallel processing?

a. STINGb. CLIQUEc. OPTICSd. ROCK

141. If a k-dimensional unit is dense , then its projections are in _ _ _ _ _ _ _ _ _ _ dimensional space.

a. kb. (k-1)c. (k+1)

d.142. DBSCAN is _ _ _ _ _ _ _ _ _ clustering algorithm

a. partitioning methodsb. hierarchical methodsc. density-based methodsd. grid-based methods

143. Density connectivity is a _ _ _ _ _ _ _ _ relation.

a. closureb. symmetricc. transitived. trichotonomy

144. The _ _ _ _ _ _ _ _ _ _ is an object is defined as the sum of influence functions of all data points

a. distance functionb. interconnectivity functionc. density functiond. closeness function

145. STING is _ _ _ _ _ _ _ _ clustering algorithm

a. partitioning methodsb. hierarchical methodsc. density-based methodsd. grid-based methods

146. Wave Cluster is _ _ _ _ _ _ _ _ algorithm from the following. i)hierarchical ii)density-basediii)grid-based

a. i & iib. i & iiic. ii & iiid. only ii

Page 17: dmdw-mid 2

147. _ _ _ _ _ _ _ can automatically result in the removal of outliers.

a. Wavelet transformb. Wavelet outlierc. data cubed. DENCLUE

148. CLASSIT and Auto class are _ _ _ _ _ _ _ _ _ _

a. statistical approachesb. neural network approachesc. Wavelet transformd. Outlier analysis

149. _ _ _ _ _ _ _ compete in a "winner-take-all" fashion for the object that is currently presented to the system.

a. supervised learningb. unsupervised learningc. competitive learningd. cluster learning

150. _ _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are two operators of COBWEB.

a. merging, dividingb. scratching, splittingc. merging, splittingd. adding, dividing

151. The larger the value, the greater the proportion of class members that share attribute-value pair This is applicable t o _ _ _ _ _ _ _ _ _ _ _ _ _

a. interclass similarityb. interclass dissimilarityc. intraclass similarityd. intraclass dissimilarity

152. The sequence of steps in conceptual clustering are _ _ _ _ _ _ _ _ _ _

a. characterization,clusteringb. classification,predictionc. clustering,characterizationd. prediction,classification

153. _ _ _ _ _ _ _ has the ability to automatically adjust the number of classes in a partition.

a. CLIQUEb. Wave clusterc. COBWEBd. Outlier analysis

154. SOM stands for _ _ _ _ _ _ _ _ _ _

a. self organizing mapsb. self originating mapsc. self outlier mapsd. self online maps

155. _ _ _ _ _ _ _ _ _ _ _ performs multidimensional clustering in 2 steps.

a. CLIQUEb. Wave clusterc. COBBWEBd. Outlier analysis

156. _ _ _ _ _ _ _ _ _ _ _ _ is a signal processing technique that decomposes a signal into different frequency sub bands.

a. Wavelet transformb. Outlier analysisc. Outlier miningd. SOMs

157. In index-based algorithm, if k represents dimensionality and n represents number of objects in the data set. The worst-case complexity is _ _ _ _ _ _ _ _ _ _ _

a. O(k*n)

Page 18: dmdw-mid 2

b. O( *n)

c. O(k* )d. O(k+n)

158. In cell-based algorithm, if k represents dimensionality and c is a constant. Its complexity is defined as _ _ _ _ _ _ _ _ _ _

a. O(  *n)

b. O(c* )

c. O( +n)

d. O(c+ )159. Block and consecutive procedures are 2 basic types for _ _ _ _ _ _ _ _ _ _ _

a. calculating outliersb. mining outliersc. classification outliersd. detecting outliers

160. If there are 'm' number of objects within d-neighborhood of an outlier and later it is decided as not an outlier because of its _ _ _ _ _ _ _ _ _ number of neighbors.

a. mb. (m-1)c. (m+1)

d.161. _ _ _ _ _ _ _ _ _ doesn't require a metric distance between the objects.

a. Exception setb. dissimilarity f unctionc. cardinality functiond. smoothing factor

162. Outlier detection and outlier analysis is a data mining task referred to as _ _ _ _ _

a. data miningb. Outlier miningc. task miningd. Analyzer mining

163. Data which are inconsistent with the remaining set of data is called as _ _ _ _ _ _

a. metadatab. Outliersc. proceduresd. process

164. _ _ _ _ _ _ _ _ _ _ is used to identify outliers w.r.t the model.

a. outlier testb. discordancy testc. differentiation testd. integration test

165. _ _ _ _ _ _ _ _ _ _ _ verifies whether an object is significantly large or small in relation to the distribution F.

a. outlier testb. discordancy testc. differentiation testd. integration test

Page 19: dmdw-mid 2

166. The _ _ _ _ _ _ _ _ states that discordant values are not outliers in distribution F, but contaminants from some other distribution G.

a. Inherent alternative distributionb. mixture alternative distributionc. slippage alternative distributiond. individual alternative distribution

167. It is difficult to construct an object cube containing _ _ _ _ _ _ _ _ _ _ dimension

a. generalizationb. keywordc. plan based. pattern

168. Which of the following are spatial operators? i)spatial-unionii)spatial-overlappingiii)spatial-intersectioniv)spatial-disjoint

a. i & iib. i & ivc. i,ii,iiid. i,ii,iv

169. Each attribute to simple-valued data for constructing a multi-dimensional data cube are called as _ _ _ _ _ _ _ _ _ _ _

a. meta cubeb. hyper cubec. object cubed. grid cube

170. _ _ _ _ _ _ _ _ _ is a task of mining significant patterns from a plan base.

a. plan classificationb. plan detectionc. plan miningd. plan construction

171. _ _ _ _ _ _ _ _ _ _ notation be used to represent sequence of actions of the same type.

a. ( )

b.c. +d. *

172. Each object in a class is associated with _ _ _ _ _ _ _ _ _ _ _

a. object identifier & object nameb. object identifier & set of attributesc. object name & set of attributesd. object code & object type

173. A set-valued attribute may be _ _ _ _ _ _ _ i) homogenousii)heterogeneous

a. only ib. only iic. i or iid. i and ii

174. _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are important means of generalization .

a. attribution, aggregationb. aggregation, approximationc. approximation, analyzationd. analyzation, attribution

Page 20: dmdw-mid 2

175. A _ _ _ _ _ _ _ _ _ _ has complex tasks, graphics, images, videos, maps, voice, music etc.

a. meta datab. multimedia databasec. MS-Accessd. SQL

176. _ _ _ _ _ _ _ _ is one where properties can be inherited from more than one super class.

a. multilevel inheritanceb. polymorphismc. multiple inheritanced. single inheritance

177. In _ _ _ _ _ _ _ _ _ signature, its image includes a composition of multiple features.

a. color histogram basedb. multi feature composedc. wavelet basedd. region based granularity

178. The edge orientation of any image can be _ _ _ _ _ _ _ _ _

a. .

b.

c.

d.179. _ _ _ _ _ _ _ _ is a optimization method for spatial association analysis.

a. progressive regressionb. progressive refinementc. progressive coveraged. refinement property

180. _ _ _ _ _ _ _ _ _ stores and manages a large collection of multimedia objects.

a. multimedia database systemb. database management systemc. relational database management systemd. graphical database system

181. A huge amount of space-related data are in _ _ _ _ _ _ _ _ _ forms.

a. imagesb. ER Diagramsc. Use case diagramsd. Audio

182. _ _ _ _ refers to the extraction of knowledge spatial relationships not explicitly stored in spatial databases.

a. spatial data miningb. spatial data warehousec. spatial data representationd. spatial data knowledge representation

183. _ _ _ _ _ _ _ _ _ is a dimension whose primitive level data are spatial starting at a certain level, becomes non-spatial.

a. non-spatial dimensionsb. spatial to non-spatial dimensionsc. spatial to spatial dimensionsd. non-spatial to spatial dimensions

184. _ _ _ _ _ _ _ _ _ _ is a collection of pointers to spatial objects.

a. numerical measures

Page 21: dmdw-mid 2

b. pointers measurec. spatial measuresd. non-spatial measures

185. MBR stands for _ _ _ _ _ _ _ _ _ _ _ which represents 2 points for rough estimation of a merged region.

a. memory buffer registerb. maximum bounding rectanglec. minimum bounding rectangled. minimum bounding region

186. Consider a spatial association rule Is-a(X,"office") Λ near-by(X,"house") => near-by(X,"university") [ 0.5 % 80 %] The rule states _ _ _ _ _ _ _ _ percent of the offices are close to houses.

a. 0.5 %b. 80 %c. 80.5 %d. 79.5 %

187. Precision= _ _ _ _ _ _ _ _ _ _

a.

b.

c.

d.188. Consider the data Original data: 3 7 2 0 7 2

Moving average of order 3: 4 3 2 6Weighted(3,4,3) The first weighted average value is _ _ _ _ _ _ _ _ _ _ _

a. 4.3b. 5.5c. 6.3d. 7.7

189. _ _ _ _ _ _ _ _ refer to the cycles which are long term oscillations about a trend line/curve which may/may not be periodic.

a. long term movementsb. cyclic movementsc. seasonal movementsd. random movements

190. If the time interval "int=0" means _ _ _ _ _ _ _ _ _ _

a. no interval gap is allowedb. gap is allowed after 0 secondsc. time interval=0d. sequence of time slots doesn't exists

191. Mining _ _ _ _ _ _ _ _ _ _ specifies the periodic behavior of the time series at some, but not all of the points in time.

a. full periodic pattern.b. partial periodic pattern.c. minimum periodic pattern.d. periodic association rules

192. Time series analysis is also referred to as _ _ _ _ _ _ _ _ _

a. decompositionb. time-analyzerc. series-analyzerd. decentralized

193. _ _ _ _ _ _ _ _ is the database that consists of sequence of ordered events with / without concrete notions of time.

Page 22: dmdw-mid 2

a. multimedia databaseb. sequence databasec. concrete databased. relational database

194. DFT and DWT are two popular data-independent transformations where F stands for _ _ _ _ _ _ _ _ _ _ _ _

a. fasterb. freedomc. Fourierd. frequency

195. _ _ _ _ _ _ _ _ _ find all pairs of gap free windows of a small length that are similar.

a. atomic matchingb. window stitchingc. sub-sequence orderingd. context switching

196. Web linkage structures, web contents etc., are included in _ _ _ _ _ _ _ _ _

a. data miningb. text miningc. web miningd. multimedia mining