knowledge graph representation via similarity-based …

Research ArticleKnowledge Graph Representation viaSimilarity-Based Embedding

Zhen Tan Xiang Zhao Yang Fang Bin Ge andWeidong Xiao

Science and Technology on Information Systems Engineering Laboratory National University of Defense TechnologyChangsha Hunan 410073 China

Correspondence should be addressed to Zhen Tan tanzhen08anudteducn and Xiang Zhao xiangzhaonudteducn

Received 16 March 2018 Revised 27 April 2018 Accepted 13 May 2018 Published 15 July 2018

Academic Editor Juan A Gomez-Pulido

Copyright copy 2018 Zhen Tan et alThis is an open access article distributed under the Creative CommonsAttribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Knowledge graph a typical multi-relational structure includes large-scale facts of the world yet it is still far away fromcompleteness Knowledge graph embedding as a representation method constructs a low-dimensional and continuous space todescribe the latent semantic information and predict themissing facts Among various solutions almost all embeddingmodels havehigh time andmemory-space complexities and hence are difficult to apply to large-scale knowledge graphs Some other embeddingmodels such as TransE and DistMult although with lower complexity ignore inherent features and only use correlations betweendifferent entities to represent the features of each entity To overcome these shortcomings we present a novel low-complexityembedding model namely SimE-ER to calculate the similarity of entities in independent and associated spaces In SimE-EReach entity (relation) is described as two parts The entity (relation) features in independent space are represented by the featuresentity (relation) intrinsically owns and in associated space the entity (relation) features are expressed by the entity (relation)features they connect And the similarity between the embeddings of the same entities in different representation spaces is highIn experiments we evaluate our model with two typical tasks entity prediction and relation prediction Compared with the state-of-the-art models our experimental results demonstrate that SimE-ER outperforms existing competitors and has low time andmemory-space complexities

1 Introduction

Knowledge graph (KG) as an important part of the artificialintelligence is playing an increasingly more essential rolein different domains [1] question answer system [2 3]information retrieval [4] semantic parsing [5] named entitydisambiguation [6] biological data mining [7 8] and soon [9 10] In knowledge graphs facts can be denoted asinstances of binary relations (egPresidentOf (DonaldTrumpAmerican)) Nowadays a great number of knowledge graphssuch as WordNet [11] Freebase [12] DBpedia [13] YAGO[14] and NELL [15] usually do not appear simultaneouslyInstead they were constructed to describe the structuredinformation in various domains [16] and all of them are fairlysparse

Knowledge representation learning [17ndash19] is consideredas an important task to extract the latent features fromassociated space Recently knowledge embedding [20 21] aneffective method of feature extraction [22] was proposed to

compress a high-dimensional and sparse space into a low-dimensional and continuous space Knowledge embeddingcan be used to derive new unknown facts from knownknowledge bases (eg link prediction) and to determinewhether a triplet is correct or not (eg triplets classification)[23] Moreover embedding representation [24] has beenused to support question answer systems [25] and machinereading [26]However almost all embeddingmodels only usethe features and attributes in knowledge graph to represententities and relations which omits the fact that entities andrelations are projections of the facts in independent spaceBesides almost all of themhave high time andmemory-spacecomplexities and cannot be used in large-scale knowledgegraphs

In this research we propose a novel similarity-basedknowledge embedding model namely SimE-ER which cal-culates the entity and relation similarities between twospaces (independent and associated spaces) A sketch of themodel framework is provided in Figure 1 The basic idea

HindawiScientific ProgrammingVolume 2018 Article ID 6325635 12 pageshttpsdoiorg10115520186325635

2 Scientific Programming

Independent Space Associated Space

Similarity

Similarity

Similarity

SteveJobs America

LaurenePowell

AppleInc

e3

e2

e1

e3

e2

e1

Figure 1 Framework of our model

Entity Pairs

Jack Ma

Jack Ma

Alibaba

Alipay

Tim Cook

Apple Inc

Google

SundarPichai

America

Laurene Powell

Apple Inc

Cou

pleO

f

Nationality

FoundOfe3 r1

Figure 2 Motivation in associated space

of this paper is that independent and associated spaces areused to represent the irrelevant and interconnected entities(relations) features respectively In independent space thefeatures of entities (relations) are independent and irrelevantBy contrast the features of entities (relations) in associatedspace are interconnected and interacting and the entitiesand relations can be denoted by the entities and relationsconnected with them Plus the similarities of the sameentities (relations) with different spaces are high In Figure 1we can see that in independent space the features of 1198901 areonly constructed by themselves but in associated spacesthe entity 1198901 is denoted by other entities and relations whichcan be described as blue points (lines) We want the featuresof 1198901 in independent and associated spaces to be similarBesides vector embedding is used to represent knowledgegraphs

In associated space take as an example the entity whichSteve Jobs has multiple triplets such as (Steve Jobs AppleInc FoundOf ) (Steve Jobs America Nationality) and (SteveJobs Laurene Powell CoupleOf ) If we combine all corrupttriplets with the same missing entity such as ( AppleInc FoundOf ) ( America Nationality) and ( LaurenePowell Couple) it is easy to locate that the missing entity 1198903 isSteve Jobs Similarly if we combine all the corrupt triplets withthe same relation such as (Steve Jobs Apple Inc ) (JackMa Alibaba ) and (Sundar Pichai Google ) we canobtain that the missing relation 1199031 is FoundOfThe scenario isshown in Figure 2 Hence using correlation between different

entities to represent features is an effective method Howeverin practice it is unsuitable to only use the correction betweendifferent entities and omit the inherent features entities havesuch as the attributes of each entity which are hard torepresent with the correlations between different entitiesTherefore we construct the independent space which canpreserve the inherent features each entity has We combineboth independent and associated spaces to represent overallfeatures of entities and relations which can in turn representthe knowledge graph more comprehensively The motivationof employing both types of spaces is to model correlationwhile reserving individual specificity

Compared with other embedding models vector embed-ding has evident advantages on time and memory-spacecomplexities We evaluate SimE-E and SimE-ER on thepopular tasks of entity prediction and relation predictionTheexperiment results validate the competitive results achievedby the proposed method compared with previous models

Contributions To summarize the main contributions ofthis paper are as follows

(i) We propose a similarity-based embedding modelnamely SimE-ER In SimE-ER we consider theentity and relation similarities of different spacessimultaneously which can extract the features ofentities and relations comprehensively

(ii) Compared with other embedding models our modelhas lower time and space complexity which improves

Scientific Programming 3

the effectiveness of processing large-scale knowledgegraphs

(iii) Through thorough experiments on real-life datasetsour approach is demonstrated to outperform theexisting state-of-the-art models in entity predictionand relation prediction tasks

Organization We discuss related work in Section 2and then introduce our method along with the theoreticalanalysis in Section 3 Afterwards experimental studies arepresented in Section 4 followed by conclusion in Section 5

2 Related Work

In this section we introduce several related works [19] pub-lished in recent years which get the state-of-the-art resultsAccording to the relation features we divide embeddingmodels into two parts matrix-based embedding models [27]and vector-based embedding models [28]

21 Matrix-Based Embedding Models In this part matrices(tensors) are used to describe relation features

Structured Embedding Structured Embedding Model(SE) [29] considers that head and tail entities are overlappingin a specific-relation spaceR119899 where the triplet (ℎ 119903 119905) existsIt uses two mapping matricesM119903ℎ andM119903119905 to extract featurefrom ℎ and 119905

Single Layer Model Compared with SE Single LayerModel (SLM) [30] uses a nonlinear activation function totranslate the extracted features and considers the featuresafter activation to be orthogonal with relation features Theextracted features are comprised of the entitiesrsquo features aftermapping and a bias of their relation

Neural Tensor Network Neural Tensor Network (NTN)[30 31] is amore complexmodel and considers that the tensorcan be regarded as better feature extractor compared withmatrices

Semantic Matching Energy The basic idea of SemanticMatching Energy (SME) [32] is that if the triplet is correctthe feature of head entity and tail entity is orthogonal Similarto SLM the features of head (tail) entity are comprised of theentitiesrsquo features after mapping and a bias of their relationThere are two methods to extract features ie linear andnonlinear

Latent FactorModelLatent FactorModel (LFM) [33 34]assumes that features of head entity are orthogonal with thoseof tail entity when the head entity is mapped in specific-relation space Its score function can be defined as 119891119903(ℎ 119905) =hTMrt where h Mr t denote the features of head entityrelation and tail entity respectively

22 Vector-Based Embedding Models In this part relationsare described as vector rather than matrix to improve theeffectiveness of representation models

Translation-Based ModelThe basic idea of translation-based model TransE [23 35 36] is that the relation r isa translation vector between h and t The score functionis 119891119903(h t) = h + r minus t119871

1119871

2

where h r and t denote

the head entity relation and tail entity embeddings respec-tively Because TransE only processes simple relations othertranslation-based models [37ndash39] are proposed to improveTransE

Combination Embedding Model CombinE [40] de-scribes the relation features with the plus and minus combi-nation of each pair Compared with other translation-basedmodels CombinE can represent relation features in a morecomprehensive way

Bilinear-Diag Model DistMult [41] uses a formulationof bilinear model to represent entities and relations andutilizes the learned embedding to extract logical rules

Holographic Embedding Model HOLE [42] utilizes acompositional vector space based on the circular correlationof vectors which creates fixed-width representations Thecompositional representation has the same dimensionality asthe representation of its constituents

Complex Embedding Model ComplEx [43] dividesentities and relations into two parts ie real part andimaginary part Real part denotes the features of symmetricrelation and imaginary part denotes the features of asymmet-ric relations

Project EmbeddingModel ProjE [44] a shared variableneural network model uses two-diagonal matrix to extractthe entity and relation features and calculate the similaritybetween features and candidate entity In training the correcttriplets have high similarity

Convolutional Embedding Model ConvE [45] trans-fers the features into 2D space and uses convolutional neuralnetwork to extract the entity and relation features

Comparedwithmatrix-based embeddingmodels vector-based models have obviously advantages on time andmemo-ry-space complexities In these vector-basedmodels TransEis a classical baseline and has been applied on many applica-tions TransR is an improvedmethod of TransEwhich solvesthe complex relation types and DistMult and ComplEx

use probability-based method to represent knowledge andachieve state-of-the-art results

3 Similarity-Based Model

Given a training set 119878+ of triplets each triplet (ℎ 119903 119905) has twoentities ℎ 119905 isin E (the set of entities) and relationship 119903 isin R (theset of relationship)Ourmodel learns the entities embeddings(hi ti ha ta) and relationship embeddings (ri ra) to representthe feature of entities and relations where the subscripts 119894119886 denote the independent and associated space The entityembedding and relation embedding take value in R119889 where119889 is the dimension of entity and relation embedding spaces

31 Our Models The basic idea of our model is that for eachentity (relation) the features are divided into two parts Thefirst part describes inherent features of entities (relations) inindependent space The feature embedding vectors can bedenoted as hi ri ti The second part signs triplet features inassociated space and the feature embedding vectors can bedenoted as ha ra ta In independent space the feature vectorsare described as the inherent features entities (relations)have In associated space the features of ha are comprised


of other entities and relations which connect with entityha

The entities (relations) in associated space are projec-tions of entities (relations) in independent space Hence therepresentation features of the same entity in independentand associated space are similar while the representationfeatures of different entities are not similar The formula canbe described as follows

hi asymp ra ⊙ ta (1)

ri asymp ha ⊙ ta (2)

ti asymp ha ⊙ ra (3)

where ⊙ denotes element-wise product In detail in (1) if wecombine the features of ra and ta we can obtain part of the hifeaturesThat is to say the hi features are similar with ra⊙ta Inthis paper we use Cosine to calculate the similarity betweendifferent spaces Taking head entity as an example the Cosinesimilarity between different spaces can be denoted as

cos (hi ra ⊙ ta) =119863119900119905 (hi ra ⊙ ta)1003817100381710038171003817hi1003817100381710038171003817 1003817100381710038171003817ra ⊙ ta

1003817100381710038171003817= 119878119906119898 (hi ⊙ ra ⊙ ta)1003817100381710038171003817hi1003817100381710038171003817 1003817100381710038171003817ra ⊙ ta

1003817100381710038171003817(4)

where Dot denotes the dot-product and Sum denotes thesummation over the vector element 119878119906119898(hi⊙ra⊙ta) calculatesthe similarity and hi ra ⊙ ta constrain the length offeatures To reduce the training complexity we just considerthe numerator and use regularization items to replace thedenominator Hence the similarity of head entity features inindependent and graph spaces can be described as

119878119894119898 (ℎ) = 119878119906119898 (hi ⊙ ra ⊙ ta) (5)

We expect that the value of hi ⊙ ra ⊙ ta is larger when hi andra⊙ta denote the same head entity while the value of hi⊙ra⊙tais smaller otherwise

To represent entities in a more comprehensive way weconsider the similarity of head and tail entities simultane-ously The score function can be denoted as

119878119894119898 (ℎ 119905) = 119878119894119898 (ℎ) + 119878119894119898 (119905)= 119878119906119898 (hi ⊙ ra ⊙ ta) + 119878119906119898 (ha ⊙ ra ⊙ ti)

(6)

The embeddingmodel based on the similarity of head and tailentities is named as SimE-E

On the basis of entity similarity we need to considerrelation similarity which can enhance the representation ofrelation featuresThe comprehensive model which considersall the similarities of entity (relation) features in differentspaces can be described as

119878119894119898 (ℎ 119903 119905) = 119878119894119898 (ℎ) + 119878119894119898 (119903) + 119878119894119898 (119905)= 119878119906119898 (hi ⊙ ra ⊙ ta) + 119878119906119898 (ha ⊙ ri ⊙ ta)+ 119878119906119898 (ha ⊙ ra ⊙ ti)

(7)

The embedding model based on the similarity of entityand relation is named as SimE-ER

32 Training To learn the proposed embedding and encour-age the discrimination between golden triplets and incorrecttriplets we minimize the following logistic ranking lossfunction over the training set

119871 = sum(hrt)isin119878

log (1 + exp (minus119884ℎ119903119905119878119894119898 (ℎ 119903 119905 Θ))) (8)

whereΘ corresponds to the embeddings hi ha ri ra ti ta isin R119889

and 119884ℎ119903119905 is a label of triplet 119884ℎ119903119905 = 1 denotes that (ℎ 119903 119905) ispositive and 119884ℎ119903119905 = minus1 denotes that (ℎ 119903 119905) is negative 119878 isa triplets set [28] which contains both positive triplets set 119878+and negative triplets set 119878minus

119878minus = (ℎ1015840 119903 119905) | ℎ1015840 isin 119864 cup (ℎ 119903 1199051015840) | 1199051015840 isin 119864

cup (ℎ 1199031015840 119905) | 1199031015840 isin 119877 (9)

The set of negative triplets constructed according to (9)is composed of training triplets with either head (tail) entityor relation replaced by a random entity or relation Only oneentity or relation is replaced for each corrupted triplet withthe same probability To prevent overfitting some constraintsare considered when minimizing the loss function 119871

forallhi ti isin |119864| ri isin |119877|

1003817100381710038171003817hi1003817100381710038171003817 = 11003817100381710038171003817ri1003817100381710038171003817 = 11003817100381710038171003817ti1003817100381710038171003817 = 1

forallha ta isin |119864| ra isin |119877|

1003817100381710038171003817ha ⊙ ra1003817100381710038171003817 = 1

1003817100381710038171003817ha ⊙ ta1003817100381710038171003817 = 1

1003817100381710038171003817ra ⊙ ta1003817100381710038171003817 = 1

(10)

Equation (10) is to constrain the length of entity (relation)features for SimE-E and SimE-ER We convert it to thefollowing loss function by means of soft constraints

119871 = sum(hrt)isin119878

log (1 + exp (minus119884ℎ119903119905119878119894119898 (ℎ 119903 119905 Θ)))

+ 120582 Θ22 (11)

where 120582 is a hyperparameter to weigh the importance ofsoft constraints We utilize the improved stochastic gradientdescent (Adagrad) [46] to train the models Comparingwith SGD Adagrad shrinks learning rate effectively whenthe number of iterations increases which means that it isinsensitive to learning rate


Table 1 Complexities of representation models

Model Relation Parameters Memory-Space Complexity Time ComplexityNTN 119882119903 isin R119889times119889times11989611987211990311198721199032 isin R119889times119889 119874(119899119890119889 + 1198991199031198892119896 + 21198991199031198892 + 119899119903119889) 119874(1198892119896)RESCAL 119872119903 isin R119889times119889 119874(119899119890119889 + 1198991199031198892) 119874(1198892)SE 119872119903ℎ119872119903119905 isin R119889times119889 119874(119899119890119889 + 21198991199031198892) 119874(1198892)SLM 11987211990311198721199032 isin R119889times119889 119874(119899119890119889 + 21198991199031198892 + 2119899119903119889) 119874(1198892)LFM 119872119903 isin R119889times119889 119874(119899119890119889 + 1198991199031198892) 119874(1198892)TransR 119903 isin R119889119872119903 isin R119889times119889 119874(119899119890119889 + 1198991199031198892 + 119899119903119889) 119874(1198892)DistMult 119903 isin R119889 119874(119899119890119889 + 119899119903119889) 119874(119889)ComplEx 119903 isin R119889 119874(2119899119890119889 + 2119899119903119889) 119874(119889)TransE 119903 isin R119889 119874(119899119890119889 + 119899119903119889) 119874(119889)SimE-E 119903 isin R119889 119874(2119899119890119889 + 119899119903119889) 119874(119889)SimE-ER 119903 isin R119889 119874(2119899119890119889 + 2119899119903119889) 119874(119889)

Table 2 Dataset statistics

Dataset Entity Relation Train Valid TestWN18 40934 18 141442 5000 5000FB15K 14951 1345 483142 50000 59071FB40K 37591 1317 325350 5000 5000

33 Comparison with Existing Models To compare the timeand memory-space complexities between different modelswe show the results in Table 1 where 119889 represents thedimension of entity and relation embeddings 119896 is the numberof tensorrsquos slices and 119899119890 and 119899119903 are the numbers of entities andrelations respectively

The comparison results are showed as follows

(i) Except for DistMult and TransE the baselines userelation matrix to project entitiesrsquo features into rela-tion space which makes these models have highmemory-space and time complexities Comparedwith thesemodels SimE-E and SimE-ER have lowertime complexity SimE-E and SimE-ER can be usedon large-scale knowledge graphs more effectively

(ii) In comparison to TransE SimE-E and SimE-ERcan dynamically control the ratio of positive andnegative triplets It enhances the robustness of repre-sentation models

(iii) Compared with SimE-E and SimE-ER DistMult isa special case of them when we only consider singlesimilarity of entity or relation That is to say SimE-E and SimE-ER can extract the features of entities(relations) more comprehensively

4 Experiments and Analysis

In this section our models SimE-E and SimE-ER are evalu-ated and compared with several baselines which have beenshown to achieve state-of-the-art performance Firstly twoclassical tasks are adopted to evaluate our models entityprediction and relation prediction Then we use cases toverify the effectiveness of our models Finally according tothe practical experimental results we analyze the time andmemory-space costs

41 Datasets We use two real-life knowledge graphs toevaluate our method

(i) WordNet (httpswordnetprincetonedudownload)a classical dictionary is designed to describe corre-lation and semantic information between differentwords Entities are used to describe the conceptsof different words and relationships are defined todescribe the semantic relevance between differententities such as instance hypernym similar to andmember of domain topic The data version we useis the same as [23] where triplets are denoted as(sway 2 has instance brachiate 1) or (felis 1 mem-ber meronym catamount 1) A subset of WordNet isadopted named as WN18 [23]

(ii) Freebase (codegooglecompwiki-links) a huge andcontinually growing knowledge graph describes largeamount of facts in the world In Freebase entities aredescribed by labels and relations are denoted bya hierarchical structure such as ldquo119905V119905V119892119890119899119903119890119901119903119900119892119903119886119898119904rdquo and ldquo119898119890119889119894119888119894119899119890119889119903119906119892 119888119897119886119904119904119889119903119906119892119904rdquo Weemploy two subsets of Freebase named as FB15Kand FB40K [23]

We show the statistics information of datasets in Table 2From Table 2 we see that compared with WN18 FB15K andFB40K have more relationships and can be regarded as thetypical large-scale knowledge graphs

42 Experiment Setup

EvaluationProtocol For each triplet in the test set each itemof triplets (head entity or tail entity or relation) is removedand replaced by items in the dictionary in turn respectivelyUsing score function to calculate these corrupted tripletsand sorting the scores by ascending order the rank of the


correct entities or relations is stored For relation in eachtest triplet the whole procedure is repeated In fact we needto consider that some correct triplets are generated in theprocess of removing and replacement Hence we filter out thecorrect triplets from corrupted triplets which actually exist intraining and validation sets The evaluation measure beforefiltering is named as ldquoRawrdquo and the measure after filteringis named as ldquoFilterrdquo We used two evaluation measures toevaluate our approach which is similar to [42]

(i) MRR is an improved measure of MeanRank [23]which calculates the average rank of all the entities(relations) and calculates the average reciprocal rankof all the entities (relations) Compared with Mean-Rank MRR is less sensitive to outliers We report theresults using both Filter and Raw rules

(ii) Hits119899 reports the ratio of correct entities in Top-n ranked entities Because the number of entities ismuch larger than that of relations we take Hits1Hits3 Hits10 for entity prediction task and takeHits1 Hits2 Hits3 for relation prediction task

A state-of-the-art embedding model should have higherMRR and Hits119899

Baselines Firstly we compare the proposed methodswith CP which uses canonical polyadic decomposition toextract the entities and relation features then we com-pare the proposed methods with TransE which considersthat tail entity features are close to the combined fea-tures of head entity and relation Besides TransR [47]ER-MLP [48] DistMult [41] and ComplEx [43] are alsoused for comparison with our methods We train CP [49]DistMult ComplEx TransE and TransR using the codesprovided by authors We choose the length of dimension119889 among 20 50 100 200 the weight of regularization 120582among 0 0003 001 01 05 1 the learning rate among0001 001 01 02 05 and the ratio of negative and correctsamples 120574 among 1 5 10 50 100 The negative samples indifferent epochs are different

Implementation For experiments using SimE-E andSimE-ER we select the dimension of the entity and the rela-tion 119889 among 50 100 150 200 the weight of regularization120582 among 0 001 01 05 1 the ratio of negative and correctsamples 120574 among 1 5 10 50 100 and the mini-batch size119861 among 100 200 500 1000 We utilized the improvedstochastic gradient descent (Adagrad) [46] to train the lossfunction With the iteration epoch increasing the learningrate in Adagrad is decreases and Adagrad is insensitive tolearning rate The initial values of both SimE-E and SimE-ER are generated by Random function and the range is(minus6radic119889 6radic119889) where 119889 is the dimension of feature vectorTraining is stopped using early stopping on the validation setMRR (using the Filter measure) computed every 50 epochswith a maximum of 2000 epochs

In SimE-E model the optimal configurations on valida-tion set are

(i) 120582 = 1 120574 = 10 119889 = 150 119861 = 100 on WN18(ii) 120582 = 1 120574 = 20 119889 = 200 119861 = 200 on FB15K

(iii) 120582 = 1 120574 = 20 119889 = 300 119861 = 100 on FB40K

In SimE-ER model the optimal configurations on vali-dation set are

(i) 120582 = 1 120574 = 10 119889 = 150 119861 = 100 on WN18(ii) 120582 = 1 120574 = 20 119889 = 200 119861 = 200 on FB15K(iii) 120582 = 1 120574 = 20 119889 = 300 119861 = 100 on FB40K

T-test In experiments for each model we run 15 times inde-pendently and calculate the mean and standard deviationThen we use Studentrsquos t-test with 119901minusV119886119897119906119890 = 095 to comparethe performance between different models and the t-test canbe shown as follows [50 51]

1205831 and 1199041 are mean and standard deviation on model1 with run 1198991 times 1205832 and 1199042 are mean and standarddeviation on model 2 with 1198992 times Then we can constructthe hypothesis

1198670 1205831 minus 1205832 le 01198671 1205831 minus 1205832 gt 0

(12)

And the t-test can be described as

119905 = 1205831 minus 1205832radic11198991 + 11198992radic(119899111990412 + 119899211990422) (1198991 + 1198992 minus 2)

(13)

The degree of freedom (119889119891) in t-distribution can be shown asfollows

119889119891 =(119904121198991 + 119904221198992)

2

(1 (1198991 minus 1)) (119904121198991)2 + (1 (1198992 minus 1)) (119904221198992)2(14)

In entity and relation prediction tasks we calculate meanand standard deviation ofMRR andHit119899 and compare theirperformance with t-test

43 Link Prediction For link prediction [52ndash54] we testedtwo subtasksmdashentity prediction and relation predictionEntity prediction aims to predict the missing ℎ or 119905 entityfrom the fact triplet (ℎ 119903 119905) similarly relation prediction isto determine which relation is more suitable for a corruptedtriplet (ℎ lowast 119905)

Entity Prediction This set of experiments tests themodelsrsquo ability to predict entities Experimental results ofmean and plusminus standard deviation on bothWN18 andFB15K are shown in Tables 3 4 and 5 and we can observe thefollowing

(i) On WN18 a small-scale knowledge graph ComplEx

achieves state-of-the-art results on MRR and Hits119899However on FB15K and FB40K two large-scaleknowledge graphs SimE-E and SimE-ER achieveexcellent results on MRR and Hits119899 and the valuesofHits10 are up to 0868 and 0889 respectivelyTheoutstanding results prove that our models can repre-sent different kinds of knowledge graphs effectivelyespecially on large-scale knowledge graphs


Table 3 Experimental results of entity prediction on WN18

ModelWN18

MRR Hits119899Filter Raw 1 3 10

CP 0065 plusmn0002 0051 plusmn0001 0043 plusmn0002 0069 plusmn0001 0107 plusmn0002DistMult 0821 plusmn0003 0530 plusmn0002 0728 plusmn0002 0914 plusmn0002 0930 plusmn0001ER-MLP 0712 plusmn0002 0508 plusmn0003 0626 plusmn0002 0775 plusmn0002 0863 plusmn0003TransE 0445 plusmn0002 0318 plusmn0002 0081 plusmn0002 0801 plusmn0001 0937 plusmn0003TransR 0415 plusmn0002 0414 plusmn0003 0378 plusmn0002 0635 plusmn0003 0724 plusmn0001ComplEx 0936 plusmn0003 0575 plusmn0002 0933plusmn0001 0939 plusmn0001 0940plusmn0001SimE-E 0823 plusmn0003 0572 plusmn0002 0726 plusmn0001 0917 plusmn0002 0938plusmn0001SimE-ER 0821plusmn0002 0576 plusmn0002 0726 plusmn0002 0914 plusmn0002 0940 plusmn0002

Table 4 Experimental results of entity prediction on FB15K

ModelFB15K


CP 0333 plusmn0003 0153 plusmn0002 0229 plusmn0004 0381 plusmn0003 0531plusmn0004DistMult 0650 plusmn0003 0242 plusmn0003 0537 plusmn0004 0738 plusmn0003 0828 plusmn0003ER-MLP 0288 plusmn0002 0155 plusmn0002 0173 plusmn0005 0317 plusmn0005 0501plusmn0001TransE 0481 plusmn0004 0220 plusmn0002 0259 plusmn0005 0651 plusmn0002 0813plusmn0002TransR 0376 plusmn0003 0201 plusmn0004 0245 plusmn0002 0435 plusmn0002 0634 plusmn0003ComplEx 0691 plusmn0003 0241 plusmn0002 0596 plusmn0003 0752 plusmn0002 0838 plusmn0003SimE-E 0740 plusmn0002 0259plusmn0002 0666plusmn0002 0795 plusmn0003 0860 plusmn0003SimE-ER 0727 plusmn0003 0261 plusmn0002 0636 plusmn0003 0797plusmn0002 0868plusmn0003


ModelFB40K


CP 0448 plusmn0002 0274 plusmn0002 0392 plusmn0003 0479 plusmn0002 0549plusmn0002DistMult 0573 plusmn0003 0407 plusmn0003 0493 plusmn0002 0613 plusmn0002 0720plusmn0003ER-MLP 0296 plusmn0001 0167 plusmn0004 0181 plusmn0001 0332 plusmn0003 0498plusmn0003TransE 0574 plusmn0003 0383 plusmn0001 0422 plusmn0002 0687 plusmn0003 0808plusmn0001TransR 0355 plusmn0001 0198 plusmn0001 0224 plusmn0002 0441 plusmn0002 0612plusmn0001ComplEx 0680 plusmn0001 0408 plusmn0002 0586 plusmn0002 0753 plusmn0002 0837plusmn0002SimE-E 0816 plusmn0001 0439 plusmn0002 0781 plusmn0002 0848 plusmn0002 0874plusmn0002SimE-ER 0810 plusmn0001 0445 plusmn0002 0756 plusmn0002 0852 plusmn0002 0889plusmn0002

(ii) ComplEx is better than SimE-ER on WN18 andthe reason is that ComplEx can distinguish sym-metric and antisymmetric relationship containedin the relation structure of WN18 However onFB15K and FB40K SimE-E and SimE-ER are betterthan ComplEx The reason is that the number ofrelations is much larger than WN18 and the relationstructure is more complex and hard to representwhich has obvious influence on the representationability of ComplEx

(iii) The results of SimE-E and SimE-ER are similar toeach other The largest margin is filtered MRR onFB15K at 0013 The phenomenon demonstrates that

both SimE-E and SimE-ER can extract the entityfeatures in knowledge graph and predict the missingentities effectively

(iv) Compared with DistMult the special case of ourmodels SimE-E and SimE-ER achieve better resultsespecially on FB15K and the filter MRR is up to0740The results can prove that ourmodelswhich useirrelevant and interconnected features to constructindependent and associated spaces can represent theentities and relations features more comprehensively

We use t-test to evaluate the effectiveness of our mod-els and the evaluation results can prove that on FB15K


Table 6 Experimental results of relation prediction on WN18

ModelWN18


CP 0551 plusmn0003 0550 plusmn0002 0405 plusmn0002 0540 plusmn0002 0629plusmn0001DistMult 0731 plusmn0003 0730 plusmn0002 0535 plusmn0002 0922 plusmn0002 0938 plusmn0002ER-MLP 0707 plusmn0002 0513 plusmn0002 0614 plusmn0001 0815 plusmn0003 0877 plusmn0002TransE 0739 plusmn0002 0739 plusmn0001 0622 plusmn0002 0729 plusmn0002 0811 plusmn0002TransR 0415 plusmn0003 0414 plusmn0003 0378 plusmn0003 0635 plusmn0002 0724 plusmn0001ComplEx 0866 plusmn0003 0865 plusmn0003 0830plusmn0001 0953 plusmn0002 0961plusmn0002SimE-E 0812plusmn0002 0812 plusmn0001 0770plusmn0002 0954plusmn0002 0962plusmn0001SimE-ER 0814 plusmn0002 0814 plusmn0001 0775 plusmn0001 0955 plusmn0002 0965plusmn0001

Table 7 Experimental results of relation prediction on FB15K

ModelFB15K


CP 0361 plusmn0002 0308 plusmn0001 0240 plusmn0002 0347 plusmn0002 0411 plusmn0002DistMult 0309 plusmn0003 0285 plusmn0003 0116 plusmn0002 0289 plusmn0002 0412 plusmn0004ER-MLP 0412 plusmn0003 0268 plusmn0002 0236 plusmn0003 0573 plusmn0003 0631 plusmn0003TransE 0245 plusmn0002 0281 plusmn0002 0275 plusmn0003 0339 plusmn0002 0381 plusmn0003TransR 0416 plusmn0002 0343 plusmn0002 0270 plusmn0001 0448 plusmn0002 0573 plusmn0002ComplEx 0566 plusmn0002 0490 plusmn0001 0371 plusmn0002 0646 plusmn0001 0701 plusmn0002SimE-E 0579plusmn0002 0523 plusmn0001 0321plusmn0002 0708plusmn0002 0823 plusmn0002SimE-ER 0593 plusmn0002 0534 plusmn0001 0331plusmn0002 0737plusmn0001 0842plusmn0002


ModelFB40K



and FB40K compared with other baselines our resultsachieve significant improvements eg on theHits10 resultsof ComplEx and SimE-ER 119905 = 2645 which is larger than119905095(28) = 1701 The t-test results can prove that on FB15Kand FB40K our experimental results achieve significantimprovement compared with other baselines

Relation Prediction This set of experiments tests themodelsrsquo ability to predict relations Tables 6 7 and 8 showthe prediction performance on WN18 and FB15K From thetables we discover the following

(i) Similar to the results in the entity prediction onWN18 ComplEx achieves better results on MRRand Hits1 and SimE-ER obtains better results

on Hits2 and Hits3 On FB15K besides thevalue of Hits1 the results of SimE-ER are betterthan ComplEx and other baselines and the valueof Hits3 is up to 0842 which is much higher(improvement of 201) than the state-of-the-artbaselines ON FB40K SimE-ER achieves state-of-the-art results on all the measures in particular thefilter MRR is up to 0603

(ii) In entity prediction task the results of SimE-E andSimE-ER are similar However in relation predictiontasks SimE-ER achieves significant results on RawMRR Hits2 and Hits3 We use the t-test toverify the results and the t-values are larger than


Table 9 MRR for each relation on WN18

Relation Name Tri SimE-ER SimE-E ComplEx DistMult

hypernym 1251 0937 0927 0933 0701hyponym 1153 0788 0520 0910 0732derivationally related form 1074 0964 0963 0946 0959member holonym 278 0715 0603 0914 0701member meronym 253 0682 0767 0767 055has part 172 0675 0602 0933 0667part of 165 0685 0819 0931 0690instance hypernym 122 0703 0856 0799 0726synset domain topic of 114 0792 0847 0813 0584member of domain topic 111 0695 0523 0714 0799instance hyponym 108 0661 0561 0945 0651also see 56 0769 0680 0603 0727verb group 39 0977 0977 0936 0973synset domain region of 37 0736 0819 1000 0694member of domain region 26 0468 0799 0788 0504member of domain usage 24 0463 0578 0780 0507synset domain usage of 14 0928 0761 1000 0750similar to 3 1000 1000 1000 1000

119905095(28) = 1701 The difference between entity andrelation tasks can demonstrate that considering bothentity and relation similarity can extract relationfeatures more effectively on the basis of ensuring theentity-features extraction

(iii) On FB15K the gap is significant and SimE-E andSimE-ER outperform other models with a MRR (Fil-ter) of 0593 and 0842 of Hits3 On both datasetsCP and TransE perform the worst which illustratesthe feasibility of learning knowledge embedding inthe first case and the power of using two mutualrestraint parts to represent entities and relations in thesecond

We also use t-test to evaluate our model ie comparingSimE-ER with ComplEx on filter MRR 119905 = 3572 which islarger than 119905095(28) = 1701 The t-test results can prove thatthe performance of SimE-ER is better than other baselineson FB15K and FB40K

To analyze the relation features Table 9 shows the MRRwith Filter of each relation on WN18 where 119879119903119894 denotesthe number of triplets for each relation in the test set FromTable 9 we conclude the following

(i) For almost all relations on WN18 compared withother baselines SimE-E and SimE-ER achieve com-petitive results which demonstrates that ourmethodscan extract different types of latent relation features

(ii) Compared with SimE-E the relationMRRs of SimE-ER are much better on most relations such as hyper-nym hyponym and derivationally related form

(iii) On almost all results of relation MRR SimE-ER isbetter than DistMult a special case of SimE-ERThatis to say compared with single embedding space

using two different spaces to describe entity andrelation features can achieve better performance

Case Study Table 10 shows the detailed prediction results ontest set of FB15K It illustrates the performance of ourmodelsGiven head and tail entities the top-5 predicted relations andrelative scores of SimE-ER are depicted in Table 10 From thetable we observe the following

(i) In triplet 1 the relation prediction result is rankedon top-2 and in triplet 2 the result is top-1 Therelation prediction results can prove the performanceof SimE-ER However in triplet 1 the correct result(top-2) has similar score with other prediction results(top-1 top-3)That is to say it is difficult for SimE-ERto distinguish similar relationships

(ii) For any relation prediction results the top-5 relationprediction results are similar that is to say similarrelations have similar representation embeddingswhich is in line with common sense

44 Complexity Analysis To compare the time andmemory-space complexity of different models we show the analyticalresults of FB15K inTable 11 where119889 represents the dimensionof entity and relation space ldquoMini-batchrdquo represents themini-batch of each iteration ldquoParamsrdquo denotes the numberof parameters in each model on FB15K and ldquoTimerdquo denotesthe running time of each iteration Note that all models arerun on standard hardware of Inter(R) Core(TM) i7U 35GHz+ GeForce GTX TITANWe report the average running timeover one hundred iterations as the running time of eachiteration From Table 11 we observe the following

(i) Except for DistMult SimE-E and SimE-ER havelower time and memory complexities compared with


Table 10 Case study of SimE-ER

Triplet 1 m02rgz97 musicgroup memberartists supported m012d9h Score

Results

1 musicgroup membermembershipmusicgroup membershipgroup 09972 musicgroup memberartists supported 09753 musicgroup memberinstruments played 0953

4 musicgroup membermembershipmusicgroup membershiprole 09135 musicgenresubgenre 0891

Triplet 2 m02hrh1q organizationrolegovernorsorganizationmember m03mnk Score

Results

1 organizationrolegovernorsorganizationmember 09942 organizationroleleadersorganizationleadershiporganization 09833 organizationorganization sectororganizations in this sector 0946

4 organizationorganization membermember oforganizationorganization 09115 peopleappointed roleappointmentpeopleappointmentappointed by 0767

Table 11 Complexities comparison

Model 119889 Mini-batch Params Time (s)RESCAL 100 200 1425M 12136NTN 100 100 7840M 34765TransR 100 2145 1438M 9596TransE 200 2145 311M 753DistMult 200 100 311M 323SimE-E 200 200 595M 537SimE-ER 200 200 622M 663

the baselines because in SimE-E and SimE-ER weonly use element-wise products between entitiesrsquoand relationsrsquo vectors to generate the representationembedding

(ii) On FB15K the time costs of SimE-E and SimE-ER ineach iteration are 537s and 663s respectively whichare lower than 753s the time cost of TransE whichhas fewer parameters The reason is that the mini-batch of TransE is 2415 which is much larger thanthe mini-batches of SimE-E and SimE-ER Besidesfor SimE-E and SimE-ER the number of iterationsis 700 times with 3760 (s) and 4642 (s) respectively

(iii) Because SimE-E and SimE-ER have low complexityand high accuracy they can easily be applied to large-scale knowledge graph while using less computingresources and running time

5 Conclusion

In this paper we propose a novel similarity-based embed-ding model SimE-ER that extracts features from knowledgegraph SimE-ER considers that the similarity of the sameentities (relations) is high in independent and associatedspaces Compared with other representation models SimE-ER is more effective in extracting the entity (relation) featuresand represents entity and relation features more flexiblyand comprehensively Besides SimE-ER has lower time andmemory complexities which indicates that it is applicable on

large-scale knowledge graphs In experiments our approachis evaluated on entity prediction and relation predictiontasks The results prove that SimE-ER achieves state-of-the-art performances We will explore the following future work

(i) In addition to the facts in knowledge graph there alsoare large amount of logic and hierarchical correlationsbetween different facts How to translate these hier-archical and logic information into low-dimensionalvector space is an attractive and valuable problem

(ii) In real world extracting relations and entities fromlarge-scale text information is an important yet openproblem Combining latent features of knowledgegraph and text sets is a feasible method to constructthe connection between structured and unstructureddata It is supposed to enhance the accuracy andefficiency of entity (relation) extraction

Data Availability

All the datasets used in this paper are fully available withoutrestriction upon request

Conflicts of Interest

The authors declare that there are no conflicts of interestregarding the publication of this paper


Acknowledgments

Thisworkwas partially supported byNSFCunderGrants nos71690233 and 71331008

References

[1] Y Wang N Wang and L Zhou ldquoKeyword query expansionparadigm based on recommendation and interpretation inrelational databasesrdquo Scientific Programming vol 2017 12 pages2017

[2] A Bordes J Weston and N Usunier ldquoOpen question answer-ing with weakly supervised embedding modelsrdquo in MachineLearning and Knowledge Discovery in Databases pp 165ndash180Springer 2014

[3] A Bordes S Chopra and JWeston ldquoQuestion Answering withSubgraphEmbeddingsrdquo inProceedings of the 2014Conference onEmpiricalMethods inNatural Language Processing (EMNLP rsquo14)A meeting of SIGDAT a Special Interest Group of the ACL pp615ndash620 Doha Qatar October 2014

[4] B Han L Chen and X Tian ldquoKnowledge based collectionselection for distributed information retrievalrdquo InformationProcessing amp Management vol 54 no 1 pp 116ndash128 2018

[5] J Berant A Chou R Frostig and P Liang ldquoSemantic parsingon freebase from question-answer pairsrdquo in Proceedings of the2013 Conference on Empirical Methods in Natural LanguageProcessing EMNLP rsquo13 Ameeting of SIGDAT a Special InterestGroup of the ACL pp 1533ndash1544 Seattle Wash USA 2013

[6] SHakimov S AOto andEDogdu ldquoNamed entity recognitionand disambiguation using linked data and graph-based central-ity scoringrdquo in Proceedings of the 4th International Workshop onSemantic Web Information Management SWIMrsquo12 ScottsdaleAriz USA May 2012

[7] J Nikkila P Toronen S Kaski J Venna E Castren and GWong ldquoAnalysis and visualization of gene expression data usingself-organizingmapsrdquoNeural Networks vol 15 no 8-9 pp 953ndash966 2002

[8] L C Freeman ldquoCliques Galois lattices and the structure ofhuman social groupsrdquo Social Networks vol 18 no 3 pp 173ndash187 1996

[9] P P Ray ldquoA survey on visual programming languages in internetof thingsrdquo Scientific Programming vol 2017 6 pages 2017

[10] H Tian and P Liang ldquoPersonalized Service RecommendationBased on Trust Relationshiprdquo Scientific Programming vol 2017pp 1ndash8 2017

[11] G AMiller ldquoWordNet a lexical database for EnglishrdquoCommu-nications of the ACM vol 38 no 11 pp 39ndash41 1995

[12] K Bollacker C Evans P Paritosh T Sturge and J Taylor ldquoFree-base A collaboratively created graph database for structuringhuman knowledgerdquo in SIGMOD2008 pp 1247ndash1249 2008

[13] S Auer C Bizer G Kobilarov J Lehmann R Cyganiak andZ G Ives ldquoDbpedia A nucleus for a web of open datardquo inProceedings of the 6th International Semantic Web Conferencepp 722ndash735 2007

[14] F M Suchanek G Kasneci and G Weikum ldquoYago a core ofsemantic knowledgerdquo in Proceedings of the 16th InternationalWorldWideWeb Conference (WWW rsquo07) pp 697ndash706 AlbertaCanada May 2007

[15] A Carlson J Betteridge B Kisiel B Settles et al ldquoToward anarchitecture for never-ending language learningrdquo inProceedingsof the Twenty-Fourth AAAI Conference on Artificial IntelligenceAAAI rsquo10 Atlanta Ga USA 2010

[16] S A El-Sheikh M Hosny andM Raafat ldquoComment on lsquoroughmultisets and information multisystemsrsquordquo Advances in DecisionSciences vol 2017 3 pages 2017

[17] M Richardson and P Domingos ldquoMarkov logic networksrdquoMachine Learning vol 62 no 1-2 pp 107ndash136 2006

[18] C Kemp J B Tenenbaum T L Griffiths T Yamada and NUeda ldquoLearning systems of concepts with an infinite relationalmodelrdquo in AAAI2006 pp 381ndash388 2006

[19] Q Wang Z Mao B Wang and L Guo ldquoKnowledge graphembedding A survey of approaches and applicationsrdquo IEEETransactions on Knowledge and Data Engineering vol 29 no12 pp 2724ndash2743 2017

[20] Y Bengio R Ducharme P Vincent and C Jauvin ldquoA neuralprobabilistic language modelrdquo Journal of Machine LearningResearch vol 3 pp 1137ndash1155 2003

[21] M Nickel V Tresp and H-P Kriegel ldquoA three-way model forcollective learning on multi-relational datardquo in Proceedings ofthe 28th International Conference on Machine Learning ICMLrsquo11 pp 809ndash816 July 2011

[22] JWeston A Bordes O Yakhnenko and N Usunier ldquoConnect-ing language and knowledge bases with embedding models forrelation extractionrdquo in Proceedings of the 2013 Conference onEmpirical Methods in Natural Language Processing EMNLP rsquo13pp 1366ndash1371 October 2013

[23] A Bordes N Usunier A Garca-Duran J Weston and OYakhnenko ldquoTranslating embeddings for modeling multi-relational datardquo in NIPS2013 pp 2787ndash2795 2013

[24] L Wondie and S Kumar ldquoA joint representation of Renyirsquos andTsallirsquos entropy with application in coding theoryrdquo InternationalJournal of Mathematics and Mathematical Sciences vol 2017Article ID 2683293 5 pages 2017

[25] W Cui Y Xiao H Wang Y Song S-W Hwang and WWang ldquoKBQA Learning question answering over QA corporaand knowledge basesrdquo in Proceedings of the 43rd InternationalConference on Very Large Data Bases VLDB rsquo17 vol 10 pp 565ndash576 September 2017

[26] B Yang and T Mitchell ldquoLeveraging knowledge bases in lstmsfor improving machine readingrdquo in Proceedings of the 55thAnnualMeeting of the Association for Computational Linguisticspp 1436ndash1446 Vancouver Canada July 2017

[27] Q Liu H Jiang Z Ling S Wei and Y Hu ldquoProbabilisticreasoning via deep learning Neural associationmodelsrdquo CoRRabs160307704 2016

[28] S He K Liu G Ji and J Zhao ldquoLearning to represent knowl-edge graphs with gaussian embeddingrdquo in Proceedings of thethe 24th ACM International Conference on Information andKnowledge Management pp 623ndash632 Melbourne AustraliaOctober 2015

[29] A Bordes J Weston R Collobert and Y Bengio ldquoLearningstructured embeddings of knowledge basesrdquo in AAAI2011 pp301ndash306 2011

[30] R Socher D Chen C D Manning and A Y Ng ldquoReasoningwith neural tensor networks for knowledge base completionrdquo inNIPS2013 pp 926ndash934 2013

[31] G E Hinton S Osindero and Y Teh ldquoA fast learning algorithmfor deep belief netsrdquoNeural Computation vol 18 no 7 pp 1527ndash1554 2006

[32] A Bordes X Glorot J Weston and Y Bengio ldquoJoint learningof words and meaning representations for open-text semanticparsingrdquo Journal of Machine Learning Research vol 22 pp 127ndash135 2012


[33] R Jenatton N L Roux A Bordes and G Obozinski ldquoA latentfactor model for highly multi-relational datardquo in NIPS2012 pp3176ndash3184 2012

[34] I Sutskever R Salakhutdinov and J B Tenenbaum ldquoModellingrelational data using Bayesian clustered tensor factorizationrdquoin Proceedings of the 23rd Annual Conference on Neural Infor-mation Processing Systems (NIPS rsquo09) pp 1821ndash1828 BritishColumbia Canada December 2009

[35] R Xie Z Liu J Jia H Luan andM Sun ldquoRepresentation learn-ing of knowledge graphs with entity descriptionsrdquo in Proceed-ings of the 30th AAAI Conference on Artificial Intelligence AAAIrsquo16 pp 2659ndash2665 February 2016

[36] H Xiao M Huang and X Zhu ldquoTransg A Generative Modelfor Knowledge Graph Embeddingrdquo in Proceedings of the 54thAnnualMeeting of the Association for Computational Linguisticspp 2316ndash2325 Berlin Germany August 2016

[37] J Feng M Huang M Wang M Zhou Y Hao and X ZhuldquoKnowledge graph embedding by flexible translationrdquo inKR2016 pp 557ndash560 2016

[38] Y Jia Y Wang H Lin X Jin and X Cheng ldquoLocally adaptivetranslation for knowledge graph embeddingrdquo in AAAI2016 pp992ndash998 2016

[39] T Ebisu and R Ichise ldquoToruse Knowledge graph embeddingon a lie grouprdquo CoRR abs171105435 2017

[40] Z Tan X Zhao and W Wang ldquoRepresentation Learning ofLarge-Scale Knowledge Graphs via Entity Feature Combina-tionsrdquo in Proceedings of the 2017 ACM on Conference on Infor-mation and Knowledge Management pp 1777ndash1786 SingaporeNovember 2017

[41] B YangW Yih XHe J Gao and L Deng ldquoEmbedding entitiesand relations for learning and inference in knowledge basesrdquoCoRR abs14126575 2014

[42] M Nickel L Rosasco and T A Poggio ldquoHolographic embed-dings of knowledge graphsrdquo in AAAI2016 pp 1955ndash1961 2016

[43] T Trouillon J Welbl S Riedel E Ciaussier and G BouchardldquoComplex embeddings for simple link predictionrdquo in Proceed-ings of the 33rd International Conference on Machine LearningICML rsquo16 pp 3021ndash3032 June 2016

[44] B Shi and T Weninger ldquoProje embedding projection forknowledge graph completionrdquo in Proceedings of the Thirty-FirstAAAI Conference on Artificial Intelligence pp 1236ndash1242 SanFrancisco Calif USA 2017

[45] T Dettmers PMinervini P Stenetorp and S Riedel ldquoConvolu-tional 2d knowledge graph embeddingsrdquo CoRR abs1707014762017

[46] M D Zeiler ldquoADADELTA an adaptive learning rate methodrdquoCoRR abs12125701 2012

[47] Y Lin Z Liu M Sun Y Liu and X Zhu ldquoLearning entityand relation embeddings for knowledge graph completionrdquo inProceedings of the Twenty-Ninth AAAI Conference on ArtificialIntelligence pp 2181ndash2187 2015

[48] X Dong E Gabrilovich G Heitz et al ldquoKnowledge vaulta web-scale approach to probabilistic knowledge fusionrdquo inSIGKDD2014 pp 601ndash610 2014

[49] JWu ZWang YWu L Liu S Deng andHHuang ldquoA TensorCP decomposition method for clustering heterogeneous infor-mation networks via stochastic gradient descent algorithmsrdquoScientific Programming vol 2017 Article ID 2803091 13 pages2017

[50] R J Rossi A Webster H Brightman and H SchneiderldquoApplied statistics for business and economicsrdquo The AmericanStatistician vol 47 no 1 p 76 1993

[51] DAndersonD Sweeney TWilliams J Camm and J CochranStatistics for Business amp Economics Cengage Learning 2013

[52] D Liben-Nowell and J Kleinberg ldquoThe link prediction problemfor social networksrdquo in Proceedings of the 2003 ACM CIKMInternational Conference on Information and Knowledge Man-agement pp 556ndash559 New Orleans La USA November 2003

[53] M A Hasan and M J Zaki ldquoA survey of link prediction insocial networksrdquo in Social Network Data Analytics pp 243ndash275Springer New Yok NY USA 2011

[54] C Dai L Chen and B Li ldquoLink prediction based on samplingin complex networksrdquoApplied Intelligence vol 47 no 1 pp 1ndash122017

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems


Volume 2018


ReconfigurableComputing



Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018


Civil EngineeringAdvances in


Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications


Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia


Biomedical Imaging



Engineering Mathematics


RoboticsJournal of



Computational Intelligence and Neuroscience


Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018


Human-ComputerInteraction

Advances in


Scientic Programming

Submit your manuscripts atwwwhindawicom


Independent Space Associated Space

Similarity

Similarity

Similarity

SteveJobs America

LaurenePowell

AppleInc

e3

e2

e1

e3

e2

e1

Figure 1 Framework of our model

Entity Pairs

Jack Ma

Jack Ma

Alibaba

Alipay

Tim Cook

Apple Inc

Google

SundarPichai

America

Laurene Powell

Apple Inc

Cou

pleO

f

Nationality

FoundOfe3 r1

Figure 2 Motivation in associated space

of this paper is that independent and associated spaces areused to represent the irrelevant and interconnected entities(relations) features respectively In independent space thefeatures of entities (relations) are independent and irrelevantBy contrast the features of entities (relations) in associatedspace are interconnected and interacting and the entitiesand relations can be denoted by the entities and relationsconnected with them Plus the similarities of the sameentities (relations) with different spaces are high In Figure 1we can see that in independent space the features of 1198901 areonly constructed by themselves but in associated spacesthe entity 1198901 is denoted by other entities and relations whichcan be described as blue points (lines) We want the featuresof 1198901 in independent and associated spaces to be similarBesides vector embedding is used to represent knowledgegraphs

In associated space take as an example the entity whichSteve Jobs has multiple triplets such as (Steve Jobs AppleInc FoundOf ) (Steve Jobs America Nationality) and (SteveJobs Laurene Powell CoupleOf ) If we combine all corrupttriplets with the same missing entity such as ( AppleInc FoundOf ) ( America Nationality) and ( LaurenePowell Couple) it is easy to locate that the missing entity 1198903 isSteve Jobs Similarly if we combine all the corrupt triplets withthe same relation such as (Steve Jobs Apple Inc ) (JackMa Alibaba ) and (Sundar Pichai Google ) we canobtain that the missing relation 1199031 is FoundOfThe scenario isshown in Figure 2 Hence using correlation between different

entities to represent features is an effective method Howeverin practice it is unsuitable to only use the correction betweendifferent entities and omit the inherent features entities havesuch as the attributes of each entity which are hard torepresent with the correlations between different entitiesTherefore we construct the independent space which canpreserve the inherent features each entity has We combineboth independent and associated spaces to represent overallfeatures of entities and relations which can in turn representthe knowledge graph more comprehensively The motivationof employing both types of spaces is to model correlationwhile reserving individual specificity

Compared with other embedding models vector embed-ding has evident advantages on time and memory-spacecomplexities We evaluate SimE-E and SimE-ER on thepopular tasks of entity prediction and relation predictionTheexperiment results validate the competitive results achievedby the proposed method compared with previous models

Contributions To summarize the main contributions ofthis paper are as follows

(i) We propose a similarity-based embedding modelnamely SimE-ER In SimE-ER we consider theentity and relation similarities of different spacessimultaneously which can extract the features ofentities and relations comprehensively

(ii) Compared with other embedding models our modelhas lower time and space complexity which improves





2 Related Work










1119871

2






















1003817100381710038171003817= 119878119906119898 (hi ⊙ ra ⊙ ta)1003817100381710038171003817hi1003817100381710038171003817 1003817100381710038171003817ra ⊙ ta

1003817100381710038171003817(4)


119878119894119898 (ℎ) = 119878119906119898 (hi ⊙ ra ⊙ ta) (5)



119878119894119898 (ℎ 119905) = 119878119894119898 (ℎ) + 119878119894119898 (119905)= 119878119906119898 (hi ⊙ ra ⊙ ta) + 119878119906119898 (ha ⊙ ra ⊙ ti)

(6)




(7)



119871 = sum(hrt)isin119878

log (1 + exp (minus119884ℎ119903119905119878119894119898 (ℎ 119903 119905 Θ))) (8)




cup (ℎ 1199031015840 119905) | 1199031015840 isin 119877 (9)



1003817100381710038171003817hi1003817100381710038171003817 = 11003817100381710038171003817ri1003817100381710038171003817 = 11003817100381710038171003817ti1003817100381710038171003817 = 1


1003817100381710038171003817ha ⊙ ra1003817100381710038171003817 = 1

1003817100381710038171003817ha ⊙ ta1003817100381710038171003817 = 1

1003817100381710038171003817ra ⊙ ta1003817100381710038171003817 = 1

(10)


119871 = sum(hrt)isin119878

log (1 + exp (minus119884ℎ119903119905119878119894119898 (ℎ 119903 119905 Θ)))

+ 120582 Θ22 (11)


















42 Experiment Setup










(i) 120582 = 1 120574 = 10 119889 = 150 119861 = 100 on WN18(ii) 120582 = 1 120574 = 20 119889 = 200 119861 = 200 on FB15K

(iii) 120582 = 1 120574 = 20 119889 = 300 119861 = 100 on FB40K





1198670 1205831 minus 1205832 le 01198671 1205831 minus 1205832 gt 0

(12)



(13)


119889119891 =(119904121198991 + 119904221198992)

2

(1 (1198991 minus 1)) (119904121198991)2 + (1 (1198992 minus 1)) (119904221198992)2(14)








ModelWN18




ModelFB15K




ModelFB40K










ModelWN18




ModelFB15K




ModelFB40K




























Results




Results








5 Conclusion





Data Availability





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in








2 Related Work










1119871

2






















1003817100381710038171003817= 119878119906119898 (hi ⊙ ra ⊙ ta)1003817100381710038171003817hi1003817100381710038171003817 1003817100381710038171003817ra ⊙ ta

1003817100381710038171003817(4)


119878119894119898 (ℎ) = 119878119906119898 (hi ⊙ ra ⊙ ta) (5)



119878119894119898 (ℎ 119905) = 119878119894119898 (ℎ) + 119878119894119898 (119905)= 119878119906119898 (hi ⊙ ra ⊙ ta) + 119878119906119898 (ha ⊙ ra ⊙ ti)

(6)




(7)



119871 = sum(hrt)isin119878

log (1 + exp (minus119884ℎ119903119905119878119894119898 (ℎ 119903 119905 Θ))) (8)




cup (ℎ 1199031015840 119905) | 1199031015840 isin 119877 (9)



1003817100381710038171003817hi1003817100381710038171003817 = 11003817100381710038171003817ri1003817100381710038171003817 = 11003817100381710038171003817ti1003817100381710038171003817 = 1


1003817100381710038171003817ha ⊙ ra1003817100381710038171003817 = 1

1003817100381710038171003817ha ⊙ ta1003817100381710038171003817 = 1

1003817100381710038171003817ra ⊙ ta1003817100381710038171003817 = 1

(10)


119871 = sum(hrt)isin119878

log (1 + exp (minus119884ℎ119903119905119878119894119898 (ℎ 119903 119905 Θ)))

+ 120582 Θ22 (11)


















42 Experiment Setup










(i) 120582 = 1 120574 = 10 119889 = 150 119861 = 100 on WN18(ii) 120582 = 1 120574 = 20 119889 = 200 119861 = 200 on FB15K

(iii) 120582 = 1 120574 = 20 119889 = 300 119861 = 100 on FB40K





1198670 1205831 minus 1205832 le 01198671 1205831 minus 1205832 gt 0

(12)



(13)


119889119891 =(119904121198991 + 119904221198992)

2

(1 (1198991 minus 1)) (119904121198991)2 + (1 (1198992 minus 1)) (119904221198992)2(14)








ModelWN18




ModelFB15K




ModelFB40K










ModelWN18




ModelFB15K




ModelFB40K




























Results




Results








5 Conclusion





Data Availability





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in












1003817100381710038171003817= 119878119906119898 (hi ⊙ ra ⊙ ta)1003817100381710038171003817hi1003817100381710038171003817 1003817100381710038171003817ra ⊙ ta

1003817100381710038171003817(4)


119878119894119898 (ℎ) = 119878119906119898 (hi ⊙ ra ⊙ ta) (5)



119878119894119898 (ℎ 119905) = 119878119894119898 (ℎ) + 119878119894119898 (119905)= 119878119906119898 (hi ⊙ ra ⊙ ta) + 119878119906119898 (ha ⊙ ra ⊙ ti)

(6)




(7)



119871 = sum(hrt)isin119878

log (1 + exp (minus119884ℎ119903119905119878119894119898 (ℎ 119903 119905 Θ))) (8)




cup (ℎ 1199031015840 119905) | 1199031015840 isin 119877 (9)



1003817100381710038171003817hi1003817100381710038171003817 = 11003817100381710038171003817ri1003817100381710038171003817 = 11003817100381710038171003817ti1003817100381710038171003817 = 1


1003817100381710038171003817ha ⊙ ra1003817100381710038171003817 = 1

1003817100381710038171003817ha ⊙ ta1003817100381710038171003817 = 1

1003817100381710038171003817ra ⊙ ta1003817100381710038171003817 = 1

(10)


119871 = sum(hrt)isin119878

log (1 + exp (minus119884ℎ119903119905119878119894119898 (ℎ 119903 119905 Θ)))

+ 120582 Θ22 (11)


















42 Experiment Setup










(i) 120582 = 1 120574 = 10 119889 = 150 119861 = 100 on WN18(ii) 120582 = 1 120574 = 20 119889 = 200 119861 = 200 on FB15K

(iii) 120582 = 1 120574 = 20 119889 = 300 119861 = 100 on FB40K





1198670 1205831 minus 1205832 le 01198671 1205831 minus 1205832 gt 0

(12)



(13)


119889119891 =(119904121198991 + 119904221198992)

2

(1 (1198991 minus 1)) (119904121198991)2 + (1 (1198992 minus 1)) (119904221198992)2(14)








ModelWN18




ModelFB15K




ModelFB40K










ModelWN18




ModelFB15K




ModelFB40K




























Results




Results








5 Conclusion





Data Availability





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in




















42 Experiment Setup










(i) 120582 = 1 120574 = 10 119889 = 150 119861 = 100 on WN18(ii) 120582 = 1 120574 = 20 119889 = 200 119861 = 200 on FB15K

(iii) 120582 = 1 120574 = 20 119889 = 300 119861 = 100 on FB40K





1198670 1205831 minus 1205832 le 01198671 1205831 minus 1205832 gt 0

(12)



(13)


119889119891 =(119904121198991 + 119904221198992)

2

(1 (1198991 minus 1)) (119904121198991)2 + (1 (1198992 minus 1)) (119904221198992)2(14)








ModelWN18




ModelFB15K




ModelFB40K










ModelWN18




ModelFB15K




ModelFB40K




























Results




Results








5 Conclusion





Data Availability





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in












(i) 120582 = 1 120574 = 10 119889 = 150 119861 = 100 on WN18(ii) 120582 = 1 120574 = 20 119889 = 200 119861 = 200 on FB15K

(iii) 120582 = 1 120574 = 20 119889 = 300 119861 = 100 on FB40K





1198670 1205831 minus 1205832 le 01198671 1205831 minus 1205832 gt 0

(12)



(13)


119889119891 =(119904121198991 + 119904221198992)

2

(1 (1198991 minus 1)) (119904121198991)2 + (1 (1198992 minus 1)) (119904221198992)2(14)








ModelWN18




ModelFB15K




ModelFB40K










ModelWN18




ModelFB15K




ModelFB40K




























Results




Results








5 Conclusion





Data Availability





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in






ModelWN18




ModelFB15K




ModelFB40K










ModelWN18




ModelFB15K




ModelFB40K




























Results




Results








5 Conclusion





Data Availability





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in






ModelWN18




ModelFB15K




ModelFB40K




























Results




Results








5 Conclusion





Data Availability





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in
























Results




Results








5 Conclusion





Data Availability





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in





Acknowledgments


References





























































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in
































Advances in

FuzzySystems


Volume 2018













Journal of

Journal of



Hindawi


Advances in

Multimedia


Biomedical Imaging





RoboticsJournal of









Volume 2018



Advances in




knowledge graph representation via similarity-based …

Documents