practical reasoning with quali ed number restrictions: a...

1

Practical Reasoning with Qualified NumberRestrictions: A Hybrid Abox Calculus for theDescription Logic SHQ

Nasim Farsiniamarj a and Volker Haarslev a,∗

a Department of Computer Science and SoftwareEngineering, Concordia University, Montreal

This article presents a hybrid Abox tableau calculusfor SHQ which extends the basic description logicALC with role hierarchies, transitive roles, and quali-fied number restrictions. The prominent feature of ourhybrid calculus is that it reduces reasoning about qual-ified number restrictions to integer linear program-ming. The calculus decides SHQ Abox consistencyw.r.t. a Tbox containing general axioms. The presentedapproach ensures a more informed calculus which ade-quately handles the interaction between numerical andlogical restrictions in SHQ concept and individual de-scriptions. A prototype reasoner for deciding ALCHQconcept satisfiability has been implemented. An em-pirical evaluation of our hybrid reasoner and its in-tegrated optimization techniques for a set of synthe-sized benchmarks featuring qualified number restric-tions clearly demonstrates the effectiveness of our hy-brid calculus.

Keywords: description logics, qualified number restric-tions, integer linear programming

1. Introduction

Description Logics (DLs) [1] are formal lan-guages to represent the knowledge about concepts,individuals, and their relationships (called roles).In the domain of the semantic web, the Web On-tology Language (OWL) [35] is based on descrip-tion logics. The DL subset of OWL, called OWL-DL, is based on the DL SHOIN which is a sub-

*Corresponding author: Volker Haarslev, Department ofComputer Science and Software Engineering, Concordia

University, 1455 de Maisonneuve Blvd. W., Montreal, Que-bec H3G 1M8, Canada, Email: [email protected]

set of SHOIQ [21]. Developing techniques for op-timized reasoning with qualified number restric-tions has become an important goal because qual-ified number restrictions have been added to theforthcoming OWL 2 [31] which is based on the DLSROIQ [20]. In this investigation we focus our at-tention on the DL SHQ, which is a subset of OWL2 and complete enough to cover the characteristicsof qualified number restrictions.

Using SHQ one can express number restrictionson the role fillers1 of individuals. For instance,the expression ∀hasCredit .(SciencetEngineeringtBusiness) about engineering undergraduate stu-dents states that credits must be taken only fromscience, engineering, or business. Moreover, (≥140 hasCredit) states that every student must haveat least 140 credits, (≥ 120 hasCredit .(Science tEngineering)) states that at least 120 credits mustbe from science or engineering, (≤ 32 hasCredit .(Science t Business)) allows at most 32 creditsfrom science or business, and (≤ 91 hasCredit .Engineering) restricts the credits from engineeringto at most 91. A typical question for a DL reasonerwould be whether the conjunction of the conceptexpressions given above is satisfiable.

For example, deciding the satisfiability of sucha SHQ concept consists of finding a model for astudent with a set of at least 140 hasCredit-fillersthat are only instances of the concept (Science tEngineering t Business), at least 120 hasCredit-fillers are instances of (Science t Engineering), atmost 32 hasCredit-fillers are instances of (SciencetBusiness), and at most 91 hasCredit-fillers are in-stances of Engineering . Most DL tableau algo-rithms [16,3,22] test the satisfiability of such aconcept by first satisfying all at-least restrictions,

1Informally speaking, if an individual a is related to anindividual b via a role R, b is called a R role filler (or R-

filler) of a (see Section 2 for more details).

AI CommunicationsISSN 0921-7126, IOS Press. All rights reserved

2 Farsiniamarj and Haarslev / Practical Reasoning with Qualified Number Restrictions

e.g., by creating 260 hasCredit-fillers, of which 120are instances of (Science t Engineering). Even-tually, a nondeterministic choose-rule assigns toeach of these 260 individuals (Science tBusiness)or ¬(Science t Business), and Engineering or¬Engineering . In case an at-most restriction is vi-olated, e.g., a student has more than 91 hasCredit-fillers of Engineering , a nondeterministic merge-rule tries to reduce the number of these individu-als by merging a pair of individuals until the up-per bound specified in this at-most restriction issatisfied. Searching for a model in such an arith-metically uninformed or blind way is usually veryinefficient.

An obvious question might be raised whetherbigger numbers will be used in qualified numberrestrictions occurring in OWL 2 ontologies. First,this seems to be a chicken-and-egg problem be-cause in the presence of inefficient reasoning tech-niques for cases with bigger numbers in qualifiednumber restrictions ontology designers will mostlikely avoid the use of these constructs. So, an em-pirical analysis of existing ontologies might be mis-leading about the necessity of efficient reasoningtechniques for these cases. Second, we argue thatthe use of bigger numbers in qualified number re-strictions is very natural in many domains (e.g.,in the example above) and conceptual restrictionsfor role fillers are essential. This is also motivatedby examples from medical domains, for instancethe human anatomy distinguishes hundreds of dif-ferent kinds of bones as part of the human skele-ton [28]. Our third argument is about the diffi-culty of conjunctions of interdependent qualifiednumber restrictions, where the calculus proposedin this article is better informed and far superiordue to the use of linear integer programming be-cause such a conjunction is represented as a sys-tem of linear inequations capturing all numericalinterdependencies (see below for more details).

Our hybrid calculus (see also [11]) is based ona standard tableau algorithm for SH [3] modi-fied and extended to deal with qualified numberrestrictions and works with an inequation solverbased on integer linear programming. The algo-rithm encodes number restrictions into a set of in-equations using the so-called atomic decomposi-tion technique [30]. The set of inequations is pro-cessed by the inequation solver which finds, if pos-sible, a minimal non-negative integer solution (dis-tribution of role fillers constrained by number re-

strictions) satisfying the inequations. The algo-rithm ensures that such a distribution of role fillersalso satisfies the logical restrictions.

Since this hybrid algorithm collects all the in-formation about arithmetic expressions before cre-ating any role filler, it will not satisfy any at-least restriction by violating an at-most restrictionand there is no need for a mechanism of merg-ing role fillers. Moreover, it reasons about num-ber restrictions by means of an inequation solver,thus its performance is not affected by the val-ues of numbers occurring in number restrictions.Since the solution from the inequation solver sat-isfies all numerical restrictions imposed by at-leastand at-most restrictions, our calculus needs to cre-ate only one so-called proxy individual (inspiredby [13]) representing a set of role fillers. Consid-ering all these features the proposed hybrid algo-rithm is well suited to improve average case per-formance. Furthermore, in [7, Chapter 6] it hasbeen shown that a tableau procedure extended byglobal caching and an algebraic method similar tothe one presented in [30,15] and in this article isworst-case optimal for SHIQ. Although the cal-culus presented in our article is different from theone in [7] we conjecture that the result presentedin [7] can be transferred to our work (if extendedby global caching) because the use of integer linearprogramming was essential to proving worst-caseoptimality.

This calculus extends our work presented in [8]by (i) covering Abox consistency for SHQ withgeneral Tboxes, and (ii) introducing the use ofproxy individuals representing sets of role fillers.It complements the work in [9,10] (which extendsthe work in [8] to ALCOQ) by additionally deal-ing with role hierarchies and transitive roles andalso giving evidence that (i) the absence of nomi-nals can lead to a more efficient calculus becausethe atomic decomposition can be applied locallyon the fly, which significantly reduces the numberof partitions to be considered, and (ii) restrictedforms of nominals can be viewed as Abox individ-uals allowing a more efficient treatment. This arti-cle also discusses practical aspects of using the hy-brid method and presents evaluation results basedon an implemented prototype. It also significantlydiffers from the work presented in [29,30] whereno formal calculus was proposed and the coveredlogic did not support Abox consistency for SHQwith general Tboxes. Our early work presented in

Farsiniamarj and Haarslev / Practical Reasoning with Qualified Number Restrictions 3

[15] did not deal with Aboxes and was based ona recursive calculus without using proxy individu-als and did not provide any proof for soundness,completeness, or termination. This article also ex-tends the work on the signature calculus presentedin [13] because the signature calculus, although us-ing proxy individuals to represent sets of identicalindividuals, first satisfies all relevant at-least re-strictions and then tries to satisfy violated at-mostrestrictions by merging affected signatures.

The remainder of this article is structured as fol-lows. After briefly introducing the DL SHQ andsome preprocessing steps in the next section, wepresent the hybrid Abox calculus in Section 3, il-lustrate the rules with two examples, and concludethis section with formal proofs. Section 4 startsthe second major part of this article, practical rea-soning. We analyze the complexity of the standardand hybrid algorithm for dealing with number re-strictions and present optimization techniques forthe hybrid algorithm. In Section 5 the architec-ture of the implemented prototype reasoner is de-scribed. A detailed evaluation of the prototype ispresented in Section 6. We conclude this articlewith a discussion and outline future work.

2. Description Logic SHQ

In the following, three disjoint sets are defined;NC is the set of concepts names; NR = NRT ∪NRS

is the set of all role names which consists of transi-tive (NRT ) and non-transitive (NRS) roles; I is theset of all individuals, while IA ⊆ I is the set of indi-viduals occurring in an Abox A. We use a standardTarski-style semantics based on an interpretationI = (∆I , ·I), where ∆I is the non-empty domainand ·I the interpretation function. The interpre-tation I gives meaning to the atomic constructsof SHQ and is extended to complex constructsas shown in Table 1, where we assume that C,Dare arbitrary concept descriptions, R,S ∈ NR,a, b ∈ IA, n,m ∈ N, and R#(x,C) denotes thecardinality of x | (x, y) ∈ RI∧y ∈ CI. A conceptdescription C is said to be satisfiable by an inter-pretation I iff CI 6= ∅. We use > (⊥) as abbrevi-ations for A t ¬A (A u ¬A) for some A ∈ NC .

A role hierarchy R is a set of axioms of the formR v S where R,S ∈ NR. In the following, let v∗be the transitive, reflexive closure of v over NR.For R v∗ S, R is called a sub-role of S and S a

Table 1

Syntax and semantics of SHQ.

Syntax Semantics

Concepts

A AI ⊆ ∆I , A is a concept name

¬C ∆I\CIC uD CI ∩DIC tD CI ∪DI∀R.C

˘x | ∀y : (x, y) ∈ RI ⇒ y ∈ CI

¯≥nR.C

˘x |#RI(x,C) ≥ n

¯≤mR.C

˘x |#RI(x,C) ≤ m

¯Roles

R ∈ NR RI ⊆ ∆I ×∆I

R ∈ NRT RI = (RI)+

Axioms

R v S RI ⊆ SIC v D CI ⊆ DI

Assertions

a : C aI ∈ CI(a, b) : R (aI , bI) ∈ RIa 6 .= b aI 6= bI

super-role of R. A role is called simple if it is nei-ther transitive nor has any transitive sub-role. Inorder to remain decidable qualified number restric-tions are only allowed for simple roles with the ex-ception of ≥1R.C. However, recent investigationsin [26] show that this condition could be relaxedin the absence of inverse roles. Moreover, ≤0R.Cis obviously equivalent to ∀R.¬C.

Definition 1 (Tbox) A SHQ Tbox T w.r.t. a rolehierarchy R is a finite set of axioms of the formC v D or C ≡ D where C, D are conceptexpressions and C ≡ D is the placeholder forC v D, D v C. A Tbox T is satisfiable by aninterpretation I iff I satisfies all axioms in T andR.

Definition 2 (Abox) A SHQ AboxA w.r.t. a TboxT and a role hierarchyR is a finite set of assertionsof the form a : C, (a, b) : R, and a 6 .= b. An Abox Ais satisfiable by an interpretation I (or consistent)iff I satisfies T and R and all assertions in A.

Let a, b ∈ I be (possibly unnamed) individuals,given an assertion (a, b) :R, a is called a predeces-sor of b and b a successor or role filler of a. The R-fillers of a are defined as Fil(a,R) = b | (aI , bI) ∈RI.


Inspired by [30] we transform given qualifiednumber restrictions to unqualified number restric-tions with (≥nR)I = (≥nR.>)I , (≤nR)I = (≤nR.>)I , and add a new role-set difference opera-tor ∀(R\R′).C such that (∀(R\R′).C)I = x | ∀y :(x, y) ∈ (RI \R′I) ⇒ y ∈ CI. The resulting lan-guage is called SHN \. Let ¬C denote the standardnegation normal form (NNF)2 of ¬C. We define arecursive function unQ which rewrites SHQ asser-tions or concept descriptions into SHN \. It is im-portant to note that this rewriting process alwaysintroduces for each transformed qualified numberrestriction a unique new role.

Definition 3 (unQ) This function transforms theinput description into its NNF and replaces qual-ified number restrictions (if C 6= >). In the fol-lowing each R′ is a new role in NR with R :=R∪ R′ v R:unQ(C) := C if C ∈ NC

unQ(¬C) := ¬C if C ∈ NC , unQ(¬C) otherwiseunQ(∀R.C) := ∀R.unQ(C)unQ(C uD) := unQ(C) u unQ(D)unQ(C tD) := unQ(C) t unQ(D)unQ(≥ nR.C) := (≥ nR′) u ∀R′.unQ(C) (1)unQ(≤ nR.C) := (≤ nR′)u∀(R\R′).unQ(¬C) (2)unQ(a :C) := a :unQ(C)unQ((a, b) :R) := (a, b) :RunQ(a 6 .= b) := a 6 .= b

Remark According to [30] one can replace aqualified number restriction of the form (≥nR.C)by (∃R′ : (R′ v R) ∈ R∧ ≥ nR′ u ∀R′.C) and≤nR.C by ∃R′ such that (R′ vR)∈R, ≤nR′ u∀R′.Cu∀R\R′.(¬C) ((1) and (2)). Therefore, ¬(≥nR.C) (which is equal to ≤(n−1)R.C) is equiv-alent to (∀R′ v R : ≤(n−1)R′ t ∃R′.¬C) whichis not a formula expressible in SHQ. Hence, inorder to avoid negating converted forms of quali-fied number restrictions, unQ must be applied ini-tially to the negation normal form of the inputTbox/Abox.

Since (2) introduces a negation itself, the negateddescription needs to be converted to NNF beforefurther applying unQ. Our language is not closedunder negation w.r.t. the concept descriptions cre-ated by rule (1) or (2). However, our calculus en-sures that these concept descriptions will never benegated.

2The negation normal form of ¬(≥nR.C) (¬(≤nR.C))is defined as ≤(n−1)R.C (≥(n+1)R.C) respectively.

Since (2) is slightly different from what is pro-posed in [30], we prove this equivalence based onthe semantics of the interpretation function I.

Proposition 4 (≤nR.C) is equisatisfiable with theexpression (∀(R\R′).¬C u≤nR′) with ∃R′ : R′vR.

Proof. The hypothesis can be translated to:(≤ nR.C)I = a |#y| 〈a, y〉 ∈ RI ∧ y ∈ CI ≤n ⇔∃I ′ : a | ∃R′ : R′I

′ ⊆ RI′ ∧ #y | 〈a, y〉 ∈ R′I′ ≤

n ∧ ∀b : 〈a, b〉 ∈ RI′ ∧ 〈a, b〉 /∈ R′I

′ ⇒ b ∈∆I

′\CI′.(⇐): If a ∈ ∆I

′, 〈a, y〉 ∈ RI′

, y ∈ CI′, we can con-

clude that 〈a, y〉 ∈ R′I′

(because if 〈a, y〉 /∈ R′I′

then y ∈ ∆I′\CI′

). Since we have #y | 〈a, y〉 ∈R′I

′ ≤ n we can define I such that it satisfies(≤ nR.C)I .(⇒): We can simply define RI

′= R′I

′∪R′′I

′such

that for all 〈a, b〉 ∈ RI if b ∈ CI then 〈a, b〉 ∈ R′I′

and if b ∈ ∆I\CI then 〈a, b〉 ∈ R′′I′.

3. A Hybrid Abox Tableau Calculus for SHQ

Extending ALC with qualified number restric-tions provides the ability to express arithmetic re-strictions on role fillers. This expressiveness canincrease the practical complexity of the reasoningwhen employing arithmetically uninformed algo-rithms (e.g., see [16,3,22]). Therefore, a tableaucalculus which benefits from arithmetic methodscan improve the average case performance of rea-soning about qualified cardinality restrictions. Inthis section we propose a hybrid tableau algorithmwhich benefits from integer linear programmingto properly handle numerical features of the lan-guage. We present a tableau algorithm that ac-cepts a general Abox A w.r.t. a general Tbox Tand a role hierarchy R as input and either returns“inconsistent” if A is not consistent or otherwise“consistent” with a complete and clash-free com-pletion forest.

For reasons of simplicity we assume in this arti-cle that standard Tbox inference services w.r.t. aTbox T and a role hierarchy R, e.g., concept sat-isfiability and subsumption testing, are reduced toAbox satisfiability in the usual way. For instance, aconcept C is satisfiable iff the Abox a : C is sat-isfiable and a concept C is subsumed by a conceptD iff the Abox a : ¬D u C is unsatisfiable.


3.1. Preprocessing

As a preprocessing step, the input Abox andTbox are transformed into SHN \ as defined inSection 2. Similar to [24], in order to propagateTbox axioms to all individuals we define the con-cept CT :=

dCvD∈T unQ(¬CtD) and U as a new

transitive role in the role hierarchy R extendedby RvU |R ∈ NR. In order to examine the con-sistency of A, the algorithm first extends A byx0: (CT u ∀U.CT ) ∪ (x0, x) :U | x occurs in A,where x0 is new in A. By this transformation, weimpose the axioms in the Tbox on all individualsin A. Furthermore, we rewrite the role assertionsin A as follows.

Definition 5 If x, y ∈ IA then there exists a uniquerole name Rxy ∈ NR only used to represent thaty is an R-filler of x with Rxy v R ∈ R. In otherwords whenever (x, z) : Rxy we have yI = zI .

We replace role assertions of the form (b, c) :Rby b : (≥1Rbc u ≤1Rbc). The reason for translat-ing Abox role assertions into number restrictions isdue to the fact that they actually impose a numer-ical restriction on an individual. For example, anassertion (b, c) :R implies there exists one and onlyone Rbc-filler for b which is c. Since the hybrid al-gorithm needs to consider all number restrictionsbefore creating an arithmetic solution and gener-ating successors, it is necessary to consider theserestrictions as well.

3.2. Atomic Decomposition of Role Fillers

Atomic decomposition is a technique first pro-posed in [30] for reasoning about sets. Later itwas applied for concept languages such as in de-scription logics for reasoning about role fillers. Theidea behind the atomic decomposition is to con-sider all possible disjoint subsets of a role fillersuch that we have |A ∪ B| = |A| + |B| for twosubsets (partitions) A and B (| · | denotes set car-dinality). For example, assume we want to trans-late the following number restrictions occurring inan Abox A into arithmetic inequations: A = a :(≤3hasDaughter u≤4hasSon u≥5hasChild)(adapted from [30]).

Different partitions for this set of restrictions aredefined as:

c = children, not sons, not daughters.s = sons, not children, not daughters.d = daughters, not children, not sons.cs = children, sons, not daughters.cd = children, daughters, not sons.sd = sons, daughters, not children.csd = children, sons, daughters.

Since it is an atomic decomposition and subsetsare mutually disjoint, we can translate the threenumber restrictions for the individual a into thefollowing inequations:

|d|+ |sd|+ |cd|+ |csd| ≤ 3|s|+ |sd|+ |cs|+ |csd| ≤ 4|c|+ |cd|+ |cs|+ |csd| ≥ 5

Finding an integer solution for this system of in-equations will result in a first solution for the ini-tial set of number restrictions. As there may existmore than one solution for this system, there maybe a nondeterministic approach needed to handleit.

Due to the nature of SHN \, a quasi tree-modelproperty for concepts3 [36] ensures that differentindividuals in a model do not share common rolefillers and role fillers (as part of a concept model)cannot affect their predecessors due to the absenceof inverse roles. The only exception are individu-als from IA occurring in an Abox A. Thus, in thefollowing we introduce the necessary notation todefine the atomic decomposition of role fillers lo-cally4 for individuals, i.e., it is based for each in-dividual on the roles of its role fillers.

3.3. Completion Forest

In the following we introduce the notion of acompletion forest and some other notation to spec-ify the completion rules that form the major partof our tableau algorithm.

3The tree model states that a concept C has a model(an interpretation I with CI 6= ∅) iff C has a tree-shaped

model, i.e., one in which the interpretation of role filler rela-

tions defines a tree shaped directed graph. Transitive rolescompromise the tree model property to some extent because

they can cause “short-cuts” down the branches of a tree.

This relaxed version is called quasi tree-model property.4For instance, the DL ALCOQ, which supports so-called

nominals, requires a global decomposition (see [9,10] for

more details).


Definition 6 (Closure) The closure clos(E) of aconcept expression E is the smallest set of con-cepts such that: E ∈ clos(E), (¬D) ∈ clos(E) ⇒D ∈ clos(E), (C t D) ∈ clos(E) or (C u D) ∈clos(E)⇒ C∈clos(E) and D∈clos(E), (∀R.C)∈clos(E) ⇒ C ∈ clos(E), (≥ nR.C) ∈ clos(E) or(≤mR.C)∈clos(E)⇒ C∈clos(E).

For a Tbox T we define clos(T ) such that if(C v D) ∈ T or (C ≡ D) ∈ T then clos(C) ⊆clos(T ) and clos(D) ⊆ T . Similarly for an AboxA we define clos(A) such that if (a :C) ∈ A thenclos(C) ⊆ clos(A).

Definition 7 (Completion Forest) A completion for-est F = (V,E,L,LE) for a SHQ Abox A is com-posed of a set of arbitrarily connected nodes as theroots5 of completion trees (if one ignores the con-nections between root nodes). Every node x ∈ Vis labeled by L(x) ⊆ clos(A) and LE(x) as a setof inequations of the form (

∑i∈1..k vi) ./ n with

./∈ ≤, ≥, n ∈ N, and vi ∈ V where V is a setof variables (see Definition 10 below); each edge〈x, y〉 ∈ E is labeled by the set L(〈x, y〉) ⊆ NR,and x (y) is called a predecessor (successor) of y(x), and the transitively closed set of successors(predecessors) is called descendants (ancestors) re-spectively. We maintain the distinction betweennodes of a forest by the relation 6 .=. In the follow-ing we also refer to elements of L and LE as con-straints of a node. Notice that our version of com-pletion forests is slightly different from standardcompletion graphs (e.g., see [24]).

Definition 8 (Node-related Partitions) The setRx

of related roles for a node x is defined as Rx =S | ≤nS, ≥mS ∩ L(x) 6= ∅. We define a par-titioning RSx = (

⋃RS⊆Rx

RS) \ ∅ and forRSx ∈ RSx we define RSIx = (

⋂S∈RSx

FilI(x, S))\(⋃

S∈Rx\RSxFilI(x, S)) with FilI(x, S) = yI | y ∈

Fil(x, S). RSIx represents the interpretation offillers of x that are fillers for the roles in RS butnot fillers for the roles in Rx \ RS . Therefore, bydefinition the fillers of x associated with the par-titions in RSx are mutually disjoint w.r.t. the in-terpretation I.

Definition 9 (Partition Variables) We assume a setV of variables and by using a mapping α : V ↔RSx we associate a unique variable v ∈ V witheach partition RSx in RSx such that α(v) = RSx.

5In contrast to the standard notion of root nodes of atree we allow for root nodes incoming edges from other root

nodes.

Definition 10 (Inequations) Let VR be defined asthe set of all variables related to a role R such thatVR = v ∈ V |R ∈ α(v). A function ξ is used tomap number restrictions to inequations of the formξ(R, ./, n) := (

∑vi∈VR vi) ./ n where ./∈ ≤,≥

and n ∈ N.

Since all concept restrictions for node successorsthat are not root nodes are propagated throughuniversal restrictions for roles and a new role iscreated for each role filler occurring in a role asser-tion, we can conclude that all nodes in a certainnode partition share the same restrictions and canbe dealt with as a unit. We call this unit proxynode which is a representative of possibly morethan one node.

Definition 11 (Node Cardinality) The cardinalityassociated with proxy nodes is defined by the map-ping card : V → N.

Definition 12 (Blocked Node) A node x in a forestF is blocked by a node y iff x, y are not root nodesand y is a ancestor of x such that L(x) ⊆ L(y).

Definition 13 (Clash Triggers) A node x containsa clash iff there exists a concept name A ∈ NC suchthat A,¬A ⊆ L(x) (logical clash) or LE(x) doesnot have a non-negative integer solution (arith-metic clash).

Definition 14 (Arithmetic Solution) An arithmeticsolution is represented using the function σ :V → N which assigns a non-negative integervalue to each variable in V. Let Vx be the setof variables assigned to a node x, Vx = v ∈V | v occurs in LE(x). We define a set of solutionsΩ for x as Ω(x) := σ(v)=n | n ∈ N, v ∈ Vx.Notice that the goal (or objective) function of theinequation solver is to minimize the sum of thevariables occurring in the input inequations.

The algorithm starts with the forest FA withthe individuals mentioned in A as root nodes. Foreach a ∈ IA a root node xa will be created withL(xa) = C | (a : C) ∈A. Additionally, for everyroot node x we set card(x) = 1.

We illustrate these definitions with a simple ex-ample shown in Figure 1. Assume for a node xthat we have L(x) = ≥ 3R, ≤2T , ≥ 1S,≤ 1S.Therefore, we have Rx = R, T, S and RSx con-sists of 7 different partitions named p1, . . . , p7 (seeFigure 1).


p1 = R, p2 = T, p4 = S,p3 = R, T, p5 = R,S, p6 = S, T,p7 = R,S, T

p1 p4

p2

p3

p5

p6 p7

R S

T

RT

RS

ST

RST

Fig. 1. Atomic Decomposition Example.

Assuming a binary coding of the indices of vari-ables, where the first digit from the right repre-sents R, the second T , and the last S, we define thevariables such that each vi is associated with itscorresponding partition (see α(v) below and Fig-ure 1).

α(v001)= p1, α(v010)= p2, α(v100)= p4,α(v011)= p3, α(v101)= p5, α(v110)= p6,α(v111)= p7

Hence, the number restrictions in L(x) can betranslated to the following set of inequations inLE(x):

v001 + v011 + v101 + v111 ≥ 3v010 + v011 + v110 + v111 ≤ 2v100 + v101 + v110 + v111 ≤ 1v100 + v101 + v110 + v111 ≥ 1

3.4. Partitioning for Root Nodes

In arbitrary Aboxes, similar to the effect of in-verse roles, a root node in a corresponding com-pletion forest can be influenced by its successorsif they are also root nodes. When a new numberrestriction for a root node x is added to L(x), thealgorithm needs to refine the partitioning assignedto x. However, the current state of the forest is aresult of existing solutions based on the previouspartitioning. In fact, the newly added number re-striction has been added to L(x) after the applica-tion of a completion rule (in fact, by the fil -Rule

which is explained in the next section). Therefore,we can conclude that the newly added number re-striction is a consequence of the solution for theprevious set of inequations. Hence, if the algorithmdoes not maintain the existing solutions, in fact, itmay remove the cause of the current partitioningwhich would result in unsoundness.

We considered two approaches to handle cy-cles between Abox individuals (represented as rootnodes).

Global partitioning: One can treat an individualin an Abox similarly to a nominal. Because of theglobal effect of nominals, one has to consider aglobal partitioning for all roles occurring in theAbox (see [9,10] for more details). Hence, all possi-ble partitions and therefore variables will be com-puted before starting the application of the rules.Consequently, whenever a number restriction isadded to the label of a node, there is no needto recompute the partitioning to construct newinequations. Although global partitioning enablesthe algorithm to deal with cycles, according tothe possibly large number of partitions/variables,it might impose a high nondeterminism on thetableau algorithm. Moreover, the number of addedroles (and partitions) could become enormouslylarge for Aboxes containing many role assertionsbecause our algorithm introduces a new sub-rolefor each Abox assertion of the form (a, b) :R.

Incremental local partitioning: In contrast tonominals, which increase the expressiveness of thelanguage and can have a global effect on othernodes, individuals in Aboxes have a local effectand can be handled locally. Moreover, one can usu-ally assume that Aboxes contain a large numberof individuals but only a relatively small numberof nominals. Therefore, in the context of SHQ wedecided to adopt incremental partitioning and dealwith cycles between Abox individuals locally. Thismeans that whenever a previously unknown num-ber restriction for a new role is added to a rootnode, a new extended partitioning is computedand the corresponding arithmetic constraints areupdated. Please note that this is only necessaryfor root nodes.

3.5. Completion Rules

The completion rules are shown in Figure 2. Thecompletion rules are always applied with the fol-


reset-Rule if x is a root node, (≤ nR), (≥ nR) ∩ L(x) 6= ∅, and ∀v ∈ Vx : R /∈ α(v)then set LE(x) := ∅ and for every successor y of x set L(〈x, y〉) := ∅

merge-Rule if there exists root nodes zx, za, zb for x, a, b ∈ IA such that R′ v∗ Rxa, R′ ∈

L(〈zx, zb〉)then merge the nodes za, zb and their labels and replace every occurrence of za in thecompletion graph by zb

u-Rule if (C1 u C2) ∈ L(x) and C1, C2 6⊆ L(x)then set L(x) = L(x) ∪ C1, C2

t-Rule if (C1 t C2) ∈ L(x) and C1, C2 ∩ L(x) = ∅then set L(x) = L(x) ∪ X with X ∈ C1, C2

∀-Rule if ∀R.C ∈ L(x) and there exists a y and R′ with R′ ∈ L(〈x, y〉), C /∈ L(y), R′ v∗Rthen set L(y) = L(y) ∪ C

∀\-Rule if ∀R\S.C ∈ L(x) and there exists a y and R′ with R′ ∈ L(〈x, y〉), R′ v∗R, C /∈ L(y),and there exists no S′ such that S′ v∗S and S′ ∈ L(〈x, y〉)then set L(y) = L(y) ∪ C

∀+-Rule if ∀R.C ∈ L(x) and there exists a y and R′, S with R′∈L(〈x, y〉), R′ v∗ S, S v∗R,S ∈ NRT , and ∀S.C /∈ L(y)then set L(y) = L(y) ∪ ∀S.C

ch-Rule If there occurs v in LE(x) with v ≥ 1, v ≤ 0 ∩ LE(x) = ∅then set LE(x) = LE(x) ∪ X with X ∈ v ≥ 1, v ≤ 0

disjoint-Rule if there occurs v in LE(zx) with R′, S′ ⊆ α(v), R′ v∗Rxa, S′ v∗Rxb, x, a, b ∈ IA,a 6 .= b ∈ A, v ≤ 0 /∈ LE(zx)then set LE(zx) := LE(zx) ∪ v ≤ 0

equal-Rule if there occurs v in LE(zx) with R′ ∈ α(v), S′ /∈ α(v), R′ v∗ Rxa, S′ v∗ Sxa,x, a, b ∈ IA, v ≤ 0 /∈ LE(zx)then set LE(zx) := LE(zx) ∪ v ≤ 0

hierarchy-Rule if there occurs v in LE(x) with R ∈ α(v), S /∈ α(v), R v∗S, v ≤ 0 /∈ LE(x)then set LE(x) := LE(x) ∪ v ≤ 0

≥-Rule If (≥ nR) ∈ L(x) and ξ(R,≥, n) /∈ LE(x)then set LE(x) = LE(x) ∪ ξ(R,≥, n)

≤-Rule If (≤ nR) ∈ L(x) and ξ(R,≤, n) /∈ LE(x)then set LE(x) = LE(x) ∪ ξ(R,≤, n)

fil-Rule If there exists v occurring in LE(zx) such that σ(v) = n with n > 0:(a) if n = 1, zx, zb root nodes, and Rxb ∈ α(v) with x, b ∈ IA

then if L(〈zx, zb〉) = ∅ then set L(〈zx, zb〉) := α(v) end ifelse

(b) if zx is not blocked and ¬∃y : L(〈zx, y〉) = α(v)then create a new node y and set L(〈zx, y〉) := α(v) and card(y) = n

Fig. 2. Completion rules for SHQ Abox consistency (listed in decreasing priority).

lowing priorities (given in decreasing priority). Arule of lower priority can never be applied if an-other one with a higher priority is applicable.

1. reset-Rule and merge-Rule2. u-Rule, t-Rule, ∀-Rule, ∀\-Rule, ∀+-Rule,

ch-Rule, disjoint-Rule, equal -Rule, and hier-archy-Rule

3. ≤-Rule and ≥-Rule4. fil -Rule

There are two limitations on the application ofthe rules:

– priority of the rules, and– rules are only applicable to nodes that are not

blocked.

All rules which have the two highest prioritiesamong the completion rules, extend L(x) with newlogical expressions or further constrain LE(x). Af-ter the application of these rules the logical label


of the node x cannot be expanded anymore. Thisis a consequence of the characteristics of SHQ be-cause the labels of a non-root node can never beaffected by its successors in the graph.

Definition 15 (Complete/Clash-free Forest) A com-pletion forest F is called complete if no comple-tion rule is applicable to any node of F . A com-pletion forest F is called clash-free if none of theclash triggers defined in Definition 13 is applicableto any node of F .

The u-Rule, t-Rule, ∀-Rule, and the ∀+-Ruleare similar to the ones in standard tableau al-gorithms. The ∀+-Rule preserves the semanticsof transitive roles. The ∀\-Rule handles the newuniversal restriction expression introduced by thetransformation function unQ.

The t-Rule and ch-Rule are nondeterministicwhile all other rules are deterministic.

≤-Rule, ≥-Rule: Since all logical constraints of anode are collected by the rules with the two highestpriorities, after their application the algorithm hascollected all number restrictions for a node. There-fore, it is possible to compute the final partition-ing with respect to these restrictions. The ≤-Ruleand the ≥-Rule translate the number restrictions,based on the atomic decomposition technique, intoinequations. Consequently, they will add these in-equations to LE(x) for a node x. In case x is a rootnode with incoming edges from other root nodes,the partitioning might be revised later due to theinteraction of the reset-Rule and fil -Rule.

ch-Rule: The intuition behind the ch-Rule is dueto partitioning consequences. When we partitionall successors for a node, we actually consider allpossible cases for the role successors. If a parti-tion px for a node x is logically unsatisfiable, thecorresponding variable v with α(v) = px shouldbe zero. But if it is logically satisfiable, nothingbut the current set of inequations can restrict thenumber of nodes being fillers for the roles in thepartition px. On the other hand, the arithmeticreasoner is unaware of the satisfiability of a con-cept representing a partition. Therefore, in orderto organize the search space with respect to thissemantic branching and ensure completeness, thealgorithm needs to distinguish between these twocases: v ≥ 1 or v ≤ 0.

disjoint-Rule: In order to preserve Abox asser-tions of the form a 6 .= b, if there exists a node zx

where a variable v occurs in LE(zx) such thatR′, S′ ⊆ α(v) and R′ v∗ Rxa and S′ v∗ Rxb,then this variable needs to be equal to zero be-cause otherwise a solution might exist where a andb need to be merged.

equal-Rule: Since fillers of Rxy and Sxy have tobe equivalent, if there exists a node zx where avariable v occurs in LE(zx) such that R′ ∈ α(v)but S′ /∈ α(v) with R′ v∗ Rxy and S′ v∗ Sxy,then this variable needs to be equal to zero becauseotherwise a solution might exist where the fillersof Rxy and Sxy would not be equivalent.

hierarchy-Rule: In order to preserve role hierar-chies, if there exists a node x where a variable voccurs in LE(x) such that R ∈ α(v) and S /∈ α(v)but R v∗S, i.e., the partition α(v) violates the rolehierarchy, then this variable needs to be equal tozero in order to “disable” this partition.

reset-Rule: In case a number restriction with anew role R has been added to a root node x, thelabel LE and the edge label of all successors of xare reset to ∅.merge-Rule: This rule merges the successors (androot nodes) za, zb of zx into one node by replacingevery occurrence of za in the completion graph byzb. The labels L and LE of zb are unified with thecorresponding labels of za. All incoming (outgoing)edges of za become now incoming (outgoing) edgesof zb. We deliberately do not specify the details ofsuch a node merger but appeal to the intuition ofthe reader and refer to the following two sectiondescribing merging examples.

fil-Rule: The fil -Rule is the rule with the lowestpriority and it is the only generating rule. It possi-bly creates one successor (proxy node) for a nodezx that is not blocked and sets the cardinality ofthe proxy node to σ(v). If zx is a root node and hasan already existing successor xb, which is a also aroot node, and the role set for 〈zx, zb〉 is empty, i.e.,it has not yet been initialized or has been reset bythe reset-rule, the role set is set to α(v). This ruleonly creates successors based on the non-negativeinteger solution provided by the arithmetic rea-soner. Hence, it will never create successors for anode that might violate number restrictions of thisnode. Therefore, there is no need for a mechanismof merging nodes created by this rule.


Due to possible cycles between root nodes, onehas to consider incremental partitioning for rootnodes because a previously unknown number re-striction could be propagated to a root node.Whenever a concept expression of the form (≤nR)or (≥ mR) is added to L(x) (which means it didnot already exist in L(x)), the following three taskstake place:

1. The reset-Rule becomes applicable for x,which sets LE(x) := ∅ and clears the arith-metic label constraining the outgoing edgesof x.

2. Now that LE(x) is empty, the ≤-Rule and ≥-Rule will be invoked again to recompute thepartitions and variables. Afterwards, the setof inequations based on the new partitioningis added.

3. If (σ(vi) = n) ∈ Ω(x), where vi ∈ Vx corre-sponds to the previous partitioning, then set

LE(x) := LE(x)∪∑

v′j∈Vx,i

v′j ≥ n,∑

v′j∈Vx,i

v′j ≤ n

where Vx,i := v′j ∈ V ′x | α(vi) ⊆ α(v′j) andv′j ∈ V ′x are based on the new partitioning.

The third task in fact maintains the previous so-lutions in x and records it by means of inequali-ties in LE(x). Therefore, the solution based on thenew partitioning will be recreated by the arith-metic reasoner. To observe the functioning of thesetasks in more detail, we refer the reader to Section3.7.

Remark When the algorithm records the solu-tion σ(vi) = n by means of inequations, we canconsider two cases for the successors that had beengenerated by the fil -Rule, based on the previoussolution:

1. The successor is a root node y. Therefore,the corresponding solution must be of theform σ(vi) = 1 where Rxy ∈ α(vi). Thissolution will be translated to

∑v′j = 1 for

all j such that that α(vi) ⊆ α(v′j). The so-lution for this equality will be also of theform σ(v′s) = 1 for some vs ∈ Vx,i. SinceRxy ∈ α(vi) and α(vi) ⊆ α(v′s), we can con-clude that Rxy ∈ α(v′s). Hence, the new solu-tion will enhance the edge between x and yand possibly extend6 its label.

6Since α(vi) ⊆ α(v′s), the new solution will not removeany role name from the label of this edge.

2. The successor is not a root node and rep-resents an anonymous individual xi andcard(xi) = n. In this case n in the corre-sponding solution can be greater or equal to 1and later the algorithm can create a solutionfor which p new nodes will be created, where1 ≤ p ≤ n. In other words, the node xi willbe removed from the forest and will be re-placed by p new nodes. Since there exists noedge from the new nodes to root nodes, thenew nodes never propagate back any infor-mation in the forest to root nodes. Therefore,removing xi from the forest does not violaterestrictions on the root nodes.

3.6. Simple merging example

Let us assume an Abox A = x :≤1R, (x, y) :R,(x, z) :R. The preprocessing converts the role as-sertions and we get a newA′ = x :≤1R, x :≥1Rxy,x : ≤1Rxy, x :≥1Rxz, x :≤1Rxz. From this wederive a forest F and the root nodes x, y, z withthe following labels: R ∈ L(〈x, z〉), R ∈ L(〈x, y〉),L(x) = ≤1R,≤1Rxy, ≥1Rxy,≤1Rxz, ≥1Rxzwhere Rxy v R, Rxz v R ⊆ R. Consider thevariables such that α(v001) = R, α(v010) = Rxy,α(v100) = Rxz, . . . , α(v111) = R,Rxy, Rxz.

After applying the hierarchy-Rule, we will havev010 ≤ 0, v100 ≤ 0, and v110 ≤ 0. Hence, afterremoving all variables that must be zero, the fol-lowing system of inequations in LE(x) needs to besolved:

v001 + v011 + v101 + v111 ≤ 1,v011 + v111 = 1v101 + v111 = 1

(∗)

The only non-negative solution for (∗) is achievedwhen it is decided by the ch-Rule that v111 ≥ 1and all other variables are equal zero. This so-lution, which is σ(v111) = 1, will invoke the fil -Rule in Figure 2 which makes y and z the succes-sors of x such that L(〈x, y〉) = R,Rxy, Rxz andL(〈x, z〉) = R,Rxy, Rxz. Consequently, sinceRxy ∈ L(〈x, z〉), the merge-Rule in Figure 2 be-comes applicable for the nodes y, z and mergesthem.


a

b

L(a) =

!1Rab,"1Rab#

!1Rac,"1Rac,!1Rc

d

R, Rab, Rac

R, Rab, Rac

1

Fig. 3. Initial completion graph. Dashed edges do not exist

yet.

3.7. Complex merging example

Consider the Abox A with an empty Tboxand role hierarchy: A = b : ∀S.(∀R.(≥ 3S.C)),a : ≤1R, (a, b) : R, (a, c) : R, (b, d) : R, (c, d) : S,(d, c) : R. Assume the algorithm generates a for-est with root nodes a, b, c, and d.

Preprocessing:

1. Applying the function unQ: R = R ∪ S′ vS and (≥3S.C)→ ≥3S′ u ∀S′.C

2. Converting role assertions:R = R∪Rab v R,Rac v R, Rbd v R, Scd v S, Rdc v R andL(a) = L(a) ∪ ≤ 1Rab, ≥ 1Rab, ≤ 1Rac,≥ 1Rac, L(b) = L(b) ∪ ≤ 1Rbd, ≥ 1Rbd,L(c) = L(c) ∪ ≤ 1Scd, ≥ 1Scd, L(d) =L(d) ∪ ≤1Rdc, ≥1Rdc

Applying the rules: The ≤-Rule and ≥-Rule areapplicable for all of the nodes and translate thenumber restrictions into inequations (Figure 3).Node a is similar to node x in the example in theprevious section and invokes the merge-Rule for cand b.

Assume the merge-Rule replaces every occur-rence of c by b, we will have L(b) = ≤ 1Rbd, ≥1Rbd, ≤ 1Sbd, ≥ 1Sbd, ∀S.(∀R.(≥ 3S′ u ∀S′.C))and for d we will have L(d) = ≤ 1Rdb, ≥ 1Rdb(Figure 4). We have four unqualified number re-strictions in L(b) (equivalent to two equality re-strictions) which will be transformed into inequa-tions by the ≤-Rule and the ≥-Rule. Assumingα(v01) = Rbd and α(v10) = Sbd, we will havev01 + v11 = 1 and v10 + v11 = 1. After applyingthe hierarchy-Rule, we get v01 ≤ 0 and v10 ≤ 0.Thus, there is only one solution for LE(b) whichis σ(v11) = 1 and the fil -Rule sets L(〈b, d〉) =Sbd, Rbd.

Afterwards, the ∀-Rule becomes applicable forthe node b and adds ∀R.(≥3S′ u ∀S′.C) to L(d).There is only one equation for d which results in

a b(c)

L(b) =

!1Rbd,"1Rbd !1Sbd,"1Sbd#

$S.($R.(" 3S! % $S!.C))

d

L(d) = !1Rdb,"1Rdb

#$R.(" 3S! % $S!.C)

R, Rab Rbd, Sbd

1

Fig. 4. Completion graph after merging b and c.

a b(c)

b!

L(b!) = Ccard(b!) = 2

L(b) =

!1Rbd,"1Rbd !1Sbd,"1Sbd#$S.($R.(" 2S ! % $S !.C))#

" 3S ! % $S !.C

d

L(d) = !1Rdb,"1Rdb#$R.(" 3S ! % $S !.C)

#C

R, Rab

Rdb

Rbd, Sbd, S!

S !

1

Fig. 5. Final completion graph.

setting L(〈d, b〉) = Rdb. After this change, asRdb v R the ∀-Rule adds (≥ 3S′ u ∀S′.C) toL(b). Since ≥ 3S′ did not exist in L(b), the al-gorithm performs reset(b). The ≤-Rule and the≥-Rule will be fired again to recompute the par-titions, considering the new number restriction≥3S′. Let α(v′001) = Rbd, α(v′010) = Sbd, andα(v′100) = S′, the solution σ(v11) = 1 for b mustbe expanded according to the new partitioning.Considering α(v11) = Rbd, Sbd which is a subsetof α(v′011) and α(v′111), the equation v′011+v′111 = 1will be added to LE(b) as a placeholder of σ(v11) =1 and we will have:

v′001 + v′011 + v′101 + v′111 = 1v′010 + v′011 + v′110 + v′111 = 1v′100 + v′101 + v′110 + v′111 ≥ 3

v′011 + v′111 = 1

(∗∗)

After applying the hierarchy-Rule, the variablesv′001, v′010, v′101, and v′110 (underlined in (∗∗)) mustbe less than or equal to zero. One of the solu-tions for (∗∗) can be achieved, after the ch-Ruledecided v′111 ≥ 1, v′011 ≤ 0, and v′100 ≥ 1, whichis σ(v′111) = 1 and σ(v′100) = 2. Subsequently, thefil -Rule will be fired for these solutions which addsS′ to L(〈b, d〉) and creates a new non-root node b′

for which L(〈b, b′〉) := S′ and card(b′) = 2. Fi-nally, the ∀-Rule becomes applicable for ∀S′.C inb and adds C to L(d) and L(b′) (see Figure 5).


3.8. Proof of Termination, Soundness, andCompleteness

In this section we prove the termination, sound-ness, and completeness of the proposed hybrid cal-culus for the Abox consistency test. Weaker prob-lems such as the Tbox satisfiability or concept sat-isfiability test can be reduced to the Abox satisfi-ability test with respect to a given Tbox and rolehierarchy (see also at the beginning of Section 3).

3.8.1. TableauIn order to prove the soundness and complete-

ness of our calculus, a tableau is defined as an ab-straction of a model to facilitate relating the re-sult of the calculus with a model. In fact, it isproven that a model can be constructed based onthe information in a tableau and for every modelthere exists a tableau [21]. The similarity betweentableau and completion graphs, which are the re-sult of the completion rules, makes it easier toprove the soundness and completeness of the cal-culus.

Due to the fact that after preprocessing the in-put of the hybrid tableau calculus is in SHN \, wedefine a tableau for SHN \ Aboxes with respectto a role hierarchy R. The following definition issimilar to the one in [24].

Definition 16 (Tableau for SHN \) Let RA be theset of role names and IA the set of individuals inan Abox A, T = (S,LT , E ,J ) is a tableau for Awith respect to role hierarchy R, where:

– S is a non-empty set of elements (representingindividuals),

– LT : S→ 2clos(A) maps elements of S to a setof concepts,

– E : RA → 2S×S maps each role to a set ofpairs of elements in S,

– J : IA → S maps individuals occurring in Ato elements of S.

Moreover, for every s, t ∈ S,A ∈ NC , C1, C2, C ∈clos(A), R,S ∈ NR the following properties holdfor T , where RT (s) := t ∈ S | 〈s, t〉 ∈ E(R).

P1 If A ∈ LT (s), then ¬A /∈ LT (s).P2 If C1uC2 ∈ LT (s), then C1 ∈ LT (s) and C2 ∈LT (s).

P3 If C1 t C2 ∈ LT (s), then C1 ∈ LT (s) or C2 ∈LT (s).

P4 If ∀R.C ∈ LT (s) and 〈s, t〉 ∈ E(R), then C ∈LT (t).

P5 If ∀R.C ∈ LT (s), for some S v R we have Sis transitive, and 〈s, t〉 ∈ E(S), then ∀S.C ∈LT (t).

P6 If ∀R\S.C ∈ LT (s), 〈s, t〉 ∈ E(R), but 〈s, t〉 /∈E(S), then C ∈ LT (t).

P7 If 〈s, t〉 ∈ E(R), and R v S, then 〈s, t〉 ∈ E(S).P8 If ≥ nR ∈ LT (s), then #RT (s) ≥ nP9 If ≤ mR ∈ LT (s), then #RT (s) ≤ mP10 If (a : C) ∈ A then C ∈ LT (J (a))P11 If (a, b) : R ∈ A, then 〈J (a),J (b)〉 ∈ E(R)P12 If a 6 .= b ∈ A, then J (a) 6= J (b)

Lemma 17 A SHQ Abox A has a tableau iffunQ(A) has a SHN \ tableau, where unQ(A) de-notes A after applying unQ to every assertion inA.

Lemma 17 is a straightforward consequence ofthe equisatisfiability of C and unQ(C) for everyconcept expression C in SHQ (see Section 2 and[30]).

3.8.2. TerminationIn order to prove termination of the hybrid al-

gorithm, we prove that it constructs a finite for-est. Since the given Abox has always a finite num-ber of individuals (i.e., root nodes), it is sufficientto prove that the hybrid algorithm creates finitetrees in which the root nodes represent Abox in-dividuals. On the other hand, due to the fact thatwe include nondeterministic rules, the t-Rule andthe ch-Rule, we must also prove that the algorithmcreates finitely many forests due to nondetermin-ism.

Lemma 18 (Termination) The hybrid algorithm ter-minates for a given Abox A with respect to a rolehierarchy R7.

Proof. Let m = |clos(A)| and k be the numberof different number restrictions after the prepro-cessing step. Therefore, m is an upper bound onthe length of a concept expression in the label of anode and k is the maximum number of roles par-ticipating in the atomic decomposition of a node.The algorithm creates a forest that consists of ar-bitrarily connected root nodes and their non-rootnode successors which appear in trees. The ter-mination of the algorithm is a consequence of thefollowing facts:

7Since Tbox axioms are propagated through the univer-sal transitive role, we do not mention the Tbox as an input

of the algorithm (see Section 2).


1. There are only two nondeterministic rules:the t-Rule and the ch-Rule. The t-Rule canbe fired at most m times for a node x, whichis the maximum length of L(x). On the otherhand, the ch-Rule can be fired at most 2Vx

times and Vx is bounded by 2k. Accordingly,we can conclude that the nondeterministicrules can only be fired finitely often for a nodeand therefore the algorithm creates finitelymany forests.

2. The only rule that removes a node from theforest is the merge-Rule which removes aroot node each time. Since there are |IA| rootnodes in the forest, this rule can be fired atmost |IA| times for a node. Moreover, accord-ing to the fact that the algorithm never cre-ates a root node, it cannot fall in a loop ofremoving and creating the same node.

3. The only generating node is the fil -Rulewhich can create at most |Vx| successors fora node x. Therefore, the out degree of theforest is bounded by |Vx| ≤ 2k.

4. According to the blocking condition, thereexist no two nodes with the same logical labelin a path in the forest, starting from a rootnode. In other words, the length of a pathstarting from a root node is bounded by thenumber of different logical labels (i.e., m).

5. The arithmetic reasoner always terminatesfor a finite set of inequations as the input.

According to (3) the out-degree of the forests isfinite and due to (4) the depth of the trees is finite.Therefore the size of each forest created by thehybrid algorithm is bounded. On the other hand,according to (1), the algorithm can create onlyfinitely many forests. Considering (2) and (5), wecan conclude the termination of the hybrid algo-rithm.

3.8.3. SoundnessTo prove the soundness, we must prove that the

constructed model, based on a complete and clash-free completion forest, does not violate the seman-tics of the input language. According to Lemma17 and the fact that a model can be obtained froma SHQ tableau (proven in [24]), it is sufficient toprove that a SHN \ tableau can be obtained froma complete and clash-free forest.

Lemma 19 (Abox semantics) The hybrid algorithmpreserves the semantics of Abox assertions; i.e., as-sertions of the form (a, b) :R and a 6 .= b.

Proof. The algorithm replaces assertions of theform (a, b) : R with a : (≤1Rab u ≥1Rab). Con-sider xa is the node corresponding to the individ-ual a ∈ IA in the forest F and likewise xb forb ∈ IA. According to the definition of Rab, the car-dinality restrictions, and assuming the fact thatthe algorithm correctly handles unqualified num-ber restrictions, one can conclude that for somev ∈ Vxa

that σ(v) = 1 with Rab ∈ α(v). Therefore,according to the condition (a) of the fil -Rule inFigure 2, Rab will be added to L(〈xa, xb〉). Sincethe preprocessing and the hierarchy-Rule preservethe role hierarchy and Rab v R holds, the assertion(a, b) :R is satisfied.

On the other hand, due to the restriction ≤1Rab, for every v′ 6= v if Rab ∈ α(v′) thenσ(v′) = 0. Hence, a set of solutions L(〈xa, xb〉)cannot be modified more than once by the fil -Rulefor more than one variable and consequently, thelabel of 〈xa, xb〉 does not depend on the order ofvariables for which the fil -Rule applies.

Let us assume an assertion of the form a 6 .= bis violated, i.e., aI = bI . This could only hap-pen if the merge-Rule became applicable to thecorresponding nodes xa, xb, xy ∈ V such thatw.l.o.g. Rya, Ryb ⊆ L(〈xy, xb〉) and A con-tains (y, a) : R, (y, b) : R. Then, the disjoint-Rule would have fired for a v ∈ Vxy

and addedv ≤ 0 to LE(xy) such that Rxa, Rxb ⊆ α(v)because due to our preprocessing and its result-ing role hierarchy and Rxa, Rxb ⊆ L(〈xy, xb〉),LE(xy) must contain in at least one of its inequa-tions a variable v representing the possibility thata and b need to be merged. Due to v ≤ 0 ∈ LE(xy),it is impossible for the arithmetic reasoner to cre-ate a solution which would require the algorithmto merge a and b.

Lemma 20 (Soundness) If the completion rules canbe applied to a SHQ-Abox A and a role hierarchyR such that they yield a complete and clash-freecompletion forest, then A has a tableau w.r.t. R.

Proof. A SHN \ tableau T can be obtainedfrom a complete and clash-free completion for-est F = (V,E,L,LE) by mapping nodes in F toelements in T which can be defined from F byT := (S,LT , E ,J ) as shown in Figure 6.

In the following we prove the properties of aSHN \ tableau for T .

– Since F is clash-free, P1 holds for T .


S := x1, . . . , xm |x ∈ F , card(x ) = mLT (xi) := L(x) for 1 ≤ i ≤ m if card(x) = mE(R) := 〈xi, yj〉 |R′ ∈ L(〈x, y〉)∧R′ v∗ RJ (a) := xa if xa is a root node in F repre-

senting individual a ∈ IA. If xb ∈V is merged into xa such that ev-ery occurrence of xb is replaced byxa, then J (b) = xa.

Fig. 6. Converting forest F to tableau T .

– If C1uC2 ∈ LT (xi), it means that (C1uC2) ∈L(x) in the forest F . Therefore the u-Ruleis applicable to the node x which adds C1

and C2 to L(x). Hence C1 and C2 must bein LT (xi) and we can conclude P2 and like-wise P3 for T . Similarly, properties P4 and P5are respectively guaranteed by the ∀-Rule and∀+-Rule.

– If ∀R\S.C ∈ LT (xi) it means that ∀R\S.C ∈L(x). Also, 〈xi, yi〉 ∈ E(R) means that thereexists a R′ v∗ R where R′ ∈ L〈x, y〉. Simi-larly, 〈xi, yi〉 /∈ E(S) means that there existsno S′ v∗ S such that S′ ∈ L〈x, y〉. Therefore,the ∀\-Rule becomes applicable for x and addsC to L(y) which is equivalent with C ∈ LT (yi)and P6 holds for T .

– Assume R v S, if 〈xi, yj〉 ∈ E(R), we can con-clude that R ∈ L(〈x, y〉) in F . The hierarchy-Rule maintains the role hierarchy by settingv ≤ 0 if R ∈ α(v) but S /∈ α(v). Moreover,since every R-successor is an S-successor, therole hierarchy is considered and properly han-dled by the ∀-Rule, ∀+-Rule, and ∀\-Rule.Therefore, we will have S ∈ L(〈x, y〉) and ac-cordingly 〈xi, yj〉 ∈ E(S). Hence, the propertyP7 holds for T .

– Due to the priorities of rules, if the ≤-Ruleand the ≥-Rule are invoked, the logical label,L(x), cannot be extended anymore. In otherwords, a correct partitioning based on all thenumber restrictions for a node has been cre-ated. Therefore, the solution created by thearithmetic reasoner satisfies all inequationsin LE(x). If (≤ mR) ∈ LT (xi), we had (≤mR) ∈ L(x) for the corresponding node in F .According to the atomic decomposition for x,the≤-Rule will add Σvi ≤ m to LE(x). There-fore, the solution Ωj(x) for LE(x) will satisfythis inequation and if R ∈ α(vj

i ) ∧ σ(vji ) ≥ 1

for 1 ≤ i ≤ k then (σ(vj1) + σ(vj

2) + . . . +

σ(vjk)) ≤ m. For every σ(vj

i ) = mi the fil-Rule creates an R-successor yi with cardinal-ity mi for x. This node will be mapped tomi elements in the tableau T which are R-successors of xi ∈ S. Therefore, xi will haveat most m R-successors in T and we can con-clude that P9 hold for T and P8 is satisfiedsimilarly.

– The hybrid algorithm sets card(xa) = 1 andL(xa) := C | (a : C) ∈ A for every node xa

in F which represents an individual a ∈ IA.Therefore, an Abox individual will be repre-sented by one and only one node and P10 issatisfied. P11 and P12 are due to Lemma 19.

3.8.4. CompletenessIn order to be complete, an algorithm needs to

ensure that it explores all possible solutions. Inother words, if a tableau T exists for an inputAbox, the algorithm can apply its completion rulesin such a way that it yields a forest F from whichwe can obtain T as shown in Figure 6.

Lemma 21 In a complete and clash-free forest, fora node x ∈ V and its successors y, z ∈ V , ifL(〈x, y〉) = L(〈x, z〉) then L(y) = L(z).

Proof. The only way to extend the logical la-bel of a node is through the t-Rule, u-Rule, ∀-Rule, ∀+-Rule, and the ∀\-Rule. Since L(〈x, y〉) =L(〈x, z〉), the ∀-Rule, ∀+-Rule, and the ∀\-Rulewill have the same effect and extend L(y) and L(z)similarly. We can consider the following two cases.

(i) If y and z are non-root nodes, we haveL(y) = L(z) = ∅ before starting the applicationof the completion rules. Therefore, when extendedsimilarly, they will remain identical after the ap-plication of the tableau rules.

(ii) If w.l.o.g. y is a root node, then there existsa role name R ∈ NR such that Rxy ∈ L(〈x, y〉).Therefore, if L(〈x, y〉) = L(〈x, z〉) then Rxy ∈L(〈x, z〉) which results in merging y and z in asingle node by the merge-Rule. Therefore, we canstill conclude that L(y) = L(z).

Corollary 22 According to the mapping from a for-est F to a SHN \ tableau T (Figure 6), everyLT (s) in T is equal to L(x) in F if x is mappedto s. Moreover, every R ∈ L(〈x, y〉) is mapped to〈s, t〉 ∈ E(R). Therefore L(〈x, y〉) = L(〈x, z〉) isequivalent to R ∈ NR | 〈s, t〉 ∈ E(R) = R ∈


NR | 〈s, t′〉 ∈ E(R) where x is mapped to s, y tot, and z to t′. Furthermore, L(y) = L(z) is equiva-lent to LT (t) = LT (t′). Thus, according to Lemma21, we can conclude: R ∈ NR | 〈s, t〉 ∈ E(R) =R ∈ NR | 〈s, t′〉 ∈ E(R) ⇒ LT (t) = LT (t′).

Lemma 23 If for a node zx a set of non-negativeinteger solutions Ω(zx) based on the set of inequa-tions LE(zx) causes a logical clash, all other non-negative integer solutions for LE(zx) will also trig-ger the same logical clash.

Proof. Assume we have a solution set Ω(zx) =σ(v1) = m1, σ(v2) = m2, . . . , σ(vn) = mn,which can only occur if vi ≥ 1 for 1 ≤ i ≤ n is de-cided by the ch-Rule and all other variables in Vzx

are equal to zero. Let V≥1zx

= v ∈ Vzx|σ(v) ≥ 1

denote this subset of variables. Suppose we alsohave a different solution Ω′(zx) for LE(zx) andV≥1

zxsuch that Ω′(zx) = σ′(v1) = p1, σ

′(v2) =p2, . . . , σ

′(vn) = pn. Both solutions Ω(zx) andΩ′(zx) have the same the set V≥1

zxof non-zero vari-

ables. So, for both solutions the fil -Rule will firefor the variables in V≥1

zxbut for some of these vari-

ables their assignment given by σ and σ′ will bedifferent. We have to distinguish the following twocases for a given v ∈ V≥1

zx: (i) condition (a) of the

fil -Rule (see Figure 2) is true, then zx and its suc-cessor zb are root nodes representing a role asser-tion (x, b) : Rxb (see Definition 5) and zx musthave for all possible arithmetic solutions an Rxb-filler, so, since σ(v) = σ′(v) = 1 the possibly as-signed value α(v) of LE(〈zx, zb〉) will only dependon the partitioning RSzx

(see Definition 8) andnot on Ω(zx) or Ω′(zx); (ii) condition (b) is true,then the fil -Rule will create a new node y and,again, the assigned value α(v) of LE(〈zx, y〉) willonly depend on the partitioning RSzx and not onΩ(zx) or Ω′(zx).

This concludes that the edge labels LE(〈zx, ·〉),created by the fil -Rule, only depend on the com-position of V≥1

zxand not on the values assigned

to the variables in V≥1zx

. Therefore, by consideringLemma 21 the forest generated based on Ω′(x) willcontain the same nodes as Ω(x), however, with dif-ferent associated cardinalities. Since logical clashesdo not depend on the cardinality of the nodes, wecan conclude that the selection of Ω′(x) will resultin the same clash as for Ω(x).

Corollary 24 According to Lemma 23, all solutionsfor LE(x) will end up with the same result; eitherall of them yield a complete and clash-free forestor return a clash.

Lemma 25 (Completeness) LetA be a SHQ-Aboxand R a role hierarchy. If A has a tableau w.r.t.R, then the completion rules can be applied toA such that they yield a complete and clash-freecompletion forest.

Proof. We assume we have a SHN \ tableau T =(S,LT , E ,J ) for A and we claim that the hybridalgorithm can create a forest F = (V,E,L,LE)from which T can be obtained. The procedure ofobtaining T from F is shown in Figure 6. We provethis by induction on the set of nodes in V .

Consider a node x in F and the completion rulesin Figure 2 and let s ∈ S in T be the elementthat is mapped from x. We actually want to provethat with guiding the application of the completionrules on x we can extend F such that it can stillbe mapped to T .

– The t-Rule: If (C1 t C2) ∈ L(x) then (C1 tC2) ∈ LT (s). The t-Rule adds C1 or C2 toL(x) which is in accordance with the propertyP2 of the tableau where for some concept E ∈C1, C2 we have E ∈ LT (s). The u-Rule,∀-Rule, and ∀+-Rule, which are deterministicrules, are similar to the t-Rule. In fact theserules are built exactly based on their relevanttableau property.

– ∀\-Rule: If ∀R\S ∈ L(x) then ∀R\S.C ∈LT (s) which means if 〈s, t〉 ∈ L(R) but 〈s, t〉 /∈E(S) then C ∈ LT (t). If t is a mapping fromy it is equivalent to say if there exists R′ v Rand no S′ v S such that R′ ∈ L(〈x, y〉) andS′ ∈ L(〈x, y〉) then C ∈ L(y). This conditionis satisfied by means of the ∀\-Rule.

– The ch-Rule: Consider in T we have t1, t2, . . . , tnas the successors of s, i.e., ∃R ∈ NR, 〈s, ti〉 ∈E(R). Intuitively, we cluster these successorsin groups of elements with the same label LT .For example if tk, . . . , tl have the same la-bel, according to Corollary 22, Nkl := R ∈NR | 〈s, tj〉 ∈ E(R) will be identical for tj ,k ≤ j ≤ l. We define a variable vkl for such aset of role names with α(vkl) = Nkl. In orderto have T as the mapping of F , the ch-Rulemust impose vkl ≥ 1.According to properties P8 and P9 of thetableau, ≤nR and ≥mR are satisfied in T for


s. Therefore, the inequations based on thesevariables will have a non-negative integer so-lution. Notice that the created set of variableconstraints based on T may result in a differ-ent solution. For example in T , the elements may have t1 and t2 as successors with thelabel LT

1 which sets v ≥ 1 and t′1, t′2, andt′3 as successors with the label LT

2 which setsv′ ≥ 1. However, in the solution based onthese variable constraints we may have threesuccessors with the label LT

1 and two succes-sors with the label LT

2 . Nevertheless, accord-ing to the Lemma 23 this fact does not violatethe completeness of the algorithm.

– The reset-Rule is a deterministic rule which isonly applicable to root nodes. Clearing the la-bel of the outgoing edges and also LE(x) doesnot violate properties of the tableau mappedfrom F . This is due to the fact that the labelof the outgoing edges from x will later be setby the fil -Rule which has a lower priority.

– The merge-Rule is also only applicable to rootnodes. Assume individuals a, b, c ∈ IA aresuch that b and c are successors of a that mustbe merged according to an at-most restrictiona : (≤nR). Since T is a tableau the restriction≤nR ∈ LT (J (a)) imposes that J (b) = J (c).On the other hand, if xa, xb, and xc are rootnodes representing a, b, and c, themerge-Rulewill merge xb and xc according to the solutionfor LE(xa). In the mapping from F to T , xb

and xc will be mapped to the same elementthat implies J (b) = J (c) which follows thestructure of T .

– The ≤-Rule, ≥-Rule, equal -Rule, disjoint-Rule, and the hierarchy-Rule only modifyLE(x). Therefore, they will not affect themapping of T from F .

– The fil -Rule, with the lowest priority, gener-ates successors for x according to the solu-tion provided by the arithmetic reasoner forLE(x). Since LE(x) conforms to the at-mostand at-least restrictions in the label of x andaccording to the variable constraints decidedby the ch-Rule, the solution will be consis-tent with T . Notice that every node x in Ffor which card(x) = m and m > 1 will bemapped to m elements in S.

The resulting forest F is clash-free and completedue to the following properties:

1. F cannot contain a node x such that A,¬A ⊆L(x) since L(x) = LT (s) and property P1 ofthe definition of a tableau would be violated.

2. F cannot contain a node x such that LE(x) isunsolvable. If LE(x) is unsolvable, this meansthat there exists a restriction of the form(≥ nR) or (≤ mR) in L(x) and thereforeLT (s) that cannot be satisfied which violatesproperty P8 and/or P9 of a tableau.

4. Practical reasoning

There is always a conflict between the expres-siveness of a DL language and the difficulty ofreasoning. Increasing the expressiveness of a rea-soner with qualified number restrictions can be-come very expensive in terms of efficiency. Asmentioned above, a standard algorithm to dealwith qualified number restrictions must extend itstableau rules with at least two nondeterministicrules; i.e., the choose-Rule and the ≤-Rule (seeFigure 7). In order to achieve an acceptable perfor-mance, a tableau algorithm needs to employ effec-tive optimization techniques. As stated in [19], theperformance of tableau algorithms even for simplelogics is a problematic issue.

In this section we briefly analyze the complex-ity of both the standard and hybrid algorithms.Based on the complexity analysis, we address somesources of inefficiency in DL reasoning. More-over, we propose some new or adapted optimiza-tion techniques for the hybrid algorithm. In thelast part we give special attention to dependency-directed backtracking as a major optimizationtechnique and compare its effect on both the stan-dard and the hybrid algorithm.

4.1. Complexity Analysis

To analyze the complexity of the concept satis-fiability test with respect to qualified number re-strictions, we count the number of branches thatthe algorithm creates in the search space.8

In the following we assume a node x ∈ V inthe completion forest that contains p at-least re-strictions and q at-most restrictions in its label(Ri, R

′j ∈ NR and Ci, C

′j ∈ clos(T )):

8Notice that every nondeterministic rule that can have koutcomes opens k new branches in the search space.


≥-Rule

if (≥ nR.C) ∈ L(x) and there are noR-successors y1, y2, . . . , yn for x suchthat C ∈ L(yi) and yi 6 .= yj

then create n new individualsy1, y2, . . . , yn and set L(yi) := C,L(〈x, yi〉) := R, and yi 6 .= yj for1 ≤ i ≤ j ≤ n and i 6= j

choose-Rule

if (≤ mR.C) ∈ L(x) and there ex-ists an R-successor y of x such thatC,¬C ∩ L(y) = ∅,then set L(y) := L(y)∪C or L′(y) :=L(y) ∪ ¬C

≤-Rule

if (i) (≤ nR.C) ∈ L(x) and x has mR-successors such that m > n and,(ii) there exist R-successors y, z for xand y 6 .= z does not holdthen replace every occurrence of y by z

Fig. 7. Standard tableau rules handling Q.

≥n1R1.C1, . . . ,≥npRp.Cp ⊆ L(x)

≤m1R′1.C′1, . . . ,≤mq R

′q.C′q ⊆ L(x)

4.1.1. Standard TableauThe rules of a standard tableau algorithm deal-

ing with qualified number restrictions as shown inFigure 7 create n R-successors in C for each at-least restriction of the form ≥ nR.C. Moreover,in order to avoid that the successors are beingmerged, they are asserted as mutually distinct in-dividuals. Assuming that no Ci is subsumed by aCj , there will be N := n1 + · · ·+ np successors forx which are composed of p sets of successors, suchthat successors in each set are mutually distinct.

Moreover, according to every at-most restric-tion ≤miR

′i.C′i the choose-Rule will create two

branches in the search space for each successor.Therefore, based on the q at-most restrictions inL(x), there will be 2q cases for each successor ofx. Since x has N successors, there will be totally(2q)N cases to be examined by the algorithm. No-tice that the creation of these 2qN branches is in-dependent from any clash occurrence and the al-gorithm will always invoke the choose-Rule N× ptimes.

Suppose the algorithm triggers a clash accordingto the restriction ≤miR

′i.C′i. If there exist M R′i-

successors in C ′i such that M > mi, the algorithmopens f(M,mi) :=

(M2

)(M−1

2

). . .(mi+1

2

)/(M −

mi)! new branches in the search space which is

the number of possible ways to merge M indi-viduals into mi individuals. In the worst case, ifm := mini∈1..q mi there will be f(N,m) waysto merge all the successors of x. Therefore, inthe worst-case one must explore (2q)N ×f(N,m)branches.

4.1.2. Hybrid TableauDuring the preprocessing step, the hybrid al-

gorithm converts all qualified number restrictionsinto unqualified ones which introduces p+ q newrole names. According to the atomic decomposi-tion presented in Section 3.2, the hybrid algorithmdefines 2p+q − 1 partitions and consequently vari-ables for x; i.e. |Vx| = 2p+q−1. The ch-Rule openstwo branches for each variable in Vx. Therefore,there will be totally 2|Vx| cases to be examinedby the arithmetic reasoner, which considers onlya single solution out of many possible solutions.The ch-Rule will be invoked |Vx| = 2p+q − 1 timesand creates 22p+q

branches in the search space.Hence, the complexity of the algorithm seems to becharacterized by a double-exponential function ofp+q. In [27] a polynomial-time algorithm for inte-ger programming with a fixed number of variablesis given. However, our prototype implementationemployed the Simplex method which is known tobe NP in the worst case but usually behaves verywell on average.

4.1.3. Hybrid vs. StandardComparing the complexity of the standard algo-

rithm with the hybrid algorithm, we can conclude:

– The complexity of the standard algorithmis a function of N and therefore the num-bers occurring in the at-most restrictions canaffect the standard algorithm exponentially.Whereas in the hybrid algorithm, the com-plexity is independent from N due to its arith-metic approach to the problem.

– Let initial complexity refer to the complexityof the tasks that the algorithm needs to per-form independently from the occurrence of aclash. That is to say, the tasks that need tobe done in all cases (whether worst or best-case). Particularly, the initial complexity ofthe standard algorithm is due to the choose-Rule ((2q)N ) and the initial complexity of thehybrid algorithm is due to the ch-Rule (22p+q

).Therefore, whenever N × q < 2p+q, the timespent for initializing the algorithm is greaterfor the hybrid algorithm in comparison withthe standard algorithms.


– The major source of complexity in the stan-dard algorithm is due to the merge-Rule. Be-ing highly nondeterministic, this rule can be amajor source of inefficiency. Therefore, in thecase of hardly satisfiable concept expressions,the standard algorithm can become very in-efficient. In contrast, the hybrid algorithm,based on an arithmetically correct solution fora set of inequations, generates and merges thesuccessors of an individual deterministicallyand.9

– Whenever a clash occurs, the algorithm needsto explore an open choice point to choose anew branch. The sources of nondeterminismdue to number restrictions in the standard al-gorithm are more than one: the choose-Ruleand the merge-Rule, whereas in the hybrid al-gorithm we have only the ch-Rule. Therefore,in the hybrid algorithm it is easier to trackthe sources of a clash.

4.2. Partition Optimization Techniques

For various DL reasoning services different opti-mization techniques have been developed. For ex-ample, axiom absorption [25] or lazy unfolding [2]are optimization techniques for Tbox services suchas classification or subsumption testing. These op-timization techniques facilitate subsumption test-ing and by avoiding unnecessary steps in Tbox rea-soning improve the performance of the reasoner.The hybrid algorithm is meant to address the per-formance issues regarding reasoning with qualifiednumber restrictions independently from the rea-soning service. In other words, by means of thehybrid reasoning, we want to improve reasoningat the concept satisfiability level which definitelyaffects Tbox and Abox reasoning.

At the concept satisfiability level, the sourcesof inefficiency are due to high nondeterminism. Infact, nondeterministic rules such as the t-Rule inFigure 2 or the choose-Rule in Figure 7 create sev-eral branches in the search space. In order to becomplete, an algorithm needs to explore all of thesebranches in the search space. Optimization tech-niques mainly try to reduce the size of the searchspace by pruning some of these branches. More-over, some heuristics can help the algorithm to

9Note that the hybrid algorithm never merges anony-mous (non-root) nodes.

guess which branches to explore first. In fact, themore knowledge the algorithm uses to guide theexploration, the less probable it is that its decisionwill fail later.

Although it seems that the hybrid algorithm isdouble-exponential and the large number of vari-ables seems to be hopelessly inefficient, there aresome effective heuristics and optimization tech-niques which make it feasible to use. In the follow-ing we briefly explain three heuristics which cansignificantly improve the performance of the algo-rithm in the average case.

4.2.1. Default Value for VariablesIn the semantic branching based on the concept

choose-Rule, in one branch we have C and in theother branch we have ¬C in the label of the nodes.However, due to the ch-Rule (for variables) in onebranch we have v ≥ 1 whereas in the other branchv ≤ 0. In contrast to concept branching based onthe choose-Rule, in variable branching we can ig-nore the existence of variables that are less or equalto zero. In other words, the arithmetic reasoneronly considers variables that are greater or equalto one.

Therefore, by setting the default value to v ≤ 0for every variable, the algorithm does not needto invoke the ch-Rule |Vx| times before startingto find a solution for the inequations. More pre-cisely, the algorithm starts with the default valueof v ≤ 0 for all variables in |Vx|. Obviously,the solution for this set of inequations, which is∀ vi ∈ Vx : σ(vi) = 0, cannot satisfy any at-leastrestriction. Therefore, the algorithm must choosesome variables in Vx to make them greater or equalto one. Although in the worst case the algorithmstill needs to try 2|Vx| cases, by setting this de-fault value it does not need to invoke the ch-Rulewhen it is not necessary. In other words, by ben-efiting from this heuristics, the initial complexity(see Section 4.1.3 for a definition) of the hybridalgorithm is no longer 2p+q.

4.2.2. Strategy of the ch-RuleAs explained in the previous section, in a more

optimized manner, the algorithm starts with thedefault value of zero for all the variables. After-wards, it must decide to set some variables greaterthan zero in order to find an arithmetic solution.The order in which the algorithm chooses thesevariables can help the arithmetic reasoner find asolution faster, if one exists.


We define don’t care variables as the set of vari-ables that have appeared in an at-least restric-tion but in no at-most restriction. Therefore, thesevariables have no restrictions other than logicalrestrictions which later on will be processed bythe algorithm. Therefore, any non-negative integervalue observing the arithmetic limitations can beassigned to these variables and we can leave themunchanged in all inequations unless a logical clashis triggered.

Moreover, we define satisfying variables as theset of variables which occur in an at-least restric-tion and are not don’t care variables. Since theseare the variables that occur in an at-least restric-tion, by assigning them to be greater or equal toone, the algorithm can lead the arithmetic reasonerto a solution. Whenever a node that is createdbased on v causes a clash, by means of dependency-directed backtracking we will set v ≤ 0 and there-fore remove v from the satisfying variables set.When the satisfying variables set becomes emptythe algorithm can conclude that the set of qualifiednumber restrictions in L(x) is unsatisfiable.

Notice that the number of variables that can bedecided to be greater than zero in an inequation isbounded by the number occurring in their corre-sponding number restriction. For example, in theinequation v1 + v01 + · · ·+ v10000000 ≥ 5, althoughwe have 128 variables in the inequation, not morethan five of the vi can be greater or equal to oneat the same time.

4.2.3. Variable EncodingOne of the interesting characteristics of the vari-

ables is that we can encode their indices in binaryformat to easily retrieve the role names related tothem. On the other hand, we do not need to as-sign any memory space for them unless they havea value greater than zero based on an arithmeticsolution.

4.3. Dependency-Directed Backtracking

Backjumping [12] or conflict-directed-directedbackjumping [32] are improved backtracking meth-ods originally developed in areas such as con-straint reasoning. These techniques were adaptedto DL reasoning as dependency-directed back-tracking [23]. These techniques detect the sourcesof an encountered clash and try to bypass dur-ing backtracking branching points that are not re-

lated to the sources of the clash. By means of thismethod, an algorithm can prune branches that willend up with the same sort of clash. As demon-strated in [23,18], this method improved the per-formance of the FaCT system to deal much moreeffectively with qualified number restrictions. Itis nowadays an optimization technique that is in-evitable in any practically usable DL reasoner [1,Chapter 9]. It is beyond the scope of this articleto prove the completeness of this well-known opti-mization technique.

By analogy, dependency-directed backtrackingwas also adapted to the hybrid algorithm. When-ever a logical clash for a successor y of x is en-countered, one can conclude that the correspond-ing variable vy for the partition in which y residesmust be zero.

Assume, y is based on a solution σ(vy) = k bythe fil -Rule and the algorithm encounters a clashin the node y. This solution can only occur when-ever vy ≥ 1 is decided by the ch-Rule for thenode x. Any other completion forest in this branch,where vy ≥ 1, will end up with a solution in theform σ(vy) = k′ (k′ ≥ 1). Assume the successorof x, created based on this solution is y′. SinceL〈x, y〉 = α(vy) and L〈x, y′〉 = α(vy) are equaland all the concepts in L(y) and L(y′) are createdbased on the roles from x, we can conclude thatL(y) = L(y′). Therefore, y′ will contain the sameclash as y. Consequently, we can conclude that allthe branches in the search space where vy ≥ 1 willend up with the same clash.

Therefore, we can prune all branches for whichvy ≥ 1 ∈ LE(x). We call this method, simple back-tracking which can exponentially decrease the sizeof the search space by pruning half of the brancheseach time the algorithm detects a clash. For ex-ample, for an arbitrary L(x), by pruning all thebranches where vy ≥ 1, we will in fact prune2|Vx|−1 = 22p+q−1 branches w.r.t. the ch-Rule,which amounts to half of the branches.

We can improve this by a more complex versionof dependency-directed backtracking in which weprune all branches that have the same reason forthe clash caused by vy. We call this method com-plex backtracking. For instance, assume the node ythat is created based on σ(vy) = k where k ≥ 1ends up with a clash. Since we have only one typeof clash other than the arithmetic clash, assumethe clash is because of A,¬A ⊆ L(y) for someA ∈ NC . Moreover, w.l.o.g. assume we know that


A is caused by a ∀Ri.A restriction in its prede-cessor x and ¬A by ∀S\Tj .(¬A) ∈ L(x). It ispossible to conclude that all the variables v forwhich Ri ∈ α(v) ∧ Tj /∈ α(v) will end up with thesame clash.10 This is due to the fact that when-ever the arithmetic reasoner assigns to a variablev (for which Ri ∈ α(v) ∧ Tj /∈ α(v)) occurring inLE(x) a value k ≥ 1, the fil -Rule will eventuallyfire for x and create a corresponding successor ywith Ri ∈ L(〈x, y〉) and Tj /∈ L(〈x, y〉). This wouldmake the ∀-Rule and ∀\-Rule applicable to x andcause a clash for y.

Consider the binary coding for the indices of thevariables in which the ith digit represents Ri andthe jth digit represents Tj . Therefore, all the vari-ables, where the binary coding has 1 as its ith digitand 0 as its jth digit must be zero. Since the bi-nary coding of the variable indices requires a totalof p+ q digits, the number of variables that mustbe zero will be 2p+q−2. The 2p+q − 2p+q−2 othervariables are free to be constrained and will opentwo branches in the search space. Therefore, thenumber of branches will be reduced from 2|Vx| to23/4|Vx| which is a significant improvement. In fact,the atomic decomposition technique can be con-sidered as a method to organize the search spaceand at the same time by means of numerical rea-soning and proxy individuals remain unaffected bythe value of numbers.

For example, consider the case when there are7 number restrictions in L(x) and therefore 27

variables. Accordingly, the ch-Rule opens 2128

branches in the search space. If y is a succes-sor of x which is created based on the solutionσ(v0011101) = m and y ends up with a clash, thealgorithm can conclude that v0011101 ≤ 0 mustbe added to LE(x). Therefore, based on sim-ple backtracking, 2127 branches remain to be ex-plored. Moreover, assume the clash in y is dueto A,¬A ⊆ L(y) where A is created becauseof ∀R3.A ∈ L(x) and ¬A is created because of∀R\R2.¬A ∈ L(x). Hence, the algorithm can con-

10Notice that in the cases where we have a disjunc-tion in the sources of a clash, there may exist more than

two sources for a clash. For example, assume ∀R.(A t¬B), ∀S.(B t ¬C),∀T.(C t ¬A) ⊆ L(x) and we haveR,S, T ⊆ L(〈x, y〉) and y leads to a clash. In fact, all of

these three role names in L(〈x, y〉) together are the sourcesof this clash. Therefore, the algorithm concludes that allthe branches in the search space for which v ≥ 1 and

R ∈ α(v) ∧ S ∈ α(v) ∧ T ∈ α(v) will end up with the sameclash.

clude that for all the variables v where R3 ∈ α(v)and R2 /∈ α(v), the same clash will occur. Namely,variables of the form v01, where ∈ 1, 0,must be zero. Therefore, 2128−32 = 23×32 branchesremain to be explored.

4.3.1. Backtracking in the Arithmetic ReasonerNormally there could be more than one solution

for a set of inequations. According to Lemma 23in Section 3, when we have a solution with respectto a set of restrictions of the form vi ≥ 1, differentsolutions where the non-zero variables only differin their values do not make any logical differences.In fact, the algorithm will create successors withthe same logical labels but different cardinalitiesbased on these different solutions. Since all the so-lutions minimize the sum of variables and satisfyall the numerical restrictions, they do not makeany arithmetic differences (as long as the set ofzero-value variables is the same).

In addition, notice that backtracking withinarithmetic reasoning is not trivial due to the factthat the cause of an arithmetic clash cannot beeasily traced back. In other words, the whole set ofnumber restrictions together causes the clash. Inthe same sense as in a standard tableau algorithm,if all the possible merging arrangements end upwith a clash, one can only conclude that the cor-responding number restrictions are not satisfiabletogether.

4.3.2. Backjumping: Standard vs. HybridAlgorithm

By comparing the effect of dependency-directedbacktracking on the hybrid and the standard algo-rithm, we can conclude:

1. In fact, the atomic decomposition is a mech-anism of organizing role-fillers of an individ-ual in partitions that are disjoint and yetcover all possible cases. Therefore, it is moresuitable for dependency-directed backtrack-ing. In other words, the whole tracking andrecording that are performed in order to de-tect sources of a clash to prune the searchspace, are hard-coded in the hybrid algorithmby means of the atomic decomposition.

2. In the hybrid algorithm, the sources of non-determinism are only the ch-Rule and thet-Rule, whereas in standard algorithms wehave three sources of nondeterminism: thet-Rule, the choose-Rule, and the ≤-Rule.Therefore, in contrast to standard algorithms


Completion

Rules

Fig. 8. Overall reasoner architecture.

which have three nondeterministic rules, thehybrid algorithm can more easily backjumpto the source of the clash. In other words, thenondeterminism due to the concept choose-Rule and the ≤-Rule in standard algorithmsis integrated into one rule, the variable ch-Rule, in the hybrid algorithm.

5. Architecture of the Prototype Reasoner

As illustrated in Figure 8, the hybrid reasoner iscomposed of two major modules: the logical mod-ule and the arithmetic module. The input of thereasoner is an ALCHQ11 concept expression. Theoutput of the algorithm is “satisfiable” provideda complete and clash-free completion graph12 hasbeen constructed or otherwise “unsatisfiable”. Thecomplete and clash-free completion graph can beconsidered as a pre-model based on which wecan construct a tableau (see Figure 6). The sys-tem was implemented in Java using OWL-API

11The language ALCHQ is equivalent to SHQ withouttransitive roles. Since transitive roles are assumed to have

no interaction with qualified number restrictions, they were

not implemented in the prototype reasoner.12Notice that since the input of the reasoner is not an

Abox, the algorithm constructs a completion graph rather

than a completion forest.

Algorithm 1 main(state1 )root = state1

preprocess(state1 )if expand(state1 ) = TRUE then

return satisfiableelse

return unsatisfiable

Algorithm 2 expand(state)if state contains clash then

updateVariables(Backtracking Strategy)else

for all rule in Rule-Priority-List doexpandedList = apply(state, rule)if expandedList = null then

close(state)if state has no clash then

return TRUEelse

for all state ′ in expandedList doexpand(state ′)

2.1.1 which is a Java interface and implementa-tion to parse the W3C Web Ontology LanguageOWL [17]. Although choosing Java as the pro-gramming language gave us the opportunity to uti-lize the OWL-API, the performance of the rea-soner was significantly affected by the overheaddue to Java features such as garbage collection, nomajor (compiler-level) optimizations in numericalfunctions, and inefficiencies in representing frac-tional numbers. On the other hand, no other majoroptimization techniques were implemented.

The overall structure of the control flow for thewhole system is sketched in Algorithms 1 and 2.The hybrid algorithm, by expanding the nodes us-ing the implemented completion rules, tries to finda clash-free completion graph for the input con-cept. Also, whenever it encounters a clash, the hy-brid algorithm prunes the search space based onthe backtracking strategy it had adopted. The fol-lowing sections describe the two major modules inmore detail.

5.1. Logical Module

The logical module can be considered as themain module which performs the completion rulesand calls the arithmetic reasoner whenever needed.It is composed of a preprocessing component which


Completion Rules:

Completion Rule

Strategy

Fig. 9. Detailed architecture of the logical module.

modifies the input ontology (.owl file) based onthe unQ function (see Definition 3). Therefore, itreplaces qualified number restrictions with equi-satisfiable unqualified ones which are also trans-formed to negation normal form. Notice that theconverted language is not closed under negation.Accordingly, the reasoner never negates a conceptexpression that is a direct or indirect output ofthe preprocessing component. Moreover, the logi-cal reasoner as illustrated in Figure 9 is composedof a set of completion rules, clash strategy compo-nent, and some other auxiliary components.

The major data structure in the logical mod-ule is a state which records the state of the com-pletion graph. The logical reasoner builds a treeof states such that firing a deterministic rule cre-ates only one child for a state. On the other hand,the application of a nondeterministic rule (suchas t-Rule) can generate more than one child fora state. For example, if the reasoner fires the t-Rule for C1 t C2 t · · ·Cn for an individual x inthe current state, state1 , it will have n childreneach of which contains one of the disjuncts in L(x).In other words, each state contains a unique com-pletion graph and if we had no nondeterministicrule, the output would be a single path of states.Moreover, every state contains all the information

Algorithm 3 canApply(state, individual , rule)if state contains individual for which rule is ap-plicable then

return trueelse

return false

Algorithm 4 apply(state, rule)newState ← Copy(state)newState.parent ← statefor all individual in newState do

if canApply(newState, individual , rule) thennewInd ← apply rule on indvidualreplace individual with newInd in newState

return newState

about its individuals, including their label, theircardinality, and the label of their edges.

The set of implemented completion rules isbased on the completion rules presented in Figure2. However, since the logical module has no in-formation regarding the variables and inequations,the ch-Rule is implemented in the arithmetic mod-ule. All the rules follow the general templates inAlgorithms 3 and 4. Each rule has a preconditionto be applicable to a state. Moreover, after its ap-plication, a rule modifies a copy of the currentstate to create a new state which will be a childof the current state. Furthermore, each rule willbe fired for all individuals for which it is applica-ble. The logical module tries to apply the comple-tion rules in accordance to their priority to everystate that is not closed. A state is called closed(clashed) if its corresponding completion graph iscomplete (contains a clash). If no rule is applicableto a state, it will be closed. If all of the states haveclashed and are closed, the input concept expres-sion is unsatisfiable.

In the following, we assume that the currentstate on which a rule is applied is called state1 .There are two variations of the set of completionrules according to the use of backtracking.

Without backtracking: There are two rules whichtogether function as the fil -Rule, the ch-Rule, the≤-Rule, and the ≥-Rule:

– The Collect-And-Initiate-Rule collects all un-qualified number restrictions in the label ofeach individual in state1 and calls the arith-metic reasoner. The arithmetic reasoner com-


putes all cases for the variables based on thech-Rule and returns all possible non-negativeinteger solutions in a list. This rule stores thelist of solutions in state2 which is a child ofstate1 .

– The Build-Arithmetic-Results-Rule which is anondeterministic rule, creates successors of anindividual based on the solutions provided bythe Collect-And-Initiate-Rule (similar to thefunction of the fil -Rule). Therefore, it is ap-plied to state2 and creates a new state for eachsolution. For example, if there exist n differ-ent solutions for an individual x in state2 , thisrule creates n new states as children of state2

and in each of them expands one solution.

With backtracking: In this case the reasoner doesnot search for all solutions at once. In fact, it as-sumes the first solution will end up with a clash-free graph and if this assumption fails, it will mod-ify its knowledge about the variables and try tosearch for a new solution. There are two rules re-sponsible for this task:

– The Collect-And-Create-Rule, similar to theCollect-And-Initiate-Rule with the lowest pri-ority, collects all number restrictions for eachindividual in state1 . Furthermore, it calls thearithmetic reasoner which returns the first so-lution it finds and generates successors of in-dividuals in state2 based on this solution.

– For a detailed description of the Build-in-Sibling-Rule, assume an individual x in state1

that has a set of number restrictions accord-ing to which the Collect-And-Create-Rule hascreated a set of successors y1, y2, . . . , yn instate2 . We call state1 the generating state ofthe yi. All of the rules may modify labels ofthe yi in the succeeding states. For instance,if y1 ends up with a clash in all the paths ofstates starting from state2 , the reasoner canconclude that y1 caused a clashed state andtherefore the corresponding solution in state1

is not valid and cannot survive. The Build-in-Sibling-Rule is applicable to the clashed statesthat are closed (i.e., cannot be expanded inanother way according to the Or-Rule). Whenthis rule finds an individual such as y1 ina clashed state, it determines its generatingstate which is state1 in this case. Furthermore,it calls the arithmetic reasoner in state1 andsets the variable related to y1 to zero and gets

a new solution. Since all the paths from state2will end up with a clash and no rule is applica-ble to it, state2 will be automatically closed.Afterwards, this rule will generate new suc-cessors of x in a new child state of state1 (ifany solution exists).

The Completion Rule Strategy component im-poses an order on the rules and they are priori-tized by their order as in the following. In otherwords, the logical reasoner, before applying a rulein state1 , ensures that no rule with a higher prior-ity is applicable.

1. The For-All-Rule (∀-Rule).2. The For-All-Subtract-Rule (∀\-Rule).3. The And-Rule (u-Rule).4. The Or-Rule (t-Rule).5. The Build-Arithmetic-Results-Rule or the

Build-in-Sibling-Rule.6. The Collect-And-Initiate-Rule or the Collect-

And-Create-Rule.

For example, assume we have x, y as two indi-viduals and we have C1 t C2 t C3 ⊆ L(x) andD1 tD2 ⊆ L(y) in state1 . If none of the firstthree rules is applicable to any of the individualsin the state1 , the Or-Rule checks if C1, C2, andC3 are not in L(x) (or similarly if D1 and D2 arenot in L(y)). Therefore, the Or-Rule is applicableto state1 and creates 6 states as children of state1

such that in each of them one of the Cis and oneof the Djs is selected. In other words, in one ap-plication of the Or-Rule, 6 new states are created.

The structure of the For-All-Rule, the For-All-Subtract-Rule, the And-Rule, and the Or-Rule issimilar to their relevant tableau rule. However, thefunctioning of the last two rules is slightly differ-ent from their corresponding tableau rule. Con-sider the case when we have simple backtrackingenabled and the logical module uses the Build-in-Sibling-Rule and the Collect-And-Create-Rule.For example, if ≥ 2R, ≤ 4S, ≥ 3T ⊆ L(x) instate1 and x has no successors, the Collect-And-Create-Rule passes the set ≥2R, ≤4S, ≥3T tothe arithmetic reasoner and receives either “no so-lution” which means that state1 contains a clashor the first non-negative integer solution that thearithmetic reasoner finds. Assume the first solu-tion found by the arithmetic reasoner is of theform v1 = 2, v2 = 1 such that α(v1) = R,S, Tand α(v2) = T, S (see Figure 10).


Fig. 10. Illustration of the rules application when backtrack-

ing.

Afterwards, the Collect-And-Create-Rule cre-ates a new state state2 as a child of state1 . Instate2 , it generates two new individuals x1 and x2.It asserts R and S in the label of the role from xto x1 and similarly R, S, and T in the label of therole from x to x2. Also, it sets card(x1) = 2 andcard(x2) = 1. Notice that all information in state1

will be copied to state2 before generating any newindividual.

Later on, assume the individual x1 ends up witha clash in statei+1 and all other possible states(such as in statei). Therefore, the Build-in-Sibling-Rule will be invoked for statei+1 . This Rule setsv1 = 0 and calls the arithmetic reasoner for an-other solution which will be generated in anotherchild of state1 , state3 . Whenever the arithmeticreasoner cannot find another solution for the listof numerical restrictions for x, the state1 will clashand the logical reasoner must search in anotherbranch for a closed and clash-free state whichtherefore contains a complete and clash-free graph.Figure 10 illustrates the functioning of these tworules in this example.

Fig. 11. Detailed architecture of the arithmetic module.

Another component in the logical module is theClash Strategy Component which triggers a clashfor an individual x whenever (i) A,¬A ⊆ L(x)for a concept name A, or (ii) an arithmetic clash isdetected in the arithmetic component. The logicalmodule returns the first non-clashed and closedstate it finds as a complete and clash-free graph.Otherwise it will return “unsatisfiable”.

5.2. Arithmetic Module

The major function of the arithmetic module isto find a non-negative integer solution for a setof unqualified number restrictions. Notice that theimplemented arithmetic module is slightly differ-ent from the arithmetic reasoner proposed in Sec-tion 3. Firstly, in addition to an inequation solver,it implements the ch-Rule. Moreover, it containsa few heuristics to guide the search for a non-negative integer solution. In the following we de-scribe the architecture of the arithmetic modulewhich is illustrated in Figure 11. Furthermore, wedescribe the functionality of each component inmore detail using pseudo code snippets.

5.2.1. Integer Linear ProgrammingA Linear Programming (LP) problem is the

study of determining the maximum or minimumvalue of a linear function f(x1, x2, . . . , xn) sub-


ject to a set of constraints. This set of constraintsconsists of linear inequations involving variablesx1, x2, . . . , xn. We call f the objective (goal) func-tion which must be either minimized or maxi-mized. If all variables are required to have integervalues, the problem is called Integer Programming(IP) or Integer Linear Programming (ILP).

Definition 26 (Integer Programming) Integer Pro-gramming (IP) is the problem of optimizing anobjective (function) f(x1, x2, . . . , xn) = c1x1 +c2x2 + · · · + cnxn + d subject to a set of m lin-ear constraints which can be formulated as: max-imize (minimize) CTX subject to AX ≤ b wherethe xi can only have integer values and XT =[x1 x2 · · · xn], C is the matrix of coefficients in thegoal function, Am×n is the matrix of coefficientsin the constraints, and bT = [b1 b2 · · · bm] containsthe limit values in the inequations.

Simplex: It was proven by Leonid Khachiyan in1979 that LP can be solved in polynomial time.However, the algorithm he introduced for thisproof is impractical due to the high degree of thepolynomial in its running time. The most widelyused and shown to be practical algorithm is theSimplex method, proposed by George Dantzig in1947.13 The Simplex method constructs a polyhe-dron based on the constraints and objective func-tion and then walks along the edges of the polyhe-dron to vertices with successively higher (or lower)values of the objective function until the optimumis reached [5]. Although LP is known to be solv-able in polynomial time, the Simplex method canbehave exponentially for certain problems.

Integer Solution: Solving the linear programmingproblem may not yield an integer solution. There-fore, an additional method is required to guaran-tee that the variables take only integer values inthe solution. There exists two general methods toachieve an integer solution.

1. Branch-and-bound: Whenever a fractionalvalue appears in the solution set, this methodsplits the search into two branches. For exam-ple, if x3 = 2.4, the algorithm splits the cur-rent problem in two different problems suchthat in one of them the new constraint x3 ≤ 2is added to A and in the other one x3 ≥ 3 is

13In fact, Leonid Kantorovich, a Russian mathematicianused a similar technique in economics before Dantzig.

added to A. The optimized solution is there-fore, the maximum (minimum) value of thesetwo branches.Moreover, the algorithm prunes all fruitlessbranches. In other words, whenever a branchcannot obtain a value better than the opti-mum value yet found, the algorithm discardsit.

2. Branch-and-cut: Whenever the optimum so-lution is not integer, the algorithm finds alinear constraint which does not violate thecurrent set of constraints but eliminates thecurrent non-integer solution from the feasibleregion (search space). This linear inequationwhich discards fractional region of the searchspace is called cutting plane.By adding cutting planes to A, the algorithmtries to yield an integer solution. However,the algorithm may reach a point where it can-not find a cutting plane. Therefore, in orderto complete the search for an optimum inte-ger solution it starts branch-and-bound.

In the implementation of our research proto-type we decided to use branch-and-bound when-ever Simplex obtains a non-integer solution. Sincethe limits (matrix b) are integer values in our case,and the algorithm hardly ends up with a non-integer solution, we decided to avoid the high com-plexity of the branch-and-cut method.

Atomic Decomposition: Let UNR be the set ofinput unqualified number restrictions. After read-ing UNR, the arithmetic module determines thenumber of different number restrictions which willlater be the number of inequations. In the fol-lowing we assume that the size of UNR is equalto n. Therefore, the arithmetic module implicitlyconsiders 2n − 1 variables such that for UNR =〈R1, R2, . . . , Rn〉 we will have Ri ∈ α(vm) if inthe binary coding of m, the ith digit is equal to1. For example, if n = 4 we can conclude thatα(v101) = R1, R3 and α(v1110) = R2, R3, R4.To retrieve the role names related to a variable, thearithmetic module uses the getRelatedRoles func-tion which has the same output as α.

5.2.2. PreprocessingBefore starting the application of the ch-Rule to

search for an arithmetic solution, the arithmeticmodule classifies the variables according to the val-ues that they can take. We define the type of avariable (v type) such that:


Algorithm 5 find-don’t-care-variables(UNR)for i = 1 to n do

if UNR[i] is an at-least restriction thenfor all j = 1 to 2n−1 such that its ith digitin binary coding = 1 do

for k = 1 to n doif (kth digit of j) = 1 ∧ UNR[k] is notan at-most restriction thenv type(vj) = −1

if v type(vj) 6= −1 ∧ v type(vj) 6= 2 thenadd vj to satisfyingVariablesList

– v type(v) = 2 if v must be zero due to logicalreasons and cannot take any value other thanzero,

– v type(v) = 0 if v is decided to be zero bythe ch-Rule which can be changed later by thech-Rule,

– v type(v) = 1 if v is decided to be greater orequal to 1 by the ch-Rule, and

– v type(v) = −1 if v is a don’t care variable andcan be greater or equal zero. In other words,it can get any value except in the case thatlogical reasons impose a type of 2.

The following tasks are performed by Algorithm5 before starting the branching.

Find don’t care variables: We define don’t carevariables as the variables that occur in an at-leastrestriction but in no at-most restriction. Therefore,they are not bounded by any limitations due to theat-most restrictions and can take any value greateror equal zero. Although logical restrictions mayforce them to be zero, the arithmetic restrictionsdo not impose any restrictions on them.

Find satisfying variables: In order to find anarithmetic solution for the input UNR list, thearithmetic module constructs a set of variables,called the satisfying variables on which it will ap-ply the ch-Rule. In fact, these are the variablesoccurring in an at-least restriction which are notnecessarily zero according to the logical reasonsnor the don’t care variables. The find-don’t-care-variables function presented in Algorithm 5 re-trieves don’t care variables and sets their type to-1. Moreover, whenever a variable v is neither don’tcare nor v type(v) = 2, this function adds it to thesatisfying variable list.

Remark It is worth noticing that the orderin which the variables are added to the satisfy-

ing variables list can significantly affect the per-formance of the arithmetic reasoner. In fact, bychoosing the variables first that are least probableto fail (either arithmetically or logically), the rea-soner can speed up the procedure of searching fora complete and clash-free graph.

Therefore, it seems that the variables which leadto the individuals that are less restricted (by theuniversal restrictions created by the unQ func-tion), may be a better choice to be close to thehead of the list. However, by means of the followingexample we will demonstrate why it is not trivialto find an optimal ordering of the variables.

Assume we have 4 UNRs in the input, three ofwhich are at-least restrictions. Therefore, we willhave the following general inequations:v0001 +v0011 +v0101 +v0111 +v1001 +v1011 +v1101 +v1111 ≥ n1

v0010 +v0011 +v0110 +v0111 +v1010 +v1011 +v1110 +v1111 ≥ n2

v0100 +v0101 +v0110 +v0111 +v1100 +v1101 +v1110 +v1111 ≥ n1

v1000 +v1001 +v1010 +v1011 +v1100 +v1101 +v1110 +v1111 ≤ m

In the set of variables from v1 to v1111, the vari-ables with the 1st digit (from right) equal to 1 arerestricted by the universal restriction related toUNR[1] (similarly for the 2nd and the 3rd restric-tion). Likewise, the variables with the 4th digitequal to 0 are restricted by the universal restric-tion related to UNR[4]14.

In this example, we can conclude that the leastrestricted variable is v1000. Nevertheless, not oc-curring in any at-least restriction, this variable isnot even in the satisfying variables list. Anotherchoice could be the case when variables have onlyone at-least restriction such as v1001, v1010 andv1100 (see the underlined variables above). But thiscase is exactly similar to the standard tableau al-gorithms presented in Section 4.1.1. Although theyseem to be logically less restricted, by not shar-ing any individuals between the at-least restric-tions, they are highly probable to fail arithmeti-cally (simply when n1 + n2 + n3 > m).

14For every qualified at-least restriction (≥ nR.C), wewill have ≥ nR′ u ∀R′.C. Thus, the existence of R′ andtherefore, appearance of 1 in its related digit will invoke theuniversal restriction ∀R′.C. However, in case of at-most re-

strictions, we will replace ≤mR.C by ≤mR′ u∀R\R′.¬C.Therefore, the absence of R′ and accordingly, the appear-ance of 0 in its related digit will invoke the universal re-

striction ∀R\R′.C


Algorithm 6 fix-role-hierarchy(R)for all R v S ∈ R do

for all v such that R related to v and S notrelated to v do

set v type(v) = 2

Another strategy could be starting from vari-ables that occur in many at-least restrictions (inthis example v1111) which is the case for the imple-mented arithmetic module. Therefore, (i) we ob-tain a faster arithmetic solution, (ii) we can en-sure a minimum number of successors model prop-erty15, (iii) although the probability of a logicalclash may be high due to many restrictions, bychoosing a highly restricted variable and after hav-ing detected a clash, we can set many more vari-ables to zero to improve backtracking.

Fix role hierarchy: By means of the fix-role-hierarchy function (see Algorithm 6), the arith-metic module sets the type of the variables thatcannot be satisfied due to the role hierarchy to2. More precisely, if R v S and R ∈ α(v) butS /∈ α(v) Algorithm 6 sets v type(v) = 2. This al-gorithm implements the hierarchy-Rule.

Backtracking results: In the case of simple back-tracking (see Section 4.3 for both backtrackingtypes), the arithmetic module only needs to setthe type of the variable related to the clashed in-dividual to 2. In the case of complex backtracking,if the logical module discovers that the existence(absence) of two or more role names in a variablemay cause a clash, the arithmetic reasoner, beforesearching for an arithmetic solution, sets the typeof all similar variables to 2.

Heuristics: In the case where we have no at-mostrestrictions or the numbers occurring in the at-most restrictions are so high that they cannot beviolated by any at-least restriction, there exists atrivial solution. In this case, similar to standardtableau algorithms, we can generate successors ac-cording to the at-least restrictions. More precisely,for each at-least restriction ≥ nR we create n R-successors and we can be sure that this model willnot fail due to number restrictions for these suc-cessors.

15A model has the minimum number of successors prop-erty iff for all of its individuals one cannot reduce the num-

ber of successors without causing a clash.

Algorithm 7 Heuristics(UNR)Assume N UNRs such that UNR[1] to UNR[M]are at-least restrictionsfor i = 1 to M do

sumOfLimits ← sumOfLimits + UNR[i ].limitMinAtMost ←∞for i = M to N do

if UNR[i ].limit < MinAtMost thenMinAtMost ← UNR[i ].limit

if sumOfLimits ≤ MinAtMost thenfor i = 1 to M do

value(v2 i )← UNR[i ].limit

Assuming N UNRs, M of which are at-least re-strictions, the procedure presented in Algorithm 7in fact has the same effect as the standard algo-rithms. For every at-least restriction ≥ nR, thisalgorithm assigns n as the value of v for which wehave α(v) = R. It is worth noticing that in thiscase the algorithm violates the property of creat-ing a model with a minimal number of successors.

5.2.3. BranchingAfter finalizing the satisfying variables list, the

main function starts the application of the ch-Rule. As presented in Algorithm 8, the branchingfunction starts assigning the type of 1 (i.e., beinggreater or equal 1) to the satisfying variables. Ifthere exist k variables in the satisfying variableslist, in order to be complete, the algorithm musttry all the 2k cases regarding the type of the vari-ables. In the case of disabled backtracking, thebranching function tries all 2k cases and returns allnon-negative integer solutions found by the integerprogramming component.

However, when benefiting from backtracking,the algorithm returns to the logical module thefirst non-negative integer solution it finds. If thefound solution logically fails, at least for one vari-able v, v type(v) = 1 changes to v type(v) = 2which later will result in a totally different solutionand the algorithm cannot compute the same solu-tion again and falling in a cycle. If branching doesnot return any solution, the arithmetic module re-turns an arithmetic clash. The branching functionin Algorithm 8 assumes the use of backtracking.Note that different backtracking strategies will re-sult in different values for the satisfyingVL list asinput for the branching algorithm.


Algorithm 8 branching(satisfyingVL)Branch over the list of satisfying variables(satisfyingVL) based on the ch-Ruleif satisfyingVL = ∅ then

return nullelse

inequations ← build inequations(satisfyingVL)result ← IntegerProgramming(inequations)if result 6= null then

return resultelse

branchingVariable ← remove last elementof satisfyingVLv type(branchingVariable)← 1result ← branching(satisfyingVL)if result 6= null then

return resultelse

v type(branchingVariable)← 0return branching(satisfyingVL)

5.2.4. Integer ProgrammingThe integer programming or the equation-solver

component gets a set of linear inequations as input.The goal function is always to minimize the sumof all variables, while all variables must be greateror equal zero. The set of constraints imposed bythe type of the variables will also be part of theinput in form of inequations. In other words, ifv type(v) = 1 for a variable, we will have v ≥ 1 asa part of the input. Notice that in the cases wherev type(v) = 0 or v type(v) = 2 the variable v neverappears in the set of input inequations.

The integer programming component is com-posed of a linear programming algorithm ac-cording to the Simplex method presented in [5]and branch-and-bound to obtain integer solutionswhen the linear solution contains fractional values.

6. Evaluation

In this section we present the empirical resultsobtained from an implemented prototype as de-scribed in the previous section. Before presenting aset of test cases and the results, we briefly discussthe issue of benchmarking in OWL and descriptionlogics. Afterwards, we identify different parame-ters that may affect the complexity of reasoningwith number restrictions. Consequently, based onthese parameters we build a set of benchmarks forwhich we evaluate the hybrid reasoner.

6.1. Benchmarking

One major problem with benchmarking in OWLis due to the fact that there exist not many com-prehensive real-world OWL ontologies to utilizeas benchmarks. In fact, as stated in [37], the cur-rent well-known benchmarks are not well suitedto address typical real-world needs. On the otherhand, qualified number restrictions are expressiveconstructs added to the forthcoming OWL 2 [31].Therefore, the current well-known benchmarks donot contain qualified number restrictions. In fact,to the best of our knowledge there is no real-worldSHQ benchmark available which contains non-trivial qualified number restrictions. Furthermore,our prototype reasoner does not implement anyoptimization technique except the one introducedabove targeting reasoning with the Q-componentof SHQ. So, any SHQ knowledge base requiringother optimization techniques than the one imple-mented in our reasoner would not demonstrate anyimprovements, especially if the difficulty is due tothe missing optimizations techniques and not dueto the occurring number restrictions. Accordingly,we needed to build a set of synthetic test cases toempirically evaluate the hybrid reasoner. Althoughthey are synthetic and did not emerge (yet) fromreal-world ontologies, we claim that they reflectpatterns that are very likely to be encountered inforthcoming OWL 2 ontologies.

For these reasons we focus our evaluation onconcept expressions only containing qualified num-ber restrictions. We identify the following param-eters that may affect the complexity of reasoningwith number restrictions:

1. The size of numbers occurring in number re-strictions. Namely, n and m in the restric-tions of the form ≤nR.C and ≥mR.C.

2. The number of qualified number restrictions.3. The ratio of the number of at-least restric-

tions to the number of at-most restrictions.4. Satisfiability versus unsatisfiability of the in-

put concept expression.

6.2. Evaluation Results

In this section we briefly examine the perfor-mance of the hybrid reasoner with respect to theparameters identified above. Moreover, we presentan evaluation to examine the effectiveness of thetwo proposed backtracking techniques.


Tableau reasoning in expressive DLs is known tobe a very time and memory-consuming procedure.Therefore, in order to remain practical, most ofthe reasoners benefit from numerous optimizationtechniques. A list of more than 70 optimizationtechniques which are widely used in DL reasonersis given in [4].

Well-known reasoners that support qualifiednumber restrictions such as Racer [14], FaCT++[34] or Pellet [33] implement numerous optimiza-tion techniques. Therefore, their performance isnot fairly comparable to this prototype. Accord-ingly, we base our evaluations only partially ona comparison with one of the existing reasonersand focus on the study of the behavior of thehybrid reasoner. Moreover, it is well-known thatbenchmarks can only compare systems but not al-gorithms because benchmark results are often af-fected by factors (e.g., index structures, heuristics,etc) that do not reveal any insight in comparingthe efficiency of algorithms, e.g., the standard vs.hybrid algorithm for number restrictions.

The following experiments were performed un-der Windows XP Professional (32-bit) on a stan-dard PC with an Intel Core 2 Duo E6400 proces-sor and 3 GB of physical memory. To improve theprecision, every test was executed in five indepen-dent runs and the average of these runs is pre-sented. Furthermore, we set the timeout limit to1000 seconds.

6.2.1. Increasing Values of NumbersThe major advantage of benefiting from an

arithmetic method is the fact that reasoning isunaffected by the size of numbers. In fact, ittranslates the number restrictions to a set of in-equations. For example, for the concept expres-sion ≥ 3 hasChild .Female the size of the num-ber is three which is relatively small. However,when expressing the concept (≥141 hasCredit u ≤45 hasCredit .ComputerScience) to model a uni-versity undergraduate engineering program or (≥1200 hasSeat u ≤600 hasSeat .(ArenatGrandCircle))to model the structure of a theater, larger numberscome into play.

In order to observe this major advantage whichis the scalability of the hybrid algorithm with re-spect to the size of the numbers, we decided tocompare its performance with Pellet. The reasonsthat we chose Pellet16 as a representative imple-mentation of the standard algorithm are:

16We used Pellet 2.0 RC7 from June 11, 2009.

– it is a free open-source reasoner that handles(qualified) number restrictions,

– similar to our prototype it is a Java-based im-plementation, and

– in contrast to FaCT++ which sometimesturned out to be unsound when dealing withnumber restrictions, Pellet returned correctanswers in all our experiments.

Furthermore, to the best of our knowledge Pel-let as well as FaCT++ have no specific optimiza-tion technique for dealing with qualified numberrestrictions. Therefore, since the goal is to com-pare implementations of the hybrid and standardalgorithm, we considered Pellet’s implementationof the standard algorithm as a adequate represen-tative for a state-of-the-art DL reasoner.

Test case description: The concept expressionsfor which we executed the concept satisfiabilitytest are as follows (with respect to a role hierar-chy R v T, S v T,RS v R,RS v S where iis a number incremented for each benchmark). Weabbreviate the first concept expression with CSAT

and the second expression with CUNSAT .

(≥2i RS.(A tB)) u (≤ i S.A) u (≤ i R.B)u(≤(i−1)T.(¬A)) t (≤ i T.(¬B)) (CSAT )(≥2i RS.(A tB)) u (≤ i S.A) u (≤ i R.B)u(≤(i−1)T.(¬A)) t (≤(i−1)T.(¬B)) (CUNSAT )

The concept expression CSAT is a satisfiableconcept where for an assertion x :CSAT , the in-dividual x has (2 × i) RS-successors in (A t B),i of which must be in ¬A and i must be in ¬B(according to the at-most restrictions (≤ i S.A)and (≤ i R.B)). Therefore, it can be concludedthat i of them are in (¬A u B) and the otheri successors are in (¬B u A). The disjunction(≤ (i− 1)T.(¬A)) t (≤ i T.(¬B)) can be satisfiedwhen choosing (≤ i T.(¬B)) which is not violateddue to the fact that x has exactly i successors in¬B. According to a similar explanation, since noneof the disjuncts can be satisfied, CUNSAT is an un-satisfiable concept.

In fact, CSAT is not trivially satisfiable, neitherby the hybrid nor the standard algorithm. In thehybrid algorithm, the set of inequations is onlysatisfied in the case where all variables are zeroexcept v, v′ ≥ 1 and α(v) = RS′, S′, T ′ andα(v′) = RS′, R′.17 The standard algorithm in

17Assume R′, S′, RS′, and T ′ are new sub-roles ofR,S,RS, and T respectively.


5

102

2

5

103

2

5

104

2

5

105

2

5

106

Runtime(ms)

2 3 4 5 6 7 8 9 10

in concept expression

Effect of the Size of Numbers: Hybrid vs. Pellet

. . .... . . .

Hybrid-Sat

Hybrid-Unsat. Pellet-Sat

Pellet-Unsat

i C

Fig. 12. Comparing the hybrid with the standard algorithm:

Effect of increasing the value of numbers.

order to examine the satisfiability of CSAT creates(2 × i) RS-successors for x in (A t B) and ac-cording to the three at-most restrictions it opens8 new branches for each successor. However, since(2× i) is much larger than i or i− 1, the reasonermust start merging the extra successors when anat-most restriction is violated which happens in allbranches in this test case.

As illustrated in Figure 12, the linear growthof i from 2 to 10 has almost no effect on the hy-brid reasoner while it dramatically increases theruntime of the standard algorithm for numbers assmall as 6 and 7.18 Moreover, we can observe thatfor i = 6 the satisfiability of CSAT is decided inabout 40s while for CUNSAT , which is only slightlydifferent to CSAT , the time increases to more than1000s. This sudden jump reveals that by decreas-ing i to i− 1 in just one at-most restriction, whichleads to unsatisfiability, the practical complexityof the problem increases tremendously. In Figure13 we zoom into the hybrid part of the Figure 12to present the behavior of the hybrid reasoner inmore detail.

In contrast to the standard reasoner, the perfor-mance of the hybrid reasoner is unaffected by thevalue of the numbers. In Figure 14 we illustratethe linear behavior of the hybrid algorithm withrespect to a linear growth of the size of the num-bers in the qualified number restrictions. Further-more, to assure that this independence will be pre-served also with respect to an exponential growth

18Note that Figure 12 is a log-linear plot and time valuesare in logarithmic scale. In Figure 13 one can observe the

behavior of the hybrid reasoner in a linear plot.

0

50

100

150

200

250

Runtime(ms)

2 3 4 5 6 7 8 9 10


Effect of the Size of Numbers in the Hybrid Approach

Hybrid-Sat

Hybrid-Unsat

i C

Fig. 13. Hybrid reasoner: Effect of increasing the value ofnumbers.

0

20

40

60

80

100

120

140

160

Runtime(ms)

0 10 20 30 40 50 60 70 80 90 100


Effect of Linear Growth of

...... .............

Hybrid-Sat. Hybrid-Unsat

i C

i

Fig. 14. Hybrid reasoner: Effect of linear growth of i.

of i, in Figure 15 we present the performance ofthe hybrid reasoner for i = 10n, n ∈ 1..6.

6.2.2. BacktrackingOne of the major well-known optimization tech-

niques addressing complexity of the reasoning withnumber restrictions is dependency-directed back-tracking or backjumping. In this experiment weobserve the effect of backtracking on the perfor-mance of the hybrid reasoner. In three settings,we first turn off backtracking, second enable sim-ple backtracking, and finally enable complex back-tracking (see Section 4.3).

In order to observe the impact of backtrack-ing, we tested an unsatisfiable concept DUNSAT

which follows the pattern where Dj uDk = ⊥ for1 ≤ j, k ≤ i, j 6= k with respect to a role hierarchyR v T : (≥3R.D1)u. . .u(≥3R.Di)u(≤(3i−1)T ).


0

10

20

30

40

50

60

70

80

90

100

Runtime(ms)

1 2 3 4 5 6


Effect of Exponential Growth of

. . . . . .

Hybrid-Sat. Hybrid-Unsat

log10 i C

i

Fig. 15. Hybrid reasoner: Effect of exponential growth of i.

2

5

102

2

5

103

2

5

104

2

5

105

2

5

106

Runtime(ms)

2 3 4 5 6 7 8 9 10 11

Number of Restrictions

Effect of Backtracking

. . .. . .. .. .

No-Backtracking

Simple. Complex

Fig. 16. Effect of backtracking at different levels.

The assertion x :DUNSAT implies that x has 3R-successors in D1, . . . , and 3 R-successors in Di.Since these 3i successors are instances of mutuallydisjoint concepts we can conclude that x has 3idistinct (not mergeable) successors. Therefore, theat-most restriction in DUNSAT cannot be satisfied.

In this experiment in each step we increase iwhich results in more number restrictions andtherefore a larger number of variables. As thelog-linear plot in Figure 16 suggests, the double-exponential nature of the hybrid algorithm and ingeneral the high nondeterminism of the ch-Rulemakes it inevitable to utilize at least simple back-tracking. Moreover, we can conclude that by usingcomplex backtracking we can improve the perfor-mance of the reasoning significantly. For example,in Figure 16 we can observe that for i = 6 rea-soning without backtracking results in a timeoutwhile benefiting from simple backtracking the rea-

0

50

100

150

200

NumberofClashes

2 3 4 5 6 7 8 9

Number of Restrictions

Effect of Backtracking

. . . . . . .

.Simple. Complex

Fig. 17. Effect of backtracking at different levels: Number

of clashes.

soner concludes unsatisfiability in about 41s andfor complex backtracking the reasoning time is re-duced to 206ms.

In fact, a better method of backtracking canprune a larger number of branches in the searchspace. In other words, the unsatisfiability of a con-cept can be concluded earlier after facing a smallernumber of clashes. In Figure 17, by observing thenumber of logical clashes each method producesbefore returning the result, we can compare theirsuccess in pruning the search space.

6.2.3. Satisfiable vs. Unsatisfiable ConceptsIn this experiment the test cases are con-

cepts containing four qualified at-least restric-tions and one unqualified at-most restriction ac-cording to the following pattern. We abbreviatethese concepts with Ei where R v T for i =1, 20, 40, . . . , 220, 240:

≥30R.(B u C) u ≥30R.(B u ¬C)u≥30R.(¬B u C) u ≥30R.(¬B u ¬C) u ≤ i T

Since the concept fillers of the four at-least re-strictions are mutually disjoint, assuming the as-sertion x :Ei, we can conclude that x must have120 nonmergeable R-successors. According to therole hierarchy R v T , every R-successor is also aT -successor. Therefore, the concept Ei is satisfi-able for i ≥ 120 and unsatisfiable for i < 120.

As illustrated in Figure 18, the standard algo-rithm can easily infer for i ∈ 1..29 that Ei is un-satisfiable since x has at least 30 distinguishedsuccessors. However, from E30 to E120 it becomesvery time-consuming for the standard algorithm tomerge all 120 successors into i individuals. More-over, Figure 18 shows that no matter which value


0

2000

4000

6000

8000

10000

12000

14000

16000

Runtime(ms)

0 20 40 60 80 100 120 140 160 180 200 220 240

in the concept expression

Satisfiable vs. Unsatisfiable: Standard and Hybrid Algorithms

...

.........

.............

Hybrid. Pellet

i Ei

Fig. 18. Standard and Hybrid algorithm: Effect of(un)satisfiability for Ei.

i takes between 30 and 119, the standard algo-rithm performs similarly. In other words, we canconclude that it tries the same number of possi-ble ways of merging which is all the possibilitiesto merge 4 sets of mutually distinguished individ-uals. As soon as i becomes greater or equal 120,since the at-most restriction is not violated, thestandard algorithm simply ignores it and reason-ing becomes trivial for the standard algorithm.

Furthermore, we can conclude from Figure 18that for the hybrid algorithm i = 1 is a trivialcase since not more than one variable can have thetype of v ≥ 1 which is the case that easily leads tounsatisfiability for E1. However, it becomes moredifficult as i grows and reaches its maximum fori = 30 and starts to decrease gradually until i = 70and remains unchanged until i = 120. In fact, thisunexpected behavior did not correspond to the for-mal analysis of the hybrid algorithm and needed tobe analyzed more comprehensively and precisely.

Therefore, we extended our analysis by observ-ing the time spent on arithmetic reasoning andlogical reasoning as well as the number of differ-ent clashes. The reason that i = 30 is a breakingpoint is the fact that for i < 30 no arithmetic so-lution exists for the set of inequations. Therefore,it seems to be very difficult for the arithmetic rea-soner to realize that a set of inequations has nosolution. Moreover, as i grows from 30 to 120, thearithmetic reasoner finds more solutions for the setof inequations which will fail due to logical clashes.In other words, the backtracking in the logical rea-soner seems to be much stronger than in the arith-metic reasoner. Whenever more logical clashes ex-ist, the hybrid reasoner can accomplish the reason-ing faster.

0

50

100

150

200

250

300

350

400

450

500

Runtime(ms)

0 20 40 60 80 100 120 140 160 180 200 220 240

in

Satisfiable vs. Unsatisfiable: Hybrid Algorithm

Hybrid

i !i T.D

Fig. 19. Hybrid algorithm: Effect of (un)satisfiability for Fi.

In order to verify this hypothesis, we built an-other pattern which is slightly different from Ei

and we abbreviate it with Fi where R v T fori = 1, 20, 40, . . . , 220, 240:

≥30R.(B u C uD) u ≥30R.(B u ¬C uD)u≥30R.(¬B u C uD)u ≥30R.(¬B u ¬C uD)u≤ i T.D

The major difference between Ei and Fi is thefact that in Fi the at-most restriction is also aqualified restriction and concept D is added tothe fillers of at-least restrictions. Therefore, the setof inequations always has an arithmetic solution,however, for i < 120 it will logically fail. In otherwords, dependency-directed backtracking discov-ers the unsatisfiability of the concept. Since theclashes and therefore backtracking results are in-dependent from the arithmetic nature of the prob-lem, as presented in Figure 19, the performanceof the hybrid reasoner stays almost constant for1 ≤ i ≤ 240. It is worth noticing that the behaviorof the standard algorithm for Fi remains exactlysimilar to Ei (not shown here).

6.2.4. Number of Qualified Number RestrictionsAccording to the complexity analysis of the hy-

brid algorithm in Section 4.1.2 one can concludethat the number of qualified number restrictionssignificantly influences the complexity of reason-ing. More specifically, the complexity of the hybridalgorithm seems to be characterized by a double-exponential function of the number of number re-strictions in the worst case.

In this experiment we built a concept containingone at-least restriction and extend it gradually. Inorder to keep the ratio between the number of at-least and at-most restrictions fixed, in each step


0

100

200

300

400

500

600

700

800

Runtime(ms)

2 4 6 8 10 12 14 16 18Number of Qualified Number Restrictions

. . . . . . .

.

Fig. 20. Effect of increasing the number of qualified number

restrictions.

we added one qualified at-least restriction and onequalified at-most restriction. In step i the conceptwhich we abbreviate with Gi is of the followingform with respect to the role hierarchy RS v R:

≥20RS u ≥10R.C1 u · · · u ≥10R.Ci u≥5R.(¬C2 t ¬C3) u · · · u ≥5R.(¬Ci t ¬Ci+1) u≤5R.(¬C1 t ¬C2)

Therefore, in each concept Gi we have 2i + 1number restrictions. It is essential for this prob-lem that the roles participating in these numberrestrictions share the same role hierarchy. Other-wise, one could partition different role names fromdifferent role hierarchies and deal with each parti-tion separately. Note that the hybrid algorithm en-counters no clashes when deciding satisfiability ofGi. As presented in Figure 20, the maximum num-ber of qualified number restrictions that the hybridprototype currently can handle (in less than 1000s)is 17 although these concepts are easily satisfiable.The Pellet reasoner could decide the satisfiabilityof each concept Gi with a runtime well below onesecond. It is currently unclear whether this per-formance degradation of the hybrid reasoner is in-evitable or could be avoided by an improved imple-mentation. For instance, Algorithm 5, which findsdon’t-care-variables, still requires O(2n) steps forn at-least restrictions. This dependency might beimproved by better indexing techniques and mightbecome linear to n with a smarter implementation.

6.2.5. Number of At-least vs. At-mostRestrictions

In the following we denote the ratio between thenumber of at-least restrictions and at-most restric-tions by RMin/Max. In addition to the pure num-

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

Runtime(ms)

2-12 3-11 4-10 5-9 6-8 7-7 8-6 9-5 10-4 11-3 12-2

# at-least - #at-most

Effect of the Ratio

Fig. 21. Effect of the ratio between the number of at-leastand at-most restrictions.

ber of number restrictions, the ratio RMin/Max

seems to also affect the complexity of reasoning.Therefore, in this experiment, for a fixed totalnumber of restrictions, we evaluate the perfor-mance of the hybrid prototype with respect to thisratio. The structure of the concept expression issimilar to Gi for which no clashes occur during thereasoning, i.e., the concept expressions are easilysatisfiable.

From Figure 21 we can conclude that the growthof RMin/Max increases the complexity of the rea-soning for the hybrid reasoner while the Pellet rea-soner could decide the satisfiability of these con-cept with a runtime well below one second. In fact,the hybrid reasoner tries to satisfy at-least restric-tions while not violating any at-most restriction.Therefore, the length of the satisfying variableslist, which is the list of variables for which the ch-Rule is applied, depends on the number of at-leastrestrictions.19 Therefore, the more at-least restric-tions exist in Gi, the harder it becomes for thearithmetic reasoner to find a solution for the set ofinequations.

Note that the at-most restrictions are not theonly source of complexity. The fact that the arith-metic reasoner always computes a minimal solu-tion, significantly affects the complexity of the rea-soning even when no at-most restriction exists. Forexample, in Table 2, when Gi has 14 at-least re-strictions and no at-most restrictions, the hybridprototype accomplished the reasoning process inabout 235 seconds. By ignoring the minimal num-ber of successors property, this problem is trivialin the absence of at-most restrictions (see Section5.2.2 where we presented a solution for this ineffi-

19In fact, it contains the variables participating in at leastone at-least restriction.


Table 2

Effect of number of at-least restrictions vs. number of at-most restrictions.

≥ − ≤ 0-14 1-13 2-12 3-11 4-10 5-9 6-8 7-7 8-6 9-5 10-4 11-3 12-2 13-1 14-0

Time (s) 0.17 0.59 1.3 2 2.9 3.7 4.4 5.1 5.9 6.5 7.1 9 16.5 53 235

ciency, which is not yet implemented in our proto-type).

7. Discussion and Future Work

Based on the evaluation results presented in Sec-tion 6 and the complexity analysis in Section 4we identify the following advantages of the hy-brid algorithm in comparison with the standardapproaches:

– Insensitivity to the value of numbers: Accord-ing to the nature of linear integer program-ming, the value of numbers do not affect thehybrid algorithm. More precisely, larger num-bers for the same variable only affect the car-dinality of its relevant proxy individual.

– Comprehensive reasoning: Since the hybrid al-gorithm collects all number restrictions be-fore expanding the completion graph, its so-lution is more comprehensive and thereforemore probable to survive. In other words, incontrast with standard algorithms it nevercreates extra successors which later need tobe merged.

– Structured search space: By means of theatomic decomposition and the variables asso-ciated with partitions, the hybrid approachsearches for a model in a very structured andwell-organized search space. As a result, whenencountering a clash, it can efficiently back-track to the source of the clash and optimallyprune the branches which lead to the sameclash (see Section 4.3).

– Minimal number of successors: According tothe fact that the goal function in the arith-metic reasoner is to minimize the sum of vari-ables, the number of successors generated foran individual is always minimized. In fact, asmentioned in Section 6.2.5, one major sourceof inefficiency in the hybrid reasoning is that itnot only searches for a model but also alwaysfor a model with a minimal number of suc-cessors. This feature of the hybrid algorithm

could be of interest for a set of problems wherethe number of successors affects the qualityof the solution. For example, in configurationproblems, not only a consistent and soundmodel is of interest, but also a model whichrequires less elements and therefore costs lessis of great importance.

The following disadvantages of the hybrid algo-rithm can be observed:

– Exponential number of variables: Accordingto the nature of the atomic decomposition,in order to have mutually disjoint sets, thehybrid algorithm introduces an exponentialnumber of variables. Considering the nonde-terministic ch-Rule, the search for a modelcan become expensive for the algorithm when-ever large numbers of cardinality restrictionsoccur in the label of an individual.

– Long initialization time: The hybrid algo-rithm needs to perform a preprocessing stepbefore starting the algorithm. Moreover, itcollects all number restrictions before gener-ating any successor for an individual. This de-lay is due to the fact that the hybrid algo-rithm spends some time on choosing an effi-cient branch to proceed. However, this initial-ization time is unnecessary for trivially satis-fiable or unsatisfiable concepts.

Considering the advantages and disadvantagesof the hybrid algorithm, one can conclude that thisapproach is inevitable whenever large numbers oc-cur in numbers restrictions. This is also stronglysupported by the results in [7] about a worst-caseoptimal tableau algorithm for SHIQ using alge-braic reasoning and global caching. Moreover, thehybrid algorithm builds a well-structured searchspace which makes it well-suited for non-trivialconcepts. However, in the case of trivially sat-isfiable concepts the current prototype performsslower than the standard algorithms. In otherwords, we can conclude that the overhead in thehybrid algorithm w.r.t. the current implementa-tion is too high for trivial situations. Moreover, the


fact that the number of the successors is minimizedtakes a possibly unnecessary extra effort.

As mentioned in the introduction, the signaturecalculus [13] was designed to address the problemof large numbers in number restrictions. However,it still has two highly nondeterministic rules (tosplit and merge proxy individuals) and does notprocess all at-most restrictions in one step. The al-gebraic method proposed in [30] cannot be consid-ered as a calculus. It neither handles TBoxes witharbitrary axioms or terminological cycles nor di-rectly deals with disjunctions and full negation. Itis unclear how this methodology could be extendedto handle more expressive description logics.

Although the early work presented in [15] andimplemented in Racer [14] deals with SHQ, it isbased on a recursive procedure not suited for for-mal proofs. Furthermore, it needs to examine thesatisfiability of all partitions before initializing theSimplex component and does not support Aboxes.Whenever the number of qualified numbers restric-tions grows and respectively the number of par-titions exponentially grows, Racer becomes veryinefficient.

We strongly believe that the current inefficiencyof our implemented prototype w.r.t. the numberof number restrictions should not be used to cometo the premature conclusion that the hybrid algo-rithm is inefficient or even unusable for problemswith many number restrictions. The inefficiencyis due to the exponential growth of the variablesin the inequations and the straight-forward imple-mentation of the algebraic module which containssome algorithmic parts that still exhibit a best-case exponential behavior. A more advanced im-plementation could use standard techniques fromlinear programming such as (delayed) column gen-eration (e.g., see [6]) that depends on the insightto consider only variables which have the poten-tial to improve the objective function (in our caseminimization of the sum of all variables occurringin the inequations of a given node in the comple-tion forest). We conjecture that such an optimiza-tion technique would greatly improve the efficiencyof the algebraic reasoner in the average case andwould make the algorithmic parts obsolete that arecurrently best-case exponential.

We plan to address the following other topics inour future work.

– Turning off minimality: Since the minimalnumber of successor property is unnecessary

in many cases, we could provide a switchto turn this property on and off. Therefore,whenever it is switched off, we can considerthe least restricted variables first in order tofind a solution faster. For example, if no at-most restriction is violated, similar to thestandard algorithm, we can create n succes-sors for every at-least restriction ≥nR.C.

– Optimizing the arithmetic reasoner: In Sec-tion 6.2.3 we learned that one major source ofinefficiency is due to the non-optimized arith-metic reasoner:

∗ To optimize the arithmetic reasoner wecould support incremental arithmetic rea-soning, i.e., whenever a solution fails due tological reasons and the arithmetic modulemodifies its knowledge about the variables,it does not restart the Simplex procedure.In fact, the Simplex module can continueits search for an integer solution, incremen-tally considering the newly discovered con-straints on the variables.

∗ One other possibility to improve the perfor-mance of the arithmetic reasoner is cachingarithmetic solutions or clashes to avoidsolving the same set of inequations morethan once.

∗ As explained in Section 5, one importantfactor which affects the performance of thearithmetic module significantly, is the or-der of variables in the satisfying-variableslist. Modifying this list according to theinput concept expression and also the re-sults gained during backtracking can helpthe arithmetic module find a surviving so-lution faster.

– Extending to more expressive languages: Thereare two well-known constructors which in-crease the expressiveness of the language:Nominals (O) and inverse roles (I). In thepresence of nominals, implied numerical re-strictions due to nominals affect all individ-uals. A hybrid calculus for ALCOQ is pre-sented in [9,10]. In the presence of inverse rolesthe labels of individuals may be modified atany time. Therefore, a hybrid algorithm han-dling SHIQ will need to deal with incremen-tal updates of labels of individuals due to thebackpropagation of knowledge.


In general, one cannot expect that the algebraictableau approach will work best for all possibleinput, especially since concept and Abox satis-fiability are known to be ExpTime-complete forSHQ. In Section 4.1 we illustrated that the worst-case complexity of the hybrid tableau is a func-tion of the number of variables and, thus, of thenumber of at-least and at-most restrictions, whilethe complexity of the standard tableau is a func-tion of the size of the numbers used in qualifiednumber restrictions and the number of at-most re-strictions. Given this analysis, in the future, afterhaving improved the performance of the algebraictableau approach, it seems to be promising to ex-plore a combination with an alternative calculusfor number restrictions such as the signature cal-culus [13], which pioneered the use of proxy indi-viduals and can be considered as an improvementover standard tableau for dealing with number re-strictions. A heuristic could be devised that de-cides whether the algebraic tableau or the signa-ture calculus should be applied to a given problem.

A calculus equipped with the ability of informedarithmetic reasoning could be a motivation for in-troducing more complex numerical DL construc-tors. As proposed in [30], one possible extensioncould be used to restrict the ratio of cardinalitiesof fillers of two different roles. For example, in ataxonomy describing the structure of a theater, theconcept expression 50 × |hasRoom.WashRoom| ≥|hasSeat .>| could express the restriction that theremust exist at least one washroom for every 50seats. Similarly, a percentage restriction such as≤ 20% hasCredit .Business could describe a con-cept where at most 20% of the hasCredit fillers areinstances of the concept Business.

Since qualified number restrictions can be trans-lated to inequations, such expressions can be trans-formed to linear inequations. However, the de-cidability of a DL extended by such constructorsneeds to be investigated.

Acknowledgements

We express our gratitude to our colleaguesJocelyne Faddoul and Ralf Moller for their valu-able advice during this investigation. We are alsovery thankful for the very detailed and thoughtfulreviews by the anonymous reviewers that greatlyhelped to improve this article.

References

[1] F. Baader, D. Calvanese, D. McGuinness, D. Nardi,

and P. Patel-Schneider. The Description Logic Hand-book, 2nd edition. Cambridge University Press, 2007.

[2] F. Baader, E. Franconi, B. Hollunder, B. Nebel, and

H. Profitlich. An empirical analysis of optimizationtechniques for terminological representation systems

or: Making KRIS get a move on. Applied Artificial In-

telligence. Special Issue on Knowledge Base Manage-ment, 4:109–132, 1994.

[3] F. Baader and U. Sattler. An overview of tableau al-

gorithms for description logics. Studia Logica, 69:5–40,2001.

[4] S. Bechhofer. DIG optimisation techniques, March

2006. http://dl.kr.org/dig/optimisations.html

(last visited July 2009).

[5] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and

C. Stein. Introduction to Algorithms, Second Edition.

The MIT Press, September 2001.

[6] G. Desaulniers, J. Desrosiers, and M. M. Solomon, ed-

itors. Column Generation. Mathematics of Decision

Making. Springer-Verlag, 2005.

[7] Y. Ding. Tableau-based Reasoning for Description

Logics with Inverse Roles and Number Restrictions.

PhD thesis, Department of Computer Science andSoftware Engineering, Concordia University, April

2008. Available at http://users.encs.concordia.ca/

~haarslev/students/Yu_Ding.pdf.

[8] J. Faddoul, N. Farsinia, V. Haarslev, and R. Moller.

A hybrid tableau algorithm for ALCQ. In Proceedings

of the 18th European Conference on Artificial Intelli-gence (ECAI 2008), Patras, Greece, July 21-25, pages

725–726, 2008. An extended version appeared in Pro-

ceedings of the 2008 International Workshop on De-scription Logics (DL-2008), Dresden, Germany, May

13-16, 2008.

[9] J. Faddoul, V. Haarslev, and R. Moller. Hybrid rea-soning for description logics with nominals and qual-ified number restrictions. Technical report, Institute

for Software Systems (STS), Hamburg University ofTechnology, 29 pages, 2008. See also http://www.sts.

tu-harburg.de/tech-reports/papers.html.

[10] J. Faddoul, V. Haarslev, and R. Moller. Algebraictableau algorithm for ALCOQ. In Proceedings of

the 2009 International Workshop on Description Log-ics (DL-2009), Oxford, United Kingdom, July 27–30,2009.

[11] N. Farsiniamarj. Combining integer programmingand tableau-based reasoning: A hybrid calculusfor the description logic SHQ. Master’s thesis,

Concordia University, Department of Computer

Science and Software Engineering, 2008. Availableat http://users.encs.concordia.ca/~haarslev/

students/Nasim_Farsinia.pdf.

[12] J. Gaschnig. Performance Measurement and Analysisof Certain Search Algorithms. PhD thesis, Carnegie-

Mellon University, Pittsburgh, PA, 1979.


[13] V. Haarslev and R. Moller. Optimizing reasoning in

description logics with qualified number restriction.

In Proceedings of the International Workshop on De-scription Logics (DL’2001), Aug. 1-3, 2001, Stanford,

CA, USA, pages 142–151, August 2001.

[14] V. Haarslev and R. Moller. RACER system descrip-

tion. In R. Gore, A. Leitsch, and T. Nipkow, edi-tors, Proceedings of the International Joint Conference

on Automated Reasoning, IJCAR’2001, June 18-23,2001, Siena, Italy, Lecture Notes in Computer Science,

pages 701–705. Springer-Verlag, June 2001.

[15] V. Haarslev, M. Timmann, and R. Moller. Com-

bining tableaux and algebraic methods for reasoningwith qualified number restrictions. In Proceedings

of the International Workshop on Description Logics

(DL’2001), Aug. 1-3, Stanford, USA, pages 152–161,2001.

[16] B. Hollunder and F. Baader. Qualifying number re-

strictions in concept languages. In J. Allen, R. Fikes,

and E. Sandewall, editors, Second International Con-ference on Principles of Knowledge Representation,

Cambridge, Mass., April 22-25, 1991, pages 335–346,

April 1991.

[17] M. Horridge, S. Bechhofer, and O. Noppens. Ignitingthe OWL 1.1 touch paper: The OWL API. In Proceed-

ings of the OWLED 2007 Workshop on OWL: Experi-

ences and Directions, volume 258, Innsbruck, Austria,June 2007.

[18] I. Horrocks. Backtracking and qualified number re-

strictions: Some preliminary results. In In Proc. of the

2002 Description Logic Workshop (DL 2002), pages99–106, 2002.

[19] I. Horrocks. Implementation and optimization tech-

niques. In Baader et al. [1], chapter 9.

[20] I. Horrocks, O. Kutz, and U. Sattler. The even more

irresistible SROIQ. In Proc. of the 10th Int. Conf.on Principles of Knowledge Representation and Rea-

soning (KR 2006), pages 57–67. AAAI Press, 2006.

[21] I. Horrocks and U. Sattler. A tableau decision proce-

dure for SHOIQ. JAR, 39(3):249–276, 2007.

[22] I. Horrocks, U. Sattler, and S. Tobies. Practical reason-ing for expressive description logics. In H. Ganzinger,D. McAllester, and A. Voronkov, editors, Proceed-

ings of the 6th International Conference on Logic forProgramming and Automated Reasoning (LPAR’99),

number 1705 in Lecture Notes in Artificial Intelligence,

pages 161–180. Springer-Verlag, September 1999.

[23] I. Horrocks, U. Sattler, and S. Tobies. Practical reason-ing for very expressive description logics. Logic Journalof the IGPL, 8(3):239–264, 2000.

[24] I. Horrocks, U. Sattler, and S. Tobies. Reason-

ing with individuals for the description logic SHIQ.In D. MacAllester, editor, Proceedings of the 17th

International Conference on Automated Deduction(CADE-17), Lecture Notes in Computer Science,pages 482–496, Germany, 2000. Springer Verlag.

[25] I. Horrocks and S. Tobies. Reasoning with axioms:

Theory and practice. In A. Cohn, F. Giunchiglia, and

B. Selman, editors, Proceedings of Seventh Interna-

tional Conference on Principles of Knowledge Rep-

resentation and Reasoning (KR’2000), Breckenridge,

Colorado, USA, April 11-15, 2000, pages 285–296,April 2000.

[26] Y. Kazakov, U. Sattler, and E. Zolin. How many legs

do I have? Non-simple roles in number restrictions re-

visited. In Proceedings of the 14th International Con-ference on Logic for Programming, Artificial Intelli-

gence, and Reasoning, LPAR 2007, Yerevan, Arme-

nia, Oct. 15-19, volume 4790 of LNAI, pages 303–317.Springer-Verlag, 2007.

[27] H. Lenstra. Integer programming with a fixed num-

ber of variables. Mathematics of Operations Research,

8(4):538–548, November 1983.

[28] E. N. Marieb, K. Hoehn, P. B. Wilhelm, and

N. Zanetti. Human Anatomy & Physiology: Interna-tional Edition with Human Anatomy and Physiology

Atlas, 7/E. Pearson Higher Education, 7 edition, 2006.

[29] H. Ohlbach and J. Koehler. Role hierarchies and num-

ber restrictions. In Proceedings of the InternationalWorkshop on Description Logics (DL-97), Sept. 27 -

29, Gif sur Yvette (Paris), France, 1997.

[30] H. Ohlbach and J. Kohler. Modal logics, description

logics and arithmetic reasoning. Artificial Intelligence,109(1-2):1–31, 1999.

[31] OWL 2 Web Ontology Language: Structural Speci-fication and Functional-Style Syntax, W3C Working

Draft, 02 December 2008, October 2008. http://www.

w3.org/TR/owl2-syntax/ (last visited July 2009).

[32] P. Prosser. Hybrid algorithms for the constraintsatisfaction problem. Computational Intelligence,

9(3):268–299, 1993.

[33] E. Sirin, B. Parsia, B. C. Grau, A. Kalyanpur, and

Y. Katz. Pellet: a practical OWL-DL reasoner. Journalof Web Semantics, 5(2):51–53, 2005.

[34] D. Tsarkov and I. Horrocks. FaCT++ description logicreasoner: System description. In Proc. of the Int. Joint

Conf. on Automated Reasoning (IJCAR 2006), vol-

ume 4130 of Lecture Notes in Artificial Intelligence,pages 292–297. Springer, 2006.

[35] F. van Harmelen, J. Hendler, I. Horrocks, D. L.

McGuinness, P. F. Patel-Schneider, and L. A. Stein.

OWL web ontology language reference, 2003. http:

//www.w3.org/TR/owl-guide/ (last visited July 2009).

[36] M. Y. Vardi. Why is modal logic so robustly decid-

able? In N. Immerman and P. G. Kolaitis, editors,

Descriptive Complexity and Finite Models, volume 31of DIMACS Series in Discrete Mathematics and The-

oretical Computer Science, pages 149–184. AmericanMathematical Society, 1996.

[37] T. Weithoner, T. Liebig, M. Luther, S. Bohm, F. vonHenke, and O. Noppens. Real-world reasoning withOWL. In Proceedings of 4th European Semantic Web

Conference, ESWC 2007, Innsbruck, Austria, June 3-7, 2007, LNCS 4519, pages 296–310. Springer-Verlag,2007.

practical reasoning with quali ed number restrictions: a...

Documents