264 ieee/acm transactions on networking, …...and 001, respectively. 100 is injected into the tags...

264 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 25, NO. 1, FEBRUARY 2017

Multi-Category RFID EstimationXiulong Liu, Keqiu Li, Alex X. Liu, Song Guo, Senior Member, IEEE ACM, Muhammad Shahzad,

Ann L. Wang, and Jie Wu, Fellow, IEEE

Abstract— This paper concerns the practically important prob-lem of multi-category radio frequency identification (RFID)estimation: given a set of RFID tags, we want to quicklyand accurately estimate the number of tags in each category.However, almost all the existing RFID estimation protocols arededicated to the estimation problem on a single set, regardlessof tag categories. A feasible solution is to separately execute theexisting estimation protocols on each category. The executiontime of such a serial solution is proportional to the numberof categories, and cannot satisfy the delay-stringent applicationscenarios. Simultaneous RIFD estimation over multiple cate-gories is desirable, and hence, this paper proposes an approachcalled simultaneous estimation for multi-category RFID systems(SEM). SEM exploits the Manchester-coding mechanism, whichis supported by the ISO 18000-6 RFID standard, to decode thecombined signals, thereby simultaneously obtaining the replystatus of tags from each category. As a result, multiple bitvectors are decoded from just one physical slotted frame. Builton our SEM, many existing excellent estimation protocols canbe used to estimate the tag cardinality of each category in asimultaneous manner. To ensure the predefined accuracy, we

Manuscript received December 23, 2015; revised April 26, 2016; acceptedMay 23, 2016; approved by IEEE/ACM TRANSACTIONS ON NETWORKINGEditor S. Chen. Date of publication September 22, 2016; date of cur-rent version February 14, 2017. This work was supported in partby the National Science Foundation for Distinguished Young Scholarsof China under Grant 61225010; the State Key Program of NationalNatural Science of China under Grant 61432002; the NSFC underGrants 61472184, 61321491, 61272417, 61300189, and 61370199; theSpecialized Research Fund for the Doctoral Program of Higher Edu-cation under Grant 20130041110019; the Fundamental Research Fundsfor the Central Universities under Grant DUT15QY20; the TianjinKey Laboratory of Advanced Networking, Tianjin, China; the JiangsuFuture Internet Program under Grant BY2013095-4-08; the Jiangsu High-Level Innovation and Entrepreneurship (Shuangchuang) Program; and theJapan Ministry of Information and Communication through the Strate-gic Information and Communications Research and Development Pro-motion Program titled “Living Activity and Health Tracking in SmartHome using RF Reflection and Big Data,” 2016-2018. The workof J. Wu was supported by the NSF under Grants CNS 1449860,CNS 1461932, CNS 1460971, CNS 1439672, CNS 1301774, andECCS 1231461. (Corresponding authors: Keqiu Li and Alex X. Liu.)

X. Liu and K. Li are with the School of Computer Science and Technology,Dalian University of Technology, Dalian 116023, China (e-mail:[email protected]; [email protected]).

A. X. Liu is with the State Key Laboratory for Novel Software Technology,Nanjing University, Nanjing 210093, China, and also with the Department ofComputer Science and Engineering, Michigan State University, East Lansing,MI 48824-1226 USA (e-mail: [email protected]).

S. Guo is with the Department of Computing, The Hong Kong PolytechnicUniversity, Hong Kong (e-mail: [email protected]).

M. Shahzad is with the Department of Computer Science, North CarolinaState University, Raleigh, NC 27606 USA (e-mail: [email protected]).

A. L. Wang is with the Department of Computer Science and Engineering,Michigan State University, East Lansing, MI 48824-1226 USA (e-mail:[email protected]).

J. Wu is with the Department of Computer and Information Sciences,Temple University, Philadelphia, PA 19122 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TNET.2016.2594481

calculate the variance of the estimate in one round, as wellas the variance of the average estimate in multiple rounds.To find the optimal frame size, we propose an efficient binarysearch-based algorithm. To address significant variance in cat-egory sizes, we propose an adaptive partitioning (AP) strategyto group categories of similar sizes together and execute theestimation protocol for each group separately. Compared withthe existing protocols, our approach is much faster, meanwhilesatisfying the predefined estimation accuracy. For example, with20 categories, the proposed SEM+AP is about seven timesfaster than prior estimation schemes. Moreover, our approachis the only one whose normalized estimation time (i.e., time percategory) decreases as the number of categories increases.

Index Terms— RFID, cardinality estimation, multi-category,Manchester coding, adaptive partitioning.

I. INTRODUCTION

A. Background and Problem Statement

RADIO Frequency Identification (RFID) has been widelyused in many applications such as inventory man-

agement [1]–[7], object tracking [8]–[10], and localiza-tion [11]–[13]. A typical RFID system consists of readers,tags, and a back-end server. The back-end server controlsthe reader to interrogate a set of tags, and the tags respondwith their IDs over a shared wireless medium. A tag is amicrochip with an antenna in a compact package that haslimited computing power and communication ranges. In anRFID-enabled warehouse, there may be thousands of taggeditems that belong to different categories, e.g., different placesof origin or different brands [14]. Each tag attached to an itemhas a unique ID that consists of two fields: a category ID thatspecifies the category of the attached object, and a member IDthat identifies this object within its category. As a manager ofthe warehouse, one may desire to timely monitor the productstock of each category. If the stock of a category is too high,it may indicate that this category of products are not popu-lar, and the manager needs to adjust the marketing strategy(e.g., lowering prices to increase sales). On the contrary, if thestock of a category is too low, the manager should performstock replenishment as soon as possible. Manual checking islaborious and of low time-efficiency. You can imagine howdifficult it is for a manager to manually count the number ofitems in each category that may be stacked together or placedon high shelves. Hence, it is desirable to exploit the RFIDtechnique to quickly obtain the number of tagged items ineach category.

This paper formulates and addresses the practical problemof multi-category RFID estimation. Given a set of RFIDtags with λ categories denoted by C1, C2, · · · , Cλ, whose

1063-6692 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

LIU et al.: MULTI-CATEGORY RFID ESTIMATION 265

cardinalities are denoted by n1, n2, · · · , nλ, respectively,a confidence interval α ∈ (0, 1], and a required reliabilityβ ∈ [0, 1), we want to estimate the number of tags ineach category using one or more readers such that for each1 ≤ i ≤ λ, we have P{|n̂i −ni| ≤ niα} ≥ β, where n̂i is theestimate of ni.

B. Limitations of Prior Art

The first piece of work that focuses on the tag cardinalityestimation problem in multi-category RFID system is Ensem-ble Sampling (ES) [14]. In ES, the reader needs to distinguishthree types of slots: empty slot, singleton slot, and collisionslot. ES exploits the number of singleton slots occupied byeach category in a time frame to estimate the tag cardinality ineach category. For a collision slot, ES only knows two or moretags responded in this slot, and nothing else. In other words,the collision slots are not fully made use of by ES, whichincurs its low time-efficiency. Sheng et al. and Luo et al. pro-posed threshold-classification schemes that identify the cate-gories whose sizes are above a predefined threshold value butdo not estimate the size of each category [15], [16]. We coulduse RFID identification and estimation protocols to addressour multi-category RFID estimation problem; however, theyare not efficient for this purpose. RFID identification protocolscan read the IDs of all tags and thus obtain the accuratenumber of tags in each category. However, the identificationspeed is much slower than that of estimation. In additionto its low time-efficiency, the operation of identifying tagsis not permitted at all in some privacy-sensitive applications,because the tag IDs transmitted in the air as plaintext couldbe easily eavesdropped by attackers. Existing RFID estimationprotocols (e.g., [17]–[21]) can only estimate the total numberof tags in a population, regardless of their categories. To usesuch protocols to address our multi-category RFID estimationproblem, we need to separately execute them on each category.Specifically, the reader can send the SELECT command [22] toactivate the tags of a specific category to let them participatein the estimation protocol, while keeping the tags of othercategories inactive. For advanced RFID estimation protocols(e.g., [17], [21]), the estimation time is determined by thegiven confidence interval α ∈ (0, 1] and required reliabilityβ ∈ [0, 1), instead of tag population size. Thus, if we use theexisting RFID estimation protocols to address our problem, theestimation time grows linearly with the number of categories,which is also inefficient.

C. Proposed Approach

In this paper, we propose an approach called SimultaneousEstimation for Multi-category RFID systems (SEM). At thestart of SEM, we inject a so-called single-one string (SO stringfor short) into each tag. Given λ categories, the SO stringinjected into the tag belonging to the i-th category is a vectorof λ bits where exactly the i-th bit is 1 and all other bits are 0s.For example, given 3 categories, the SO strings are 100, 010,and 001, respectively. 100 is injected into the tags of the firstcategory; 010 is injected into the tags of the second category;001 is injected into the tags of the third category. Such a string

Fig. 1. Single-one manchester coding.

injecting operation can be easily implemented as follows. Thereader uses SELECT command [22] to activate the tags in aspecific category while keeping the other tags inactive. Then,the reader broadcasts the corresponding SO string, and theactive tags record the received string in their memories. TheRFID tags respond to the reader’s query with the SO stringsthat are modulated by Manchester coding mechanism. Whenquerying two tags, which are in the i-th category and thej-th category, respectively, if i = j, then the reader obtains avector of λ bits where exactly the i-th bit is 1 and all other λ−1bits are 0s; if i �= j, then the reader obtains a vector of λ bitswhere exactly the i-th bit and the j-th bit are collisions andall other λ−2 bits are 0s. Specifically, as illustrated in Fig. 1,1 is encoded as a falling edge and 0 is encoded as a risingedge in the Manchester coding. If all tags transmit 0 (or 1)at the same time, the reader can successfully recover the bitas 0 (or 1); otherwise, the reader will detect a bit collision x.Thus, from the bit vector that the reader obtains, we knowexactly which categories of tags responded in this slot. Notethat Manchester coding is supported by the RFID standardISO 18000-6 [25] for detecting bit-level collisions [26], [27].Many excellent literature [9], [28] makes use of the bit-levelsynchronization to address RFID application problems.

SEM is based on the standard Framed Slotted Alohaprotocol [29] for MAC layer communication. First, the RFIDreader initializes a slotted time frame by broadcasting a binaryrequest 〈δ, f〉, where δ is a random seed and f is the framesize (i.e., the number of slots in the forthcoming frame). Eachtag randomly chooses a slot in the frame to reply its SO string.Specifically, each tag initializes its slot counter sc = H(ID, δ)mod f , which follows a uniform distribution within [0, f −1].The reader broadcasts the QueryRep command at the end ofeach slot to inform every tag to decrement its slot counter scby 1. In each slot, a tag responds to the reader once its slotcounter sc becomes 0. At the end of each frame, the readerobtains an array of f ternary strings where each ternary stringhas λ bits and each bit has a value of 0, 1, or x. We call thisarray a physical frame. For the λ-bit ternary string ti of thei-th slot, for each 1 ≤ j ≤ λ, if ti[j] = 0, then there is no tagin category Cj that responded in the i-th slot; if ti[j] = 1, thenonly tags in category Cj responded in the i-th slot; if ti[j] = x,then more than one tag responded in the i-th slot: at least onein category Cj and the remaining not in category Cj . Thus,from this physical frame, we can obtain λ logical frames, eachfor one category, where the logical frame for category Ci is


TABLE I

EXISTING TAG ESTIMATION PROTOCOLS THAT CAN LEVERAGE SEM TO ENABLE SIMULTANEOUS ESTIMATION OVER MULTIPLE CATEGORIES

Fig. 2. From one physical frame to λ = 3 logical frames.

the same as the physical frame that the reader could obtainif the tag population only contains the tags in category Ci.Fig. 2 shows an example of obtaining λ logical frames froma physical frame. For example, in the third slot, the ternarystring xxx is the collision result of three types of single-onestrings: 100, 010, and 001.

We now zoom into the logical frame for category Ci. Foreach slot, we either have the SO string or nothing. By denotingthe slot containing SO string with 1 and the slot that is emptywith 0, we can obtain a bit vector with f bits. Fig. 2 showsthree bit vectors that we obtain. Based on the bit vectorsobtained by SEM, many excellent tag estimation protocols, assummarized in Table I, can be used to simultaneously estimatethe tag cardinality of each category. For example, EnhancedZero-Based estimator (EZB) [19] relies on an important intu-ition: the fewer tags are, the more empty slots will appearin the frame. Thus, EZB can exploit the number of emptyslots in a frame to conduct the tag estimation. Here, we coulduse the number of 0s in each bit vector as the input of EZBto estimate the number of tags in the corresponding category.Besides EZB, many existing protocols such as FNEB [23] thatleverages the index of the first non-empty slots in the frame,LoF [24] that makes use of the length of continuous non-emptyslots, ART [17] that exploits the average run length of non-empty slots, can be built on our SEM to achieve simultaneousestimation over multiple categories.

Using our SEM approach, previous RFID estimation pro-tocols can be significantly accelerated when facing the

multi-category estimation problem. In the following, we usea numerical example to show this point. Let tγ representthe duration of a slot for transmitting γ-bit data and isgiven by τw + γ × τb, where τw = 302us is the wait-ing time and τb = 18.8us is the time for transmittingone bit [30], [31]. As there are λ categories, in SEM eachslot contains λ-bit single-one string, i.e., γ = λ. Thus, thetime cost of an SEM frame is f(τw + λ × τb). On the otherhand, the time cost of executing a frame of existing protocolonce for each category is λf(τw +γτb), where γ = 1 becausethe existing protocols just require each slot to carry a singlebit to indicate empty or non-empty. It is easy to find thatthe number of slots executed by SEM are much smaller thanthe total number of slots executed by existing protocols. Forexample, when λ = 30, SEM is almost 11 times faster thanthe existing estimation protocols.

D. Challenges and Proposed Solutions

The first key challenge is to guarantee the required esti-mation accuracy specified by confidence interval α ∈ (0, 1]and required reliability β ∈ [0, 1) for all categories.As the estimation based on one round of SEM has an inherentvariance due to the probabilistic nature, we execute multiplerounds of SEM to reduce the variance of the estimate of eachcategory. To ensure that SEM achieves the required accuracy,we first calculate the variance of the estimate for one roundand the variance of the average estimate in multiple rounds.Then, we use statistical methods to find the minimum numberof rounds that can achieve the required accuracy.

The second key challenge is to choose an optimal broadcastframe size f that minimizes the estimation time. In SEM, thereader broadcasts a large frame size f , but has to terminatethe frame after executing the f ′ slots. Normally, f ′ ≤ 512.Optimal configurations of f ′ and f are crucial to the perfor-mance of SEM. Optimization of f ′ is relatively easy, becauseit has a small value range, and even a simple enumerationmethod is workable. However, f may has a wide range ofpossible values, and thus enumeration is infeasible. We showthat the execution time is a convex function with respect to f ,based on which we propose an efficient binary search-basedalgorithm to find the optimal value of f .

The third key challenge is to deal with categories thatvary significantly in size. To minimize the estimation time,categories with small sizes demand a small frame size,whereas, categories with large sizes demand a large framesize. To address this issue, we propose an Adaptive Parti-tioning (AP) to group categories of similar sizes togetherand execute SEM for each group separately. Although this


introduces more times of executing SEM, the estimation timefor each group is well optimized as the categories in eachgroup have similar sizes. Such a hybrid strategy has a smallerestimation time in comparison with the two extreme strategiesof estimating each category separately and estimating allcategories together. As we do not know category sizes inadvance, we adaptively partition the categories based on theexecution of previous rounds.

E. Novelty and Advantage Over Prior Art

The key technical novelty of this paper lies in proposing anSingle-one Manchester coding-based approach called SEM,built on which traditional tag estimation protocols can beused to address the multi-category RFID estimation in asimultaneous manner. The key technical depth of this paperis in the mathematical development of SEM in addressing thethree technical challenges of guaranteeing accuracy, choosingframe sizes, and partitioning categories. The key advantage ofour approach over prior art is that SEM can decode multiplebit vectors from just one physical frame to simultaneously esti-mate the tag cardinality of each category. Compared with theprior separate estimation methods, our SEM approach signifi-cantly reduces the number of physical slots, and thus achievesmuch better time-efficiency. For example, for an RFID systemwith 20 categories, our SEM+AP uses 2 seconds whereasthe state-of-the-art ART protocol takes 14 seconds [17].It represents that our SEM+AP is 7x faster than ART. As thenumber of categories increases, the normalized estimation timeof our approach decreases, whereas, that of prior estimationprotocols does not.

The rest of the paper is organized as follows. In Section II,we present our SEM approach in detail along with its analysis.In Section III, we describe how to calculate the optimal valuesfor system parameters to minimize the estimation time of SEMwhile achieving the required reliability. In Section IV, wedescribe how SEM adaptively partitions the categories intocomparable sizes to reduce the estimation time. In Section V,we review the related work. In Section VI, we present resultsfrom our extensive evaluation of the proposed approach and itscomparison with the existing protocols. Finally, in Section VII,we conclude the paper.

II. SEM: ESTIMATOR AND VARIANCE

To estimate the number of tags in each category,SEM executes multiple Aloha frames. At the end of eachframe, it obtains a bit vector for each category. Based on theobtained bit vectors, SEM can perform any estimator listed inTable I to simultaneously estimate the number of tags of eachcategory. Here, for the purpose of clarity, we let SEM exploitthe most classical estimator in EZB [19]. Note that, if moreadvanced estimators such as ART [17] or SRC [21] are used,the performance of SEM can be further improved. An insightbehind EZB: the fewer tags are there, the more empty slotsappear in the frame. Hence, we make use of the number of0s in each bit vector to perform the estimation. The estimateobtained from the number of 0s observed in a single bitvector is not accurate due to the variance associated with the

TABLE II

MAIN NOTATIONS USED IN THE PAPER

estimation process. Thus, instead of executing a single round,SEM executes k rounds and obtains k estimates of the numberof tags in that category. It then calculates the average of thosek estimates to obtain the fine-grained estimate. Next, we firstformally derive the estimator that SEM uses to estimate thesize of any given category, using the number of 0s in thebit vector corresponding to that category as input. Then, wederive the expression for variance of the estimator, whichwe will use in Section III to determine the values of systemparameters to ensure that SEM achieves the required reliabilityin the minimum possible time. Table II summarizes the mainnotations used in this paper.

A. Estimator

Let ni represent the number of tags in category Ci. Letf represent the number of slots that the reader broadcasts atthe start of the frame. We call f the broadcast frame size. Letpi,0 represent the probability that any bit in the bit vectorof category Ci is 0. Formally, for large values of f , theprobability pi,0 is given by the following equation.

pi,0 =(

1 − 1f

)ni

≈ e−nif (1)

In the above equation, such an approximation is usually madein previous literature [17], [32].

Let the reader terminate the frame after executing f ′ slots,where f ′ ≤ f . We call f ′ the executed frame size. LetNi,0 be the random variable for number of 0s observed inthe first f ′ bits of the bit vector of category Ci. As theprobability for any bit to be 0 is pi,0, the random variableNi,0 follows binomial distribution Binom(f ′, pi,0). Thus, theexpected value of Ni,0 is given by the following equation.

E(Ni,0) = f ′ · pi,0 = f ′e−nif (2)


Solving Eq. (2) for ni, we get the following equation.

ni = −f ln{

E(Ni,0)f ′

}(3)

This equation shows that for fixed given values of f and f ′,ni is a monotonically decreasing function of E(Ni,0). Thus,we can estimate the value of ni by substituting E(Ni,0) inthe equation above by the observed value of Ni,0 from thelogical frame of category Ci. Substituting Ni,0 for E(Ni,0)in Eq. (3), we get the estimator n̂i of ni as follows.

n̂i = −f ln{

Ni,0

f ′

}(4)

B. Variance

The following lemma calculates the variance in the estima-tor derived in Eq. (4).

Lemma 1: Let f and f ′ be the broadcast and executedframe sizes, respectively, and ni be the number of tags incategory Ci. The variance in the estimate n̂i of ni is given bythe following equation.

V ar(n̂i) =f2

f ′(e

nif − 1

)(5)

Proof: According to Eq. (4), n̂i is a function of the randomvariable Ni,0. Thus, we express n̂i as φ(Ni,0). The Taylor’sseries expansion of φ(Ni,0) around E(Ni,0) is given by thefollowing equation.

n̂i = φ(Ni,0) = φ(η) +∂φ

∂Ni,0(Ni,0 − η),

where ∂φ∂Ni,0

is the first-order derivative. Taking the expectationof both sides of the equation above, we have:

E[n̂i] = E[φ(η)] +∂φ

∂Ni,0E[(Ni,0 − η)] = φ(η)

The variance of n̂i can now be calculated using the followingexpression.

V ar(n̂i) = E[n̂i − E(n̂i)]2 =[

∂φ

∂Ni,0

]2V ar(Ni,0) (6)

As required by the equation above, we next calculate thefirst-order derivative ∂φ

∂Ni,0|Ni,0=η and the variance V ar(Ni,0).

∂φ

∂Ni,0= −f × f ′

Ni,0× 1

f ′

Replacing Ni,0 by η = E(Ni,0) = f ′e−nif in the equation

above, we get:

∂φ

∂Ni,0|Ni,0=η = − f

f ′ enif

As Ni,0 ∼ Binom(f ′, pi,0), the variance V ar(Ni,0) is givenby the following equation.

V ar(Ni,0) = f ′pi,0(1 − pi,0) = f ′e−nif

(1 − e−

nif

)(7)

Substituting the expressions of ∂φ∂Ni,0

|Ni,0=η and V ar(Ni,0)into Eq. (6), we get the variance of n̂i as given in Eq. (5) inthe lemma statement. �

III. SEM: SYSTEM PARAMETERS

Recall from Section II that SEM executes k frames toestimate the number of tags in each category. Next, we firstderive the expression to calculate the value of k, which ensuresthat SEM achieves the required reliability. After that, we deriveexpressions to calculate broadcast frame size f and executedframe size f ′, which ensure that the execution time of SEM isthe minimum.

A. Number of Frames k

Let ˆni,j represent the estimate of the number of tags incategory Ci obtained from the jth frame. Let Ak(n̂i) representthe average of the k estimates obtained from the k frames,i.e., Ak(n̂i) = 1

k

∑kj=1 ˆni,j . In what follows, Theorem 1

calculates the value of k which ensures that the averageestimate satisfies the required reliability.

Theorem 1: Given required confidence interval α, requiredreliability β, broadcast frame size f , and executed framesize f ′, the average estimate Aki(n̂i) of the number of tagsin category Ci satisfies the requirement P{|Aki(n̂i) − ni| ≤niα} ≥ β when the average is obtained from ki frames, whereki satisfies the following equation.

ki ≥(

fZβ

αni

)2(

enif − 1f ′

)(8)

Proof: As SEM uses different seeds for each frame,the ki frames are independent of each other. According tothe central limit theorem,

Aki(n̂i)−E[Aki

(n̂i)]√V ar[Aki

(n̂i)]is a random

variable that follows the standard normal distribution. Let usrepresent this random variable by ℵ. As ℵ follows a standardnormal distribution, for any required reliability β, there existsa number Zβ such that

P (−Zβ ≤ ℵ ≤ Zβ) = β (9)

The requirement P{|Aki(n̂i)−ni| ≤ niα} ≥ β can be writtenas below.

P

{(1−α)ni−E[Aki(n̂i)]√

V ar[Aki(n̂i)]≤ℵ≤ (1+ α)ni−E[Aki(n̂i)]√

V ar[Aki (n̂i)]

}

≥ β (10)

Comparing Eqs. (9) and (10), SEM will achieve the requiredreliability when the following conditions hold.⎧⎪⎪⎪⎨

⎪⎪⎪⎩

(1 − α)ni − E[Aki(n̂i)]√V ar[Aki (n̂i)]

≤ −Zβ

(1 + α)ni − E[Aki(n̂i)]√V ar[Aki (n̂i)]

≥ Zβ ,

(11)

Next we calculate the expectation and variance of Aki(n̂i).

E[Aki(n̂i)] =1ki

ki∑j=1

E( ˆni,j) = ni

V ar[Aki (n̂i)] =1k2

i

ki∑j=1

f2

f ′ (enif − 1) =

f2

kif ′ (enif − 1)

(12)


Substituting the expressions for E[Aki (n̂i)] and V ar[Aki (n̂i)]into either of the two inequalities in Eq. (11) and rearranging,we get the inequality in Eq. (8). �

B. Frame Sizes f and f ′

For the given values of f ′ and f , Theorem 1 calculatesthe number of frames that SEM must execute to achievethe required reliability. Next we optimize the values of theexecuted and broadcast frame sizes to ensure that the estima-tion time of SEM is minimized.

Let Ti represent the minimum execution time needed bycategory Ci, tλ represent the duration of each slot, and tξrepresent the time that the reader takes to transmit the ξ-bitparameters for frame initialization. Thus, Ti = ki × (tξ + f ′×tλ) = (fZβ)2

f ′(αni)2(e

nif − 1) × (tξ + f ′ × tλ). Let T represent

the execution time of SEM for all categories, which shouldbe equal to the longest execution time among all minimumexecution times for the λ categories.

Next, we first show that Ti is a convex function of ni.To prove convexity, a sufficient and necessary condition isthat the second-order derivative of Ti with respect to ni isalways larger than 0. The following equation calculates thesecond-order derivative of Ti with respect to ni.

∂2Ti

∂n2i

=f2Z2

β(tξ +f ′tλ)f ′α2

[e

nif

(1

f2n2i

− 4fn3

i

+6n4

i

)− 6

n4i

]

(13)

For simplicity, we substitute ( 1f2n2

i− 4

fn3i

+ 6n4

i) with Φ. Note

that Φ = ( 1f2n2

i− 4

fn3i

+ 6n4

i) ≥ 2

√1

f2n2i× 6

n4i− 4

fn3i

=2√

6−4fn3

i> 0. Furthermore, using the fourth-order Taylor series

expansion of enif , we know that e

nif > 1 + ni

f + n2i

2f2 +n3

i

6f3 + n4i

24f4 . Then, Eq. (13) can be written as the follow-ing inequality.

∂2Ti

∂n2i

>f2Z2

β(tξ + f ′tλ)f ′α2

×[(

1 +ni

f+

n2i

2f2+

n3i

6f3+

n4i

24f4

)Φ− 6

n4i

]

Substituting the value of Φ in the inequality above and simpli-

fying, we get ∂2Ti

∂n2i

>f2Z2

β(tξ+f ′tλ)

f ′α2 ( 2fn3

i+ 1

12f4 + n2i

24f6 ) > 0.As this second-order derivative is always greater than 0, Ti

is a convex function of ni. Let Cx and Cy be the categorieswith the fewest and the most number of tags, respectively,among all λ categories. Let nx and ny be the number of tagsin the categories Cx and Cy , respectively. By the propertyof convex function, the maximum value of Ti lies at oneof the two boundary points, i.e., (nx, Tx) or (ny, Ty). Thus,T = max{Tx, Ty}. Minimizing the overall time T is equiva-lent to minimizing max{Tx, Ty}. Formally, we need to solvethe following optimization problem to find out the optimal

values of f ′ and f to minimize max{Tx, Ty}.

Minimizing max{Tx, Ty}s.t. f ′ ∈ [1, 512]

f ′ ≤ f

Cx is the smallest category under estimation

Cy is the largest category under estimation

Ti =(fZβ)2

f ′(αni)2(e

nif − 1

)× (tξ + f ′ × tλ)

i = x or y (14)

In the optimization problem formulated in Eq. (14), theexecuted frame size f ′ should be no more than 512 due topractical reasons [17]. It is easy to enumerate each possiblevalue of f ′ to find the optimal one because of its smallvalue range. However, f has a large value range, and theenumeration method is not suitable when optimizing its value.Therefore, we investigate how to quickly optimize the valueof f in the following. We first show that max{Tx, Ty} is aconvex function of f , which means that there is a value of ffor which the execution time of SEM is the minimum. Then,we describe a simple binary search-based method to determinethe optimal value of f . The second-order derivative of Ti withrespect to f is given by the following equation.

∂2Ti

∂f2=

Z2β(tξ + f ′tλ)

α2n2i f

′

[e

nif

(n2

i

f2− 2ni

f+ 2)− 2]

(15)

For simplicity, we substitute n2i

f2 − 2ni

f + 2 with Ψ. Note that

Ψ = n2i

f2 − 2ni

f + 2 = (ni

f − 1)2 + 1 > 0. Substituting enif

with its fourth-order Taylor series in Eq. (15) and simplifying,we have the following inequality.

∂2Ti

∂f2>

Z2β(tξ + f ′tλ)

α2n2i f

′

(n3

i

3f3+

n4i

4f4+

n5i

12f5+

n6i

24f6

)> 0

(16)

As the second order derivative of Ti, with respect to f , isalways greater than 0, Ti is a convex function of f . Thus,Tx and Ty are both convex functions of f . Consequently,max{Tx, Ty} is also a convex function of f .

Leveraging this convexity of max{Tx, Ty} with respect to f ,SEM uses a fast binary-searching algorithm to find the optimalvalue of f . Given a f ′ ≤ 512, SEM first initializes flow to f ′,and fhigh to 3ny. We have observed through simulationsthat 3ny is a good upper bound on the size of broadcastframe. Second, SEM calculates the first-order derivative ofmax{Tx, Ty} at flow+fhigh

2 . If this derivative is less than 0,it updates flow to flow+fhigh

2 ; otherwise, it updates fhigh

to flow+fhigh

2 . SEM recursively performs this search untilflow = fhigh, at which point it stops and returns the valueof f as f = flow = fhigh.

C. Dynamic Parameter Adjusting

To calculate the optimal values of system parameters, ourproposed methods assume that SEM already knows the sizeof each category apriori. However, the category sizes are


Fig. 3. Separate estimation vs. simultaneous estimation in a balanced RFID system that contains two categories C1 and C2 with sizes of 100 and 110 tagsrespectively. (α, β) = (5%, 95%). (a) SEM on C1. (b) SEM on C2. (c) SEM on C1 and C2.

unknown apriori and are actually the quantity we need toestimate. Next, we present how to obtain rough estimates ofcategory sizes, which are then used to calculate the optimalvalues of system parameters.

Before executing the first frame, SEM sets the size of thesmallest category to nmin and the largest category to nmax,where nmin and nmax are the lower and upper bounds oncategory sizes, respectively, and are provided by the systemadministrator. Using nmin and nmax as inputs, we calculatethe broadcast frame size f using the binary search-basedmethod proposed above. Note that our binary search basedmethod is not sensitive to the rough values of nmin and nmax

because the system parameter values converge to their nearoptimal values after only a few frames. After executing κ > 1frames, we get average estimate Aκ(n̂i) for each category Ci.This Aκ(n̂i) is used to calculate the number of requiredframes, and should be repeated using Eq. (8).

D. Avoiding Premature Termination

As we calculate the number of times the frames are executed(i.e., ki) using the estimated value Aκ(n̂i), which is notvery accurate when κ is small, the value of ki may besmaller than what it should be. Consequently, SEM may stopafter executing fewer frames than it should have executedcausing the estimated size of category Ci do not satisfy therequired reliability. In other words, the estimation process forcategory Ci is terminated too early, which we call prematuretermination. As ki is a monotonically increasing functionof ni, instead of substituting ni with Aκ(n̂i), SEM substitutesni with Aκ(n̂i) + · √V ar[Aκ(n̂i)] to calculate the valueof ki. The variance of Aκ(n̂i) was calculated in Eq. (12).According to the famous three-sigma rule [33], = 3 shouldbe large enough. We name this method of calculating ki as the -sigma method. Through extensive simulations in Section VI,we show that our -sigma method is highly effective againstpremature termination.

IV. SEM: ADAPTIVE PARTITIONING

Until now, we have described how SEM executes multi-ple frames for all categories simultaneously, and estimatesthe sizes of the categories. This strategy works well onlywhen all categories are balanced, i.e., sizes of all categories

are similar. When the categories are unbalanced, i.e., sizes ofcategories are very different, simultaneously estimating sizesof all categories adversely affects the performance of SEM.Next, we discuss the two scenarios of balanced and unbalancedcategories, respectively.

A. Category Types

1) Balanced Categories: We first consider an RFID systemthat consists of two categories C1 and C2 with similar sizes of100 and 110 tags, respectively. Fig. 3(a) and (b) respectivelyshow the minimal execution time of SEM when it is separatelyexecuted on tags in category C1 and C2. In the figure, theoptimal pair, e.g., (68, 68, 0.8378s), means that the optimalvalues of both f ′ and f for SEM are 68, and the correspondingminimum execution time is 0.8378s. Note that the minimumtime SEM takes to solely estimate the number of tags in cate-gory C1 is 0.8378s. And the time for category C2 is 0.8311s.Clearly, the total time of SEM when executed separately foreach category is 1.6689s. In contrast, as shown in Fig. 3 (c),the minimum time SEM takes to simultaneously estimatethe number of tags in both categories C1 and C2 is just0.8826s, which is much smaller than the time SEM takes toestimate the number of tags in the categories separately. Thus,simultaneous estimation performs much better than separateestimation method in such a balanced RFID system.

2) Unbalanced Categories: Fig. 4(a) and (b) plot the min-imum execution times of SEM for two categories C1 and C2

with quite different sizes of 100 and 2000 tags, respectively.The minimum time SEM takes to estimate the number of tagsin categories C1 and C2 separately are 0.8378s and 1.0038s,resulting in the total time of 1.8416s. In contrast, as shownin Fig. 4(c), the minimum time SEM takes to simultaneouslyestimate the number of tags in both categories is 2.56s, whichis much larger than the time SEM takes to separately estimatethe number of tags in the categories. This happens because,for the unbalanced categories, it is hard to find a pair ofparameters 〈f, f ′〉 that simultaneously fit categories with largeand small sizes. Thus, separate estimation performs better inthe scenario of unbalanced categories.

From the above case studies of balanced and unbalancedcategories, we conclude that when the category sizes are unbal-anced, SEM should first partition categories into groups such


Fig. 4. Separate estimation vs. simultaneous estimation in an unbalanced RFID system that contains two categories C1 and C2 with sizes of 100 and 2000 tagsrespectively. (α, β) = (5%, 95%). (a) SEM on C1. (b) SEM on C2. (c) SEM on C1 and C2.

that the sizes of categories in the same group are comparableand then simultaneously estimate the sizes of categories inindividual groups. This will reduce the overall estimation timeof SEM. Next, we describe how SEM partitions categoriesinto groups.

B. Adaptive Partitioning

At start, SEM assumes that all categories belong to the samegroup. Without loss of generality, it assumes that all categoriesare arranged in a list L1,λ in ascending order, i.e., L1,λ =〈n1, n2, .., nλ〉, and for any i, j ∈ [1, λ], if i < j, we haveni ≤ nj . As aforementioned, the sizes of the smallest andlargest categories in a group, i.e., n1 and nλ in this case,determine the estimation time of SEM. We represent theminimum time of SEM on a group that has the smallestcategory size ni and the largest category size nj by Ti,j . Recallthat the estimation time of SEM is minimum when the valuesof n, f ′, and f are calculated as described in Section III.

SEM partitions the group represented by list Lx,y =〈nx, .., ny〉 into two groups represented by lists Lx,s =〈nx, .., ns〉 and Ls+1,y = 〈ns+1, .., ny〉, where the value ofs should satisfy the following two conditions.

1) Tx,s + Ts+1,y ≤ Tx,y

2) ∀z ∈ [x, y − 1], Tx,s + Ts+1,y ≤ Tx,z + Tz+1,y

SEM recursively applies this partitioning method on groupsstarting with x = 1 and y = λ and continues until for a givengroup represented by list Lx,y, there is no s ∈ [x, y − 1] thatsatisfies the first condition. Fig. 5 shows an example whereSEM partitions a large unbalanced group represented by thelist 〈n1, n2, n3, n4, n5, n6〉 into several small balanced groupsrepresented by the lists 〈n1, n2〉, 〈n3, n4〉, and 〈n5, n6〉. Notethat, in Fig. 5, Tx,x (e.g., T1,1) means the minimum estimationtime of SEM on a group that contains just one category Cx

(e.g., C1). After obtaining the small balanced groups, SEMtakes one balanced group at a time and estimates the sizes ofcategories in that group simultaneously.

Just like in the calculation of optimal system parameters,adaptive partitioning also needs the size of each categoryapriori. At the very beginning, we do not know the numberof tags in each category at all. Hence, for the first round ofSEM, we let all the categories be in the same group. After the

Fig. 5. Example of Adaptive Partitioning (AP): an initial unbalanced groupis partitioned into 3 balanced groups.

first round of estimation, SEM uses the method proposedin Section III-C to obtain the rough estimates of categorysizes to guide the group partitioning process, and to find theoptimum values of broadcast frame size f and executed framesize f ′ for each group. If for any category, the estimate Aκ(n̂i)achieves the required reliability after κ frames, SEM removescategory Ci from the list L1,λ. Before executing each frame,SEM first updates the list L1,λ by removing the categoriesfor which the required reliability has been achieved, and thenpartitions them into groups. The estimation process terminateswhen all categories achieve the required reliability. Whatwe should clarify is that the proposed Adaptive Partitioningmethod is a heuristic algorithm, and does not ensure to returnthe optimal grouping result.

C. Discussion

1) Multi-Reader Estimation: Due to the limited communi-cation range, a single RFID reader cannot cover a large area.Thus, multiple RFID readers are frequently deployed. SEMuses one of the many existing reader-scheduling protocols [34]to schedule which reader transmits and receives at what time.All readers always send the same commands and relay the datathey receive to a back-end server. Thus, these readers essen-tially work like a logical big reader. SEM works seamlesslyin single as well as multi-reader environments.


2) Bit Synchronization: Katabi et al. reported in [35] thatthe synchronization offset for commercial RFID tags is nor-mally no more than 1us. Recall that transmitting each bit froma tag to a reader requires 18.8us. Hence, the 1us offset is onlyabout 5.3% of a bit duration. In other words, the signal offsetdoes not have much negative impact on SEM. Hence, likemany top level RFID literature [9], [28], we also assume thatthe signals of each tag is well synchronized on bit level.

V. RELATED WORK

At the infancy stage of RFID research, the academic com-munities have paid much attention to the exact tag identifica-tion problem [29], [36], which is to exactly identify the tag IDswithin the interrogation range of an RFID reader. Generally,there are two types of tag identification protocols: Aloha-basedprotocols and Tree-based protocols. Their basic principlesare presented as follows. Fundamentally, the Aloha-basedprotocol is a kind of Time Division Multiple Access (TDMA)mechanism. A tag ID can be successfully identified in a slotwhen only one tag responds in this slot. As for tree-basedprotocols, the reader broadcasts a 0/1 string to query the tags.A tag responds with its ID once it finds that the queried stringis the prefix of its ID. A reader identifies a tag ID when onlyone tag responds. Although RFID identification protocols canbe used to obtain the exact tag IDs, it is a well-recognizedfact that the tag identification protocols are slow because theirexecution time is proportional to the number of tags. For somepurposes like stock monitoring, it is not efficient to executethe tag identification protocols because we only need to knowthe approximate number of tags instead of exact tag IDs.

Another direction of research on RFID systems is tar-geted at the cardinality estimation of RFID tag populations.Kodialam et al. proposed the first set of cardinality estimationschemes, USE and UPE, which use the number of emptyor collision slots to estimate population sizes [20]. Similarly,Zheng et al. proposed Probabilistic Estimation Tree (PET)to estimate cardinalities for tree-based RFID systems [37].Shahzad et al. proposed ART, which uses the average runlength of non-empty slots for cardinality estimation [17].Li et al. proposed Maximum Likelihood Estimator (MLE),which looks at the energy aspect of cardinality estimation [18].Liu et al. studied the problem of key tag population track-ing [38]. Gong et al. investigated INformative Counting (INC)to estimate the number of counterfeit tags whose IDs are notstored in a database [39]. For privacy reason, RFID estimationwith the presence of blocker tag is investigated in [32].

The above literature assumes that all tags within an RFIDsystem belong to the same category. However, in practicalscenarios, tags are usually classified into different categoriesaccording to brands. In recent years, the researchers haveshifted some attention to the interesting problems raising in themulti-category RFID systems. Sheng et al. addressed the prob-lem of identifying categories whose cardinalities are above agiven threshold [16]. They proposed the Group Testing (GT)scheme, that rapidly eliminates the groups containing small-sized categories. Luo et al. claimed that the GT protocol is notsuitable for RFID systems in which the sizes of a large numberof categories are above a threshold, because each group has

a high probability of containing a large-size category, and thusis difficult to eliminate. To accommodate this situation, theyproposed an efficient Threshold-Based Classification (TBC)Protocol [15] that obtains multiple logical bitmaps from asingle time frame. Each bitmap is used to approximate the tagcardinality of a category. The categories whose cardinalitiesare obviously above (or below) the given threshold can berapidly eliminated. Unfortunately, GT and TBC protocols canonly identify the categories with sizes greater than a threshold,but cannot estimate sizes of individual categories. The workclosest to ours, focusing on multi-category RFID system,is Ensemble Sampling (ES) [14], which exploits the numberof singleton slots occupied by each category in a time frameto estimate the tag cardinality in each category. ES can onlydistinguish three types of slots: empty slot, singleton slot, andcollision slot. For collision slot, ES only knows two or moretags responded in this slot, and nothing else. How to make fulluse of the information in each type of slots especially that inthe collision slots is the key to achieve better time efficiency.The proposed SEM exploits the Single-one Manchester codingstring, and could know which categories of tags responded ina collision slot. From a single physical frame, it can derivemultiple logical frames, and each servers the tag cardinalityestimation for a category.

VI. PERFORMANCE EVALUATION

In this section, we conduct extensive simulations on alarge scale multi-category RFID system to evaluate the per-formance of the proposed protocols. We evaluate SEM ina variety of scenarios both with and without adaptive par-titioning. In the rest of this section, we use SEM+APto denote SEM with adaptive partitioning and simplySEM to denote it without partitioning. Besides SEM andSEM+AP, we also implemented the existing representa-tive tag estimation/identification protocols, including Max-imum Likelihood Estimator (MLE) [18], Enhanced ZeroBased estimator (EZB) [19], Unified Probabilistic Estima-tor (UPE) [20], Average Run-based Tag estimation (ART) [17],Ensemble Sampling (ES) [14], Enhanced Framed SlottedAloha (EDFSA) [29] and Tree Hopping (TH) [36]. Followingthe simulation strategy used by these state-of-the-art schemes,we assume that the communication channel is error-free anda single reader covers all tags.

A. Evaluation Metrics

We evaluate SEM on two important metrics: (1) actual relia-bility, which is the percentage of times the relative errors in theestimates calculated by SEM are less than α, and (2) executiontime, which is the time SEM takes to estimate the cardinalitiesof all tags in each category. We run each simulation 1000 timesand use the results from these 1000 simulations to calculatethe values of the performance metrics. Before evaluating thesemetrics, we first evaluate the effectiveness of our adaptivepartitioning strategy.

B. Validating the Effectiveness of Adaptive Partitioning

To evaluate the improvement in execution time due toadaptive partitioning, we simulate an RFID system containing


Fig. 6. Comparing SEM+AP with SEM for balanced category sizes. Each category has the same size of 5000 tags. (a) Balanced. (b) α = 5%, β = 95%,l = 0. (c) α = 5%, β = 95%, l = 1. (d) α = 3%, β = 97%, l = 0. (e) α = 3%, β = 97%, l = 1.

Fig. 7. Comparing SEM+AP with SEM for unbalanced category sizes. The cardinalities of 10 categories are exponentially distributed. (a) Unbalanced (Exp.)(b) α = 5%, β = 95%, l = 0. (c) α = 5%, β = 95%, l = 1. (d) α = 3%, β = 97%, l = 0. (e) α = 3%, β = 97%, l = 1.

Fig. 8. Comparing SEM+AP with SEM for unbalanced category sizes. The cardinalities of 10 categories are linearly distributed. (a) Unbalanced (Linear)(b) α = 5%, β = 95%, l = 0. (c) α = 5%, β = 95%, l = 1. (d) α = 3%, β = 97%, l = 0. (e) α = 3%, β = 97%, l = 1.

a tag population with 10 categories. We conduct simulationsfor two accuracy requirements, i.e., α = 5%, β = 95%and α = 3%, β = 97% and two settings of , i.e., = 0and = 1. We conduct simulations for both balanced andunbalanced categories.

1) Balanced Categories: In this case, each category hasthe cardinality of 5000 tags, as shown in Fig. 6(a).Fig. 6(b) through 6(e) show the execution times of bothSEM+AP and SEM for the two accuracy requirements and thetwo settings of from 1000 independent runs of simulations.We observe from these figures that the execution time ofSEM+AP and SEM are almost the same. This is because theframe sizes f and f ′ calculated by SEM is appropriate forall categories. This means that there is no need to partitionthe list L1,10 into multiple groups. In fact, when SEM appliesthe adaptive partitioning algorithm on these 10 categories, thecategories are not divided into multiple groups; rather, theyare returned in a single group only.

2) Unbalanced Categories: In this case, the category sizesvary from 1000 tags to 50000 tags. We pick the categorysizes from two different distributions: exponential distribu-tion as shown in Fig. 7(a) and linear distribution as shown

in Fig. 8(a). Fig. 7(b) through 7(e) and 8(b) through 8(e)show the execution times of both SEM+AP and SEM forthe two accuracy requirements and the two settings of from 1000 independent runs of simulations. We observe fromthese figures that the execution time of SEM+AP is 50%smaller than the execution time of SEM. For example, theexecution times of SEM+AP and SEM with = 0, α = 5%,and β = 95% are approximately 3.2s and 6.3s, respectively.We make similar observations about SEM+AP and SEM forother settings of , α, and β when the categories are unbal-anced. The underlying reason is that SEM+AP first adaptivelypartitions an unbalanced group into multiple balanced groupsand then finds proper frame sizes f and f ′ for each group,which significantly reduces the execution time.

C. Actual Reliability

Recall that actual reliability is the percentage of times theestimates for any category Ci lie in the range [(1 − α)ni,(1 + α)ni], where ni is the actual cardinality of category Ci.We independently repeat each simulation scenario 1000 timesand calculate the actual reliability from those 1000 estimation


Fig. 9. Actual reliability of SEM+AP for balanced categories. (a) α = 5%,β = 95%. (b) α = 3%, β = 97%.

Fig. 10. Actual reliability of SEM+AP for unbalanced (exp.) categories.(a) α = 5%, β = 95%. (b) α = 3%, β = 97%.

Fig. 11. Actual reliability of SEM+AP for unbalanced (linear) categories.(a) α = 5%, β = 95%. (b) α = 3%, β = 97%.

results. Fig. 9 plots the actual reliability of SEM+AP in abalanced RFID system for the two accuracy requirements andthe two settings of when the cardinalities of the categoriesare those shown in Fig. 6(a). We observe from these twofigures that the actual reliability achieved by SEM+AP foreach category is higher than the required reliability β.

Fig. 10 and 11 plot the actual reliability of SEM+AP inthe unbalanced RFID system for the two accuracy require-ments and the two settings of when the cardinalities ofthe categories are those shown in Fig. 7(a) and Fig. 8(a),respectively. We observe from these figures that SEM+APwith = 0 sometimes does not satisfy the required reliability.This happens due to the premature termination, discussed inSection III-D. However, with = 1, the actual reliabilityof SEM+AP is always higher than the required reliabilityβ in all scenarios. This further shows that our -sigmamethod with = 1 is very effective in alleviating prematuretermination.

Fig. 12. Execution time of SEM+AP (� = 1) and prior protocols forbalanced categories. α = 5%, β = 95%. (a) Cardinality Picking Prob.(b) Avg. time vs. category number.

D. Execution Time

We evaluate the execution time of SEM+AP and presentits side-by-side comparison with the execution times of fiveexisting estimation protocols, namely MLE, EZB, UPE, ART,and ES. We use these estimation protocols to separatelyestimate the cardinality of each category one by one, exceptfor ES that simultaneously estimates the cardinalities of thetop-k largest categories. We set k in ES equal to the totalnumber of categories. We change the number of categoriesin tag populations from 1 to 20 and pick category sizesfrom two distributions: a non-uniform distribution to generatebalanced categories and a uniform distribution to generateunbalanced categories. Next we present the execution timeof SEM+AP and existing protocols for the balanced andunbalanced categories.

1) Balanced Categories: In this case, for each value ofnumber of categories, we pick the sizes of categories from thedistribution shown in Fig. 12(a). For example, the probabilitycorresponding to 10000 tags is 0.25, which means that anarbitrary category has a 25% likelihood of being assigneda cardinality of 10000 when simulating the RFID system.Since the cardinalities with non-zero probabilities are withina relatively small range ([8000, 12000]), all categories willhave similar cardinalities, resulting in a balanced categoriesscenario. Fig. 12(b) plots the normalized average executiontimes of SEM+AP and existing protocols. Normalized exe-cution time is calculated by dividing the execution time withthe number of categories. We observe from this figure thatSEM+AP is the only protocol whose average execution timeper category decreases as the number of categories increases.Furthermore, SEM+AP is significantly faster compared withthe prior estimation protocols. For example, with 20 cate-gories, the average time per category of the fastest existingprotocol, i.e., ART, is about 0.7 seconds, whereas that of ourSEM+AP is just about 0.10 seconds, which is nearly 7 timesfaster than ART.

2) Unbalanced Categories: In this case, for each valueof number of categories, we pick the sizes of categoriesfrom the distribution shown in Fig. 13(a). Since the cardi-nalities with non-zero probabilities are in a relatively widerange ([1000, 20000]), different categories will have differentcardinalities, resulting in an unbalanced categories scenario.Fig. 12(b) plots the normalized average execution times of


Fig. 13. Execution time of SEM+AP (� = 1) and prior protocols forunbalanced categories. α = 5%, β = 95%. (a) Cardinality Picking Prob.(b) Avg. time vs. category number.

Fig. 14. Comparing SEM+AP with tag identification protocols. (a) varyingthe expected size of each category. (b) varying the number of categories.

SEM+AP and existing protocols. We make two importantobservations. First, the average execution time of the existingprotocols is almost the same as that in the scenario of balancedcategories. This is because the execution times of existingprotocols only depend on the required accuracy and areindependent of tag population sizes [17]–[21]. Thus, as longas the number of categories does not change, the executiontime of the existing protocols does not change. Second, ourSEM+AP protocol is persistently several times faster thanall prior protocols for unbalanced categories as long as thenumber of categories is greater than 2.

E. Comparing With the Precise Tag Identification Protocols

Besides comparing with the existing RFID estimation pro-tocols, we also compare SEM+AP with the representativetag identification protocols, i.e., Enhanced Framed SlottedAloha (EDFSA) [29] and Tree Hopping (TH) [36]. In thesimulations, we set the number of tag categories to 20, and thetag cardinality in each category follows the normal distributionN(μ, σ2), σ = μ/3, where μ is small and varies from10 to 100. The estimation accuracy of SEM+AP is set to(5%, 95%). The simulation results in Fig. 14(a) infer that thetag identification protocols are faster than our SEM+AP onlywhen the expected tag cardinality in each category is quitesmall (e.g., μ < 30 in this figure). As the tag cardinalityin each category (i.e., μ) increases, the execution time ofthe tag identification protocols increases linearly. In contrary,the execution time of our SEM+AP is almost stable withvarying μ. In Fig. 14(b), we vary the number of tag categoriesfrom 1 to 10, and keep the tag cardinality in each categoryfollowing the normal distribution N(μ, σ2), where μ = 100and σ = μ/3. The simulation results in Fig. 14(b) reveal

that the execution time of the tag identification protocols isproportional to the number of tag categories. In contrary, thetime cost of SEM+AP just increases slightly as the numberof categories increases.

VII. CONCLUSION

In this paper, we make the following three key contri-butions. First, we formally defined the practically importantproblem of multi-category RFID estimation and proposed anSingle-one Manchester coding-based approach called SEM.Our SEM approach could decode multiple bit vectors froma single physical frame, thereby achieving simultaneous esti-mation over multiple categories. Second, we propose theoptimization technique of adaptive partitioning called AP toaddress the issue that category sizes may have large variances.The key idea is to group categories of similar sizes togetherand execute our SEM approach for each group separately.Third, we conducted extensive simulations to evaluate theproposed approaches. The simulation results show that ouroptimized SEM+AP approach can satisfy the predefined esti-mation accuracy while significantly outperforming all priorschemes, in terms of execution time. Moreover, we find thatour SEM is the only approach whose normalized estima-tion time decreases as the number of categories increases.Many excellent estimation protocols dedicated to single-setestimation can be built on our SEM+AP to achieve fast andsimultaneous estimation in multi-category RFID systems.

REFERENCES

[1] T. Li, S. Chen, and Y. Ling, “Efficient protocols for identifying themissing tags in a large RFID system,” IEEE/ACM Trans. Netw., vol. 21,no. 6, pp. 1974–1987, Dec. 2013.

[2] X. Liu, S. Zhang, B. Xiao, and K. Bu, “Flexible and time-efficient tagscanning with handheld readers,” IEEE Trans. Mobile Comput., vol. 15,no. 4, pp. 840–852, Apr. 2015.

[3] J. Liu, B. Xiao, K. Bu, and L. Chen, “Efficient distributed query process-ing in large RFID-enabled supply chains,” in Proc. IEEE INFOCOM,Apr./May 2014, pp. 163–171.

[4] M. Chen, W. Luo, Z. Mo, S. Chen, and Y. Fang, “An efficient tag searchprotocol in large-scale RFID systems with noisy channel,” IEEE/ACMTrans. Netw., vol. 24, no. 2, pp. 703–716, Apr. 2016.

[5] X. Liu et al., “Efficient unknown tag identification protocols in large-scale RFID systems,” IEEE Trans. Parallel Distrib. Syst., vol. 25, no. 12,pp. 3145–3155, Dec. 2014.

[6] J. Liu, B. Xiao, S. Chen, F. Zhu, and L. Chen, “Fast RFID groupingprotocols,” in Proc. IEEE INFOCOM, Apr./May 2015, pp. 1948–1956.

[7] M. Chen and S. Chen, “Identifying state-free networked tags,” in Proc.IEEE ICNP, Nov. 2015, pp. 302–312.

[8] L. Yang, Q. Lin, X. Li, T. Liu, and Y. Liu, “See through walls withCOTS RFID system!” in Proc. ACM MobiCom, 2015, pp. 487–499.

[9] Y. Zheng and M. Li, “P-MTI: Physical-layer missing tag identificationvia compressive sensing,” IEEE/ACM Trans. Netw., vol. 23, no. 4,pp. 1356–1366, Aug. 2015.

[10] X. Liu et al., “A multiple hashing approach to complete identifica-tion of missing RFID tags,” IEEE Trans. Commun., vol. 62, no. 3,pp. 1046–1057, Mar. 2014.

[11] L. Yang et al., “Tagoram: Real-time tracking of mobile RFID tags tohigh precision using COTS devices,” in Proc. ACM MobiCom, 2014,pp. 237–248.

[12] L. Yang, Y. Guo, T. Liu, C. Wang, and Y. Liu, “Perceiving the slightesttag motion beyond localization,” IEEE Trans. Mobile Comput., vol. 14,no. 11, pp. 2363–2375, Nov. 2015.

[13] L. Shangguan et al., “ShopMiner: Mining customer shopping behaviorin physical clothing stores with COTS RFID devices,” in Proc. ACMSenSys, 2015, pp. 113–125.


[14] L. Xie, H. Han, Q. Li, J. Wu, and S. Lu, “Efficient protocols forcollecting histograms in large-scale RFID systems,” IEEE Trans. ParallelDistrib. Syst., vol. 26, no. 9, pp. 2421–2433, Sep. 2015.

[15] W. Luo, Y. Qiao, S. Chen, and M. Chen, “An efficient protocol forRFID multigroup threshold-based classification based on sampling andlogical bitmap,” IEEE/ACM Trans. Netw., vol. 24, no. 1, pp. 397–407,Feb. 2016.

[16] B. Sheng, C. C. Tan, Q. Li, and W. Mao, “Finding popular categoriesfor RFID tags,” in Proc. ACM MobiHoc, 2008, pp. 159–168.

[17] M. Shahzad and A. X. Liu, “Fast and accurate estimation ofRFID tags,” IEEE/ACM Trans. Netw., vol. 23, no. 1, pp. 241–254,Feb. 2015.

[18] T. Li, S. Wu, S. Chen, and M. Yang, “Energy efficient algorithms forthe RFID estimation problem,” in Proc. IEEE INFOCOM, Mar. 2010,pp. 1–9.

[19] M. Kodialam, T. Nandagopal, and W. C. Lau, “Anonymous track-ing using RFID tags,” in Proc. IEEE INFOCOM, May 2007,pp. 1217–1225.

[20] M. Kodialam and T. Nandagopal, “Fast and reliable estima-tion schemes in RFID systems,” in Proc. ACM MobiCom, 2006,pp. 322–333.

[21] B. Chen, Z. Zhou, and H. Yu, “Understanding RFID counting protocols,”in Proc. ACM MobiCom, 2013, pp. 291–302.

[22] C. Floerkemeier and S. Sarma, “RFIDSim—A physical and logical layersimulation engine for passive RFID,” IEEE Trans. Autom. Sci. Eng.,vol. 6, no. 1, pp. 33–43, Jan. 2009.

[23] H. Han et al., “Counting RFID tags efficiently and anonymously,” inProc. IEEE INFOCOM, Mar. 2010, pp. 1–9.

[24] C. Qian, H. Ngan, Y. Liu, and L. M. Ni, “Cardinality estimation forlarge-scale RFID systems,” IEEE Trans. Parallel Distrib. Syst., vol. 22,no. 9, pp. 1441–1454, Sep. 2011.

[25] Information Technology Automatic Identification and Data CaptureTechniques-Radio Frequency Identification for Item Management AirInterface—Part 6: Parameters for Air Interface Communications at860C960 MHz, ISO/IEC Standard FDIS 18000-6, 2003.

[26] Y.-H. Chen et al., “A novel anti-collision algorithm in RFID systemsfor identifying passive tags,” IEEE Trans. Ind. Informat., vol. 6, no. 1,pp. 105–121, Feb. 2010.

[27] Y.-C. Lai, L.-Y. Hsiao, H.-J. Chen, C.-N. Lai, and J.-W. Lin, “A novelquery tree protocol with bit tracking in RFID tag identification,”IEEE Trans. Mobile Comput., vol. 12, no. 10, pp. 2063–2075,Oct. 2013.

[28] L. Kong, L. He, Y. Gu, M.-Y. Wu, and T. He, “A parallel identificationprotocol for RFID systems,” in Proc. IEEE INFOCOM, Apr./May 2014,pp. 154–162.

[29] S.-R. Lee, S.-D. Joo, and C.-W. Lee, “An enhanced dynamic framedslotted ALOHA algorithm for RFID tag identification,” in Proc. ACMMobiQuitous, Jul. 2005, pp. 166–172.

[30] Y. Yin, L. Xie, S. Lu, and D. Chen, “Check out the rules: Towards time-efficient rule checking over RFID tags,” Mobile Netw. Appl., vol. 19,no. 4, pp. 524–533, 2014.

[31] Y. Qiao, S. Chen, T. Li, and S. Chen, “Energy-efficient polling protocolsin RFID systems,” in Proc. ACM MobiHoc, 2011, Art. no. 25.

[32] X. Liu et al., “RFID cardinality estimation with blocker tags,” in Proc.IEEE INFOCOM, Apr./May 2015, pp. 1679–1687.

[33] F. Pukelsheim, “The three sigma rule,” Amer. Statist., vol. 48, no. 2,pp. 88–91, 1994.

[34] L. Yang et al., “Season: Shelving interference and joint identificationin large-scale RFID systems,” in Proc. IEEE INFOCOM, Apr. 2011,pp. 3092–3100.

[35] J. Wang, H. Hassanieh, D. Katabi, and P. Indyk, “Efficient and reliablelow-power backscatter networks,” in Proc. ACM SIGCOMM, 2012,pp. 61–72.

[36] M. Shahzad and A. X. Liu, “Probabilistic optimal tree hopping for RFIDidentification,” IEEE/ACM Trans. Netw., vol. 23, no. 3, pp. 796–809,Jun. 2015.

[37] Y. Zheng and M. Li, “PET: Probabilistic estimating tree for large-scale RFID estimation,” IEEE Trans. Mobile Comput., vol. 11, no. 11,pp. 1763–1774, Nov. 2012.

[38] X. Liu et al., “Fast tracking the population of key tags in large-scale anonymous RFID systems,” IEEE/ACM Trans. Netw., to bepublished.

[39] W. Gong et al., “Informative counting: Fine-grained batch authenti-cation for large-scale RFID systems,” in Proc. ACM MobiHoc, 2013,pp. 21–30.

Xiulong Liu received the B.E. degree from theSchool of Software Technology, Dalian Universityof Technology, China, in 2010, where he is cur-rently pursuing the Ph.D. degree with the Schoolof Computer Science and Technology. He served asa Research Assistant with Hong Kong PolytechnicUniversity in 2014, and a Visiting Scholar withTemple University in 2015. His research interestsinclude RFID systems and wireless sensor networks.

Keqiu Li received the bachelor’s and master’sdegrees from the Department of Applied Mathe-matics, Dalian University of Technology, China,in 1994 and 1997, respectively, and the Ph.D. degreefrom the Graduate School of Information Science,Japan Advanced Institute of Science and Technol-ogy, in 2005. He also has two-year post-doctoralexperience with the University of Tokyo, Japan.He is currently a Professor with the School ofComputer Science and Technology, Dalian Univer-sity of Technology. He has published more than

100 technical papers, such as the IEEE TRANSACTIONS ON PARALLEL

AND DISTRIBUTED SYSTEMS, ACM Transactions on Internet Technology,and ACM Transactions on Multimedia Computing, Communications, andApplications. His research interests include data center networks, cloudcomputing, and wireless networks. He is an Associate Editor of the IEEETRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS and the IEEETRANSACTIONS ON COMPUTERS.

Alex X. Liu received the Ph.D. degree in computerscience from the University of Texas at Austinin 2006. He is a Professor with the Department ofComputer Science and Engineering, Michigan StateUniversity. He is also affiliated with the State KeyLaboratory for Novel Software Technology, NanjingUniversity, Nanjing, China. His research interestsfocus on networking and security. He is an Asso-ciate Editor of the IEEE/ACM TRANSACTIONS ON

NETWORKING and an Area Editor of the Journal ofComputer Communications. He received the IEEE &

IFIP William C. Carter Award in 2004 and an NSF CAREER Award in 2009.He received the Withrow Distinguished Scholar Award in 2011 at MichiganState University. He received best paper awards from ICNP-2012, SRDS-2012,and LISA-2010.

Song Guo (S’02–M’05–SM’11) received thePh.D. degree in computer science from the Uni-versity of Ottawa, Canada. He is currently a FullProfessor with The Hong Kong Polytechnic Uni-versity (PolyU). Prior to joining PolyU, he was aFull Professor with The University of Aizu, Japan.His research interests are mainly in the areas ofcloud and green computing, big data, wireless net-works, and cyber-physical systems. His research hasbeen sponsored by JSPS, JST, MIC, NSF, NSFC,and industrial companies. He has published over

300 conference and journal papers in these areas and received multiple bestpaper awards from IEEE/ACM conferences. Dr. Guo has served as an Editorof several journals, including the IEEE TRANSACTIONS ON PARALLEL AND

DISTRIBUTED SYSTEMS, the IEEE TRANSACTIONS ON EMERGING TOPICSIN COMPUTING, the IEEE TRANSACTIONS ON GREEN COMMUNICATIONS

AND NETWORKING, the IEEE Communications Magazine, and WirelessNetworks. He has been actively participating in conference organizationsserving as General Chair and TPC Chair. He is a Senior Member of the IEEEand ACM, and an IEEE Communications Society Distinguished Lecturer.


Muhammad Shahzad received the Ph.D. degree incomputer science from Michigan State Universityin 2015. He is currently an Assistant Professor withthe Department of Computer Science, North Car-olina State University, USA. His research interestsinclude design, analysis, measurement, and model-ing of networking and security systems. He receivedthe 2015 Outstanding Graduate Student Award, the2015 Fitch Beach Award, and the 2012 OutstandingStudent Leader Award at Michigan State University.

Ann L. Wang received the B.S. degree in informa-tion engineering and the M.S. degree in computerscience from the Beijing University of Posts andTelecommunications in 2009 and 2012, respectively.She is currently pursuing the Ph.D. degree withMichigan State University. Her research interestsinclude networking, security, and privacy.

Jie Wu (M’90–SM’93–F’09) is the Associate ViceProvost for International Affairs with Temple Uni-versity. He also serves as Director of the Centerfor Networked Computing and Laura H. CarnellProfessor with the Department of Computer andInformation Sciences. Prior to joining Temple Uni-versity, he was a Program Director with the NationalScience Foundation and was a Distinguished Pro-fessor with Florida Atlantic University. His cur-rent research interests include mobile computingand wireless networks, routing protocols, cloud and

green computing, network trust and security, and social network applica-tions. He regularly publishes in scholarly journals, conference proceedings,and books. Dr. Wu serves on several editorial boards, including the IEEETRANSACTIONS ON SERVICE COMPUTING and the Journal of Paralleland Distributed Computing. He was General Co-Chair/Chair of the IEEEMASS 2006, the IEEE IPDPS 2008, the IEEE ICDCS 2013, and ACMMobiHoc 2014, as well as Program Co-Chair of the IEEE INFOCOM 2011and CCF CNCC 2013. He was an IEEE Computer Society DistinguishedVisitor, an ACM Distinguished Speaker, and Chair of the IEEE TechnicalCommittee on Distributed Processing. He is a China Computer Federation(CCF) Distinguished Speaker and a Fellow of the IEEE. He is the recipientof the 2011 CCF Overseas Outstanding Achievement Award.

264 ieee/acm transactions on networking, …...and 001, respectively. 100 is injected into the tags...

Documents