operator specificationsobs.cn-north-1.myhwclouds.com/consumer-en/docattachment...5 concat...

HiAI DDK V200

Operator Specifications

Issue 02

Date 2018-11-21

HUAWEI TECHNOLOGIES CO., LTD.

Issue 02 (2018-11-21) Copyright © Huawei Technologies Co.,Ltd. i

Copyright © Huawei Technologies Co., Ltd. 2018. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without prior

written consent of Huawei Technologies Co., Ltd.

Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.

All other trademarks and trade names mentioned in this document are the property of their respective

holders.

License Agreement

The contents described in this document may include but are not limited to introduction or reference to

non-Huawei or open source software. When using them, comply with the copyright requirements of the

other party.

Notice

The purchased products, services and features are stipulated by the contract made between Huawei and

the customer. All or part of the products, services and features described in this document may not be

within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,

information, and recommendations in this document are provided "AS IS" without warranties, guarantees or

representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in the

preparation of this document to ensure accuracy of the contents, but all statements, information, and

recommendations in this document do not constitute a warranty of any kind, express or implied.

The method of applying for HiAI is described as follows:

Send an application email to [email protected].

The format of the email subject is HUAWEI HiAI+Company name+Product name.

The format of the email body is Cooperation company+Contact person+Contact information+Contact email

address.

We will send you feedback within five workdays after receiving your email.

Website:https://developer.huawei.com/consumer/en/hiai

https://developer.huawei.com/consumer/en/hiai

HiAI DDK V200

Operator Specifications About This Document

Issue 02 (2018-11-21) Copyright © Huawei Technologies Co.,Ltd. ii

About This Document

Purpose

This document describes the operator specifications supported by Huawei HiAI DDK V200.

This document is used with the following documents:

Huawei HiAI DDK V200 Quick Start

Huawei HiAI DDK V200 Integration Guide

Change History

Changes between document issues are cumulative. The latest document issue contains all the

changes made in earlier issues.

Issue 02 (2018-11-21)

This issue updates the tensorflow operator of tf.cast and tf.image.resize_images.

This issue updates the description of section 2.2, the number of 8, 9, 10.

Issue 01 (2018-10-27)

This issue is used for first office application (FOA).

HiAI DDK V200

Operator Specifications Contents

Issue 02 (2018-11-21) Copyright © Huawei Technologies Co.,Ltd. iii

Contents

About This Document .................................................................................................................... ii

1 Parameter Description .................................................................................................................. 1

2 Operator Boundary ....................................................................................................................... 2

2.1 Caffe Operator Boundary .............................................................................................................................................. 2

2.2 TensorFlow Operator Boundary.................................................................................................................................... 7

2.3 AndroidNN Operator Boundary .................................................................................................................................. 13

HiAI DDK V200

Operator Specifications 1 Parameter Description

Issue 02 (2018-11-21) Copyright © Huawei Technologies Co.,Ltd. 1

1 Parameter Description

Parameter Description

ni Batch size

ci/co Channel count

hi/ho/ Height

wi/wo Width

ph/pw Expansion or padding size. The default pad size is less than the kernel size.

sh/sw Stride

kh/kw Size of the convolutional core

window_h(window_y)/

window_w(window_x)

Window size

dh(dilation_h)

dw(dilation_w)

Convolution dilation coefficient

HiAI DDK V200

Operator Specifications 2 Operator Boundary


2 Operator Boundary

2.1 Caffe Operator Boundary

No. Operator Description Boundary

1 absval Calculates the absolute

value of the input.

The input and output shapes must be the same.

2 add Calculates the sum of two

inputs.

The shape of two inputs must be the same as

that of one output.

3 Argmax Returns the index number

corresponding to the

maximum input value.

The parameters out_max_val and top_k are

not supported.

4 batch_norm Performs the calculation

of [x – avg(x)]/x. 1. use_global_stats must be set to true.

2. mean/variance shape = (1, ci, 1, 1)

5 concat Concatenates input data

by dimension.

For the four dimensions of output data, the

value of the dimension that is concatenated

must be equal to the sum of the dimension

values of all input data. The other three

dimensions are the same as the three

dimensions corresponding to the input data.

6 conv_depthwise Depthwise convolution 1. co% ci = 0

2. The requirements in the case of co/ci = 1

are as follows:

(kh * kw * 3 + 2) * co/ci * 16 < 8190

For 1H8, (kh * kw * 3 + 2) * co/ci * 16 <

4095

3. The requirements in the case of co/ci > 1

are as follows:

(align(x) indicates alignment with x. For

float16 mode, the value rounds up to an

HiAI DDK V200




integer multiple of 16.)

(kh * kw * 2 + align(kh * kw) * 2 + 1) *

co/ci * 16 < 8190

For 1H8, (kh * kw * 2 + align(kh * kw) * 2

+ 1) * co/ ci * 16 < 4095

4. The group parameter is not supported.

5. num_ouput%bottom[0]->channel==0

6. The input and weight values of the fix8 type

are not supported.

7 convolution Convolution 1. no = ni;

2. hi >= dh * (kh – 1) + 1;

3. wi >= dw * (kw – 1) + 1;

4. ho = (hi + ph – ((kh – 1) * (dh – 1) +

kh))/sh + 1;

5. wo = (wi + pw - ((kw – 1) * (dw – 1) +

kw))/sw + 1;

8 crop Crop No restrictions

9 deconvolution Deconvolution The dilation parameter is not supported.

10 Detection_output Check result output. The num_classes parameter must be set.

11 dropout Multiplies the input data

by (1 –dropout_ratio).

0 < dropout_ratio < 1

12 eltwise Element-wise operations

(sum, product, and

maximum value)

The stable_prod_grad parameter is not

supported.

13 elu

Activation function No restrictions

14 exp Applies e as the base and

x as the exponent to the

input.

The input and output shapes must be the same.

For the input data, ensure that the output data

does not exceed 65504 specified by fp16. That

is, the input data cannot be greater than 11.

15 flatten Converts an input n * c *

h * w into a vector n * (c

* h * w).

No restrictions

16 inner_product Fully connected The transpose parameter is not supported.

17 interp Interpolation layer 1. pad_beg <= 0

2. pad_end <= 0

HiAI DDK V200




3. ni = no, ci = co;

4. ho = interp_h;

5. wo = interp_w;

6. align_corners: not supported

18 log Performs log calculation

on the input. When base

is greater than 0, the

calculation formula is y =

log_base(shift + scale *

x). When base is equal to

–1, the calculation

formula is y = ln(shift +

scale * x) = log_e(shift +

scale * x).

The input and output shapes must be the same,

and the input data must be greater than 0.

19 lrn Local response

normalization layer

local_size is an odd number within the value

range of [3, 15].

20 lstm Long and short term

memory network

There are three types of input: two inputs, three

inputs, and five inputs.

Two inputs: The values of n and c of bottom[0]

must be the same as those of bottom[1].

Three inputs: In addition to the requirements of

two inputs, the value of n in bottom[2] must be

the same as the value of c in bottom[0].

Five inputs: In addition to the requirements of

three inputs, the value of h in bottom[3] and

bottom[4] must be the same as that of

num_ouput. Add expose_hidden: true to

recurrent_param.

21 lstm_reshape Input reshaping 1. When axis is less than 0: num_axes + axis

+ 1 <= num_axes

num_axes >= 0 or –1

2. When num_axes is greater than or equal to

0: axis + num_axes >= num_axes

22 mult Product of two inputs The two input scales are the same.

23 nms Non-maximum

suppression

Input scale: (1, box_size 1, 5)

24 normalize Standardization layer At least two inputs are required.

25 permute Rearranges the input

dimensions according to

the given mode.

The value of order(index) is smaller than

num_axes

order.size()=num_axes

26 pooling Pooling layer 1. When global_pooling is set to true:

HiAI DDK V200




kernel_h = bottom->height kernel_w =

bottom->width

2. no = ni, co = ci

3. When mode is max: (max (min(kh, hi),

min(kw, wi)) + 1) * min(ci, 256) < 131040

For 1H8, (max (min(kh, hi), min(kw, wi)) +

1) * min(ci, 256) < 65520

4. When mode is set to avg: (max (min(kh,

hi), min(kw, wi)) + 1) * min(ci, 256) <

130976

For 1H8, (max (min(kh, hi), min(kw, wi)) +

1) * min(ci, 256) < 65440

27 power y = (scale * x + shift) ^

power

The input and output shapes must be the same,

and the input data must be greater than 0.

28 prelu Activation function No restrictions

29 prior_box Obtains the position of the

real target from the region

proposals.

min_size and max_size must be configured.

30 proposal Sorts the region proposals

by (proposal, score) and

obtains the top N

proposals by using nms.

Currently, the proposal of the CPU and MLU

supports only two bottoms. The third input

im_info is not supported.

31 psroi_pooling Position-sensitive

region-of-interest pooling

(PSROIPooling)

1. output_dim > 0

2. group_size > 0

32 relu Activation function,

including common ReLU

and Leaky ReLU, which

can be specified by

parameters.

No restrictions

33 reshape Input dimension

reshaping

ni * ci * hi * wi = no * co * ho * wo

34 rnn Recurrent neural network

(RNN)

There are three types of input: two inputs, three

inputs, and four inputs

Two inputs: The values of n and c of bottom[0]

must be the same as those of bottom[1].

Three inputs: In addition to the requirements of

two inputs, the value of n in bottom[2] must be

the same as the value of c in bottom[0].

Four inputs: In addition to the requirements of

three inputs, the value of h in bottom[3] must

be the same as that of num_ouput. Add

HiAI DDK V200




expose_hidden: true to recurrent_param.

35 roi_align Aggregation of regional

features 1. pooled_h > 0

2. pooled_w > 0

36 roi_pooling Maps the ROI to the

feature map. 1. pooled_h > 0

2. pooled_w > 0

37 rpn Region proposal network

(RPN)

No restrictions

38 scale out = alpha * Input + beta In the case of one input, the values of axis and

num_axes are 1.

In the case of two inputs, the shape of input 0

is ni0, ci0, h, w. The shape of input 1 is ni1,

ci1, 1, 1. ni0 is equal to ni1 or ni1 is equal to 1,

and ci0 is equal to ci1.

39 shufflechannel The help information is

cross-connected in the

feature channel.

1. channels%group = 0

2. The input and output shapes are the same.

40 sigmoid Activation function No restrictions

41 silence Do not print unused blob

information in logs.

No restrictions

42 slice Slices the input into

multiple outputs.

The axis and slice_dim parameters are

mutually exclusive. Configure either of them.

43 softmax Normalized logic function No restrictions

44 sqrt Squares the input. The input data must be greater than or equal to

0. The input and output shapes must be the

same.

45 sub Performs subtraction on

two inputs.

The two input scales are the same.

46 tanh Activation function No restrictions

47 unpooling Opposite to pooling. Lost

information is padded

based on specified

parameters.

kw <= wi

kh <= hi

sw <= kw

sh <= kh

48 upsample Reverse propagation

process of max pooling

upsample_h/upsample_w and

scale/scale_h/scale_w are two mutually

exclusive groups of parameters.

1. If upsample_h and upsample_w are

configured, then upsample_h > 1,

upsample_w > 1

HiAI DDK V200




2. If scale or scale_h/scale_w is configured,

then scale = 2, scale_h = 2, scale_w = 2

2.2 TensorFlow Operator Boundary

No. Python API C++ API Boundary

1 tf.nn.avg_pool AvgPool max(kernel_shape.h, kernel_shape.w) *

min(input_shape.c, 256) +

min(output_shape.c, 256) < 130880, where

kernel_shape indicates the shape of the

kernel, input_shape indicates the shape of

the input, output_shape indicates the shape

of the output, min indicates the minimum

value, and max indicates the maximum

value.

2 tf.nn.max_pool MaxPool max(kernel_shape.h, kernel_shape.w) *

min(input_shape.c, 256) +

min(output_shape.c, 256) < 130880, where

kernel_shape indicates the shape of the

kernel, input_shape indicates the shape of

the input, output_shape indicates the shape

of the output, min indicates the minimum

value, and max indicates the maximum

value.

3 tf.nn.conv2d Conv2D input_shape.h >= kernel_shape.h &&

input_shape.w >= kernel_shape.w

That is, the input height must be greater than

the kernel height and the input width must be

greater than the kernel width.

4 tf.concat Concat input_dims() <= 4;

For the four dimensions of output data, the

value of the dimension that is concatenated

must be equal to the sum of the dimension

values of all input data. The other three

dimensions are the same as the three

dimensions corresponding to the input data.

5 tf.matmul MatMul transpose_a == false &&

transpose_b == false

hi < 15; wi < 15

6 tf.nn.fused_batch_norm FusedBatchNorm mean/variance shape = (1, ci, 1, 1)

7 tf.abs Abs The input and output shapes must be the

same.

HiAI DDK V200




8 tf.image.resize_images

(ResizeMethod.NEAREST_

NEIGHBOR)

ResizeNearestNeighb

or

ni = no, ci = co

9 tf.image.resize_images

(ResizeMethod.BILINEAR)

ResizeBilinear ni = no, ci = co

align_corners is not supported.

10 tf.cast Cast Restriction on the tf.cast operator, which is

the same as that on the data types. For

example, if the input value is fp16, the

maximum value is 65504.

The following data type conversions are

supported

CAST_FLOAT32_TO_UINT8

CAST_UINT8_TO_FLOAT32

CAST_INT8_TO_FLOAT16

CAST_FIX8_TO_FLOAT16

CAST_FLOAT16_TO_FIX8

CAST_FLOAT16_TO_FLOAT32

CAST_FLOAT32_TO_FLOAT16

CAST_INT16_TO_FLOAT16

CAST_FLOAT16_TO_INT16

11 tf.nn.depthwise_conv2d DepthwiseConv2dNat

ive

The value of co must be a positive integer

multiple of ci.

The requirements in the case of co/ci = 1 are

as follows:

(kh * kw * 3 + 2) * co/ci * 16 < 8190

For 1H8, (kh * kw * 3 + 2) * co/ci * 16 <

4095

The requirements in the case of co/ci > 1 are

as follows:




(kh * kw * 2 + align(kh * kw) * 2 + 1) * co/ci

* 16 < 8190

For 1H8,


* 16 < 4095

The input and weight values of the fix8 type

are not supported.

12 tf.sparse_matmul SparseMatMul transpose_a == false &&

transpose_b == false &&

a_is_sparse == false

HiAI DDK V200




hi < 15; wi < 15

13 tf.reshape Reshape input.dims() <= 4 && output.dims() <= 4

int ni, ci, hi, wi, no, co, ho, wo;


14 tf.squeeze Squeeze input.dims() <= 4

15 tf.expand_dims ExpandDims input.dims() <= 3 && output.dims() <= 4

16 tf.nn.mlu_conv2d MluConv2D input_shape.h >= kernel_shape.h &&





17 tf.nn.fused_conv2d_bias FusedConv2DBias input_shape.h >= kernel_shape.h &&





18 tf.nn.fix8_conv2d Fix8Conv2D input_shape.h >= kernel_shape.h &&





19 tf.nn.fix8_conv2d_bias Fix8Conv2DBias input_shape.h >= kernel_shape.h &&





20 tf.nn.first_conv2d FirstConv2D input_shape.h >= kernel_shape.h &&





21 tf.nn.first_conv2d_bias FirstConv2DBias input_shape.h >= kernel_shape.h &&





22 tf.nn.mlu_depthwise_conv2d

_native

MluDepthwiseConv2

dNative

The value of co must be a positive integer

multiple of ci.

The requirements in the case of co/ci = 1 are

as follows:

(kh * kw * 3 + 2) * co/ci * 16 < 8190

For 1H8, (kh * kw * 3 + 2) * co/ci * 16 <

HiAI DDK V200




4095

The requirements in the case of co/ci > 1 are

as follows:





* 16 < 8190

For 1H8, (kh * kw * 2 + align(kh * kw) * 2 +

1) * co/ci * 16 < 4095

The input and weight values of the fix8 type

are not supported.

23 tf.greater Greater No restrictions

24 tf.less_equal / No restrictions

25 tf.nn.relu Relu The input and output shapes must be the

same.

26 tf.nn.relu6 Relu6 The input and output shapes must be the

same.

27 tf.contrib.keras.layers.Leaky

Relu

/ The input and output shapes must be the

same.

28 tf.exp exp The input and output shapes must be the

same. For the input data, ensure that the

output data does not exceed 65504 specified

by fp16. That is, the input data cannot be

greater than 11.

29 tf.nn.conv2d_transpose Conv2DBackpropInp

ut

no = ni;

ho = (hi – 1) * sh + kh – hu – hd;

wo = (wi – 1) * sw + kw – wl – wr;

hu, hd, wl, and wr indicate the pad sizes in

the upper, lower, left, and right positions,

respectively.

30 tf.sigmoid Sigmoid The input and output shapes must be the

same.

31 tf.add Add The shapes of input 1 and input 2 are n1, c1,

h1, w1 and n2, c2, h2, w2, respectively. The

shape of input 2 supports the following three

cases:

(1) n2 = n1, c2 = c1, h2 = h1, w2 = w1;

(2) n2 = 1, c2 = c1, h2 = 1, w2 = 1;

(3) n2 = 1, c2 = 1, h2 = 1, w2 = 1;

The positions of input 1 and input 2 can be

exchanged.

HiAI DDK V200




32 tf.add_n AddN No restrictions

33 tf.multiply Multiply The shapes of input 1 and input 2 are n1, c1,



cases:

(1) n2 = n1, c2 = c1, h2 = h1, w2 = w1;

(2) n2 = 1, c2 = c1, h2 = 1, w2 = 1;

(3) n2 = 1, c2 = 1, h2 = 1, w2 = 1;


exchanged.

34 tf.subtract Subtract The shapes of input 1 and input 2 are n1, c1,



cases:

(1) n2 = n1, c2 = c1, h2 = h1, w2 = w1;

(2) n2 = 1, c2 = c1, h2 = 1, w2 = 1;

(3) n2 = 1, c2 = 1, h2 = 1, w2 = 1;


exchanged.

35 tf.nn.bias_add BiasAdd No restrictions

36 tf.nn.lrn LRN depth_radius is an integer within the range

of [1, 7].

37 tf.where Select The shapes of three inputs and one output

must be the same. Input A and input B are

two copies of data to be selected, and input C

is an option. The value can only be 1, or 0, or

true or false.

38 tf.summary.merge Merge No restrictions

39 tf.nn.elu Elu The input and output shapes must be the

same.

40 tf.rsqrt Rsqrt The input data must be greater than 0. The

input and output shapes must be the same.

41 tf.log Log The input and output shapes must be the

same.

42 tf.tanh Tanh The input and output shapes must be the

same.

43 tf.slice Slice No restrictions

44 tf.split Split/SplitV No restrictions

45 tf.floor Floor The input and output shapes must be the

same.

HiAI DDK V200




46 tf.contrib.keras.backend.swit

ch

Switch No restrictions

47 tf.identity Identity No restrictions

48 tf.nn.softplus Softplus The input and output shapes must be the

same.

49 tf.nn.softsign Softsign The input and output shapes must be the

same.

50 tf.pad Pad, PadV2 Only mode = 'CONSTANT' is supported.

51 tf.contrib.rnn.LSTMCell / See the restrictions in the TensorFlow

document.

52 tf.contrib.rnn. GRUCell / See the restrictions in the TensorFlow

document.

53 tf.contrib.rnn.GRUBlockCell GRUBlcokCell See the restrictions in the TensorFlow

document.

54 tf.contrib.rnn.LSTMBlockCe

ll

LSTMBlockCell See the restrictions in the TensorFlow

document.

55 tf.fake_quant_with_min_max

_args

FakeQuantWithMinM

axArgs

The input and output shapes must be the

same.


_vars

FakeQuantWithMinM

axVars


same.


_vars_per_channel

FakeQuantWithMinM

axVarsPe

rChannel


same.

58 tf.nn.fractional_avg_pool FractionalAvgPool kh = (int)(ratio_row + 0.5);

kw = (int)(ratio_col + 0.5);

ratio_row < hi;

ratio_col < wi;

c * (kh + 2) * (kw + 1) * 16 < 8191

For 1H8, the size is halved to c * (kh + 2) *

(kw + 1) * 16 < 4095.

59 tf.nn.fractional_max_pool FractionalMaxPool kh = (int)(ratio_row + 0.5);

kw = (int)(ratio_col + 0.5);

ratio_row < hi;

ratio_col < wi;

c * (kh + 2) * (kw + 1) * 16 < 8191

For 1H8, the size is halved to c * (kh + 2) *

(kw + 1) * 16 < 4095.

60 tf.nn.log_softmax LogSoftmax The input and output shapes must be the

same.

HiAI DDK V200




61 tf.contrib.layers.flatten Reshape Input data dimension ≤ 4

62 tf.reduce_max Max No restrictions

63 tf.strided_slice StridedSlice No restrictions

64 tf.contrib.rnn.BasicRNNCell / See the restrictions in the official TensorFlow

document.

65 tf.contrib.rnn.BidirectionalGr

id LSTMCell

/ See the restrictions in the TensorFlow

document.

66 tf.contrib.rnn.MultiRNNCell / See the restrictions in the official TensorFlow

document.

67 tf.nn.static_rnn / See the restrictions in the official TensorFlow

document.

68 tf.reverse_sequence ReverseSequence No restrictions

69 tf.realdiv RealDiv No restrictions

70 tf.stack Pack No restrictions

71 tf.unstack Unpack No restrictions

72 tf.contrib.rnn.BasicLSTMCel

l

/ No restrictions

73 tf.transpose Transpose No restrictions

74 tf.space_to_batch_nd SpaceToBatchND hi % h_block_size == 0;

wi % w_block_size == 0;

75 tf.batch_to_space_nd BatchToSpaceND ni % (h_block_size * w_block_size) == 0;

76 tf.xw_plus_b MLP transpose_a == false &&

transpose_b == false

hi < 15; wi < 15

77 tf.fix8_mlp Fix8MLP hi < 15; wi < 15

78 tf.fix8_matmul Fix8MatMul hi < 15; wi < 15

2.3 AndroidNN Operator Boundary

The values of n, c, h, and w of all operators must be greater than or equal to 1, and n * c * h * w must be

less than or equal to 256M (indicating the number of data records instead of bytes).

No. Operation NPU_interface Parameter Limit Boundary

1 add NPU_ADD int n1, c1, h1, n1, c1, h1, and w1 are the first input

HiAI DDK V200




w1, n2, c2, h2,

w2, no, co, ho,

wo;

dimensions; n2, c2, h2, and w2 are the

second input dimensions; no, co, ho, and

wo are output dimensions.

The shape of input 1 is the same as that

of output.

The shape of input 2 supports the

following three cases:

(1) n2 = n1, c2 = c1, h2 = h1, w2 = w1;

(2) n2 = 1, c2 = c1, h2 = 1, w2 = 1;

(3) n2 = 1, c2 = 1, h2 = 1, w2 = 1;

2 mul NPU_MUL int n1, c1, h1,

w1, n2, c2, h2,

w2, no, co, ho,

wo;

n1, c1, h1, and w1 are the first input

dimensions; n2, c2, h2, and w2 are the

second input dimensions; no, co, ho, and

wo are output dimensions.

The shape of input 1 is the same as that

of output.

The shape of input 2 supports the

following three cases:

(1) n2 = n1, c2 = c1, h2 = h1, w2 = w1;

(2) n2 = 1, c2 = c1, h2 = 1, w2 = 1;

(3) n2 = 1, c2 = 1, h2 = 1, w2 = 1;

3 relu NPU_RELU int ni, ci, hi,wi; ni, ci, hi, and wi are the input

dimensions. The shape of input is the

same as that of output.

4 relu1 NPU_RELU1 int ni, ci, hi,wi; ni, ci, hi, and wi are the input

dimensions.

The shape of input is the same as that of

output.

5 relu6 NPU_RELU6 int ni, ci, hi,wi; ni, ci, hi, and wi are the input

dimensions.


output.

6 tanh NPU_TANH int ni, ci, hi,wi; ni, ci, hi, and wi are the input

dimensions.


output.

7 concat NPU_CONCATE

NATION

Input data

dimensions and

output data

dimensions

int axis;

The input data has multiple dimensions.

(1) One-dimensional (c): Indicates that

concatenation is performed based on

the c dimension.

(2) Two-dimensional (n, c):

When axis is 0, it indicates that


HiAI DDK V200




the n dimension.



the c dimension.

(3) Three-dimensional (n, h, c):



the n dimension.



the h dimension.



the c dimension.

(4) Four-dimensional (n, h, w, c):



the n dimension.



the h dimension.



the w dimension.



the c dimension.

For the output data dimensions, the

value of the dimension that is

concatenated must be equal to the sum

of the input dimension values. For other

dimensions, the input data dimension is

equal to the output data dimension.

8 conv2d NPU_CONV_2D int ni, ci, hi,

wi, no, co, ho,

wo, pl, pr, pt,

pb, sh, sw;

ni, ci, hi, and wi are input dimensions;

no, co, ho, and wo are output

dimensions; pl, pr, pt, and pb are the

sizes of the left, right, upper, and lower

pads to be supplemented; sh and sw are

the sliding dimensions of the sliding

window. no = ni;

9 conv_depthwis

e

NPU_DEPTHWIS

E_CONV_2D

int ni, ci, hi,

wi, no, co, ho,

wo, pl, pr, pt,

pb, sh, sw,

multiplier;



dimensions; pl, pr, pt, and pb are the

sizes of the left, right, upper, and lower

pads to be supplemented; sh and sw are

the sliding dimensions of the sliding

window.

The value of co must be a positive

HiAI DDK V200




integer multiple of ci. The value of

multiplier must be able to exactly divide

ci.

kh and kw are the calculated size of the

convolutional core.

The requirements in the case of co/ci = 1

are as follows:

(kh * kw * 3 + 2) * co/ci < 131040

The requirements in the case of co/ci > 1

are as follows:

(kh * kw * 2 + align(kh * kw) * 2 + 1)

*co/ci < 131040




10 mlp NPU_FULLY_CO

NNECTED

int ni, ci, hi,wi,

no, co, ho,wo;

ni, ci, hi, and wi are the first input

dimensions.


dimensions.

hi < 15; wi < 15

11 lrn NPU_LOCAL_RE

SPONSE_NORM

ALIZATION

int ni, ci, hi,wi,

radius; float

alpha,beta, bias;

ni, ci, hi, and wi are input dimensions.

The value of radius in within the range

of [1, 7].

12 avg_pool NPU_AVERAGE

_POOL_2D

int ni, ci, hi,

wi, no, co, ho,

wo, kh, kw, sh,

sw, pl, pr, pt,pb;



dimensions; kh and kw are the height

and width of the convolutional core; sh

and sw are the sliding dimensions of the

sliding window; pl, pr, pt, and pb are

the sizes of the pads to be supplemented.

no = ni; co = ci;

(max (min(kh, hi), min(kw, wi)) + 1)

*min(ci, 256) < 130976

13 max_pooling NPU_MAX_POO

L_2D

int ni, ci, hi,

wi, no, co, ho,

wo, kh, kw, sh,

sw, pl, pr, pt,

pb;



dimensions; kh and kw are the height

and width of the convolutional core; sh




no = ni; co = ci;


*min(ci, 256) < 131040

14 reshape NPU_RESHAPE int ni, ci, hi,wi,

no, co, ho, wo;



HiAI DDK V200




dimensions.


15 softmax NPU_SOFTMAX int ni, ci, hi,wi; ni, ci, hi, and wi are input dimensions.


same.

16 svdf NPU_SVDF int batch_size,

input_size,

num_units,

memory_size,

rank;

The value of rank is less than or equal

to 2.

17 basic_rnn NPU_RNN int ni, ci, no, co; ni and ci are input dimensions; no and

co are output dimensions.

18 l2_pool NPU_L2_POOL_2

D

int ni, ci, hi,

wi, no, co, ho,

wo, kh, kw, sh,

sw, pl, pr, pt,

pb;

ni, ci, hi, and wi are the first input

dimensions; no, co, ho, and wo are

output dimensions; kh and kw are the

height and width of the pooling core; sh




ni = no, ci = co;


*min(ci, 256) < 130976

19 resize_bilinear NPU_RESIZE_BI

LINEAR

int ni, ci, hi,

wi, no, co, ho,

wo;

int interp_h,

interp_w;



dimensions.

ni = no, ci = co;

interp_h > 0 && interp_w > 0

ho = interp_h,

wo = interp_w;

20 l2_normal NPU_L2_NORM

ALIZATION

int ni, ci, hi,wi ni, ci, hi, and wi are input dimensions.

The value of ci is within the range of [1,

7].

21 Floor NPU_FLOOR int ni, ci, hi ,wi,

no, co, ho, wo;



dimensions.


same.

22 dequantize NPU_DEQUANTI

ZE

int ni, ci, hi,

wi;


same.

23 logistic NPU_LOGISTIC int ni, ci, hi,

wi;

ni, ci, hi, and wi are input dimensions.


HiAI DDK V200




same.

24 lstm NPU_LSTM int ni, ci, nc,

cc, nr, cr;

int no, co, nco,

cco;

float cell_clip,

project_clip;

ni and ci are two input dimensions; nc

and cc are the dimensions of cell_input;

nr and cr are dimensions of recurrent;

no and co are output dimensions; nco

and cco are the dimensions of

cell_output.

cell_clip performs the clip operation on

cell_out, and project_clip performs the

clip operation on the state_out.

ni = no, co = cr; nco = nc, cco = cc;

The dimensions of i2i/i2f/i2c/i2o_filter

are (cc, ci).

The dimensions of r2i/r2f/r2c/r2o_filter

are (cc, co).

The dimensions of c2i/c2f/c2o_filter are

(cc, cc).

The dimensions of

input/output/forget/cell_bias are (1,

cc).

The dimensions of project_filter are

(co, cs).

The dimensions of project_bias are (1,

co).

In the parameters, filter and bias may be

null.

use_cifg, use_peephole,

use_projection_weight, and

use_projection_bias are four bool

variables. The options are as follows:

use_cifg = IsNullInput(i2i_filter);

use_peephole = !IsNullInput(c2o_filter);

use_projection_weight

= !IsNullInput(projection_filter);

use_projection_bias

= !IsNullInput(projection_bias);

operator specificationsobs.cn-north-1.myhwclouds.com/consumer-en/docattachment...5 concat...

Documents