Execution Strategy

Execution Strategy

When evaluating a model you can either evaluate the results in parallel (executing as if you are able to achieve intra-model parallelism) or serially. This is controlled by the --strategy={parallel|serial} option. For example, to find the execution flow for parallel execution you would use.

benanza benchinfo --model_path //BVLC_GoogleNet --benchmark_database db.json --short=false --batch_size=1 --human=true --strategy=parallel --show --highlight_fast_path

This would output a graphics along with information, with the red path being the path of minimal cost (i.e. critical path).



%0



conv1/7x7_s2_1

conv_1

CONV

[1,64,112,112]



conv1/7x7_s2_2

relu_1

RELU

11.905µs



conv1/7x7_s2_1->conv1/7x7_s2_2





pool1/3x3_s2_1

maxpool_1

MXPL

11.216µs



conv1/7x7_s2_2->pool1/3x3_s2_1





pool1/norm1_1

lrn_1

LRN

0s



pool1/3x3_s2_1->pool1/norm1_1





conv2/3x3_reduce_1

conv_2

CONV

28.128µs



pool1/norm1_1->conv2/3x3_reduce_1





conv2/3x3_reduce_2

relu_2

RELU

8.564µs



conv2/3x3_reduce_1->conv2/3x3_reduce_2





conv2/3x3_1

conv_3

CONV

75.339µs



conv2/3x3_reduce_2->conv2/3x3_1





conv2/3x3_2

relu_3

RELU

9.746µs



conv2/3x3_1->conv2/3x3_2





conv2/norm2_1

lrn_2

LRN

0s



conv2/3x3_2->conv2/norm2_1





pool2/3x3_s2_1

maxpool_2

MXPL

10.453µs



conv2/norm2_1->pool2/3x3_s2_1





inception_3a/1x1_1

conv_4

CONV

30.987µs



pool2/3x3_s2_1->inception_3a/1x1_1





inception_3a/3x3_reduce_1

conv_5

CONV

31.169µs



pool2/3x3_s2_1->inception_3a/3x3_reduce_1





inception_3a/5x5_reduce_1

conv_7

CONV

30.625µs



pool2/3x3_s2_1->inception_3a/5x5_reduce_1





inception_3a/pool_1

maxpool_3

MXPL

11.709µs



pool2/3x3_s2_1->inception_3a/pool_1





inception_3a/1x1_2

relu_4

RELU

7.775µs



inception_3a/1x1_1->inception_3a/1x1_2





inception_3a/output_1

concat_1

CONC

0s



inception_3a/1x1_2->inception_3a/output_1





inception_3a/3x3_reduce_2

relu_5

RELU

7.806µs



inception_3a/3x3_reduce_1->inception_3a/3x3_reduce_2





inception_3a/3x3_1

conv_6

CONV

47.29µs



inception_3a/3x3_reduce_2->inception_3a/3x3_1





inception_3a/3x3_2

relu_6

RELU

8.031µs



inception_3a/3x3_1->inception_3a/3x3_2





inception_3a/3x3_2->inception_3a/output_1





inception_3a/5x5_reduce_2

relu_7

RELU

7.748µs



inception_3a/5x5_reduce_1->inception_3a/5x5_reduce_2





inception_3a/5x5_1

conv_8

CONV

42.248µs



inception_3a/5x5_reduce_2->inception_3a/5x5_1





inception_3a/5x5_2

relu_8

RELU

7.75µs



inception_3a/5x5_1->inception_3a/5x5_2





inception_3a/5x5_2->inception_3a/output_1





inception_3a/pool_proj_1

conv_9

CONV

30.918µs



inception_3a/pool_1->inception_3a/pool_proj_1





inception_3a/pool_proj_2

relu_9

RELU

7.75µs



inception_3a/pool_proj_1->inception_3a/pool_proj_2





inception_3a/pool_proj_2->inception_3a/output_1





inception_3b/1x1_1

conv_10

CONV

43.399µs



inception_3a/output_1->inception_3b/1x1_1





inception_3b/3x3_reduce_1

conv_11

CONV

43.399µs



inception_3a/output_1->inception_3b/3x3_reduce_1





inception_3b/5x5_reduce_1

conv_13

CONV

34.578µs



inception_3a/output_1->inception_3b/5x5_reduce_1





inception_3b/pool_1

maxpool_4

MXPL

12.991µs



inception_3a/output_1->inception_3b/pool_1





inception_3b/1x1_2

relu_10

RELU

8.031µs



inception_3b/1x1_1->inception_3b/1x1_2





inception_3b/output_1

concat_2

CONC

0s



inception_3b/1x1_2->inception_3b/output_1





inception_3b/3x3_reduce_2

relu_11

RELU

8.031µs



inception_3b/3x3_reduce_1->inception_3b/3x3_reduce_2





inception_3b/3x3_1

conv_12

CONV

55.393µs



inception_3b/3x3_reduce_2->inception_3b/3x3_1





inception_3b/3x3_2

relu_12

RELU

8.396µs



inception_3b/3x3_1->inception_3b/3x3_2





inception_3b/3x3_2->inception_3b/output_1





inception_3b/5x5_reduce_2

relu_13

RELU

7.75µs



inception_3b/5x5_reduce_1->inception_3b/5x5_reduce_2





inception_3b/5x5_1

conv_14

CONV

64.44µs



inception_3b/5x5_reduce_2->inception_3b/5x5_1





inception_3b/5x5_2

relu_14

RELU

7.806µs



inception_3b/5x5_1->inception_3b/5x5_2





inception_3b/5x5_2->inception_3b/output_1





inception_3b/pool_proj_1

conv_15

CONV

34.53µs



inception_3b/pool_1->inception_3b/pool_proj_1





inception_3b/pool_proj_2

relu_15

RELU

7.775µs



inception_3b/pool_proj_1->inception_3b/pool_proj_2





inception_3b/pool_proj_2->inception_3b/output_1





pool3/3x3_s2_1

maxpool_5

MXPL

8.924µs



inception_3b/output_1->pool3/3x3_s2_1





inception_4a/1x1_1

conv_16

CONV

46.653µs



pool3/3x3_s2_1->inception_4a/1x1_1





inception_4a/3x3_reduce_1

conv_17

CONV

46.384µs



pool3/3x3_s2_1->inception_4a/3x3_reduce_1





inception_4a/5x5_reduce_1

conv_19

CONV

46.266µs



pool3/3x3_s2_1->inception_4a/5x5_reduce_1





inception_4a/pool_1

maxpool_6

MXPL

8.905µs



pool3/3x3_s2_1->inception_4a/pool_1





inception_4a/1x1_2

relu_16

RELU

7.767µs



inception_4a/1x1_1->inception_4a/1x1_2





inception_4a/output_1

concat_3

CONC

0s



inception_4a/1x1_2->inception_4a/output_1





inception_4a/3x3_reduce_2

relu_17

RELU

7.792µs



inception_4a/3x3_reduce_1->inception_4a/3x3_reduce_2





inception_4a/3x3_1

conv_18

CONV

46.878µs



inception_4a/3x3_reduce_2->inception_4a/3x3_1





inception_4a/3x3_2

relu_18

RELU

7.851µs



inception_4a/3x3_1->inception_4a/3x3_2





inception_4a/3x3_2->inception_4a/output_1





inception_4a/5x5_reduce_2

relu_19

RELU

7.757µs



inception_4a/5x5_reduce_1->inception_4a/5x5_reduce_2





inception_4a/5x5_1

conv_20

CONV

42.162µs



inception_4a/5x5_reduce_2->inception_4a/5x5_1





inception_4a/5x5_2

relu_20

RELU

7.785µs



inception_4a/5x5_1->inception_4a/5x5_2





inception_4a/5x5_2->inception_4a/output_1





inception_4a/pool_proj_1

conv_21

CONV

46.411µs



inception_4a/pool_1->inception_4a/pool_proj_1





inception_4a/pool_proj_2

relu_21

RELU

7.739µs



inception_4a/pool_proj_1->inception_4a/pool_proj_2





inception_4a/pool_proj_2->inception_4a/output_1





inception_4b/1x1_1

conv_22

CONV

48.094µs



inception_4a/output_1->inception_4b/1x1_1





inception_4b/3x3_reduce_1

conv_23

CONV

48.114µs



inception_4a/output_1->inception_4b/3x3_reduce_1





inception_4b/5x5_reduce_1

conv_25

CONV

48.388µs



inception_4a/output_1->inception_4b/5x5_reduce_1





inception_4b/pool_1

maxpool_7

MXPL

9.037µs



inception_4a/output_1->inception_4b/pool_1





inception_4b/1x1_2

relu_22

RELU

7.774µs



inception_4b/1x1_1->inception_4b/1x1_2





inception_4b/output_1

concat_4

CONC

0s



inception_4b/1x1_2->inception_4b/output_1





inception_4b/3x3_reduce_2

relu_23

RELU

7.786µs



inception_4b/3x3_reduce_1->inception_4b/3x3_reduce_2





inception_4b/3x3_1

conv_24

CONV

50.465µs



inception_4b/3x3_reduce_2->inception_4b/3x3_1





inception_4b/3x3_2

relu_24

RELU

7.778µs



inception_4b/3x3_1->inception_4b/3x3_2





inception_4b/3x3_2->inception_4b/output_1





inception_4b/5x5_reduce_2

relu_25

RELU

7.769µs



inception_4b/5x5_reduce_1->inception_4b/5x5_reduce_2





inception_4b/5x5_1

conv_26

CONV

50.697µs



inception_4b/5x5_reduce_2->inception_4b/5x5_1





inception_4b/5x5_2

relu_26

RELU

7.739µs



inception_4b/5x5_1->inception_4b/5x5_2





inception_4b/5x5_2->inception_4b/output_1





inception_4b/pool_proj_1

conv_27

CONV

48.158µs



inception_4b/pool_1->inception_4b/pool_proj_1





inception_4b/pool_proj_2

relu_27

RELU

7.739µs



inception_4b/pool_proj_1->inception_4b/pool_proj_2





inception_4b/pool_proj_2->inception_4b/output_1





inception_4c/1x1_1

conv_28

CONV

48.331µs



inception_4b/output_1->inception_4c/1x1_1





inception_4c/3x3_reduce_1

conv_29

CONV

48.331µs



inception_4b/output_1->inception_4c/3x3_reduce_1





inception_4c/5x5_reduce_1

conv_31

CONV

48.388µs



inception_4b/output_1->inception_4c/5x5_reduce_1





inception_4c/pool_1

maxpool_8

MXPL

9.037µs



inception_4b/output_1->inception_4c/pool_1





inception_4c/1x1_2

relu_28

RELU

7.872µs



inception_4c/1x1_1->inception_4c/1x1_2





inception_4c/output_1

concat_5

CONC

0s



inception_4c/1x1_2->inception_4c/output_1





inception_4c/3x3_reduce_2

relu_29

RELU

7.872µs



inception_4c/3x3_reduce_1->inception_4c/3x3_reduce_2





inception_4c/3x3_1

conv_30

CONV

54.562µs



inception_4c/3x3_reduce_2->inception_4c/3x3_1





inception_4c/3x3_2

relu_30

RELU

7.796µs



inception_4c/3x3_1->inception_4c/3x3_2





inception_4c/3x3_2->inception_4c/output_1





inception_4c/5x5_reduce_2

relu_31

RELU

7.769µs



inception_4c/5x5_reduce_1->inception_4c/5x5_reduce_2





inception_4c/5x5_1

conv_32

CONV

50.697µs



inception_4c/5x5_reduce_2->inception_4c/5x5_1





inception_4c/5x5_2

relu_32

RELU

7.739µs



inception_4c/5x5_1->inception_4c/5x5_2





inception_4c/5x5_2->inception_4c/output_1





inception_4c/pool_proj_1

conv_33

CONV

48.158µs



inception_4c/pool_1->inception_4c/pool_proj_1





inception_4c/pool_proj_2

relu_33

RELU

7.739µs



inception_4c/pool_proj_1->inception_4c/pool_proj_2





inception_4c/pool_proj_2->inception_4c/output_1





inception_4d/1x1_1

conv_34

CONV

48.114µs



inception_4c/output_1->inception_4d/1x1_1





inception_4d/3x3_reduce_1

conv_35

CONV

48.397µs



inception_4c/output_1->inception_4d/3x3_reduce_1





inception_4d/5x5_reduce_1

conv_37

CONV

48.44µs



inception_4c/output_1->inception_4d/5x5_reduce_1





inception_4d/pool_1

maxpool_9

MXPL

9.037µs



inception_4c/output_1->inception_4d/pool_1





inception_4d/1x1_2

relu_34

RELU

7.786µs



inception_4d/1x1_1->inception_4d/1x1_2





inception_4d/output_1

concat_6

CONC

0s



inception_4d/1x1_2->inception_4d/output_1





inception_4d/3x3_reduce_2

relu_35

RELU

7.751µs



inception_4d/3x3_reduce_1->inception_4d/3x3_reduce_2





inception_4d/3x3_1

conv_36

CONV

59.21µs



inception_4d/3x3_reduce_2->inception_4d/3x3_1





inception_4d/3x3_2

relu_36

RELU

7.88µs



inception_4d/3x3_1->inception_4d/3x3_2





inception_4d/3x3_2->inception_4d/output_1





inception_4d/5x5_reduce_2

relu_37

RELU

7.776µs



inception_4d/5x5_reduce_1->inception_4d/5x5_reduce_2





inception_4d/5x5_1

conv_38

CONV

54.228µs



inception_4d/5x5_reduce_2->inception_4d/5x5_1





inception_4d/5x5_2

relu_38

RELU

7.739µs



inception_4d/5x5_1->inception_4d/5x5_2





inception_4d/5x5_2->inception_4d/output_1





inception_4d/pool_proj_1

conv_39

CONV

48.158µs



inception_4d/pool_1->inception_4d/pool_proj_1





inception_4d/pool_proj_2

relu_39

RELU

7.739µs



inception_4d/pool_proj_1->inception_4d/pool_proj_2





inception_4d/pool_proj_2->inception_4d/output_1





inception_4e/1x1_1

conv_40

CONV

49.322µs



inception_4d/output_1->inception_4e/1x1_1





inception_4e/3x3_reduce_1

conv_41

CONV

49.231µs



inception_4d/output_1->inception_4e/3x3_reduce_1





inception_4e/5x5_reduce_1

conv_43

CONV

49.268µs



inception_4d/output_1->inception_4e/5x5_reduce_1





inception_4e/pool_1

maxpool_10

MXPL

8.934µs



inception_4d/output_1->inception_4e/pool_1





inception_4e/1x1_2

relu_40

RELU

7.796µs



inception_4e/1x1_1->inception_4e/1x1_2





inception_4e/output_1

concat_7

CONC

0s



inception_4e/1x1_2->inception_4e/output_1





inception_4e/3x3_reduce_2

relu_41

RELU

7.774µs



inception_4e/3x3_reduce_1->inception_4e/3x3_reduce_2





inception_4e/3x3_1

conv_42

CONV

63.374µs



inception_4e/3x3_reduce_2->inception_4e/3x3_1





inception_4e/3x3_2

relu_42

RELU

7.809µs



inception_4e/3x3_1->inception_4e/3x3_2





inception_4e/3x3_2->inception_4e/output_1





inception_4e/5x5_reduce_2

relu_43

RELU

7.776µs



inception_4e/5x5_reduce_1->inception_4e/5x5_reduce_2





inception_4e/5x5_1

conv_44

CONV

64.708µs



inception_4e/5x5_reduce_2->inception_4e/5x5_1





inception_4e/5x5_2

relu_44

RELU

7.872µs



inception_4e/5x5_1->inception_4e/5x5_2





inception_4e/5x5_2->inception_4e/output_1





inception_4e/pool_proj_1

conv_45

CONV

49.225µs



inception_4e/pool_1->inception_4e/pool_proj_1





inception_4e/pool_proj_2

relu_45

RELU

7.872µs



inception_4e/pool_proj_1->inception_4e/pool_proj_2





inception_4e/pool_proj_2->inception_4e/output_1





pool4/3x3_s2_1

maxpool_11

MXPL

7.892µs



inception_4e/output_1->pool4/3x3_s2_1





inception_5a/1x1_1

conv_46

CONV

65.936µs



pool4/3x3_s2_1->inception_5a/1x1_1





inception_5a/3x3_reduce_1

conv_47

CONV

119.092µs



pool4/3x3_s2_1->inception_5a/3x3_reduce_1





inception_5a/5x5_reduce_1

conv_49

CONV

66.205µs



pool4/3x3_s2_1->inception_5a/5x5_reduce_1





inception_5a/pool_1

maxpool_12

MXPL

7.559µs



pool4/3x3_s2_1->inception_5a/pool_1





inception_5a/1x1_2

relu_46

RELU

7.809µs



inception_5a/1x1_1->inception_5a/1x1_2





inception_5a/output_1

concat_8

CONC

0s



inception_5a/1x1_2->inception_5a/output_1





inception_5a/3x3_reduce_2

relu_47

RELU

7.72µs



inception_5a/3x3_reduce_1->inception_5a/3x3_reduce_2





inception_5a/3x3_1

conv_48

CONV

62.654µs



inception_5a/3x3_reduce_2->inception_5a/3x3_1





inception_5a/3x3_2

relu_48

RELU

7.747µs



inception_5a/3x3_1->inception_5a/3x3_2





inception_5a/3x3_2->inception_5a/output_1





inception_5a/5x5_reduce_2

relu_49

RELU

7.751µs



inception_5a/5x5_reduce_1->inception_5a/5x5_reduce_2





inception_5a/5x5_1

conv_50

CONV

64.249µs



inception_5a/5x5_reduce_2->inception_5a/5x5_1





inception_5a/5x5_2

relu_50

RELU

7.76µs



inception_5a/5x5_1->inception_5a/5x5_2





inception_5a/5x5_2->inception_5a/output_1





inception_5a/pool_proj_1

conv_51

CONV

65.989µs



inception_5a/pool_1->inception_5a/pool_proj_1





inception_5a/pool_proj_2

relu_51

RELU

7.76µs



inception_5a/pool_proj_1->inception_5a/pool_proj_2





inception_5a/pool_proj_2->inception_5a/output_1





inception_5b/1x1_1

conv_52

CONV

66.418µs



inception_5a/output_1->inception_5b/1x1_1





inception_5b/3x3_reduce_1

conv_53

CONV

66.063µs



inception_5a/output_1->inception_5b/3x3_reduce_1





inception_5b/5x5_reduce_1

conv_55

CONV

66.043µs



inception_5a/output_1->inception_5b/5x5_reduce_1





inception_5b/pool_1

maxpool_13

MXPL

7.559µs



inception_5a/output_1->inception_5b/pool_1





inception_5b/1x1_2

relu_52

RELU

7.814µs



inception_5b/1x1_1->inception_5b/1x1_2





inception_5b/output_1

concat_9

CONC

0s



inception_5b/1x1_2->inception_5b/output_1





inception_5b/3x3_reduce_2

relu_53

RELU

7.72µs



inception_5b/3x3_reduce_1->inception_5b/3x3_reduce_2





inception_5b/3x3_1

conv_54

CONV

75.34µs



inception_5b/3x3_reduce_2->inception_5b/3x3_1





inception_5b/3x3_2

relu_54

RELU

7.814µs



inception_5b/3x3_1->inception_5b/3x3_2





inception_5b/3x3_2->inception_5b/output_1





inception_5b/5x5_reduce_2

relu_55

RELU

7.727µs



inception_5b/5x5_reduce_1->inception_5b/5x5_reduce_2





inception_5b/5x5_1

conv_56

CONV

86.137µs



inception_5b/5x5_reduce_2->inception_5b/5x5_1





inception_5b/5x5_2

relu_56

RELU

7.76µs



inception_5b/5x5_1->inception_5b/5x5_2





inception_5b/5x5_2->inception_5b/output_1





inception_5b/pool_proj_1

conv_57

CONV

65.989µs



inception_5b/pool_1->inception_5b/pool_proj_1





inception_5b/pool_proj_2

relu_57

RELU

7.76µs



inception_5b/pool_proj_1->inception_5b/pool_proj_2





inception_5b/pool_proj_2->inception_5b/output_1





pool5/7x7_s1_1

averagepool_1

AVGPL

8.969µs



inception_5b/output_1->pool5/7x7_s1_1





pool5/7x7_s1_2

dropout_1

DRP

8.511µs



pool5/7x7_s1_1->pool5/7x7_s1_2





_pool5/7x7_s1_mask_1

dropout_1

DRP

[1,1024,1,1]



pool5/7x7_s1_1->_pool5/7x7_s1_mask_1





OC2_DUMMY_0

reshape_1

RSHP

0s



pool5/7x7_s1_2->OC2_DUMMY_0





loss3/classifier_1

gemm_1

GEMM

10.939µs



OC2_DUMMY_0->loss3/classifier_1





prob_1

softmax_1

SFT

[1,1000]



loss3/classifier_1->prob_1