Transcript
Page 1: A.1. Pseudocode for Line Vectorizationopenaccess.thecvf.com/content_ICCV_2019/...972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996

97297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025

102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079

ICCV#1585

ICCV#1585

ICCV 2019 Submission #1585. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

A. Supplementary MaterialsA.1. Pseudocode for Line Vectorization

Algorithm 1 gives a more detailed description for linevectorization. The algorithm takes the C-junction set VC

and T-junction set VT as the input and outputs a vectorizedwireframe (V,E). In the first stage (Lines 2-3), we find thelines among C-junctions according to the line confidencefunction. The procedure Prune-Lines greedily removes thelines with the lowest confidence that either intersect withother lines (Line 32) or are too close to other lines in termof the polar angle (Line 35). In the second stage (Lines4-25), we add the T-junctions into the wireframe. FromLines 6-14, we find the T-junctions that are on the existingwireframe. We first adjust the positions of those T-junctionsby projecting them onto the line (Line 9) and then add themto the candidate T-junction set V′ (Line 10). Because thedegree of a T-junction is always one, we try to find theconnection with the highest confidence for those candidatesT-junctions (Lines 15-25). We repeat the this process untilV, V′, and E remain the same in the last iteration.

A.2. Line Assignments for Vanishing PointsIn Equation (4), we need to find the set of lines Ai ⊆ E

corresponding to the vanishing point i. Mathematically, wedefine the objective function

minA

3∑i

∑(u,v)∈Ai

‖(u − V i) × (u − v)‖2 ,

where ‖(·) × (·)‖2 can be understood as the parallelogramarea formed by two vectors. Since each line in this equa-tion is mutually independent, we can solve this optimizationproblem by greedily assigning each line to the best vanishingpoint i to minimize the objective function.

A.3. Sampled Failure Cases and DiscussionsFigure 7 demonstrates some failure cases in our

SceneCity dataset. We found that our pipeline might notwork well on the scenes in which there are many lines andjunctions that are close to each other. This is because theresolution of the output heat map is 128 × 128, so any de-tail whose size is below two or three pixels might get lostduring the vectorization stage. Therefore, one of our futurework is to explore the possibility of using high-resolutioninput and output images. There are also issues in the 3Ddepth refinement stage. When the scene is complex, findingthe assignment Ai for each line can be hard, due to the er-ror in the junction position and line direction. In addition,the term contributed by erroneous lines in Equation (4) canmake the depth of some junctions inaccurate. Such problemmight potentially be alleviated by increasing the resolutionof the input and output images, using a more data-driven

Algorithm 1 Edge Vectorization AlgorithmRequire: Candidate C-junction set VC , T-junction set VT .Require: Hyper-parameters ηc and η◦.Ensure: Wireframe (V,E).1: procedure Vectorize(VC , VT )2: V← VC

3: E← Prune-Lines({(u, v)|u, v ∈ V, c(u, v) > ηc})4: V′← �5: while V, V′, or E change in the last iteration do6: for all w ∈ VT do7: for all e = (u, v) ∈ E do8: if w is near e then9: project w to the line e10: V′← V′ ∪ {w}11: break12: end if13: end for14: end for15: for all u ∈ V′ do16: v ← argmaxv∈V∪V′ c(u, v)17: if c(u, v) ≥ ηc then18: V′← V′\{u}19: V← V ∪ {u}20: E← E ∪ {(u, v)}21: end if22: end for23: VT ← VT \(V ∪ V′)24: end while25: E← Prune-Lines(E)26: return (V,E)27: end procedure28: procedure Prune-Lines(E)29: sort E w.r.t confidence values in descending order30: E′← �31: for all e ∈ E do32: if ∃e′ ∈ E′ : e intersects with e′ then33: continue34: end if35: if ∃e′ ∈ E′ : e′ ∩ e , � and ∠(e, e′) < η◦ then36: continue37: end if38: E′← E′ ∪ {e}39: end for40: return E′41: end procedure

10

Page 2: A.1. Pseudocode for Line Vectorizationopenaccess.thecvf.com/content_ICCV_2019/...972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996

108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133

113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187

ICCV#1585

ICCV#1585

ICCV 2019 Submission #1585. CONFIDENTIAL REVIEW COPY. DO NOT DISTRIBUTE.

Ground truth Inferred 3D Novel views

Figure 7: Failure cases on the SceneCity dataset.

method, designing a better objective function, or employinga RANSIC approach in those two stages.

A.4. Network ComparisonIn this section, we discuss the difference between our

network and the one in [12]. Our network is similar as theirline detection network with the difference in the followingperspectives:

1. In each hourglass module, they use two consecutiveresidual modules (RM) at each spatial resolution whilewe only use one RM, resulting less parameters in eachhourglass module. Note that our design is the sameas the original hourglass paper [23], which enables usto use more RM in each hourglass module and reducethe computational complexity since more computationis allocated in lower resolution stages. We adopt suchdesign since we find using two RM gives negligiblegains to the performance compared with only one.

2. We apply the intermediate supervision to the stackedhourglass network. For each hourglass modules, theloss term associated with the predicted heat maps isadded to the final loss. [12] does not such intermediatesupervision in their method. We find such intermedi-ate supervision vital in both our synthetic dataset aswell as their 2D dataset in terms of both accuracy androbustness.

3. We observe that using 2 stacked hourglass modulesgives a similar performance as the 5 stacked hourglassmodules in [12]. On the other hand, using 2 stacksconsumes less memory, which gives us more designflexibility. For example, we are able to utilize a largerbatch size to make the gradients more stable.

11


Top Related