kccv 2014 - hanyangcvlab.hanyang.ac.kr/kccv/2014/kccv2014_program_booklet.pdf · 2014-08-22 ·...

명예 위원장 : 이상욱 (서울대)조직 위원장 : 이경무 (서울대)조직 위원 : 윤일동 (한국외대), 한보형 (포항공대), 김선주 (연세대),

임종우 (한양대), 윤국진 (GIST)

주관 : 서울대학교 공과대학주최 : 자동화시스템공동연구소, BK21플러스 창의정보기술 인재양성사업단, 엑소비전 연구단

KCCV 2014Korean Conference on Computer Vision 2014

한국컴퓨터비전학술대회 2014http://kccv.or.kr/

일시: 2014년 8월 25일 (월) 09:50 – 17:30

장소: 서울대학교 공과대학 신공학관 301동 118호

프로그램 (2014 년 8 월 25 일 월요일)

9:50-10:00 인사말

이경무 교수 (서울대)

10:00-10:10 축사

이상욱 교수 (서울대)

Session1 : Segmentation and Recognition

10:10-10:40

Sequential Convex Relaxation for Mutual Information-Based Unsupervised Figure-Ground Segmentation

김준모 교수 (KAIST)

10:40-11:10

Tracking Using Motion Estimation With Physically Motivated Inter-Region Constraints

홍병우 교수 (중앙대)

11:10-11:40

Generalized Background Subtraction using Superpixels with Label Integrated Motion Estimation

임종우 교수 (한양대)

11:40-12:10 A Unified Semantic Model with Discriminative / Generative Tradeoff

황성주 교수(UNIST)

12:10-14:00 점심 시간

Session 2: Computational Photography and Geometry

14:00-14:30 Color Transfer Using Probabilistic Moving Least Squares

김선주 교수 (연세대)

14:30-15:00 Photometric methods for Geometry Refinement

Yu-Wing Tai 교수 (KAIST)

15:00-15:30 Global Search for Rotation and Focal Length

서용덕 교수 (서강대)

15:30-15:50 Coffee Break

Session 3: Visual Tracking

15:50-16:20 Beyond Chain Models for Visual Tracking: A Trilogy

한보형 교수 (포항공대)

16:20-16:50 Visual Tracking Using Patch based Appearance Models

심재영 교수 (UNIST)

16:50-17:20

Robust Online Multi-Object Tracking with Track Confidence and Online Appearance Learning

윤국진 교수 (GIST)

17:20-17:30 맺음말

Talk 1 프로그램

Sequential Convex Relaxation for Mutual Information-Based

Unsupervised Figure-Ground Segmentation

Youngwook Kee and Junmo Kim

Dept. of EE, KAIST

E-mail: [email protected], [email protected]

The unsupervised segmentation of figure and ground is inherently a chicken-and-egg problem:

Where is the object and what distinguishes it from the background? A common assumption in image

segmentation is that the object and background have different color distributions. Yet, if these are

completely unknown and if they show considerable overlapping, then the joint estimation of color

distributions and segmentation becomes a major algorithmic challenge. In this work, we propose an

optimization algorithm [1] to jointly estimate the color distributions of the foreground and background,

and separate them based on their mutual information [2] with geometric regularity. The algorithm

constructs a sequence of convex upper bounds for the mutual information-based nonconvex energy and

efficiently minimizes it on its relaxed domain. Accordingly, we attain high quality solutions for

challenging unsupervised figure-ground segmentation problems. Figure 1 shows two compelling

examples where the proposed method separates the foreground and background not seen to humans.

Figure 1. Synthetic image experiments for visualization of the intensity distribution

separation. Case 1 (top row and four PDFs from the left most): Object (white) and the background

(black) of the "CVPR'14" image are associated with a unimodal and bimodal distribution, respectively,

of the same mean and variance. Case 2 (second row and the remaining PDFs): Object and the

background are associated with two Rayleigh distributions from which all the even central moments

drawn are the same.

References:

[1] Y. Kee, M. Souiai, D. Cremers, and J. Kim. Sequential Convex Relaxation for Mutual

Information-Based Unsupervised Figure-Ground Segmentation. In CVPR, 2014.

[2] J. Kim, J. W. Fisher Ⅲ, A. Yezzi, M. Cetin, and A. S. Willsky. A Nonparametric Statistical

Method for Image Segmentation Using Information Theory and Curve Evolution. IEEE

Trans. Image Process., 14(10): 1486-1502, 2005.

Talk 2 프로그램

Tracking Using Motion Estimation

with Physically Motivated Inter-Region Constraints

Omar Arif(1), Ganesh Sundaramoorthi(1), Byung-Woo Hong(2), Anthony Yezzi(3)

(1)KAUST, Saudi Arabia (2)Chung-Ang University, Korea

(3)Georgia Institute of Technology, U.S.A.

We propose a method for tracking structures (e.g. ventricles and myocardium) in cardiac

images (e.g. magnetic resonance) by propagating forward in time a previous estimate of the

structures using a new physically motivated motion estimation scheme. Our method estimates

motion by regularizing only within structures so that differing motions among different

structures are not mixed. It simultaneously satisfies the physical constraints at the interface

between a fluid and a medium that the normal component of the fluid's motion must match the

normal component of the medium's motion and the No-Slip condition, which states that the

tangential velocity approaches zero near the interface. We show that these conditions lead to

PDEs with Robin boundary conditions at the interface, which couple the motion between

structures. We show that propagating a segmentation across frames using our motion estimation

scheme leads to more accurate segmentation than traditional motion estimation that does not

use physical constraints. Our method is suited to interactive segmentation, prominently used in

commercial applications for cardiac analysis, where segmentation propagation is used to predict

a segmentation in the next frame. We show that our method leads to more accurate predictions

than a popular and recent interactive method used in cardiac segmentation.

Figure 1. Tracking the Left Ventricle on cardiac MRI images

Generalized Background Subtraction using Superpixels with Label

To appear in IEEE Transactions on Medical Imaging

Talk 3 프로그램

Integrated Motion Estimation

Jongwoo Lim

Div. of Computer Science & Engineering, Hanyang University

E-mail: [email protected]

In this talk I present an online background subtraction algorithm with superpixel-based

density estimation for videos captured by moving camera. Our algorithm maintains appearance

and motion models of foreground and background for each superpixel, computes foreground

and background likelihoods for each pixel based on the models, and determines pixelwise labels

using binary belief propagation. The estimated labels trigger the update of appearance and

motion models, and the above steps are performed iteratively in each frame. After convergence,

appearance models are propagated through a sequential Bayesian filtering, where predictions

rely on motion fields of both labels whose computation exploits the segmentation mask.

Superpixel-based modeling and label integrated motion estimation make propagated

appearance models more accurate compared to existing methods since the models are

constructed on visually coherent regions and the quality of estimated motion is improved by

avoiding motion smoothing across regions with different labels. We evaluate our algorithm

with challenging video sequences and present significant performance improvement over the

state-of-the-art techniques quantitatively and qualitatively. This is a joint work with prof.

Bohyung Han (POSTECH).

Fig 1. Overview of our proposed algorithm.

Fig 2. Visualization of learned foreground and background models.

References:

[1] Jongwoo Lim and Bohyung Han, " Generalized Background Subtraction using Superpixels with Label

Integrated Motion Estimation," ECCV, 2014 (to appear).

Talk 4 프로그램

A Unified Semantic Model with Discriminative / Generative Tradeoff

Sung Ju Hwang

Dept. of Electric and Computer Engineering, UNIST

[email protected]

For many years, researchers have sought ways to leverage external semantic knowledge about

the world to aid object categorization, with attributes and taxonomies being the two most

popular semantic sources. In my prior work [1,2,3], I showed how to translate such semantic

knowledge into structural constraints between category models, to regularize the learning of a

discriminative categorization model. However, each proposed method utilized either one of

the two types of semantic entities, and there was no unified model that leverages and relates

them both.

In this talk, I will present a method that utilizes both attributes and taxonomy, and defines the

relationship between the two types of entities. The proposed method, Unified Semantic

Embedding (USE), learns a single unified semantic space in which object categories,

attributes, and supercategories are explicitly embedded. Such semantic space that embeds

heterogeneous semantic entities as vectors enables each entity to be represented as a

combination of other types of semantic entities.

Specifically, we can define each object category as a combination of a supercategory plus a

sparse combination of attributes. We explicitly enforce this relationship between the semantic

entities in the learned space, with an additional exclusivity regularization to learn

discriminative composition for each object category.

The proposed reconstructive regularization guides the discriminative learning process to learn

a better generalizing model, as well as generates compact semantic description of each

category, which enables humans to analyze what has been learned. We validate our method

on the Animals with Attributes dataset for categorization performance and qualitative

analysis, which shows that our method is able to improve classification performance while

learning discriminative semantic decomposition of each category.

Figure 1. Concept. We regularize each category

to be a supercategory + a sparse combination of

attributes, which allows the learned model to

describe any object category in a compact

semantic description, e.g. tiger = striped feline.

References

[1] Semantic Kernel Forests from Multiple Taxonomies, S. J. Hwang, K. Grauman and F.

Sha, NIPS 2012

[2] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, K. Grauman, and

F. Sha, NIPS 2011

[3] Sharing Features Between Objects and Their Attributes, S. J. Hwang, F. Sha and K.

Grauman, CVPR 2011

Talk 5 프로그램

Color Transfer Using Probabilistic Moving Least Squares

Seon Joo Kim

Dept. of Computer Science, Yonsei University

[email protected]

In this talk, I will introduce a new color transfer method which is a process of transferring color

of an image to match the color of another image of the same scene. The color of a scene may

vary from image to image because the photographs are taken at different times, with different

cameras, and under different camera settings. To solve for a full nonlinear and nonparametric

color mapping in the 3D RGB color space, we propose a scattered point interpolation scheme

using moving least squares and strengthen it with a probabilistic modeling of the color transfer

in the 3D color space to deal with mis-alignments and noise. Experiments show the

effectiveness of our method over previous color transfer methods both quantitatively and

qualitatively. In addition, our framework can be applied for various instances of color transfer

such as transferring color between different camera models, camera settings, and illumination

conditions, as well as for video color transfers. This work was done in collaboration with Dr.

Youngbae Hwang from KETI, Joon-Young Lee and Professor In So Kweon from KAIST.

Figure 1 Changes in image color

due to different factors. The iamges

in the third column are our results

of transferring the color of the

images in the frist column to that of

the second column.

References

[1] Y. Hwang, J.-Y. Lee, I. S. Kweon, S. J. Kim, “Color Trasnfer Using Probabilistic Moving

Least Squares,” CVPR 2014

Talk 6 프로그램

Photometric methods for Geometry Refinement

Yu-Wing Tai

Dept. of EE, KAIST

E-mail: [email protected]

3D acquisition is always of interested to research and general community. State-of-the-art

3D acquisition, such as Kinect fusion, is able to capture 3D model using a handheld RGB-D

camera. The captured 3D models, however, usually lack of accuracy in fine geometry details.

In this talk, I am going to present my recent works in ICCV'13 (Multiview Photometric Stereo

using Planar Mesh Parameterization) and CVPR'14 (Exploiting Shading Cues in Kinect IR

Images for Geometry Refinement) which use surface normals from photometric methods to

acquire highly accurate geometric details for geometry refinement. I will present our 3D

acquisition set-up, the processing pipeline, and demonstrate the quality of our refined 3D

models. We will also compare the reconstruction of our 3D model with 3D model from Kinect

fusion.

Figure: Comparison of a real data - Apollo. Left: 3D model from Kinect fusion. Right: Our

refined 3D model using the IR shading cues. The 3D mesh is rendered with the Phong-shaded

model. (Figure 1 of [2]).

References:

[1] Jaesik Park, Sudipta Sinha, Yasuyuki Matsushita, Yu-Wing Tai, In So Kweon,

Multiview Photometric Stereo using Planar Mesh Parameterization, ICCV, 2013

[2] Gyeongmin Choe, Jaesik Park, Yu-Wing Tai, In So Kweon, Exploiting Shading Cues

in Kinect IR Images for Geometry Refinement. CVPR, 2014.

Talk 7 프로그램

Global Search for Rotation and Focal Length

Yongduek Seo, J.C. Bazin, R. Hartley, M. Pollefeys

[email protected]

An efficient approach to identify inliers and outliers should estimate the underlying model

in such a way that the number of inliers is maximized. The most popular technique must be

RANSAC and its variants, which have been applied for numerous computer vision tasks ranging

from 3D reconstruction to object recognition. Despite its popularity, RANSAC does not

guarantee to produce the maximum number of inliers. This paper is dedicated to rotational

homography with unknown focal length, which typically occurs in the context of panoramic

imaging. We propose a globally optimal approach that computes the camera rotation and the

focal length so that the maximum number of inlier correspondences between two images is

guaranteed to be obtained.

In contrast to previous works, (i) we do not assume that the focal length is known in advance,

(ii) we compute the focal length, in addition to the rotation, (iii) instead of the angular error, we

consider the meaningful Euclidian distance in the image space in pixels, which requires deriving

the reprojection bounds in the image, and (iv) we introduce a rotation parametrization that

permits to reduce the correlation between the focal length and the rotation parameters.

Figure. Top Left: Comparison of the number of inliers computed. Top Right: Convergence

of bounds. Bottom: Matching result.

References

1. J.C. Bazin, Y. Seo, M. Pollefeys, Globally optimal consensus set maximization through

rotation search, ACCV 2012.

2. J.C. Bazin, Y. Seo, R. Hartley, M. Pollefeys, Globally optimal inlier set maximization with

unknown rotation and focal length, ECCV 2014.

Talk 8 프로그램

Beyond Chain Models for Visual Tracking: A Trilogy

Bohyung Han

Dept. of Computer Science and Engineering, POSTECH

[email protected]

Most probabilistic tracking algorithms rely on the first-order Markov chain, which is convenient

to exploit temporal coherency of target state but is not able to effectively handle several critical

challenges in visual tracking such as abrupt motion, occlusion, appearance changes, and so on.

To overcome these limitations, I present very novel tracking algorithms that are based on more

general graphical models beyond chain models and are conceptually appropriate for more

challenging environment. In the line of this research, a series of three algorithms have been

proposed recently by POSTECH Computer Vision Lab.; they mainly solve for graphical model

construction, density propagation over new graphical models, and measurement without

temporal coherency assumption. The proposed algorithms achieve superior performance

compared to conventional tracking methods in various challenging sequences. This is a joint

work with Seunghoon Hong, Hyeonseob Nam, and Suha Kwak.

Figure 1 The proposed graphical models. The new graphical models are appropriate to handle

various challenges for visual tracking since the graph structures are determined based on the

characteristics of input sequence adaptively.

References

[1] Seunghoon Hong, Suha Kwak and Bohyung Han, “Orderless Tracking through Model

Averaged Posterior Estimation,” ICCV 2013

[2] Seunghoon Hong and Bohyung Han, “Visual Tracking by Sampling Tree-Structured

Graphical Models,” ECCV 2014

[3] Hyeonseob Nam, Seunghoon Hong and Bohyung Han, “Online Graph-based Tracking,”

ECCV 2014

(a) Orderless Bayesian model averaging [1]

(b) Tree-structured graphical model [2]

(c) Sequential Bayesian model averaging [3]

Talk 9 프로그램

Visual Tracking using Pertinent Patch based Appearance Models

Dae-Youn Lee*, Jae-Young Sim**, and Chang-Su Kim*

*School of Electrical Engineering, Korea University

** School of Electrical and Computer Engineering, UNIST

E-mail: [email protected], [email protected], [email protected]

Tracking performance highly depends on the accuracy of appearance models for the target

object region and the background. We propose a novel visual tracking algorithm using patch-

based appearance models. We first divide the bounding box of a target object into multiple

patches as shown in Fig. 1 (a). Then we select only pertinent patches among them as shown in

Fig. 1 (b), which occur repeatedly near the center of the bounding box, to construct the

foreground appearance model. We also divide the input image into non-overlapping blocks,

construct a background model at each block location, and integrate these background models

for tracking. Using the appearance models, we obtain an accurate foreground probability map.

Finally, we estimate the optimal object position by maximizing the likelihood, which is

obtained by convolving the foreground probability map with the pertinence mask. Experimental

results demonstrate that the proposed algorithm outperforms state-of-the-art tracking

algorithms significantly in terms of center position errors and success rates.

(a) (b)

Fig. 1 Pertinent patch selection: All patches in the bounding box in (a) are initial candidates

to estimate the foreground appearance model. We select only pertinent patches as shown in (b)

to obtain a better model. The bounding box is shown in red, the selected patches in green, and

the object to be tracked in blue, respectively.

References:

[1] Dae-Youn Lee, Jae-Young Sim, and Chang-Su Kim, "Visual Tracking Using

Pertinent Patch Selection and Masking," CVPR, 2014.

Talk 10 프로그램

Robust Online Multi-Object Tracking

with Track Confidence and Online Appearance Learning

Seung-Hwan Bae and Kuk-Jin Yoon

Computer Vision Laboratory, GIST

E-mail: {bshwan, kjyoon}@gist.ac.kr

During the last decade, the multi-object tracking problem has long been one of the most

import issues in computer vision. However, it still remains a difficult problem in complex

scenes, because of frequent occlusions, similar appearances of objects, rapid motion changes,

and other factors. In this paper, we propose robust online multi-object tracking algorithms that

can handle those challenges effectively.

We first propose a tracklet confidence for evaluating tracklet’s reliability and then

confidence-based algorithms for local and global association during online tracking. Here, for

reliable association between tracklets and detections, we also propose novel online appearance

learning algorithms using ensemble learning and incremental LDA. By effectively combining

these algorithms, we build a practical framework for robust online multi-object tracking as

shown in Fig. 1. Experiments with challenging public datasets show distinct performance

improvement over other batch and online tracking methods.

Fig 1. Overall framework of our approach for robust online multi-object tracking.

References:

[1] Seung-Hwan Bae and Kuk-Jin Yoon, "Robust Online Multi-Object Tracking based on

Tracklet Confidence and Online Discriminative Appearance Learning," CVPR, 2014.

[2] Seung-Hwan Bae and Kuk-Jin Yoon, "Robust Online Multi-Object Tracking with Data

Association and Track Management," TIP, 2014.

Tracklet Confidence(Low → High)Detection ResponseAssociated Detection

Legend

(6) Online Training Sample Collection

Online Discriminative Appearance Learning

Two-Step Association with Tracklet confidence

(7) Discriminative Projection Space Update for Local and Global Associations

…

1st Components 1st Components

Update

2nd Components 2nd Components…

Online Detections

(1) A Set of Tracklets with Confidenceand Detections

(3) Local Association

(2) Tracklets with Low Confidence

(2) Tracklets with High Confidence

(4) Global Association

(5) Tracklet Confidence Update

신공학관301동

신공학관302동

정문후문

서울대학교지도

kccv 2014 - hanyangcvlab.hanyang.ac.kr/kccv/2014/kccv2014_program_booklet.pdf · 2014-08-22 ·...

Documents