kccv 2014 - hanyangcvlab.hanyang.ac.kr/kccv/2014/kccv2014_program_booklet.pdf · 2014-08-22 ·...
TRANSCRIPT
명예 위원장 : 이상욱 (서울대)조직 위원장 : 이경무 (서울대)조직 위원 : 윤일동 (한국외대), 한보형 (포항공대), 김선주 (연세대),
임종우 (한양대), 윤국진 (GIST)
주관 : 서울대학교 공과대학주최 : 자동화시스템공동연구소, BK21플러스 창의정보기술 인재양성사업단, 엑소비전 연구단
KCCV 2014Korean Conference on Computer Vision 2014
한국컴퓨터비전학술대회 2014http://kccv.or.kr/
일시: 2014년 8월 25일 (월) 09:50 – 17:30
장소: 서울대학교 공과대학 신공학관 301동 118호
프로그램 (2014 년 8 월 25 일 월요일)
9:50-10:00 인사말
이경무 교수 (서울대)
10:00-10:10 축사
이상욱 교수 (서울대)
Session1 : Segmentation and Recognition
10:10-10:40
Sequential Convex Relaxation for Mutual Information-Based Unsupervised Figure-Ground Segmentation
김준모 교수 (KAIST)
10:40-11:10
Tracking Using Motion Estimation With Physically Motivated Inter-Region Constraints
홍병우 교수 (중앙대)
11:10-11:40
Generalized Background Subtraction using Superpixels with Label Integrated Motion Estimation
임종우 교수 (한양대)
11:40-12:10 A Unified Semantic Model with Discriminative / Generative Tradeoff
황성주 교수(UNIST)
12:10-14:00 점심 시간
Session 2: Computational Photography and Geometry
14:00-14:30 Color Transfer Using Probabilistic Moving Least Squares
김선주 교수 (연세대)
14:30-15:00 Photometric methods for Geometry Refinement
Yu-Wing Tai 교수 (KAIST)
15:00-15:30 Global Search for Rotation and Focal Length
서용덕 교수 (서강대)
15:30-15:50 Coffee Break
Session 3: Visual Tracking
15:50-16:20 Beyond Chain Models for Visual Tracking: A Trilogy
한보형 교수 (포항공대)
16:20-16:50 Visual Tracking Using Patch based Appearance Models
심재영 교수 (UNIST)
16:50-17:20
Robust Online Multi-Object Tracking with Track Confidence and Online Appearance Learning
윤국진 교수 (GIST)
17:20-17:30 맺음말
Talk 1 프로그램
Sequential Convex Relaxation for Mutual Information-Based
Unsupervised Figure-Ground Segmentation
Youngwook Kee and Junmo Kim
Dept. of EE, KAIST
E-mail: [email protected], [email protected]
The unsupervised segmentation of figure and ground is inherently a chicken-and-egg problem:
Where is the object and what distinguishes it from the background? A common assumption in image
segmentation is that the object and background have different color distributions. Yet, if these are
completely unknown and if they show considerable overlapping, then the joint estimation of color
distributions and segmentation becomes a major algorithmic challenge. In this work, we propose an
optimization algorithm [1] to jointly estimate the color distributions of the foreground and background,
and separate them based on their mutual information [2] with geometric regularity. The algorithm
constructs a sequence of convex upper bounds for the mutual information-based nonconvex energy and
efficiently minimizes it on its relaxed domain. Accordingly, we attain high quality solutions for
challenging unsupervised figure-ground segmentation problems. Figure 1 shows two compelling
examples where the proposed method separates the foreground and background not seen to humans.
Figure 1. Synthetic image experiments for visualization of the intensity distribution
separation. Case 1 (top row and four PDFs from the left most): Object (white) and the background
(black) of the "CVPR'14" image are associated with a unimodal and bimodal distribution, respectively,
of the same mean and variance. Case 2 (second row and the remaining PDFs): Object and the
background are associated with two Rayleigh distributions from which all the even central moments
drawn are the same.
References:
[1] Y. Kee, M. Souiai, D. Cremers, and J. Kim. Sequential Convex Relaxation for Mutual
Information-Based Unsupervised Figure-Ground Segmentation. In CVPR, 2014.
[2] J. Kim, J. W. Fisher Ⅲ, A. Yezzi, M. Cetin, and A. S. Willsky. A Nonparametric Statistical
Method for Image Segmentation Using Information Theory and Curve Evolution. IEEE
Trans. Image Process., 14(10): 1486-1502, 2005.
Talk 2 프로그램
Tracking Using Motion Estimation
with Physically Motivated Inter-Region Constraints
Omar Arif(1), Ganesh Sundaramoorthi(1), Byung-Woo Hong(2), Anthony Yezzi(3)
(1)KAUST, Saudi Arabia (2)Chung-Ang University, Korea
(3)Georgia Institute of Technology, U.S.A.
We propose a method for tracking structures (e.g. ventricles and myocardium) in cardiac
images (e.g. magnetic resonance) by propagating forward in time a previous estimate of the
structures using a new physically motivated motion estimation scheme. Our method estimates
motion by regularizing only within structures so that differing motions among different
structures are not mixed. It simultaneously satisfies the physical constraints at the interface
between a fluid and a medium that the normal component of the fluid's motion must match the
normal component of the medium's motion and the No-Slip condition, which states that the
tangential velocity approaches zero near the interface. We show that these conditions lead to
PDEs with Robin boundary conditions at the interface, which couple the motion between
structures. We show that propagating a segmentation across frames using our motion estimation
scheme leads to more accurate segmentation than traditional motion estimation that does not
use physical constraints. Our method is suited to interactive segmentation, prominently used in
commercial applications for cardiac analysis, where segmentation propagation is used to predict
a segmentation in the next frame. We show that our method leads to more accurate predictions
than a popular and recent interactive method used in cardiac segmentation.
Figure 1. Tracking the Left Ventricle on cardiac MRI images
Generalized Background Subtraction using Superpixels with Label
To appear in IEEE Transactions on Medical Imaging
Talk 3 프로그램
Integrated Motion Estimation
Jongwoo Lim
Div. of Computer Science & Engineering, Hanyang University
E-mail: [email protected]
In this talk I present an online background subtraction algorithm with superpixel-based
density estimation for videos captured by moving camera. Our algorithm maintains appearance
and motion models of foreground and background for each superpixel, computes foreground
and background likelihoods for each pixel based on the models, and determines pixelwise labels
using binary belief propagation. The estimated labels trigger the update of appearance and
motion models, and the above steps are performed iteratively in each frame. After convergence,
appearance models are propagated through a sequential Bayesian filtering, where predictions
rely on motion fields of both labels whose computation exploits the segmentation mask.
Superpixel-based modeling and label integrated motion estimation make propagated
appearance models more accurate compared to existing methods since the models are
constructed on visually coherent regions and the quality of estimated motion is improved by
avoiding motion smoothing across regions with different labels. We evaluate our algorithm
with challenging video sequences and present significant performance improvement over the
state-of-the-art techniques quantitatively and qualitatively. This is a joint work with prof.
Bohyung Han (POSTECH).
Fig 1. Overview of our proposed algorithm.
Fig 2. Visualization of learned foreground and background models.
References:
[1] Jongwoo Lim and Bohyung Han, " Generalized Background Subtraction using Superpixels with Label
Integrated Motion Estimation," ECCV, 2014 (to appear).
Talk 4 프로그램
A Unified Semantic Model with Discriminative / Generative Tradeoff
Sung Ju Hwang
Dept. of Electric and Computer Engineering, UNIST
For many years, researchers have sought ways to leverage external semantic knowledge about
the world to aid object categorization, with attributes and taxonomies being the two most
popular semantic sources. In my prior work [1,2,3], I showed how to translate such semantic
knowledge into structural constraints between category models, to regularize the learning of a
discriminative categorization model. However, each proposed method utilized either one of
the two types of semantic entities, and there was no unified model that leverages and relates
them both.
In this talk, I will present a method that utilizes both attributes and taxonomy, and defines the
relationship between the two types of entities. The proposed method, Unified Semantic
Embedding (USE), learns a single unified semantic space in which object categories,
attributes, and supercategories are explicitly embedded. Such semantic space that embeds
heterogeneous semantic entities as vectors enables each entity to be represented as a
combination of other types of semantic entities.
Specifically, we can define each object category as a combination of a supercategory plus a
sparse combination of attributes. We explicitly enforce this relationship between the semantic
entities in the learned space, with an additional exclusivity regularization to learn
discriminative composition for each object category.
The proposed reconstructive regularization guides the discriminative learning process to learn
a better generalizing model, as well as generates compact semantic description of each
category, which enables humans to analyze what has been learned. We validate our method
on the Animals with Attributes dataset for categorization performance and qualitative
analysis, which shows that our method is able to improve classification performance while
learning discriminative semantic decomposition of each category.
Figure 1. Concept. We regularize each category
to be a supercategory + a sparse combination of
attributes, which allows the learned model to
describe any object category in a compact
semantic description, e.g. tiger = striped feline.
References
[1] Semantic Kernel Forests from Multiple Taxonomies, S. J. Hwang, K. Grauman and F.
Sha, NIPS 2012
[2] Learning a Tree of Metrics with Disjoint Visual Features, S. J. Hwang, K. Grauman, and
F. Sha, NIPS 2011
[3] Sharing Features Between Objects and Their Attributes, S. J. Hwang, F. Sha and K.
Grauman, CVPR 2011
Talk 5 프로그램
Color Transfer Using Probabilistic Moving Least Squares
Seon Joo Kim
Dept. of Computer Science, Yonsei University
In this talk, I will introduce a new color transfer method which is a process of transferring color
of an image to match the color of another image of the same scene. The color of a scene may
vary from image to image because the photographs are taken at different times, with different
cameras, and under different camera settings. To solve for a full nonlinear and nonparametric
color mapping in the 3D RGB color space, we propose a scattered point interpolation scheme
using moving least squares and strengthen it with a probabilistic modeling of the color transfer
in the 3D color space to deal with mis-alignments and noise. Experiments show the
effectiveness of our method over previous color transfer methods both quantitatively and
qualitatively. In addition, our framework can be applied for various instances of color transfer
such as transferring color between different camera models, camera settings, and illumination
conditions, as well as for video color transfers. This work was done in collaboration with Dr.
Youngbae Hwang from KETI, Joon-Young Lee and Professor In So Kweon from KAIST.
Figure 1 Changes in image color
due to different factors. The iamges
in the third column are our results
of transferring the color of the
images in the frist column to that of
the second column.
References
[1] Y. Hwang, J.-Y. Lee, I. S. Kweon, S. J. Kim, “Color Trasnfer Using Probabilistic Moving
Least Squares,” CVPR 2014
Talk 6 프로그램
Photometric methods for Geometry Refinement
Yu-Wing Tai
Dept. of EE, KAIST
E-mail: [email protected]
3D acquisition is always of interested to research and general community. State-of-the-art
3D acquisition, such as Kinect fusion, is able to capture 3D model using a handheld RGB-D
camera. The captured 3D models, however, usually lack of accuracy in fine geometry details.
In this talk, I am going to present my recent works in ICCV'13 (Multiview Photometric Stereo
using Planar Mesh Parameterization) and CVPR'14 (Exploiting Shading Cues in Kinect IR
Images for Geometry Refinement) which use surface normals from photometric methods to
acquire highly accurate geometric details for geometry refinement. I will present our 3D
acquisition set-up, the processing pipeline, and demonstrate the quality of our refined 3D
models. We will also compare the reconstruction of our 3D model with 3D model from Kinect
fusion.
Figure: Comparison of a real data - Apollo. Left: 3D model from Kinect fusion. Right: Our
refined 3D model using the IR shading cues. The 3D mesh is rendered with the Phong-shaded
model. (Figure 1 of [2]).
References:
[1] Jaesik Park, Sudipta Sinha, Yasuyuki Matsushita, Yu-Wing Tai, In So Kweon,
Multiview Photometric Stereo using Planar Mesh Parameterization, ICCV, 2013
[2] Gyeongmin Choe, Jaesik Park, Yu-Wing Tai, In So Kweon, Exploiting Shading Cues
in Kinect IR Images for Geometry Refinement. CVPR, 2014.
Talk 7 프로그램
Global Search for Rotation and Focal Length
Yongduek Seo, J.C. Bazin, R. Hartley, M. Pollefeys
An efficient approach to identify inliers and outliers should estimate the underlying model
in such a way that the number of inliers is maximized. The most popular technique must be
RANSAC and its variants, which have been applied for numerous computer vision tasks ranging
from 3D reconstruction to object recognition. Despite its popularity, RANSAC does not
guarantee to produce the maximum number of inliers. This paper is dedicated to rotational
homography with unknown focal length, which typically occurs in the context of panoramic
imaging. We propose a globally optimal approach that computes the camera rotation and the
focal length so that the maximum number of inlier correspondences between two images is
guaranteed to be obtained.
In contrast to previous works, (i) we do not assume that the focal length is known in advance,
(ii) we compute the focal length, in addition to the rotation, (iii) instead of the angular error, we
consider the meaningful Euclidian distance in the image space in pixels, which requires deriving
the reprojection bounds in the image, and (iv) we introduce a rotation parametrization that
permits to reduce the correlation between the focal length and the rotation parameters.
Figure. Top Left: Comparison of the number of inliers computed. Top Right: Convergence
of bounds. Bottom: Matching result.
References
1. J.C. Bazin, Y. Seo, M. Pollefeys, Globally optimal consensus set maximization through
rotation search, ACCV 2012.
2. J.C. Bazin, Y. Seo, R. Hartley, M. Pollefeys, Globally optimal inlier set maximization with
unknown rotation and focal length, ECCV 2014.
Talk 8 프로그램
Beyond Chain Models for Visual Tracking: A Trilogy
Bohyung Han
Dept. of Computer Science and Engineering, POSTECH
Most probabilistic tracking algorithms rely on the first-order Markov chain, which is convenient
to exploit temporal coherency of target state but is not able to effectively handle several critical
challenges in visual tracking such as abrupt motion, occlusion, appearance changes, and so on.
To overcome these limitations, I present very novel tracking algorithms that are based on more
general graphical models beyond chain models and are conceptually appropriate for more
challenging environment. In the line of this research, a series of three algorithms have been
proposed recently by POSTECH Computer Vision Lab.; they mainly solve for graphical model
construction, density propagation over new graphical models, and measurement without
temporal coherency assumption. The proposed algorithms achieve superior performance
compared to conventional tracking methods in various challenging sequences. This is a joint
work with Seunghoon Hong, Hyeonseob Nam, and Suha Kwak.
Figure 1 The proposed graphical models. The new graphical models are appropriate to handle
various challenges for visual tracking since the graph structures are determined based on the
characteristics of input sequence adaptively.
References
[1] Seunghoon Hong, Suha Kwak and Bohyung Han, “Orderless Tracking through Model
Averaged Posterior Estimation,” ICCV 2013
[2] Seunghoon Hong and Bohyung Han, “Visual Tracking by Sampling Tree-Structured
Graphical Models,” ECCV 2014
[3] Hyeonseob Nam, Seunghoon Hong and Bohyung Han, “Online Graph-based Tracking,”
ECCV 2014
(a) Orderless Bayesian model averaging [1]
(b) Tree-structured graphical model [2]
(c) Sequential Bayesian model averaging [3]
Talk 9 프로그램
Visual Tracking using Pertinent Patch based Appearance Models
Dae-Youn Lee*, Jae-Young Sim**, and Chang-Su Kim*
*School of Electrical Engineering, Korea University
** School of Electrical and Computer Engineering, UNIST
E-mail: [email protected], [email protected], [email protected]
Tracking performance highly depends on the accuracy of appearance models for the target
object region and the background. We propose a novel visual tracking algorithm using patch-
based appearance models. We first divide the bounding box of a target object into multiple
patches as shown in Fig. 1 (a). Then we select only pertinent patches among them as shown in
Fig. 1 (b), which occur repeatedly near the center of the bounding box, to construct the
foreground appearance model. We also divide the input image into non-overlapping blocks,
construct a background model at each block location, and integrate these background models
for tracking. Using the appearance models, we obtain an accurate foreground probability map.
Finally, we estimate the optimal object position by maximizing the likelihood, which is
obtained by convolving the foreground probability map with the pertinence mask. Experimental
results demonstrate that the proposed algorithm outperforms state-of-the-art tracking
algorithms significantly in terms of center position errors and success rates.
(a) (b)
Fig. 1 Pertinent patch selection: All patches in the bounding box in (a) are initial candidates
to estimate the foreground appearance model. We select only pertinent patches as shown in (b)
to obtain a better model. The bounding box is shown in red, the selected patches in green, and
the object to be tracked in blue, respectively.
References:
[1] Dae-Youn Lee, Jae-Young Sim, and Chang-Su Kim, "Visual Tracking Using
Pertinent Patch Selection and Masking," CVPR, 2014.
Talk 10 프로그램
Robust Online Multi-Object Tracking
with Track Confidence and Online Appearance Learning
Seung-Hwan Bae and Kuk-Jin Yoon
Computer Vision Laboratory, GIST
E-mail: {bshwan, kjyoon}@gist.ac.kr
During the last decade, the multi-object tracking problem has long been one of the most
import issues in computer vision. However, it still remains a difficult problem in complex
scenes, because of frequent occlusions, similar appearances of objects, rapid motion changes,
and other factors. In this paper, we propose robust online multi-object tracking algorithms that
can handle those challenges effectively.
We first propose a tracklet confidence for evaluating tracklet’s reliability and then
confidence-based algorithms for local and global association during online tracking. Here, for
reliable association between tracklets and detections, we also propose novel online appearance
learning algorithms using ensemble learning and incremental LDA. By effectively combining
these algorithms, we build a practical framework for robust online multi-object tracking as
shown in Fig. 1. Experiments with challenging public datasets show distinct performance
improvement over other batch and online tracking methods.
Fig 1. Overall framework of our approach for robust online multi-object tracking.
References:
[1] Seung-Hwan Bae and Kuk-Jin Yoon, "Robust Online Multi-Object Tracking based on
Tracklet Confidence and Online Discriminative Appearance Learning," CVPR, 2014.
[2] Seung-Hwan Bae and Kuk-Jin Yoon, "Robust Online Multi-Object Tracking with Data
Association and Track Management," TIP, 2014.
Tracklet Confidence(Low → High)Detection ResponseAssociated Detection
Legend
(6) Online Training Sample Collection
Online Discriminative Appearance Learning
Two-Step Association with Tracklet confidence
(7) Discriminative Projection Space Update for Local and Global Associations
…
1st Components 1st Components
Update
2nd Components 2nd Components…
Online Detections
(1) A Set of Tracklets with Confidenceand Detections
(3) Local Association
(2) Tracklets with Low Confidence
(2) Tracklets with High Confidence
(4) Global Association
(5) Tracklet Confidence Update
신공학관301동
신공학관302동
정문후문
서울대학교지도