experience with simple approaches wei fan erheng zhong sihong xie yuzhao huang kun zhang $ jing peng...

Experience with Simple Approaches Wei Fan ‡ Erheng Zhong † Sihong Xie † Yuzhao Huang † Kun Zhang $ Jing Peng # Jiangtao Ren † ‡ IBM T. J. Watson Research Center † Sun Yat-sen University $ Xavier University of Louisiana # Montclair State University

Upload: aiden-saunders

Post on 27-Mar-2015

216 views

Category:

Documents

1 download

Report

Download

Tags:

Embed Size (px):

TRANSCRIPT

Experience with Simple Approaches

Wei Fan‡ Erheng Zhong† Sihong Xie† Yuzhao Huang† Kun Zhang$

Jing Peng# Jiangtao Ren†

‡IBM T. J. Watson Research Center†Sun Yat-sen University

$Xavier University of Louisiana#Montclair State University

Page 2: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

RDT: Random Decision Tree (Fan et al’03)

“Encoding data” in trees. At each node, an un-used feature is chosen randomly

A discrete feature is un-used if it has never been chosen previously on a given decision path starting from the root to the current node.

A continuous feature can be chosen multiple times on the same decision path, but each time a different threshold value is chosen

Stop when one of the following happens: A node becomes too small or belong to same class Or the total height of the tree exceeds some limits:

Page 3: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Illustration of RDTB1: {0,1}

B2: {0,1}

B3: continuous

B2: {0,1}

B3: continuous

B2: {0,1}

B3: continuous

B3: continous

B1 == 0

B2 == 0?

B3 < 0.3?

Y N

……… B3 < 0.6?

Random threshold 0.3

Random threshold 0.6

B1 chosen randomly

B2 chosen randomly

B3 chosen randomly

Page 4: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Probabilistic view of decision trees - PETs

Petal.Width< 1.75

setosa 50/0/0

versicolor0/49/5

virginica 0/1/45

Petal.Length< 2.45

P(setosa|x,θ) = 0

P(versicolor|x,θ) = 49/54

P(virginica|x,θ) = 5/54

Given an example x :

iiL

yL N/N),x|y(P • , E.g. (C4.5, CART)

• confidences in the predicted labels

• the dependence of P(y|x,θ) on θ is non-trivial

For example :

Page 5: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Problems of probability estimation via conventional DTs

1. Probability estimates tend to approach the extremes of 1 and 0.

---------------------------------------------2. Additional inaccuracies result

from the small number of examples at a leaf.

---------------------------------------------3. Same probability is assigned

to the entire region of space defined by a given leaf.

C4.4(Provost,03)

BC44(Zhang,06),

RDT(Fan,03)

Page 6: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

bRDT

“ bRDT” is the averaging of RDT and BC44, where RDT is Random Decision Tree and BC44 is Bagged C4.4

RDT pr(y|x)

BC44 pb(y|x)

bRDT [pr(y|x)+pb(y|x)]/2

Page 7: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Sampling strategy for Task 1 &2

For station Z, negative instances are partitioned

into “blocks” such that the size of each block is

Approximately 3 times as that of the positive.

………… ……

Positive

Negative

Block 1

Block n

Page 8: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Task 1 & 2 - Result For V station, row 2 and 3, corresponding to task 1 and 2 The optimal classifiers of task 1 and 2 for station W, X, Y,

Z are the same. Thus there’s only one row for these 4 stations

Page 9: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Task 1 - ROC

VwXYZ

Page 10: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Task 2 - ROC

VwXYZ

Page 11: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Task 3 – Feature Expansion

X X X2 ln(X+1)

Example Three instances with only one feature, A and B are positive while C is negative. A(0.9) B(1.0) C(1.1)

Distant (A, B) = Distant (B, C) 0.01 vs. 0.01

Expand A(0.9,0.81,0.64) B(1.0,1.0,0.69) C(1.1,1.21,0.74)

Distant (A, B) < Distant (B, C) 0.049 vs. 0.056

Page 12: Experience with Simple Approaches Wei Fan Erheng Zhong Sihong Xie Yuzhao Huang Kun Zhang $ Jing Peng # Jiangtao Ren IBM T. J. Watson Research Center Sun

Task3 – Result of test 3

Parameter-free

キャッシュレス治療サービスがご利用いただける病 …/media/SJNK/files/kinsurance/leisure/off/...Sun Yat-Sen University No.26 Yuancun Erheng Road Tianhe District

Noncontact Free-Rotating Disk Triboelectric … Free-Rotating Disk Triboelectric Nanogenerator as a Sustainable Energy Harvester and Self-Powered Mechanical Sensor Long Lin,†, Sihong

PUBLICATIONSzli/PDF_papers/Zhao_et... · A new cloud and aerosol layer detection method based on micropulse lidar measurements Chuanfeng Zhao1, Yuzhao Wang1, Qianqian Wang1, Zhanqing

Fractal analysis of cracking in a clayey soil under … · Fractal analysis of cracking in a clayey soil under freeze–thaw cycles Yang Lua, Sihong Liua,⁎, Liping Wengb,LiujiangWanga,ZhuoLic,LeiXua

syt.travinh.gov.vn · 2019. 12. 18. · NAVEE COSMETICS CO., LTD Zibian Erheng No.3, Road, Junhe Industrial Area, Xinshi Road, Junhe Street, Baiyun District, Guangzhou City, Guangdong

Summary Environmental Impact Assessment · Nantong Jintan Zhenjiang Daiyang Jurong Luhe Gaoyou Baoying Changzhou Shuyang Binhai Xiangshui Pei Xian Pei Xian Chia Wang Xinghua Sihong

Analytical and experimental study on the fluid structure interaction during air blast ...energetics.chm.uri.edu/system/files/Erheng Paper... · 2018-06-21 · Analytical and experimental

Show and Tell More: Topic-Oriented Multi-Sentence Image ... · Title: Show and Tell More: Topic-Oriented Multi-Sentence Image Captioning Author: Yuzhao Mao, Chang Zhou, Xiaojie Wang,

The 1-Laplacian Cheeger Cut: Theory and AlgorithmsThe 1-Laplacian Cheeger Cut: Theory and Algorithms K.C. Changy, Sihong Shao y, Dong Zhang y Abstract This paper presents a detailed

Item Recommendation for Emerging Online Businesses · 2016-06-28 · Item Recommendation for Emerging Online Businesses Chun-Ta Lu,† Sihong Xie,† Weixiang Shao,† Lifang He,‡

Graph-based Iterative Hybrid Feature Selection Erheng Zhong † Sihong Xie † Wei Fan ‡ Jiangtao Ren † Jing Peng # Kun Zhang $ † Sun Yat-sen University ‡

Sihong Wu - DiVA portalkth.diva-portal.org/smash/get/diva2:372901/FULLTEXT01.pdfseasonal courses of carbon and evapotranspiration and to examine the responses of photosynthe sis, transpiration

Informed Trading in Regulated Industries David M. Reeb, Yuzhao Zhang and Wanli Zhao Discussion by Ko-Chia Yu Shanghai University of Finance and Economics

Research Article Slope Stability Analysis Using Slice-Wise Factor … · 2019. 7. 31. · Research Article Slope Stability Analysis Using Slice-Wise Factor of Safety YuZhao, 1 Zhi-YiTong,

Multiple Views and Multiple Sources on Crowdfundingclu/doc/wsdm14_slides.pdf · Multiple Views and Multiple Sources Ben Tan, Erheng Zhong, Evan Wei Xiang, Qiang Yang ... of the crowdfunding

VCU DEATH AND COMPLICATIONS CONFERENCE Sihong SuyApril 5, 2012

Inferring the Impacts of Social Media on Crowdfundingsxie/paper/p573-lu.pdf · Inferring the Impacts of Social Media on Crowdfunding Chun-Ta Lu, Sihong Xie, Xiangnan Kong, and Philip

KUNG-CHING CHANG, SIHONG SHAO, DONG ZHANG, AND WEIXI … · 2 KUNG-CHING CHANG, SIHONG SHAO, DONG ZHANG, AND WEIXI ZHANG Let us introduce some notations ﬁrst. G = (V,E) is an unweighted

Latent Space Domain Transfer between High Dimensional Overlapping Distributions Sihong Xie Wei Fan Jing Peng* Olivier Verscheure Jiangtao Ren Sun Yat-Sen

Generalized and Heuristic-Free Feature Construction for Improved Accuracy Wei Fan ‡, Erheng Zhong †, Jing Peng*, Olivier Verscheure ‡, Kun Zhang §, Jiangtao

Cross Validation Framework to Choose Amongst Models and Datasets for Transfer Learning Erheng Zhong ¶, Wei Fan ‡, Qiang Yang ¶, Olivier Verscheure ‡, Jiangtao

Press Release 28 September 2012 - Cybersys Rel SiHong 28Sep2012.pdf · deploy the concept using the latest mobile phone operation systems (Android and IOS) and the 3G / LTE network

cs.stanford.edu...2003/09/24 · THE STANFORD DAILY -03 2002 Stephanie Marie Abegg Katherine Baxter Bearman Kendra Joy Berenson David Douglas Berg Andrea Ryerson Burbank Sihong Chan

Efficient and Numerically Stable Sparse Learning Sihong Xie 1, Wei Fan 2, Olivier Verscheure 2, and Jiangtao Ren 3 1 University of Illinois at Chicago,

Cross Validation Framework to Choose amongst Models and ...… · Cross Validation Framework to Choose amongst Models and Datasets for Transfer Learning Erheng Zhong1,WeiFan2,QiangYang3,

experience with simple approaches wei fan erheng zhong sihong xie yuzhao huang kun zhang $ jing peng...

Documents

roc slide

block n slide

continuous feature

b distant b

continuous b3

chosen bayesian averaging

rdt fan

averaging of rdt