image analysis for malicious advertisement detection

13
Image analysis for fraudulent advertisements Jithendranath J V

Upload: jithendranath-joijoide

Post on 26-Dec-2014

165 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Image analysis for malicious advertisement detection

Image analysis for fraudulent advertisementsJithendranath J V

Page 2: Image analysis for malicious advertisement detection

2 04/10/2023

AdvertisementAwesomeAwful

Page 3: Image analysis for malicious advertisement detection

3 04/10/2023

Image Analyzer for Creative Tester

User Issue / Yahoo! Challenge

Roadmap Theme & Goal

• Creative Tester gets approximately around a million creatives per day to be tested for malicious content. Of this 2 %– 5 % of adverts are of category windows mimic. These needs to be detected and banned at the earliest, with less human intervention.

• Need to validate brand safety and ensure quality impressions for advertisers.

• Trust and Safety team in collaboration with Sciences came up with a Image Analyzer module that can detect the malicious advertisements like windows mimic or fake brands with phony downloads and tag them appropriately to be banned.

Value Proposition/Positioning – To reduce the manual effort in

recognizing and banning of malicious advertisements that can be visually

identified as fraudulent

Page 4: Image analysis for malicious advertisement detection

4 04/10/2023

IONIX / CT Ecosystem

Cqueuer (RMX Apps)

IONIXCreative Tester

(CT)

TRF_PROD DB

Primary/Secondary Creative/Click_URL Review

Media Guard Manual Audit

Queue.

Domain Lookup Service

Media Trust

(3rd Party)

Virus Checker

(ClamAv / Trend Micro)

Image Analyzer

Min-bar/Technical Tags

Min-bar /Technical

Tags

Creatives/LineItems gets banned with Min-Bar Classifier

s

Creative Feed based on Advertisers profile

Downloaders (Chrome, Firefox, IE)

Flash Checker

Creatives Banned

Page 5: Image analysis for malicious advertisement detection

5 04/10/2023

IA Internals - Modeler

Feature extraction

SIFT

SURF

CBOW

K Means Computation

Histogram Generation

Model Generation (SVM)

Page 6: Image analysis for malicious advertisement detection

6 04/10/2023

IA Internals - Classifier

Feature extraction

SIFT

SURF

CBOW

Histogram Generation

Classification (SVM)

Page 7: Image analysis for malicious advertisement detection

7 04/10/2023

Performance – Precision and Recall

Precision 0.81818 0.81818 0.81818 0.8125 0.8125 0.76522 0.74638 0.72327 0.67895 0.65198 0.55102 0.41163 0.34281 0.27941 0.22636 0.1649

Recall 0.00402 0.06827 0.13253 0.19679 0.26104 0.3253 0.38956 0.45382 0.51807 0.58233 0.64659 0.71084 0.7751 0.83936 0.90361 1

Threshold 3.64068 2.45085 1.85615 1.24538 0.91759 0.29167 0.06556 -0.18092 -0.52049 -0.78885 -1.18095 -1.75574 -2.07244 -2.52684 -3.13143 -4.45694

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

0.2

0.4

0.6

0.8

1

1.2

PrecisionVsRecall

PrecisionVsRecall

1 2 3 4 5 6 7 8 9 10111213141516

-5

-4

-3

-2

-1

0

1

2

3

4

5

PrecisionRecallThreshold

Page 8: Image analysis for malicious advertisement detection

8 04/10/2023

IA Integration with CT

Feature Extractor

K Means

Histogram

Classifier

ServletCreative Tester

Image Analyzer

HTTP

Page 9: Image analysis for malicious advertisement detection

9 04/10/2023

IA - API Example

{ "requests":[ { "imgid":"1", "imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d079b5de50f6b30602e4a00b84a6e49e9443af7.jpg", "run_wnddlg":true }]}

{ "responses":[ { "imgurl":"http://ionix.zenfs.com/ct/dev2/screenshots/5d079b5de50f6b30602e4a00b84a6e49e9443af7.jpg", "imgid":"1", "classifiers":[ { "classifier":"wnddlg", "status”:true, "result":true, "conf":0.40216639639794 } ] }]}

Request: Response:

Page 10: Image analysis for malicious advertisement detection

Yahoo! Confidential & Proprietary. 10 04/10/2023

Sample – Classified images

Page 11: Image analysis for malicious advertisement detection

11 04/10/2023

What Does Success Look Like

• Who are the customers?– RMX and APT creative serving systems.– Moneyball (Going forward)

• Success metrics– Reducing the manual effort needed in identifying win mimic based

advertisements– This would be measured by the confidence score generated by the

system, that would eventually help us do everything automated– Reduction in customer complaints.

• Key business stakeholders who have/will validate success– Serving systems– Business teams– Manual review teams

Page 12: Image analysis for malicious advertisement detection

12 04/10/2023

Competitive Landscape

• 3rd party ad verification companies.

etc.,

• What differentiates our product/Solution?

– Avoiding the need to expose and send out demand inventory.– Flexibility to keep improvising the algorithms for higher precision/recall.– Quick turn around time for validation.– Building highly targeted models ( for ex: fake facebook, or fake adobe)

Page 13: Image analysis for malicious advertisement detection