機械学習cross 前半資料

69
機械学習CROSS ー前編ー エンジニアサポートCROSS 2013/01/17

Upload: shohei-hido

Post on 04-Dec-2014

13.099 views

Category:

Technology


7 download

DESCRIPTION

エンジニアサポートCROSS2014 機械学習CROSSセッション前半資料です

TRANSCRIPT

  • 1. CROSS CROSS 2013/01/17

2. Agenda lllll 3. CROSSNG 3 4. l l llCROSS l lll l lll 4 5. l l lNIPS2013 Google/Amazon/Facebook/Yahoo/Microsoft FacebookCEO5 6. 2013 6 7. Web l l l l 7 8. Agenda lllll 9. l l l lHIDO Shohei Twitter: @sla l2002: IPAl2006: l l2006-2012: IBM 2012-: llJubatus2013-: PFI & Chief Research Officer9 10. Yahoo!JAPAN 2011- 20127 Yahoo! JAPAN 2005-2010 2002-2005A.T. 1992-2002 20003 P10 11. Web50 11 12. Web 12 13. ALBERT - @komiya_atsushi Web / AWS / 14. FFRI,Inc. (@junichi_m) 20134 ()14 15. @myui (NAIST) 20093 20104 DataGeek) XML Many-core(64)(Lock-free) (MonetDB) ApacheHive hFps://github.com/myui/hivemall (CVR) Hivemall 14 IPA PFI) 16. GunosyGunosy(SNSGunosy) : :25 :CEO :()/ -> (ex) etc GunosyInc. 17. Agenda lllll 18. 1. a) c) 100% b) d) 18 19. l l l ll 19 20. l x l ll Web ll l l 20 21. lxy l llWeb l llxy yxyx=y={, } x=y={, }y xxyy2121 22. l l ll Web ll l I/O l xx 22 y 22 23. l l l Web 23 24. l l l l l ll l l l ll 24 25. 24 365 25 26. l l 26 27. andVW and andAKB l l l ll1perl 27 28. l l ll l ll l l 28 29. R, Weka, Matlab, SPSS SciPy, Shogun bigML, Bazil Mahout, Jubatus, Oryx, hivemall29 30. Agenda lllll 31. P31 32. YDNP32 33. DB Query Short List Long List Short ListP33 34. CTRClick-Through-Rate P34 42 CTR CTR A 20 0.25 5.0 B 15 0.20 3.0 C 30 0.15 4.5 D 100 0.01 1.0 AC 35. P35 pua CTR 36. hFp://dl.acm.org/cita]on.cfm?id=2501978P36 37. 37 38. 1. 2. 3. 38 39. () )by RIT | | | | () () | | | etc.39 40. () , , , , 2010 750ml40 41. 1. HTMLDB 2. DB2 Table data GenerationChateau dIssan 1994Database: : This is a wine from Margaux. ...Annotation Rule wine from x => x is a Region This is a wine from Lafite Rothschild New Region! 41 42. 42 43. textnon-textClassify text/non-text43 44. About ALBERT http://bit.ly/alb_recruit 2013 ALBERT Inc. 45. http://chiefmartec.com/2014/01/marketing-technology-landscape-supergraphic-2014/ 46. DisplayAdver:singDataManagementPlaAormCRMMarke:ngAutoma:on http://chiefmartec.com/2014/01/marketing-technology-landscape-supergraphic-2014/ 47. k-means k- 48. ALBERT 2013 ALBERT Inc. 49. 12 2013 ALBERT Inc. 50. CTR CTR 2013 ALBERT Inc. 51. @[] CVR(ConversionRate) CVR=#CV/#CLICKS CV(| ) Terabytes60-100GB 1000 RDBTSVHDFS CV0.95AUC 1 AUC Hivemall325-10 1000map (#mapslot Hive+UDF Columnar(ORC) ((('A`)))CLICK0.2%CTR = #CLICKS / # Impression 500 52. Hadoop/HiveELT(Extract-Load-Transform) HDFSHadoop/Hive (?) Hive CVR3view3table UDF Label1 2 3 Web service 71transformHadoop /Hive9-1Logs B1Join Aextractload OLTP DBs 8 ID TransformscriptLabelA:2A:3B:7B:8B:91100010-1KDDCup2012A:10100011001100 ETL ETLUDFHiveTransform 53. (1) (e.g., (SGD)) SGD (CW/AROW/ SCW) PassiveAggressive conceptdrip ? Lazy ?10 54. (2) /(?) shue CW/AROW..? CW/AROW/SCW3Hadoop cluster Postgres Training data OLTP transactions node Incremental learning Prediction model ClouderaOryx node node DB-HadoopHybrid machinelearning Batch learning 55. FFRI,Inc.57 56. FFRI,Inc. etc. etc. 58 57. FFRI,Inc.2006 2013 http://www.av-test.org/en/statistics/malware/ 59 58. FFRI,Inc. or APIn-gram NtCreateFile_NtWriteFile_NtCloseHandle TPR:90%FPR:15% FPR:1%NG 60 59. FFRI,Inc. DB1000 APIn-gram 3 61 60. GunosyInc. 61. GunosyGunosy(SNSGunosy) : :25 :CEO :()/ -> (ex) etc GunosyInc. 62. 64 63. Preferred Infrastructure (PFI) l l l 20063 lSedue: lBazil: lJubatus: IR65 64. Jubatus: Hadoop l llHadoopCEP(Complex Event Processing) lJubatus: lNTT SIC http://jubat.us/1. 2. 3. l l 66 65. Bazil: l ll ll PDCAASP lOS Web GUI 66. Agenda lllll 67. ll l l l l l 69