pointwise hsic: a linear-time kernelized co-occurrence ... yokoi/docs/yokoi-phsic-slide... pointwise...

Click here to load reader

Post on 20-Feb-2021

0 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions [EMNLP2018] Sho Yokoi (Tohoku U./AIP)*, Sosuke Kobayashi (PFN), Kenji Fukumizu (ISM), Jun Suzuki *, Kentaro. Inui *

    github.com/cl-tohoku/phsic pip install phsic-clihttps://bit.ly/2SfpZmv1809.00800

    Computing Co-occurrence Strength in NLP observed: paired data ! = #$, &$ ~ ()* task: compute “co-occurrence/association/relation strength” of #, & ∈ , × .

    Collocation Extraction New York

    in the

    Dialogue Response Selection I've lost my … I saw it at …

    I've lost my … I don’t know

  • Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions [EMNLP2018] Sho Yokoi (Tohoku U./AIP)*, Sosuke Kobayashi (PFN), Kenji Fukumizu (ISM), Jun Suzuki *, Kentaro. Inui *

    github.com/cl-tohoku/phsic pip install phsic-clihttps://bit.ly/2SfpZmv1809.00800

    Computing Co-occurrence Strength in NLP observed: paired data ! = #$, &$ ~ ()* task: compute “co-occurrence/association/relation strength” of #, & ∈ , × .

    Collocation Extraction New York

    in the

    Dialogue Response Selection I've lost my … I saw it at …

    I've lost my …

    easy to learn inappropriate to sparse data

    PMI(𝑥, 𝑦) = log 𝑛 ⋅ #(𝑥, 𝑦)#(𝑥, ⋅)#(⋅, 𝑦) AAAC6HicbVLLahRBFK1pH4nta6JLN4XDQAIxdItgNkIgLnQRmICTBKaHobr69kyl60VVtXFoGn/Bnbj1J/wWF271N6zuaWEm8ULB4Zx769a9p1LNmXVR9LMX3Lp95+7W9r3w/oOHjx73d56cWVUaCmOquDIXKbHAmYSxY47DhTZARMrhPC2OG/38IxjLlPzglhqmgswlyxklzlOz/mlyxTJYEFclgriFEdXo5H1d737aX+7hNzjhap7khtBK4oRmyuFk0Gp11YKW2/OwBQ0fzvqD6CBqA98EcQcGqIvRbKf3OckULQVIRzmxdhJH2k0rYhyjHOowKS1oQgsyh4mHkgiw06qdvcZDz2Q4V8Yf6XDLrldURFi7FKnPbAa017WG/J82KV1+OK2Y1KUDSVeN8pJjp3CzSJwxA9TxpQeEGubfiumC+FU5v+6NLqVkVGXwYtVqQ2rebDVQP6QFJwiTDVMdewMNIyvWEmk7ljNP/8t1i/VcfNLcHuIh1uoKjFas3YbQ3mj/GcIhXu9bNOX7uABTMFt7y+LrBt0EZy8PYo9PXw2O3nbmbaNn6DnaRTF6jY7QOzRCY0TRD/QL/UZ/gsvgS/A1+LZKDXpdzVO0EcH3v4Ts7LQ=AAAC6HicbVLLahRBFK1pH4nta6JLN4XDQAIxdItgNkIgLnQRmICTBKaHobr69kyl60VVtXFoGn/Bnbj1J/wWF271N6zuaWEm8ULB4Zx769a9p1LNmXVR9LMX3Lp95+7W9r3w/oOHjx73d56cWVUaCmOquDIXKbHAmYSxY47DhTZARMrhPC2OG/38IxjLlPzglhqmgswlyxklzlOz/mlyxTJYEFclgriFEdXo5H1d737aX+7hNzjhap7khtBK4oRmyuFk0Gp11YKW2/OwBQ0fzvqD6CBqA98EcQcGqIvRbKf3OckULQVIRzmxdhJH2k0rYhyjHOowKS1oQgsyh4mHkgiw06qdvcZDz2Q4V8Yf6XDLrldURFi7FKnPbAa017WG/J82KV1+OK2Y1KUDSVeN8pJjp3CzSJwxA9TxpQeEGubfiumC+FU5v+6NLqVkVGXwYtVqQ2rebDVQP6QFJwiTDVMdewMNIyvWEmk7ljNP/8t1i/VcfNLcHuIh1uoKjFas3YbQ3mj/GcIhXu9bNOX7uABTMFt7y+LrBt0EZy8PYo9PXw2O3nbmbaNn6DnaRTF6jY7QOzRCY0TRD/QL/UZ/gsvgS/A1+LZKDXpdzVO0EcH3v4Ts7LQ=AAAC6HicbVLLahRBFK1pH4nta6JLN4XDQAIxdItgNkIgLnQRmICTBKaHobr69kyl60VVtXFoGn/Bnbj1J/wWF271N6zuaWEm8ULB4Zx769a9p1LNmXVR9LMX3Lp95+7W9r3w/oOHjx73d56cWVUaCmOquDIXKbHAmYSxY47DhTZARMrhPC2OG/38IxjLlPzglhqmgswlyxklzlOz/mlyxTJYEFclgriFEdXo5H1d737aX+7hNzjhap7khtBK4oRmyuFk0Gp11YKW2/OwBQ0fzvqD6CBqA98EcQcGqIvRbKf3OckULQVIRzmxdhJH2k0rYhyjHOowKS1oQgsyh4mHkgiw06qdvcZDz2Q4V8Yf6XDLrldURFi7FKnPbAa017WG/J82KV1+OK2Y1KUDSVeN8pJjp3CzSJwxA9TxpQeEGubfiumC+FU5v+6NLqVkVGXwYtVqQ2rebDVQP6QFJwiTDVMdewMNIyvWEmk7ljNP/8t1i/VcfNLcHuIh1uoKjFas3YbQ3mj/GcIhXu9bNOX7uABTMFt7y+LrBt0EZy8PYo9PXw2O3nbmbaNn6DnaRTF6jY7QOzRCY0TRD/QL/UZ/gsvgS/A1+LZKDXpdzVO0EcH3v4Ts7LQ=AAACh3icbVFNaxsxEJU3zUc335deehENhh6SsJtLcughYAK9FBKok4Bjwqx2HIvVF9Jsg1lM7rn21/XfVGu7YCcdEDzezOjNzCuckoGy7E8nWfuwvrG59THd3kl39/YPdm6Drb3AvrDK+vsCAippsE+SFN47j6ALhXdF1Wvzd7/QB2nNT5o4HGp4MnIkBVCkrh8PjrLTbBb8PcgX4Igt4vGw8/JQWlFrNCQUhDDIM0fDBjxJoXCaPtQBHYgKnnAQoQGNYdjM5pzybmRKPrI+PkN8xi53NKBDmOgiVmqgcXiba8n/5QY1jS6GjTSuJjRiLjSqFSfL26V5KT0KUpMIQHgZZ+ViDB4ExdOsqNRGClviyVxqJdXOHByKuGRA0iBNyzS9eGwvYc4GMGHBKhnpf7U0Xq7lP9rfU97lzj6jd1bOrqFdNCUal3b5sm7Vth/zCn0lwzQ6lr/15z24PTvNI77J2Bb7zL6wryxn5+ySfWfXrM8EK9kr+518Sr4lvbmzSWdh8SFbieTqL1iay1g=AAAC3XicbZLNattAEMfX6kdSNW2dXntZagwJpEHKpb0UCrm0h4ADdRKwjFmtRvZW+8XuKqlZRF+ht9In6bP00Gv7Gl3JLthJBwR/fjO7szN/5Zoz65LkZy+6d//Bw53dR/HjvSdPn/X39y6sqg2FMVVcmaucWOBMwtgxx+FKGyAi53CZV6dt/vIajGVKfnRLDVNB5pKVjBIX0Kx/nt2wAhbE+UwQtzDCj84+NM3B56PlIX6LM67mWWkI9RJntFAOZ4Mu1/hOdOwwyE60PJ71B8lx0gW+K9K1GKB1jGb7vS9ZoWgtQDrKibWTNNFu6olxjHJo4qy2oAmtyBwmQUoiwE59N3uDh4EUuFQmfNLhjm6e8ERYuxR5qGwHtLdzLfxfblK78s3UM6lrB5KuGpU1x07hdpG4YAao48sgCDUsvBXTBQmrcmHdW11qyagq4NWq1VaqfbPVQMOQFpwgTLbEnwYDDSMraom0a8pZwP9q3WKzFp+1t8d4iLW6AaMV67YhdDA6/AzxEG/2rdrjR7gCUzHbBMvS2wbdFRcnx2nQ5wnaRS/QS3SAUvQavUPv0QiNEUU/0C/0G/2JPkVfo28rc6Pe2uXnaCui738BWDXrXg==AAAC3XicbZLNattAEMfX6kdSNW2dXntZagwJpEHKpb0UCrm0h4ADdRKwjFmtRvZW+8XuKqlZRF+ht9In6bP00Gv7Gl3JLthJBwR/fjO7szN/5Zoz65LkZy+6d//Bw53dR/HjvSdPn/X39y6sqg2FMVVcmaucWOBMwtgxx+FKGyAi53CZV6dt/vIajGVKfnRLDVNB5pKVjBIX0Kx/nt2wAhbE+UwQtzDCj84+NM3B56PlIX6LM67mWWkI9RJntFAOZ4Mu1/hOdOwwyE60PJ71B8lx0gW+K9K1GKB1jGb7vS9ZoWgtQDrKibWTNNFu6olxjHJo4qy2oAmtyBwmQUoiwE59N3uDh4EUuFQmfNLhjm6e8ERYuxR5qGwHtLdzLfxfblK78s3UM6lrB5KuGpU1x07hdpG4YAao48sgCDUsvBXTBQmrcmHdW11qyagq4NWq1VaqfbPVQMOQFpwgTLbEnwYDDSMraom0a8pZwP9q3WKzFp+1t8d4iLW6AaMV67YhdDA6/AzxEG/2rdrjR7gCUzHbBMvS2wbdFRcnx2nQ5wnaRS/QS3SAUvQavUPv0QiNEUU/0C/0G/2JPkVfo28rc6Pe2uXnaCui738BWDXrXg==AAAC6HicbVLLattAFB2rr1R9Oemym6HGkEAapG7aTSGQLNJFwIE6CVjGjEZX9lTzYmbU1AjRX8iudNuf6Ld00W37Gx3JKthJLwwczrl37tx7JtWcWRdFP3vBnbv37j/Yehg+evzk6bP+9s65VaWhMKaKK3OZEgucSRg75jhcagNEpBwu0uKo0S8+gbFMyQ9uqWEqyFyynFHiPDXrnyVXLIMFcVUiiFsYUY1O39f17uf95R5+hxOu5kluCK0kTmimHE4GrVZXLWi5PQ9b0PDhrD+IDqI28G0Qd2CAuhjNtntfkkzRUoB0lBNrJ3Gk3bQixjHKoQ6T0oImtCBzmHgoiQA7rdrZazz0TIZzZfyRDrfsekVFhLVLkfrMZkB7U2vI/2mT0uVvpxWTunQg6apRXnLsFG4WiTNmgDq+9IBQw/xbMV0Qvyrn173RpZSMqgxerVptSM2brQbqh7TgBGGyYaojb6BhZMVaIm3Hcubpf7lusZ6LT5vbQzzEWl2B0Yq12xDaG+0/QzjE632LpnwfF2AKZmtvWXzToNvg/PVB7PFZNDg87szbQi/QS7SLYvQGHaITNEJjRNEP9Av9Rn+Cj8F18DX4tkoNel3Nc7QRwfe/g6zssA==AAAC6HicbVLLahRBFK1pH4nta6JLN4XDQAIxdItgNkIgLnQRmICTBKaHobr69kyl60VVtXFoGn/Bnbj1J/wWF271N6zuaWEm8ULB4Zx769a9p1LNmXVR9LMX3Lp95+7W9r3w/oOHjx73d56cWVUaCmOquDIXKbHAmYSxY47DhTZARMrhPC2OG/38IxjLlPzglhqmgswlyxklzlOz/mlyxTJYEFclgriFEdXo5H1d737aX+7hNzjhap7khtBK4oRmyuFk0Gp11YKW2/OwBQ0fzvqD6CBqA98EcQcGqIvRbKf3OckULQVIRzmxdhJH2k0rYhyjHOowKS1oQgsyh4mHkgiw06qdvcZDz2Q4V8Yf6XDLrldURFi7FKnPbAa017WG/J82KV1+OK2Y1KUDSVeN8pJjp3CzSJwxA9TxpQeEGubfiumC+FU5v+6NLqVkVGXwYtVqQ2rebDVQP6QFJwiTDVMdewMNIyvWEmk7ljNP/8t1i/VcfNLcHuIh1uoKjFas3YbQ3mj/GcIhXu9bNOX7uABTMFt7y+LrBt0EZy8PYo9PXw2O3nbmbaNn6DnaRTF6jY7QOzRCY0TRD/QL/UZ/gsvgS/A1+LZKDXpdzVO0EcH3v4Ts7LQ=AAAC6HicbVLLahRBFK1pH4nta6JLN4XDQAIxdItgNkIgLnQRmICTBKaHobr69kyl60VVtXFoGn/Bnbj1J/wWF271N6zuaWEm8ULB4Zx769a9p1LNmXVR9LMX3Lp95+7W9r3w/oOHjx73d56cWVUaCmOquDIXKbHAmYSxY47DhTZARMrhPC2OG/38IxjLlPzglhqmgswlyxklzlOz/mlyxTJYEFclgriFEdXo5H1d737aX+7hNzjhap7khtBK4oRmyuFk0Gp11YKW2/OwBQ0fzvqD6CBqA98EcQcGqIvRbKf3OckULQVIRzmxdhJH2k0rYhyjHOowKS1oQgsyh4mHkgiw06qdvcZDz2Q4V8Yf6XDLrldURFi7FKnPbAa017WG/J82KV1+OK2Y1KUDSVeN8pJjp3CzSJwxA9TxpQeEGubfiumC+FU5v+6NLqVkVGXwYtVqQ2rebDVQP6QFJwiTDVMdewMNIyvWEmk7ljNP/8t1i/VcfNLcHuIh1uoKjFas3YbQ3mj/GcIhXu9bNOX7uABTMFt7y+LrBt0EZy8PYo9PXw2O3nbmbaNn6DnaRTF6jY7QOzRCY0TRD/QL/UZ/gsvgS/A1+LZKDXpdzVO0EcH3v4Ts7LQ=AAAC6HicbVLLahRBFK1pH4nta6JLN4XDQAIxdItgNkIgLnQRmICTBKaHobr69kyl60VVtXFoGn/Bnbj1J/wWF271N6zuaWEm8ULB4Zx769a9p1LNmXVR9LMX3Lp95+7W9r3w/oOHjx73d56cWVUaCmOquDIXKbHAmYSxY47DhTZARMrhPC2OG/38IxjLlPzglhqmgswlyxklzlOz/mlyxTJYEFclgriFEdXo5H1d737aX+7hNzjhap7khtBK4oRmyuFk0Gp11YKW2/OwBQ0fzvqD6CBqA98EcQcGqIvRbKf3OckULQVIRzmxdhJH2k0rYhyjHOowKS1oQgsyh4mHkgiw06qdvcZDz2Q4V8Yf6XDLrldURFi7FKnPbAa017WG/J82KV1+OK2Y1KUDSVeN8pJjp3CzSJwxA9TxpQeEGubfiumC+FU5v+6NLqVkVGXwYtVqQ2rebDVQP6QFJwiTDVMdewMNIyvWEmk7ljNP/8t1i/VcfNLcHuIh1uoKjFas3YbQ3mj/GcIhXu9bNOX7uABTMFt7y+LrBt0EZy8PYo9PXw2O3nbmbaNn6DnaRTF6jY7QOzRCY0TRD/QL/UZ/gsvgS/A1+LZKDXpdzVO0EcH3v4Ts7LQ=AAAC6HicbVLLahRBFK1pH4nta6JLN4XDQAIxdItgNkIgLnQRmICTBKaHobr69kyl60VVtXFoGn/Bnbj1J/wWF271N6zuaWEm8ULB4Zx769a9p1LNmXVR9LMX3Lp95+7W9r3w/oOHjx73d56cWVUaCmOquDIXKbHAmYSxY47DhTZARMrhPC2OG/38IxjLlPzglhqmgswlyxklzlOz/mlyxTJYEFclgriFEdXo5H1d737aX+7hNzjhap7khtBK4oRmyuFk0Gp11YKW2/OwBQ0fzvqD6CBqA98EcQcGqIvRbKf3OckULQVIRzmxdhJH2k0rYhyjHOowKS1oQgsyh4mHkgiw06qdvcZDz2Q4V8Yf6XDLrldURFi7FKnPbAa017WG/J82KV1+OK2Y1KUDSVeN8pJjp3CzSJwxA9TxpQeEGubfiumC+FU5v+6NLqVkVGXwYtVqQ2rebDVQP6QFJwiTDVMdewMNIyvWEmk7ljNP/8t1i/VcfNLcHuIh1uoKjFas3YbQ3mj/GcIhXu9bNOX7uABTMFt7y+LrBt0EZy8PYo9PXw2O3nbmbaNn6DnaRTF6jY7QOzRCY0TRD/QL/UZ/gsvgS/A1+LZKDXpdzVO0EcH3v4Ts7LQ=AAAC6HicbVLLahRBFK1pH4nta6JLN4XDQAIxdItgNkIgLnQRmICTBKaHobr69kyl60VVtXFoGn/Bnbj1J/wWF271N6zuaWEm8ULB4Zx769a9p1LNmXVR9LMX3Lp95+7W9r3w/oOHjx73d56cWVUaCmOquDIXKbHAmYSxY47DhTZARMrhPC2OG/38IxjLlPzglhqmgswlyxklzlOz/mlyxTJYEFclgriFEdXo5H1d737aX+7hNzjhap7khtBK4oRmyuFk0Gp11YKW2/OwBQ0fzvqD6CBqA98EcQcGqIvRbKf3OckULQVIRzmxdhJH2k0rYhyjHOowKS1oQgsyh4mHkgiw06qdvcZDz2Q4V8Yf6XDLrldURFi7FKnPbAa017WG/J82KV1+OK2Y1KUDSVeN8pJjp3CzSJwxA9TxpQeEGubfiumC+FU5v+6NLqVkVGXwYtVqQ2rebDVQP6QFJwiTDVMdewMNIyvWEmk7ljNP/8t1i/VcfNLcHuIh1uoKjFas3YbQ3mj/GcIhXu9bNOX7uABTMFt7y+LrBt0EZy8PYo9PXw2O3nbmbaNn6DnaRTF6jY7QOzRCY0TRD/QL/UZ/gsvgS/A1+LZKDXpdzVO0EcH3v4Ts7LQ=

    PMI(𝑥, 𝑦) = log 𝐏 (𝑦|𝑥)𝐏 (𝑦) AAADD3icjVJBb9MwFHbCYCOw0cGRi0VVqZXGlCAkdkGatAs7bCqIbpOaqnKcl9aKY0e2wxqFiN/Ar+GGuPITuPBbcNoA7caBJ1n69L3v+T2/z1HOmTa+/8Nx72zdvbe9c9978HB371Fn//GFloWiMKKSS3UVEQ2cCRgZZjhc5QpIFnG4jNKTJn/5AZRmUrw3ZQ6TjMwESxglxlLTziK8ZjHMianCjJi5yqrh2Wld9xcH5QC/xiGXszBRhFZ/dbrMogQP6+mfknfn53WN++XHxaD+L+Gg9qadrn/oLwPfBkELuqiN4XTf+RTGkhYZCEM50Xoc+LmZVEQZRjnUXlhoyAlNyQzGFgqSgZ5UyxXVuGeZGCdS2SMMXrLrFRXJdDOvVTaz6pu5hvxXblyY5GhSMZEXBgRdNUoKjo3Ezb5xzBRQw0sLCFXMzorpnNiNGuvKRpdCMCpjeL5qtZFqZtY5UPtIDSYjTDRMdWJ9VoysWE2EblnOLP1ba+brWnzW3O7hHs7lNahcsuU2stz+B/tnvB5e75s25Qc4BZUyXVvLgpsG3QYXLw4Di9++7B6ftubtoKfoGeqjAL1Cx+gNGqIRouins+XsOnvuZ/eL+9X9tpK6TlvzBG2E+/0XCTb9Bg==AAADD3icjVJBb9MwFHbCYCOw0cGRi0VVqZXGlCAkdkGatAs7bCqIbpOaqnKcl9aKY0e2wxqFiN/Ar+GGuPITuPBbcNoA7caBJ1n69L3v+T2/z1HOmTa+/8Nx72zdvbe9c9978HB371Fn//GFloWiMKKSS3UVEQ2cCRgZZjhc5QpIFnG4jNKTJn/5AZRmUrw3ZQ6TjMwESxglxlLTziK8ZjHMianCjJi5yqrh2Wld9xcH5QC/xiGXszBRhFZ/dbrMogQP6+mfknfn53WN++XHxaD+L+Gg9qadrn/oLwPfBkELuqiN4XTf+RTGkhYZCEM50Xoc+LmZVEQZRjnUXlhoyAlNyQzGFgqSgZ5UyxXVuGeZGCdS2SMMXrLrFRXJdDOvVTaz6pu5hvxXblyY5GhSMZEXBgRdNUoKjo3Ezb5xzBRQw0sLCFXMzorpnNiN