bigquery, fluentd and tagomoris #gcpja
DESCRIPTION
TRANSCRIPT
BigQuery,Fluentd,and tagomoris
gcp ja night #282014/09/16
TAGOMORI Satoshi(tagomoris)
Satoshi Tagomori (@tagomoris)LINE Corporation
Analytics Platform Team
スーパードゥラアアアアァァァァァァアアアアアアアアァァイイイイイイイイイイイイイイイイィィィィィイイイイイイイイィイィイイイイィイィ
極度乾燥(しなさい)。
BigQuery
What all of us love :)
Fluentd
It is that with cute logo :)
Fluentd
Readable configuration
Flexible buffer system
Various input/output plugins
Very simple/easy plugin system
High performance for many uses
fluent-plugin-bigquery
Insert events from Fluentd into BigQuery
over streaming inserts
Table Sharding inserts
versions
v0.0.x ~ v0.1.x
by @tagomoris
many patches from @yugui
v0.2.x
KAIZEN platform Inc (@naoya_ito)
tagomoris: I want new maintainer for fluent-plugin-bigquery, who uses BQ actually....
naoya_ito: OK, KAIZEN platform will do!
tagomoris: Great! I’ll transfer my repository to KAIZEN’s account... Can I have commit bit for Fluentd related fixes?
naoya_ito: Sure!
( д) ゚ ゚
Disclosure: I’m an employee of LINE now!
performance
Use Table Sharding inserts
tables table1,table2,table3,table4
Use many threads for concurrent insertion
num_threads 4
(Same with tables)
FeaturesAuthentication
auth_method compute_engine
auth_method private_key
Schema
specs per fields
schema_path
fetch_schema
API Quotanumber of records over streaming inserts
10,000 rows per sec per table
10MB per sec per table
Use Table Sharding
Max row size: 20KB
Max data size per insert: 1MB
Max rows per request: 500
Hobby programming & Cloud services
For hobby programming
To setup environments is very troublesome...
Cloud services are easy to use!
Cloud service specific limitations/restrictions are funny to play with!
Enjoy!