copyright 2004 mysql abthe world’s most popular open source database writing storage engines brian...

42
Copyright 2004 MySQL AB The World’s Most Popular Open Source Database Writing Storage Engines Brian Aker Director of Architecture Montreal PHP Conference March 2005 MySQL AB

Upload: piers-tyler

Post on 28-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Writing Storage Engines

Brian AkerDirector of Architecture

Montreal PHP ConferenceMarch 2005

MySQL AB

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Who am I?

• Brian Aker– Director of Architecture, MySQL AB

– Author of mod_layout, the apache streaming services mod_mp3, Slash (Slashdot’s CMS System), and lot of other things on Freshmeat....

– http://mysql.com/

– http://krow.net/

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

MySQL

• 5 million installations

• 200 employees, 20+ countries– Most North American Developers live in Seattle

– We even have an office in Seattle

• No, developers never go there

Copyright 2003 MySQL AB The World’s Most Popular Open Source Database

MySQL Server

• High Performance RDBMS

• SQL-Based, aiming to be SQL-99 compliant

• Stable

• Scalable– Embedded in hardware (including JMX)

– Extremely high load applications

– Master/Slave Replication

• Easy to use

• Modular – “Storage Engines”

– Many features can be disabled at runtime and/or compile time to conserve resources

Copyright 2003 MySQL AB The World’s Most Popular Open Source Database

Client Library Support

• Libmysql c-library (think OCI)

• JDBC – Type IV JDBC Driver

• ODBC

• Perl DBD::DBI

• PHP (built in)

• ADO.Net, OleDB, Ruby, Erlang, Eiffel, Smalltalk, etc, etc. provided by third parties

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Goals

• Overview of MySQL Architecture

• Understanding of Storage Engine Architecture

• Knowledge of required methods

• Starting points for coding– sql/ is for the kernel

– mysys/ is the portable runtime

– mysql-test is for you test cases

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

What does it take?

• All code is written in simplified C++

• An example storage engine

• Your Ideas

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Server’s Kernel

Parser

Optimizer

Storage Engine

MyISAM Innodb NDB HEAP Merge

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

What is a Storage Engine?

• “Data formats on Disk”

• Examples– Innodb

– MyISAM

– BDB

– Cluster

– HEAP

– CSV

– Your’s!

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Example Table

• CREATE TABLE foo (

• a int,

• b char(4),

• c varchar(9),

• d blob)

• ENGINE = MYISAM;

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Rows

• Rows are made up of Fields

NULL INT CHAR VAR BLOB

NULL 4 4 L + 9 L + P

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

• Rows are made up of fields

Fields

NULL C H A R

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

What do I need to do to add one?

• Subclass Field in field.h

• Implement a few methods:– Storage: store(string), store(long long), store(double)

– Retrieve:val_real(), val_int(), val_str()

– Other: field_cast_type(), result_type(), cmp(), sort_string(), max_length()

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Field Store Example

• int Field_ipaddrv4::store(const char *from, uint length, CHARSET_INFO *cs){ int count; count= sscanf(from, "%u.%u%u.%u", ptr, (ptr +1), (ptr +2), (ptr +3)); if (count != 4) { bzero(ptr, 4); return -1; } return 0;}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Field val Example

• String *Field_ipaddrv4::val_str(String *val_buffer __attribute__((unused)), String *val_ptr){ int count;

count= snprintf(buffer, 15, "%u.%u.%u.%u", ptr[0], ptr[1], ptr[2], ptr[3]); val_ptr->set((const char*) buffer,count, &my_charset_latin1);

return val_ptr;}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Break Down of Storage Engine Methods

• Table Control

• Optimizer

• SQL Modifiers

• SQL Reads

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Table Control

• ::create()

• ::open()

• ::close()

• ::delete_table()

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_example::create()

• int ha_example::create(const char *name, TABLE *table_arg, HA_CREATE_INFO *create_info){ DBUG_ENTER("ha_example::create"); /* This is not implemented but we want someone to be able that it works. */ DBUG_RETURN(0);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_example::open()

• int ha_example::open(const char *name, int mode, uint test_if_locked){ DBUG_ENTER("ha_example::open");

if (!(share = get_share(name, table))) DBUG_RETURN(1); thr_lock_data_init(&share->lock,&lock,NULL);

DBUG_RETURN(0);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_example::close()

• int ha_example::close(void){ DBUG_ENTER("ha_example::close"); DBUG_RETURN(free_share(share));}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_example::delete_table()

• int ha_example::delete_table(const char *name){ DBUG_ENTER("ha_example::delete_table"); /* This is not implemented but we want someone to be able that it works. */ DBUG_RETURN(0);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Optimizer

• ::info()

• ::records_in_range()

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

info()

• void ha_heap::info(uint flag)

• { records = info.records; deleted = info.deleted; errkey = info.errkey; mean_rec_length=info.reclength; data_file_length=info.data_length; index_file_length=info.index_length; max_data_file_length= info.max_records* info.reclength; delete_length= info.deleted * info.reclength; if (flag & HA_STATUS_AUTO) auto_increment_value= info.auto_increment;}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

records_in_range()

• ha_rows ha_example::records_in_range(uint inx, key_range *min_key, key_range *max_key){ DBUG_ENTER("ha_example::records_in_range"); DBUG_RETURN(10); // low number to force index usage}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

SQL Modifiers

• delete_row()

• write_row()

• update_row()

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_example::delete_row()

• int ha_example::delete_row(const byte * buf){ DBUG_ENTER("ha_example::delete_row"); DBUG_RETURN(HA_ERR_WRONG_COMMAND);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_archive::insert_row()

• int ha_archive::write_row(byte * buf){ char *pos; z_off_t written; DBUG_ENTER("ha_archive::write_row");

statistic_increment(ha_write_count,&LOCK_status); if (table->timestamp_default_now) update_timestamp(buf+table->timestamp_default_now-1); written= gzwrite(share->archive_write, buf, table->reclength); DBUG_RETURN(0);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_tina::write_row()

• int ha_tina::update_row(const byte * old_data, byte * new_data){ int size; DBUG_ENTER("ha_tina::update_row"); size= encode_quote(new_data); if (chain_append()) DBUG_RETURN(-1); if (my_write(share->data_file, buffer.ptr(), size, MYF(MY_WME | MY_NABP))) DBUG_RETURN(-1); DBUG_RETURN(0);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

SQL Reads

• Scan Reads– rnd_init(), rnd_next(), position(), rnd_pos()

• Index Reads– index_read(), index_next(), index_prev(), index_first(),

index_last()

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_tina::rnd_init()

• int ha_tina::rnd_init(bool scan){ DBUG_ENTER("ha_tina::rnd_init");

current_position= next_position= 0; records= 0; chain_ptr= chain; if (scan) (void)madvise(share->mapped_file,share->file_stat.st_size,MADV_SEQUENTIAL);

DBUG_RETURN(0);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_tina::rnd_next()

• int ha_tina::rnd_next(byte *buf){ DBUG_ENTER("ha_tina::rnd_next");

current_position= next_position; if (!share->mapped_file) DBUG_RETURN(HA_ERR_END_OF_FILE); if (HA_ERR_END_OF_FILE == find_current_row(buf) ) DBUG_RETURN(HA_ERR_END_OF_FILE);

records++; DBUG_RETURN(0);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_tina::position()

• void ha_tina::position(const byte *record){ DBUG_ENTER("ha_tina::position"); ha_store_ptr(ref, ref_length, current_position); DBUG_VOID_RETURN;}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_tina::rnd_pos()

• int ha_tina::rnd_pos(byte * buf, byte *pos){ DBUG_ENTER("ha_tina::rnd_pos"); current_position= ha_get_ptr(pos,ref_length); DBUG_RETURN(find_current_row(buf));}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_example::index_read()

• int ha_example::index_read(byte * buf, const byte * key, uint key_len __attribute__((unused)), enum ha_rkey_function find_flag __attribute__((unused))){ DBUG_ENTER("ha_example::index_read"); DBUG_RETURN(HA_ERR_WRONG_COMMAND);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

ha_example::index_next()

• /* Used to read forward through the index.*/int ha_example::index_next(byte * buf){ DBUG_ENTER("ha_example::index_next"); DBUG_RETURN(HA_ERR_WRONG_COMMAND);}

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Table Scan

• ha_example::store_lockha_example::external_lockha_example::infoha_example::rnd_initha_example::extra Cash record in HA_rrnd()ha_example::rnd_nextha_example::rnd_nextha_example::rnd_nextha_example::extra End cacheing of records (def)ha_example::external_lockha_example::extra Reset database to after open

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

That is All?

• Transaction methods

• Bulk load methods

• Defrag methods

• Lot more (read handler.h)

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Autoconf

• Autoconf files in the top-level source directory– acconfig.h

– acinclude.m4

– config.in

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Additional Files

• Basic server files modified under sql/– sql/Makefile.am

– sql/handler.h

– sql/mysql_priv.h

– sql/handler.cc

– sql/mysqld.cc

– sql/set_var.cc

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Test Cases

• Test cases created under mysql-test– mysql-test/include/have_mmap.inc

– mysql-test/t/mmap.test

– mysql-test/r/mmap.result

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

Other Thoughts

• What are your goals?– Read only?

– Durable?

– Network?

Copyright 2004 MySQL AB The World’s Most Popular Open Source Database

More Information

• sql/ha_example.[h|cc]

• Look at Documents on mysql.com

• lists.mysql.com

• MySQL Support Contracts