For large tables partitioning them can significantly improve performance. If the table has a date field a good way to partition would be to use MONTH-HASH partitioning. The date field needs to be the primary key or at least part of the primary key
Creating a table with this partitioning
CREATE TABLE `stats` ( `id` int(11) NOT NULL AUTO_INCREMENT, `time` datetime, `client` int(11), `api` int(11), `num_requests` int(255), `avg_time` float, `90_percentile` float, `95_percentile` float, PRIMARY KEY (`id,time`) ) ENGINE=MyINNODB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci PARTITION BY RANGE( MONTH(time) ) PARTITIONS 12 SUBPARTITION BY HASH( TO_DAYS(time) ) SUBPARTITIONS 2 ( PARTITION pJAN VALUES LESS THAN (2), PARTITION pFEB VALUES LESS THAN (3), PARTITION pMAR VALUES LESS THAN (4), PARTITION pAPR VALUES LESS THAN (5), PARTITION pMAY VALUES LESS THAN (6), PARTITION pJUN VALUES LESS THAN (7), PARTITION pJUL VALUES LESS THAN (8), PARTITION pAUG VALUES LESS THAN (9), PARTITION pSEP VALUES LESS THAN (10), PARTITION pOCT VALUES LESS THAN (11), PARTITION pNOV VALUES LESS THAN (12), PARTITION pDEC VALUES LESS THAN MAXVALUE ) ;
Modifying an old table
Add the date field into the primary key
MariaDB [api_perf]> alter table stats DROP PRIMARY KEY, ADD PRIMARY KEY (id,time); Query OK, 0 rows affected (0.75 sec) Records: 0 Duplicates: 0 Warnings: 0
Alter the table
MariaDB [api_perf]> alter table stats -> PARTITION BY RANGE( MONTH(time) ) -> PARTITIONS 12 -> SUBPARTITION BY HASH( TO_DAYS(time) ) -> SUBPARTITIONS 2 ( -> PARTITION pJAN VALUES LESS THAN (2), -> PARTITION pFEB VALUES LESS THAN (3), -> PARTITION pMAR VALUES LESS THAN (4), -> PARTITION pAPR VALUES LESS THAN (5), -> PARTITION pMAY VALUES LESS THAN (6), -> PARTITION pJUN VALUES LESS THAN (7), -> PARTITION pJUL VALUES LESS THAN (8), -> PARTITION pAUG VALUES LESS THAN (9), -> PARTITION pSEP VALUES LESS THAN (10), -> PARTITION pOCT VALUES LESS THAN (11), -> PARTITION pNOV VALUES LESS THAN (12), -> PARTITION pDEC VALUES LESS THAN MAXVALUE -> ) -> ; Query OK, 0 rows affected (2.95 sec) Records: 0 Duplicates: 0 Warnings: 0 MariaDB [api_perf]>
Testing
This function within MySQL allows the data to be recycled back into the month partition regardless of the year. It saves having to create new partitions all the time but will still need to have a scheduled job to clear up old data.
By performing inserts over different years and checking the number of TABLE_ROWS per partition, we can see the reuse of the same partition.
MariaDB [api_perf]> insert into stats (time,client,api,num_requests,avg_time) values("2013-02-01 00:00:00",1,1,30,12.2); Query OK, 1 row affected (0.00 sec) MariaDB [api_perf]> insert into stats (time,client,api,num_requests,avg_time) values("2014-02-01 00:00:00",1,1,30,12.2); Query OK, 1 row affected (0.05 sec) MariaDB [api_perf]> insert into stats (time,client,api,num_requests,avg_time) values("2015-02-01 00:00:00",1,1,30,12.2); Query OK, 1 row affected (0.00 sec) MariaDB [api_perf]> MariaDB [api_perf]> MariaDB [api_perf]> MariaDB [api_perf]> MariaDB [api_perf]> MariaDB [api_perf]> MariaDB [api_perf]> MariaDB [api_perf]> select * FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME='stats' and PARTITION_NAME="pFEB"\G; *************************** 1. row *************************** TABLE_CATALOG: NULL TABLE_SCHEMA: api_perf TABLE_NAME: stats PARTITION_NAME: pFEB SUBPARTITION_NAME: pFEBsp0 PARTITION_ORDINAL_POSITION: 2 SUBPARTITION_ORDINAL_POSITION: 1 PARTITION_METHOD: RANGE SUBPARTITION_METHOD: HASH PARTITION_EXPRESSION: MONTH(time) SUBPARTITION_EXPRESSION: TO_DAYS(time) PARTITION_DESCRIPTION: 3 TABLE_ROWS: 1 AVG_ROW_LENGTH: 16384 DATA_LENGTH: 16384 MAX_DATA_LENGTH: NULL INDEX_LENGTH: 0 DATA_FREE: 0 CREATE_TIME: NULL UPDATE_TIME: NULL CHECK_TIME: NULL CHECKSUM: NULL PARTITION_COMMENT: NODEGROUP: default TABLESPACE_NAME: NULL *************************** 2. row *************************** TABLE_CATALOG: NULL TABLE_SCHEMA: api_perf TABLE_NAME: stats PARTITION_NAME: pFEB SUBPARTITION_NAME: pFEBsp1 PARTITION_ORDINAL_POSITION: 2 SUBPARTITION_ORDINAL_POSITION: 2 PARTITION_METHOD: RANGE SUBPARTITION_METHOD: HASH PARTITION_EXPRESSION: MONTH(time) SUBPARTITION_EXPRESSION: TO_DAYS(time) PARTITION_DESCRIPTION: 3 TABLE_ROWS: 2 AVG_ROW_LENGTH: 8192 DATA_LENGTH: 16384 MAX_DATA_LENGTH: NULL INDEX_LENGTH: 0 DATA_FREE: 0 CREATE_TIME: NULL UPDATE_TIME: NULL CHECK_TIME: NULL CHECKSUM: NULL PARTITION_COMMENT: NODEGROUP: default TABLESPACE_NAME: NULL 2 rows in set (0.00 sec) ERROR: No query specified