tisdag 19 maj 2020

Automate MySQL Partitioning using events and procedures


The purpose of this post if to show how you can automate the creation and pruning of partitioned tables. If you want to read about partitioning I recommend reading our manual.

In short partitioning makes it possible to spread your individual tables across a file system according to the partition rules you specify.

Reasons for partition your tables might be:
- Insert performance, smaller index trees for stable insert throughput.
- Select performance, only read data from selected (aka partitioning pruning) partitions.
- Delete performance, drop partitioning is must quicker than doing range deletes of old data.

Partitioning definitions is part for the CREATE/ALTER table statements, in the samples below we will use RANGE partitions on a DATE column in our table. This design would be appropriate for log/transaction tables where you want to remove old data.

Lets first create a simple table called logs:

       
CREATE TABLE logs (
    id INT AUTO_INCREMENT NOT NULL,
    message VARCHAR(500) NOT NULL,
    code TINYINT UNSIGNED NOT NULL,
    created DATETIME DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (id,created)
); 


Next step we want to partition this table using the column created, we will use range partitoning and partition data per day like:

       
ALTER TABLE logs PARTITION BY RANGE COLUMNS(created) (
    PARTITION p20200217 VALUES LESS THAN ('2020-02-17'),
    PARTITION p20200218 VALUES LESS THAN ('2020-02-18'),
    PARTITION p20200219 VALUES LESS THAN ('2020-02-19'),
    PARTITION p20200220 VALUES LESS THAN ('2020-02-20'),
    PARTITION p20200221 VALUES LESS THAN ('2020-02-21')
);


After the initial setup of our partitioned table it's now time to think about how to make sure we always have new partitions to insert data to and also remove old partitions. You can of course do this as a manual task using ALTER TABLE statements but that will become a tedious and error prune task over time.

To solve this problem we can use events and stored procedures in MySQL, we will also have a helper table to store some meta data of the tables we want to automate partitioning for.

Lets first create a table that will contain some information about the tables we want to automate partition handling for.

       
CREATE TABLE IF NOT EXISTS `MyPartTables` (
  `schema_name` VARCHAR(100) NOT NULL,
  `table_name` VARCHAR(100) NOT NULL,
  `no_of_future_partitions` INT NOT NULL,
  `max_partitions` INT NOT NULL,
  PRIMARY KEY (`schema_name`,`table_name`)
);


By looking at column names its quite clear what they mean, you simply insert one row per table you want to automate partition handling for. For our example table above this would look something like:

       
INSERT INTO MyPartTables VALUES ('ted','logs',2,20);


So, for out logs table in schema ted this means, create 2 partitions in the future and keep a maximum of 20 partitions before starting to remove old partitions.

Next we need to have some stored procedures to handle the create of new partitions and removal of old partitions. Both of these stored procedures will use INFORMATION_SCHEMA.PARTITIONS to look at current state and then add/remove any needed partitions as specified by information in MyPartTables for the specific table.

First lets have a look at AddPartitions, this procedure will fetch all rows from MyPartTables and then add any needed partitions. It's today hard coded to work only with RANGE partitioning using days for tables that look like our test table logs above.

       
CREATE PROCEDURE `AddPartitions`()
BEGIN
  DECLARE done   BOOL DEFAULT FALSE;
  DECLARE maxp   INT DEFAULT 0;
  DECLARE pdate  INT DEFAULT 0;
  DECLARE pname  VARCHAR(20);    
  DECLARE v_table_name  VARCHAR(100);
  DECLARE v_schema_name  VARCHAR(100);
  DECLARE v_no_future_part INT;

  DECLARE list CURSOR FOR SELECT `table_name`,`schema_name`,`no_of_future_partitions` FROM MyPartTables;

  DECLARE CONTINUE HANDLER FOR
  SQLSTATE '02000'
  SET done = TRUE;

  OPEN list;
  tloop: LOOP
    IF done THEN
      CLOSE list;
      LEAVE tloop;
    END IF;

    FETCH list INTO v_table_name,v_schema_name,v_no_future_part;
    -- DEBUG SELECT v_table_name,v_schema_name,v_no_future_part;

    SET pdate=TO_DAYS(DATE(NOW() + INTERVAL (v_no_future_part) DAY));
    SELECT TO_DAYS(TRIM(BOTH "'" FROM MAX(PARTITION_DESCRIPTION))) INTO maxp
           FROM INFORMATION_SCHEMA.PARTITIONS
           WHERE TABLE_NAME=v_table_name
           AND TABLE_SCHEMA=v_schema_name
           AND PARTITION_DESCRIPTION!="MAXVALUE";
    -- DEBUG' SELECT pdate, maxp;

    WHILE pdate > maxp DO
       SET maxp = maxp + 1;
       SET pname = CONCAT('p', DATE_FORMAT(FROM_DAYS(maxp), '%Y%m%d'));
       SET @qry = CONCAT('ALTER TABLE ',v_schema_name,'.',v_table_name, ' ADD PARTITION (PARTITION ',pname,' VALUES LESS THAN ("',DATE_FOR
MAT(FROM_DAYS(maxp),'%Y-%m-%d'),'"))');
       -- DEBUG SELECT @qry;
       PREPARE stmt FROM @qry;
       EXECUTE stmt;
       DEALLOCATE PREPARE stmt;
    END WHILE;
  END LOOP;
END ;;

Next lets look at the the procedure for removing old partitions, this is structured in the same way as the AddPartitions procedure above but instead of adding partitions it removes any partitions above the limit given in MyPartTables.

       
CREATE PROCEDURE RemovePartitions ()
BEGIN
  DECLARE done BOOL DEFAULT FALSE;
  DECLARE v_table_name  VARCHAR(100);
  DECLARE v_schema_name  VARCHAR(100);
  DECLARE v_max_partitions INT;
  DECLARE v_no_of_partitions_to_remove INT;

  DECLARE cur CURSOR FOR SELECT `table_name`,`schema_name`,`max_partitions` FROM MyPartTables;

  DECLARE CONTINUE HANDLER FOR
  SQLSTATE '02000'
  SET done = TRUE;

  OPEN cur;

  tloop: LOOP
    FETCH cur INTO v_table_name,v_schema_name,v_max_partitions;
    IF done THEN
      CLOSE cur;
      LEAVE tloop;
    END IF;
    -- DEBUG SELECT v_table_name,v_schema_name,v_max_partitions;
    SET @qry = CONCAT('SELECT COUNT(DISTINCT(PARTITION_DESCRIPTION)) INTO @v_no_of_partitions ',
                     'FROM INFORMATION_SCHEMA.PARTITIONS ',
                     'WHERE TABLE_NAME="',v_table_name, '" ',
                     'AND TABLE_SCHEMA = "',v_schema_name, '" ');
    -- DEBUG SELECT @qry;
    PREPARE stmt FROM @qry;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
    -- DEBUG SELECT @v_no_of_partitions, v_max_partitions;

    IF @v_no_of_partitions > v_max_partitions THEN
      SET v_no_of_partitions_to_remove = @v_no_of_partitions - v_max_partitions;
      SET @qry = CONCAT('SELECT GROUP_CONCAT(DISTINCT(PARTITION_NAME)) INTO @v_list_of_partitions_to_remove FROM ( ',
                            'SELECT PARTITION_NAME ',
                            'FROM INFORMATION_SCHEMA.PARTITIONS ',
                            'WHERE TABLE_NAME = "',v_table_name, '" ',
                            'AND TABLE_SCHEMA = "',v_schema_name, '" ',
                            'AND PARTITION_DESCRIPTION!="MAXVALUE" ',
                            'ORDER BY PARTITION_DESCRIPTION LIMIT ',v_no_of_partitions_to_remove,
                            ' ) as slabb');
      -- DEBUG SELECT @qry;
      -- DEBUG SELECT @v_partitions_to_remove;
      PREPARE stmt FROM @qry;
      EXECUTE stmt;
      DEALLOCATE PREPARE stmt;

      SET @qry = CONCAT('ALTER TABLE ',v_table_name, ' DROP PARTITION ', @v_list_of_partitions_to_remove);
      -- DEBUG SELECT @qry;
      PREPARE stmt FROM @qry;
      EXECUTE stmt;
      DEALLOCATE PREPARE stmt;
    END IF;
  END LOOP;
END ;;


Now we only need to create two events that will run our add and remove partition procedures above, as we are dealing with daily partitions its enough to run the procedures once per day like:

       
CREATE EVENT AddPartitions ON SCHEDULE EVERY 1 DAY
DO
  BEGIN
  CALL AddPartitions();
END ;;

CREATE EVENT RemovePartitions ON SCHEDULE EVERY 1 DAY
DO
  BEGIN
  CALL RemovePartitions();
END ;;


If you want to try this out there is complete running example here.

This is just an example on how you can automate the procedure of handling your partitioned tables in MySQL, the code in the stored procedures is hard coded to only work with RANGE and daily partitioned tables but you adopt these to work with any kind of partition setups.

Having a metadata table like MyPartTables  makes it easy handle all of our partitioned tables.

If the result from the GROUP_CONCAT operation is large you need to increase the group_concat_max_len parameter, I recommend increasing group_concat_max_len to 50000.


söndag 10 maj 2020

Install MySQL Cluster Manager "MCM" using Systemd

For more information on how to create a service look at man systemd.service

Install

Directory for MCM installation is /opt Download latest tar package of MCM from edelivery.oracle.com or support.oracle.com (today this is package: MySQL Cluster Manager 1.4.8 TAR for Generic Linux x86 (64bit))
Run commands below once you have downloaded the MCM package:
sudo su -
cd /opt
tar xzf /tmp/mcm-1.4.8-linux-glibc2.12-x86-64bit.tar.gz
mv mcm-1.4.8-linux-glibc2.12-x86-64bit mcm
cp /opt/mcm/mcm1.4.8/etc/mcmd.ini /opt/mcm/
mkdir /opt/mcm/mcm_data
Set manager-directory to folder /opt/mcm/mcm_data in configuration file:
cat /opt/mcm/mcmd.ini | sed 's/^##manager-directory.*/manager-directory = \/opt\/mcm\/mcm_data/' >> /opt/mcm/mcmd.ini
Structure should be:
 /opt/mcm
         mcmd.ini
         mcm1.4.8
         mcm_data

 Folder mcm_data is where all data on disk will be stored by default.
 Folder mcmN.N.N contains installed version of MCM.

Systemd configuration

Create user to run the MCM service:
sudo useradd --no-create-home -s /bin/false mcm
sudo chown -R mcm:mcm /opt/mcm
Create systemd file:
Make sure to update path to correct version of mcm binaries in mcm service file below.
Create file /etc/systemd/system/mcm.service like below.
[Unit]
Description=MySQL Cluster Manager
Documentation=https://dev.mysql.com/doc/mysql-cluster-manager/en/
After=network-online.target

[Service]
User=mcm
Group=mcm
Restart=always
Type=simple

ExecStart=/opt/mcm/mcm1.4.8/bin/mcmd --defaults-file=/opt/mcm/mcmd.ini

[Install]
WantedBy=multi-user.target
Start and enable service:
sudo systemctl daemon-reload
sudo systemctl start mcm
sudo systemctl enable mcm
sudo systemctl status mcm
If the service is not started correctly, look in messages file: sudo tail -150f /var/log/messages  

Remember to set option StopOnError to 0 when creating your cluster so mcmd can restart failed data nodes (set StopOnError:ndbmtd=0 mycluster;).
You can find this and many other MySQL guides here: https://github.com/wwwted

Happy Clustering!

tisdag 5 maj 2020

Database change tracking using MySQL EE Audit

Use MySQL Enterprise Audut plugin to track data changes on specific tables.
Scripts and documentation on github: https://github.com/wwwted/Database-change-tracking
Tracked data (audit logs) can then be inserted into reporting db using parser scripts.
Limitations:
  • Changing tracking rules stopps all tracking for existing connections until the reconnect. This is a limitation in MySQL EE Audit filter handling.
  • No filtering for different users, easy to implement if needed in change_tracking.sql SP (START/STOP).
  • Only tracking INSERT/UPDATE/DELETE statements. Easy to modify if needed.

1 Enable tracking on target server

1.1 Enable audit plugin

Add below to my.cnf for JSON format: loose_audit_log_format = JSON
Load audit plugin:
mysql -uroot -proot < ./mysqlsrc/share/audit_log_filter_linux_install.sql
mysql -uroot -proot -se"show plugins" | grep audit
Set audit log to rotate on size (for test set it low so you get some files to read):
mysql> set persist audit_log_rotate_on_size=4096;

1.2 Install tracking script

mysql -uroot -proot < change_tracking.sql

2 Setup reporting server

Reporting server will host all audit logs from all sources and be the "reporting" database.

2.1 Add reporting schema and tables

mysql -umsandbox -pmsandbox -h127.0.0.1 -P8017 < report_schemas.sql

2.2 Import data into reporting database

Update script with database connection variables and path to audit files Run import: ./audit-parser-json.pl
Imported files are recorded in table audit_information.audit_jobs
Sample output:
mysql> select * from audit_information.audit_jobs;
+----+---------------------------+---------------------+-------------+
| ID | AUDIT_LOG_NAME            | PARSED_DATE_TIME    | LOG_ENTRIES |
+----+---------------------------+---------------------+-------------+
|  1 | audit.20200429T095031.log | 2020-04-29 11:51:21 |          27 |
|  2 | audit.20200429T102434.log | 2020-04-29 12:24:40 |          12 |
+----+---------------------------+---------------------+-------------+
Data is stored in table audit_information.audit_data. Scripts are not inserting all data from audit logs, just some sample columns.

3 USAGE

  • Start tracking:CALL tracking.START_TRACKING("db","table");
  • List tracked tables:CALL tracking.LIST_TRACKING();
  • Stop tracking: CALL tracking.STOP_TRACKING("db","table");
Tracking uses database tracking and one interal tables CHANGE_TRACKING.

4 Test

CREATE DATABASE test;
use test;
CREATE TABLE slafs (i INT);
CALL tracking.START_TRACKING("test","slafs");
--  (you need to log out and in again to have filter activated)
--  (filters are stored in select * from mysql.audit_log_filter;)
CALL tracking.LIST_TRACKING();
INSERT INTO test.slafs VALUES (6);
--  (run tail -f on audit-log)
CALL tracking.STOP_TRACKING("test","slafs");

Misc

Audit filters:
  • SELECT * FROM mysql.audit_log_filter;
  • SELECT * FROM mysql.audit_log_user;