Tuesday, November 24, 2009

MySQL Cluster Query execution process

MySQL Cluster executes the queries depending on the below type of scan’s it does

  1. Primary key lookup
  2. Unique key lookup
  3. Ordered index scan (i.e., non-unique indexes that use T-trees)
  4. Full table scan

Let’s say you have 4 data nodes in your cluster (NoOfReplicas=2). This means you have 2 node groups and each one has half the data. Cluster uses a hash on the primary key (unless you’ve controlled the partitioning using the 5.1 partitioning features). So for any table, half the rows are in one node group and half the rows are the in other node group.

Now for the 4 types of query execution. You can verify which type of execution is used with EXPLAIN. Here’s how each ones works:

  1. Primary key lookup - the MySQL server can calculate the hash on the primary key and know which data node the relevant row resides in. Then the MySQL server contacts that data node and receives the row.
  2. Unique key lookup - the MySQL server cannot know which data node the row might be stored in. So it contacts a pseudo-random node. That data node has a hidden table that maps the unique key values to the primary key. Then the hash on the primary key reveals where the row resides.
  3. Ordered index scans are sent in parallel to all data nodes, where they search their local t-tree.
  4. Full table scans are send in parallel to all data nodes, where they scan their primary fragment.

Summary: primary key lookups are best. If you have more than 2 nodes, throughput goes up because all nodes are actively serving different fragments of data. Ordered index lookup and full table scans are done in parallel, so more nodes leads to better performance.

No comments:

Post a Comment