Thursday 26 September 2013

Book "SQL Performance Explained"

I just read Markus Winand's wonderful book "SQL Performance Explained" which can be bought as an e-book or print edition and even be read online. The books sole topic is B-tree backed database indices.

The preface already states that "Database indexing is a development task." This is because to define an optimal index one must understand how the application queries the data.

→ This means that every developer should read this book. It will show you how to read execution plans and how to create indices for a wide range of scenarios.




Here are some key points, but you really have to read the book:

1) It is better to have a single index for all columns of a where clause than one index for each column.

2) The column order of a concatenated index has great impact on its usability so it must be chosen carefully.

A database can use a concatenated index when searching with the leading (leftmost) columns. An index with three columns can be used when searching for the first column, when searching with the first two columns together, and when searching using all columns.

→ The most important consideration when defining a concatenated index is how to choose the column order so it can support as many SQL queries as possible.

3) If you use a function (TRUNC, UPPER, ...) on a column in the where clause you must also create an index on the function

4) When searching for ranges index for equality first—then for ranges. This will keep the scanned index range as small as possible.

Example:
   WHERE date_of_birth >= TO_DATE(?, 'YYYY-MM-DD')
   AND date_of_birth <= TO_DATE(?, 'YYYY-MM-DD')
   AND subsidiary_id  = ?

→ The index should be on SUBSIDIARY_ID and then on DATE_OF_BIRTH

5) Even if you have a full copy of the production database in your development environment, the concurrent background load can still cause a query to run much slower in production.

→ Careful execution plan inspection yields more confidence than superficial benchmarks.

6) An index can be used for an order-by sort. And because databases can read indexes in both directions this can even be in the other direction. It is only important that the scanned index range is in the exact opposite order to the order by clause.

7) And a last one not only for performance but also for security: Parameterized Queries are the best way to prevent SQL injection.

No comments:

Post a Comment