Locking and Concurrency In MySQL

In Continuation of understanding the Basic Architecture of MySQL DB , This is our second blog after MySQL Architecture - Part 1 - Basics blog.

Locking and Concurrency

The first of those problems is how to deal with concurrency and locking. In any data repository you have to be careful when more than one person, process, or client needs to change data at the same time.

Read/Write Locks

Systems that deal with concurrent read/write access typically implement a locking system that consists of two lock types. These locks are usually known as shared locks and exclusive locks, or read locks and write locks.
Without worrying about the actual locking technology, we can describe the concept as follows. Read locks on a resource are shared: many clients may read from the resource at the same time and not interfere with each other. Write locks, on the other hand, are exclusive, because it is safe to have only one client writing to the resource at given time and to prevent all reads when a client is writing. Why? Because the single writer is free to make any changes to the resource—even deleting it entirely.
In the database world, locking happens all the time. MySQL has to prevent one client from reading a piece of data while another is changing it. It performs this lock management internally in a way that is transparent much of the time

Lock Granularity

One way to improve the concurrency of a shared resource is to be more selective about what is locked. Rather than locking the entire resource, lock only the part that contains the data you need to change. Better yet, lock only the exact piece of data you plan to change. By decreasing the amount of data that is locked at any one time, more changes can occur simultaneously—as long as they don't conflict with each other.

The downside of this is that locks aren't free. There is overhead involved in obtaining a lock, checking to see whether a lock is free, releasing a lock, and so on. All this business of lock management can really start to eat away at performance because the system is spending its time performing lock management instead of actually storing and retrieving data. (Similar things happen when too many managers get involved in a software project.)

To achieve the best performance overall, some sort of balance is needed. Most commercial database servers don't give you much choice: you get what is known as row-level locking in your tables. MySQL, on the other hand, offers a choice in the matter. Among the storage engines you can choose from in MySQL, you'll find three different granularities of locking. Let's have a look at them.

1. Table locks

The most basic and low-overhead locking strategy available is a table lock, which is analogous to the mailbox locks described earlier. The table as a whole is locked on an all-or-nothing basis. When a client wishes to write to a table (insert, delete, or update, etc.), it obtains a write lock that keeps all other read or write operations at bay for the duration of the operation. Once the write has completed, the table is unlocked to allow those waiting operations to continue. When nobody is writing, readers obtain read locks that allow other readers to do the same.
For a long time, MySQL provided only table locks, and this caused a great deal of concern among database geeks. They warned that MySQL would never scale up beyond toy projects and work in the real world. However, MySQL is so much faster than most commercial databases that table locking doesn't get in the way nearly as much as the naysayers predicted it would.
Part of the reason MySQL doesn't suffer as much as expected is because the majority of applications for which it is used consist primarily of read queries. In fact, the MyISAM engine (MySQL's default) was built assuming that 90% of all queries run against it will be reads. As it turns out, MyISAM tables perform very well as long as the ratio of reads to writes is very high or very low.

2. Page locks

A slightly more expensive form of locking that offers greater concurrency than table locking, a page lock is a lock applied to a portion of a table known as a page. All the records that reside on the same page in the table are affected by the lock. Using this scheme, the main factor influencing concurrency is the page size; if the pages in the table are large, concurrency will be worse than with smaller pages. MySQL's BDB (Berkeley DB) tables use page-level locking on 8-KB pages.
The only hot spot in page locking is the last page in the table. If records are inserted there at regular intervals, the last page will be locked frequently.

3. Row locks

The locking style that offers the greatest concurrency (and carries the greatest overhead) is the row lock. In most applications, it's relatively rare for several clients to need to update the exact same row at the same time. Row-level locking, as it's commonly known, is available in MySQL's InnoDB tables. InnoDB doesn't use a simple row locking mechanism, however. Instead it uses row-level locking in conjunction with a multiversioning scheme, so let's have a look at that.

Multi Version Concurrency Control

There is a final technique for increasing concurrency: Multi-Version Concurrency Control (MVCC). Often referred to simply as versioning , MVCC is used by Oracle, by PostgreSQL, and by MySQL's InnoDB storage engine. MVCC can be thought of as a new twist on row-level locking. It has the added benefit of allowing nonlocking reads while still locking the necessary records only during write operations. Some of MVCC's other properties will be of particular interest when we look at transactions in the next section.
So how does this scheme work? Conceptually, any query against a table will actually see a snapshot of the data as it existed at the time the query began—no matter how long it takes to execute. If you've never experienced this before, it may sound a little crazy. But give it a chance.
In a versioning system, each row has two additional, hidden values associated with it. These values represent when the row was created and when it was expired (or deleted). Rather than storing the actual time at which these events occur, the database stores the version number at the time each event occurred. The database version (or system version) is a number that increments each time a query begins. We'll call these two values the creation id and the deletion id.
Under MVCC, a final duty of the database server is to keep track of all the running queries (with their associated version numbers). Let's see how this applies to particular operations:

SELECT
When records are selected from a table, the server must examine each row to ensure that it meets several criteria:
·       Its creation id must be less than or equal to the system version number. This ensures that the row was created before the current query began.
·       Its deletion id, if not null, must be greater than the current system version. This ensures that the row wasn't deleted before the current query began.
·       Its creation id can't be in the list of running queries. This ensures that the row wasn't added or changed by a query that is still running.
·       Rows that pass all of these tests may be returned as the result of the query.

INSERT
When a row is added to a table, the database server records the current version number along with the new row, using it as the row's creation id.

DELETE
To delete a row, the database server records the current version number as the row's deletion id.

UPDATE
When a row is modified, the database server writes a new copy of the row, using the version number as the new row's creation id. It also writes the version number as the old row's deletion id.
The result of all this extra record keeping is that read queries never lock tables, pages, or rows. They simply read data as fast as they can, making sure to select only rows that meet the criteria laid out earlier. The drawbacks are that the server has to store a bit more data with each row and do a bit more work when examining rows

you can learn more about MySql and InnoDB from below books:

MySQL Replication Simplified: Easy step-by-step examples to establish, troubleshoot and monitor replication

Introducing InnoDB Cluster: Learning the MySQL High Availability Stack

MySQL Cookbook: Solutions for Database Developers and Administrators

Effective MySQL Backup and Recovery

================================================================

You can also , Visit our Blogs for related topic on backup and Recovery:

1)MySQL Architecture - Part 1 - Basics

Search This Blog

Database Solutions

MySQL Architecture - Part 2 - Locking and Concurrency

Locking and Concurrency In MySQL