When data in a database are used and changed by multiple processes, you have to make that piece of data isolated from other ones. In other words, you can't let two or more changes happen for one particular field in a database.
The "Status" field shows what happened when our application tried to call "Phone" of "Name" on "Last Check".
Now you want to run your app to constantly keep the table up to date. To do that, it keeps calling all numbers in a multithreaded loop, in which number of threads are total number of the rows. Therefore, all rows should be busy at the same time.
Here is a simple flowchart of each thread (but a bad one):
Before we talk about the problem that will occur, please see the famous Sleeping Barber problem, to understand why we need Mutex (Mutual Exclusion) and Semaphore.
For instance:
Now we have some understanding. The next step comes to mind is to have a "flag" to let other threads know that particular phone number is under process, so no intervention should occur. As an example, we add a flag column to our table, and will set it to "true" when is in use, and "false" when it is not.
And here will be the flowchart (again, not so good):
This is a better solution and a naive developer might think it's sufficient. It actually is a good solution if we're dealing with an array of objects in RAM, instead of a database resides in a hard disk and perhaps also needs some time to establish connection and getting the response. Most relational databases such as MySQL, MSSQL, Oracle and Postgres support different isolation levels. We also have predefined Table-lock and Row-lock strategies for different engines. For example, InnoDB engine is a very good choice when it comes to concurrency and its isolation level is
Changing the isolation level to "Serializable" makes almost same behavior, unless we would be able to write only one row at a time.
That means changing the isolation level to the highest (Serializable) will NOT help to prevent a problem like this:
Do it this way:
Let me provide an example. Assume that you have a phone book table that contains numerous rows of names, phone numbers, and status:
Name
|
Phone
|
Status
|
Last Check
|
John Doe
|
111-111-1111
|
OK
|
1-1-2014 00:00:12 UTC
|
Mrs. Peach
|
222-222-2222
|
OK
|
1-1-2014 00:00:14 UTC
|
Curious George
|
333-333-3333
|
Disconnected
|
1-1-2014 22:22:22 UTC
|
Roger Waters
|
444-444-4444
|
Busy
|
1-1-2001 01:00:12 UTC
|
Bill Gates
|
555-555-5555
|
Busy
|
1-1-1998 10:05:23 UTC
|
The God
|
000-000-0000
|
Unreachable
|
1-1-2014 20:00:10 UTC
|
The "Status" field shows what happened when our application tried to call "Phone" of "Name" on "Last Check".
Now you want to run your app to constantly keep the table up to date. To do that, it keeps calling all numbers in a multithreaded loop, in which number of threads are total number of the rows. Therefore, all rows should be busy at the same time.
Here is a simple flowchart of each thread (but a bad one):
Before we talk about the problem that will occur, please see the famous Sleeping Barber problem, to understand why we need Mutex (Mutual Exclusion) and Semaphore.
For instance:
- Thread A "Selects" one row (John Doe)
- At the same time, Thread B "Selects" the same row.
- Thread A calls John Doe and it's ringing. That means the "Status" is OK.
- Thread B calls John Doe and it is busy, because thread A is calling.
- Thread A "Updates" the row to OK
- Thread B immediately "Updates" the row to Busy, that was not what we really meant.
Now we have some understanding. The next step comes to mind is to have a "flag" to let other threads know that particular phone number is under process, so no intervention should occur. As an example, we add a flag column to our table, and will set it to "true" when is in use, and "false" when it is not.
Name
|
Phone
|
Status
|
Last Check
|
Flag
|
John Doe
|
111-111-1111
|
OK
|
1-1-2014 00:00:12 UTC
|
True
|
Mrs. Peach
|
222-222-2222
|
OK
|
1-1-2014 00:00:14 UTC
|
True
|
Curious George
|
333-333-3333
|
Disconnected
|
1-1-2014 22:22:22 UTC
|
False
|
Roger Waters
|
444-444-4444
|
Busy
|
1-1-2001 01:00:12 UTC
|
True
|
Bill Gates
|
555-555-5555
|
Busy
|
1-1-1998 10:05:23 UTC
|
False
|
The God
|
000-000-0000
|
Unreachable
|
1-1-2014 20:00:10 UTC
|
False
|
And here will be the flowchart (again, not so good):
This is a better solution and a naive developer might think it's sufficient. It actually is a good solution if we're dealing with an array of objects in RAM, instead of a database resides in a hard disk and perhaps also needs some time to establish connection and getting the response. Most relational databases such as MySQL, MSSQL, Oracle and Postgres support different isolation levels. We also have predefined Table-lock and Row-lock strategies for different engines. For example, InnoDB engine is a very good choice when it comes to concurrency and its isolation level is
REPEATABLE READ
, that means we can have multiple reads (Select) on each row, but only one write (Update) against the same row, and we can have multiple writes against multiple rows.Changing the isolation level to "Serializable" makes almost same behavior, unless we would be able to write only one row at a time.
That means changing the isolation level to the highest (Serializable) will NOT help to prevent a problem like this:
- Thread A "Selects" one row (John Doe)
- At the same time, Thread B "Selects" the same row. (it is possible in serializable mode)
- Thread A calls John Doe and it's ringing. That means the "Status" is OK.
- Thread B calls John Doe and it is busy, because thread A is calling.
- Thread A "Updates" the row to OK
- If thread B immediately tries to "Update" the row at the same time of Thread A, we will get an error (because the row was locked at the moment, this is a deadlock situation)
- If thread B has a slight delay (most of the time), that means the job of Thread A is done and it can set the row to Busy, same problem as above.
Now we can talk about the real solution. There are three different approach we can do to prevent this race conditions:
- (Not a good idea, but yet a solution): Set max connection to your database to "1", and increase the timeout to a bigger number. This makes all queries get into a line and get executed one by one.
- (A better solution, not the best): You can lock the table/row even when you're doing a Select, by adding "Lock in Share Mode" at the end of select query. If you're doing this, I strongly suggest to lower your isolation level to something like
READ COMMITTED.
- (The good solution) instead of using "true" and "false" in our flag, we can have something like "UUID" and "False". Then we need to conduct the UPDATE query ALONG with our SELECT as a sub-query. This approach keeps the other threads away while the UPDATE is doing it's job (and the row is locked by the DB engine).
Here is an example:
Instead of doing following two queries in above (bad) approaches:
-> Mysql: Select name,phone From mytable Where Flag='False'; -> Script: set name_var = name -> Mysql: Update mytable Set Flag='True' Where name = ${name_var};
Do it this way:
-> Script: generate a UUID (or a long random string or number) -> Mysql: Update mytable Set flag= ${UUID} Where name in (Select name From (Select name From mytable Where flag='False' Order By RAND() Limit 1 ) As dummyTableName); -> Mysql: Select name,phone From mytable Where Flag=${UUID};