While one of the ACID rules for database transactions is
consistency. We can obviously see users asking for more concurrency, that’s simply
because concurrency and response time form the equation of database throughput,
in other words, PERFORMANCE.
Measuring the database performance depends not only on how
many transactions the database server (RDBMS) can perform, but also if it’s
capable of keeping this number as high when multiple users are connected to the
server.
If we want to understand the relation between concurrency
and consistency we have to think of it as a justice scale, on one plate we put
concurrency and on the other we put consistency. So if you want to increase one
of them drastically will result in decreasing the other.
RDBMS vendors have used multiple methods for concurrency
control, but the goal is unique: how to provide your end user with the best
concurrency and consistency (remember the scale) while keeping the transaction
ACID rules achieved. Vendors like Oracle use multi version concurrency control,
while Vendors like DB2, Sybase and Microsoft use two phase locking concurrency
control. Now let’s understand first how each one works then analyze how that
can affect banking environments.
In Two phase locking method, if transaction A is updating a
row, and transaction B is trying to read the same row then transaction B has to
wait until transaction A Commit/Rollback then the server return the final value
to transaction B. Assuming you are trying to select some book price while the author
is updating the price, you can’t get the price until the author of the book
commit or rollback whatever changes he is doing. In database world this is
called LOCK WAIT TIME.
On the other hand, if transaction A is reading some record
and transaction B is trying to read the same record then no need to wait at
all, but if transaction C is trying to modify that record then it will have to
wait until both transaction A and B finish reading and release the read lock.
In Multi Version concurrency control, if transaction A is
updating a row and transaction B is trying to read it then the server will create
TWO VERSIONS of the data. One version that is committed – which will be
returned to transaction B - and one that is being updated but not yet
committed. The committed version will be available for all other readers while
the uncommitted version will be available only for Transaction A, in other
words, WRITERS DON’T BLOCK READERS. From that concept we can notice that the
problem will rise if transaction A committed the change, then the value that
was read by transaction B is no more correct. Or if transaction B tried to read
the same record again after transaction A commits, the value can’t be
repeatable. In other words, this is an unrepeatable read. We can apply the same
rule if 2 transactions are trying to read the same data. The only difference is
we can’t say an unrepeatable read can happen in the later situation.
Back to the main subject, how banking –or any monetary
transaction – can be affected by Multi Version concurrency control. If you are
trying to check your balance while a deduction operation is being applied on
your balance. Then your reading of your balance will not be accurate if the
deduction is committed. It can’t affect any other withdrawals or other
deductions though because at the end writers will block writers even when using
Multi Versions, i.e. same as Two phase locking mechanism.
The above behavior (incorrect balance for example) can never
happen when using Two Phase lock since your balance query will wait until the
deduction is committed or rolled back.
From another point of view, when it’s related to non-monitory
transactions, like checking product details on amazon.com. It will never harm
having an unrepeatable read if it will provide more concurrency to millions of
users trying to read a price or some description of a product.
What remains to be said is in 2010, according to the
University of Sydney – and quoting from there article here :
“Dr Michael Cahill PhD thesis entitled Serializable
Isolation for Snapshot Databases saw him develop an algorithm to avoid data
inconsistency in databases accessed by multiple users.
The algorithm tackles problems occurring when many clients
access a database simultaneously. While organizations like banks lock their
databases so only one person can access a single piece of information at a
time, this isn't tenable for retail websites where thousands of people often
want to the same book title at the same time. Most large databases therefore
opt for what's called a multi version system. The existing multi version
algorithms work well most of the time but their potential to produce
inconsistent data can have serious consequences, says Dr Cahill. He cites the
example of doctors scheduled to work at the same time changing their rosters
simultaneously so that no one is rostered at all.
"I looked at algorithms used in real
systems and found a practical way to maintain consistency," he says.
"I've changed the algorithm in multi version systems so they keep track of
more information. By tracking some additional information about concurrent
operations, the algorithm guarantees the traditional kind of correctness expected
of a database, or, to use the correct terminology, prevents anomalies." ”