Moving to history tables
In the last post I talked about how we made the speed of actually writing all those log-records much faster. It has to date been so fast that no a single report of a problem has been filed. you find that post here.
Once the data was written to the log-tables, it had to be moved to the history tables.This was a process that took around 16 hours. It was never allowed to run for that long as it had to be stopped before the business day started.
This move was done from an instance on EXADATA to a database on an old server. Yes, I can hear you all think “AHA! It must be the slow database link.”. That was the leading thought when I started looking at it. And yes it sure was slow, but no it was not having a big impact. The other area that had been tuned and tweaked over and over and over and … was the insert over the database link to the history tables. Sure enough it was taking a long time. However measuring it showed that it only accounted for 20% of the total time. Reducing that would let us save over three hours. While that would be good, where did the rest of the time go?
It went to a place no one had suspected. Nor tweaked Nor had any measurements been made. What else is new?
It was the part of the process that was EXADATA only. It was the delete of the rows that had been moved. Huh? How could this be? Well it turns out that deleting data based on an in-clause was not as fast as one would think (or at least want). The process was based on selecting a set of primary key values and putting them into a temporary table, this table was then used to identify rows to insert into the history table and to delete rows.
Yes, there are quite a few things in this process that one could attempt to optimise However, no matter what, the speed would probably not be fast enough. If it ended up being, would it handle the projected growth of the business? And is there really no better way than essentially row by row processing?
Sure there is. Sometimes SQL is not the best or only tool at our disposal.
Everything doesn’t have to be done with SQL. ™
We had already removed the indexes, so the delete should now be faster. It was, just barely fast enough. Thus, just with that change we had squeezed into the seemingly unattainable window for this process. But business is growing and we would within weeks be back to tuning and tweaking.
Back to the idea of not using SQL for everything. But first, let’s revisit the approach that led to success with the write speed. What assumptions are made that we can question? Well… Why are we writing this data over a database link to the slowest database we have in our production environment? It has always been that way and yes we’re worried about the impact of doing this on the EXADATA. Both the impact of letting ad-hoc searches be made as well as the impact of storing all this data on the EXADATA. The storage concern is well founded as the log-data takes up close to 1.5 TB and the volume of logs written are increasing.
However, when we question this we all agree that these are assumed problems and assumed solutions to those problems. Based on that a PoC is produced to show what would happen if we could keep the historic log data in the same database instance on the EXADATA.
With the historic tables in the same database, we get a whole new set of tools to use. I build a PoC showing how data can be moved from the operational tables (the one logs are written to) to the historic ones in under a second for the whole days volume. To achieve this I partition the table on range where the partition key is the time when the log was inserted. Next part is to use a technology in the database called exchange partition.
When exchanging a partition, no data is actually moved. The partition with today’s data is via exchange partition moved from table A to table B. However, this move is only updating metadata in the database. That is to say that the only change was to specify which table the partition belongs to. The rows in the partition remains in the same exact physical location on disk as they were from the beginning. They are not changed, not read, and not touched in any way.
This is what makes such a move so fast. Even better, it is transactionally safe. If a query started while it belonged to table A, it will be read even though it was moved to another table in the middle of that query. Queries on table A that starts after the move will of course not see the moved data at all.
Sub-second moving data of millions or billions rows is something that cannot be done with SQL no matter how much one tunes the SQL. So again, SQL is not the only tool at your disposal.
With this we proved that the process can be fast enough. I have not discussed it here, but during this process we also showed that the ad-hoc searches was of no concern either. EXADATA smart scan handles the actual ad-hoc queries very well and most of them are actually sub-second response time even with no indexes. This is for 1 billion+ row tables. Yes smart scan is one part of it and storage indexes is another. I will not discuss those in these posts, but take my word for it, when the time they took were presented the concern was immediately forgotten.
In the next post in this series, I will discuss how we dealt with the concern over the amount of disk space we would use now and in the future if we let the historic data stay on the EXADATA.