Too frequent commits can lead to performance problems, but can it also affect how much undo the updates generate?
The fact that very frequent committing causes performance problems has been noticed by almost every Oracle developer with even just a couple of months experience with the Oracle database. However, if commit can increase I/O requests issued by the database, then this would also impact the performance of other processes. With Oracle, that can be done in many ways, but with commit processing I usually hear that it is not really a problem as only the process committing would be impacted as long as there is enough processing power available.
The thinking is that a commit is mostly just a CPU consuming process and if the offending process completes soon enough and your server has more CPU available than it really needs, then what is the problem? If that is true, then this turns into another “Sure we could have written the code better, but it works now and we have more urgent problems to solve. We’ll look at this next time we have to change this code.”. That is, it will not be changed until it causes enough problems to make the development manager embarrassed.
Let’s look at how frequent commit processing can generate a lot of I/O and reduce the usability of a shared resource. We will look at two cases, just to show how we can force it with the smallest possible table and how it works with a larger one.
Single Row & Single column table
Let’s first create a new schema to use for the test.
conn system drop user commfreq cascade; create user commfreq identified by commfreq; grant create session, resource, select_catalog_role to commfreq; alter user commfreq quota unlimited on users; alter user commfreq default tablespace users; conn commfreq/commfreq create table test_tb (col_a number not null); insert into test_tb (col_a) values(0); commit; exec dbms_stats.gather_table_stats(user, 'test_tb',cascade=>true)
We now have a schema with one small table and no index, this is the smallest table you can have in Oracle as it only contains one row with one column.
To see how many megabytes of undo the system has generated in the current session, execute this query.
select round(to_number(b.value)/1024/1024,1) "MB UNDO" from v$statname a inner join v$sesstat b on a.statistic# = b.statistic# inner join v$mystat c on b.sid = c.sid where a.name = 'undo change vector size' and c.statistic# = 0;
To make this measurement useful we’ll create a new session before each test case.
We’ll first test how much undo we use if we commit after updating the row 10,000 times.
conn commfreq/commfreq begin for i in 1..10000 loop update test_tb set col_a = col_a + 1; end loop; commit; end; /
On my test system this generated 1.1 MB undo.
The only change we need to do is to move the commit into the loop, repeat the test, and execute the query again.
conn commfreq/commfreq begin for i in 1..10000 loop update test_tb set col_a = col_a + 1; commit; end loop; end; /
This time the same update produced 1.7 MB undo. While this difference may not seem too large at first, it is almost 50% more undo.
Remember that this is a small table and while the numbers may seem small, it still shows a significant increase in undo.
While this is interesting, it is not too representative of actual database applications. We’ll now take a look at something more realistic to see if the observed behavior changes.
To set up this test we’ll remove our test row, add one column we can use to make the average row length more realistic. Then we’ll add 1,000,000 rows to the table with data for the new column defaulted.
conn commfreq/commfreq delete from test_tb; alter table test_tb add col_b varchar2(80) not null; insert into test_tb (col_a, col_b) select rownum ,rpad('ABC', 80, 'X') from dual connect by level <= 1000000; exec dbms_stats.gather_table_stats(user, 'test_tb',cascade=>true)
The first test to perform is similar to the first one in the previous test, we read all rows and update each row by setting col_b to the lower case version of what it contains.
conn commfreq/commfreq begin for r in (select col_a from test_tb) loop update test_tb set col_b = lower(col_b) where col_a = r.col_a; end loop; commit; end; /
Since we’re not committing until all rows have been updating, we get all blocks written to undo, but the overhead is minimal as there is just one transaction involved.
The whole update of one million rows, row by row, generated 187.2 MB undo.
Our next test is of course to repeat the same update, but this time to commit after each update.
conn commfreq/commfreq begin for r in (select col_a from test_tb) loop update test_tb set col_b = upper(col_b) where col_a = r.col_a; commit; end loop; end; /
This time the update generated 248.1 MB undo. That is almost 35% more undo than the single transaction operation produced.
It is clear that committing too frequently causes significantly more undo to be generated. This does of course not mean that you should only commit one time in all cases, but you’re not reducing I/O for undo by committing often. Rather, commit to make sure you have your unit of work set to what makes business sense and if possible such that you don’t have to write complex restart logic.
One note about the updates used in this post. If you want to update all rows or many rows, you’d of course never do it row by row if you have any interest in getting it done fast. I’ve only done that here to test and show the impacts on undo for processes that have to do row by row processing. That is how most systems operate and batch processing of data is usually not as easy as just making a very large update to one table at a time.
Please leave feedback if you’ve read this. I’m interested in both style and content comments, as well as corrections if you find technical or grammatical errors in my blog posts.