Multitable insert

Mutlitable insert is a feature I've read about many times, but I've never taken the time to read up on it or play with the way this SQL statement works. I figure making a post about it will give me the reason I need to learn more about it.

It is not so much a new statement as an extension of the existing INSERT statement. It allows you to insert into different tables and with different conditions from a select statement. That is, it is a variant of the common "insert into select from" version of the insert statement.

There are of course many uses of this. Splitting a table into multiple tables seems like an obvious use, but more common may be when data is added to more than one table at the same time. One such situation could be a situation where a blob is placed in it's own table or where a parent table needs to be maintained while loading the dependent table with new data.

In our test we'll read all rows from all_ind_columns and insert into the tables test_tb and test_col. The two tables will of course hold tables and indexed columns in tables respectively.

Lets first set up the user and the tables.

conn system
create user multiins identified by multiins;
alter user multiins quota unlimited on users;
grant create session, resource to multiins;
conn multiins/multiins
create table test_tb as
select table_name
from user_ind_columns
where 1 = 0;
create table test_col as
select table_name, column_name
from user_ind_columns
where 1 = 0;

We now have a table test_tb that holds just a table_name and a table test_col that holds table_name and column_name. This is a traditional parent child table relationship. The task is now to in one statement load both tables with data from all_ind_columns so we get one row per table in test_tb and one per indexed column in test_col.

when cn = (select min(column_name)
from all_ind_columns a
where a.table_name = tn) then
into test_tb values (tn)
when 1=1 then
into test_col values (tn, cn)
select table_name  tn
,column_name cn
from all_ind_columns;

Pretty nice. This can be used to denormalize tables or just load tables in parallel when the data is in one single data source. It can even be used to have one join pull data and then put the data into different tables. The number of conditions and tables is virtually unlimited. I believe the limit is set to 127 and I'm sure I'll reach the limit for a single statement (64,000 bytes) before I reach that limit.

The subselect used in the first condition is to make sure that we only insert the table name once. In each table, there will just be one column that has the lowest value as column names are unique within a table.

You'd of course want to add primary keys and a foreign key if these tables were to now be used in a real application.

April 29th, 2007 by