Sometimes the results of a SQL is not at all what you expected. Most of the time it is a simple error you’ve made. Very rarely is it an issue that after hours of looking at it still looks like SQL does something wrong, But it still happens.
This post is about one such situation that occurred last week. If you read the last blog post, you’ll recognize some of this. It turned up when I looked at transforming that SQL to create the many million rows I needed.
with ds as ( select level rad from dual connect by level < 5 ) , a as ( select trunc(dbms_random.value(1900, 2020)) bs from dual union all select 2025 from dual ) , b as ( select a.bs, substr(a.bs, ds.rad, 1) dgt from a, ds ) select b.bs, b.dgt from b where b.dgt >= 0 order by b.bs ;
When I execute this it should get me eight rows, Four with a number somewhere between 1900 and 2020 and four with 2025. Those four then also has each digit in those number broken out as the second value in the result table.
An example would be like this:
1989 9 1989 9 1989 8 1989 1 2025 2 2025 5 2025 2 2025 0
The first four are 1989 with the digits 1,9,8,9 broken out and the second is 2025 with the digits 2,0,2,5 broken out.
Now the problem is that as written the SQL will return results that should not be possible. An example of the result it returns is.
1931 1 1989 8 2004 0 2008 8 2025 2 2025 5 2025 2 2025 0
Note the first four rows, for some reason it turned them into four different years. It should not be possible.
What happens seems to be a query transformation gone wrong. It does things that it should not when the execution of a SQL is not deterministic (due to the use of dbms_random).
The thing that triggers it to transform seems the be the very last condition in the SQL “where b.dgt >= 0”. That condition should never be false with the data generated in the SQL, it is there just to showcase the bug. If you remove it you will get the expected result.
Same thing if you change “union all” to “union”, that also makes it produce a correct result. Note that as the query is written, there should be no difference in the result between union all and union.
An argument against it having anything to do with query transformation is that the hint NO_QUERY_TRANSFORMATION has no impact on the behavior.
I may of course have overlooked something here, but I have reviewed this behavior with two colleagues I enjoy debating SQL with.
This blogpost is largely as a reminder to myself and to anyone who may be interested in seeing the odd behavior non deterministic SQL in pre-factoring clauses seem to cause in some odd situations.