Category: Challenge

May 18th, 2007 by Mathias

Both Gary and Philip postd working versions. Gary's didn't work for me when I had the same letter more than one time in the string. Philips did work and the need to deal with each letter as if it is unique can of course be questioned. The intent was for a solution that could handle that and as Philip showed adjusting that is as easy as to just add a distinct.

My version is based on not using analytical functions of which I thought connect by was a part. Looking at it in another way, I wanted something that could be adopted to other databases.

My starting version of the solution would be:

   with v    as (select 'ABC' v from dual)
,base as (select substr(v.v,1,1) nv, 1 nn from v
union all
select substr(v.v,2,1) nv, 2 nn  from v
union all
select substr(v.v,3,1) nv, 3 nn  from v)
select a.nv||b.nv||c.nv
from base a, base b, base c
where a.nn not in(b.nn, c.nn)
and b.nn not in (c.nn)

What's not to like? I get to use one with construct that takes the output from another with construct as it's input. :-)
The "base" with gives us a result table that has one letter from the input string on each row. I'm then joining that with itself three times (as I have three letter in the string) to get a cartesian product that holds all cominations. It has the valid combinations (where all three letters are used) and invalid ones (where some letters are used more than one time). The last thing is to remove the invalid combinations. I do this by checking if the number I assign to each value (nn) occurs more than one time in the result. For AAA we would have nn for table alias A, B, and C be 1. I only have to check if each value occurs later on in the column list as if it was equal to an earlier one, the test for that earlier one would detect that the same was used.

So it works… Interesting and satisfying at the same time. However, the union is not too lovable. How can we avoid it? Well, the easy way would of course be to use the connect by trick, but as I intended to not use it I'll have to cheat and use anoter unrelated table (view).

   with v    as (select 'ABC' v from dual)
,base as (select substr( v.v,rownum,1) nv
,rownum nn from v, all_objects
where rownum <= length(v.v))

select a.nv||b.nv||c.nv
from base a
,base b
,base c
where a.nn not in (b.nn, c.nn)
and b.nn not in (c.nn)

I'm usig all_objects which should have more rows than the length of a string you want all combinations. It works, but I still prefer the connect by version if I were to use it. This way it does satisfy the limitation I put up for this.

So while this works, it is not as dynamic as one could want it to be. You need a different SQL if you have a four letter string. It ieasy to change this for that, but it is still annoying to have to have many different ones. An alternative would be to write PL*SQL to generate the SQL and then execute it. While that may be nicer, I wanted a solution that was a single statement.

I did find a way to do it. I'm sure it can be done easier and nicer. Maybe analytic functions would make the SQL faster and easier to understand. However, This way combines XML, with clauses, and dynamic in-line queries. For fancy use of the Oracle database, it's almost optimal… How practical it would be to use this very often and with long strings would have to be tested before getting it into a production scenario.

   with v as (select 'ABC' v from dual)
,base as (select substr(v.v,rownum,1) nv
,rownum nn
from v
where rownum <= length(v.v))       ,sel  as (select 'select ' ||                        substr(max(sys_connect_by_path('a'||
nn||'.nv', '||')),3)||' data' a
from base
connect by prior nn = nn - 1
start with nn = 1)
,frm  as (select 'from ' ||
'base a'||nn, ',')),2) a
from base
connect by prior nn = nn - 1
start with nn = 1)
,whr2  as (select 'a'||(nn-1)||'.nn not in('||
'a'||nn||'.nn', ', ')),2) ||
')' cond
,max(rownum) rn
from base
where nn > 1
connect by prior nn = nn + 1
start with nn = length( v.v)
group by nn
order by nn)
,whr  as (select 'where ' || substr(max(
sys_connect_by_path(cond, ' and '))
,5) a
from whr2
where not rn = length(v.v)
connect by prior rn = rn + 1
start with rn = length( v.v) - 1)
select extractvalue(t.column_value,'/DATA')
from sel a
,frm b
,whr c
'for $root in $vals
return $root/ROWSET/ROW/DATA'
passing xmltype(dbms_xmlgen.getxml(
'with v    as (select ''ABC'' v
from dual)
,base as (select substr(v.v,rownum,1) nv
,rownum nn
from v
where rownum <= length(v.v))'                              ||a.a||' '                              ||b.a||' '                              ||c.a)) as "vals") t; 

The formatting is a bit forced here as I need it to fit in the line size google allows on this blog. I will not explain in detail what I have tried to achieve in this SQL. If you would want me to talk about what I'm doing in a post, leave a comment and I'll try to write it up in a post soon.

In short, I'm generating different pieces of the SQL statement needed in the with clauses and them I'm using that to create the XML document with all these cominations ina single XML document. I'm then using some XML function and XMLTable with an XML Query to get it back out form one row (one XML doc) to multiple rows to get it back to relational data.

The threaad that started me thinking about this was this thread. There are some permutations things there that solves this with pure analytics. But why do this the easy way when you can make it more complicated. Laurent's solution is probably the only one that scales well with longer strings.

I hope you enjoyed this and hopefully even learned something. I know I did, both that Cartesian products can be used in ways I hadn't really thought of and that I need to learn more about analytical functions.

I also learned more about XML function and how to move data in and out of these dynamic in-line queries.

Posted in Challenge, Oracle, SQL, XML

May 16th, 2007 by Mathias

While maybe a pretentious title, I've always enjoyed getting and giving what I call SQL Challenges. It is basically a short explanation for what you should do with pure SQL and some limitations to how you can do it and then you get to try to come up with a solution. It is frequently more about being able to do it, than it is that one would actually implement it that way. The idea is that you lear new ways to work with SQL or think about SQL and that will help you write better SQL for things you'd actually do in your application.

The "I" is in case this is something someone likes and I think of other ones in the future. It the first one isn't I, then the second one cannot really be II, right?

To the challenge. I thought of this after reading a question on a forum a while back. The final more performant solution was not SQL, but I thought it was interesting to actually do this in SQL. In case anyone recognizes this from the forum, please don't link to it. I will link to it when I show my version of a solution to this challenge. The challenge is about challenging yourself to learn something, it's not about winning over everyone else. Learning is a personal adventure, not some form of competition.

The problem this challenge poses is to find all combinations of the letters in a string. Lets say that it is always three letters in this string to reduce complexity and make it more possible to solve it with pure SQL.

If the string contains ABC, you'd expect the query to return:

If it returns more or less, then you have not managed to get just the combinations. AAA, for example, is not a valid answer as there is just one A in the input.

You should limit your SQL to not use analytical functions at all, and not PL*SQL. The SQL should be written such that you only need one SQL statement and no other objects are created to support your single SQL statement.

Also consider that you will want a solution that is possible to rewrite for at least 8 letters instead of the three you're writing the SQL for. Several ways of solving this will make the statement too large soon after three letters.

Hopefully this is complicated enough to be a challenge, I know it took me quite some time to figure out how I could solve it.

Posted in Challenge