Challenge

SQL Challenge I

While maybe a pretentious title, I’ve always enjoyed getting and giving what I call SQL Challenges. It is basically a short explanation for what you should do with pure SQL and some limitations to how you can do it and then you get to try to come up with a solution. It is frequently more about being able to do it, than it is that one would actually implement it that way. The idea is that you lear new ways to work with SQL or think about SQL and that will help you write better SQL for things you’d actually do in your application.

The “I” is in case this is something someone likes and I think of other ones in the future. It the first one isn’t I, then the second one cannot really be II, right?

To the challenge. I thought of this after reading a question on a forum a while back. The final more performant solution was not SQL, but I thought it was interesting to actually do this in SQL. In case anyone recognizes this from the forum, please don’t link to it. I will link to it when I show my version of a solution to this challenge. The challenge is about challenging yourself to learn something, it’s not about winning over everyone else. Learning is a personal adventure, not some form of competition.

The problem this challenge poses is to find all combinations of the letters in a string. Lets say that it is always three letters in this string to reduce complexity and make it more possible to solve it with pure SQL.

If the string contains ABC, you’d expect the query to return:

ABC
ACB
BAC
BCA
CAB
CBA

If it returns more or less, then you have not managed to get just the combinations. AAA, for example, is not a valid answer as there is just one A in the input.

You should limit your SQL to not use analytical functions at all, and not PL*SQL. The SQL should be written such that you only need one SQL statement and no other objects are created to support your single SQL statement.

Also consider that you will want a solution that is possible to rewrite for at least 8 letters instead of the three you’re writing the SQL for. Several ways of solving this will make the statement too large soon after three letters.

Hopefully this is complicated enough to be a challenge, I know it took me quite some time to figure out how I could solve it.

5 Comments

  1. Gary Myers

    Incomplete specification.What if the string contains the same character repeated (eg ABA) ?
    If that is ‘out of scope’, then the following works.
    Start with select substr(val,rownum,1) let, length(val) lenfrom (select ‘ABC’ val from dual) connect by level to get each characterThen
    select sys_connect_by_pat (let,’;’) x, lenfrom (select substr(val,rownum,1) let, length(val) len from (select ‘ABC’ val from dual) connect by level CONNECT BY NOCYCLE let != PRIOR let;To get each combination of the characters.
    Finallyselect replace(x,’;’), len from (select sys_connect_by_pat (let,’;’) x, len from (select substr(val,rownum,1) let, length(val) len from (select ‘ABC’ val from dual) connect by level CONNECT BY NOCYCLE let != PRIOR let)where length(replace(x,’;’)) = len;To only get those with ALL the characters.

  2. Mathias

    Gary,
    Thank you for your comment. To clarify the challenge. If you get ABA, then you would still get six rows as the result. Each letter needs to be treated as if it is unique.
    You’d get this from ABA:AABABAAABABABAABAA
    That is each A is represented twice in the results. The letters in the input string should never affect how many rows you get back.
    You’re using analytical functions. I believe this is solvable without them and my intention is to solve this without use of them.
    Mathias

  3. philipsd6

    OK, here’s my first attempt at a solution:
    var string varchar2exec :string := ‘ABC’;
    with data as ( select rownum as rn, substr(:string, level, 1) as letter from dual connect by level )select d1.letter || d2.letter || d3.letterfrom data d1, data d2, data d3where d1.rn d2.rnand d1.rn d3.rnand d2.rn d3.rn;
    The main downside to this query is that you are required to know the length of the string prior to constructing the query, due to the number of self joins being equal to the length of the string, not to mention the increasing number of rn rn predicates as the length of the string increases.

  4. Mathias

    philipsd6,
    The problem with the string being fixed is hard to solve in a single statement.
    Your version is pretty good, but connect by is an analytical function. I think this can be solved without any analytical functions. That is not hard to change in your query. The way you write the where clause will result in a very large statement very fast with longer strings.

  5. Yeah, like I said, the self-joins and ever-increasing number of predicates make my first attempt unscalable for longer strings.
    I disagree that ‘connect by’ is an analytic function though they do something rather similar come to think of it (perhaps that is a topic for another day!) So with that in mind, here’s my second attempt:
    var string varchar2exec :string := ‘ABC’;
    with letterset as ( select rownum as rn, substr(:string, level, 1) as letter from dual connect by level )select distinct replace(sys_connect_by_path(letter, ‘/’), ‘/’) as combosfrom lettersetwhere level = length(:string)connect by nocycle prior rn rn;
    That’s pretty darn generic there! I had to switch to my 10g database to do it though I haven’t been able to figure out how to do this in 9i though, where nocycle isn’t available. But hey, if you really think it can be done without “connect by”, I’ll give that a crack.
    I also wasn’t satisfied with your constraint about how “each letter needs to be treated as if it is unique” because it means for a string like “ABA” you’ll get duplicates back. So I stuck in the distinct to make sure you only get the unique combinations of the letters. That just seems to make more sense to me.

Leave a Reply to Mathias Cancel

Your email address will not be published. Required fields are marked *

*