Brandon Ling Brandon Ling - 7 months ago 12
SQL Question

Oracle query returns no result if extending time on condition

If I do:

SELECT count(*) FROM XX where "date" >= '8-APR-2015' and "date" <= '8-APR-2016'


It would return many rows, but if I do:

SELECT count(*) FROM XX where "date" >= '8-APR-2010' and "date" <= '8-APR-2016'


It returns 0. How is that possible? If anything I would get more rows because I'm increasing the range that is valid for retrieval. Any ideas?

EDIT:

NLS_TIMESTAMP_FORMAT 'DD-MON-RR HH.MI.SSXFF
NLS_DATE_FORMAT DD-MON-RR

Answer

If you look at the execution plans for the two queries, particularly the predicate information, you'll see that the first one does:

---------------------------------------------------------------------------     
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |     
---------------------------------------------------------------------------     
|   0 | SELECT STATEMENT   |      |     1 |    13 |     3   (0)| 00:00:01 |     
|   1 |  SORT AGGREGATE    |      |     1 |    13 |            |          |     
|*  2 |   TABLE ACCESS FULL| XX   |     1 |    13 |     3   (0)| 00:00:01 |     
---------------------------------------------------------------------------     

Predicate Information (identified by operation id):                             

   2 - filter("date">=TO_TIMESTAMP('8-APR-2015') AND                            
              "date"<=TO_TIMESTAMP('8-APR-2016'))                               

while the second does:

----------------------------------------------------------------------------    
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |    
----------------------------------------------------------------------------    
|   0 | SELECT STATEMENT    |      |     1 |    13 |     0   (0)|          |    
|   1 |  SORT AGGREGATE     |      |     1 |    13 |            |          |    
|*  2 |   FILTER            |      |       |       |            |          |    
|*  3 |    TABLE ACCESS FULL| XX   |     1 |    13 |     3   (0)| 00:00:01 |    
----------------------------------------------------------------------------    

Predicate Information (identified by operation id):                             
---------------------------------------------------                             

   2 - filter(NULL IS NOT NULL)                                                 
   3 - filter("date">=TO_TIMESTAMP('8-APR-2010') AND                            
              "date"<=TO_TIMESTAMP('8-APR-2016'))                               

And since NULL IS NOT NULL is never true, that gets zero rows. But that's down to your NLS settings. With other format masks it does not have that filter step.

You can get a sense of what's happening if you look at how those to_timestamp() calls are being evaluated with your format NLS settings:

alter session set nls_timestamp_format = 'DD-MON-RR HH.MI.SSXFF';

select to_char(to_timestamp('8-APR-2015'), 'YYYY-MM-DD') as from_1,
  to_char(to_timestamp('8-APR-2016'), 'YYYY-MM-DD') as to_1,
  to_char(to_timestamp('8-APR-2010'), 'YYYY-MM-DD') as from_2,
  to_char(to_timestamp('8-APR-2016'), 'YYYY-MM-DD') as to_2
from dual;

FROM_1     TO_1       FROM_2     TO_2     
---------- ---------- ---------- ----------
2015-04-08 2016-04-08 2020-04-08 2016-04-08

The first pair of dates look OK - 2015 is before 2016. But the second 'from' has come out as 2020, not 2010; and since Oracle is smart enough to realise that 2020 is later than 2016, it knows there can be no data that matches, and adds the impossible condition to short circuit and avoid redundant data access.

Compare that with a mask that handles four-digit years properly:

alter session set nls_timestamp_format = 'DD-MON-RRRR HH.MI.SSXFF';

select to_char(to_timestamp('8-APR-2015'), 'YYYY-MM-DD') as from_1,
  to_char(to_timestamp('8-APR-2016'), 'YYYY-MM-DD') as to_1,
  to_char(to_timestamp('8-APR-2010'), 'YYYY-MM-DD') as from_2,
  to_char(to_timestamp('8-APR-2016'), 'YYYY-MM-DD') as to_2
from dual;

FROM_1     TO_1       FROM_2     TO_2     
---------- ---------- ---------- ----------
2015-04-08 2016-04-08 2010-04-08 2016-04-08

Now the second from 'date' is correct.

The difference is down to how the RR format mask behaves, though this specific behaviour isn't really documented.

What's actually happening is down to Oracle's helpfulness in trying to be flexible in interpreting format masks. As it says in the docs, just under the table of datetime format elements, "Oracle Database converts strings to dates with some flexibility" - but the effects of that are sometimes a bit unexpected.

It's actually the bit after RR that's throwing it out. You can see that with this little demo:

with t as (
  select 1998 + level as year from dual connect by level < 16
)
select year, to_char(to_timestamp(to_char(year), 'RR HH'), 'YYYY-MM-DD HH24:MI:SS')
from t;

      YEAR TO_CHAR(TO_TIMESTAM
---------- -------------------
      1999 1999-04-01 00:00:00
      2000 2000-04-01 00:00:00
      2001 2020-04-01 01:00:00
      2002 2020-04-01 02:00:00
      2003 2020-04-01 03:00:00
      2004 2020-04-01 04:00:00
      2005 2020-04-01 05:00:00
      2006 2020-04-01 06:00:00
      2007 2020-04-01 07:00:00
      2008 2020-04-01 08:00:00
      2009 2020-04-01 09:00:00
      2010 2020-04-01 10:00:00
      2011 2020-04-01 11:00:00
      2012 2020-04-01 12:00:00
      2013 2013-04-01 00:00:00

The RR model only seems to look at the first two digits of the year, but when being helpful it also tries to handle four-digit years for you, and that is working for 2015 and 2016. And it would work for other years if the mask didn't have a time component. But it does, and it's preferring to interpret the third and fourth characters of your four-digit year using the HH part of the mask.

So for 2010, it's seeing the '10', decides it can interpret that as an HH value, does that, and then only converts the remaining two digits '20' using the RR mask - which it treats as 2020. So you end up with 10am on April 8th 2020. The same thing happens for 2000 (though you can't tell the difference) through to 2012. When you get to 2013, '13' is no longer valid for the HH mask, so it goes back to treating all four digits as the year. If the NLS format mast had HH24 then it would 'break' for 2013-2023 as well.


The moral is to never rely on NLS settings. (And never use 2-digit years, or 2-digit year masks). Convert strings to dates/timestamp explicitly:

where "date" >= to_timestamp('8-APR-2015', 'DD-MON-YYYY')
and "date" <= to_timestamp('8-APR-2016', 'DD-MON-YYYY');

... though preferably not with month names as they are also NLS-dependent, though you can specify you want English translation:

where "date" >= to_timestamp('8-APR-2015', 'DD-MON-YYYY', 'NLS_DATE_LANGUAGE=ENGLISH')
and "date" <= to_timestamp('8-APR-2016', 'DD-MON-YYYY', 'NLS_DATE_LANGUAGE=ENGLISH');

Or even better for fixed values, use ANSI date/timestamp literals:

where "date" >= timestamp '2010-04-08 00:00:00'
and "date" <= timestamp '2016-04-08 00:00:00';