geneSummons geneSummons - 4 years ago 107
Java Question

Java - How to unit test TimSort and "general contract violation" issues

This question is related to "Comparison method violates its general contract!" - TimSort and GridLayout
and several other similar "general contract violation" questions. My question is particularly related to Ceekay's answer at the bottom of the page about "How to test the TimSort implementation". In my case I have fixed the application bug that brought me here which was due to a symmetry violation, but I am having trouble creating a unit test to expose that violation (if the fix is commented out or unfixed in future).

public class TickNumber implements Comparable<TickNumber> {
protected String zone;
protected String track;
}
public class GisTickNumber extends TickNumber implements Comparable<TickNumber> {
private String suffix;
}


I've left out all the implementation details, but basically a Tick number is a 4 digit number where the first two digits are the zone and the second two digits are the track. GisTickNumbers can have alpha characters in the zone and or track fields, and they can optionally have an alpha suffix of one or two characters. Valid ticks are all integers in the range
[0000, 9999]
(even when represented as Strings). All valid Tick numbers are valid Gis Tick numbers, but valid Gis Ticks can also look like
A912, R123, 0123G, A346*
.

My symmetry violation was that in the GisTick
compareTo
, I was accounting for the possible suffix, but in the plain Tick
compareTo
I was not accounting for it. Thus, if 'this' was a
0000
Tick and 'that' was a
0000*
Gis Tick,
0000.compareTo(0000*)
would return 0. While if 'this' was a
0000*
Gis Tick and 'that' was a
0000
Tick,
0000*.compareTo(0000)
would return 1. A clear symmetry violation (once the shrouds pulled back)

According to Ceekay in an answer to the linked question,



  1. Create a list with 32 or more objects.

  2. Within that list, there needs to [be] two or more runs.

  3. Each run must contain 3 or more objects.



Once you meet those [three] criteria you can begin testing for this failure.


I believe I have set up such a list of TickNumber (and GisTickNumber) objects for my unit test, but I can't seem to get the test to fail. Even though the list has over 100 objects, more than two runs, and each run contains about 10 objects. So, my question is what other characteristics does the list of objects under test need to satisfy in order for a call to
Collections.sort(testList)
to fail due to "general (symmetry) contract violation"?


  • and yes, I commented out the fix before I ran the unit test that I was expecting to fail.


Answer Source

Solved:
I ended up debugging to a breakpoint where I could view the toString() representation of the Objects in the List getting sorted, and was then able to extract the TickNumber information from the rest of that data and eventually use that extracted data in my unit test. Finally, I went back and removed list items until I crafted what seems to be a list that satisfies the "minimal requirements" for triggering symmetry related "general contract violations".

I'm not sure how to generalize my specific solution into generic characteristics a list must satisfy in order to trigger TimsSort and this "general contract violation". But here goes...

  • The list must contain 64 elements (49 + 1 + 12 + 1 + 1)
  • The list must contain a run of 50, where for 49 of the 50 elements the compare result is 0 (i.e. comparisons match)
    • Within the front half of that "matching run" there must be 1 element that sorts before all the others in the run (all the others in the run match when compared), and that single odd element must also "symmetry mismatch" the element at the end of the other runs.
  • The list must contain a minimum of 2 other runs of three or more elements (my test list has a run of 8 followed by a run of 4)
    • The other half of the "symmetry mismatch" must be the last item in the run of 4 (the second other run).
  • The list must contain an element at (end - 1) position that sorts to the beginning of the sorted list
  • The list must contain an element at (end) position that sorts somewhere in the middle of the sorted list

I'm pretty sure the above bullets are not an exhaustive list of general requirements a list must satisfy to expose a symmetry violation when the list is sorted, but they worked for me in one specific case.

Specifically, my crafted test list starts with 49 TickNumber objects where Tick = "9999", and somewhere in the front half of the 49 Ticks there is a "9910" Tick, for a total of 50 Tick numbers in this opening pseudo-run. (Pseudo because "9910" breaks up the unsorted run of 49 matching "9999" Ticks.) The "9910" Tick in the opening run is one-half of the symmetry mismatch I am testing for. Then the test list contains 12 GisTickNumber objects as a run of 8 ("9915*", "9920*", "9922*", "9931*", "9933*", "9934*", "9936*", "9939*"), followed by a run of 4 ("9907*", "9908*", "9909*", "9910*"). Note that the last item in the run of 4 is the other half of the "symmetry mismatch" I am testing for. Finally, the list caps off with a "9901" TickNumber object that will lead off the sorted list, and a "9978*" GisTickNumber object that sorts somewhere in the middle. I have tried removing and/or rearranging the Objects in the test list to no avail. The Unit test will start issuing false-positive (success) results if, for example, the "9901" element is removed from the test list. (false-positives will also occur if "9901" is moved to the front of the unsorted list)

Note: I suspect that the plain TickNumber part of the "9910" symmetry mismatch can appear anywhere in the opening run before the MIN_RUN'th element. In other words, if MIN_RUN is 32 and the leading run in my test list has 50 elements with 49 that compare "the same", then the "9910" symmetry mismatch element can appear at any position in the run less than position 32. This supposition hasn't been proven; but I have empirically determined that the symmetry mismatch element can't appear near the end of the leading run, and that it can appear in multiple spots near the start of the leading run. (one different spot per test run)

In general, if any of these conditions are not "exactly right" you won't trigger the "general contract violation" even though you are testing list data where comparisons should violate the contract.

In my case, the only TickNumber objects that match in my test list are the 49 "9999" Ticks and the 2 ("9910" and "9910*") Ticks that violate symmetry on comparison.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download