bcrist - 6 months ago 38

C++ Question

I've been porting Sebastiano Vigna's **xorshift1024*** PRNG to be compatible with the standard C++11 uniform random number generator contract and noticed some strange behavior with the

`jump()`

According to Vigna, a call to

`jump()`

`next()`

`jump()`

`next()`

`jump();`

next();

should leave the generator in the same state as

`next();`

jump();

since both should be equivalent to

`for (bigint i = 0; i < (bigint(1) << 512) + 1; ++i)`

next();

assuming

`bigint`

Unfortunately, this doesn't work with the reference implementation Vigna provides (which I will include at the end for posterity; in case the implementation linked above changes or is taken down in the future). When testing the first two options using the following test code:

`memset(s, 0xFF, sizeof(s));`

p = 0;

// jump() and/or next() calls...

std::cout << p << ';';

for (int i = 0; i < 16; ++i)

std::cout << ' ' << s[i];

calling

`jump()`

`next()`

`1; 9726214034378009495 13187905351877324975 10033047168458208082 990371716258730972 965585206446988056 74622805968655940 11468976784638207029 3005795712504439672 6792676950637600526 9275830639065898170 6762742930827334073 16862800599087838815 13481924545051381634 16436948992084179560 6906520316916502096 12790717607058950780`

while calling

`next()`

`1; 13187905351877324975 10033047168458208082 990371716258730972 965585206446988056 74622805968655940 11468976784638207029 3005795712504439672 6792676950637600526 9275830639065898170 6762742930827334073 16862800599087838815 13481924545051381634 16436948992084179560 6906520316916502096 12790717607058950780 9726214034378009495`

Clearly either my understanding of what

`jump()`

`jump()`

Xorshift1024* reference implementation; http://xorshift.di.unimi.it/xorshift1024star.c

`/* Written in 2014-2015 by Sebastiano Vigna (vigna@acm.org)`

To the extent possible under law, the author has dedicated all copyright

and related and neighboring rights to this software to the public domain

worldwide. This software is distributed without any warranty.

See <http://creativecommons.org/publicdomain/zero/1.0/>. */

#include <stdint.h>

#include <string.h>

/* This is a fast, top-quality generator. If 1024 bits of state are too

much, try a xorshift128+ generator.

The state must be seeded so that it is not everywhere zero. If you have

a 64-bit seed, we suggest to seed a splitmix64 generator and use its

output to fill s. */

uint64_t s[16];

int p;

uint64_t next(void) {

const uint64_t s0 = s[p];

uint64_t s1 = s[p = (p + 1) & 15];

s1 ^= s1 << 31; // a

s[p] = s1 ^ s0 ^ (s1 >> 11) ^ (s0 >> 30); // b,c

return s[p] * UINT64_C(1181783497276652981);

}

/* This is the jump function for the generator. It is equivalent

to 2^512 calls to next(); it can be used to generate 2^512

non-overlapping subsequences for parallel computations. */

void jump() {

static const uint64_t JUMP[] = { 0x84242f96eca9c41dULL,

0xa3c65b8776f96855ULL, 0x5b34a39f070b5837ULL, 0x4489affce4f31a1eULL,

0x2ffeeb0a48316f40ULL, 0xdc2d9891fe68c022ULL, 0x3659132bb12fea70ULL,

0xaac17d8efa43cab8ULL, 0xc4cb815590989b13ULL, 0x5ee975283d71c93bULL,

0x691548c86c1bd540ULL, 0x7910c41d10a1e6a5ULL, 0x0b5fc64563b3e2a8ULL,

0x047f7684e9fc949dULL, 0xb99181f2d8f685caULL, 0x284600e3f30e38c3ULL

};

uint64_t t[16] = { 0 };

for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)

for(int b = 0; b < 64; b++) {

if (JUMP[i] & 1ULL << b)

for(int j = 0; j < 16; j++)

t[j] ^= s[(j + p) & 15];

next();

}

memcpy(s, t, sizeof t);

}

Answer

OK, I'm sorry but sometimes this happens (I'm the author).

Originally the function had two memcpy(). Then I realised then a circular copy was needed. But I replaced just the *first* memcpy(). Stupid, stupid, stupid. All files on the site have been fixed. The arXiv copy is undergoing update. See http://xorshift.di.unimi.it/xorshift1024star.c

Incidentally: I didn't "publish" anything wrong in the scientific sense, as the jump() function is not part of the ACM Trans. Math. Soft. paper—it just has been added few weeks ago on the site and on the arXiv/WWW version. The fast publication path of the web and arXiv means that, sometimes, one distributes unpolished papers. I can only thank the reporter for reporting this bug (OK, technically StackOverflow is not reporting bugs, but I got an email, too).

Unfortunately, the unit test I had did not consider the case p ≠ 0. My main concern was that the correctness of the computed polynomial. The function, as noted above, is correct when p = 0.

As for the computation: to each generator corresponds a characteristic polynomial P(x). The jump polynomial for k is just x^k mod P(x). I use fermat to compute such powers, and then I have some scripts generating the C code.

Of course I can't test 2^512, but since my generation code works perfectly from 2 to 2^30 (the range you can easily test), I'm confident it works at 2^512, too. It's just fermat computing x^{2^512} instead of x^{2^30}. But independent verifications are more than welcome.

I have code working only for powers of the form x^{2^t}. This is what I need to compute useful jump functions. Computing polynomials modulo P(x) is not difficult, so one could conceivably have a completely generic jump function for any value, but frankly I find this totally overkill.

If anybody is interested in getting other jump polynomials, I can provide the scripts. They will be part, as it happens for all other code, of the next xorshift distribution, but I need to complete the documentation before giving them out.

For the record, the characteristic polynomial of xorshift1024* is x^1024 + x^974 + x^973 + x^972 + x^971 + x^966 + x^965 + x^964 + x^963 + x^960 + x^958 + x^957 + x^956 + x^955 + x^950 + x^949 + x^948 + x^947 + x^942 + x^941 + x^940 + x^939 + x^934 + x^933 + x^932 + x^931 + x^926 + x^925 + x^923 + x^922 + x^920 + x^917 + x^916 + x^915 + x^908 + x^906 + x^904 + x^902 + x^890 + x^886 + x^873 + x^870 + x^857 + x^856 + x^846 + x^845 + x^844 + x^843 + x^841 + x^840 + x^837 + x^835 + x^830 + x^828 + x^825 + x^824 + x^820 + x^816 + x^814 + x^813 + x^811 + x^810 + x^803 + x^798 + x^797 + x^790 + x^788 + x^787 + x^786 + x^783 + x^774 + x^772 + x^771 + x^770 + x^769 + x^768 + x^767 + x^765 + x^760 + x^758 + x^753 + x^749 + x^747 + x^746 + x^743 + x^741 + x^740 + x^738 + x^737 + x^736 + x^735 + x^728 + x^726 + x^723 + x^722 + x^721 + x^720 + x^718 + x^716 + x^715 + x^714 + x^710 + x^709 + x^707 + x^694 + x^687 + x^686 + x^685 + x^684 + x^679 + x^678 + x^677 + x^674 + x^670 + x^669 + x^667 + x^666 + x^665 + x^663 + x^658 + x^655 + x^651 + x^639 + x^638 + x^635 + x^634 + x^632 + x^630 + x^623 + x^621 + x^618 + x^617 + x^616 + x^615 + x^614 + x^613 + x^609 + x^606 + x^604 + x^601 + x^600 + x^598 + x^597 + x^596 + x^594 + x^593 + x^592 + x^590 + x^589 + x^588 + x^584 + x^583 + x^582 + x^581 + x^579 + x^577 + x^575 + x^573 + x^572 + x^571 + x^569 + x^567 + x^565 + x^564 + x^563 + x^561 + x^559 + x^557 + x^556 + x^553 + x^552 + x^550 + x^544 + x^543 + x^542 + x^541 + x^537 + x^534 + x^532 + x^530 + x^528 + x^526 + x^523 + x^521 + x^520 + x^518 + x^516 + x^515 + x^512 + x^511 + x^510 + x^508 + x^507 + x^506 + x^505 + x^504 + x^502 + x^501 + x^499 + x^497 + x^494 + x^493 + x^492 + x^491 + x^490 + x^487 + x^485 + x^483 + x^482 + x^480 + x^479 + x^477 + x^476 + x^475 + x^473 + x^469 + x^468 + x^465 + x^463 + x^461 + x^460 + x^459 + x^458 + x^455 + x^453 + x^451 + x^448 + x^447 + x^446 + x^445 + x^443 + x^438 + x^437 + x^431 + x^430 + x^429 + x^428 + x^423 + x^417 + x^416 + x^415 + x^414 + x^412 + x^410 + x^409 + x^408 + x^400 + x^398 + x^396 + x^395 + x^391 + x^390 + x^386 + x^385 + x^381 + x^380 + x^378 + x^375 + x^373 + x^372 + x^369 + x^368 + x^365 + x^360 + x^358 + x^357 + x^354 + x^350 + x^348 + x^346 + x^345 + x^344 + x^343 + x^342 + x^340 + x^338 + x^337 + x^336 + x^335 + x^333 + x^332 + x^325 + x^323 + x^318 + x^315 + x^313 + x^309 + x^308 + x^305 + x^303 + x^302 + x^300 + x^294 + x^290 + x^281 + x^279 + x^276 + x^275 + x^273 + x^272 + x^267 + x^263 + x^262 + x^261 + x^260 + x^258 + x^257 + x^256 + x^249 + x^248 + x^243 + x^242 + x^240 + x^238 + x^236 + x^233 + x^232 + x^230 + x^228 + x^225 + x^216 + x^214 + x^212 + x^210 + x^208 + x^206 + x^205 + x^200 + x^197 + x^196 + x^184 + x^180 + x^176 + x^175 + x^174 + x^173 + x^168 + x^167 + x^166 + x^157 + x^155 + x^153 + x^152 + x^151 + x^150 + x^144 + x^143 + x^136 + x^135 + x^125 + x^121 + x^111 + x^109 + x^107 + x^105 + x^92 + x^90 + x^79 + x^78 + x^77 + x^76 + x^60 + 1