Scott Scott - 1 month ago 11
Javascript Question

Pubnub.history() paging

I'm trying to implement paging for

pubnub.history()
. Here is what I have so far:

Attempt 1: This gets the first
count
messages from the beginning of time:

function getHistory(chnl,count, token) {
var pagesize=10;
console.log('Starting at ' + token + ', getting ' + count);
pubnub.history({
channel : chnl,
callback : function(m){
var msgs = m[0];
var msgcount = msgs.length;
var startToken = m[1];
var endToken = m[2];
for (var i = 0; i < msgcount; i++) {
var msg = msgs[i];
processMessage(chnl,msg);
}
if (count >= pagesize) {
getHistory(chnl,count - pagesize,endToken);
}
},
count: pagesize,
start: token,
reverse: true
});
}


Attempt 2: This gets the most recent
pagesize
messages over and over again.
token
is the same every time:

function getHistory(chnl,count, token) {
var pagesize=10;
console.log('Starting at ' + token + ', getting ' + count);
pubnub.history({
channel : chnl,
callback : function(m){
var msgs = m[0];
var msgcount = msgs.length;
var startToken = m[1];
var endToken = m[2];
for (var i = 0; i < msgcount; i++) {
var msg = msgs[i];
processMessage(chnl,msg);
}
if (count >= pagesize) {
getHistory(chnl,count - pagesize,startToken);
}
},
count: pagesize,
end: token,
reverse: false
});
}


I need to get the last
count
messages, ending with the most recent. How can I modify my function to do this?




UPDATE: My new API v4 code:

function getHistory(chnl,count, token) {
var pagesize=10;
console.log('Starting at ' + token + ', getting ' + count);
pubnub.history({
channel : chnl,
callback : function(m){

},
count: pagesize,
start: token,
reverse: true
},function(status, response) {
var msgs = response.messages;
var msgcount = msgs.length;
var startToken = response.startTimeToken;
var endToken = response.endTimeToken;
for (var i = 0; i < msgcount; i++) {
var msg = msgs[i];
processMessage(msg);
}
if (count >= pagesize) {
getHistory(chnl,count - pagesize,endToken);
}
});
}


And here is what it is doing right now if I request 1000:


2016-10-25 13:29:59.060 pubnub.js:23 Starting at 14772721032416580, getting 620
2016-10-25 13:29:59.385 pubnub.js:23 Starting at 14772748031396800, getting 610
2016-10-25 13:29:59.678 pubnub.js:23 Starting at 14772778027380036, getting 600
2016-10-25 13:30:00.024 pubnub.js:23 Starting at 14772808029462014, getting 590
2016-10-25 13:30:00.440 pubnub.js:23 Starting at 14772838027743668, getting 580
2016-10-25 13:30:00.772 pubnub.js:23 Starting at 14772868026780088, getting 570
2016-10-25 13:30:01.116 pubnub.js:23 Starting at 14772898023567196, getting 560
2016-10-25 13:30:01.426 pubnub.js:23 Starting at 14772928025649280, getting 550
2016-10-25 13:30:01.726 pubnub.js:23 Starting at 14772958026355252, getting 540
2016-10-25 13:30:02.113 pubnub.js:23 Starting at 14773220007794584, getting 530
2016-10-25 13:30:02.457 pubnub.js:23 Starting at 14774200653664656, getting 520
2016-10-25 13:30:02.606 pubnub.js:23 Starting at 14774200653664656, getting 510
2016-10-25 13:30:02.690 pubnub.js:23 Starting at 14774200653664656, getting 500
2016-10-25 13:30:02.790 pubnub.js:23 Starting at 14774200653664656, getting 490
2016-10-25 13:30:02.922 pubnub.js:23 Starting at 14774200653664656, getting 480
2016-10-25 13:30:03.015 pubnub.js:23 Starting at 14774200653664656, getting 470



As you can see in this instance, there weren't actually 1000 messages so at
13:30:02.467
it started getting the same message over and over. I then requested 300 and it worked, but it got the first 300, not the most recent 300.

So two issues still remain:


  1. Get all messages if requested number is more than exist.

  2. Get most recent x messages.


Answer

PubNub Storage - Paging with History

Some setup

  • I have put together a recursive paging function and a set of steps to test it. Keep in mind this code is as simple as possible while still being somewhat functional for a POC, but it is not best practices.
  • And this is PubNub JavaScript SDK v4, instead of v3 for which the question was originally asked.

First, initialize an instance of PubNub using your own keys

// init PubNub
var pubnub = new PubNub({
    publishKey   : 'pub-...',
    subscribeKey : 'sub-...'
})

Second, create a function to populate a channel

We want to create a channel with messages that we can easily understand the order of those message: message #1, message #2, etc.

We use setInterval to add some delay between the published messages. First, we want to be sure that the messages are stored in the order we publish (remember, it's all async, including on the server side) and we want some time spacing between our message timetokens.

function pub(channel, total) {
    var i = 1;

    var looper = setInterval(
        function() {
            pubnub.publish({
                channel: channel,
                message: "message #" + i++
            });

            if (i > total) {
                clearInterval(looper);
            }
        }, 
    400);
}

You can create a several different test channels with different number of messages but one with 32 message ought to do the trick for simple history paging testing. You want a non-round number so you can page by 5 or 10 and see what the uneven end results gives you. It is useful to have another channel with round number to see that as well.

Finally, create the history paging function

It's a bit crude but it gets the job done without too much fanciness to get in the way of pointing out what is happening. It uses recursion and not optimally.

function getMessages(args, callback) {
    pubnub.history(
        {
            // search starting from this timetoken
            start: args.startToken,
            channel: args.channel,
            // false - search forwards through the timeline
            // true - search backwards through the timeline
            reverse: args.reverse,
            // limit number of messages per request to this value; default/max=100
            count: args.pagesize,
            // include each returned message's publish timetoken
            includeTimetoken: true,
            // prevents JS from truncating 17 digit timetokens
            stringifiedTimeToken: true
        },
        function(status, response) {
            // holds the accumulation of resulting messages across all iterations
            var results = args.results;
            // the retrieved messages from history for this iteration only
            var msgs = response.messages;
            // timetoken of the first message in response
            var firstTT = response.startTimeToken;
            // timetoken of the last message in response
            var lastTT = response.endTimeToken;
            // if no max results specified, default to 500
            args.max = !args.max ? 500 : args.max;

            if (msgs != undefined && msgs.length > 0) {
                // display each of the returned messages in browser console
                for (var i in msgs) {
                    msg = msgs[i];
                    console.log(msg.entry, msg.timetoken);
                }

                // first iteration, results is undefined, so initialize with first history results
                if (!results) results = msgs;
                // subsequent iterations, results has previous iterartions' results, so concat
                // but concat to end of results if reverse true, otherwise prepend to begining of results
                else args.reverse ? results = results.concat(msgs) : results = msgs.concat(results);
            }

            // show the total messages returned out of the max requested
            console.log('total    : ' + results.length + '/' + args.max);

            // we keep asking for more messages if # messages returned by last request is the
            // same at the pagesize AND we still have reached the total number of messages requested
            // same as the opposit of !(msgs.length < pagesize || total == max)
            if (msgs.length == args.pagesize && results.length < args.max) {
                getMessages(
                    {
                        channel:args.channel, max:args.max, reverse:args.reverse, 
                        pagesize:args.pagesize, startToken:args.reverse ? lastTT : firstTT, results:results
                    }, 
                    callback);
            }
            // we've reached the end of possible messages to retrieve or hit the 'max' we asked for
            // so invoke the callback to the original caller of getMessages providing the total message results
            else callback(results);
        }
    );
}

Test the Paging

We will assume that we have a channel named test1 with 32 messages in Storage as follows:

  • message #1
  • message #2

...

  • message #32

each with a unique timetoken.

Invoke the getMessages function to get the following desired results:

1. get maximum 50 messages starting from the oldest message, 5 messages at a time

getMessages(
    {
        channel: 'test1', max: 50, pagesize: 5, reverse: true
    },
    function(results) {
        console.log("results: \n" + JSON.stringify(results));       
    }
);

2. get maximum of 20 messages starting from the oldest message, 5 messages at a time

getMessages(
    {
        channel: 'test1', max: 20, pagesize: 5, reverse: true
    },
    function(results) {
        console.log("results: \n" + JSON.stringify(results));       
    }
);

3. get maximum of 50 messages starting from the newest message, 5 messages at a time

getMessages(
    {
        channel: 'test1', max: 50, pagesize: 5, reverse: false
    },
    function(results) {
        console.log("results: \n" + JSON.stringify(results));       
    }
);

4. get maximum of 20 messages starting from the newest message, 5 messages at a time

getMessages(
    {
        channel: 'test1', max: 20, pagesize: 5, reverse: false
    },
    function(results) {
        console.log("results: \n" + JSON.stringify(results));       
    }
);

So you should see that the order of the returned messages of each invocation of history is always oldest to newest (ascending). But the order in which the messages are searched depends on the reverse parameter.

  • reverse:true will retrieve blocks of message from the oldest to the newest: 1,2,3,4,5 | 6,7,8,9,10 | ...
  • reverse:false will retrieve blocks of message from the newest to the oldest: 28,29,30,31,32 | 23,24,25,26,27 | ...

So to reiterate, order of the messages in each response is the same, but the direction of search/retrieve is different. NOTE: this is something the new Storage/History design will clear up with simpler, more robust APIs (for example: getMessagesSince, getMessagesBefore, getMessagesBetween).

Now let's introduce the startToken parameter so we can retrieve message from or to a given point in the channel's history timeline. The timetoken you specify will be one that you need to pick from an existing message in the channel's history. It will be different for you than for me since we published at different times with different keys. I will use startToken:14774567814936359 for the following examples and let's assume this is the timetoken for message #17.

5. get maximum of 20 messages starting from message at timetoken 14774567814936359, 5 messages at a time (remember that start param is exclusive so you will not get message #17 back in your results)

getMessages(
    {
        channel: 'test1', max: 20, pagesize: 5, reverse: true,
        startToken:"14774567814936359"
    }, 
    function(results) {
        console.log("results: \n" + JSON.stringify(results));       
    }
);

NOTE: You have to pass the timetoken values as strings (notice that startToken param value is quoted above, and below) otherwise JavaScript will round them up. For example, 14774567814936359 will become 14774567814936360 and you will get unexpected results. Go ahead and try it without the quotes and you will see that the message at the provided timetoken will be retrieved, but it should be excluded. You can see the rounded up timetoken in the history URL that was submitted by checking in the Network tab of your browser.

6. get maximum of 20 messages prior to the message at timetoken 14774567814936359, 5 messages at a time

getMessages(
    {
        channel: 'test1', max: 50, pagesize: 5, reverse: false,
        startToken:"14774567814936359"
    }, 
    function(results) {
        console.log("results: \n" + JSON.stringify(results));       
    }
);

We could go on and on with more examples, but I'll let you (Scott) and others ask for how-to examples and I can provide as necessary.

Ordering history results

The getMessages function assembles the resulting array of messages in a presorted ascending order even when messages are retrieve from newest to oldest order. It does this by prepending the new history results with the current accumulation of previous iterations' results.

If the desire is to sort messages in descending timetoken order, then it is fairly simple to implement a sort comparator function to do so. I provide a simple example of this here:

function sortHistory(messages, desc, callback) {
    messages.sort(function(a, b) {
        var e1 = desc ? b : a;
        var e2 = desc ? a : b;
        return parseInt(e1.timetoken) - parseInt(e2.timetoken);
    });

    callback(messages);
}

This sortHistory function accepts the messages array (an array of JSON elements: {message, timetokens}), a flag that indicates the desired sort order (ascending is default, set to true for descending), and a function that will be called with the resulting sorted array messages.

Now let's call getMessages with revers:true. This means that the messages will be retrieved in oldest to newest and if we want ascending sort order, then it is already in that order.

getMessages({channel: 'test1', max: 100, pagesize: 5, reverse: true}, 
    function(results) {
        console.log("presorted: \n" + JSON.stringify(results));       
    }
);

If we want to sort the messages in descending order, we just add a call to our sortHistory function.

getMessages({channel: 'test1', max: 100, pagesize: 5, reverse: true}, 
    function(results) {
        console.log("before sort: \n" + JSON.stringify(results));

        // sort messages in descending order
        sortHistory(results, true, function(sorted) {
            console.log("after sort: \n" + JSON.stringify(sorted));
        });        
    }
);

And if we retrieve the message with reverse:false, newest to oldest, then we still get the messages assembled in ascending order even though the iterations in chunks of size pagesize (16,17,18,19,20 | 11,12,13,14,15...). But if we want to sort descending, again, just need to call the sortHistory function.

getMessages({channel: 'test1', max: 100, pagesize: 5, reverse: false}, 
    function(results) {
        console.log("before sort: \n" + JSON.stringify(results));

        // sort messages in descending order
        sortHistory(results, true, function(sorted) {
            console.log("after sort: \n" + JSON.stringify(sorted));
        });        
    }
);

I updated the jsfiddle with new code and I made a lot of mods to this answer.

See the PubNub JavaScript SDK Storage (history) Docs/Tutorial] for full details and the PubNub Storage and History tutorial.