Vinay billa Vinay billa - 3 months ago 9
R Question

Function to identify changes done previously

BACKGROUND

I have a list of 16 data frames. A data frame in it looks like this. All the other data frames have the similar format.

DateTime
column is of
Date class
while
Value
column is of
time series
class

> head(train_data[[1]])

DateTime Value
739 2009-07-31 49.9
740 2009-08-31 53.5
741 2009-09-30 54.4
742 2009-10-31 56.0
743 2009-11-30 54.4
744 2009-12-31 55.3


I am performing forecasting for the
Value
column across all the
data.frames
in this
list
. The following line of code feeds data into UCM model.

train_dataucm <- lapply(train_data, transform, Value = ifelse(Value > 50000 , Value/100000 , Value ))


The transform function is used to reduce large values because UCM has some issues rounding off large values ( I don't know why though ). I just understood that from user @KRC in this link

One data frame got affected because it had large values which got transformed to log values. All the other dataframes remained unaffected.

> head(train_data[[5]])
DateTime Value
715 2009-07-31 139901
716 2009-08-31 139492
717 2009-09-30 138818
718 2009-10-31 138432
719 2009-11-30 138659
720 2009-12-31 138013


I got to know this because I manually checked each one of the 15 data frames

PROBLEM


  1. Is there any function which can call out the data frames which got
    affected due to the condition which I inserted?

  2. The function must be able to list down the data frames which got affected and should be able to put them into a list.



If I will be able to do this, then I can apply anti log function on the values and get the actual values.

This way I can give the correct forecasts with minimal human intervention.

I hope I am clear in specifying the problem .

Thank You.

Answer

Simply check whether any of your values in a data frame is too high:

has_too_high_values = function (df)
    any(df$Value > 50000)

And then collect them, e.g. using Filter:

Filter(has_too_high_values, train_data)
Comments