Maxime P. Maxime P. -4 years ago 86
R Question

Deduce informations with pairs using programmation

I would like to analyze data.

My Database is composed of 1408 (704 for type 1 and 704 for type 2) observations and 49 variables. Here is part of my database.

The point is that I want to analyze gender of type 1(sellers) who overcharged.

Data
Subject ID Gender Period Matching group Group Type Overcharging
654 1 1 73 1 1 NA
654 1 2 73 1 1 NA
654 1 3 73 1 1 NA
654 1 4 73 1 1 NA
708 0 1 73 1 2 1
708 0 2 73 1 2 0
708 0 3 73 1 2 0
708 0 4 73 1 2 1
435 1 1 73 2 1 NA
435 1 2 73 2 1 NA
435 1 3 73 2 1 NA
435 1 4 73 2 1 NA
546 0 1 73 2 2 0
546 0 2 73 2 2 0
546 0 3 73 2 2 1
546 0 4 73 2 2 0


For example, if you take a look at matching group =73, there are 2 groups (1 and 2).And in each group, there are two types (1 and 2). For each type 1 (seller) we do not have information about what he did (overcharge or not). But we have informations about buyers (type 2) who were overcharged or not.

If I can identify the buyer who were over-treated, then, this means that the seller this buyer is interacting has over-treated the buyer. So all I need to look at is the gender of the seller in the same group as the buyer.

In matching group 73 we know for instance that at period 1 subject 708 was overcharged (the one in group 1). As I know that this men belongs to group 1 and matching group 73, I am able to identify the seller who has overcharged him : subject 654 with gender =1.

In group 2 (matching group 73), we know that at period 3, agent 546 was overcharged. As I know that this men belongs to group 1 and matching group 73, I am able to identify the seller who has overcharged him : subject 435 with gender =1.
....
I would do this for all the observations I have.

However I really don't know how to proceed to code and make this condition on R.

This is what I tried to do, but doesn't fit my needs !

for (matchinggroup[type==1]==matchinggroup[type==2] &
group[type==1]==group[type==2] & period[type==1]==period[type==2])
{
if ((overtreatment==1), na.rm=TRUE)
sum(gender==1[type==1], na.rm=TRUE)
}


The expected output I would like to have is :

sum(overcharging==1[gender==1&type==1])
>3
sum(overcharging==1[gender==0&type==1])
>0
sum(overcharging==0[gender==1&type==1])
>5
sum(overcharging==0[gender==0&type==1])
>0


Thank you for your time and consideration ! Help is appreciated.

Answer Source

Not exactly sure what your desired output is, but consider this:

Data <- read.table(header = T, 
                   text = "Subject_ID  Gender   Period   Matching_group   Group    Type  Overcharging
654        1           1            73         1        1      NA
654        1           2            73         1        1      NA
654        1           3            73         1        1      NA
654        1           4            73         1        1      NA 
708        0           1            73         1        2       1
708        0           2            73         1        2       0
708        0           3            73         1        2       0
708        0           4            73         1        2       1
435        1           1            73         2        1      NA
435        1           2            73         2        1      NA
435        1           3            73         2        1      NA
435        1           4            73         2        1      NA    
546        0           1            73         2        2       0
546        0           2            73         2        2       0
546        0           3            73         2        2       1
546        0           4            73         2        2       0
")

dat1 <- subset(Data, Overcharging==1)

This will find all the Overcharging sellers. And then you could find each matching buyer using this loop:

out <- data.frame()

for(i in 1:nrow(dat1)){
  dat2 <- dat1[i,]
  df <- Data[Data$Period==dat2$Period & Data$Matching_group==dat2$Matching_group &
     Data$Group==dat2$Group & Data$Type==1,]
  out <- rbind(out, df)
}

Which will give you:

    Subject_ID Gender Period Matching_group Group Type Overcharging
1         654      1      1             73     1    1           NA
4         654      1      4             73     1    1           NA
11        435      1      3             73     2    1           NA
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download