Maxime P. -4 years ago 86
R Question

# Deduce informations with pairs using programmation

I would like to analyze data.

My Database is composed of 1408 (704 for type 1 and 704 for type 2) observations and 49 variables. Here is part of my database.

The point is that I want to analyze gender of type 1(sellers) who overcharged.

`````` Data
Subject ID  Gender   Period   Matching group   Group    Type  Overcharging
654        1           1            73         1        1      NA
654        1           2            73         1        1      NA
654        1           3            73         1        1      NA
654        1           4            73         1        1      NA
708        0           1            73         1        2       1
708        0           2            73         1        2       0
708        0           3            73         1        2       0
708        0           4            73         1        2       1
435        1           1            73         2        1      NA
435        1           2            73         2        1      NA
435        1           3            73         2        1      NA
435        1           4            73         2        1      NA
546        0           1            73         2        2       0
546        0           2            73         2        2       0
546        0           3            73         2        2       1
546        0           4            73         2        2       0
``````

For example, if you take a look at matching group =73, there are 2 groups (1 and 2).And in each group, there are two types (1 and 2). For each type 1 (seller) we do not have information about what he did (overcharge or not). But we have informations about buyers (type 2) who were overcharged or not.

If I can identify the buyer who were over-treated, then, this means that the seller this buyer is interacting has over-treated the buyer. So all I need to look at is the gender of the seller in the same group as the buyer.

In matching group 73 we know for instance that at period 1 subject 708 was overcharged (the one in group 1). As I know that this men belongs to group 1 and matching group 73, I am able to identify the seller who has overcharged him : subject 654 with gender =1.

In group 2 (matching group 73), we know that at period 3, agent 546 was overcharged. As I know that this men belongs to group 1 and matching group 73, I am able to identify the seller who has overcharged him : subject 435 with gender =1.
....
I would do this for all the observations I have.

However I really don't know how to proceed to code and make this condition on R.

This is what I tried to do, but doesn't fit my needs !

``````  for (matchinggroup[type==1]==matchinggroup[type==2] &
group[type==1]==group[type==2] & period[type==1]==period[type==2])
{
if ((overtreatment==1), na.rm=TRUE)
sum(gender==1[type==1], na.rm=TRUE)
}
``````

The expected output I would like to have is :

``````    sum(overcharging==1[gender==1&type==1])
>3
sum(overcharging==1[gender==0&type==1])
>0
sum(overcharging==0[gender==1&type==1])
>5
sum(overcharging==0[gender==0&type==1])
>0
``````

Thank you for your time and consideration ! Help is appreciated.

Not exactly sure what your desired output is, but consider this:

``````Data <- read.table(header = T,
text = "Subject_ID  Gender   Period   Matching_group   Group    Type  Overcharging
654        1           1            73         1        1      NA
654        1           2            73         1        1      NA
654        1           3            73         1        1      NA
654        1           4            73         1        1      NA
708        0           1            73         1        2       1
708        0           2            73         1        2       0
708        0           3            73         1        2       0
708        0           4            73         1        2       1
435        1           1            73         2        1      NA
435        1           2            73         2        1      NA
435        1           3            73         2        1      NA
435        1           4            73         2        1      NA
546        0           1            73         2        2       0
546        0           2            73         2        2       0
546        0           3            73         2        2       1
546        0           4            73         2        2       0
")

dat1 <- subset(Data, Overcharging==1)
``````

This will find all the Overcharging sellers. And then you could find each matching buyer using this loop:

``````out <- data.frame()

for(i in 1:nrow(dat1)){
dat2 <- dat1[i,]
df <- Data[Data\$Period==dat2\$Period & Data\$Matching_group==dat2\$Matching_group &
Data\$Group==dat2\$Group & Data\$Type==1,]
out <- rbind(out, df)
}
``````

Which will give you:

``````    Subject_ID Gender Period Matching_group Group Type Overcharging
1         654      1      1             73     1    1           NA
4         654      1      4             73     1    1           NA
11        435      1      3             73     2    1           NA
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download