user3910073 - 9 months ago 42

R Question

I have a data frame like this

`NAME DIST`

A 0

A 1

A 100

A 2

A 1

A 4

A 500

A 1

A 1

What I want to do is to find an efficient way of creating a new column NEWNAME such that if DIST > 100 it contains a a name that is equal for all the previous rows

`NAME DIST NEWNAME`

A 0 A

A 1 A

A 100 A

A 2 A2

A 1 A2

A 4 A2

A 500 A3

A 1 A3

A 1 A3

I have done it using a for loop but I was looking for a more efficient solution in R style. Below my code using a for loop

`k <- 0`

for(l in 1:length(df$NAME)){

if(df$DIST[l] >= 100){

k <- k+1;

df$NEWNAME[(l):length(df$NAME)] <- paste(df$NAME,k,sep="")

}

}

Thanks in advance

Answer

You can do this to create your new column:

```
df$NEWNAME=paste0("A", cumsum(0+df$DIST>=100))
```

I used your data as `df`

and also assumed you meant superior *or equal* to 100:

```
df=data.frame("NAME"=rep("A", 9), "DIST"=c(0,1,100,2,1,4,500,1,1))
```

**EDIT**

If you need to start the new names at the row N+1 of the distance>100, you can do this after:

```
df$NEWNAME2 = lag(df$NEWNAME, n=1, default="A0")
#### NAME DIST NEWNAME NEWNAME2
#### 1 A 0 A0 A0
#### 2 A 1 A0 A0
#### 3 A 100 A1 A0
#### 4 A 2 A1 A1
#### 5 A 1 A1 A1
```

Source (Stackoverflow)