Dipto Dipto - 1 month ago 5
R Question

ggplot line plot labels at begining of the line gets repeated

I have a data.table of values as follows:

> head(dt)
id x y
1: 1 0.00 0.000000e+00
2: 1 0.05 4.761905e-05
3: 1 0.10 9.523810e-05
4: 1 0.15 1.428571e-04
5: 1 0.20 1.904762e-04
6: 1 0.25 2.380952e-04


I am using the following code (inspired from here) to create a line plot with the id of each line added as a label at the beginning of the line:

library(ggplot2)
plt<-ggplot(data=dt, aes(x=x, y=y, group=id, color=id))
plt<-plt+geom_line()+geom_text(aes(label=id, color=vertex_name, x=0, y=y), hjust=.1)+ theme(legend.position="none")
plt


The plot I am getting however looks like this:
enter image description here
The labels seem to be repeated several times!

How should I modify the code so that each label appear only once?

Here is a subset of the data I am using:

> dput(subset(dt, dt$id<12))
structure(list(id = c("1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "10",
"10", "10", "10", "10", "10", "10", "10", "10", "10", "10", "11",
"11", "11", "11", "11", "11", "11", "11", "11", "11", "11", "11",
"11", "11", "11", "11", "11", "11", "11", "11", "11"), x = c(0,
0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55,
0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1, 0.45, 0.5, 0.55,
0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1, 0, 0.05, 0.1,
0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75,
0.8, 0.85, 0.9, 0.95, 1, 0, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3,
0.35, 0.4, 0.45), y = c(0, 4.76190476190476e-05, 9.52380952380952e-05,
0.000142857142857143, 0.00019047619047619, 0.000238095238095238,
0.000285714285714286, 0.000333333333333333, 0.000380952380952381,
0.000428571428571429, 0.000476190476190476, 0.000523809523809524,
0.000571428571428571, 0.000619047619047619, 0.000666666666666667,
0.000714285714285714, 0.000761904761904762, 0.00080952380952381,
0.000857142857142857, 0.000904761904761905, 0.000952380952380952,
0.107380952380952, 0.0976190476190476, 0.0878571428571428, 0.0780952380952381,
0.0683333333333333, 0.0585714285714286, 0.0488095238095238, 0.039047619047619,
0.0292857142857143, 0.0195238095238095, 0.00976190476190475,
0, 0.195238095238095, 0.18547619047619, 0.175714285714286, 0.165952380952381,
0.156190476190476, 0.146428571428571, 0.136666666666667, 0.126904761904762,
0.117142857142857, 0.0785714285714286, 0.0707142857142857, 0.0628571428571428,
0.055, 0.0471428571428571, 0.0392857142857143, 0.0314285714285714,
0.0235714285714286, 0.0157142857142857, 0.00785714285714285,
0, 0.157142857142857, 0.149285714285714, 0.141428571428571, 0.133571428571429,
0.125714285714286, 0.117857142857143, 0.11, 0.102142857142857,
0.0942857142857143, 0.0864285714285714)), .Names = c("id", "x",
"y"), sorted = "id", class = c("data.table", "data.frame"), row.names = c(NA,
-63L), .internal.selfref = <pointer: 0x0000000006520788>)

Answer

You can try this, it creates a new df with only values for y=max(y)

library(dplyr)    
subdt<-dt%>%group_by(id)%>%filter(x==0)

or with data.table

subdt<-setDT(dt)[, .SD[x==0], by=id]

The plot:

  ggplot(data=dt, aes(x=x, y=y, group=id, color=id))+geom_line()+geom_text(data=subdt,aes(label=id, color=id,  x=0,y=y, hjust=.1))

Edit: Following aosmith's comment I changed the subset to the y where x==0 which makes more sense.