Peter Nsanze Peter Nsanze - 4 months ago 8
R Question

r programming binding dataframes with matching country names

I have two data frames of country data.

df1
has all the countries of the world.
df2
has a subset of countries but has the populations in one of its columns.
I want to take the population data and add it to
df1
where the country names are a match.

If
df1$Column1 = df2$Column1
(same country name) then populate
df1$Column2
(currently empty) with the information from
df2$Column2
(country's population) where the row is the the one for that country match.

I tried to merge the two using the column "Name" which they both have for country names. total <- merge(map,Co2_2x, by="NAME") But I get empty rows in my new dataframe but the columns are all there.

I'd like to be able to say "for this row and column matrix position in df1 (the country), get the row (country name match in df2) and column X (population data). Then put it in this row and column Y matrix position in df1 (new population column in df1 for the matched country name)"... There must be an easier way :-)

I'd like to fill map$measure with data from Co2_2x$premium where the countries match.

Thanks!

library(XML)
library(raster)
library(rgdal)
download.file("http://thematicmapping.org/downloads/TM_WORLD_BORDERS_SIMPL-0.3.zip",destfile="TM_WORLD_BORDERS_SIMPL-0.3.zip")
unzip("TM_WORLD_BORDERS_SIMPL-0.3.zip",exdir=getwd())
polygons <- shapefile("TM_WORLD_BORDERS_SIMPL-0.3.shp")

polygons
map <- as.data.frame(polygons)

map$Measure <- 0

library(rvest)
Co2 <- read_html("https://en.wikipedia.org/wiki/List_of_countries_by_carbon_dioxide_emissions")

Co2_2x<-Co2 %>%
html_nodes("table") %>%
.[[1]] %>%
html_table()

names(Co2_2x)[2]<-paste("premium")

names(Co2_2x)[1]<-paste("NAME")


total <- merge(map,Co2_2x, by="NAME")

Answer

I have trouble importing Co2, but I think that you just need to add the all.x=T option, as follows :

total <- merge(map,Co2_2x, by="NAME",all.x=T)

Have a look at the documentation for details.

Comments