I have a question about the output I'm getting from using the match function. I have two dataframes which are dissimilar in their number of rows and row names. I wish to obtain two new dataframes from the previous two with equal no of rows/rownames. One way to do this is to match the rownames of one dataframe to the other
Here's my code below so far:
x_1 <- c("A1", "A1", "B10", "B10", "B10", "B10", "C100", "C100", "C100", "C100") y_1 <- round(seq(1, 24, length = 10), 2) A <- data.frame(x_1, y_1) x_2 <- c("A1", "B10", "C100", "D1", "D200", "G210") y_2 <- round(seq(1, 24, length = 6), 2) B <- data.frame( x_2, y_2 )
Now, as A and B are dissimilar in rownames, I wish to make new versions of A and B but with all the dissimilar rownames deleted.
m_1 <- names(table(A$x_1)) m_2 <- names(table(B$x_2)) comb_names <- union(m_1[!(m_1 %in% m_2)], m_2[!(m_2 %in% m_1)]) A_1 <- A[!A$x_1 %in% c(comb_names), ] B_1 <- B[!B$x_2 %in% c(comb_names), ] newB_1 <- B_1[match(A_1$x_1, B_1$x_2), ]
newB_1 is a dataframe of B_1 which has been matched with rownames from A_1
My question is when I type the code names(table(newB_1$x_2))
, I'm still getting all the original rownames in B_1 which should have been deleted with this code B_1 <- B[!B$x_2 %in% c(comb_names), ]
. However, when I type newB_1, it gives me the right output.
names(table(newB_1$x_2))"A1""B10""C100""D1""D200""G210"newB_1x_2 y_2A1 1.0A1 1.0B10 5.6B10 5.6B10 5.6B10 5.6C100 10.2C100 10.2C100 10.2C100 10.2
In fact, the same thing holds for names(table(B_1$x_2))
which suggests that B_1 <- B[!B$x_2 %in% c(comb_names), ]
isnt deleting the names contained in comb_names as given above.
table(B_1$x_2)A1 B10 C100 D1 D200 G210 1 1 1 0 0 0
The final questions is how can I completely delete the rownames that are not common to both dataframes A and B such that I end up with two dataframes of equal rownames? i.e. I don't want the names D1, D200 and G210 appearing in the new dataframe.
I hope the above makes sense but I would be very happy to clarify any ambiguities. I would like to know how to modify my code to get the desired output but other alternative codes that can replicate the results are also welcome.