R Data Trimming – How to Remove Specific Characters in R

rtrim

I have a data set which looks like:

> data<-c( "IGHV1-2*02 F, or  IGHV1-2*03 F","IGHV3-23*01 F, or 
> IGHV3-23*04 F","IGHV2-70*01 F","IGHV7-4-1*01")

I would like to keep the first appearance of "V1-2" for example and delete anything which follows (including the "*").
So I tried the following:

> data.substr<-substr(data,4,9)
> data.substr1<-gsub("*","",data.substr)

but I still cant get rid of the "*", probably because it serves as a placeholder…
Does anyone have an idea?

Best Answer

gsub("[*].*$","",data)

put * in square brackets, it will be treated as character, then any value .* until the end of the string $ will be removed.

Related Question