WithColumn
withColumn.RdReturn a new SparkDataFrame by adding a column or replacing the existing column that has the same name.
Usage
withColumn(x, colName, col)
# S4 method for SparkDataFrame,character
withColumn(x, colName, col)Arguments
- x
- a SparkDataFrame. 
- colName
- a column name. 
- col
- a Column expression (which must refer only to this SparkDataFrame), or an atomic vector in the length of 1 as literal value. 
Details
Note: This method introduces a projection internally. Therefore, calling it multiple times,
for instance, via loops in order to add multiple columns can generate big plans which
can cause performance issues and even StackOverflowException. To avoid this,
use select with the multiple columns at once.
See also
Other SparkDataFrame functions: 
SparkDataFrame-class,
agg(),
alias(),
arrange(),
as.data.frame(),
attach,SparkDataFrame-method,
broadcast(),
cache(),
checkpoint(),
coalesce(),
collect(),
colnames(),
coltypes(),
createOrReplaceTempView(),
crossJoin(),
cube(),
dapplyCollect(),
dapply(),
describe(),
dim(),
distinct(),
dropDuplicates(),
dropna(),
drop(),
dtypes(),
exceptAll(),
except(),
explain(),
filter(),
first(),
gapplyCollect(),
gapply(),
getNumPartitions(),
group_by(),
head(),
hint(),
histogram(),
insertInto(),
intersectAll(),
intersect(),
isLocal(),
isStreaming(),
join(),
limit(),
localCheckpoint(),
merge(),
mutate(),
ncol(),
nrow(),
persist(),
printSchema(),
randomSplit(),
rbind(),
rename(),
repartitionByRange(),
repartition(),
rollup(),
sample(),
saveAsTable(),
schema(),
selectExpr(),
select(),
showDF(),
show(),
storageLevel(),
str(),
subset(),
summary(),
take(),
toJSON(),
unionAll(),
unionByName(),
union(),
unpersist(),
withWatermark(),
with(),
write.df(),
write.jdbc(),
write.json(),
write.orc(),
write.parquet(),
write.stream(),
write.text()
Examples
if (FALSE) {
sparkR.session()
path <- "path/to/file.json"
df <- read.json(path)
newDF <- withColumn(df, "newCol", df$col1 * 5)
# Replace an existing column
newDF2 <- withColumn(newDF, "newCol", newDF$col1)
newDF3 <- withColumn(newDF, "newCol", 42)
# Use extract operator to set an existing or new column
df[["age"]] <- 23
df[[2]] <- df$col1
df[[2]] <- NULL # drop column
}