scala - How to flatmap a nested Dataframe in Spark -


i have nested string shown below. want flat map them produce unique rows in spark

my dataframe has

a,b,"x,y,z",d 

i want convert produce output like

a,b,x,d a,b,y,d a,b,z,d 

how can that.

basically how can flat map , apply function inside dataframe

thanks

spark 2.0+

dataset.flatmap:

val ds = df.as[(string, string, string, string)] ds.flatmap {    case (x1, x2, x3, x4) => x3.split(",").map((x1, x2, _, x4)) }.todf 

spark 1.3+.

use split , explode functions:

val df = seq(("a", "b", "x,y,z", "d")).todf("x1", "x2", "x3", "x4") df.withcolumn("x3", explode(split($"x3", ","))) 

spark 1.x

dataframe.explode (deprecated in spark 2.x)

df.explode($"x3")(_.getas[string](0).split(",").map(tuple1(_))) 

Comments

Popular posts from this blog

html - Styling progress bar with inline style -

java - Oracle Sql developer error: could not install some modules -

How to use autoclose brackets in Jupyter notebook? -