- 
                Notifications
    You must be signed in to change notification settings 
- Fork 36
Column functions
        Jolan Rensen edited this page Aug 2, 2022 
        ·
        3 revisions
      
    Similar to the Scala API for Columns, many of the operator functions could be ported over.
For example:
ds.select( col("colA") + 5 )
// datasets can also be invoked to get a column
ds.select( ds("colA") / ds("colB") )
dataset.where( col("colA") `===` 6 )
// or alternatively
dataset.where( col("colA") eq 6)In short, all supported operators are:
- 
==- same as equals()
 
- same as 
- 
!=- same as !equals()
 
- same as 
- 
eq/`===`- in Scala: ===
- in Java: equalTo()
 
- in Scala: 
- 
neq/`=!=`- in Scala: =!=
- in Java: notEqual()
 
- in Scala: 
- 
-col(...)- same in Scala
- in Java: negate(col())
 
- 
!col(...)- same in Scala
- in Java: not(col())
 
- 
gt- in Scala: >
- same in Java but also infix
 
- in Scala: 
- 
lt- in Scala: <
- same in Java but also infix
 
- in Scala: 
- 
geq- in Scala: >=
- same in Java but also infix
 
- in Scala: 
- 
leq- in Scala: <=
- same in Java but also infix
 
- in Scala: 
- 
or- in Scala: ||
- same in Java but also infix
- 
`||`is unfortunately an illegal function name on Windows
 
- in Scala: 
- 
and/`&&`- in Scala: &&
- in Java: and()
 
- in Scala: 
- 
+- same in Scala
- in Java: plus()
 
- 
-- same in Scala
- in Java: minus()
 
- 
*- same in Scala
- in Java: multiply()
 
- 
/- same in Scala
- in Java: divide()
 
- 
%- same in Scala
- in Java: mod()
 
Secondly, there are some quality of life additions as well:
In Kotlin, Ranges are often
used to solve inclusive/exclusive situations for a range. So, instead of between(a, b) you can now do:
dataset.where( col("colA") inRangeOf 0..2 )Also, for columns containing map- or array-like types, instead of getItem() we have:
dataset.where( col("colB")[0] geq 5 )Finally, thanks to Kotlin reflection, we can provide a type- and refactor safe way
to create TypedColumns and with those, a new Dataset from pieces of another using the select() function:
val dataset: Dataset<YourClass> = ...
val newDataset: Dataset<Tuple2<TypeA, TypeB>> = dataset.select(col(YourClass::colA), col(YourClass::colB))
// Alternatively, for instance when working with a Dataset<Row>
val typedDataset: Dataset<Tuple2<String, Int>> = otherDataset.select(col<_, String>("a"), col<_, Int>("b"))