-
Notifications
You must be signed in to change notification settings - Fork 77
Description
This issue tracks research on extending Kotlin DataFrame with first-class integration support for Spring JDBC in Gradle-based projects.
The goal is to explore and define:
- a recommended integration pattern between Spring JDBC and Kotlin DataFrame,
- a minimal, idiomatic example for users,
- a short guide describing how DataFrame fits into Spring JDBC–based architectures.
The focus is on read-oriented use cases (analytics, reporting, data processing), not ORM or CRUD replacement.
Proposed pattern:
Use Spring JDBC as a thin SQL execution layer (JdbcTemplate / NamedParameterJdbcTemplate), and convert query results directly into Kotlin DataFrame for in-memory transformations, aggregation, and analysis. Spring manages infrastructure, while DataFrame handles tabular data logic.
Benefits
- Seamless adoption of DataFrame in existing Spring JDBC projects
- Clear separation between SQL access and data transformation logic
- No ORM or Spring Data dependency required
- Idiomatic Kotlin API for analytics and reporting
- Lower entry barrier for users already using Spring JDBC
Best practice
@Component
class SalesAnalyticsRepository(
private val jdbcTemplate: JdbcTemplate
) {
fun salesDf(from: LocalDate, to: LocalDate): DataFrame<SaleRow> =
jdbcTemplate.query(
"""
select date, region, amount
from sales
where date between ? and ?
""",
from, to
) { rs ->
rs.toDataFrame<SaleRow>()
}
}
data class SaleRow(
val date: LocalDate,
val region: String,
val amount: BigDecimal
)
@Service
class SalesReportService(
private val repo: SalesAnalyticsRepository
) {
fun revenueByRegion(from: LocalDate, to: LocalDate): DataFrame<*> =
repo.salesDf(from, to)
.groupBy { region }
.aggregate {
sum { amount }
}
}
Anti-patterns
@Service
class SalesService(
private val jdbcTemplate: JdbcTemplate
) {
fun revenueByRegion(): DataFrame<*> {
val df = jdbcTemplate.query(
"select * from sales"
) { rs ->
rs.toDataFrame()
}
// all is mixed here
return df
.filter { amount > BigDecimal.ZERO }
.groupBy("region")
.aggregate {
sum("amount")
}
}
}
or
interface SalesRepository {
fun findAll(): DataFrame<*> // bery bad
}
Possible API changes:
inline fun <reified T : Any> JdbcTemplate.readDataFrame(
sql: String,
limit: Int? = null,
inferNullability: Boolean = true,
dbType: DbType? = null
): DataFrame<T>
fun NamedParameterJdbcTemplate.readDataFrame(
sql: String,
params: Map<String, Any?>,
limit: Int? = null,
inferNullability: Boolean = true,
dbType: DbType? = null
): AnyFrame
inline fun <reified T : Any> NamedParameterJdbcTemplate.readDataFrame(
sql: String,
params: Map<String, Any?>,
limit: Int? = null,
inferNullability: Boolean = true,
dbType: DbType? = null
): DataFrame<T>
fun JdbcTemplate.readDataFrame(
sql: String,
limit: Int?,
inferNullability: Boolean,
dbType: DbType?
): AnyFrame =
query(sql) { rs ->
DataFrame.readResultSet(
resultSet = rs,
dbType = dbType ?: DbType.from(rs.metaData),
limit = limit,
inferNullability = inferNullability
)
}
Architecture notes:
Spring manages:
- DataSource
- Connection
- Transactions
DataFrame manages:
- Schema inference
- Type mapping
- In-memory tabular operations
Spring JDBC extensions:
- Glue code only
My thoughts on why it should be done after possible Spring Integration
Top-5 UX Problems for Spring Users
Duplicate configuration
Spring applications already define DataSource, but KDF requires a separate DbConnectionConfig.
Connection lifecycle mismatch
Spring manages connections and transactions, while KDF expects users to reason about connection handling explicitly.
Extra cognitive load (DbType)
Spring users must manually think about DbType, even though the database is already known and configured.
Unclear best practices
Users are unsure whether they are integrating KDF with Spring in a correct and idiomatic way.
Perception mismatch
The lack of Spring-friendly entry points creates the impression that KDF is not designed for Spring-based applications.
Inspired by added functionality https://x.com/intellijidea/status/2000566592384422334