Skip to content

Hms standalone rest server with Spring Boot#6327

Open
difin wants to merge 1 commit intoapache:masterfrom
difin:hms-standalone-rest-server-with-spring-boot
Open

Hms standalone rest server with Spring Boot#6327
difin wants to merge 1 commit intoapache:masterfrom
difin:hms-standalone-rest-server-with-spring-boot

Conversation

@difin
Copy link
Contributor

@difin difin commented Feb 19, 2026

What changes were proposed in this pull request?

The Standalone REST Catalog Server is reimplemented to use Spring Boot instead of plain Java:

  • Server framework – Uses Spring Boot with an embedded Jetty server instead of raw servlet wiring.
  • Health checks – Adds Actuator liveness and readiness probes; readiness verifies HMS connectivity via Thrift.
  • Observability – Exposes Prometheus metrics for Kubernetes HPA and monitoring.
  • Configuration – Keeps port and other settings in MetastoreConf but bridges them into Spring (e.g., via system properties) so Spring Boot uses the configured port.
  • Graceful shutdown – Uses Spring Boot’s shutdown handling with a configurable timeout.
    Standalone packaging – Adds a spring-boot-maven-plugin “exec” JAR for running the server as a standalone process.

Why are the changes needed?

Spring Boot improves how the Standalone REST Catalog Server is run and operated:

  • Kubernetes support – Liveness and readiness probes (/actuator/health/*) let Kubernetes reliably route traffic and restart unhealthy pods. Readiness includes an actual HMS connectivity check instead of a simple config check.
  • Observability – Prometheus metrics enable HPA, dashboards, and alerting, which is standard for production deployments.
  • Operational behavior – Graceful shutdown and a well-defined lifecycle reduce the chance of dropped requests during restarts.
  • Maintainability – Spring Boot replaces custom servlet wiring and configuration, and aligns with common patterns for cloud-native Java services.

Does this PR introduce any user-facing change?

If the standalone REST Catalog server is deployed in Kubernetes:

  • Liveness/readiness probes – Configure HTTP probes to use the new actuator endpoints:
    -- Liveness: httpGet: /actuator/health/liveness
    -- Readiness: httpGet: /actuator/health/readiness
  • Metrics/HPA – Prometheus scraping or custom metrics use /actuator/prometheus.

How was this patch tested?

Integration tests in TestStandaloneRESTCatalogServer and TestStandaloneRESTCatalogServerJwtAuth run the Spring Boot standalone HMS REST catalog server and verify liveness/readiness probes, Prometheus metrics, REST catalog operations, and JWT auth with Keycloak (Testcontainers).

@deniskuzZ
Copy link
Member

	Suppressed: java.lang.NullPointerException: Cannot invoke "org.keycloak.admin.client.Keycloak.close()" because "this.keycloak" is null
		at org.apache.iceberg.rest.extension.OAuth2AuthorizationServer.stop(OAuth2AuthorizationServer.java:181)
		at org.apache.iceberg.rest.extension.HiveRESTCatalogServerExtension.afterAll(HiveRESTCatalogServerExtension.java:124)
		... 1 more
Caused by: java.lang.ClassNotFoundException: jakarta.annotation.Priority

webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT,
properties = {
"spring.main.allow-bean-definition-overriding=true",
"spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need these properties?

Copy link
Contributor Author

@difin difin Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this property is needed: "spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration"

without it tests fail with the following:

Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.zaxxer.hikari.HikariDataSource]: Factory method 'dataSource' threw exception; nested exception is org.springframework.boot.autoconfigure.jdbc.DataSourceProperties$DataSourceBeanCreationException: Failed to determine a suitable driver class
	at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:185)
	at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:641)
	... 140 more
Caused by: org.springframework.boot.autoconfigure.jdbc.DataSourceProperties$DataSourceBeanCreationException: Failed to determine a suitable driver class
	at org.springframework.boot.autoconfigure.jdbc.DataSourceProperties.determineDriverClassName(DataSourceProperties.java:186)
	at org.springframework.boot.autoconfigure.jdbc.DataSourceProperties.initializeDataSourceBuilder(DataSourceProperties.java:125)
	at org.springframework.boot.autoconfigure.jdbc.DataSourceConfiguration.createDataSource(DataSourceConfiguration.java:48)
	at org.springframework.boot.autoconfigure.jdbc.DataSourceConfiguration$Hikari.dataSource(DataSourceConfiguration.java:90)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154)
	... 141 more

}

private static void deleteDirectoryStatic(File directory) {
if (directory.exists()) {
Copy link
Member

@deniskuzZ deniskuzZ Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we have utils to do the recursive delete FileUtils.deleteDirectory(directory)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

String livenessUrl = "http://localhost:" + port + "/actuator/health/liveness";
try (CloseableHttpClient httpClient = HttpClients.createDefault()) {
HttpGet request = new HttpGet(livenessUrl);
try (CloseableHttpResponse response = httpClient.execute(request)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can put both under same try and avoid nesting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored all nested try blocks to use a single try-with-resources.

LOG.info("=== Test: Health Check ===");
String healthUrl = "http://localhost:" + restCatalogServer.getPort() + "/health";
public void testPrometheusMetrics() throws Exception {
Copy link
Member

@deniskuzZ deniskuzZ Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we use spring actuator? nice :)

String configUrl = restCatalogServer.getRestEndpoint() + "/v1/config";

String configUrl = "http://localhost:" + port + "/iceberg/v1/config";
Copy link
Member

@deniskuzZ deniskuzZ Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use URI template and apply format with port and path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a URI template and helper method.

@deniskuzZ
Copy link
Member

deniskuzZ commented Feb 20, 2026

to support OAuth / JWT Authentication don't we need SecurityConfig?

@EnableWebSecurity
public class SecurityConfig extends WebSecurityConfigurerAdapter {
    @Override
    protected void configure(HttpSecurity http) throws Exception {
        http.oauth2ResourceServer()
            .jwt(); // validate JWT tokens
    }
}

cc @okumin

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
<version>${spring-boot.version}</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move versions to dependency management?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<repositories>
<repository>
<id>central</id>
<url>https://repo.maven.apache.org/maven2</url>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

em, i am not sure about this
cc @zabetak, @abstractdog should we whitelist spring dependencies in Nexus? I originally thought that if a dependency is missing from the local repository, we always fetch it from Central."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should work without defining this repository, spring artifacts are present in central:
https://mvnrepository.com/artifact/org.springframework.boot

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the repo definition here is strange don't think we need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I added that to check if the build passes with this definitions, not for production.
It wasn't able to find spring boot dependencies through the wonder repo.

Copy link
Contributor Author

@difin difin Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it should work without defining this repository, spring artifacts are present in central: https://mvnrepository.com/artifact/org.springframework.boot

@abstractdog no it didn't work, you can see that previous build failed saying these dependencies were missing in wonder.

Copy link
Contributor

@abstractdog abstractdog Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it sounds silly, but the same happens when I add a new tez staging artifact repo, it magically fails to fetch all artifacts for 2-3 times, and then it succeeds: I still think you should remove this, and we can check wonder artifactory logs what exactly happened when it tried to connect to the remote maven central (if it tried at all :D )
I'll check if I can provide easy access to you to the artifactory logs (e.g. kubeconfig)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It passed on rerun after removing these custom changes.
Thanks, @abstractdog.

* Used by Kubernetes readiness probes to determine if the server is ready to accept traffic.
*/
@Component
public class HMSReadinessHealthIndicator implements HealthIndicator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move it under the health package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*
* <p>Multiple instances can run behind a Kubernetes Service for load balancing.
*/
@SpringBootApplication(exclude = DataSourceAutoConfiguration.class)
Copy link
Member

@deniskuzZ deniskuzZ Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we refactor and extract configuration like

@SpringConfiguration
public class IcebergCatalogConfiguration {
    private static final Logger LOG = LoggerFactory.getLogger(IcebergCatalogConfiguration.class);

    private final Configuration conf;

    public IcebergCatalogConfiguration(Configuration conf) {
        this.conf = conf;
    }

    @Bean
    public Configuration hadoopConfiguration() {
        return conf;
    }

    @Bean
    public ServletRegistrationBean<HttpServlet> restCatalogServlet() {
        // Determine servlet path and port
        String servletPath = MetastoreConf.getVar(conf, ConfVars.ICEBERG_CATALOG_SERVLET_PATH);
        if (servletPath == null || servletPath.isEmpty()) {
            servletPath = "iceberg";
            MetastoreConf.setVar(conf, ConfVars.ICEBERG_CATALOG_SERVLET_PATH, servletPath);
        }

        int port = MetastoreConf.getIntVar(conf, ConfVars.CATALOG_SERVLET_PORT);
        if (port == 0) {
            port = 8080;
            MetastoreConf.setLongVar(conf, ConfVars.CATALOG_SERVLET_PORT, port);
        }

        LOG.info("Creating REST Catalog servlet at /{}", servletPath);

        // Create servlet from Iceberg factory
        var descriptor = HMSCatalogFactory.createServlet(conf);
        if (descriptor == null || descriptor.getServlet() == null) {
            throw new IllegalStateException("Failed to create Iceberg REST Catalog servlet");
        }

        return new ServletRegistrationBean<>(descriptor.getServlet(), "/" + servletPath + "/*");
    }
}

and then

@SpringBootApplication(exclude = DataSourceAutoConfiguration.class)
public class StandaloneRESTCatalogServer {

    private static final Logger LOG = LoggerFactory.getLogger(StandaloneRESTCatalogServer.class);

    @Bean
    public Configuration hadoopConfiguration() {
        Configuration conf = MetastoreConf.newMetastoreConf();
        // Load system properties
        for (String prop : System.getProperties().stringPropertyNames()) {
            conf.set(prop, System.getProperty(prop));
        }

        // Validate mandatory config
        String thriftUris = MetastoreConf.getVar(conf, ConfVars.THRIFT_URIS);
        if (thriftUris == null || thriftUris.isEmpty()) {
            throw new IllegalArgumentException("metastore.thrift.uris must be configured to connect to HMS");
        }

        LOG.info("Hadoop Configuration initialized, HMS Thrift URIs: {}", thriftUris);
        return conf;
    }

    public static void main(String[] args) {
        SpringApplication.run(StandaloneRESTCatalogServer.class, args);
        LOG.info("Standalone REST Catalog Server started successfully");
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


# Server configuration
# Port is set via MetastoreConf.CATALOG_SERVLET_PORT
server.port=${metastore.catalog.servlet.port:8080}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use application.yml ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<repositories>
<repository>
<id>central</id>
<url>https://repo.maven.apache.org/maven2</url>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same nexus related issue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, build on Jenkins failed being unable to find spring boot dependencies before adding this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should set up jenkins in this case, where the local repository has a desired mirrored remote one, let me check the current configuration

Copy link
Contributor

@abstractdog abstractdog Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have all these remote repos configured for the single virtual repository (wonder), so it should work:
Screenshot 2026-02-20 at 16 48 29

I would try to remove the repo from pom.xml, and we'll investigate in the next round what's happening
I can even check the logs of the artifactory if you need, in case of errors

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passed on rerun without these custom settings.

<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-servlets</artifactId>
<version>${jetty.version}</version>
Copy link
Member

@deniskuzZ deniskuzZ Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it defined in dependency management somewhere? if not, maybe it should be

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-logging</artifactId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why exclude logging?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spring-boot-starter-web pulls in spring-boot-starter-logging, which uses Logback as the SLF4J implementation.

The exclusion is there so Hive can keep using Log4j2 instead of Spring Boot’s default Logback.

</properties>
<dependencyManagement>
<dependencies>
<!-- Align all Jetty artifacts to Hive's version; Spring Boot 2.7.18 defaults to 9.4.53 -->
Copy link
Member

@deniskuzZ deniskuzZ Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we align hive to spring deps maybe?

Copy link
Contributor Author

@difin difin Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much broader upgrade.
Currently Hive uses Spring 5.3.39, Jetty 9.4.57.v20241219 and servlet API 4.0.
Spring boot 2.7.18 is compatible with Spring 5.3.39.
Moving to Spring Boot 3.0+ requires Spring 6.0+, Jetty 11+, Servlet API 5.0.

<executions>
<execution>
<goals>
<goal>repackage</goal>
Copy link
Member

@deniskuzZ deniskuzZ Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we even need to define the goal?

<plugin>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-maven-plugin</artifactId>
    <configuration>
        <mainClass>org.apache.iceberg.rest.standalone.StandaloneRESTCatalogServer</mainClass>
    </configuration>
</plugin>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goal is needed for creating an executable Spring Boot fat JAR.
It needs to be added in this fashion because this module uses Hive’s parent POM instead of spring-boot-starter-parent, so the Spring Boot plugin does not get any default lifecycle bindings.

@difin difin force-pushed the hms-standalone-rest-server-with-spring-boot branch from b48799e to d27f6b3 Compare February 25, 2026 22:45
@difin difin force-pushed the hms-standalone-rest-server-with-spring-boot branch from d27f6b3 to c8c91f2 Compare March 4, 2026 00:59
@difin difin force-pushed the hms-standalone-rest-server-with-spring-boot branch from c8c91f2 to 19640da Compare March 4, 2026 14:50
@difin difin force-pushed the hms-standalone-rest-server-with-spring-boot branch from 19640da to a054c87 Compare March 4, 2026 20:56
@difin
Copy link
Contributor Author

difin commented Mar 4, 2026

to support OAuth / JWT Authentication don't we need SecurityConfig?

@EnableWebSecurity
public class SecurityConfig extends WebSecurityConfigurerAdapter {
    @Override
    protected void configure(HttpSecurity http) throws Exception {
        http.oauth2ResourceServer()
            .jwt(); // validate JWT tokens
    }
}

We don’t need Spring Security for JWT/OAuth2 here. Auth is handled by the Hive metastore’s ServletSecurity, which wraps the Iceberg REST Catalog servlet in HMSCatalogFactory. That layer extracts the Bearer token and validates it with SimpleJWTAuthenticator (JWT) or OAuth2Authenticator (OAuth2). This is the same path used by the embedded HMS REST catalog, so the standalone server reuses that logic instead of introducing a separate Spring Security filter chain. Adding Spring Security would duplicate and potentially conflict with the existing auth handling.

I also added JWT integration tests for the Standalone REST Catalog server in TestStandaloneRESTCatalogServerJwtAuth, using Keycloak (Testcontainers) as the token issuer and the same ServletSecurity / SimpleJWTAuthenticator pipeline as the embedded HMS REST catalog.

@difin difin changed the title Hms standalone rest server with spring boot Hms standalone rest server with Spring Boot Mar 5, 2026
@difin difin force-pushed the hms-standalone-rest-server-with-spring-boot branch from a054c87 to adc0bc2 Compare March 5, 2026 14:53
@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 5, 2026

@okumin
Copy link
Contributor

okumin commented Mar 7, 2026

to support OAuth / JWT Authentication don't we need SecurityConfig?

DISCLAIMER: I might not be understanding Spring with Servlet correctly.
We may finally be able to use the Spring's tool if all are migrated to Spring. As of today, if Spring can leverage ServletSecurity, we don't immediately move to there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants