Hms standalone rest server with Spring Boot#6327
Conversation
2e24788 to
196fd31
Compare
|
| webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT, | ||
| properties = { | ||
| "spring.main.allow-bean-definition-overriding=true", | ||
| "spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration" |
There was a problem hiding this comment.
this property is needed: "spring.autoconfigure.exclude=org.springframework.boot.autoconfigure.jdbc.DataSourceAutoConfiguration"
without it tests fail with the following:
Caused by: org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.zaxxer.hikari.HikariDataSource]: Factory method 'dataSource' threw exception; nested exception is org.springframework.boot.autoconfigure.jdbc.DataSourceProperties$DataSourceBeanCreationException: Failed to determine a suitable driver class
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:185)
at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:641)
... 140 more
Caused by: org.springframework.boot.autoconfigure.jdbc.DataSourceProperties$DataSourceBeanCreationException: Failed to determine a suitable driver class
at org.springframework.boot.autoconfigure.jdbc.DataSourceProperties.determineDriverClassName(DataSourceProperties.java:186)
at org.springframework.boot.autoconfigure.jdbc.DataSourceProperties.initializeDataSourceBuilder(DataSourceProperties.java:125)
at org.springframework.boot.autoconfigure.jdbc.DataSourceConfiguration.createDataSource(DataSourceConfiguration.java:48)
at org.springframework.boot.autoconfigure.jdbc.DataSourceConfiguration$Hikari.dataSource(DataSourceConfiguration.java:90)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154)
... 141 more
| } | ||
|
|
||
| private static void deleteDirectoryStatic(File directory) { | ||
| if (directory.exists()) { |
There was a problem hiding this comment.
don't we have utils to do the recursive delete FileUtils.deleteDirectory(directory)
| String livenessUrl = "http://localhost:" + port + "/actuator/health/liveness"; | ||
| try (CloseableHttpClient httpClient = HttpClients.createDefault()) { | ||
| HttpGet request = new HttpGet(livenessUrl); | ||
| try (CloseableHttpResponse response = httpClient.execute(request)) { |
There was a problem hiding this comment.
you can put both under same try and avoid nesting
There was a problem hiding this comment.
Refactored all nested try blocks to use a single try-with-resources.
| LOG.info("=== Test: Health Check ==="); | ||
| String healthUrl = "http://localhost:" + restCatalogServer.getPort() + "/health"; | ||
| public void testPrometheusMetrics() throws Exception { |
There was a problem hiding this comment.
do we use spring actuator? nice :)
| String configUrl = restCatalogServer.getRestEndpoint() + "/v1/config"; | ||
|
|
||
| String configUrl = "http://localhost:" + port + "/iceberg/v1/config"; |
There was a problem hiding this comment.
can we use URI template and apply format with port and path?
There was a problem hiding this comment.
Added a URI template and helper method.
|
to support OAuth / JWT Authentication don't we need SecurityConfig? cc @okumin |
| <dependency> | ||
| <groupId>org.springframework.boot</groupId> | ||
| <artifactId>spring-boot-starter-actuator</artifactId> | ||
| <version>${spring-boot.version}</version> |
There was a problem hiding this comment.
should we move versions to dependency management?
itests/qtest-iceberg/pom.xml
Outdated
| <repositories> | ||
| <repository> | ||
| <id>central</id> | ||
| <url>https://repo.maven.apache.org/maven2</url> |
There was a problem hiding this comment.
em, i am not sure about this
cc @zabetak, @abstractdog should we whitelist spring dependencies in Nexus? I originally thought that if a dependency is missing from the local repository, we always fetch it from Central."
There was a problem hiding this comment.
I guess it should work without defining this repository, spring artifacts are present in central:
https://mvnrepository.com/artifact/org.springframework.boot
There was a problem hiding this comment.
Yeah, the repo definition here is strange don't think we need it.
There was a problem hiding this comment.
Yes, I added that to check if the build passes with this definitions, not for production.
It wasn't able to find spring boot dependencies through the wonder repo.
There was a problem hiding this comment.
I guess it should work without defining this repository, spring artifacts are present in central: https://mvnrepository.com/artifact/org.springframework.boot
@abstractdog no it didn't work, you can see that previous build failed saying these dependencies were missing in wonder.
There was a problem hiding this comment.
I know it sounds silly, but the same happens when I add a new tez staging artifact repo, it magically fails to fetch all artifacts for 2-3 times, and then it succeeds: I still think you should remove this, and we can check wonder artifactory logs what exactly happened when it tried to connect to the remote maven central (if it tried at all :D )
I'll check if I can provide easy access to you to the artifactory logs (e.g. kubeconfig)
There was a problem hiding this comment.
It passed on rerun after removing these custom changes.
Thanks, @abstractdog.
| * Used by Kubernetes readiness probes to determine if the server is ready to accept traffic. | ||
| */ | ||
| @Component | ||
| public class HMSReadinessHealthIndicator implements HealthIndicator { |
There was a problem hiding this comment.
should we move it under the health package?
| * | ||
| * <p>Multiple instances can run behind a Kubernetes Service for load balancing. | ||
| */ | ||
| @SpringBootApplication(exclude = DataSourceAutoConfiguration.class) |
There was a problem hiding this comment.
can we refactor and extract configuration like
@SpringConfiguration
public class IcebergCatalogConfiguration {
private static final Logger LOG = LoggerFactory.getLogger(IcebergCatalogConfiguration.class);
private final Configuration conf;
public IcebergCatalogConfiguration(Configuration conf) {
this.conf = conf;
}
@Bean
public Configuration hadoopConfiguration() {
return conf;
}
@Bean
public ServletRegistrationBean<HttpServlet> restCatalogServlet() {
// Determine servlet path and port
String servletPath = MetastoreConf.getVar(conf, ConfVars.ICEBERG_CATALOG_SERVLET_PATH);
if (servletPath == null || servletPath.isEmpty()) {
servletPath = "iceberg";
MetastoreConf.setVar(conf, ConfVars.ICEBERG_CATALOG_SERVLET_PATH, servletPath);
}
int port = MetastoreConf.getIntVar(conf, ConfVars.CATALOG_SERVLET_PORT);
if (port == 0) {
port = 8080;
MetastoreConf.setLongVar(conf, ConfVars.CATALOG_SERVLET_PORT, port);
}
LOG.info("Creating REST Catalog servlet at /{}", servletPath);
// Create servlet from Iceberg factory
var descriptor = HMSCatalogFactory.createServlet(conf);
if (descriptor == null || descriptor.getServlet() == null) {
throw new IllegalStateException("Failed to create Iceberg REST Catalog servlet");
}
return new ServletRegistrationBean<>(descriptor.getServlet(), "/" + servletPath + "/*");
}
}
and then
@SpringBootApplication(exclude = DataSourceAutoConfiguration.class)
public class StandaloneRESTCatalogServer {
private static final Logger LOG = LoggerFactory.getLogger(StandaloneRESTCatalogServer.class);
@Bean
public Configuration hadoopConfiguration() {
Configuration conf = MetastoreConf.newMetastoreConf();
// Load system properties
for (String prop : System.getProperties().stringPropertyNames()) {
conf.set(prop, System.getProperty(prop));
}
// Validate mandatory config
String thriftUris = MetastoreConf.getVar(conf, ConfVars.THRIFT_URIS);
if (thriftUris == null || thriftUris.isEmpty()) {
throw new IllegalArgumentException("metastore.thrift.uris must be configured to connect to HMS");
}
LOG.info("Hadoop Configuration initialized, HMS Thrift URIs: {}", thriftUris);
return conf;
}
public static void main(String[] args) {
SpringApplication.run(StandaloneRESTCatalogServer.class, args);
LOG.info("Standalone REST Catalog Server started successfully");
}
}
|
|
||
| # Server configuration | ||
| # Port is set via MetastoreConf.CATALOG_SERVLET_PORT | ||
| server.port=${metastore.catalog.servlet.port:8080} |
There was a problem hiding this comment.
should we use application.yml ?
| <repositories> | ||
| <repository> | ||
| <id>central</id> | ||
| <url>https://repo.maven.apache.org/maven2</url> |
There was a problem hiding this comment.
yes, build on Jenkins failed being unable to find spring boot dependencies before adding this.
There was a problem hiding this comment.
we should set up jenkins in this case, where the local repository has a desired mirrored remote one, let me check the current configuration
There was a problem hiding this comment.
There was a problem hiding this comment.
Passed on rerun without these custom settings.
| <dependency> | ||
| <groupId>org.eclipse.jetty</groupId> | ||
| <artifactId>jetty-servlets</artifactId> | ||
| <version>${jetty.version}</version> |
There was a problem hiding this comment.
isn't it defined in dependency management somewhere? if not, maybe it should be
| <exclusions> | ||
| <exclusion> | ||
| <groupId>org.springframework.boot</groupId> | ||
| <artifactId>spring-boot-starter-logging</artifactId> |
There was a problem hiding this comment.
spring-boot-starter-web pulls in spring-boot-starter-logging, which uses Logback as the SLF4J implementation.
The exclusion is there so Hive can keep using Log4j2 instead of Spring Boot’s default Logback.
| </properties> | ||
| <dependencyManagement> | ||
| <dependencies> | ||
| <!-- Align all Jetty artifacts to Hive's version; Spring Boot 2.7.18 defaults to 9.4.53 --> |
There was a problem hiding this comment.
should we align hive to spring deps maybe?
There was a problem hiding this comment.
This is a much broader upgrade.
Currently Hive uses Spring 5.3.39, Jetty 9.4.57.v20241219 and servlet API 4.0.
Spring boot 2.7.18 is compatible with Spring 5.3.39.
Moving to Spring Boot 3.0+ requires Spring 6.0+, Jetty 11+, Servlet API 5.0.
| <executions> | ||
| <execution> | ||
| <goals> | ||
| <goal>repackage</goal> |
There was a problem hiding this comment.
do we even need to define the goal?
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<mainClass>org.apache.iceberg.rest.standalone.StandaloneRESTCatalogServer</mainClass>
</configuration>
</plugin>
There was a problem hiding this comment.
This goal is needed for creating an executable Spring Boot fat JAR.
It needs to be added in this fashion because this module uses Hive’s parent POM instead of spring-boot-starter-parent, so the Spring Boot plugin does not get any default lifecycle bindings.
b48799e to
d27f6b3
Compare
d27f6b3 to
c8c91f2
Compare
c8c91f2 to
19640da
Compare
19640da to
a054c87
Compare
We don’t need Spring Security for JWT/OAuth2 here. Auth is handled by the Hive metastore’s I also added JWT integration tests for the Standalone REST Catalog server in |
a054c87 to
adc0bc2
Compare
|
DISCLAIMER: I might not be understanding Spring with Servlet correctly. |




What changes were proposed in this pull request?
The Standalone REST Catalog Server is reimplemented to use Spring Boot instead of plain Java:
Standalone packaging – Adds a spring-boot-maven-plugin “exec” JAR for running the server as a standalone process.
Why are the changes needed?
Spring Boot improves how the Standalone REST Catalog Server is run and operated:
Does this PR introduce any user-facing change?
If the standalone REST Catalog server is deployed in Kubernetes:
-- Liveness: httpGet: /actuator/health/liveness
-- Readiness: httpGet: /actuator/health/readiness
How was this patch tested?
Integration tests in
TestStandaloneRESTCatalogServerandTestStandaloneRESTCatalogServerJwtAuthrun the Spring Boot standalone HMS REST catalog server and verify liveness/readiness probes, Prometheus metrics, REST catalog operations, and JWT auth with Keycloak (Testcontainers).