Skip to content

Commit 85a01c2

Browse files
AUTO: Docs repo sync - ScalarDL (#1056)
* AUTO: Sync ScalarDL docs in English to docs site repo * Add docs for writing apps with HashStore and TableStore --------- Co-authored-by: josh-wong <joshua.wong@scalar-labs.com> Co-authored-by: Josh Wong <23216828+josh-wong@users.noreply.github.com>
1 parent a178135 commit 85a01c2

File tree

4 files changed

+316
-23
lines changed

4 files changed

+316
-23
lines changed
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
---
2+
tags:
3+
- Community
4+
- Enterprise
5+
displayed_sidebar: docsEnglish
6+
---
7+
8+
# Write a ScalarDL Application with HashStore
9+
10+
import JavadocLink from "/src/theme/JavadocLink.js";
11+
12+
This document explains how to write ScalarDL applications with HashStore. You will learn how to interact with ScalarDL HashStore in your applications, handle errors, and validate your data.
13+
14+
## Use the ScalarDL HashStore Client SDK
15+
16+
You have two options to interact with ScalarDL HashStore:
17+
18+
- Using [commands](scalardl-hashstore-command-reference.mdx), as shown in [Get Started with ScalarDL HashStore](getting-started-hashstore.mdx)
19+
- Using the [HashStore Java Client SDK](https://javadoc.io/doc/com.scalar-labs/scalardl-hashstore-java-client-sdk/)
20+
21+
Using commands is a convenient way to try HashStore without writing an application. For building HashStore-based applications, however, the HashStore Client SDK is recommended, as it runs more efficiently without launching a separate process for each operation.
22+
23+
The HashStore Client SDK is available on [Maven Central](https://central.sonatype.com/artifact/com.scalar-labs/scalardl-hashstore-java-client-sdk). You can install it in your application by using a build tool such as Gradle. For example, in Gradle, you can add the following dependency to `build.gradle`, replacing `VERSION` with the version of ScalarDL that you want to use.
24+
25+
```gradle
26+
dependencies {
27+
implementation group: 'com.scalar-labs', name: 'scalardl-hashstore-java-client-sdk', version: '<VERSION>'
28+
}
29+
```
30+
31+
The Client SDK APIs for HashStore are provided by a service class called <JavadocLink packageName="scalardl-hashstore-java-client-sdk" path="com/scalar/dl/hashstore/client/service" className="HashStoreClientService" />. The following is a code snippet that shows how to use `HashStoreClientService` to manage objects and collections. `HashStoreClientService` provides the same functionalities as the HashStore client commands shown in [Get Started with ScalarDL HashStore](getting-started-hashstore.mdx).
32+
33+
```java
34+
// HashStoreClientServiceFactory should always be reused.
35+
HashStoreClientServiceFactory factory = new HashStoreClientServiceFactory();
36+
37+
// HashStoreClientServiceFactory creates a new HashStoreClientService object in every create
38+
// method call but reuses the internal objects and connections as much as possible for better
39+
// performance and resource usage.
40+
HashStoreClientService service = factory.create(new ClientConfig(new File(properties)));
41+
try {
42+
// put the hash value of an object with metadata.
43+
String objectId = ...;
44+
String hash = ...;
45+
JsonNode metadata = ...;
46+
ExecutionResult result = service.putObject(objectId, hash, metadata);
47+
} catch (ClientException e) {
48+
System.err.println(e.getStatusCode());
49+
System.err.println(e.getMessage());
50+
}
51+
52+
factory.close();
53+
```
54+
55+
:::note
56+
57+
You should always use `HashStoreClientServiceFactory` to create `HashStoreClientService` objects. `HashStoreClientServiceFactory` caches objects that are required to create `HashStoreClientService` and reuses them on the basis of the given configurations, so the `HashStoreClientServiceFactory` object should always be reused.
58+
59+
:::
60+
61+
For more information about `HashStoreClientServiceFactory` and `HashStoreClientService`, see the [`scalardl-hashstore-java-client-sdk` Javadoc](https://javadoc.io/doc/com.scalar-labs/scalardl-hashstore-java-client-sdk/latest/index.html).
62+
63+
## Handle errors
64+
65+
If an error occurs in your application, the Client SDK will return an exception with a status code and an error message with an error code. You should check the status code and the error code to identify the cause of the error. For details about the status code and the error codes, see [Status codes](how-to-write-applications.mdx#status-codes) and [Error codes](how-to-write-applications.mdx#error-codes).
66+
67+
### Implement error handling
68+
69+
The SDK throws <JavadocLink packageName="scalardl-java-client-sdk" path="com/scalar/dl/client/exception" className="ClientException" /> when an error occurs. You can handle errors by catching the exception as follows:
70+
71+
```java
72+
HashStoreClientService service = ...;
73+
try {
74+
// interact with ScalarDL HashStore through a HashStoreClientService object
75+
} catch (ClientException e) {
76+
// e.getStatusCode() returns the status of the error
77+
}
78+
```
79+
80+
## Validate your data
81+
82+
In ScalarDL, you occasionally need to validate your data to make sure all the data is in a valid state. Since you can learn the basics of how ScalarDL validates your data in [Write a ScalarDL Application in Java](how-to-write-applications.mdx#validate-your-data), this section mainly describes how you can perform the validation in HashStore.
83+
84+
When validating [assets](data-modeling.mdx#asset) (objects and collections here) in HashStore, you only need to specify an object ID or a collection ID. An example code for validating an object is as follows:
85+
86+
```java
87+
HashStoreClientService service = ...
88+
try {
89+
LedgerValidationResult result = service.validateObject("an_object_ID");
90+
// You can also specify age range.
91+
// LedgerValidationResult result = service.validateObject("an_object_ID", startAge, endAge);
92+
} catch (ClientException e) {
93+
}
94+
```
95+
96+
An example code for validating a collection is as follows:
97+
98+
```java
99+
HashStoreClientService service = ...
100+
try {
101+
LedgerValidationResult result = service.validateCollection("a_collection_ID");
102+
// You can also specify age range.
103+
// LedgerValidationResult result = service.validateCollection("a_collection_ID", startAge, endAge);
104+
} catch (ClientException e) {
105+
}
106+
```
107+
108+
:::note
109+
110+
HashStore internally assigns a dedicated asset ID to an asset that represents an object or a collection. The asset ID consists of a prefix to show the asset type and a key; for example, a prefix `o_` and an object ID for objects, and a prefix `c_` and a collection ID for collections are used. You will see such raw asset IDs in `AssetProof` in `LedgerValidationResult`.
111+
112+
:::
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
---
2+
tags:
3+
- Community
4+
- Enterprise
5+
displayed_sidebar: docsEnglish
6+
---
7+
8+
# Write a ScalarDL Application with TableStore
9+
10+
import JavadocLink from "/src/theme/JavadocLink.js";
11+
12+
This document explains how to write ScalarDL applications with TableStore. You will learn how to interact with ScalarDL TableStore in your applications, handle errors, and validate your data.
13+
14+
## Use the ScalarDL TableStore Client SDK
15+
16+
You have two options to interact with ScalarDL TableStore:
17+
18+
- Using [commands](scalardl-tablestore-command-reference.mdx), as shown in [Get Started with ScalarDL TableStore](getting-started-tablestore.mdx)
19+
- Using the [TableStore Java Client SDK](https://javadoc.io/doc/com.scalar-labs/scalardl-tablestore-java-client-sdk/)
20+
21+
Using commands is a convenient way to try TableStore without writing an application. For building TableStore-based applications, however, the TableStore Client SDK is recommended, as it runs more efficiently without launching a separate process for each operation.
22+
23+
The TableStore Client SDK is available on [Maven Central](https://central.sonatype.com/artifact/com.scalar-labs/scalardl-tablestore-java-client-sdk). You can install it in your application by using a build tool such as Gradle. For example, in Gradle, you can add the following dependency to `build.gradle`, replacing `VERSION` with the version of ScalarDL that you want to use.
24+
25+
```gradle
26+
dependencies {
27+
implementation group: 'com.scalar-labs', name: 'scalardl-tablestore-java-client-sdk', version: '<VERSION>'
28+
}
29+
```
30+
31+
The Client SDK APIs for TableStore are provided by a service class called <JavadocLink packageName="scalardl-tablestore-java-client-sdk" path="com/scalar/dl/tablestore/client/service" className="TableStoreClientService" />. The following is a code snippet that shows how to use `TableStoreClientService` to manage table authenticity. `TableStoreClientService` provides the same functionalities as the TableStore client commands shown in [Get Started with ScalarDL TableStore](getting-started-tablestore.mdx).
32+
33+
```java
34+
// TableStoreClientServiceFactory should always be reused.
35+
TableStoreClientServiceFactory factory = new TableStoreClientServiceFactory();
36+
37+
// TableStoreClientServiceFactory creates a new TableStoreClientService object in every create
38+
// method call but reuses the internal objects and connections as much as possible for better
39+
// performance and resource usage.
40+
TableStoreClientService service = factory.create(new ClientConfig(new File(properties)));
41+
try {
42+
// execute a SQL statement.
43+
String sql = "SELECT * FROM employee WHERE id = '1001'";
44+
ExecutionResult result = service.executeStatement(sql);
45+
result.getResult().ifPresent(System.out::println);
46+
} catch (ClientException e) {
47+
System.err.println(e.getStatusCode());
48+
System.err.println(e.getMessage());
49+
}
50+
51+
factory.close();
52+
```
53+
54+
:::note
55+
56+
You should always use `TableStoreClientServiceFactory` to create `TableStoreClientService` objects. `TableStoreClientServiceFactory` caches objects that are required to create `TableStoreClientService` and reuses them on the basis of the given configurations, so the `TableStoreClientServiceFactory` object should always be reused.
57+
58+
:::
59+
60+
For more information about `TableStoreClientServiceFactory` and `TableStoreClientService`, see the [`scalardl-tablestore-java-client-sdk` Javadoc](https://javadoc.io/doc/com.scalar-labs/scalardl-tablestore-java-client-sdk/latest/index.html).
61+
62+
## Handle errors
63+
64+
If an error occurs in your application, the Client SDK will return an exception with a status code and an error message with an error code. You should check the status code and the error code to identify the cause of the error. For details about the status code and the error codes, see [Status codes](how-to-write-applications.mdx#status-codes) and [Error codes](how-to-write-applications.mdx#error-codes).
65+
66+
### Implement error handling
67+
68+
The SDK throws <JavadocLink packageName="scalardl-java-client-sdk" path="com/scalar/dl/client/exception" className="ClientException" /> when an error occurs. You can handle errors by catching the exception as follows:
69+
70+
```java
71+
TableStoreClientService service = ...;
72+
try {
73+
// interact with ScalarDL TableStore through a TableStoreClientService object
74+
} catch (ClientException e) {
75+
// e.getStatusCode() returns the status of the error
76+
}
77+
```
78+
79+
## Validate your data
80+
81+
In ScalarDL, you occasionally need to validate your data to make sure all the data is in a valid state. Since you can learn the basics of how ScalarDL validates your data in [Write a ScalarDL Application in Java](how-to-write-applications.mdx#validate-your-data), this section mainly describes how you can perform the validation in TableStore.
82+
83+
When validating [assets](data-modeling.mdx#asset) (records, index records, and table schema here) in TableStore, you need to specify a table and a primary or index key if necessary. An example code for validating assets in TableStore is as follows:
84+
85+
```java
86+
TableStoreClientService service = ...
87+
String tableName = "employee";
88+
String primaryKeyColumn = "id";
89+
String indexKeyColumn = "department";
90+
TextNode primaryKeyValue = TextNode.valueOf("1001");
91+
TextNode indexKeyValue = TextNode.valueOf("sales");
92+
try {
93+
LedgerValidationResult result1 =
94+
service.validateRecord(tableName, primaryKeyColumn, primaryKeyValue);
95+
LedgerValidationResult result2 =
96+
service.validateIndexRecord(tableName, indexKeyColumn, indexKeyValue);
97+
LedgerValidationResult result3 = service.validateTableSchema(tableName);
98+
// You can also specify age range.
99+
// LedgerValidationResult result1 =
100+
// service.validateRecord(tableName, primaryKeyColumn, primaryKeyValue, startAge, endAge);
101+
// LedgerValidationResult result2 =
102+
// service.validateIndexRecord(tableName, indexKeyColumn, indexKeyValue, startAge, endAge);
103+
// LedgerValidationResult result3 = service.validateTableSchema(tableName, startAge, endAge);
104+
} catch (ClientException e) {
105+
}
106+
```
107+
108+
:::note
109+
110+
TableStore internally assigns a dedicated asset ID to an asset that represents a record, an index record, and a table schema. The asset ID consists of a prefix to show the asset type and a key; for example, a prefix `rec_`, a primary key column name, and a primary key value are used for asset IDs of records. You will see such raw asset IDs in `AssetProof` in `LedgerValidationResult`.
111+
112+
:::

docs/scalardl-benchmarks/README.mdx

Lines changed: 72 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -168,39 +168,88 @@ You can run the benchmark several times by using the `--except-pre` option after
168168

169169
## Common parameters
170170

171-
| Name | Description | Default |
172-
|:---------------|:--------------------------------------------------------|:----------|
173-
| `concurrency` | Number of threads for benchmarking. | `1` |
174-
| `run_for_sec` | Duration of benchmark (in seconds). | `60` |
175-
| `ramp_for_sec` | Duration of ramp-up time before benchmark (in seconds). | `0` |
171+
### `concurrency`
172+
173+
- **Description:** Number of worker threads that concurrently execute benchmark transactions against the database. This parameter controls the level of parallelism during the actual benchmark execution phase. Increasing this value simulates more concurrent client accesses and higher workload intensity.
174+
- **Default value:** `1`
175+
176+
### `run_for_sec`
177+
178+
- **Description:** Duration of the benchmark execution phase (in seconds). This parameter defines how long the benchmark will run and submit transactions to the database.
179+
- **Default value:** `60`
180+
181+
### `ramp_for_sec`
182+
183+
- **Description:** Duration of the ramp-up period before the benchmark measurement phase begins (in seconds). During this warm-up period, the system executes transactions but does not record performance metrics. This allows the system to reach a steady state before collecting benchmark results.
184+
- **Default value:** `0`
176185

177186
## Workload-specific parameters
178187

179188
Select a workload to see its available parameters.
180189

181190
<Tabs groupId="benchmarks" queryString>
182191
<TabItem value="SmallBank" label="SmallBank" default>
183-
| Name | Description | Default |
184-
|:-------------------|:----------------------------------------------------|:----------|
185-
| `num_accounts` | Number of bank accounts for benchmarking. | `100000` |
186-
| `load_concurrency` | Number of threads for loading. | `1` |
187-
| `load_batch_size` | Number of accounts in a single loading transaction. | `1` |
192+
### `num_accounts`
193+
194+
- **Description:** Number of bank accounts to create for the benchmark workload. This parameter determines the size of the dataset and affects the working-set size.
195+
- **Default value:** `100000`
196+
197+
### `load_concurrency`
198+
199+
- **Description:** Number of parallel threads used to load initial benchmark data into the database. This parameter controls how fast the data-loading phase completes. Increasing this value can significantly reduce data-loading time for large datasets. This is separate from the `concurrency` parameter used during benchmark execution.
200+
- **Default value:** `1`
201+
202+
### `load_batch_size`
203+
204+
- **Description:** Number of accounts to insert within a single transaction during the initial data-loading phase. Larger batch sizes can improve loading performance by reducing the number of transactions, but may increase the execution time of each transaction.
205+
- **Default value:** `1`
188206
</TabItem>
189207
<TabItem value="TPC-C" label="TPC-C">
190-
| Name | Description | Default |
191-
|:-------------------|:------------------------------------------------------|:----------|
192-
| `num_warehouses` | Number of warehouses (scale factor) for benchmarking. | `1` |
193-
| `rate_payment` | Percentage of Payment transaction. | `50` |
194-
| `load_concurrency` | Number of threads for loading. | `1` |
208+
### `num_warehouses`
209+
210+
- **Description:** Number of warehouses to create for the TPC-C benchmark workload. This value is the scale factor that determines the dataset size. Increasing this value creates a larger working set and enables various enterprise-scale testing.
211+
- **Default value:** `1`
212+
213+
### `rate_payment`
214+
215+
- **Description:** Percentage of Payment transactions in the transaction mix, with the remainder being New-Order transactions. For example, a value of `50` means 50% of transactions will be Payment transactions and 50% will be New-Order transactions.
216+
- **Default value:** `50`
217+
218+
### `load_concurrency`
219+
220+
- **Description:** Number of parallel threads used to load initial benchmark data into the database. This parameter controls how fast the data-loading phase completes. Increasing this value can significantly reduce data-loading time, especially for larger numbers of warehouses. This is separate from the `concurrency` parameter used during benchmark execution.
221+
- **Default value:** `1`
195222
</TabItem>
196223
<TabItem value="YCSB" label="YCSB">
197-
| Name | Description | Default |
198-
|:-------------------|:---------------------------------------------------|:----------|
199-
| `record_count` | Number of records for benchmarking. | `1000` |
200-
| `payload_size` | Payload size (in bytes) of each record. | `1000` |
201-
| `ops_per_tx` | Number of operations in a single transaction | `2` |
202-
| `workload` | Workload type (A, C, or F). | `A` |
203-
| `load_concurrency` | Number of threads for loading. | `1` |
204-
| `load_batch_size` | Number of records in a single loading transaction. | `1` |
224+
### `record_count`
225+
226+
- **Description:** Number of records to create for the YCSB benchmark workload. This parameter determines the size of the dataset and affects the working-set size during benchmark execution.
227+
- **Default value:** `1000`
228+
229+
### `payload_size`
230+
231+
- **Description:** Size of the payload data (in bytes) for each record. This parameter controls the amount of data stored per record and affects database storage, memory usage, and I/O characteristics.
232+
- **Default value:** `1000`
233+
234+
### `ops_per_tx`
235+
236+
- **Description:** Number of read or write operations to execute within a single transaction. This parameter affects transaction size and execution time. Higher values create longer-running transactions.
237+
- **Default value:** `2`
238+
239+
### `workload`
240+
241+
- **Description:** YCSB workload type that defines the operation mix: **A** (50% reads, 50% read-modify-write operations), **C** (100% reads), or **F** (100% read-modify-write operations). Note that the workload A in this benchmark uses read-modify-write operations instead of pure blind writes because ScalarDL prohibits the blind writes. Each workload type simulates different application access patterns.
242+
243+
- **Default value:** `A`
244+
245+
### `load_concurrency`
246+
247+
- **Description:** Number of parallel threads used to load initial benchmark data into the database. This parameter controls how fast the data-loading phase completes. Increasing this value can significantly reduce data-loading time for large datasets. This is separate from the `concurrency` parameter used during benchmark execution.
248+
- **Default value:** `1`
249+
250+
### `load_batch_size`
251+
252+
- **Description:** Number of records to insert within a single transaction during the initial data-loading phase. Larger batch sizes can improve loading performance by reducing the number of transactions, but may increase the execution time of each transaction.
253+
- **Default value:** `1`
205254
</TabItem>
206255
</Tabs>

0 commit comments

Comments
 (0)