Deposit Services are responsible for the transfer of custodial content and metadata from end users to repositories. End users transfer custody of their content to PASS by performing a submission through the HTML user interface, and Deposit Services subsequently transfers the custody of content to downstream repositories.
Deposit Services is deployed as "back-end" infrastructure. It has no user-facing elements. In particular, Deposit Services is unaware of the internal/external duality of resource URIs. This means that when looking at URIs in Deposit Services' logging output, some adjustment may be necessary for a developer or systems operator to retrieve the resource from their location in the network topology.
The primary mechanism for configuring Deposit Services is through environment variables. This aligns with the patterns used in development and production infrastructure which rely on Docker and its approach to runtime configuration.
| Environment Variable | Default Value | Description |
|---|---|---|
ACTIVEMQ_BROKER_URI |
null |
the publicly-supported (i.e. official PASS) variable for configuring the JMS broker URL. used to compose the SPRING_ACTIVEMQ_BROKER_URL
|
DSPACE_HOST |
localhost | the IP address or host name of the server running the SWORD protocol version 2 endpoint |
DSPACE_PORT |
8181 | the TCP port exposing the SWORD protocol version 2 endpoint |
ES_HOST |
localhost | the IP address or host name of the Elastic Search index. |
ES_PORT |
9200 | the TCP port running the Elastic Search HTTP API. |
FCREPO_HOST |
localhost | the IP address or host name of the Fedora repository |
FCREPO_JMS_PORT |
61616 | the TCP port for the STOMP protocol. |
FCREPO_PORT |
8080 | the TCP port running the Fedora HTTP REST API. |
FTP_HOST |
localhost | the IP address or host name of the NIH FTP server |
FTP_PORT |
21 | the TCP control port of the NIH FTP server |
PASS_DEPOSIT_HTTP_AGENT |
pass-deposit/x.y.z | the value of the User-Agent header supplied on Deposit Services' HTTP requests. |
PASS_DEPOSIT_JOBS_CONCURRENCY |
2 | the number of Quartz jobs that may be run concurrently. |
PASS_DEPOSIT_JOBS_DEFAULT_INTERVAL_MS |
600000 | the amount of time, in milliseconds, that Quartz launches jobs. |
PASS_DEPOSIT_JOBS_DISABLED |
undefined | set this environment variable to true to disable all Quartz jobs. By default this environment variable is undefined for the production runtime. |
PASS_DEPOSIT_QUEUE_SUBMISSION_NAME |
submission | the name of the JMS queue that has messages pertaining to Submission resources (used by the JmsSubmissionProcessor) |
PASS_DEPOSIT_QUEUE_DEPOSIT_NAME |
deposit | the name of the JMS queue that has messages pertaining to Deposit resources (used by the JmsDepositProcessor) |
PASS_DEPOSIT_REPOSITORY_CONFIGURATION |
classpath:/repositories.json | points to a properties file containing the configuration for the transport of custodial content to remote repositories. Values must be Spring Resource URIs. See below for customizing the repository configuration values. |
PASS_DEPOSIT_TRANSPORT_SWORDV2_SLEEP_TIME_MS |
10000 | the number of milliseconds to wait between depositing a package using SWORD, and checking the SWORD statement for the deposit state |
PASS_DEPOSIT_WORKERS_CONCURRENCY |
4 | the number of Deposit Worker threads that can simultaneously run. |
PASS_ELASTICSEARCH_LIMIT |
100 | the maximum number of results returned in a single search response |
PASS_ELASTICSEARCH_URL |
http://${es.host:localhost}:${es.port:9200}/pass | the URL used to communicate with the Elastic search API. Normally this this variable does not need to be changed (see note below) |
PASS_FEDORA_BASEURL |
http://${fcrepo.host:localhost}:${fcrepo.port:8080}/fcrepo/rest/ | the URL used to communicate with the Fedora REST API. Normally this variable does not need to be changed (see note below) |
PASS_FEDORA_PASSWORD |
moo | the password used for Basic HTTP authentication to the Fedora REST API |
PASS_FEDORA_USER |
fedoraAdmin | the username used for Basic HTTP authentication to the Fedora REST API |
SPRING_ACTIVEMQ_BROKER_URL |
${activemq.broker.uri:tcp://${fcrepo.host:localhost}:${fcrepo.jms.port:61616}} | the internal variable for configuring the URI of the JMS broker |
SPRING_ACTIVEMQ_PASSWORD |
null |
Password to use when authenticating to the broker |
SPRING_ACTIVEMQ_USER |
null |
User name to use when authenticating to the broker |
SPRING_JMS_LISTENER_CONCURRENCY |
4 | the number of JMS messages that can be processed simultaneously by each JMS queue |
If the Fedora repository is deployed under a webapp context other than
/fcrepo, or ifhttpsought to be used instead ofhttp, the environment variablePASS_FEDORA_BASEURLmust be set to the base of the Fedora REST API (e.g.PASS_FEDORA_BASEURL=https://fcrepo:8080/rest)
If the Elastic Search index is deployed under a url other than
/pass, or ifhttpsought to be used instead ofhttp, the environment variablePASS_ELASTICSEARCH_URLmust be set to the base of the Elastic Search HTTP API (e.g.PASS_ELASTICSEARCH_URL=https://localhost:9200/index)
The Repository configuration contains the parameters used for connecting and depositing custodial material to downstream repositories. The format of the configuration file is JSON, defining multiple downstream repositories in a single file.
Each repository configuration has a top-level key that is used to identify a particular configuration. Importantly, each top-level key must map to a Repository resource within the PASS repository. This implies that the top-level keys in repositories.json are not arbitrary. In fact, the top level key must be one of:
- the value of a
Repository.repositoryKeyfield (of aRepositoryresource in the PASS repository) - the full URI of a
Repositoryresource in the PASS repository - a portion of the URI path of a
Repositoryresource in the PASS repository
Given a Repository with a repositoryKey of my-repo and a URI of https://pass.my.edu/fcrepo/rest/repositories/77/cc/80/64/77cc8064-a918-4823-968d-2b17386db76d, any of the following top level keys are acceptable:
my-repohttps://pass.my.edu/fcrepo/rest/repositories/77/cc/80/64/77cc8064-a918-4823-968d-2b17386db76d/repositories/77/cc/80/64/77cc8064-a918-4823-968d-2b17386db76d77cc8064-a918-4823-968d-2b17386db76d
Deposit Services comes with a default repository configuration, but a production environment will want to override the default. Defaults are overridden by creating a copy of the default configuration, editing it to suit, and setting PASS_DEPOSIT_REPOSITORY_CONFIGURATION to point to the new location.
Acceptable values for
PASS_DEPOSIT_REPOSITORY_CONFIGURATIONmust be a form of Spring Resource URI.
The default configuration is replicated below:
{
"JScholarship": {
"deposit-config": {
"processing": {
"beanName" : "org.dataconservancy.pass.deposit.messaging.status.DefaultDepositStatusProcessor"
},
"mapping": {
"http://dspace.org/state/archived": "accepted",
"http://dspace.org/state/withdrawn": "rejected",
"default-mapping": "submitted"
}
},
"assembler": {
"specification": "http://purl.org/net/sword/package/METSDSpaceSIP"
},
"transport-config": {
"auth-realms": [
{
"mech": "basic",
"username": "user",
"password": "pass",
"url": "https://jscholarship.library.jhu.edu/"
},
{
"mech": "basic",
"username": "user",
"password": "pass",
"url": "https://dspace-prod.mse.jhu.edu:8080/"
},
{
"mech": "basic",
"username": "dspace-admin@oapass.org",
"password": "foobar",
"url": "http://${dspace.host}:${dspace.port}/swordv2"
}
],
"protocol-binding": {
"protocol": "SWORDv2",
"username": "dspace-admin@oapass.org",
"password": "foobar",
"server-fqdn": "${dspace.host}",
"server-port": "${dspace.port}",
"service-doc": "http://${dspace.host}:${dspace.port}/swordv2/servicedocument",
"default-collection": "http://${dspace.host}:${dspace.port}/swordv2/collection/123456789/2",
"on-behalf-of": null,
"deposit-receipt": true,
"user-agent": "pass-deposit/x.y.z"
}
}
},
"PubMed Central": {
"deposit-config": {
"processing": {
},
"mapping": {
"INFO": "accepted",
"ERROR": "rejected",
"WARN": "rejected",
"default-mapping": "submitted"
}
},
"assembler": {
"specification": "nihms-native-2017-07"
},
"transport-config": {
"protocol-binding": {
"protocol": "ftp",
"username": "nihmsftpuser",
"password": "nihmsftppass",
"server-fqdn": "${ftp.host}",
"server-port": "${ftp.port}",
"data-type": "binary",
"transfer-mode": "stream",
"use-pasv": true,
"default-directory": "/logs/upload/%s"
}
}
}
}The default repository configuration will not be suitable for production. A production deployment needs to provide updated authentication credentials and insure the correct value for the default SWORD collection URL - default-collection. Each transport-config section should be reviewed for correctness, paying special attention to protocol-binding and auth-realm blocks: update username and password elements, and insure correct values for URLs.
Values may be parameterized by any property or environment variable.
To create your own configuration, copy and paste the default configuration into an empty file and modify the JSON as described above. The configuration must be referenced by the pass.deposit.repository.configuration property, or is environment equivalent PASS_DEPOSIT_REPOSITORY_CONFIGURATION. Allowed values are any Spring Resource path (e.g. classpath:/, classpath*:, file:, http://, https://). For example, if your configuration is stored as a file in /etc/deposit-services.json, then you would set the environment variable PASS_DEPOSIT_REPOSITORY_CONFIGURATION=file:/etc/deposit-services.json prior to starting Deposit Services. Likewise, if you kept the configuration accessible at a URL, you could use PASS_DEPOSIT_REPOSITORY_CONFIGURATION=http://example.org/deposit-services.json.
A "failed" Deposit or Submission has Deposit.DepositStatus = FAILED or Submission.AggregateDepositStatus = FAILED. When a resource has been marked FAILED, Deposit Services will ignore any messages relating to the resource when in listen mode (see below for more information on modes). Intervention (automated or manual) is required to update the failed resource.
A resource will be considered as failed when errors occur during the processing of Submission and Deposit resources. Some errors may be caused by transient network issues, or a server being rebooted, but for now Deposit Services does not contain any logic for retrying when there are low-level communication errors with an endpoint.
Submission resources are failed when:
- Failure to build the Deposit Services model for a Submission
- There are no files attached to the Submission
- Any file attached to the Submission is missing a location URI (the URI used to retrieve the bytes of the file).
- An error occurs saving the state of the
Submissionin the repository (arguably a transient error, but DS does not perform any retries when there are errors communicating with the repository)
See SubmissionProcessor for details. Right now, when a Submission is failed, manual intervention is required. Deposit Services does not provide any support for dealing with failed submissions. It is likely the end-user will need to re-create the submission in the user interface, and resubmit it.
Deposit resources are failed when:
- An error occurs building a package
- An error occurs streaming a package to a
Repository(arguably transient) - An error occurs polling (arguably transient, but DS does not perform retries) or parsing the status of a
Deposit - An error occurs saving the state of a
Depositin the repository (again, arguably transient, but DS doesn't perform retries when there are errors communicating with the repository)
See DepositTask for details. Deposits fail for transient reasons; a server being down, an interruption in network communication, or invalid credentials for the downstream repository are just a few examples. Manual intervention is required to remediate failed deposits, but Deposit Services provides support for this case (see the retry mode documented below).
Deposit Services' primary artifact is a single self-executing jar. The behavior, or "mode" of the deposit services application is directed by command line arguments and influenced by environment variables. In the PASS infrastructure, the Deposit Services self-executing jar is deployed inside of a simple Docker container.
Deposit Services can be built by running:
mvn clean install
The main Deposit Services deployment artifact is found in deposit-messaging/target/deposit-messaging-<version>.jar. It is this jarfile that is included in the Docker image for Deposit Services, and posted on the GitHub Release page.
The mode is a required command-line argument which directs the deposit services application to take a specific action.
Listen mode is the "primary" mode, if you will, of Deposit Services. In listen mode Deposit Services responds to JMS messages from the Fedora repository by creating and transferring packages of custodial content to remote repositories.
Listen mode is invoked by starting Deposit services with listen as the single command-line argument:
$ java -jar deposit-services.jar listen
Deposit Services will connect to a JMS broker specified by the SPRING_ACTIVEMQ_BROKER_URL environment variable (optionally authenticating if SPRING_ACTIVEMQ_USER and SPRING_ACTIVEMQ_PASSWORD are present), and wait for the Fedora repository to be available as specified by FCREPO_HOST and FCREPO_PORT. Notably, listen mode does not use the index.
If the Fedora repository is deployed under a webapp context other than
/fcrepo, the environment variablePASS_FEDORA_BASEURLmust be set to the base of the Fedora REST API (e.g.PASS_FEDORA_BASEURL=http://fcrepo:8080/fcrepo/rest)
After successfully connecting to the JMS broker and the Fedora repository, deposit services will listen and respond to JMS messages relating to the submission and deposit of material to the Fedora repository. Incoming Submission resources created by end-users of the UI will be processed:
- custodial content packaged
- packages sent to destination repositories
- confirmation of custody transfer
- recording the identities of content in destination repositories
Incoming Deposit resources will be used to update the overall success or failure of a Submission.
Retry mode is used to retry a Deposit that has failed. Retry mode is invoked by starting Deposit services with retry as the first command-line argument, with an optional --uris argument, accepting a space-separated list of Deposit URIs to retry. If no --uris argument is present, the index is searched for all Deposit resources that have failed, and those are the deposits that are re-tried.
To retry all failed deposits:
$ java -jar deposit-services.jar retry
To retry specific deposits:
$ java -jar deposit-services.jar retry --uri=http://192.168.99.100:8080/fcrepo/rest/deposits/8e/af/ac/a9/8eafaca9-1f24-413a-bf1e-fbbd673ba45b --uri=http://192.168.99.100:8080/fcrepo/rest/deposits/4a/cb/04/bb/4acb04bb-4f79-40ef-8ff9-e105261aa7fb
Refresh mode is used to re-process a Deposit in the SUBMITTED state that needs its deposit status refreshed. When refresh is invoked, the optional --uris argument is used to identify the Deposit resources to refresh. Otherwise a search of the index is performed for all Deposit resources in the SUBMITTED state.
Refreshing a Deposit means that its deposit status reference will be retrieved, parsed, and processed. The status returned from the reference will be stored on the Deposit, and the status of the corresponding RepositoryCopy will be updated as well. If the Deposit status is updated to ACCEPTED, the RepositoryCopy will be updated to COMPLETE. If the Deposit status is updated to REJECTED, the RepositoryCopy will be updated to REJECTED as well.
To refresh all deposits in the SUBMITTED state:
$ java -jar deposit-services.jar refresh
To refresh specific deposits:
$ java -jar deposit-services.jar refresh --uri=http://192.168.99.100:8080/fcrepo/rest/deposits/8e/af/ac/a9/8eafaca9-1f24-413a-bf1e-fbbd673ba45b --uri=http://192.168.99.100:8080/fcrepo/rest/deposits/4a/cb/04/bb/4acb04bb-4f79-40ef-8ff9-e105261aa7fb
Modes to be supported by future releases of Deposit Services.
TODO
Deposit Services is implemented using Spring Boot, which heavily relies on Spring-based annotations and conventions to create and populate a Spring ApplicationContext, arguably the most important object managed by the Spring runtime. Unfortunately, if you aren't familiar with Spring or its conventions, it can make the code harder to understand.
The entrypoint into the deposit services is the DepositApp, which accepts command line parameters that set the "mode" of the deposit services runtime. Spring beans are created entirely in Java code by the DepositConfig and JmsConfig classes.
The listen argument will invoke the ListenerRunner, which waits for the Fedora repository to be available, otherwise it shuts down the application. Two JMS listeners are started that listen to the submission and deposit queues. The submission queue provides messages relating to Submission resources, and the deposit queue provides messages relating to Deposit resources. Deposit Services does not listen or act on messages for other types of repository resources.
The PASS_FEDORA_USERNAME and PASS_FEDORA_PASSWORD define the username and password used to perform HTTP Basic authentication to the Fedora HTTP REST API (i.e. PASS_FEDORA_BASEURL).
The retry argument invokes the FailedDepositRunner which will re-submit failed Deposit resources to the task queue for processing. URIs for specific Deposits may be specified, otherwise the index is searched for failed Deposits, and each one will be re-tried.
The refresh argument invokes the SubmittedUpdateRunner which will attempt to re-process a Deposit's status reference. URIs for specific Deposits may be specified, otherwise the index is searched for SUBMITTED Deposits, and each one will be refreshed.
Each JMS listener (one each for the deposit and submission queues) can process messages concurrently. The number of messages each listener can process concurrently is set by the property spring.jms.listener.concurrency (or its environment equivalent: SPRING_JMS_LISTENER_CONCURRENCY).
The submission queue is processed by the JmsSubmissionProcessor,which resolves the Submission resource represented in the message, and hands off processing to the SubmissionProcessor. The SubmissionProcessor builds a DepositSubmission, which is the Deposit Services' analog of a Submission containing all of the metadata and custodial content associated with a Submission. After building the DepositSubmission, the processor creates a DepositTask and hands off the actual packaging and transfer of submission content to the deposit worker thread pool. Importantly, the SubmissionProcessor updates the Submission resource in the repository as being in progress.
There is a thread pool of so-called "deposit workers" that perform the actual packaging and transport of custodial content to downstream repositories. The size of the worker pool is determined by the property pass.deposit.workers.concurrency (or its environment equivalent: PASS_DEPOSIT_WORKERS_CONCURRENCY). The deposit worker pool accepts instances of DepositTask, which contains the primary logic for packaging, streaming, and verifying the transfer of content from the PASS repository to downstream repositories. The DepositTask will determine whether or not the transfer of custodial content has succeed, failed, or is indeterminable (i.e. an asyc deposit process that has not yet concluded). The status of the Deposit resource associated with the Submission will be updated accordingly.
Certain Spring sub-systems like Spring MVC, or Spring Messaging, support the notion of a "global" ErrorHandler. Deposit services provides an implementation DepositServicesErrorHandler, and it is used to catch exceptions thrown by the JmsDepositProcessor, JmsSubmissionProcessor, and is adapted as a Thread.UncaughtExceptionHandler and as a RejectedExecutionHandler.
Deposit services provides a DepositServicesRuntimeException (DSRE for short), which has a field PassEntity resource. If the DepositServicesErrorHandler catches a DSRE with a non-null resource, the error handler will test the type of the resource, mark it as failed, and save it in the repository.
The take-home point is: Deposit and Submission resources will be marked as failed if a DepositServicesRuntimeException is thrown from one of the JMS processors, or from the DepositTask. As a developer, if an exceptional condition does not warrant a failure, then do not throw DepositServicesRuntimeException. Instead, consider logging a warning or throwing a DSRE with a null resource. Likewise, to fail a resource, all you need to do is throw a DSRE with a non-null resource. The DepositServicesErrorHandler will do the rest.
Finally, one last word. Because the state of a resource can be modified at any time by any actor in the PASS infrastructure, the DepositServicesErrorHandler encapsulates the act of saving the failed state of a resource within a CRI. A pre-condition for updating the resource is that it must not be in a terminal state. For example, if the error handler is updating the state from SUBMITTED to FAILED, but another actor has modified the state of the resource to REJECTED in the interim, the pre-condition will fail. It makes no sense to modify the state of a resource after it is in its terminal state. The take-home point is: the DepositServicesErrorHandler will not mark a resource as failed if it is in a terminal state.
A central, yet awkwardly-named, abstraction is CriticalRepositoryInteraction. This interface is used to prevent interleaved updates of individual repository resources by different threads. A CriticalRepositoryInteraction (CRI for short) isolates the execution of a Function on a specific repository resource, and provides the boilerplate (i.e. template) for retrieving and updating the state of the resource. There are four main components to CriticalRepositoryInteraction: the repository resource itself, a pre-condition, post-condition, and the critical update (i.e. the Function to be executed). The only implementation of CRI is the class CriticalPath, and the particulars of that implementation are discussed below.
- First,
CriticalPathobtains a lock over the string form of the URI of the resource being updated. This insures that any other threads executing aCRIfor the same resource in the same JVM must wait their turn before executing their critical update of the resource.
This occurs more often than one might think, as Deposit Services receives many messages for the same resource almost "all at once" when a submission occurs. The thread model for Spring and the Deposit Workers would be rife with conflicts unless something like the
CRIwas uniformly adopted in Deposit Services.
-
Second, the resource is read from the repository.
-
Third, the pre-condition
Predicateis executed over the resource. If the pre-condition fails, the entireCriticalPathis failed, and returns. -
Fourth, the critical
Functionis executed, assured that the resource at the time it was retrieved in step 2 meets the pre-condition applied in step 3. It is assumed that theFunctionmodifies the state of the resource. TheFunctionmay return the updated state of the resource, or it may return an entirely different object (remember theFunctionis parameterized by two types; while it must accept aPassEntity, it does not have to return aPassEntity). -
After updating the state of the resource in step 4, an attempt is made to store and re-read the updated resource in the repository. In this step, an
UpdateConflictExceptionmay occur, because some other process outside of the JVM may have modified the resource after step 2 but before step 5. IfUpdateConflictExceptionis caught, it is the responsibility of theConflictHandlerto resolve the conflict. Otherwise, the update is successful, and processing of the resource by theCriticalPathcontinues. -
Finally, the post-condition
BiPredicateis executed. It accepts the resource as updated and read by step 5, and the object returned by the critical update in step 4. This determines the logical success or failure of theCriticalPath. Steps 1 through 5 may have executed without error, but the post-condition has final say of the overall success of theCriticalPath.
Here is a real example of a CRI in action, used when packaging and depositing custodial content to a downstream repository.
The pre-condition insures that we are operating on Deposit resources acceptable for processing. The critical update creates a package and streams it to the remote repository, and obtains a TransportResult. The status of the Deposit resource is modified, and the TransportResult is returned by the critical update. Finally, the post-condition uses the state of the Deposit resource and the TransportResult to evaluate the success of the critical update.
Behind the scenes, the CriticalPath is insuring that the state of the Deposit is properly stored in the repository, and that any conflicts are handled.
After the CriticalPath executes, its CriticalResult can be examined for success or failure.
CriticalResult<TransportResponse, Deposit> result = critical.performCritical(dc.deposit().getId(), Deposit.class,
/*
* Pre-condition: only "dirty" deposits can be processed by {@code DepositTask}
*/
(deposit) -> {
boolean accept = dirtyDepositPolicy.accept(deposit.getDepositStatus());
if (!accept) {
LOG.debug(">>>> Update precondition failed for {}", deposit.getId());
}
return accept;
},
/*
* Post-conditon: determines *physical* success of the Deposit: were the bytes of the package successfully received?
* Note: uses the TransportResponse as well as the resource state to determine the success of this CRI
*/
(deposit, tr) -> {
boolean success = deposit.getDepositStatus() == SUBMITTED;
if (!success) {
LOG.debug(">>>> Update postcondition failed for {} - expected status '{}' but actual status " +
"is '{}'", deposit.getId(), SUBMITTED, deposit.getDepositStatus());
}
success &= tr.success();
if (!success) {
LOG.debug(">>>> Update postcondition failed for {} - transport of package to endpoint " +
"failed: {}", deposit.getId(), tr.error().getMessage(), tr.error());
}
return success;
},
/*
* Critical update: Assemble and stream a package of content to the repository endpoint, update status to SUBMITTED
* Note: this Function accepts a Deposit resource, but returns a TransportResponse. Both are used by the
* post-condition to determine the success of the CRI
*/
(deposit) -> {
Packager packager = dc.packager();
PackageStream packageStream = packager.getAssembler().assemble(dc.depositSubmission());
Map<String, String> packagerConfig = packager.getConfiguration();
try (TransportSession transport = packager.getTransport().open(packagerConfig)) {
TransportResponse tr = transport.send(packageStream, packagerConfig);
deposit.setDepositStatus(SUBMITTED);
return tr;
} catch (Exception e) {
throw new RuntimeException("Error closing transport session for deposit " +
dc.deposit().getId() + ": " + e.getMessage(), e);
}
});
Deposit services primarily acts on three types of resources: Submission, Deposit, and RepositoryCopy. Each of these resources carries a status. Managing and reacting to the values of resource status is a large part of what Deposit services does.
Abstractly, Deposit services considers the value of any status to be intermediate, or terminal.
It isn't clear, yet, whether this abstract notion of intermediate and terminal need to be shared amongst components of PASS. If so, then certain classes and interfaces in the Deposit Services code base should be extracted out into a shared component.
The semantics of terminal state are that the resource has been through a workflow of some kind, and has reached the end of that workflow. Because the workflow has reached a terminus, no additional state is expected to be placed on the resource, and no existing state of the resource is expected to change.
The semantics of intermediate state are that the resource is currently in a workflow of some kind, and has yet to reach the end of that workflow. Because the workflow has not reached a terminus, the resource is expected to be modified at any time, until the terminal state is achieved.
A general pattern within Deposit services is that resources with terminal status are explicitly accounted for (this is largely enforced by policies which are documented elsewhere), and are considered "read-only".
Submission status is enumerated in the AggregatedDepositStatus class. Deposit services considers the following values:
NOT_STARTED(intermediate): Incoming Submissions from the UI must have this status valueIN_PROGRESS(intermediate): Deposit services places the Submission in anIN_PROGRESSstate right away. When a thread observes aSubmissionin this state, it assumes that another thread is processing this resource.FAILED(intermediate): Occurs when a non-recoverable error happens when processing theSubmissionACCEPTED(terminal): Deposit services places the Submission into this state when all of itsDeposits have beenACCEPTEDREJECTED(terminal): Deposit services places the Submission into this state when all of itsDeposits have beenREJECTED
Deposit status is enumerated in the DepositStatus class. Deposit services considers the following values:
SUBMITTED(intermediate): the custodial content of theSubmissionhas been successfully transferred to theDepositsRepositoryACCEPTED(terminal): the custodial content of theSubmissionhas been accessioned by theDepositsRepository(i.e. custody of theSubmissionhas successfully been transferred to the downstreamRepository)REJECTED(terminal): the custodial content of theSubmissionhas been rejected by theDeposit'sRepository(i.e. the downstreamRepositoryhas refused to accept custody of theSubmissioncontent)FAILED(intermediate): the transfer of custodial content to theRepositoryfailed, or there was some other error updating the status of theDeposit
RepositoryCopy status is enumerated in the CopyStatus class. Deposit services considers the following values:
COMPLETE(terminal): a copy of the custodial content is available in theRepositoryat this locationIN_PROGRESS(intermediate): a copy of the custodial content is expected to be available in theRepositoryat this location. The custodial content should not be expected to exist until theDepositstatus isACCEPTEDREJECTED(terminal): the copy should be considered to be invalid. Even if the custodial content is made available at the location indicated by theRepositoryCopy, it should not be mistaken for a successful transfer of custody.
RepositoryCopy status is subservient to the Deposit status. They will always be congruent. For example, a RepositoryCopy cannot be COMPLETE if the Deposit is REJECTED. If a Deposit is REJECTED, then the RepositoryCopy must also be REJECTED.
There are some common permutations of these statuses that will be observed:
ACCEPTEDSubmissions will only haveDeposits that areACCEPTED. EachDepositwill have aCOMPLETERepositoryCopy.REJECTEDSubmissions will only haveDeposits that areREJECTED.REJECTEDDeposits will not have anyRepositoryCopyat all.IN_PROGRESSSubmissions may have zero or moreDeposits in any state.FAILEDSubmissions should have zeroDeposits.ACCEPTEDDeposits should have aCOMPLETERepositoryCopy.REJECTEDDeposits will have aREJECTEDRepositoryCopySUBMITTEDDeposits will have anIN_PROGRESSRepositoryCopyFAILEDDeposits will have noRepositoryCopy