Skip to content

linkedMetadataBlocks, dataset types, and fields set as required in tsv for metadata blocks #12196

@pdurbin

Description

@pdurbin

@scolapasta noticed something while testing #11753 that I was able to reproduce on develop as of d988622.

Testing requires a metadata block where fields have been set to required in the tsv itself. So, to test, I dropped my database and changed the city field in the geospatial block to be required before installing Dataverse. The change looks like this:

% git diff --word-diff
diff --git a/scripts/api/data/metadatablocks/geospatial.tsv b/scripts/api/data/metadatablocks/geospatial.tsv
index 1140831741..84d352a112 100644
--- a/scripts/api/data/metadatablocks/geospatial.tsv
+++ b/scripts/api/data/metadatablocks/geospatial.tsv
@@ -1,10 +1,10 @@
#metadataBlock  name    dataverseAlias  displayName                                                                                              
        geospatial              Geospatial Metadata                                                                                              
#datasetField   name    title   description     watermark        fieldType      displayOrder    displayFormat   advancedSearchField      allowControlledVocabulary       allowmultiples  facetable       displayoncreate required        parent  metadatablock_id
        geographicCoverage      Geographic Coverage     Information on the geographic coverage of the data. Includes the total geographic scope of the data.             none    0               FALSE   FALSE   TRUE    FALSE   FALSE   [-FALSE-]{+TRUE+}                geospatial
        country Country / Nation        The country or nation that the Dataset is about.                text    1       #VALUE,          TRUE    TRUE    FALSE   TRUE    FALSE   FALSE   geographicCoverage      geospatial
        state   State / Province        The state or province that the Dataset is about. Use GeoNames for correct spelling and avoid abbreviations.              text    2       #VALUE,         TRUE    FALSE   FALSE   TRUE    FALSE   FALSE   geographicCoverage       geospatial
        city    City    The name of the city that the Dataset is about. Use GeoNames for correct spelling and avoid abbreviations.               text    3       #VALUE,         TRUE    FALSE   FALSE   TRUE    FALSE   [-FALSE-]{+TRUE+}       geographicCoverage       geospatial
        otherGeographicCoverage Other   Other information on the geographic coverage of the data.               text    4       #VALUE,  FALSE   FALSE   FALSE   TRUE    FALSE   FALSE   geographicCoverage      geospatial

Here's how it looks when you enable the block ("required by Dataverse"):

Image

As expected, if you use UI, the field is required:

Image

Also as expected, if you try to create a dataset via API with the required field, you will get an error (yes, it's fairly ugly):

curl -H "X-Dataverse-key:$API_TOKEN" -X POST "$SERVER_URL/api/dataverses/$PARENT/datasets" --upload-file dataset-finch1.json -H 'Content-type:application/json'

{"status":"ERROR","message":"Validation Failed: Geographic Coverage City is required. (Invalid value:edu.harvard.iq.dataverse.DatasetField[ id=null ]).java.util.stream.ReferencePipeline$3@23f79b05"}

Now let's talk about linkedMetadataBlocks, which was added (by me) in the following PR to have additional fields show up "on create":

Going back to city, which is required in the scenario above, if you create a dataset type that links to its block (geospatial), like this...

{"name":"geoDatasetType","linkedMetadataBlocks":["geospatial"]}

... and then create a dataset of that dataset type without including city (see JSON below), the dataset will be created anyway. That is to say, the linkedMetadataBlocks mechanism has nothing to do with fields being required or not.

curl -H "X-Dataverse-key:$API_TOKEN" -X POST "$SERVER_URL/api/dataverses/$PARENT/datasets" --upload-file dataset-geo.json.txt -H 'Content-type:application/json'

dataset-geo.json.txt

{
  "datasetType": "geoDatasetType",
  "datasetVersion": {
    "license": {
      "name": "CC0 1.0",
      "uri": "http://creativecommons.org/publicdomain/zero/1.0"
    },
    "metadataBlocks": {
      "citation": {
        "fields": [
          {
            "value": "Darwin's Finches",
            "typeClass": "primitive",
            "multiple": false,
            "typeName": "title"
          },
          {
            "value": [
              {
                "authorName": {
                  "value": "Finch, Fiona",
                  "typeClass": "primitive",
                  "multiple": false,
                  "typeName": "authorName"
                },
                "authorAffiliation": {
                  "value": "Birds Inc.",
                  "typeClass": "primitive",
                  "multiple": false,
                  "typeName": "authorAffiliation"
                }
              }
            ],
            "typeClass": "compound",
            "multiple": true,
            "typeName": "author"
          },
          {
            "value": [ 
                { "datasetContactEmail" : {
                    "typeClass": "primitive",
                    "multiple": false,
                    "typeName": "datasetContactEmail",
                    "value" : "finch@mailinator.com"
                },
                "datasetContactName" : {
                    "typeClass": "primitive",
                    "multiple": false,
                    "typeName": "datasetContactName",
                    "value": "Finch, Fiona"
                }
            }],
            "typeClass": "compound",
            "multiple": true,
            "typeName": "datasetContact"
          },
          {
            "value": [ {
               "dsDescriptionValue":{
                "value":   "Darwin's finches (also known as the Galápagos finches) are a group of about fifteen species of passerine birds.",
                "multiple":false,
               "typeClass": "primitive",
               "typeName": "dsDescriptionValue"
            }}],
            "typeClass": "compound",
            "multiple": true,
            "typeName": "dsDescription"
          },
          {
            "value": [
              "Medicine, Health and Life Sciences"
            ],
            "typeClass": "controlledVocabulary",
            "multiple": true,
            "typeName": "subject"
          }
        ],
        "displayName": "Citation Metadata"
      }
    }
  }
}

If you use the linkedMetadataBlocks mechanism, should fields that are required in the metadata block tsv be respected? That's the question. (This wasn't part of the requirements back when we worked on #10519.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    SPRINT- NEEDS SIZING

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions