Simple URL harvesting (opendata)
This harvester connects to a remote server via a simple URL to retrieve metadata records. This allows harvesting opendata catalogs such as opendatasoft, ESRI, DKAN and more.
Adding a simple URL harvester
-
Site - Options about the remote site.
- Name - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester.
- Service URL - The URL of the server to be harvested. This can include pagination params like
?start=0&rows=20
- loopElement - Propery/element containing a list of the record entries. (Indicated as an absolute path from the document root.) eg.
/datasets
- numberOfRecordPath : Property indicating the total count of record entries. (Indicated as an absolute path from the document root.) eg.
/nhits
- recordIdPath : Property containing the record id. eg.
datasetid
- pageFromParam : Property indicating the first record item on the current "page" eg.
start
- pageSizeParam : Property indicating the number of records containned in the current "page" eg.
rows
- toISOConversion : Name of the conversion schema to use, which must be available as XSL on the GN instance. eg.
OPENDATASOFT-to-ISO19115-3-2018
Note
GN looks for schemas by name in https://github.com/geonetwork/core-geonetwork/tree/4.0.x/web/src/main/webapp/xsl/conversion/import. These schemas might internally include schemas from other locations like https://github.com/geonetwork/core-geonetwork/tree/4.0.x/schemas/iso19115-3.2018/src/main/plugin/iso19115-3.2018/convert. To indicate the
fromJsonOpenDataSoft
schema for example, from the latter location directly in the admin UI the following syntax can be used:schema:iso19115-3.2018:convert/fromJsonOpenDataSoft
.Sample configuration for opendatasoft
- loopElement -
/datasets
- numberOfRecordPath :
/nhits
- recordIdPath :
datasetid
- pageFromParam :
start
- pageSizeParam :
rows
- toISOConversion :
OPENDATASOFT-to-ISO19115-3-2018
Sample configuration for ESRI
- loopElement -
/dataset
- numberOfRecordPath :
/result/count
- recordIdPath :
landingPage
- pageFromParam :
start
- pageSizeParam :
rows
- toISOConversion :
ESRIDCAT-to-ISO19115-3-2018
Sample configuration for DKAN
- loopElement -
/result/0
- numberOfRecordPath :
/result/count
- recordIdPath :
id
- pageFromParam :
start
- pageSizeParam :
rows
- toISOConversion :
DKAN-to-ISO19115-3-2018
-
Privileges - Assign privileges to harvested metadata.