HealthDCAT-AP editor v4.0

Customise Your Editor

Track Your Progress

Mandatory
0/0
Recommended
0/0
Optional
0/0
×

HealthDCAT-AP Metadata Editor – User Guide

Quick Start

This metadata editor is designed to support data holders in creating high-quality, FAIR-compliant metadata records based on the HealthDCAT-AP specification. It is part of the efforts under the European Health Data Space (EHDS) to improve discoverability and standardization of health-related datasets.

Important: The metadata records published through this platform are intended to be openly available, in line with the objectives of the EHDS. Users are responsible for ensuring that the information uploaded is suitable for open publication and does not include sensitive or restricted content.

The tool is especially aimed at non-technical users. Thanks to its user-friendly interface and built-in validation rules, no knowledge of RDF or semantic web technologies is required to produce standards-compliant metadata.

Behind the scenes, the editor takes care of generating valid RDF files and handling technical details such as:

You can start with minimal information (a dataset title is the minimum element), publish the record to create a stable URI, and come back later to complete or enrich your metadata.

This tool is actively developed, and new releases are published regularly. If you need credentials or want to report an issue, please contact: EHDS2@sciensano.be

Create & Update Metadata

Important: You can publish your metadata only if you have valid credentials. If you need credentials or want to report an issue, please contact: EHDS2@sciensano.be

New Record

Update Existing Record

Errors

Free Text & URIs

Free Text Fields

What is a URI?

A URI (Uniform Resource Identifier) is a unique identifier for a resource on the web. In metadata, URIs reference organizations, licenses, datasets, and more.

URIs look like URLs, but they may not always lead to a webpage. Their main purpose is to ensure clarity and consistency across systems.

Why use URIs? When a URI is updated in its source (e.g., a license description), all datasets referencing it automatically benefit — enabling consistent, centralized updates.

Examples:

Tip: Use URIs from trusted vocabularies (e.g., Wikidata, DCAT registries) when possible.

Backup & Restore

Export a Backup

Import a Backup

About HealthDCAT-AP

HealthDCAT-AP is an extension of the DCAT-AP metadata standard, tailored for describing health-related data in the context of the European Health Data Space (EHDS).

It builds on concepts like dcat:Dataset, dcat:Distribution, and dct:title using URIs and semantic vocabularies to ensure metadata is FAIR – Findable, Accessible, Interoperable, and Reusable.

Dataset Discovery

A set of metadata properties that describe essential attributes of the dataset, facilitating its intelligibility, relevance, and usability for a variety of purposes. These properties provide insights into the content, scope, context, etc.

Mandatory
0/0
Recommended
Optional
0/0
Definition: A name given to the dataset.
Usage: This property can be repeated for parallel language versions of the name.
Example: Linking of registers for COVID-19 vaccine surveillance
Definition: Alternative title of the dataset such as an acronym.
Example: LINK-VACC
Definition: A keyword or tag describing the dataset.
Usage: This is a generic, free-text property that allows you to describe the dataset using informal terms. However, unlike controlled concepts, keywords do not carry semantic meaning or standardized identifiers.
  • Enter only one keyword per field. You can add as many entries as needed to cover all relevant keywords.
  • For more structured and machine-readable tagging, consider using Health Theme or Code Values, which allow you to annotate the dataset with semantic concepts (e.g., from Wikidata or controlled vocabularies). This improves discoverability and allows systems to reason over the dataset’s thematic coverage.
Example: Corona virus
Definition: A free-text account of the dataset.
Usage: This property can be repeated for parallel language versions of the description.
Example: The LINK-VACC project links selected variables from existing registries for COVID-19 vaccine surveillance, in order to ensure the monitoring of COVID- 19 vaccines in the phase following their marketing authorization (post-authorization surveillance). This includes the measurement of uptake and coverage of the vaccination, the estimation of vaccine effectiveness, and continuous monitoring of the vaccine's safety. For these purposes, existing pseudonymized data on COVID-19 laboratory test results, hospitalized COVID-19 patients, COVID-19 vaccinations, underlying health problems, socio-demographic and -economic factors, and healthcare worker status are linked.
Definition: A statement about the lineage of a dataset.
Usage: Information about how the data was collected, including methodologies, tools, and protocols used.
Example: The data for the LINK-VACC project is sourced from several existing databases, including Vaccinnet+, HealthData COVID-19 database (Contact tracing and Clinic database), CoBRHA, STATBEL, and the AIM database. These databases collectively provide comprehensive demographic, clinical, and socio-economic data relevant to the project's objectives.
Definition A free text statement of the purpose of the processing of data or personal data.
Example The primary objective of Sciensano's LINK-VACC project is to monitor COVID-19 vaccines post-authorization and evaluate the public health value of prioritizing vaccination for people with comorbidities. This involves assessing the vaccines' effectiveness and safety in the broader population context, beyond the limited scope of clinical trials, and determining future vaccination policies in public health emergencies such as epidemics or pandemics.
Release date definition: The date of formal issuance (e.g.: publication) of the dataset.
Modification date definition: The most recent date on which the Dataset was changed or modified.
Release Date
Modification Date
Definition: A definition of the population within the dataset.
Example: The population targeted by the LINK-VACC project comprises all individuals in Belgium who have received a COVID-19 vaccine, undergone testing for COVID-19, or have been hospitalized with a confirmed diagnosis of COVID-19. The project also considers healthcare professionals and the general Belgian population for understanding vaccination coverage and effectiveness, especially among those with comorbidities and varying socio-economic backgrounds.
Definition: A geographic region that is covered by the dataset.
Usage: You may select one or more countries as well as specific NUTS levels (Nomenclature of Territorial Units for Statistics) to provide finer granularity.
If you indicate that the dataset covers an entire country, do not also select individual sub-regions (NUTS levels) within it because this creates a semantic contradiction, as it suggests both full and partial coverage at the same time.
Select
Definition A temporal period that the dataset covers.
Start Date
End Date
Definition: A language of the dataset.
Usage: This property can be repeated if there are multiple languages in the dataset.
Select
Definition A temporal period for which the dataset is available for secondary use.
Start Date
End Date
Definition: The frequency at which the dataset is updated.
Select
Definition: The minimum spatial separation resolvable in a dataset, measured in meters.
Example: 10.3
Number
Definition: The minimum time period resolvable in the dataset.
Years
Months
Days
Hours
Minutes
Seconds

Contacts

All forms for submitting contact details related to the dataset.
To ensure accuracy, consistency, and ease of reference, it is preferred that all contact information be provided as URIs (Global identifiers that can be dereferenced (accessed via HTTP) to retrieve RDF data) from an authoritative Register.

Mandatory
0/0
Recommended
Optional
0/0
Definition: An entity (organisation) responsible for making the dataset available.
Usage: In addition to the Publisher information, the Publisher Type (Definition: A type of organisation that makes the dataset available) must be provided as well as a Publisher Note (Definition: A description of the publisher activities).
Example of Publisher Note: Sciensano is a research institute and the national public health institute of Belgium. It is a so-called federal scientific institution that operates under the authority of the federal minister of Public Health and the federal minister of Agriculture of Belgium.

Provide Register URI

Name
URL
Mail
Trusted Data HolderHealthDCAT-AP
Select
Publisher TypeHealthDCAT-AP
Select
Publisher NoteHealthDCAT-AP
Definition: Health Data Access Body supporting access to data in the Member State.

Provide HDAB Register URI

Name
URL
Mail
Select
Definition: An entity responsible for producing the dataset.
Usage: In many cases, the creator may be the same as the contact point or publisher — for example, in small organizations or when the same person or team is responsible for authoring, publishing, and maintaining the dataset.
You may reuse the same organization or person URI across these roles when appropriate. .
Definition: Contact information that can be used for sending comments about the dataset.
Usage: In many cases, the contact point may be the same as the creator or publisher — for example, in small organizations or when the same person or team is responsible for authoring, publishing, and maintaining the dataset.
You may reuse the same organization or person URI across these roles when appropriate. .

Documentation

A collection of metadata properties that provide detailed information about the dataset's documentation, lineage, relationships, and legal context. To enhance accessibility, HTTP URIs are used to reference or resolve these resources, allowing users to directly access relevant information.

Mandatory
0/0
Recommended
Optional
0/0
Definition: A page or document about this dataset.
Definition: A web page that provides access to the dataset, its distributions and/or additional information.
Example: https://sciensano.service-now.com/sp
URI
Definition: A description of a relationship with another resource.
Example: The LINK-VACC project is related to 5 other existing projects and each relationship is expressed as a "Qualified relation" providing the landing page (URL) to the project and using a controlled vocabulary to define the nature of the relationship.
Definition: An Agent having some form of responsibility for the resource.
Example: The Belgian Public Health Institute is the "processor" of the LINK-VACC dataset.
Definition: An activity that generated, or provides the business context for, the creation of the dataset.
Example: Data linkage.
Definition: A statement related to quality of the Dataset, including rating, quality certificate, feedback that can be associated to the dataset.
Definition: The version indicator (name or identifier) of a resource.
Understanding Dataset Versions: Use versioning properties to describe how this dataset relates to previous or newer versions in a series.

                        ┌────────────────────┐     dct:isVersionOf     ┌────────────────────┐
                        │  Dataset v3.0      │ ◀────────────────────── │  Dataset v2.0      │
                        └────────────────────┘                        └────────────────────┘
                                │
                                ▼
                        dcat:version = "3.0"
                                │
                                ▼
                        ┌────────────────────┐     dct:hasVersion       ┌────────────────────┐
                        │  Dataset v3.0      │ ───────────────────────▶ │  Dataset v4.0      │
                        └────────────────────┘                          └────────────────────┘
                    

Usage:

  • Version: declares the version number of the current dataset (e.g., "3.0")
  • Is Version Of: points to the previous version of this dataset (e.g., 2.0)
  • Has Version: points to the next/newer version of this dataset (e.g., 4.0)

Text
Definition: A description of the differences between this version and a previous version of the Dataset.
Usage: This property can be repeated for parallel language versions of the version notes.
Definition: A related dataset that is a version, edition, or adaptation of the described dataset.
Understanding Dataset Versions: Use versioning properties to describe how this dataset relates to previous or newer versions in a series.

                    ┌────────────────────┐     dct:isVersionOf     ┌────────────────────┐
                    │  Dataset v3.0      │ ◀────────────────────── │  Dataset v2.0      │
                    └────────────────────┘                        └────────────────────┘
                            │
                            ▼
                    dcat:version = "3.0"
                            │
                            ▼
                    ┌────────────────────┐     dct:hasVersion       ┌────────────────────┐
                    │  Dataset v3.0      │ ───────────────────────▶ │  Dataset v4.0      │
                    └────────────────────┘                          └────────────────────┘
                

Usage:

  • Version: declares the version number of the current dataset (e.g., "3.0")
  • Is Version Of: points to the previous version of this dataset (e.g., 2.0)
  • Has Version: points to the next/newer version of this dataset (e.g., 4.0)

URI
Definition This property refers to a related dataset of which the described dataset is a version, edition, or adaptation.
Understanding Dataset Versions: Use versioning properties to describe how this dataset relates to previous or newer versions in a series.

                    ┌────────────────────┐     dct:isVersionOf     ┌────────────────────┐
                    │  Dataset v3.0      │ ◀────────────────────── │  Dataset v2.0      │
                    └────────────────────┘                        └────────────────────┘
                            │
                            ▼
                    dcat:version = "3.0"
                            │
                            ▼
                    ┌────────────────────┐     dct:hasVersion       ┌────────────────────┐
                    │  Dataset v3.0      │ ───────────────────────▶ │  Dataset v4.0      │
                    └────────────────────┘                          └────────────────────┘
                

Usage:

  • Version: declares the version number of the current dataset (e.g., "3.0")
  • Is Version Of: points to the previous version of this dataset (e.g., 2.0)
  • Has Version: points to the next/newer version of this dataset (e.g., 4.0)

URI
Definition A related dataset from which the described dataset is derived.
URI
Definition: A related resource, such as a publication, that references, cites, or otherwise points to the dataset.
URI
Definition: A related resource.
Usage: Use relation property to indicate a general association between this dataset and another resource (such as another dataset, document, or service), where the specific nature of the relationship is not captured by a more precise property.

However, when the relationship is known and semantically clear, use one of the more specific properties instead:
  • Source – when the dataset is derived from another dataset or document
  • In Series – when the dataset belongs to a dataset series or collection
  • Qualified Relation – to express typed relationships with explicit roles (e.g., “updates”, “is replaced by”, “complements”)
  • is Version Of / has Version – to link this dataset to other versions in a version lineage
URI
Definition: The legal basis used to justify processing of personal data.
Definition: A secondary identifier of the dataset, such as MAST/ADS17, DataCite18, DOI19, EZID20 or W3ID21.
Definition A dataset series of which the dataset is part.
Example: If this is the 2024 edition of a recurring National Health Survey, you can link it to the main series representing all editions. This helps users discover related datasets across different years or regions. The expected value is main series URI.
URI

Categorisation

A set of metadata properties that classify and describe the key characteristics and compliance aspects of the dataset. These properties serve as filters within a data catalogue

Mandatory
0/0
Recommended
Optional
0/0
Definition: A category of the Dataset.
Usage: According HealthDCAT-AP standard, you must select at least "Health".
Select
Definition The legislation that mandates the creation or management of the dataset.
Usage: For health datasets, the value must include the ELI of the EHDS Regulation. Multiple legislations may apply to the dataset.
Example:
  • Data Governance Act: http://data.europa.eu/eli/reg/2022/868/oj
  • EU Health Data Space Act: https://eur-lex.europa.eu/eli/reg/2025/327/oj
  • High Value Dataset Act: https://eur-lex.europa.eu/eli/reg_impl/2023/138/oj
URI
Definition: The health category to which this dataset belongs as described in the Commission Regulation on the European Health Data Space laying down a list of categories of electronic data for secondary use, Art.51.
Select
Definition: A category of the dataset or tag describing the dataset.
Usage: A dataset may be associated with multiple themes. Wikidata HTTP URIs MUST be used.
Example:
  • COVID-19: http://www.wikidata.org/entity/Q84263196
  • Cancer: http://www.wikidata.org/entity/Q12078
URI
Definition A type of the dataset.
Usage: A dataset may be associated with multiple dataset types like "statistical" and "High Value Dataset".
Example: Personal data (for personal electronic health data)
Select
Definition: Key elements that represent an individual in the dataset.
Usage: The list of key elements representing an individual in the dataset is expected to be comprehensive and complete.
Example: Age Exact, Blood type, Current Employment, ect. You can use the search bar inside the dropdown to quickly find what you need.
Select
Definition: An implementing rule or other specification for the dataset (e.g., a formal standard).
Usage: Wikidata HTTP URIs MUST be used. Use this field to indicate which data model or structure your dataset complies with.
Example:
  • OMOP Common Data Model: http://www.wikidata.org/entity/Q125499706
  • FHIR (Fast Healthcare Interoperability Resources): http://www.wikidata.org/entity/Q19597236
URI
Definition: Coding systems in use (ex: ICD-10-CM, SNOMED-CT, ...)
Usage: Wikidata HTTP URIs MUST be used.
Example:
  • ICD-10: http://www.wikidata.org/entity/P494
  • SNOMED-CT: http://www.wikidata.org/entity/Q1753883
  • Orphanet Rare Disease Ontology: http://www.wikidata.org/entity/Q24254958
URI
Definition: Health classifications and their codes associated with the dataset.
Usage: Wikidata HTTP URIs MUST be used. The most relevant code values used in the dataset should be provided (Specify Code Values).
Example: http://www.orpha.net/ORDO/Orphanet_26348.
Definition: The typical minimum and maximum age range (year) of the population represented in the dataset.
Usage: The values provided should indicate the general age coverage and must not reveal any personal information.
Mininum age
Number
Maximum age
Number
Definition: Size of the dataset in terms of the number of records.
Usage: An approximate count of the records is expected.
Number
Definition: Number of records for unique individuals.
Usage: An approximate count of the records is expected.
Number

Data Access

A set of metadata properties that describe various ways to access and interact with the dataset:

  • Dataset Distribution: Describes how the dataset is made accessible.
  • Dataset Sample: Provides subsets or representative examples of the dataset to facilitate evaluation and understanding.
  • Dataset Analytics: Offers insights into the dataset by describing the analytical tools, services, or methods available to users for deriving value from the data.

Mandatory
0/0
Recommended
Optional
0/0
Definition: Information that indicates whether the Dataset is publicly accessible, has access restrictions or is not public.
Example:
  • Public: The dataset is available under general open data rules, such as those covered by the High Value Datasets Implementing Regulation.
  • Restricted: The dataset contains protected data and is accessible only under specific conditions, as outlined in regulations like the Data Governance Act.
  • Non-public: The dataset includes resources that may contain sensitive or personal information, falling under regulations such as the EHDS Regulation.
Select
Definition: An available distribution for the dataset.
Usage: For sensitive health datasets (e.g., personal electronic health data), a distribution must include the landing page of the Health Data Access Body supporting data access.
Definition: A sample distribution of the dataset.
Usage: For sensitive data, HealthDCAT-AP requires data holders to provide a sample distribution of the dataset (e.g., mock-up data, anonymized data, synthetic data, etc.) in any computer-readable format (e.g., CSV, JSON). If applicable, a data dictionary should also be published. The data dictionary must be published using CSVW, resulting in an RDF format for the sample distribution. A more complex use case involves merging both requirements by simultaneously producing the dataset sample as tabular data along with its data dictionary using CSVW.
Definition: An analytics distribution of the dataset.
Usage: Data holders are encouraged to provide HTTP URIs pointing to API endpoints or document repositories where users can access or request associated resources such as technical reports of the dataset, quality measurements, usability indicators,... or analytics services such as data visualization tools.
Example: http://atlas.ecdc.europa.eu/public/index.aspx (Surveillance Atlas of Infectious Diseases)

Technical Metadata

A set of metadata properties used primarily for metadata management within a data repository. These properties, often referred to as administrative metadata, support the internal management of datasets and are typically not intended for exchange with external systems or users. Organisations may also add their own properties to further tailor the management of datasets according to their specific needs.

Mandatory
0/0
Recommended
Optional
0/0
Definition: The main identifier for the Dataset, e.g. the URI or other unique identifier in the context of the Catalogue.
Usage: The use of persistent dereferenceable URIs is mandatory in the HealthDCAT-AP profile (i.e., HTTP URIs)
Example: https://opendata.schleswig-holstein.de/dataset/a4c09d4b-9922-40f2-8615-4f4d89ff339f
URI
Definition: A timestamp indicating the last date of revision of the metadata
Usage: The EDHS Regulation states that the health data holder shall, at a minimum, on an annual basis check that its dataset description in the national catalogue is accurate and up to date.
Metadata Update Date

Metadata Quality Assessment

This section implements the Metadata Quality Assessment (MQA) methodology developed by data.europa.eu. It evaluates the quality and utility of the dataset’s metadata based on key FAIR-aligned dimensions such as Findability, Accessibility, Interoperability, Reusability, and Contextuality. Each dimension includes a set of indicators derived from the EU Vocabularies and standards such as DCAT-AP. The resulting scores contribute to a metadata quality certificate that helps users assess the fitness of health datasets for discovery, reuse, and integration.

This section is informative and does not contain any fields to fill in. It is designed to help you understand the current quality of your metadata and identify areas for improvement. Use the insights provided here to enrich your dataset description and align it with international standards and best practices.

Findability of your record

The following table describes the metrics that help people and machines in finding datasets. A maximum of 100 points can be scored in this area.

Accessibility of your record

The following table describes which metrics are used to determine whether access to the data referenced by the distributions is guaranteed. A maximum of 100 points can be scored in this area.

Interoperability of your record

The following table describes the metrics used to determine whether a distribution is considered interoperable. According to the assumption 'identical content with several distributions', only the distribution with the highest number of points is used to calculate the points. A maximum of 110 points can be scored in this area.

Reusability of your record

The following table describes which metrics are used to check the reusability of the data. A maximum of 75 points can be scored in this area.

Contextuality of your record

The following table show some light weight properties, that provide more context to the user. A maximum of 20 points can be scored in this area.