Open Data Metadata Guide
  • README
  • Categories
  • Dataset Metadata
  • Column Metadata
  • Additional Resources
  • Appendix A: Sample Dataset Metadata
  • Appendix B: Sample Column Metadata
Powered by GitBook
On this page
  • Standard Dataset Fields
  • Federal Dataset Fields

Appendix A: Sample Dataset Metadata

PreviousAdditional ResourcesNextAppendix B: Sample Column Metadata

Last updated 2 years ago

Standard Dataset Fields

The U.S. federal government has created the Project Open Data metadata schema standard to implement the federal open data policy. The is based on the international DCAT metadata schema used by open data programs around the world and . The Project Open Data schema must be preseneted as a JSON file to be ingested by Data.gov. This schema is natively available with many open data portal providers including: Azavea, Esri Open Data, NuCivic's DKAN, OpenGov, and Socrata, and is easily added to CKAN sites with an extension or can be generated on an ad hoc basis with these .

Field
Label
Definition
Required

Title

Human-readable name of the asset. Should be in plain English and include sufficient detail to facilitate search and discovery.

Always

Description

Human-readable description (e.g., an abstract) with sufficient detail to enable a user to quickly understand whether the asset is of interest.

Always

Tags

Tags (or keywords) help users discover your dataset; please include terms that would be used by technical and non-technical users.

Always

Last Update

Most recent date on which the dataset was changed, updated or modified.

Always

Publisher

The publishing entity and optionally their parent organization(s).

Always

Contact Name and Email

Contact person's name and email for the asset.

Always

Unique Identifier

A unique identifier for the dataset or API as maintained within an Agency catalog or database.

Always

Public Access Level

The degree to which this dataset could be made publicly-available, regardless of whether it has been made available. Choices: public (Data asset is or could be made publicly available to all without restrictions), restricted public (Data asset is available under certain use restrictions), or non-public (Data asset is not available to members of the public).

Always

License

If-Applicable

Rights

This may include information regarding access or restrictions based on privacy, security, or other policies. This should also serve as an explanation for the selected “accessLevel” including instructions for how to access a restricted file, if applicable, or explanation for why a “non-public” or “restricted public” data asset is not “public,” if applicable. Text, 255 characters.

If-Applicable

Spatial

The range of spatial applicability of a dataset. Could include a spatial region like a bounding box or a named place.

If-Applicable

Temporal

The range of temporal applicability of a dataset (i.e., a start and end date of applicability for the data).

If-Applicable

Distribution

If-Applicable

Metadata Type

No

Frequency

The frequency with which dataset is published.

No

Data Standard

URI used to identify a standardized specification the dataset conforms to.

No

Data Dictionary

URL to the data dictionary for the dataset. Note that documentation other than a data dictionary can be referenced using Related Documents (references).

No

Data Dictionary Type

No

Collection

The collection of which the dataset is a subset.

No

Release Date

Date of formal issuance.

No

Language

The language of the dataset.

No

Homepage URL

This field is not intended for an agency's homepage (e.g. www.agency.gov), but rather if a dataset has a human-friendly hub or landing page that users can be directed to for all resources tied to the dataset.

No

Related Documents

Related documents such as technical information about a dataset, developer documentation, etc.

No

Category

Main thematic category of the dataset.

No

Federal Dataset Fields

The U.S. federal requirement also requires the following metadata fields. You should consider requiring local department codes, systems of record, and associated IT spending if helpful for your open data catalog. If you do not have unique governmentwide codes related to these areas, you might consider creating those.

Field
Label
Definition
Required

Bureau Code

Always

Program Code

Always

Data Quality

Whether the dataset meets the agency's Information Quality Guidelines (true/false).

No

Primary IT Investment UII

For linking a dataset with an IT Unique Investment Identifier (UII).

No

System of Records

If the system is designated as a system of records under the Privacy Act of 1974, provide the URL to the System of Records Notice related to this dataset.

No

The license or non-license (i.e. Public Domain) status with which the dataset or API has been published. See for more information.

A container for the array of Distribution objects. See below for details.

IRI for the . This should be dcat:Dataset for each Dataset.

The machine-readable file format ( also known as ) of the dataset's Data Dictionary (describedBy).

Federal agencies, combined agency and bureau code from OMB Circular A-11, Appendix C (, ) in the format of 015:11.

Federal agencies, list the primary program related to this data asset, from the . Use the format of 015:001.

Project Open Data schema
has been mapped to many standards
tools
Open Licenses
JSON-LD data type
IANA Media Type
MIME Type
PDF
CSV
Federal Program Inventory
title
description
keyword
modified
publisher
contactPoint
identifier
accessLevel
license
rights
spatial
temporal
distribution
Dataset Distribution Fields
@type
accrualPeriodicity
conformsTo
describedBy
describedByType
isPartOf
issued
language
landingPage
references
theme
bureauCode
USG
programCode
USG
dataQuality
USG
primaryITInvestmentUII
USG
systemOfRecords
USG