Ingesting Metadata

Tip: To download an xsd file, right-click and select

Overview

The term "Ingest" refers to the process by which ECHO loads published metadata files into its metadata holdings for the usage of the ECHO user community. The term "Ingest job" refers to a distinguishable grouping of submitted metadata files which are processed and reported upon by ECHO Ingest.

The topics covered on this page include information regarding the metadata Ingest process performed by ECHO. It is the responsiblity of the Data Partner to regularly monitor the Ingest of their published metadata.

The information provided is broken up into the following topic areas:

Ingest Workflow

ECHO continually detects submitted metdata and will create Ingest jobs which are then processed. Each job is transformed and validated against the ECHO 10.0 metadata schema, and sorted according to metadata action type. After these steps, a job may wait for associated browse image files or proper sequencing. After the job is released from a waiting state, it is queued for loading. When the current Ingest job completes loading, Ingest will generate a ftp accessible Ingest report, notify the Data Partner, and begin loading the next queued job.

For a more detailed description of the internal ECHO Ingest states, please refer to:

Delivery Methods

Data Providers submit their metadata files to a configured ftp location on the ECHO system. The ECHO Ingest process regularly detects published metadata files and will bundle a group of detected files or a single package file into an Ingest job.

Providers may deliver metadata files in two ways:

ECHO recommends Data Provider submit metadata using the package delivery method. This method allows providers to assign a textual job name (e.g. "Ingest Job #3 - 10/21/2008"), and a package sequence number. Each Ingest Job is associated with a sequence number in order to ensure metadata is submitted in the correct order. Providers are encouraged to utilize this mechanism by supplying their own package sequence numbers to manage the processing of their data.

For a more detailed description of metadata delivery methods, please refer to:

Processing Order

Each submitted metadata file will contain a specific type of metadata item (e.g. collection insert, granule partial update, etc...). In order to ensure that metadata items are ingested correctly, metadata actions are processed in the following order:

  1. Browse inserts/replacements
  2. Collection inserts/replacements
  3. Collection partial deletes
  4. Collection partial updates
  5. Collection deletes
  6. Granule inserts/replacement
  7. Granule partial deletes
  8. Granule partial updates
  9. Granule deletes
  10. Browse deletes

Last update usage. How does last update work for collection & granule partial update. How does last update work for browse (not required)

Ingest Reporting

Ingest reports its activity to Data Providers through email and automatically generated report files. These two reporting mechanisms are described below. For a more detailed description of metadata delivery methods, please refer to:

Email Notifications

ECHO Ingest will notify a configurable list of individuals associated with each Data Provider when an ingest job is started and completed. The automatically generated email will include identifying information for the job and the event time. Note that when a job is 'started', this means that it has been added to the queue for a provider. Ingest will process received jobs according to the order of receipt, or sequence number (if provided). Emails will be generated for the following situations:

Ingest Reports

When an ECHO Ingest job completes, an XML report file is generated and placed in the provider's ftp output directory. This file is created according to the schema referenced below.

The XML format used for the ECHO Ingest reports facilitates automated ingest processing and reconciliation. Each Ingest report includes the following information.

ECHO Ingest Accounting Tool (EIAT)

Data Providers may use the ECHO Ingest Accounting Tool (EIAT) to monitor their active jobs and to the information included in the XML Ingest reports. A provider can view an overall summary of Ingest jobs being processed and the following information for the provider's Ingest jobs:

To access EIAT, visit the following links. The ECHO authentication associated with each system will be used to allow access to users who have been granted "provider role" to an ECHO Data Provider.

ECHO Ingest Policies

The following policies outline a few guidelines for interacting with ECHO Ingest. Additional policies are outlined in the Data Partner's Operation Agreement, which is signed by both ECHO and the Data Partner. For questions regarding ingest policies, please contact ECHO directly at echo@echo.nasa.gov.

  1. ECHO will retain original metadata and Ingest logs for a maximum of 60 days.
  2. ECHO will retain original browse image files for a maximum of 60 days.
  3. ECHO will remove all Ingest report files in the provider output area greater than 60 days old.
  4. ECHO will remove all reconciliation files in the provider output area greater than 60 days old.
  5. ECHO may configure a provider for manual ingest processing if problems are detected with submitted metadata.
  6. ECHO will not manually edit metadata within the ECHO DB to correct invalid data. Data Partners should resubmit metadata with the valid data values to correct such situations.

Data Partners