Skip to content
Ryan Heaton edited this page Jul 6, 2011 · 28 revisions

Introduction

Genealogy work is complicated. Fortunately, we have computers to help us manage complex work. But that puts a significant responsibility on software developers to create the right tools to get the job done well.

In the past, genealogical software was primarily used to manage somebody's conclusions about their genealogy. In order to share those conclusions with their family, or to transport their conclusions to a new computer, the conclusions had to be saved to a disk. GEDCOM was the name of the standard way to save those files to disk.

The world has shifted. Computers are being used much more broadly across all aspects of the genealogical research process. With the arrival of the Internet and the World Wide Web, genealogists are using computers to:

  • Make records available online as digital artifacts
  • Index and annotate online artifacts so as to make them searchable
  • Search for records and other genealogical information
  • Make conclusions based on sound evidence found in records
  • Support conclusions by accurately citing the sources of the evidence
  • Identify contradictory evidence and alternate theories
  • Share and collaborate on genealogy work

GEDCOM X is the industry standard for facilitating each of these activities.

The Conceptual Model

GEDCOM X Conceptual Model

Consider the depiction above, illustrating how GEDCOM X can be used to facilitate genealogy work.

Sources and Records

Information on an ancestor can be gathered from multiple sources. The depiction above shows three (of the many) possibilities: a publication, a photo, and a census. Note the important distinction between the real, physical manifestation of the sources and their digital representation. The GEDCOM X domain refers to the former as physical artifacts and the latter as sources.

To make a source useful to genealogical software, there are three different sets of data that can be represented digitally. These three sets of data make up the definition of a source in GEDCOM X:

  1. The media for the source
  2. The metadata of the source
  3. The contents of the source

The media for the source is the raw digitized version of the physical artifact. This is most often a digital image of a book, census or photo, but it could also be an audio or video file.

The metadata of the source is data "about" the source. Metadata includes things like the title, publisher, publication date, author, and (especially important to genealogical research) the bibliographic citation for the source. GEDCOM X uses the Dublin Core Metadata Initiative to define standard source metadata. The blue elements in the illustration above represent the source metadata.

The process of making the contents of the source available is called indexing. For example, if the source is a census that lists a John Smith born January 1, 1880 then the result of indexing the source will be a piece of structured digital data called a record that specifies a persona with a name "John Smith" and a birth event with "January 1, 1880" as the text of the date. Technically, the record is also data "about" the source (and hence can be considered source "metadata") but it's useful to distinguish it as a special case because it is separate from the Dublin Core terms and has particular significance to genealogical applications.

The green elements in the illustration above represent the record data.

Conclusions

Genealogical conclusions should be based on sound evidence supported by properly-cited sources. The source metadata can can be used to properly cite the evidence for conclusion data and to support the genealogical proof standard. The record data can be used to supply conclusion data and to measure its validity. The conclusion data is represented in red above.

GEDCOM X Profiles

Not everybody is interested in the entire scope of GEDCOM X. So the GEDCOM X standard is divided into distinct components called "profiles". Each profile is designed to meet the needs of a well-defined set of related requirements. The depiction above illustrates three distinct GEDCOM X profiles: the GEDCOM X Record Profile in green, the GEDCOM X Conclusion Profile in red, and the GEDCOM X Source Profile in blue. It also illustrates that profiles can reference other profiles.

There are other GEDCOM X profiles not depicted above. For example, search data is defined by the GEDCOM X Search Profile and attribution data is defined by the GEDCOM X Attribution Profile. To read about the different GEDCOM X profiles, see the profile documentation.

Getting Started

To learn how to produce and consume GEDCOM X from your application, take a look at the Developers Guide.

To read the documentation on the GEDCOM X domain and its different profiles and namespaces, try starting at Profiles.

To get involved with the development and enhancement of the GEDCOM X standard, read about the Community.

Clone this wiki locally