Some people eat, sleep and chew gum, I do genealogy and write...

Friday, October 2, 2015

What is online and what is not?

My attention was directed to a post on the Heritage Family History blog, entitled, "Time to Improve Online Coverage Details." I have since subscribed to this blog posted by an English genealogist named Celia.

The gist of the blog post is contained in an introductory statement which says,
It is my opinion that genealogy websites should provide full source details and coverage dates for each of their databases. They should also clearly state where a database is not yet complete.
I don't think anyone would disagree with this statement, but as the blog post goes on to point out, this is apparently not the practice of very many of the online genealogical programs containing historical genealogically important records. In some cases the lack of details provided in the search results, even when the correct individual has been identified is highly truncated and, in some cases, misleading. This is especially true when the original record is not easily reviewed.

This problem can arise at various levels within the structure of the searches conducted by various programs. The blog post cited above addresses one aspect of this issue, the lack of a reference list showing the coverage of any given set of records. For example, the name of the record collection may imply that it covers a certain time period, when, in fact, some of the records are missing from the collection.

I should also point out that FamilySearch.org's Historical Record Collections are linked to Research Wiki articles where some of this missing information is often outlined in detail for each collection. The cited blog post focuses on Ancestry.com and Findmypast.com.

However, the issue raised goes well beyond merely identifying the content of a record collection. The real issue involves the search results obtained. In many cases, the search results do not contain enough information to adequately identify the individuals produced and omit crucial information that is contained in the original record. For example, I may search for someone with a relatively common name. The results of the search may produce thousands of individuals, but in each case, I must try to examine the original records if they are available before I can make a positive identification, right or wrong, of those records produced. In some cases the search results have included only some of the contents of the record. In my example, assuming the original record has the parents' names, if the search results omits those names, then you must look at each record individually to determine which of the entries is the correct individual.

In addition, as pointed out by Celia's blog post, the results of a search often fail to provide complete or even accurate sourcing details or any limitations in the coverage of the records.

Some search techniques involve comparing several records in a spreadsheet format to determine the identity of an ancestor. In some cases, hundreds or even thousands of names are compared from a certain geographic area in an attempt to find an elusive ancestor. The failure of the genealogical database to include vital information in the original records, such as the name of the parents, makes this process next to impossible. Even though the advanced genealogist uses a powerful took such as a spreadsheet program to analyze the data, the missing entries force the research to look at each individual entry thereby making examining large numbers of records practically impossible.

The decision to truncate the record search is probably made by the programmers with consideration for speed and the amount of data shown, but there is a sound genealogical reason for including the entire contents of the record in a search. Most indexes are still compiled by individual effort. Expanding the number of fields indexed would very likely cause recored to be indexed at a much slower rate. These issues may have been seen to be more important than displaying the entire record. But if you are faced with many pages of search results all showing the same or very similar name, you will wish that those who chose what to display had to actually use those same records.

1 comment:

  1. This topic has been creating considerable discussion. Researchers want to create credible arguments based on traceable records. Databases have also been disappearing from major websites making citation to the original source an absolute priority if the records are to be verifiable.
    Celia is not the only genealogist to highlight this and the genealogists employed by these companies should be highlighting these problems to those who manage the website.

    ReplyDelete