The 1890 Census Time Machine: Could 21st Century Tech Replicate the 1890 Population Schedules?

 I really enjoy learning about the Census Bureau and its processes.  So I was especially fond of Michael Hait’s “Anatomy” series and his other articles about the U.S. census back in 2009.  After reading this census oriented post, I started musing about the 1890 census. Here’s a “re-deux” of a piece published in GeneaBlogie in November, 2009.

The population schedules for the 1890 census were mostly destroyed in a fire that occurred in the Commerce Department building in January of 1921.  The Associated Press reported that several senators believed that cigarette smoking had been the cause of the fire.  There were calls for immediate bans on smoking in federal buildings; however, that wouldn’t happen for many more decades.

Many of the destroyed  records were probably not burned but rather affected by water damage.  In any event, the absence of an 1890 census can be a real problem for some researchers.  Certainly there are other sources available; these sometimes are described as  “census substitutes.”  These are often state censuses, federal non-population schedules, directories that list individuals by name or occupation or address for a particular time, and other records.  See Juliana Szucs’ Ancestry.com videos  in the “Hidden Treasures” series. A complete census substitute may consist of a number of such records.

But the census substitutes and other records used to fill the 1890 to 1900 gap, are scattered and sporadic. Although any given population schedule is just one piece of a research jigsaw puzzle, it would really be nice to have an 1890 census population schedule.

Well, let’s ask our fairy godmother to reconstruct the 1890 census! Or could we actually do it ourselves, using presently available technology?  In fact, putting aside issues of feasibility other than technological, including cost-benefit analysis, the answer is clearly “yes!”  We’ll get to all of the sticky questions  later.

My idea goes like this: we would gather all the available information relevant to the 1890 census and we would dump it into a supercomputer that would crunch the numbers and come out with population schedules for every state just as if the original population schedules existed.

Here’s the data we would we would input: first, the population schedules from:

  • the 1870 census
  • the 1880 census
  • the 1900 census
  • the 1910 census.

The purpose of selecting these years for census data is to engage in a bit of “triangulation” with respect to individuals and their residences in 1890.

We would also input:

  • digitized data of known migration patterns before and after 1890.
  • state census records.
  • federal non-population schedules from 1880 to 1900.
  • immigration data
  • digitized biographical information of people known to have been alive between 1890 and 1900.
  • information about deaths known to have occurred between 1880 and 1900.
  • information about the rate of growth of each state and county in the country from 1880 to 1900.
  • marriage data from 1880 to 1900.
  • available military records from 1880 to 1900.
  • a harvesting of names from all publications from 1880 to 1900, with locations attached to the names.

That comes down to just about everything for which a record exists from 1880 to 1900.  We would program the computer with a set of rules to allow it to connect individual lives and places of residence from these various sources.  Another rule would have to do with linking families together. The idea is to replicate the 1890 population schedules as nearly as possible.  Actually our 21st century 1890 census likely would be more accurate than the original (no transcription errors; no enumerator bias or error; imagine the possibilities!).

In the abstract this is a plausible idea.  There’s enough computing power in the world to do this.  Since I first mused about this in 2009, billions of new records have been digitized and cloud computing has become “this-generational,” if no longer edgy.

Now to deal with reality.  Question one is WHY? My answer is “because it’s there.” A related question is “Haven’t we gotten along without this?”  Yes, but so what? See my answer to question one.   One might also ask, “Won’t this be tremendously expensive for a ‘so-what’?” Maybe (but don’t kill the buzz!).

Notwithstanding my rhapsodizing above, it’s fair to ask how we would judge the accuracy of this reconstructed 1890 census population schedule.  Like all scientific endeavors, we would have to establish some confidence level at which we would find ourselves to be satisfied with the matter as it is developed.  Think for a moment about a point that Michael Hait raised in of his posts on census records: how reliable are any census records?  We could judge the accuracy of the reconstituted 1890 census population schedules by identifying sources that would likely lead to a specific individual being located where the reconstituted schedules say he is.  We would randomly sample the population schedules and compare them to other known records.  Recall that among the other known records are some surviving population schedules for a very few counties in several states. states.  Those could be part of the data set or they could be held out as a control set.  That could be one of the devices we use to sample the reconstituted schedules for accuracy.

In some sense the accuracy sampling issue has already been answered for us because we now know everything (and more) that the Census Bureau knew in 1890!

We would  subject our Replicated 1890 Census Schedules to the elements of the Genealogical Proof Standard.

  • A reasonably exhaustive search.” This would have been done through the data collection process.
  • “Complete and accurate citation of all sources.” The computer will do this.
  • “Analyze and correlate the collected information to assess its quality as evidence.”  The computer will do this.
  • “Resolve any conflicts caused by contradictory items of evidence or information contrary to the conclusion.”  This would be a shared human/computer task.
  • “Arrive at a ‘soundly reasoned, coherently written conclusion.’ ” This would be the end product.

If the project meets the standard, then we may deem it sufficiently accurate for genealogical research purposes.

My flippant answer to the WHY question above comes down to this: (1) because we can; (2) because it would be interesting; (3) it may perhaps spawn new applications useful for other things that we cannot now anticipate; (4) it would be fun!

I’m not saying it should be done or that it’s necessary to be done but that it could be done.  I’m not exactly holding my breath waiting for anyone to come rushing in to do this project, but I’d like to hear some educated (or even whimsical) views on what I call The 1890 Census Time Machine.

So whaddya think? Let’s have some peer review!