Part 2
I have no instance of the surname Dewey in my tree, but all 16 2g-grandparents have roots in the British Isles, and most of those lines were in North America by the first half of the 18th century.
Using the exact surname "Dewey" and a birth location of Westfield, Massachusetts, I get 10 4th-6th cousins, and 75 5th-8th cousins...over five times the total number of matches that your own DNA comparison garnered.
I do have the surname Moore in my tree; none dated earlier than 1830, and none that I know of as being connected to John Moore and his line. Running that search with a Windsor, Connecticut, birthplace yielded 5 2nd-3rd cousins, 3 3rd-4th cousins, and a staggering 536 4th-6th cousins. I didn't bother importing the 5th-8th cousin list into Excel, but at that point the individual sharing amounts hadn't dropped below 20cM yet.
Regarding that 20cM amount, I believe we need to be clear that our inexpensive microarray tests simply cannot be 100% accurate. In no small part because the chip technology itself advertises call rates of > 99%. When a small segment comparison may rely on only a few hundred tested markers, a call-rate failure of 1% is statistically significant. And as many as 19% of those tested markers are targeted for clinical/pharmacological purposes, most of which have little applicability to genealogy.
Plus, there is empirical evidence that even AncestryDNA's interpretation methods are wrong occasionally. Well-known genetic genealogist and honorary Research Fellow in the Department of Genetics, Evolution and Environment at University College London, Debbie Kennett, has the admirable advantage of having her own DNA and that of both parents tested. She can do actual trio-phasing comparisons. She's written that she has "three matches at AncestryDNA over 15 cM which don't match either of my parents. They share respectively 19 cM, 24 cM and 25 cM." This performance rate is much better than at other testing companies, but that segments at Ancestry of 19-25cM can be false-positives should provide a lens for viewing results regarding smaller segments.
Similarly, we can't really use Blaine Bettinger's Shared cM Project as a de facto scientific evaluation of DNA sharing ranges. It is a crowd-sourced, self-reported, and unvetted set of data. There is no way to validate which submitted data are correct and which are not. If you read the full report PDF, you see that Blaine himself highlights the issues with user-provided data. Blaine does the best he can to attempt to normalize the data, but it's by an indiscriminate, brute-force approach: he removes 0.5% of the reported submissions from both low and high ends of the centiMorgan values for each relationship.
If you look at the provided histograms, you'll see how rapidly they begin to diverge from a Gaussian distribution as the relationships grow more distant, which we would otherwise expect to see with a large enough--and accurate--sample size.
Also important to note is that Blaine attempts to offer an estimate of standard deviation only out to his "Grouping 10," where 4C1R is the most distant relationship. The data starts to become too unreliable after that point to make an attempt at SD.
It should also be noted that all the averages reported beyond 2C1R need to be taken with a grain of salt. By 3C we reach a point where roughly 8% of cousins will share no detectable DNA between them. One of the core issues with these kinds of crowd-sourced data are that they will always be underreported on the low side. The actual averages will be lower than presented.
That's why I always start with the simple Coefficient of Relationship numbers as a baseline in order to evaluate how much a reported amount might be skewed. By the CoR, 8th cousins would share on average 0.0008% of their DNA; 9th cousins 0.0002%; 10th cousins 0.00005%. For a 6800cM calculated genome, we'd be talking 0.05cM, 0.01cM, and 0.003cM respectively.
Following on that last point, we also need to keep in mind that, as I described in our 2023 conversation, the genetic effects of pedigree collapse dilute rapidly once the collapse ceases. If it did not, truly endogamous populations like the Rapa Nui or Ashkenazim would have been so severely affected by the lack of genetic diversity that they might not even have endured. I'll reference again my extreme--and admittedly rather silly--example from 2021 when I broke down what happened genetically with the deeply inbred Lannister family from Game of Thrones.
Jamie and Cersei Lannister were twins who had children. If the children and subsequently their children didn't again inbreed, Cersei's and Jaime's 2g-grandchildren would be down to the genetic difference between double 3rd cousins and regular 3rd cousins, and a distinction of about 106-117cM versus 53-59cM. Not insignificant, but we're nudging a level where the amount of shared DNA can't readily distinguish between the pedigree collapse scenario and one where none of the parents were genetically related.
The Dewey/Moore hypotheses evidently range from 8th to 10th cousins, or 9 to 11 generations ago. At 11 generations, two full siblings could have children together and if the pedigree collapse stopped there, by 7 generations ago there would be little or no genetic evidence of it. (By the way, "If you go back 5 generations, then there are 63 ancestors..." Actually, speaking genealogically and not genetically, at 5 generations we would have 32 ancestors, not 63; the simple formula to calculate the potential number of ancestors at any generation is 2k where k is the number of generations; self is always generation 0.)
That puts us back to the distinctions among genealogical ancestry, genetic ancestry, and genetic similarity. At AncestryDNA I show 85 matches to "Dewey" and "Westfield, Massachusetts" not because I have potentially identifiable ancestors that match those criteria, but because my roots are also in the British Isles and those 85 matches and I (assuming all are physically valid segments, which at least 90% probably are) carry chunks of DNA that have been passed down via regional, local, and even tribal/clan populations from many, many generations ago.
Without directly analyzing the segment detail (and preferably the raw data themselves) I can't make any assumptions about a shared, identifiable ancestor within the genealogical timeframe. And even deeper analysis may prove inconclusive: it may offer only a little more or a little less weight to the information as possible evidence. As broad-stroke examples, the closer the shared segment is to the ends of the chromosome, the more likely the crossover events were somewhat recent, making attribution potentially possible; if I've mapped my probable haplotypic pile-up regions and the segment falls into one of those, it's most likely from a much older source where attribution won't be possible; if the segment includes very few matching SNPs (or SNP mismatches), the more likely it's a false-positive; if the segment is small and spans an area of the chromosome where protein-coding genes are densely clustered, the more likely the match is genealogically irrelevant.
Last up for today: "DNA testing capability has moved on a lot over the last 5 to 10 years, perhaps time for a reevaluation?"
That is a correct statement. But interestingly enough, it's true far more for Y chromosome testing than for our typical autosomal tests...which technology hasn't appreciably changed in well over a decade. The microarray was invented--well, first published--in 1991 by Stephen Fodor and colleagues. They were with the Affymax Research Institute which, not coincidentally, became the origin of the name of one of the first microarray chips, the Affymetrix GeneChip (more rabbit-hole diving: Affymetrix is now Applied Biosystems, a brand of DNA microarray products sold by Thermo Fisher Scientific after they acquired Affymetrix; Living DNA is the only major genealogy testing company that today uses a version of the Affymetrix chip).
Speaking of over a decade, our autosomal DNA results are still being compared based on the GRCh37 reference genome, the last iteration of which was published in 2013. Even GRCh38 was scheduled to be replaced by GRCh39 a year ago, but the Genome Reference Consortium has deferred that with the possibility of moving to a pangenome model rather than a single reference...the majority of which, by the way, is from one man who lived in Buffalo, New York, years ago and responded to a newspaper ad about DNA testing). There are known errors and omissions in GRCh37 (which can affect cM calculation, among other things), but it's impractical (read: costly) for our DNA testing companies to switch to a different reference model. For more information about GRCh37 vs. GRCh38, see Guo, Yan, et al. "Improvements and Impacts of GRCh38 Human Reference on High Throughput Sequencing Data Analysis." Genomics 109, no. 2 (March 1, 2017): 83–90. https://doi.org/10.1016/j.ygeno.2017.01.005.
The advent of direct-to-consumer next generation sequencing of the Y chromosome--and specifically the FTDNA Big Y test--has meant a sea-change in how we can evaluate and use yDNA information. My first commercial yDNA test was back when 12 STR markers was state of the art. In the intervening 23 years I've upgraded every time a new offering came out, including when SNP panel testing was first offered. That, unfortunately for my pocketbook, added up to a lot of incremental dollars (I've done a total of nine yDNA tests at FTDNA).
Today, and most especially in the world's most tested subclade of the yDNA haplotree, R-M269, deep NGS testing allows us to use reliable TMRCA predictions as tightly as about 83 years, or roughly 2.5 generations if we use an average generation interval of 32 years. The 23andMe, Living DNA, and now FTDNA yDNA haplogroup report as derived from autosomal testing isn't really a for-purpose yDNA test. At best, it can determine a defining SNP somewhere fairly high in the haplotree, meaning a quite old date of first appearance. My own branch on the haplotree is currently 14 levels deeper than R-M269.
The Dewey/Moore hypotheses may very well be spot on. But in the lab, to attempt to avoid confirmation bias, it's incumbent upon us to actively seek to disprove the hypothesis. There, and in the Genealogical Proof Standard, that effort extends to objectively determining the merit and strength of the evidentiary information.
My personal opinion--and that's all it is, my opinion--is that in and of itself the AncestryDNA information is insufficient to support definitive identification of common ancestors who date back to the beginning of the 17th century.
Edited: Gave the Genome Reference Consortium an incorrect name. I can't live with that, so I had to correct it. :-)