Genetic mutation found in descendants of 1760s couple - Rushing & Harrell

+20 votes
709 views

The descendants of a 1760s couple in Virginia have been found to have genetic mutation that causes seizures, migraines, brain bleeds, etc. Matthew Malachi Rushing and Sarah Mae Harrell born in the mid-1760s in North Carolina are the apparent first carriers of this mutation. 

There are two profiles on WikiTree for Matthew Rushing, one for his wife. 

Matthew Rushing      Matthew Malachi Rushing

Sarrah Harrell

Anyone interested in following this family line might want to read this article.

How a Rare Brain Mutation Spread Across America

WikiTree profile: Matthew Rushing
in The Tree House by Shirley Dalton G2G6 Pilot (533k points)
recategorized by Ellen Smith

More discussion here.

5 Answers

+7 votes
Fascinating article. Thanks for sharing, Shirley.
by Carolyn Martin G2G6 Pilot (283k points)
+6 votes
Fascinating article as the Rushing's in America descend from my ancestors who left the English shores in the 1600's. I have submitted details of several thousand of them to WikiTree.

Have just started submitting details of slaves who took the name Rushing, this report will send me off on another family adventure!
by Peter Rushen G2G4 (4.4k points)
+6 votes
There is a partial genealogy of the family here: https://www.angioma.org/wp-content/uploads/2020/06/CCM2GenealogyWeb.pdf

see pages 13-15.

The son appears to be here in WT twice, Rushing-219 and Rushing-139, although the pdf above gives his Matthew Malachi Jr. rather than John Malachi.
by Richard Rosenberger G2G6 Mach 3 (35.0k points)
Thanks so much, Richard.  Looks like there are now some good clues to start sorting out some of the next generation up.
+4 votes


An interesting article, and nice to see that genealogy can be used for something like this.  Thanks for sharing.

One has to wonder, however, how much research was actually done for the Mathew Malachi Rushing and the Sarah Mae Harrell named in the article.  Did the researcher simply rely on one of the over 1400 (poorly documented) trees on Ancestry?  or go beyond that?

There is at least one other Mathew Malachi Rushing on Wikitree who has in the past been identified as the husband of Sarah Harrell (he's not)  and whose family seems to the most often copied, with variations, in the Ancestry trees.

by Gayel Knott G2G6 Mach 3 (33.9k points)
+6 votes

You can find the URL for the preprint research paper in the link Shirley provided, but just so that you have it handy here is the PDF at ResearchSquare.

Remembering that the 62 SNPs examined and identified in the research are only indicative if considered collectively--meaning that we can't make any assumptions whatsoever about only a few of them--I thought it might be interesting to see which of our common vendor tests/versions included any of the SNPs.

These are all in, by genealogical standards, the very small segment of 104,547 base pairs that starts at position 45,039,345 on Chromosome 7 in the Build 37 (GRCh37) reference genome map assembly that all our testing companies use. This segment comprises the components of the CCM2 scaffold protein gene. In terms of the centiMorgans we use, this segment calculates to 0.04cM for the male genome, 0.24cM for the female genome, and a sex-averaged value of 0.13cM...so not one we would ever see by itself in a matching report.

Of the 62 SNPs, we can find seven of them in popular microarray tests we've taken. The position shown is the one in GRCh37, not the GRCh38 assembly that our genealogical testing companies still don't use and that the paper referenced. The allele is the one cited in the Gallione, et al. research paper, "Genetic Genealogy Uncovers a Founder Deletion Mutation in the Cerebral Cavernous Malformations 2 Gene."

SNP rsID Chr7 Position Allele Company & Test Version
rs10227754 45055410 G 23andMe v3, Ancestry v1, FTDNA v2, MyHeritage v1
rs4720490 45057134 A 23andMe versions 3 and 4
rs6967748 45139163 T 23andMe v3 and v4, Ancestry v1 and v2, FTDNA v2, Living DNA v2, MyHeritage v1
rs7792895 45141785 T 23andMe v3 and v4, Ancestry v1, FTDNA v2, MyHeritage v1
rs2119056 45143108 T 23andMe v5, FTDNA v3, Living DNA v1
rs3801407 45143503 G 23andMe v3 and v4
rs6973982 45143892 A 23andMe v3, Ancestry v1, FTDNA v2, MyHeritage v1

by Edison Williams G2G6 Pilot (441k points)
Interesting.  Thanks.
So 23andMe version 3 tested all but one of them? Was it a more comprehensive test?

Thanks, Gayel and Lucas. I think it would be extremely difficult to describe any one of the common microarray tests we do for genealogy as more comprehensive than any other.

I believe a lot of people feel that the approximately 650,000 SNPs examined by the microarrays represent a highly selective set of markers known as being the most important ones to ancestral genetics. But that's not really the case...in large part because there is no consensus set of markers that are globally applicable for that purpose. When commercial testing for genealogy first began, insider knowledge said that the primary objective was to get coverage as broad as possible across the genome, not based on research about how important any given SNP is for genealogy. As such, in earlier tests--in the portion of the genome that can be tested by microarray (about 8% of it cannot)--we had an average coverage of about 1 tested SNP out of every 4,700 or so.

You can see a breakdown of what SNPs are examined in the default Illumina Global Screening Array v3 test on page two of this PDF. For population and ancestral genetics what we're primarily interested in are markers in the portion of the genome that we know does not contain protein coding genes or actively support amino acid or RNA coding or transcription. In some instances, markers within genes are valuable for genealogy; these are generally the Mendelian alleles that indicate straightforward things like hair color, blood type, eye color, etc. But most of the protein coding part of our genome, the exome, has limited room to mutate without causing deleterious medical conditions or impacting reproduction and survival. Also, two functions during meiosis, gene linkage and linkage disequilibrium, work to prevent crossover--the action of chromosome breakage and recombination, creating the segments we use for genealogy--from ever occurring inside a gene or the flanking nucleotides next to it. Genes are kept intact during meiosis unless a serious structural abnormality happens.

In 2000 a burning question among geneticists was how many genes did we actually have? There was a contest called GeneSweep that offered a US$3,000 prize for the closest estimate. Over 1,000 entries came in with guesses ranging from just under 26,000 to just over 312,000. We still don't know the exact count, but the number has generally been shrinking rather than growing the more we learn. The current approximate consensus is about 20,500 to 21,000.

Whole exome sequencing typically looks at all the known genes as well as their flanking regions. Still, that represents only about 1.5% of the genome, or around 45 or 46 million base pairs in our 3.06 billion base pair genome. You can see that of the 654,027 SNPs targeted by the GSA v3 microarray test, fully 18.2% of them are in the exome and examined expressly for clinical/medical purposes.

Quite literally, over one-sixth of the markers our common DNA tests look at have no practical bearing on genealogy at all. We should keep that in mind when analyzing DNA matches, especially on smaller segments of less than around 15cM. Segments can be checked for gene numbers, density, and positions at the NCBI Genome Data Viewer (note that the tried and true 1000 Genomes Browser is being retired as of April, so we need to shift to the new implementation). For closer analysis, there are a variety of sources by which to obtain templates of different DNA test versions from the major companies, and if the company/version of the tests are known (and they should be identified in any comparison; it can make a big difference) a small segment can then be evaluated on a SNP-by-SNP basis to see which were actually examined by the tests. I personally do this for any segment less than 20cM, even on larger segments depending on their positions on the chromosome. It's a step I seldom see performed, but I believe any matching or triangulation involving smaller segments can be called into question unless this, among other analyses, is performed.

Before I forget, the term "SNP" is not synonymous with "marker" or "base pair." To qualify as a single nucleotide polymorphism the variant has to be found in the global population. I don't think there is a true, prescriptive definition of how frequent the variant must be in the population, but we see 1% often used as a benchmark.

All this talk of coding genes and the size of the genome also brings to mind what you'll see if you Google "how identical is human DNA." The proclamation of 99.9% will blare out at you. But that's misleading (and maybe I need a blog post about this, come to think of it). In those strident references, the 99.9% refers to the exome, not the rest of the genome. Heck, until last May we had never even sequenced the entire genome, and none of today's commercial tests--whether microarray or whole genome sequencing--can look at about 7% to 8% of the human genome. If we were truly 99.9% identical then there would only be about 46 million base pairs that could possibly distinguish us from one another. But the Big Y test at FTDNA looks at 23.6 million base pairs that are genealogically relevant just on that chromosome alone. And currently at the NCBI's dbSNP database there are a total of almost 1.072 billion SNPs or SNVs (single nucleotide variants) cataloged and on file. About 98.5% of our genomes are not involved in protein coding or RNA transcription, and we are not identical on all of it.

Looping back around, The 23andMe version 3 test was performed on a customized Illumina OmniExpress chip. If you remember my earlier comment about an average distribution of roughly 1 SNP in 4,700 tested, the default version of this test--per Illumina's spec sheet--was advertised as having a 4,080 mean and 2,220 median test spacing. That segment for the CCM2 gene is 104,547 base pairs long and that version of the 23andMe test looked at 38 SNPs in that span, or 1 in every 2,751. The area was more interesting to them than the default chip's 4,080 mean, but about the same as the 2,220 median. Six of those tested SNPs happened to have been included in the Gallione, et al. research. But none of the 38 SNPs would be of much use for genealogy, so from that perspective the 23andMe v3 test was no more or less comprehensive than any of the others.

Thank you. A lot to read but I enjoy your posts.

Related questions

+5 votes
2 answers
301 views asked Jun 22, 2021 in Genealogy Help by Tommy Rushing G2G Rookie (280 points)
+3 votes
2 answers
271 views asked Nov 23, 2016 in Genealogy Help by Caryn Cross G2G Crew (340 points)
+3 votes
1 answer
383 views asked Nov 23, 2016 in Genealogy Help by Caryn Cross G2G Crew (340 points)
+3 votes
1 answer
170 views asked Sep 13, 2014 in Genealogy Help by Julie Avery G2G Rookie (220 points)
+8 votes
1 answer
+4 votes
1 answer
+6 votes
2 answers

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...