Thanks, Gayel and Lucas. I think it would be extremely difficult to describe any one of the common microarray tests we do for genealogy as more comprehensive than any other.
I believe a lot of people feel that the approximately 650,000 SNPs examined by the microarrays represent a highly selective set of markers known as being the most important ones to ancestral genetics. But that's not really the case...in large part because there is no consensus set of markers that are globally applicable for that purpose. When commercial testing for genealogy first began, insider knowledge said that the primary objective was to get coverage as broad as possible across the genome, not based on research about how important any given SNP is for genealogy. As such, in earlier tests--in the portion of the genome that can be tested by microarray (about 8% of it cannot)--we had an average coverage of about 1 tested SNP out of every 4,700 or so.
You can see a breakdown of what SNPs are examined in the default Illumina Global Screening Array v3 test on page two of this PDF. For population and ancestral genetics what we're primarily interested in are markers in the portion of the genome that we know does not contain protein coding genes or actively support amino acid or RNA coding or transcription. In some instances, markers within genes are valuable for genealogy; these are generally the Mendelian alleles that indicate straightforward things like hair color, blood type, eye color, etc. But most of the protein coding part of our genome, the exome, has limited room to mutate without causing deleterious medical conditions or impacting reproduction and survival. Also, two functions during meiosis, gene linkage and linkage disequilibrium, work to prevent crossover--the action of chromosome breakage and recombination, creating the segments we use for genealogy--from ever occurring inside a gene or the flanking nucleotides next to it. Genes are kept intact during meiosis unless a serious structural abnormality happens.
In 2000 a burning question among geneticists was how many genes did we actually have? There was a contest called GeneSweep that offered a US$3,000 prize for the closest estimate. Over 1,000 entries came in with guesses ranging from just under 26,000 to just over 312,000. We still don't know the exact count, but the number has generally been shrinking rather than growing the more we learn. The current approximate consensus is about 20,500 to 21,000.
Whole exome sequencing typically looks at all the known genes as well as their flanking regions. Still, that represents only about 1.5% of the genome, or around 45 or 46 million base pairs in our 3.06 billion base pair genome. You can see that of the 654,027 SNPs targeted by the GSA v3 microarray test, fully 18.2% of them are in the exome and examined expressly for clinical/medical purposes.
Quite literally, over one-sixth of the markers our common DNA tests look at have no practical bearing on genealogy at all. We should keep that in mind when analyzing DNA matches, especially on smaller segments of less than around 15cM. Segments can be checked for gene numbers, density, and positions at the NCBI Genome Data Viewer (note that the tried and true 1000 Genomes Browser is being retired as of April, so we need to shift to the new implementation). For closer analysis, there are a variety of sources by which to obtain templates of different DNA test versions from the major companies, and if the company/version of the tests are known (and they should be identified in any comparison; it can make a big difference) a small segment can then be evaluated on a SNP-by-SNP basis to see which were actually examined by the tests. I personally do this for any segment less than 20cM, even on larger segments depending on their positions on the chromosome. It's a step I seldom see performed, but I believe any matching or triangulation involving smaller segments can be called into question unless this, among other analyses, is performed.
Before I forget, the term "SNP" is not synonymous with "marker" or "base pair." To qualify as a single nucleotide polymorphism the variant has to be found in the global population. I don't think there is a true, prescriptive definition of how frequent the variant must be in the population, but we see 1% often used as a benchmark.
All this talk of coding genes and the size of the genome also brings to mind what you'll see if you Google "how identical is human DNA." The proclamation of 99.9% will blare out at you. But that's misleading (and maybe I need a blog post about this, come to think of it). In those strident references, the 99.9% refers to the exome, not the rest of the genome. Heck, until last May we had never even sequenced the entire genome, and none of today's commercial tests--whether microarray or whole genome sequencing--can look at about 7% to 8% of the human genome. If we were truly 99.9% identical then there would only be about 46 million base pairs that could possibly distinguish us from one another. But the Big Y test at FTDNA looks at 23.6 million base pairs that are genealogically relevant just on that chromosome alone. And currently at the NCBI's dbSNP database there are a total of almost 1.072 billion SNPs or SNVs (single nucleotide variants) cataloged and on file. About 98.5% of our genomes are not involved in protein coding or RNA transcription, and we are not identical on all of it.
Looping back around, The 23andMe version 3 test was performed on a customized Illumina OmniExpress chip. If you remember my earlier comment about an average distribution of roughly 1 SNP in 4,700 tested, the default version of this test--per Illumina's spec sheet--was advertised as having a 4,080 mean and 2,220 median test spacing. That segment for the CCM2 gene is 104,547 base pairs long and that version of the 23andMe test looked at 38 SNPs in that span, or 1 in every 2,751. The area was more interesting to them than the default chip's 4,080 mean, but about the same as the 2,220 median. Six of those tested SNPs happened to have been included in the Gallione, et al. research. But none of the 38 SNPs would be of much use for genealogy, so from that perspective the 23andMe v3 test was no more or less comprehensive than any of the others.