What do you recommend for the mathematics of genetic genealogy?

Question

What do you recommend for the mathematics of genetic genealogy?

1.6k views

I have been reviewing what people are doing with their DNA results and their GEDCOMs. I have yet to try out the Genetic Genealogy Kit and affiliated tools, but I have tried out GRAMPS, RootMagic, Ancestral Quest, Genome Mate Pro, GEDMatch, and a few of Felix Immanuel's genetic genealogy tools.

I've been going through my library of mathematics and genetics textbooks including Schaums Outlines Genetics 4th ed, Snustad and Simmons' Principle of Genetics 4th edition, my references on vector analysis, linear algebra, and the python programming language. I haven't been able to find what I'm looking for.

The basic of it is that a person can be represented in a genetic genealogy by their DNA plus annotations like name, date, location, and events. In simplified formal notation, their DNA can be written as a physical measurement of an experimental subject. In physics, forces and force interactions or waveform interference can be written in terms of vectors. Physical measurements of physical systems can generally be written as vectors, so your DNA should be able to be represented as vector.

I want to do it this way because I want to be able to decompose the vectors into subvectors representing the contributions of genetics from other family members. This way my DNA can be effectively factored recursively into maternal vs paternal, maternal grandfather vs maternal grandmother, paternal grandfather vs paternal grandmother, and so on. You could then compare the factored or phased DNA to matches shared with other family members and determine immediately where in your family tree they must be. Likewise, you could use the vector representation of other people's DNA in order to automatically generate genealogies and check for intersections.

So what references do you all recommend for doing mathematical or quantitative genetic genealogies?

Update: For common reference.

COOP Lab at UC Davis:

asked May 27, 2016 in The Tree House by Ian Mclean G2G6 Mach 1 (13.6k points)
edited May 31, 2016 by Ian Mclean

Show 37 previous comments

Let's suppose you have lots of cousins and get them all tested. One in 8 will give you a Y match, your father's brothers' sons.

Half will give you an X match. They're on your mother's side.

Those cousins will also give you loads of autosomal matches. For simplicity we'll suppose that all the matches are through your mother.

So, looking at segment Blah1 to Blah2 on your Q chromosome-pair, if you have a match with maternal cousin Fred, you know one chromosome of the pair came from your mother. But you knew that anyway.

You also now know that any other match on the same chromosome comes through your mother.

But you don't know which other matches are on the same chromosome. The testing people say chromosome when they mean chromosome-pair because they can't separate the pair. This is why they have to do fuzzy matching and triangulation.

Every segment of every chromosome-pair except XY will match cousins on both sides if you have enough cousins. Which matches happen to exist in your sample isn't information, it's just a sampling artefact. But those matches won't yield any new information about who is on which side.

commented Jun 4, 2016 by Living Horace G2G6 Pilot (633k points)

RJ, This is a good question.

You begin with a false premise, based on what you have been told on this site.

"Every segment of every chromosome-pair except XY will match cousins on both sides if you have enough cousins"

The opposite is most likely true, especially for IBD, which can be traced to a unique common ancestor.

The reason for this is that that you may have a segment which matches a maternal cousin Fred, but on the paternal strand, part of the strand could be via the paternal grandfather, and part of the strand the paternal grandmother. Testing siblings, parents, aunts, uncles, and cousins, will identify those that have such a condition.

In these cases, the probability of the same segment matching more than one side is near 0%.

I would also like to correct you on your statement...

"This is why they have to do fuzzy matching and triangulation."

1st, DNA services, as far as I know do not include triangulation in their matching algorithm, they provide reports that allow you to create your own TG's.

2nd, I know that Wikitree likes to characterize the matching as "Fuzzy Matching', but that is not how it has been used, at least in the past, outside of wikitree. The logic which determines the endpoints has been described as using fuzzy logic, which is why different DNA services may report different end points. The matching algorithms use what may be better characterized as "Educated guess" or "Prediction".

A 7cm segment may actually be a 5cM or 6cM segment. This is why AncestryDNA encourages parents and children to test. Phasing the DNA Data works to eliminate the fuzziness, make the predictions more accurate, and extends the distance of the predictions.

commented Jun 4, 2016 by Ken Sargent G2G6 Mach 6 (62.1k points)

A) "1st, DNA services, as far as I know do not include triangulation in their matching algorithm, they provide reports that allow you to create your own TG's."

?!?!? its easier to tell what you refer to.... think this is a never ending discussion.....

1) FTDNA just do segment matching based on size and total in common and don't display results lower than a threshold

2) Ancestry DNA have DNA circles that are secret but we can guess they use the family tree available,.... ==> the have triangulation somehow...?!?!?

3) 23andMe ?!?!?

4) ?!?!?

B) 7cm segment may actually be a 5cM or 6cM segment ?!?!?

Do you mean something that looks like a IBD is a IBS sounds less possible or do we have numbers on that?

commented Jun 4, 2016 by Living Sälgö G2G6 Pilot (297k points)

a1) FTDNA just do segment matching based on size and total in common. Yes, and this is not triangulation.

a2) Ancestry DNA has circles that are secret but we can guess they use the family tree available,.... ==> the have triangulation somehow

Here is the Help on AncestryDNA Circles

"DNA Circles show you which members share DNA with one another in the genome, but not where in the genome they share that DNA. This is because our studies of genetic inheritance and DNA Circles have shown us that individuals in DNA Circles very rarely share the same matching segments"

I have been told that on Wikitree, the term triangulation means they are part of a triangulated Group. AncestryDNA clearly does not use triangulation. This clearly tells us that AncestryDNA does not use triangulation.

a3). 23andme does not use triangulation to determine what is a match, or prediction. You can run reports that will provide you the data for you to determine what is and what is not triangulated, but they do not report on what matches are triangulated.

a4) ??

a5) "B) 7cm segment may actually be a 5cM or 6cM segment ?!?!?"

FTDNA and 23andme may report a 7cM because the endpoints are "Fuzzy", but when AncestryDNA takes that same Raw Data and phrases it, the fuzziness is nearly eliminated, and the more accurate result is 5cM.

These are both IBD because they are the same segment, but the fact AncestryDNA will phase data when available, it results in a more accurate results. This is why AncestryDNA minimum is 5cm and the others 7cM.

commented Jun 4, 2016 by Ken Sargent G2G6 Mach 6 (62.1k points)

RJ, "Comes to the same thing." - Not on Wikitree. Wikitree only accepts triangulation when there is a triangulation group which shares the same 7cm or greater.

1. Wikitree does not accept less than 7cm. We on Wikitree can't say a person is related via one parent, based on evidence that a less than 7cM IBS segment absolutely did not come from the other parent. IMO, this is logic 101.

Outside of wikitree, I doubt many people will agree " If you have a match with an unknown person, you'll need them to match a known relative". Simple logic tells us given only 2 choices, and we eliminate one, the other must be true. If I can prove the match is not my mother, then it must be via my father.

I would like to correct you on the following.

"If you have a match with an unknown person, you'll need them to match a known relative to be able to find out which side of the tree they're on. Then the answer is immediate and doesn't need any further analysis."

Although I agree with this statement, it is not within the Wikitree guidelines or the comments made. You can not just match. If this were the case, then you would not have to look at segments. Even though you might have a completely documented connection to a cousin and you match that cousin, you have to find a third cousin who shares a triangulated segment. Why? it adds nothing when deciding which side of the family a cousin is related on.

commented Jun 4, 2016 by Ken Sargent G2G6 Mach 6 (62.1k points)
edited Jun 4, 2016 by Ken Sargent

Magnus,

If I match an unknown cousin on a 10cm segment, and my mother does not match on any part of that segment, the probability is that this segment came via my father. There are no Triangulated Groups involved. Just to be clear, you disagree because it seems you are still supporting the claims

“"If you want to know which side of your tree somebody is on, you need a triangulated 3-way match with another cousin."?”

If we agreed that this was a smaller IBS 6cM segment, and the mother did not share this 6cM segment, are you still supporting the claim…

Do you still disagree that segments from matches which are IBS segment are probably related to the son via his father?”
Every segment of every chromosome-pair except XY will match cousins on both sides if you have enough cousins. Even though segments on one strand came from the same paternal grandparent, but the other strand was split between the maternal grandfather and maternal grandmother.

commented Jun 4, 2016 by Ken Sargent G2G6 Mach 6 (62.1k points)

23andMe has recently added a triangulation groups system for open profiles. The system shows both In Common With indirect matching (Ancestry.com style DNA circles) and lists what ICW matches also form triangulation groups.

--------------------------------------

I figure that we should explicitly state something that is basically important: If you share a segment of DNA with someone then they are probably related to you; segments over 7 cM are more likely to be positive matches than to be false matches, and segments over 10 cM are almost certainly not false positives (ISOGGWiki).

You don't need triangulation groups to establish that you are related to someone by DNA comparison in some way. Sharing more than 15 cM can be considered to be almost certainly a direct genetic relationship.

Triangulation groups are necessary for establishing probable most recent common ancestors; I need my two sibling's and my DNA to indirectly establish my mother as our common ancestor, or I need one sibling, my mother, and my own DNA to directly establish that my mother is our common ancestor by a triangulation group. I need one sibling or my mother, one of either my maternal aunt or my first cousins by my maternal aunt, and my DNA to indirectly establish either of my grand parents as a common ancestor. This logic extends up through all genetic ancestors but not necessarily for every genealogical ancestor (See the UC Davis genetic genealogy blog in the OP for details)

commented Jun 5, 2016 by Ian Mclean G2G6 Mach 1 (13.6k points)

4 Answers

I have a couple of genetic matches to people who were adopted, and I would really like to figure out what branches of my family they relate to. Which is why I've been trying so hard to figure out the mathematical model for this search and sort method.

I had thought about a fan chart; I think that is the best way to represent whole genome sequences or to keep track of exome results like what has been derived by labs like 23andMe and FamilyTreeDNA. The structure I would like to find is the one which shows what fragments of DNA I got from who.

Think of it like this. At me, the structure would ideally have 100% of my DNA exactly as it is. At my parents, they'd each have roughly 50% of my DNA representing the portions they passed on to me; this would be basically my phased DNA showing exactly what I got from my father and exactly what I got from my mother.

Normally in figuring out what my mother and father are going to pass on to a child it is a matter of some randomness and probability, but in the case where we're examining me, my DNA, and my parents and their DNA there isn't strictly a probabilistic relationship to be concerned about; we should be able to use strict differences to deduce what actually happened as compared to what could have happened from the actual measurements.

In practice for figuring out where distant cousins go in the family tree, I do think it would be a probability or at least a degrees of truth problem written in statistical or fuzzy logic.

A bonus to making the kind of map that I am thinking about is that we'd eventually see what DNA survived from my ancestors to me and see what is missing from the puzzle. With enough people represented in this same kind of structure, we could start to see where the pieces fit together, so we could reconstruct the whole genome sequences of common ancestors that we don't necessarily know. To me that would be useful for determining where I fit in the global family graph, and I imagine it would be similarly useful to other people looking for how they fit in.

commented May 29, 2016 by Ian Mclean G2G6 Mach 1 (13.6k points)

Ken, the first problem I have to solve is which side of my family they are related to me from. For my maternal grandmother's side of the family, I know with relative certainty that one of the adoptees is not related to me by my great grandparents or lower; I know all my great aunts and uncles and all their children and all the great grandchildren. There's a distinct possibility that they are related through my maternal grandfather's side of the family, but I have put figuring that out specifically on hold until I can rule out the more difficult case: they are related to me through my father's side of the family.

I know very little about my father's side of the family relatively speaking. I barely have my relatives documented out to my paternal grandparents. One of the adoptees shares X chromosome DNA with me, so I can generally assume she is a relative on my mother's side, but the main adoptee that I want to help shares only autosomal DNA with me, so it is ambiguous as to where they are in my family.

In order to figure out their parents, I need to figure out which side of the family I need to look on. From there I need to figure out our most recent common ancestor. From the most common recent ancestor, I can then trace down the line to the adoptee and at least one of their parents; the adoptee has already found their parent of record at least under a pseudonym, and from what is known of their father, I am related to them through their mother.

I'll keep the Lazarus kits in mind for this, but the Lazarus kits depend on having solved more basic problems.

commented May 30, 2016 by Ian Mclean G2G6 Mach 1 (13.6k points)

I like your model Ian - and of course it should work. Magnus and Ken are smart cookies and they've pointed out some issues. Nevertheless the data should tell the story. And that for me is the rub - the data aren't necessarily there yet. Despite the seeming precision of these tests we don't know the error rate or variation in results due to testing procedures, lower level data sorts, different tolerances, thresholds, or magnitudes for categorizing the lower-level data, etc. None of these data from these tests are ready for the precision of the 'exacto knife' of a model you have in mind presently. Even the underlying proteins themselves appear to behave in unpredictable ways so while I am hopeful better models will be developed for predictive as well historical reasons I'm not sure were aren't stuck in the Sherlock Holmes era for a bit longer. I would think Mathmatica might do some interesting things with the data but you are likely to have to rely on statistics and categorical analysis for the state of the art.

commented May 31, 2016 by Leake Little G2G6 Mach 1 (16.3k points)

I get the issues with the available data. But the precision isn't so much the issue anymore, and as time goes by, it is going to become less the issue. Error rates are entering into the 1% range and rapidly diminishing for individual genetic tests. Comparison between old kits and new kits or between standard kits and custom kits are the major problem at the moment.

With the cost per genome rapidly approaching 0, the issue of precision or lack of data is going to effectively go away entirely. For my personal case, I have most of my immediate family members totally on board for genetic sequencing and analysis, so I am not concerned about not having access to the minimum data necessary to pull apart my genome and figure out deductively and experimentally where I got what from whom. To me, it is simply a matter of finding and learning to use the correct tools. Or inventing them where they don't yet exist.

For me the major issue isn't the reliability of the specific genetic testing kits though. What I want is the basic mathematical model for the simplest case: the generalized family tree without pedigree collapse.

That model isn't going to depend on any of those factors, and we can actually use deviations from the simplest model as a way to infer information that wouldn't otherwise be obvious.

The mathematical model can be constructed without actually depending directly on any given test or precision. The data may not be present or up to the required precision, but we have the basic theories for vector analysis, physical measurement, computer coding, and genetics. The mathematical theory of genetic genealogy can be written before we have the data to test the theory of genetic genealogy. Data developed later can then be used to test the theory and possibly refute it or some of its assumptions.

The basic structure is actually already relatively well known: "[Identical by state data] may be useless in identifying the common ancestor but it [is useful in] an iterative process of building a decision tree [...] based on probabilities." -Ken Sargent

commented May 31, 2016 by Ian Mclean G2G6 Mach 1 (13.6k points)

Answer 1 · 2016-05-27T05:36:42+0000

The problem, as I see it, is that only a sampling of a given person's genome is tested, so that while getting a statistical measure of relatedness for a few generations is easy enough, it will only work definitively for a very few generations. Further back patterns will exist for areas of the genome, but since they are characteristic of a large number of people in a given area, it isn't possible to do what you suggest if there is a very large tested group. I've been looking a bit for a good source of technical explanations but so far I mostly see stuff for the non-technically oriented. But Wikitree is a large group and I'm pretty sure actual articles will be cited here which don't take a bunch of money to read. Meanwhile I have more to do here than I can get even started on. But I'll keep an eye on you to see if you get a handle on how to handle things.

Answer 2 · 2016-05-29T13:20:36+0000

I believe what you are saying is true but some clarification is necessary.

Using your conclusion: "This way my DNA can be effectively factored recursively into maternal vs paternal, maternal grandfather vs maternal grandmother, paternal grandfather vs paternal grandmother, and so on. You could then compare the factored or phased DNA to matches shared with other family members and determine immediately where in your family tree they must be. Likewise, you could use the vector representation of other people's DNA in order to automatically generate genealogies and check for intersections."

My two brothers also have been DNA tested, and without a tree, we can make certain conclusions about the source even though we can't specifically identify which parent or another ancestor is the source.

For example, if the two oldest brothers in my family share a segment, but the 3rd brother does not, it means that the 3rd brother received that particular segment from a different paternal grandparent and a different material grandparent than the other two.

If a cousin matches the 3rd brother, then you can also make some assumptions that about the grandparent of the 1st two brothers. We presume that the well-documented tree is correct, and by using that tree, it is determined that this match is via his paternal grandfather, you can presume that other matches on that same segment to only the two oldest brothers was inherited via your paternal grandmother.

I can make this presumption because I know my parents share no segments.

A tree and DNA are mutually dependent on each other for the answers we are asking A tree and DNA are either consistent with each other or they are not. DNA alone can not independently Confirm nor Prove particular relationship. It can only further support or refute an existing claim.

Answer 3 · 2016-06-05T09:36:08+0000

Here is my exchange with Ann Cousin aka DNACousins.

I asked for clarification on some things to make it clearer but this morning I told her it was not needed. I understood why she answered as she did but just reading the response.

1. Conceptually, a match for a son not found in his mother can be attributed to his father. This includes IBD and IBS, but not IBD. I have no problem limiting those matches (without triangulation), to only those that include an IBD segment, which is all I initially intended.

2. Given the answer to #1 only requires a match, it indicates that triangulation is not necessary. Since triangulation is only used to find common ancestors for those without a tree, she interpreted the question that way.

Do I really have to ask Ann to clarify her last statement by telling her that the Wikitree technical group believes that triangulation is used for something other than finding the common ancestor? Do I really have to say to her that the Wikitree Technical members are not convinced her answer to #1 because "If you want to know which side of your tree somebody is on, you need a triangulated 3-way match with another cousin. "

I tried to ask the question so not to bias her answer but I should have noted that Wikitree places a triangulation requirement on more than finding common ancestors.

From: Ann Turner

Sent: Saturday, June 04, 2016 4:32 PM

To: Kenneth Sargent

Subject: Re: I am hoping you will clear up a disagreement.

I've been wishing I could spend more time on WikiTree, but it seems like there's always something else demanding my attention.

1) Conceptually, a match for a son not found in his mother can be attributed to his father. There are a couple of "gotchas", though. The segment must be long enough that you can rule out a coincidental match. There's no consensus on how long that should be. And there is also a possibility of a false negative in the mother, e.g. at FTDNA (which requires a total of 20 cM, including small 1-3 cM pseudo-segments), AncestryDNA (with its TIMBER algorithm discounting some segments) and 23andMe (with a cap on the number of DNA Relatives). GEDmatch lets you look at everyone through the same lens.

2) There's also no consensus on whether you "need" a triangulated group. AncestryDNA uses more of a network approach. I wrote up some material about how difficult it is to assemble TGs here: http://tinyurl.com/TheTroubleWithTriangulation. But if you have the good fortune to get a triangulated group with pretty robust segment sizes, I do think it's possible to attribute it to a specific ancestral couple if it's not too many generations back. When you go back many generations, that brings up the possibility of multiple lines of descent.

Hope that helps,

Ann

On Sat, Jun 4, 2016 at 9:48 AM, Kenneth Sargent <msnkjsargent@msn.com> wrote:

Hi Ann,

I’ve been spending too much time on Wikitree, devoted almost entirely to the discussions on DNA. I suspect that Wikitree is the best source for publicly available documented trees but the discussions are not at the level as 23andme used to be. I was hoping to ask you two basic questions and get your permission to post your response. We are discussing the “mathematics of genetic genealogy”.

Your responses to these questions could significantly affect how Wikitree users think about how to use DNA in their research.

Scenario: We have the raw data for a mother and a son available to us for customization. There are matches to the son, that are not matches to his mother. More specifically for these matches, the segments are shared with the son, but none are shared with the mother. Since the data can be phased, I am presuming the process could phase the data first.

1. Is it possible, using the data available, in these cases, that a match to the son, and not the mother, is probably related to the son via the father?

2. Do you agree “If you want to know which side of your tree somebody is on, you need a triangulated 3-way match with another cousin. “ FYI – a triangulated 3-way match” means part of a Triangulated Group. You don’t have to go further than yes or no, but feel free to comment.

Thank you

answered Jun 5, 2016 by Ken Sargent G2G6 Mach 6 (62.1k points)

RJ, the scenario we have been using only involves 3 people who are not all biologically related to each other. The son and the mother are related to each other, but the DNA Cousin is only related to the son. There is no triangulation match with the son. It is a simple match which contains IBD AND possibly IBS Segments.

Given there is NO TRIANGULATION in this scenario, I maintain "you need don't need a triangulated 3-way match with another cousin." which is directly contrary to your assertion. This same principle that is applied to Wikitree requirements that only the approved method of the confirmation of a father or mother requires triangulation.

I am not sure what you mean by "Most ancestors are inaccessible". We are not looking at any ancestors of the son other than the mother and father.

It seems that you believe (and wikitree) that you have to know the common ancestors in order to determine if a match is related to the father or to the mother in every case.

commented Jun 5, 2016 by Ken Sargent G2G6 Mach 6 (62.1k points)

Based on your last post, I will assume some misunderstanding.

I think it important then to identify the problem with communication in this case.

1^st The title implies a more technical discussion on “mathematics of genetic genealogy?” in which a higher level of precision is assumed.

2. You stated, “If you want to know which side of your tree somebody is on, you need a triangulated 3-way match with another cousin.”

This has really been then the focus of the exchange. To Ian and I, this was obviously false.

Ian provided examples that contradicted this proposition by providing examples of showing which side of your tree somebody is on, without a triangulated 3-way match with another cousin.

3. I provided a simple scenario of the son, mother, and cousin where the son is related to the cousin via his father and repeatedly used it to show my point. I took your responses as denials. This is also without a triangulated 3-way match with another cousin.

I thought I was very specific about the scope of my statements. Please understand that I am unclear about what you believe is obvious.

Do you still believe…

If you want to know which side of your tree somebody is on, you need a triangulated 3-way match with another cousin.”

Because if you still support this, then you can’t believe

“the son is probably related to the cousin via the son's father” because there is no triangulated 3-way match with another cousin”

commented Jun 6, 2016 by Ken Sargent G2G6 Mach 6 (62.1k points)

The following is a pseudocode rendering of an algorithm for determining genetic relationships between anonymous or pseudoanonymous genetic samples.

First step: determine which side of the family a match is for your genetic genealogy via comparison to yourself and at least one parent; mitochondrial matches are going to be strictly along your matrilineal relations; X chromosome matches can be effectively treated as being strictly maternal for XY karyotypes but maybe paternal for non-XY karyotypes; Y chromosome matches can be effectively treated as strictly paternal.

Second step: repeat the above for n matches to create a pool of sorted matches which have been determined to be on your father's side, your mother's side, both, or neither (you might have matches due to mutation). The choice of the size of the pool, n, needs to be based on standards for statistical significance.

Third step: find all matches that share a sex-linked segment and an autosomal segment. These are weakly patrlineal (Y Chromosome), weakly matrilineal (mitochondrial), or weakly maternal (X Chromosome) autosomal matches; there is a probable relationship of inheritance between.the autosomal match and the sex-linked match; this is a correlative relationship but not necessarily a causal relationship. This group of matches are useful for figuring out what autosomes to target first in the search and sort.

Fourth step: analyze the matches and sort according to probable degree of relationship. Naively, order the sorts according to cM lengths, the number of shared segments, and total shared cM lengths; a more sophisticated algorithm for determining probable degree of relationship can and should be be used.

Fifth step: diagram yourself at the center of a bifurcated polar coordinate system with the sorted matches plotted to intervals representing the range of probable degree of relationship over the rings radiating out from you on the appropriate side of the map. Mother's matches on one side and father's matches on the other side; I would probably exclude plotting the both or neither matches for now. The idea is to find clusters of matches that match each other and graph those clusters according to their probable degree of relationship; by graphing their probable degree of relationship to you and their probable degree of relationship to each other, you create a relative topological reference of distance and connection.

Sixth step: find all triangulations between you and your mother's matches; find all triangulations between you and your father's matches. Mark the abstract relationship of you, your mother, and your match's most recent common ancestor; at this point, the graph should begin to show a structure of relationships resembling a familiar genetic genealogy; it will likely be incomplete and will have islands of disconnected relations.

Steps beyond this really depend on what you want to accomplish. The islands can be recursively connected by performing steps 1 through 6 for each child-parent pair you can find among your matches. There's a critical threshold of matches that would result in a chain reacting algorithm that would tend towards total connectivity.

A DNA cousin can know which side of the family I am on by the mirror image of the process by which I discovered what side of the family they are on.

To determine what side of my father's family tree a given match is more information is required. In my case, I basically do not have access to my father's DNA directly, so the best I can do is phase my DNA with my mother and my siblings to composite my father's DNA via Lazarus kits or similar.

However, I can also take all of the cousins that I am able to discern are not related to my mother and composite their DNA matches with me as well into the image of my father's DNA; I don't know what side of his tree they all are on, but I don't need to know either because I only need to know that they are not on my mother's side of the family tree. I can composite a functional image of my father between my siblings, my mother, my father's pseudonymous genetic relations, and me; the issue then is to determine his mother or father's DNA. Obviously, his mother's DNA can't be fully reconstructed without a genetic kit from his daughter, maternal sisters, maternal aunts, or maternal uncles because I probably share no X Chromosome or mitochondrial DNA in common with my father. His father can be partially reconstructed sans X chromosome and mitochondrial DNA because I share upwards of 1/4th my autosomal DNA and almost my whole Y Chromosome in common with him, but again, we would need genetic kits from my father's paternal sisters, paternal aunts, paternal uncles, or daughter. Without going through all that, I would guess that we can use my DNA and my father's partially reconstructed DNA to sort paternal DNA cousins into probable pools of paternal grandfather and grandmother matches by those who do not share DNA with me and my partrilineal uncles or paternal XY-karyotype first cousins.

Though we are now getting into why it is important to derive the mathematical genetic decomposition or "factorization" of people for comparison. The determination of which side a DNA cousin lies on of my father is answered by how the pieces fit together to form a completed puzzle and depends on the mathematical decomposition of a genome into a quantitative genetic genealogy; unlike a common puzzle where each piece has a unique fit, this puzzle can be assembled multiple ways from multiple other puzzles. The classification and categorization of which are methodically significant and mathematically possible.

commented Jun 7, 2016 by Ian Mclean G2G6 Mach 1 (13.6k points)

Categories

What do you recommend for the mathematics of genetic genealogy?

Please log in or register to add a comment.

Please log in or register to answer this question.

4 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions