A follow-on about this broader subject of ephemeral data. Blaine Bettinger posted something last night on his Genetic Genealogy Tips & Techniques Facebook group that I think is worthwhile repeating...especially in light of new GDPR constraints settling down all around us, as well as the--unfounded--backlash to genetic genealogy following the highly-publicized capture of the Golden State Killer. I don't think Blaine will mind me quoting part of his post; the original link is: https://www.facebook.com/groups/geneticgenealogytipsandtechniques/permalink/413835562413483/.
"However, there is a lesson here. We must be PROACTIVE in our DNA efforts and remember that ANY resource/database could disappear. We as a community MUST MUST MUST find a way to stably archive the DNA evidence that we are losing at an astonishing rate. We Save the Pensions and fight against record destruction, but we are not working to find stable, long-term ways to save DNA evidence for future generations. We MUST find ways to do this for our children, grandchildren, great-grandchildren, and beyond."
(And thank you, Blaine, for the shout-out tag in that post.)
There is a double-edge to the (now) old adage that once something is posted to the internet it never goes away. Not true. It all depends upon the type of data, how it was originally published, whether it was publicly accessible when published, as well as several other factors. In this particular case of Y-STR data, Ybase was another option that went dark first, taking thousands of instances of Y-STR data with it. Now Ysearch will do the same. These data are precious because they aren't recoverable; once they're gone, they're gone and, as time marches on, the people who took those DNA tests are no longer alive to contribute another sample.
The problem with all commercial testing companies in this regard is just that: they are commercial. Some may have elements of altruism at work--FTDNA did, after all, create Ysearch way back when--but it is simply not in their best fiduciary interest to openly share too much. What is not openly shared is subject to permanent deletion at a moment's notice.
But this isn't just an issue with commercial, for-profit endeavors. We learned in the last few days that the reason Oxford Ancestors had announced that they were ceasing business was not, as I had suspected, economic and legislative (i.e., GDPR) pressures, but that Oxford University had planned to restructure the Science Area and the labs doing the DNA testing would become unavailable for up to a year beginning this July. This seems to have been averted, and Oxford Ancestors will remain open.
On a purely academic scale, a yDNA study was undertaken at Washington State University back in 2004 or 2005 looking more deeply at the R1b haplogroup. Data was shared and the professor heading the study was communicative with the study participants; interesting stuff. Within 18 months, the funding was pulled and the study closed, taking the volunteer-contributed DNA data with it.
And, if you participated in the initial implementation of National Geographic's Genographic study and have tried to log into your account in the past few years, you've found that access to the study and your yDNA data went quietly away. Genographic 2.0 lives, but the data are not interchangeable between the two studies and, while the collective data from v1.0 no doubt still exists somewhere, it's completely unavailable now to the study participants.
Important genetic genealogy data affecting, quite literally, millions of people is far more fragile than we might think.