Reference # on GEDCOM

+4 votes

I looked up this topic because of issues with downloading a gedcom. And found this question: "I downloaded a GEDCOM of one of my ancestors to save me some work on my own personal tree and I've been working through it and cleaning it up. Each person has 2 reference numbers, the #60 as well as their own 7-9 digit number. What are these numbers for? I can't find those numbers on their pages either. An example is Cornerde-1 has the reference number 60 and 8931535. Where can I find the meaning of those numbers?"

The sole answer had to do with "stuff" that shows up in a profile when importing a gedcom, and thus is not relevant. This question has to do exporting a gedcom.
What do these number mean? They are not profile numbers.
in The Tree House by George Fulton G2G6 Pilot (666k points)

2 Answers

+5 votes
60 is the privacy level. 60 is Open privacy.

8931535 is the User ID of the profile.
by Jamie Nelson G2G6 Pilot (648k points)
Thank you.

I wish I could globally delete these, as they have no value in the software I'm using. However, they get put into a predefined "fact" field, the field cannot be deleted. My only option is to go into each and every person in the file and individually delete it.

There are other issues with exporting gedcom files as well.

If you look at the .ged file in a text viewer like notepad, can you identify the INDI line where these values are stored? If so, I can help you remove those.

For instance,

0 @I1@ INDI

1 NAME John /Smith/


1 PVC 60

1 _UID 8931535

1 FAMS @F1@

1 REFN 11218702
2 TYPE wikitree.user_id
1 REFN 11905215
2 TYPE wikitree.page_id
1 REFN 35
2 TYPE wikitree.privacy
Thank you for the offer, but i’ve started to remove them manually and am about a third of the way through.

My next step is to fix all the citations. The WiikiTree just dumps these into the notes field. These need to be turned into actual citations that my software can work with.

After that, I need to remove all the formatting (html?) codes that mean nothing to my software.

The file I’m working with has only about 300 people, so the task is not impossible.

One thing I’ve learned from this is that I’ll not do it again.

Why am I doing this? If I run an ahnentafel report without any of the notes fields, and without the reference numbers, I get a report of about 25 pages with the name and place index. If I have it print the notes, the report is more than 150 pages, and pretty much unreadable.
If I know all the boundaries of what you are wanting to do, I can run it through a script for you. I already created one based on the info Jamie gave, and it takes around 30 seconds to process a file with 500 individuals and will remove all lines associated with the user_id, page_id and privacy.
It’s been tedious, but i’ve removed them all.

I also removed the “facts” with no data.

As I go through the file, when the other last name is a compound name, it gets split into two fields. I’ve fixed all these, fortunately there were not many.

The notes and sources are next ...
Has any work been done to provide an automated script that will leave out of the GEDCOM parts of the profile that cause other genealogy programs (like Custodian 4) to barf when trying to process a WikiTree GEDCOM?

I have found a solution with RootsMagic:

With RootsMagic it is possible to select what fields you export from from RootsMagic. So the overall process is as follows:

  1. Export the gedcom from WikiTree
  2. Import the gedcom into to RootsMagic
  3. Decide what fields are unnecessary
  4. Set those unnecessary fields to “do not export” in RootsMagic 
  5. Export the file from RootsMagic as a new gedcom
  6. Import the new gedcom into a new empty RootsMagic data file.
  7. The unnecessary fields are are no longer in the file.

This is actually a rather quick and easy process.

You can now procedure with the cleanup of the narrative in the notes fields.

Another option, should you want to keep these fields for some reason, RootsMagic allows you to set these fields to “do not print,” and they will not print in the narrative reports.

Thanks, I will give it a try!
+4 votes

Maybe I'm missing something but, I don't see how the reference numbers included in the GEDCOM are at all useful.

1 REFN 11218702
2 TYPE wikitree.user_id
1 REFN 11905215
2 TYPE wikitree.page_id
1 REFN 35
2 TYPE wikitree.privacy
Can the user_id and/or page_id be used on WikiTree anymore to find the user or page?

What would be more useful would be to include the WikiTree ID, e.g.:

1 REFN Braunstein-75
2 TYPE wikitree.ID
by Louis Kessler G2G5 (5.9k points)

Louis? Well, seeing as this looks like your second G2G post, may I offer you a huge welcome! Anyone who's done any serious genetic genealogy in the past several years will know you, even if they don't immediately make the association. I sincerely hope you find time to drop by here from time to time.

And I agree with you. I wrote a script, as Steve evidently did, to strip out those three REFN fields. In all the exported GEDCOMs there is a top-level line under each 0 INDI, i.e.:


Of course, that URL will import in various ways depending upon the application, but I can't see how having a separate REFN field for the WikiTree ID would be a problem.

Just FWIW, another matter eluded my ability to script for it. I did an quick experiment earlier this month and pulled my fairly small pedigree using the "limit for privacy" option. The export was 51,177 lines for 898 total records; 611 individual and 284 families. Running it through a GEDCOM validator threw a whopping 1,368 errors.

I doubt there's any feasible way WikiTree can mitigate that since all text in the biography box on profiles exports as one massive NOTE field. There were scads of non-UTF-8 compliant characters peppered in there, and many of those forced a line break which then also threw a second error because it broke the protocol of all lines under 1 NOTE needing to be prefaced by either 2 CONT or 2 CONC. I'm actually surprised that errors-to-lines ratios like that don't cause apps like FTM and sites like Ancestry to need the Heimlich maneuver when they try to eat the GEDCOMs.

Don't know how it might be accomplished, but stripping out those non-UTF-8 characters would be on my wish list for improvement of WikiTree GEDCOM exports.


Unfortunately, no, I don't see a

1 WWW tag

under every INDI.  In the GEDCOM I downloaded, I only see the WWW tag under the INDI for myself, and not anyone else.

To add to that problem, WWW is not a valid GEDCOM tag at level 1 in an INDI record.

A correct and likely better way to add the link would be to include the WikiTree profile page as a source, e.g:

0 @I1@ INDI
1 SOUR @S1@

0 @S1@ SOUR
1 TITL WikiTree Profile Page for Surname-123 (Full name)

Yes, there are other problems with WikiTree GEDCOM exports, including illegal UTF8 characters as you mention, and lines without level numbers and tags, and non-inclusion of source information. Each program handles GEDCOM differently, and basic name/birth/marriage/death information should transfer well, but for all other information, different programs will import a WikiTree GEDCOM differently.

Well, the WWW tag was among the nine added to the 5.5.1 draft of October 1999, but that version wasn't released then and 5.5 stayed the standard for two decades. FamilySearch did finally release 5.5.1, very quietly, as the published standard in November 2019 without revisiting it with a new "draft" notification. Shame they didn't actually look at evaluating a full version update, a la v6.0.

I need to go do test GEDCOM exports again the next week or so. I pulled three early this month--one for my watchlist "privacy limited," and two for my "Family Tree" one each limited for privacy and file size--and all three have that WWW line between the place of death and the "FAM" tags.

I sorely wish we could get true "SOUR" entries, but I do understand the limitations the MediaWiki platform imposed on WT from the beginning. It's a trade-off we've gotten used to. But imagine if there were a true, curated, centralized database of sources and easily-copied citations we could draw from when working on profiles, and then exporting GEDCOMs with extracted source data. Would be cool.

In RootsMagic these www fields get put into a multimedia field, if I recall correctly. I manually put the WikiTree I’d into a source field, then delete the multimedia field field. Since, by default, they do not print in the narrative reports they are not much of a problem.

Related questions

+4 votes
1 answer
+2 votes
0 answers
111 views asked Nov 11, 2023 in WikiTree Tech by M. Kazakov G2G Rookie (190 points)
+6 votes
1 answer
265 views asked May 20, 2022 in WikiTree Tech by Steve Davies G2G6 Mach 6 (66.9k points)
+9 votes
0 answers
0 votes
0 answers
178 views asked Jul 10, 2020 in Policy and Style by John VIlla G2G Rookie (220 points)
+1 vote
0 answers
211 views asked Jul 4, 2019 in WikiTree Tech by Mark Dorney G2G6 Mach 6 (66.2k points)
+1 vote
1 answer
284 views asked Jan 8, 2019 in WikiTree Tech by Robin Rainford G2G6 Mach 1 (15.5k points)
+4 votes
1 answer
309 views asked Nov 21, 2017 in WikiTree Tech by Norbert Gitzl G2G6 Mach 2 (27.5k points)

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright