Artificial Intelligence sites can ask questions and get answers with citation sources

+11 votes
1.2k views
An AI-powered tool that provides answers with citation sources is a powerful resource that can help you find accurate and reliable information quickly. When you ask a question, the tool uses machine learning algorithms to analyze large amounts of unstructured data, such as articles and websites, to identify relevant information that can help answer your question. Once it has identified an answer, the tool also provides a list of sources to back up its response. You can then review these sources to determine whether they contain the information you need to support your statements, and whether they are from a reliable source you want to use. By leveraging the power of AI, this tool can save you time and effort, while ensuring that the information you use is trustworthy and accurate.

*[https://www.perplexity.ai/ Perplixity AI]

*[https://neeva.com/ Neeva]

I also added these links to my United States Resource page under miscellaneous. You can bookmark this site for future reference.
WikiTree profile: Space:United_States_Resources
in The Tree House by Jimmy Honey G2G6 Pilot (217k points)
retagged by Ellen Smith

This will not save as much effort as it seems, because of the necessity of confirming the results. Large-language-model "artificial intelligence" works by stringing words together according to frequencies of phrases in previously observed text. It can therefore make up results out of whole cloth. Therefore, if you try it, you need to check what is generated for validity very carefully before using it on WikiTree.

See for example this Scientific American article: AI Platforms like ChatGPT Are Easy to Use but Also Potentially Dangerous.

"Systems like ChatGPT are enormously entertaining and even mind-bogglingly human-sounding, but they are also unreliable and could create an avalanche of misinformation"

Edited to add: See this earlier G2G comment for an example where Perplexity conflated two people. 

Interesting that this was posted now.  I was doing a session with ChatGPT and asked about Gustave Anjou.  It completely made up a response and provided a specific source for it's answer when asked for.  When question about it's response, it appologized for incorrect information and admitted that the source it used could not be verified.  In fact, it does not exist even while telling me that it can not make up sources.

This is the second round where it is giving fake information or distorting the answer.  In the absence of transparency, I could believe that these AI programs are now in a new phase and using us to test and train the next generation of AI.

On the other hand, when I asked about the Genealogical Proof Process, it corrected me with the Genealogical Proof Standard and gave a very decent explaination of the process.

Danger Will Robinson!
Again, it is a tool, and it has nothing to do with Chatgpt. Chatgpt was designed to have a conversational interaction, and it will add things to make it sound more like two people talking. It doesn't mean that is bad or wrong, it is simply doing what is was designed to do.  Have you ever searched something in Google and got back something unrelated to what you were searching for? Did you stop using Google?  It is the same thing.

It's not the same, because Google Search results are not something that can be posted directly into a WikiTree biography. Text generated by ChatGPT or Perplexity can have the right format and look plausible enough that someone might do just that. (In fact it's been considered.)

Jimmy, I appreciate that you are warning people to check the results, but I think the emphasis needs to be on caution rather than the utility or power of the tools.

I have to agree with Jim R on this one. As a very heavy user of AI myself, I am well aware of the many pros and cons that they offer. But that also means I know to heavily fact check the information - especially the output from Perplexity. I have tested it quite a bit and 9/10 times the information is highly conflated between multiple sources and people, unless the person in question is a notable with many sources all saying the same thing.

After extensive testing of many AI products for both work and genealogy, I am of the firm belief that all AI references should have a VERY hard warning associated with them - especially on a site like WikiTree.

@Michael: Danger Will Robinson! Exactly right! (And I am a programmer)

Michael, This is weirdly appropriate in regards to Gustave Anjou!

In other words, AI has learned its genealogical research technique from the average humans reflexively clicking on trees and documents proposed by...an algorithm based on previous choices made by human users. Can we really expect the AI to be an improvement? Garbage in->garbage out!

Maybe one could hope for algorithms based on Genealogical Proof Standards, but where's the fun in that? For me, it would be like publishing the crossword or soduko puzzles all filled in.

Ha! Yes, now that you point that out!

4 Answers

+8 votes
I tried Perplexity with a couple of people I am working on. For example I asked:

"Biography of Juliana de Jesus Maria y Jose (Vigil) Chaves (1827 - 1916)"

Its response was:

"Juliana de Jesus Maria y Jose (Vigil) Chaves was born on January 29, 1827 in Santa Fe, New Mexico[1][2]. She passed away on October 8, 1916[1]. Her parents were Domingo Chavez and Juliana Chavez (born Vigil)[2]. Unfortunately, there is not much information available about her life or accomplishments."

One of its sources was the WikiTree profile that I created. The other was MyHeritage.

It seems to have a tendency to mix up parents and spouses. In tha example above it says the parents are Domingo Chavez (her husband) and Juliana Chavez (herself).

It was more successful when I tried a notable person and it used their WikiPedia page as a source. I think Google Search would have been almost as good though.

I feel it won't be long before these AI tools can be really effective at doing genealogy though and could save a lot of time. They need to train them more to do genealogy though.
by Rob Pavey G2G6 Pilot (286k points)
I tried Neema as well. It is quite useful for searching multiple sites at once. If tools like this stick around perhaps I should add support for searching using them from the WikiTree Sourcer menu.
Rob, they aren't trained to do anything subject-specific. If they were, they would be something other than a large language model. Instead they analyse natural language by measuring frequent phrases and word sequences. They could be provided with more genealogical texts as input, but that wouldn't prevent the kind of "misunderstanding" (in scare quotes because actually no understanding happens) that causes conflation, unwarranted extrapolation, and "hallucinations".

These tools are very dangerous if used without care. They can be helpful in generating a first draft, but that needs meticulous checking before it's used for any serious purpose. I'm concerned that people may not realise this, and introduce a lot of inaccuracies into WikiTree which look plausible but are actually quite wrong.
Hi Jim, did you try neeva? It is not generating any text - just doing a "smart" search as far as I can tell. It worked better than a Google search when given a search string like:
"J P Lacome born 1887 in New Mexico"

I guess the main thing it seems to do is recognize that this is a search for a historical person and filter the search results so it shows results from genealogy sites etc.
I would imagine they would add to conflation issues if used on a site like this one. It would have to be very advanced to be helpful.
I used Perplexity, I didn't try Neeva. I find for the first question, if you include more information and get better results, then on subsequent questions you can ask more specific narrower questions.

I haven't tried Neeva in detail. The home page says it uses LLMs (large language models) so I don't have confidence it would be safer than the others. But if it is only extracting text from the pages it finds, instead of generating new text, maybe it's not as risky. Oddly enough the about page doesn't mention LLMs or AI at all.

It's the text that ChatGPT and Perplexity invent that needs careful validation.

This is nothing like Chatgpt, it doesn't make anything up. It searches the internet for a response and returns multiple information with inline citations to the source. You can then view the source and decide if you want to use it. If it doesn't find a source for some information, it will say it was unable to find a source for that info.  Example: you ask for a when and where someone was born. It finds a source with the birth date, but it does not mention where they were born. It will return the date of birth with inline citation and say I was unable to find a place of birth.

Again, it is a tool, the same as a census record. They can all be wrong. It is up to you to accept it or reject it.

That may be true of Neeva. Perplexity does generate text, and can conflate. I asked Perplexity about "Daniel McAuslan" with followup question "Victoria 1860" and got

Mary McAuslan was born in 1837 to Daniel McAuslan and Elizabeth Mcfarlan. She lived in California in 1860. There is no information about Victoria 1860 or any person named Daniel Barnet Lazarus in the provided search results.

The first sentence there is relevant, but the other two are about completely unrelated people.

The danger is that the hype around these tools will lead people to accept their results too readily.

Again, it footnotes everything with inline citation. You can review and decide if it is something you want to use. No different than a Google search. It will return conflated information, then you pick what is relevant. If you type one or two words, you will get the same results as in Google, a lot of useless information. But in the end, it doesn't matter if you get results back and pick and choose what is relevant. The same as you do with any search engine. It is a tool, and how you phrase questions matter. The same way they matter how you phrase them in Google.
+5 votes

This is Fantastic. I asked two questions and was given the correct answers both times, including the correct dates. I can confirm the information is correct because it came from my sourced profiles. This should prove to be a valuable tool for many if the correct sources have been included. But as always sources still need to be confirmed. wink 

by Roy Tonkin G2G Astronaut (4.5m points)
+9 votes

I tried Perplexity and asked about Harriet Tubman and her children. The first answer was correct, but when I asked about Jency, a supposed child with her first husband John, it got confused. It claimed that "Jency Tubman was Harriet Tubman's niece, born in 1846 to her brother John[1][2]. She died on August 4, 1944, at the age of 98 and was buried in Liberty Cemetery, Oneonta, Blount, Alabama[3]."

When I clicked on the "detailed" feature, the answer changed, becoming "Jency Tubman was Harriet Tubman's daughter[2][3]. She was born in Bucktown, Dorchester, Maryland on January 1, 1846[1]. Jency married Robert E. Maddox on November 27, 1876 in St. Clair, Alabama and they had at least one son and four daughters[3]. She died on August 4, 1944 in Oneonta, Blount County, Alabama at the age of 98 and was buried in Liberty Cemetery" citing FamilySearch, a G2G post and Findagrave. The same sources were cited for each.

Neeva just directed me to search results.

I also asked ChatGPT about this, and the response was similar, but it invented a fictitious lineage:

"Jency Tubman Mattox was the daughter of Ben Tubman and Marie Louise Anderson Tubman. Ben Tubman was the son of Isaac Tubman and Caroline Tubman, who were both born into slavery in Maryland. Isaac was the son of Rit Green, who was Harriet Tubman's mother, which makes Jency Tubman Mattox a descendant of Rit Green and a great-grandniece of Harriet Tubman. Marie Louise Anderson Tubman was a school teacher, and she and Ben Tubman had four children together, including Jency."

by Aaron Gullison G2G6 Pilot (212k points)
edited by Aaron Gullison
+6 votes

I tried something really obscure.

what is the birth date of Bernardi Gerardi Zumbrink who died in 1742 in Iburg, Holy Roman Empire

Perplexity's answer:

No sources found. Try a more general question.

Neeva's answer:

https://neeva.com/search?q=what+is+the+birth+date+of+Bernardi+Gerardi+Zumbrink+who+died+in+1742+in+Iburg%2C+Holy+Roman+Empire&src=mkthome

At least Perplexity knew it was clueless. Neeva threw a pile of BS a mile wide at me. 

I then asked a question to which the answer can be found on the web.

What is the birthdate of Christoper Bernard Zumbrink who died in 1789 in the Prince Bishopric of Osnabruck

Perplexity's answer:

There is no information available on the birthdate of Christopher Bernard Zumbrink, who died in 1789 in the Prince Bishopric of Osnabruck. 

The answer is on Wikitree. So much for Perplexity.

 Neeva?

https://neeva.com/search?q=What+is+the+birthdate+of+Christoper+Bernard+Zumbrink+who+died+in+1789+in+the+Prince+Bishopric+of+Osnabruck&src=mkthome

Same pile of BS as before. 

by Paul Schmehl G2G6 Pilot (165k points)
edited by Paul Schmehl

Paul, your question to Perplexity about Christoper Bernard Zumbrink illustrates a general problem with large language models.

I see you created that profile on 21 February 2023.

LLMs work from a database of collected texts. There is a cutoff date for the texts that have been collected: the information available is not "live". So it's unlikely that Perplexity could have such a recent WikiTree profile in its data.

I haven't seen details for Perplexity, but ChatGPT's data does not go beyond 2021.

Related questions

+7 votes
1 answer
+10 votes
2 answers
+11 votes
2 answers
+7 votes
3 answers
366 views asked Jan 24, 2023 in The Tree House by Jimmy Honey G2G6 Pilot (217k points)
+9 votes
1 answer
+9 votes
2 answers
422 views asked Feb 3, 2024 in WikiTree Help by Graham Crisford G2G5 (5.8k points)
+6 votes
0 answers
+6 votes
1 answer

WikiTree  ~  About  ~  Help Help  ~  Search Person Search  ~  Surname:

disclaimer - terms - copyright

...