I just completed a data science bootcamp, and for the major capstone project, they asked us to take online data from somewhere, and analyse it to find interesting, useful, patterns. I decided to look at G2G interactions, to see if there were any effects of personality (positivity) on how many contributions users make, and how people respond the them.
https://github.com/brfoley76/g2g-user-prediction
The major challenge in all this, was that the entire forum is so crazy positive. It was hard to find counterexamples. I even gave up trying to train a positive-negative classifier.
---EDIT: what I did and found---
To explain a bit of jargon, I went through all the G2G posts, and looked at each user (about 16000 unique users), and what they wrote, and how many upvotes the gave, how many thanks they gave, how many contributions they made ... things like that.
Let me emphasize, none of this was private info, it was just the same kind of things you see when you pop open a G2G question. Nothing to do with ancestors names, or birthdates, or places lived.
When I was getting the statistics for each user in turn, I called them a 'focal user'. So, if user Foley-10331 answered a question, I wanted to see if we could tell anything about how other people responded to him. Did his total contributions matter, or how many upvotes he gave, or the emotional tone of his texts.
In the end, tone mattered and positivity mattered. But more than up-or-down-votes, or cheerfulness, the things that really stood out were that doing a lot of work on contributing, and answering (not asking) questions, was the biggest predictor of getting thanks, and improving the discussions.
Maybe that's not a surprise?