Identification of spam, low value content, and users with multiple accounts. Need to segment and score users.
Statistical bias. People posting on social media are not the same as those inactive in social media.
Need to stem and normalize text, automatically correct typos, identify and categorize text atoms and relationships between text atoms, and deal with foreign language.
Need to blend data gathered with internal data. Need to perform fuzzy merging (maybe not at a very granular level) between internal corporate data and data obtained on social networks.
Potential privacy or liability issues, e.g. if data gathered is used to target people individually, through marketing campaigns, fraud investigations or to penalize users (e.g. refusing a job to a candidate based on data mining of user posts on social networks).
Getting actionable, ROI-generating insights from the analyses. In case of fraud detection or better targeting users, the lift should be easy to measure.