Subscribe to DSC Newsletter

Is big data becoming more and more structured, or the other way around? 5 questions.

  • And how do we define structured?
  • Is XML data considered structured?
  • Is XML format here to stay? It is very consuming in terms of disk storage, although the difference between XML and raw data is probably small after compression.
  • Does structured data mean death of text mining?
  • Much of the so-called "structured data" comes from users who attach a label (tag or category) to their posts in message boards, blogs or social networks. This makes clustering and taxonomy building easy, but can we trust the "structure" in the data?

Views: 42

Reply to This

Replies to This Discussion

A very interesting number of questions. Questions (1) and (2) i think require much discussion. Regarding whether XML is here to stay i would think that it is. Even though XML is not the solution for every problem it can easily be used for a number of applications -structuring financial information and news being one of them.

Structured data may well make some current Text Mining applications obsolete but IMHO Text Mining is here to stay. User generated content in form of Text will be around for a very long time.

Tagging and categories given by users is a great resource. I've started incorporating this "collective intelligence" in my applications and it really makes a difference in the results i am getting so in my experience the use of user-generated tags has been a great help.


On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service