I would share with you some early results about a research I'm doing in the field of "graph entropy" applied to text mining problem.

**Why Graph Entropy is so important?**

Based on the main concept of entropy the following assumptions are true:

- The entropy of a graph should be a functional of the stability of the structure (so that it depicts in some way the distribution of the edges of the graph).
- Sub sets of vertexes quite isolated from the rest of the graph are characterized by a high stability (low entropy).
- It's quite easy use the entropy as a measure for graph clustering.

As you can imagine a smart definition of graph entropy can be helpful in many problems related to text mining.

Let's see an application of graph entropy to extract relevant words in a document.

The experiment as been done using the first section of the definition of "nuclear weapons".

- The method based on graph entropy seems provide the more accurate results (5 errors respect 9 and 11 of the other methods).
- The graph entropy depicts better the core of the graph containing the relevant words.
- I tried to expand the number of relevant features and the accuracy of the other two methods tends to worsen quickly:

