Subscribe to DSC Newsletter

Hi everyone,

If you had to do social network analysis on R, which libraries would you use? Statnet or iGraph?

Thanks,
Jose

Views: 490

Replies to This Discussion

Hi Jose, it depends on what you are trying to do for your specific research. It also matters the size of your graph you are analyzing. As you know R is in-memory limited; and SNA tends to require the full graph for analysis because proper sampling techniques are still being researched.

Sincerely,
Nick
CEO
Sonamine, LLC
http://www.sonamine.com
They are two different sets of libraries:

"network", "sna", and "statnet" have been developed by the same team of mostly social scientists and statisticians. network is the data structure package. sna is the classic network analysis package - it implements a lot of methods from wasserman & faust. statnet is the new package that has techniques for developing statistical models using network data.

igraph has been developed by a different team which comes from the physics side of network research. some of the methods overlap, but this package has been well-designed to optimize techniques for large data sets and to do some nice visualizations (with a little bit of a learning curve on drawing graphs).

hope that helps.

jesse
I want to compute eigenvector centrality in the Add Health data... 80 schools with about 80,000 total students. Is one package superior to the other for this task?
Michael, eigenvector centrality can be computationally intensive as the edge count increases. What is the edge count in the Add Health data?

Based on my understanding, both of these packages require the network to be loaded into memory so as long as your machine can handle the size of network, you should be fine.
Thanks for the reply Nick,

The average out-degree is about five with a maximum of ten. For now, I wouldn't mind treating this as separate network objects for each school to reduce the memory requirements, but only if I can automate the process so I don't have to run the same code on 80 separate networks.
Hi Michael, if I read you correctly, this means there are about 80,000 x 5-10 ~ 400,000 edges.

If all you want to do is calculate eigenvector centrality quickly, then I'd suggest downloading Sonamine trial version, get a key and use that. It'll finish the calculation in about 5-10minutes using a standard windows xp laptop.

Public disclaimer: I'm the CEO of Sonamine, and we sell graph mining tools for very large networks. For example, we'll run eigenvector centrality for a 7M edge network using windows laptop in about 15 minutes.

Nick
Let's suppose I am interested in analyzing basic things such us, betweenness, closeness and Eigenvector centrality, centrality?

Furthermore, for each vertex (node) I am interested in storing some attributes like age, gender, location, hobbies, preferences, etc. I also need to store interactions between nodes (photos in common, comments, likes, etc.)

Based on these data I would be interested in finding similarities between nodes, clusters, influence, etc.

So, which one seems more appropriate?

Note: I don't estimate to have several nodes (less than 5000), although with several attributes, as this is pure research and I am only interested in the proof of concept.
hi jose,
a other excellent software is PAJEK. it is freeware and should handle large networks.
but R is quite good too. :-)

http://vlado.fmf.uni-lj.si/pub/networks/pajek/
Hi everyone,
I know SPSS modeling clementine, data mining tool..Especially, ıt's web graphic is good for social analytics, for categorical data...

Thanks..

RSS

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service