A Data Science Central Community
This is a part of our continuing series on engineering and analytics at LinkedIn. If this isn’t your cup of Java, check back tomorrow for regular LinkedIn programming. Else, check out our Engineering Blog. - Ed.
Sharing knowledge is part of our core culture at LinkedIn, whether it’s through hackdays or contributions to open-source projects. We actively participate in academic conferences, such as KDD, SIGIR, and WWW, and industry conferences such as OSCON and Strata.
A few months ago, we decided to take it a step further by sharing the Open Tech Talk series that we host at our Mountain View headquarters to the general public. Here are three of my favorite big-data tech talks that we’ve hosted at LinkedIn thus far:
What do graphs look like? How do they evolve over time? How do you handle a graph with a billion nodes? Chris presents a comprehensive list of static and temporal laws, grounded in recent observations on real graphs. He then presents tools for discovering anomalies and patterns in graphs. Finally, an overview of the PEGASUS system which is designed to handle billion-node graphs using Hadoop.
Algorithmically matching items to users in a given context is essential for the success and profitability of large scale recommender systems like content optimization, computational advertising, search, shopping, movie recommendation, and many more. In this talk, Deepak discusses some of the key technical challenges by focusing on a concrete application – content optimization on the Yahoo! front page. He also briefly discusses response prediction techniques for serving ads on the RightMedia Ad exchange.
3. Big Data in Real Time: Processing Data Streams at LinkedIn by Jay Kreps (LinkedIn)
My colleague, Jay Kreps, discusses the state of up-and-coming stream processing technologies and how they fit in the broader data infrastructure ecosystem — from live storage systems to Hadoop. He explores problems that are amenable to real-time stream processing, solutions that change and shape the way we think about data, and challenges and lessons that we have learned while building LinkedIn’s data infrastructure. A must-see presentation.
In addition to providing compelling speakers, Open Tech Talks offer attendees a low-pressure environment in which people with shared professional interests can reconnect with people they know, as well as make new connections. For those who cannot attend, we live-stream the talks and post the entire recordings on YouTube.
We, at LinkedIn, are proud to be able to do our part to share this knowledge with the engineering community, and we hope you’ll join us for our future Open Tech Talks.