Subscribe to DSC Newsletter

All Blog Posts Tagged 'hdfs' (2)

Making data science accessible – HDFS

By Dan Kellett, Director of Data Science, Capital One UK

 

Disclaimer: This is my attempt to explain some of the ‘Big Data’ concepts using basic analogies. There are inevitably nuances my analogy misses.

 

What is HDFS?

When people talk about ‘Hadoop’ they are usually referring to either the efficient storing or processing of large amounts of data. MapReduce is a framework for efficient processing using a parallel, distributed algorithm…

Continue

Added by Dan Kellett on July 21, 2016 at 2:00am — No Comments

Working With Large Data Sets

This is an excerpt from my blogpost Working With Large Data Sets...



For the past 18 months I’ve moved from working on the SMTP proxy to working on our other systems, all of which make use of the data we collect from each connection. It’s a fair amount of data and it can be up to 2Kb in size for each connection. Our servers receive approximately 1000 of these pieces of data per second, which is fairly sustained due to our global…

Continue

Added by Phil Whelan on September 28, 2010 at 2:02pm — 1 Comment

Monthly Archives

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

On Data Science Central

© 2020   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service