A Data Science Central Community
Separating Facts from Fiction
Many expect that choosing a cloud platform is the best way to get a BI tool with the maximum level of self-service. Here’s where the myth that Cloud BI equals self-service perpetuates. Unfortunately, this is not necessarily true since the basic nature of business analytics includes the constant need to add, tweak and manage data. Meaning a Cloud BI solution that uses traditional BI technology and architecture will still require: countless service hours, an expensive ROI model, and a rigid data-driven environment with little tolerance for change. Quite the opposite of the personal control Cloud BI implies with its self-service association.
While this may sound counter-intuitive as it stands in direct contrast to the success of self-service cloud applications like SalesForce or Google Analytics, there is one rule of thumb that will steer you clear of choosing a Cloud BI solution that cannot provide the self-service benefits you’re looking for:If a BI solution is not self-service on-premises, it sure as heck won’t be self-service in the cloud. That’s right. The trick to finding a solid Cloud BI tool is to choose a software that is self-service BI, not managed-service BI, whether or not you plan on deploying it in the cloud or on-premises, and even if you are seeking a fully-managed BI service.
Choosing to install BI software on a “virtual” computer on Amazon EC2 is different than selecting a fully-managed BI service in the cloud, which is what most self-proclaimed cloud vendors offer today. Both are often referred to as “Cloud BI” or “BI in the Cloud”.
The former is a tactical choice which merely pertains to where the computer will be located: on the floor in your basement, or on the floor in Amazon’s basement. The latter, going for a fully-managed Cloud BI service, is a more strategic decision and requires the same considerations any organization needs to take into account before outsourcing their BI solution to a 3rd-party, on-premises.
With the success of many operational cloud applications, businesses expect to experience fully-managed Cloud BI the same way they experience other fully cloud-based self-service applications like SalesForce for CRM, Google Analytics for traffic analysis or Zendesk for help-desk management–which all provide true self-service.
Unfortunately, this is not the case with business intelligence since traditional BI solutions require end users to either call the customer support or bother their IT department every time they want to add data, alter a field or change a data visualization on the interface of a report or dashboard. The same process of transferring and handling data that’s required for on-premise will also be required for a cloud platform. There are several reasons for this, most of them technical in nature:
Technically speaking, BI software is best deployed to be as “close” as possible to the data it feeds off of. If this data is on the Amazon cloud, it makes sense to place your data on the Amazon cloud because there will be minimal overhead in transferring of data from the sources to the BI software for analysis. But, If the source data is on Amazon and the BI software is on Rackspace, that data would need to be transferred (over the www) from Amazon to Rackspace. Similarly, if the data is on-premises and BI is installed in the cloud, the source data would need to be uploaded (over the www) to the cloud first.
Keeping this data synchronized on an on-going basis combined with the frequent introduction of new data sets or sources will make things more complicated and feel like you’re constantly treading water. To top it all off, data that is transferred over the open Internet needs to be encrypted and then decrypted–slowing things down by an additional 60%.
Users will need to punch through these challenges just to get to the point where more BI-specific challenges emerge. These well-known and on-going challenges are around data warehousing, data modeling, query formulation and data visualization – and if traditional BI technology is used, they require a specialized pro to tackle them.
The fundamental difference between “Cloud BI” and operational cloud applications like SalesForce or Google Analytics, which are in fact self-service cloud applications, is that if a BI tool is not self-service on premise it cannot be self-service in the cloud either.
One of the biggest reasons that SalesForce and Google Analytics are true self-service applications is because the same application is used for data entry, administration and operation. This means SalesForce/Google control (and can therefore pre-engineer) the entire data architecture, from how data is stored to what the user can do with it. In SalesForce, data is entered manually by sales teams, and in Google Analytics the data is collected automatically by Google via scripts embedded in your website. The data is then sent and stored on SalesForce and Google servers respectively, and structured to best serve each application’s pre-defined purpose.
BI solutions, on the other hand, do not generate new data but rather plug into existing data landscapes–which changes everything from an engineering perspective, making it all the more complicated. Since the data is almost always located in a different location from the actual BI software, it can be generated by many different applications, in countless different formats, and stored in a variety of locations.
While it is impossible to pre-engineer a solution that fits every possible data landscape, you can use a solution that can be easily adapted to your existing data landscape and work to make it fit whatever changes are ahead. This is just as true for on-premises as it is in the cloud though, especially if you’re not using a single cloud provider or location. So again, self-service flexibility and control in this area is not a reason a company should choose a cloud platform.