Cloud-Based BI for On-Premise Data

The title of this article is no oxymoron. Why should cloud-based Business Intelligence (Cloud BI) solutions be limited to cloud-based data only? In fact you should expect your Cloud BI solutions to work well with all your data including on-premise data.

Wait a moment, you’d say, my on-premise data sits behind firewalls and is not accessible from the cloud! Nor should it be accessible from the cloud lest data security be compromised!

Oh, but I insist:

  • Cloud BI can be implemented without poking any holes in your network firewalls
  • There’s no need to copy on-premise data to the cloud
  • Cloud BI is more secure than client-server solutions that keep potentially sensitive data on portable computers
  • Cloud BI is centrally managed, allowing you to easily cut access and control the load on your data source

Let me explain:

No Holes in your Firewall

The key to cloud-based access to on-premise data is a data-access agent. For example, the Explore Analytics Agent resides in your private network and communicates with the cloud-based Explore Analytics over HTTPS. The connection is initiated by the agent and there’s no port that accepts any connections from the outside. While the agent initiates the connection, user queries are processed instantly supporting interactive data exploration and live reports. And you have precise control over which data the agent can access.

No Need to Copy Data

Queries are performed on-premise and results are streamed back to the user. Copying data to the cloud is not necessary.

Don’t Keep Sensitive Data on Portable Computers

Your users use laptops and tablets. These devices are easily lost or stolen, potentially leaving you to deliver the bad news to your customers that their data might have been compromised. A SaaS web-based solution that can be accessed from anywhere using nothing more than a browser is thus more secure, because it eliminates the need to keep data on users’ laptops or tablets. You can easily change a password to cut off access from a lost device. Explore Analytics is specifically designed for self-service BI, replacing the need to use Excel spreadsheets.

Centrally Managed

With centralized data access, you can easily limit the number of concurrent queries, or limit the amount of data that you allow users to fetch in a single query. Since data is filtered and aggregated at the source, there’s never a need to fetch more than a few thousand rows. When users leave your company, off-boarding is easy. Moreover, if you need to temporarily cut off all data access to your data, then Cloud BI makes it a cinch.

Conclusion

Using Cloud BI solutions can greatly reduce costs and improve your users’ experience, and can be accomplished without sacrificing data security. The availability, confidentiality, and integrity of your data can be controlled at the level required by your business and the sensitivity of your data.

About Explore Analytics

Explore Analytics is a self-service SaaS BI tool for data analysis, visualization, and reporting. For more information, see the Explore Analytics website.

 

Real-Time Access to SaaS Data

Introduction

Data stored in SaaS applications is often inaccessible to BI tools. This is a major headache to early adopters of SaaS applications. With on-premise applications, IT departments can bypass the application and access data directly from the underlying database. With multi-tenant SaaS applications, such direct database access is not available because the database is shared with other customers.

Understanding the Problem

Ideally, all data access should go through the application. There are some very compelling reasons to go through the application:

  • The application manages data-level access rights. For example, allowing a user to only see data for their region.
  • The application manages data at a business-object level. Such data objects are often assembled via object-relational mapping of application objects to relational database tables.
  • Multitenant SaaS applications restrict users from seeing data that belongs to other tenants.

For these reasons, bypassing the application to access data directly from the underlying database is not a good idea in general, and is not possible with SaaS applications.

Current Strategies

Let’s review the strategies that applications currently provide for data access.

Data Export

Most if not all applications allow users to export data into a file, typically Excel or CSV, that can be loaded into a spreadsheet or imported into a BI tool. This approach is easy to use and works with most tools, however it suffers from several serious drawbacks:

  • Data is outdated as soon as it is exported
  • Works well for small data sets, but takes too long to move large amounts of data
  • Works well for single tables, but not so well when the analysis requires data from multiple related tables

Web Services

SaaS applications typically provide a Web Service API for data access. Access is direct and is managed by the application. In principle, this is the desired solution. However, due lack of standards, most SaaS applications provide limited APIs that are useful for obtaining specific records or for exporting data, but are not suited for query and reporting because they lack an expressive query language such as SQL.

Specifically, the missing pieces are:

  • Lack of support for aggregate queries. For example, requesting sales totals grouped by product and region. Without such API, BI tools have to request potentially very large data sets to be aggregated. This very quickly becomes prohibitive for real-time data reporting.
  • Lack of support for table joins and data filtering (other than the most basic). For example, requesting all the orders for customers of a given sales person within a certain range of order size.
  • Lack of a standard API similar to SQL and ODBC/JDBC. This lack of standard means that BI vendors need to develop a connector for every application that they support and every application vendor has to implement their own API.

Data Warehousing

Given that SaaS applications do not provide an API for real-time data access, the typical, yet rather expensive, solution is to export data from the application into a relational database and then run reports again this database.

In addition to being expensive to setup and maintain, this solution also suffers from the fact that the data is accurate only as of the last time it was exported. Frequent data synchronization makes the solution even more expensive, and yet it is never real-time. Users today expect to see up-to-the-minute data, not yesterday’s data.

Standard Data Access API for SaaS Applications

The BI and SaaS vendor communities need to collaborate on defining an API for real-time data access. Technologically, this is not very hard and it’s been done for relational database back in the early nineties. I believe that the leadership must come from the SaaS vendor community because this is the community that stands to gain the most by solving this problem. If you belong to that community, then consider this a call to action. Please contact me if you’d like to develop this idea further.