Sunday, July 05, 2009

Operational BI - Part 2

Having set the need for an Operational BI in my previous post, I will sketch out the architecture of an Operational BI solution.

The four important blocks to be considered while designing an O-BI system are

  1. Sourcing/Extraction Module
  2. Transformation & Load Module
  3. Data Retention Module
  4. Reporting Module
Sourcing/Extraction Module discusses the extraction strategy from the source. This Module covers the change data capture (CDC) design & data transfer mechanism. Choosing the right extraction strategy would spell success or failure for the project. Almost 70% of O-BI projects fail because of the wrong sourcing strategy implemented.

Transformation & Load Module discusses the kind of loading tool-set that would suit an O-BI system. Details about the expected load volumes, the loading patterns and the hand-shaking mechanisms with the source will be discussed

Data Retention Module discusses about the parameters required for estimating the size of the sliding data storage windows.

And finally the Reporting Module discusses the kind of reports that an operational executive would need for taking his tactical decision on a hour-hour basis.

These sections would be discussed in detailed in my further posts.

Saturday, April 18, 2009

Operational BI - Part 1

The genesis of BI has always been the need to seek for the BIBLE of decision making. But BI over a decade has transformed itself from a night watchman to more of a 24/7 call-center representative. It has become real-time. What made this change? Why was the mutation to real-time necessary? What are the challenges in data integration? And finally, how can Operational BI (O-BI) be coupled with the Enterprise Analytical Reporting framework? I will be assessing each of the questions posed in greater detail and arrive at a design pattern for modeling a Operational BI Solution.

Let us drive the need for implementation of an operational BI solution with an example.

A store manager at a retail outlet manages various aspects of retailing - visual merchandising, customer experience, resource scheduling, loss prevention, product management (ordering, receiving, pricing, inventory). Let me explain each one of these facets of the retailing business briefly.

  • Visual Merchandising: Promotion of the sale of good through visual appeal in the stores (source: Wikipedia).
  • Customer Experience: Reduced customer wait-time in the check-out counters.
  • Resource Scheduling: Monitoring the efficiency of the employee schedule for improved load balance of employee work-hours.
  • Loss prevention: Real-time monitoring of 'shrinkage' because of shoplifting, employee embezzlement, credit card fraud, system errors and many more.
  • Product Management: Real-time monitoring of product inventory.
Given this background, I would proceed on to connect all these process areas with a business case that would put a Store Manager in trouble and how an Operational BI solution can save his day.

Let us assume that the Store Manager has access to a reporting solution which refreshes once in a day. He notices that the daily sales has dropped as compared to the previous day. He drills further down to investigate the cause of the decline. He finds out that the drop can be traced to one particular hour in the day. A deeper look into the problem highlighted the issue of an increased average customer wait-time per hour causing a poor conversion rate. The wait time finally was attributed to reduced work-force in that hour because of an increased lunch break taken by the employees (since they turned up very early to work).

This problem could have been easily rectified if the store manager had access to data earlier than what he had. Had he had real-time access, he would have noticed the dip in sales for that hour immediately and would have taken corrective action, thereby not affecting the sales during that hour. With a decent business case established for a real-time BI system, let's analyse what an operational BI is and how does it facilitate to solve the problem.

The architecture of Operational BI and the challenges associated with it will be posted in the next article.

Tuesday, October 21, 2008

Infobright's column based datawarehousing

Infobright, a open source data warehousing start up addresses the performance issues that usually come along with a data warehouse by implementing a highly compressed column-oriented store. The data is stored in columns instead of rows. This allows for reduced I/O because of the compression ratios obtained on the columns. Data is stored as 65K blocks or nodes containing a meta data store about the relationships between columns.

Some of the key customers of Infobright are RBC Royal Bank and Xerox. They claim their product would be ideal for data warehouses ranging from 500GB to 30TB. Their compression ratios are close to 40:1 according to their community blogs. The most attractive feature about them was the compatibility with the existing Business Intelligence tools like Business Objects and Pentaho.

I wasn't very convinced with the concurrency offered by them. It supports 50-100 users with 5-10 concurrent queries. I will watch for the progress of this new exciting player in the already crowded BI market.

Thursday, July 10, 2008

Release Early Release Often

Release Early Release Often (RERO) technique proposes to have releases early and often, instead of a big bang release. This approach is typically followed in tech startups, working on Open source projects. That’s the reason we see many of Google’s products still in beta version and their updates getting released once in a month or so. We planned to experiment the strategy for a big Master Data Management (MDM) project. The experimentation turned out to be successful. The rest of the essay discusses the experience details of such an implementation.

User Thrill

Important features of the application were phased out for various distinct releases. Some of them were Hierarchy & Workflow management, Security and Exception reporting. And the duration between releases were as close as 2 weeks. That meant, the user saw features getting added once in 2 weeks. We captured the user feedback about the releases and made sure we corrected it in the immediate ones. This approach had a two prong benefit. User experienced the application very, very early and we experienced the bugs. By the time, the UAT phase reached us, the application had reached a near-to-zero defect zone. We were a bit skeptical whether the user participation would be high, but since the product was there to be played with, it naturally attracted them.

Incremental Application testing

The application was getting tested from the day the first beta was released; rather from the “Go Live” day. Although this created few negative impressions on the user experience due to few unpleasant bugs; they knew that it was in its beta stages and the next release would have the patched version. In fact, our testing team grew from a 3 member team to a 6 member virtual team (There were 3 business users).

Support framework

To enable such a dynamic release process, the revision control and the code review/release systems should be efficient; there would be multiple releases instead of one. The integration testing should be solid. And the unit testing before the releases should be good enough not to distract your users completely; dissolving the purpose. Meticulous planning of the releases will also form a key to the success. The development tools that you use should be agile and adaptable enough to accept and implement the user’s feedback for the next release.

Conclusion

The experiment turned out to be a success. This strategy would work for most of your implementations, unless it’s a maintenance project with less than a week’s duration of deliverable.

Wednesday, June 25, 2008

Which MDM approach is right for you?

MDM, in the past 5 years, has come a long way in its maturity model. Most of the MDM implementations fall under 2 different kinds of approaches.

  1. Operational MDM (the tougher among the two)
  2. Analytical MDM

Operational MDM enables synchronization of master entities and their attributes between the transaction processing systems. Why does one need such an MDM? Let's take an example. ABC Corporation is a manufacturing firm. It conducts roadshows and marketing campaigns to advertise its products. The salesperson collect customer information during those roadshows and feed it into their IT systems for further followup. There are a different set of sales representatives who conduct feedback on their products sold, with their customers. They too enter the customer feedback into their IT systems. These are 2 different sets of CRM processes.

Typically what happens in a mature company is, there are a set of batch processes which pick up the master data from one system and transfer it to the other. Now this introduces delay, inconsistency, inaccuracy of data and lot of manual reconciliation (same customer name can be entered by 2 different salesperson or the latest survey from a salesperson can erase previously collected information about the customer). So the IT develops custom programs to clean up the data, write reconciliation programs but still cannot manage to do all this in real time.

This mess can be reduced or eliminated by deploying an operational MDM. Operational MDM tools solve the synchronization problem using complex match-merge algorthims. Some of the tools currently in the market are Siperian, IBM, Purisma, Oracle and SAP.

Analytical MDM is an architectural approach if the problem revolves around inconsistent reporting for business performance management. In simple terms, inconsistent hierarchies are getting reported out. This needs for a unified reporting view of the master data. The audience for this system would be the downstream data warehousing and business intelligence applications. Some of the MDM vendors selling their expertise in this area are Kalido, Oracle, IBM.

It is essential that an organization has to build both these models to address their MDM needs. But which one to chose first depends on which problem is in their high priority list.

Friday, June 20, 2008

Teradata's reseller alliance with Trillium

Teradata Corporation announced its reseller alliance partnership with Trillium Software. Teradata will now combine its warehouse product with Trillium's Data quality tools and its own MDM products. Overall, this seems to be a good strategy for Teradata, because now Teradata's customers can leverage Trillium's data quality abilities on their huge databases.

Because of this alliance, the customers will enjoy a powerpacked database, Data Quality tools and a MDM suite. Information Difference has ranked Teradata's MDM low in the quadrant though compared to the likes of SAP, Oracle and Siperian.

Thursday, June 19, 2008

Buy or Make - Financial Analytics

Today, I had a consulting assignment with a company focussing on Server Virtualization. The objective was to narrate the factors influencing a Make vs Buy (mVb) decision and their risk quotients for a Financial Analytics Solution.

Some of them are :

  1. What is the business requirement and is the requirement very unique?
  2. How urgent is the application?
  3. What is the technology Strategy of the Organization?
  4. Does the off-the-shelf product address most of the requirements and does it have flexibility to customize it?
  5. How does the present make-buy decision relate to the strategy?
  6. Are their right people and support systems to support the application, in case of a build?
  7. Does the financial tool address internationalization needs?
  8. Are their security measures in-built in the tool, because it hosts sensitive data?
  9. Can the Integration of the Packaged Solution into the process control system be done seamlessly?
  10. What is the underlying technology? In this case, what is the ERP system? It would make sense to buy the analytical solution from the same vendor of the ERP system, if it addresses your requirements
  11. Will the TCO be reduced because of the Buy approach?
  12. Are their right people and support systems to support the application, in case of a build?
  13. Does it reduce cost?

After these questions were answered, the following matrices were prepared which summed up the decision.
  1. High Level Requirement x Priority x Effort Estimation Matrix
  2. Benefit Comparison Matrix
  3. Risk Comparison Matrix