Saturday, August 16, 2008

Linked Question response

What Business Intelligence trends have you observed?

In addition to any trends you are observing, which ones do you feel will make the most positive impact for business users of BI?


There have been some positive trends, although the road towards next generation or "Agile BI" has definitely been a slow one. So what constitutes Agile Business Intelligence?

1 - A greater connection with and alligiance to the end user. In other words coopting the users within the development process. The democratization of BI - this refers to a more open, less costly licensing approach which recognizes that BI is applicable across the enterprise, not just in the hands a select few.

2 - BI is getting faster, provides more value and more is getting more cost effective.

3 - Greater flexibility in terms of including ad hoc reporting capabilities, coordination with unstructured data sources and user collaboration.

Perhaps the best way to bundle these considerations into a single definable trend is to view BI as moving to a more response, rapid and cost effective part of the enterprise.


copyright 2008, Semantech Inc.

Tuesday, April 22, 2008

What is Master Data ?

I was speaking with someone this week on the topic of MDM - it occurred to me not too long after we began the conversation that we more than likely had differing perspectives as to what Master Data Management meant. That's inspired me to take a stab at a more formal definition than I've seen anywhere else thusfar...

MDM, The Core Concept:

Let's start with what Master Data is not, Master data is not:
  1. Meta-data, which is a description of data (or data about data as it's commonly referred to as).
  2. Ontology, Taxonomy or Vocabulary - Master data can be derived from these but is not in itself a formal semantic construct.
  3. Software Tool - ultimately, Master Data is technology-agnostic; it is a logical construct which can be defined through various modeling tools and realized through a variety of data management software solutions. At the point where Master Data becomes tightly coupled with any one software tool or any one modeling technique it will likely loose a great deal of its potential value to the enterprise.
So, then what is it? How would we characterize what can become Master Data or not ?
  1. It may be considered "data of record" or an authoritative data source, but it might not be also. Data of record implies that there is a system of record with sanctioned data elements that are not meant to be repeated throughout the enterprise across other systems. Or this might refer to data entities which are determined to be unique and authoritative across the enterprise regardless of their current use (in a system).
  2. Master data is reference data, sort of. If we consider that reference data is a definitive set of element definitions or entities associated with any particular business domain, sub-domain or problem space. In this capacity, Master Data may serve multiple roles, including: discovery, registry or repository access, data dictionary foundation.
  3. Benchmark - this is a critical consideration; any data entities defined as Master Data elements within an enterprise are unlikely to remain unchanged or unmodified. Eventually there will be variations of Master Data sets, these variations must be tracked back to their source and there must also be a mechanism whereby others in the enterprise can understand where, why and how those modifications occurred 'atop' the core sets of Master Data. Thus the Master Data is a baseline or benchmark wherein the data chain of custody can be managed or tracked.
  4. It can be a canonical data model or data exchange model - this is important in cases where the core data architecture has not yet been designed or deployed, or in cases where it is anticipated that there will be a major or radical transformation of the existing architecture to a new one. The model can contain Master Data elements or sets within it.
MDM in Today's Implementations

Much of what we refer to now as MDM solutions have been borne out of previous product solutions that were describe as meta-data management solutions. For many of the MDM solutions on the market, the "repository or registry" architectural construct / pattern is how this capability is harnessed. Another architectural approach related to MDM might be referred to as the middleware design - this extends MDM into data transport and is focused on supporting accurate message translation. And of course there are solutions that combine both aspects.

Perhaps we can consider that there are at least two philosophical approaches to MDM:
  • Passive MDM - This is most closely aligned to the original Meta-data management solutions with a central repository to support discovery and high level data reconciliation.
  • Active MDM - This is most closely aligned with solutions stemming from EAI, Middleware, ETL based solutions where data reconciliation rules are being applied at multiple levels and in more detail.
  • Hybrid MDM - Both solutions are relatively weak in dealing with bi-directional reconciliation focused heavily on transactional systems (it is much easier to reconcile historical data from multiple sources than real-time data from multiple sources). Hybrid MDM applies both previous techniques and new ones to tackle the most problematic use cases.
It is clear to anyone who has worked with database development and data-system integration that having the ability to reconcile data sources adds tremendous value to the enterprise helping to improve performance, integrity and overall efficiency. Being able to do some of this automation using COTS tools is even more appealing, however there is still a set of processes which ultimately takes precedence here if one is to deploy a successful MDM solution. The enterprise data governance approach must be defined first, the data and business environment must be modeled and if ownership of data sets is to be handed off to user groups (either fully or partially) the impact to both governance and model maintenance must be considered and mitigated in advance.

As we've discovered with nearly every IT technology and product over the past 40 years - implementation without process or architectural considerations leads to many issues, often more issues than existed before the new technology was introduced. This is no exception with MDM - the most important thing to consider here is that deployment of MDM software can significantly impact both solution performance and integrity (and there is no guarantee that the impact will be positive).

References - Other Definitions:

Copyright 2008, Semantech Inc.

Saturday, April 19, 2008

ETL versus MDM

Over the past few weeks I've been engaged in a number of conversations with folks regarding what is the true nature of Master Data Management (MDM). The more I've thought about it, the more it seems to me that there needs to be better definition or consensus on the topic. The reality is that MDM has been largely a marketing-driven term more than emerging disruptive technology, MDM solutions do not follow any industry standard MDM methodology and most of the capabilities within these COTS are beginning to find their way into solutions not defined as MDM.

To understand what this all means we need to take a step back for a moment and think about where MDM came from - in the early 2000's there was a small set of meta-data repository tools, the problem with these though is that they proved difficult to use, had a fairly significant solutions footprint and more importantly seemed somewhat disconnected from the data sources they were charged with tracking.

The notion of what is 'Master Data' is also related to the Data Warehousing trend going back to the mid-1990's. The idea embodied by most DW solutions at that time was the desire to obtain a "single version of the truth" or one set of standard data entities. In the case of the DW solutions though this meant the actual entities and not the meta-data about the entities. This is an important distinction as it soon became apparent that consolidating data architectures was much more difficult than characterizing or defining them and later governing them through meta-data.

So the concept behind MDM has its roots in a series of attempts to obtain higher levels of management control over diverse sets of enterprise data. This began to imply a data governance lifecycle as well as the need to be able to reconcile data through a meta-data management layer. However, the folks who came up with "MDM" products were not the only ones who understood this. Similar capabilities were being built into products on front-end and back-end of the data management lifecycle. The back-end, i.e. the part which interfaces directly with the DBMS's from a data loading perspective is referred to as ETL. ETL stands for Extract Transform and Load, but of course ETL tools have become more sophisticated than their original name might imply.

The most commonly recognized COTS tool dedicated to ETL is Informatica - but now Informatica has extended its solution to encompass what is being referred to as MDM. Traditionally, the easiest way to determine the distinction between MDM and ETL was the fact the MDM solutions were somewhat abstracted from the full data lifecycle, providing a reference point or front-end discovery mechanism. That is as we've noted changing.

Prediction - Given the somewhat ubiquitous title implied by 'MDM,' I believe that the concept and the related tools are going to evolve into it. MDM will extend across the data governance spectrum to eventually encompass all of it.


Copyright 2008, Semantech Inc.

Monday, March 24, 2008

Data Federation Presentation

Vision Statement - Agile Business Intelligence

It may be time for us to reconsider some of our previous assumptions regarding data integration. For decades, we’ve presumed that the best method for managing data was through strict conformance and control – we viewed the enterprise as static, remaining stable once properly defined. We’ve discovered that often quite the opposite is true (or that in fact we must contend with multiple enterprises). Today we are facing more complexity in data integration than ever before; more data sources, greater volumes of data, more solution paradigms to deal with and greater expectations for Cross Domain data exchange.

Moreover, data integration has become the linchpin within holistic architectures based upon services and sophisticated business process orchestration. Agile Intelligence is not a vendor-focused solution; it is a technology agnostic philosophy designed to address the new realities of enterprise integration. It provides a context and methodology that aligns technical solutions with the expectations that drive them. Agile Intelligence provides a solution focus that pulls the big picture together without losing sight of those whom it serves, the end-users.

Copyright 2008, Semantech Inc.

Sunday, March 23, 2008

An Enterprise view of Master Data Management

What do your enterprise’s databases, applications and documents all have in common? Do they share a common data model; not likely. But all of these resources do represent enterprise information and can be characterized and tracked using metadata. Some studies have shown that the time we spend looking for information resources gobbles up 25 to 30% of the average workday. Metadata is not just a technical consideration, it can define productivity in our knowledge economy.

Master Data Management (MDM) has arisen out of the pragmatic need to control metadata across disparate data sources and applications within a given enterprise. The problem thusfar with MDM solutions is that they tend to be focused just within a single enterprise. The principles behind Master Data Management however can be extended to accommodate semantic mediation between disparate enterprises or functional domains. Innovators are posturing to use MDM as the discovery fabric between services oriented environments and the semantic governance layer within unique enterprise enclaves. This new role is referred to as Enterprise Master Data Management, eMDM. End users are the key to making this work as they are responsible for defining the semantic layer and the rules for discovery between domains.

In Agile Business Intelligence, the nature of data transformation takes on a new character, focusing on steps necessary to aid user driven-discovery and manipulation. Attempts to build complex logic and data integration at or near the data source level often represent developer assumptions about user need; usually without the benefit of full validation. Worse yet, using the old techniques many of those assumptions end up being ‘hard-coded into solutions. The Master Data Management best practice is a good illustration of how and why logic needs to migrate closer to the end-user. The metadata we gather to help determine the types of data discovery that occurs most often is validated by users then stored here and used to make other decisions regarding performance optimization within the federated data layer. In many ways, data integration is moving to a more of a “search engine” model – data heuristics rather than data modification.



Copyright 2008, Semantech Inc.

Thursday, March 20, 2008

Introduction to ABI

Hello and welcome to the Agile Business Intelligence (ABI) Blog. You may be wondering just what ABI is - what makes it different from ordinary Business Intelligence or Decision Support. That's precisely what we intend to explore on this Blog. We'll start at the beginning though, with the definition:

Agile Business Intelligence -

The Internet has redefined people’s expectations about information. Access to information has become easier, faster and most importantly, more flexible. The search engine discovery interface in many ways is the ultimate ad hoc reporting tool. Over the past year, companies have been scrambling to imitate that interface on the desktop and even combine the two. Why is it so popular, even considering that the results of current search engines are not particularly accurate? The answer is simple; the user is in total control of the discovery process. A new generation of BI capabilities is bringing that user control to more sophisticated report generation tools with much more accurate results. This is Agile BI.

From the developer’s perspective, instead of making assumptions we use our relationship with the end users to let them drive and validate logic from their perspective. Agile BI is focused on utilizing ad hoc capabilities to drive discovery and determine optimization strategies. Agile provides the user toolset for influencing the rest of the Agile Data Architecture. Based upon user queries and activities, we gather metadata, optimize caches and determine cross domain mapping strategies. Some have referred to this capability as “Operational BI,” however that doesn’t capture the context of what’s really happening across the entire architecture.

Agile BI is not merely another channel for managing Business Intelligence, it is a new way to view the entire practice of BI. This best practice assumes that BI is of value to the whole enterprise, not merely 10% of it. This practice also assumes that for BI to work the end user becomes a partner in the development process. Most importantly, this is where we gain most of our insights on the overall performance of our Agile Data Architecture.

Copyright 2008, Semantech Inc.