Saturday, August 16, 2008
Linked Question response
In addition to any trends you are observing, which ones do you feel will make the most positive impact for business users of BI?
There have been some positive trends, although the road towards next generation or "Agile BI" has definitely been a slow one. So what constitutes Agile Business Intelligence?
1 - A greater connection with and alligiance to the end user. In other words coopting the users within the development process. The democratization of BI - this refers to a more open, less costly licensing approach which recognizes that BI is applicable across the enterprise, not just in the hands a select few.
2 - BI is getting faster, provides more value and more is getting more cost effective.
3 - Greater flexibility in terms of including ad hoc reporting capabilities, coordination with unstructured data sources and user collaboration.
Perhaps the best way to bundle these considerations into a single definable trend is to view BI as moving to a more response, rapid and cost effective part of the enterprise.
copyright 2008, Semantech Inc.
Tuesday, April 22, 2008
What is Master Data ?
MDM, The Core Concept:
Let's start with what Master Data is not, Master data is not:
- Meta-data, which is a description of data (or data about data as it's commonly referred to as).
- Ontology, Taxonomy or Vocabulary - Master data can be derived from these but is not in itself a formal semantic construct.
- Software Tool - ultimately, Master Data is technology-agnostic; it is a logical construct which can be defined through various modeling tools and realized through a variety of data management software solutions. At the point where Master Data becomes tightly coupled with any one software tool or any one modeling technique it will likely loose a great deal of its potential value to the enterprise.
- It may be considered "data of record" or an authoritative data source, but it might not be also. Data of record implies that there is a system of record with sanctioned data elements that are not meant to be repeated throughout the enterprise across other systems. Or this might refer to data entities which are determined to be unique and authoritative across the enterprise regardless of their current use (in a system).
- Master data is reference data, sort of. If we consider that reference data is a definitive set of element definitions or entities associated with any particular business domain, sub-domain or problem space. In this capacity, Master Data may serve multiple roles, including: discovery, registry or repository access, data dictionary foundation.
- Benchmark - this is a critical consideration; any data entities defined as Master Data elements within an enterprise are unlikely to remain unchanged or unmodified. Eventually there will be variations of Master Data sets, these variations must be tracked back to their source and there must also be a mechanism whereby others in the enterprise can understand where, why and how those modifications occurred 'atop' the core sets of Master Data. Thus the Master Data is a baseline or benchmark wherein the data chain of custody can be managed or tracked.
- It can be a canonical data model or data exchange model - this is important in cases where the core data architecture has not yet been designed or deployed, or in cases where it is anticipated that there will be a major or radical transformation of the existing architecture to a new one. The model can contain Master Data elements or sets within it.
Much of what we refer to now as MDM solutions have been borne out of previous product solutions that were describe as meta-data management solutions. For many of the MDM solutions on the market, the "repository or registry" architectural construct / pattern is how this capability is harnessed. Another architectural approach related to MDM might be referred to as the middleware design - this extends MDM into data transport and is focused on supporting accurate message translation. And of course there are solutions that combine both aspects.
Perhaps we can consider that there are at least two philosophical approaches to MDM:
- Passive MDM - This is most closely aligned to the original Meta-data management solutions with a central repository to support discovery and high level data reconciliation.
- Active MDM - This is most closely aligned with solutions stemming from EAI, Middleware, ETL based solutions where data reconciliation rules are being applied at multiple levels and in more detail.
- Hybrid MDM - Both solutions are relatively weak in dealing with bi-directional reconciliation focused heavily on transactional systems (it is much easier to reconcile historical data from multiple sources than real-time data from multiple sources). Hybrid MDM applies both previous techniques and new ones to tackle the most problematic use cases.
As we've discovered with nearly every IT technology and product over the past 40 years - implementation without process or architectural considerations leads to many issues, often more issues than existed before the new technology was introduced. This is no exception with MDM - the most important thing to consider here is that deployment of MDM software can significantly impact both solution performance and integrity (and there is no guarantee that the impact will be positive).
References - Other Definitions:
Copyright 2008, Semantech Inc.
Saturday, April 19, 2008
ETL versus MDM
To understand what this all means we need to take a step back for a moment and think about where MDM came from - in the early 2000's there was a small set of meta-data repository tools, the problem with these though is that they proved difficult to use, had a fairly significant solutions footprint and more importantly seemed somewhat disconnected from the data sources they were charged with tracking.
The notion of what is 'Master Data' is also related to the Data Warehousing trend going back to the mid-1990's. The idea embodied by most DW solutions at that time was the desire to obtain a "single version of the truth" or one set of standard data entities. In the case of the DW solutions though this meant the actual entities and not the meta-data about the entities. This is an important distinction as it soon became apparent that consolidating data architectures was much more difficult than characterizing or defining them and later governing them through meta-data.
So the concept behind MDM has its roots in a series of attempts to obtain higher levels of management control over diverse sets of enterprise data. This began to imply a data governance lifecycle as well as the need to be able to reconcile data through a meta-data management layer. However, the folks who came up with "MDM" products were not the only ones who understood this. Similar capabilities were being built into products on front-end and back-end of the data management lifecycle. The back-end, i.e. the part which interfaces directly with the DBMS's from a data loading perspective is referred to as ETL. ETL stands for Extract Transform and Load, but of course ETL tools have become more sophisticated than their original name might imply.
The most commonly recognized COTS tool dedicated to ETL is Informatica - but now Informatica has extended its solution to encompass what is being referred to as MDM. Traditionally, the easiest way to determine the distinction between MDM and ETL was the fact the MDM solutions were somewhat abstracted from the full data lifecycle, providing a reference point or front-end discovery mechanism. That is as we've noted changing.
Prediction - Given the somewhat ubiquitous title implied by 'MDM,' I believe that the concept and the related tools are going to evolve into it. MDM will extend across the data governance spectrum to eventually encompass all of it.
Copyright 2008, Semantech Inc.
Monday, March 24, 2008
Vision Statement - Agile Business Intelligence
Moreover, data integration has become the linchpin within holistic architectures based upon services and sophisticated business process orchestration. Agile Intelligence is not a vendor-focused solution; it is a technology agnostic philosophy designed to address the new realities of enterprise integration. It provides a context and methodology that aligns technical solutions with the expectations that drive them. Agile Intelligence provides a solution focus that pulls the big picture together without losing sight of those whom it serves, the end-users.
Copyright 2008, Semantech Inc.
Sunday, March 23, 2008
An Enterprise view of Master Data Management
Master Data Management (MDM) has arisen out of the pragmatic need to control metadata across disparate data sources and applications within a given enterprise. The problem thusfar with MDM solutions is that they tend to be focused just within a single enterprise. The principles behind Master Data Management however can be extended to accommodate semantic mediation between disparate enterprises or functional domains. Innovators are posturing to use MDM as the discovery fabric between services oriented environments and the semantic governance layer within unique enterprise enclaves. This new role is referred to as Enterprise Master Data Management, eMDM. End users are the key to making this work as they are responsible for defining the semantic layer and the rules for discovery between domains.
In Agile Business Intelligence, the nature of data transformation takes on a new character, focusing on steps necessary to aid user driven-discovery and manipulation. Attempts to build complex logic and data integration at or near the data source level often represent developer assumptions about user need; usually without the benefit of full validation. Worse yet, using the old techniques many of those assumptions end up being ‘hard-coded into solutions. The Master Data Management best practice is a good illustration of how and why logic needs to migrate closer to the end-user. The metadata we gather to help determine the types of data discovery that occurs most often is validated by users then stored here and used to make other decisions regarding performance optimization within the federated data layer. In many ways, data integration is moving to a more of a “search engine” model – data heuristics rather than data modification.

Copyright 2008, Semantech Inc.
Thursday, March 20, 2008
Introduction to ABI
Agile Business Intelligence -
The Internet has redefined people’s expectations about information. Access to information has become easier, faster and most importantly, more flexible. The search engine discovery interface in many ways is the ultimate ad hoc reporting tool. Over the past year, companies have been scrambling to imitate that interface on the desktop and even combine the two. Why is it so popular, even considering that the results of current search engines are not particularly accurate? The answer is simple; the user is in total control of the discovery process. A new generation of BI capabilities is bringing that user control to more sophisticated report generation tools with much more accurate results. This is Agile BI.
From the developer’s perspective, instead of making assumptions we use our relationship with the end users to let them drive and validate logic from their perspective. Agile BI is focused on utilizing ad hoc capabilities to drive discovery and determine optimization strategies. Agile provides the user toolset for influencing the rest of the Agile Data Architecture. Based upon user queries and activities, we gather metadata, optimize caches and determine cross domain mapping strategies. Some have referred to this capability as “Operational BI,” however that doesn’t capture the context of what’s really happening across the entire architecture.
Agile BI is not merely another channel for managing Business Intelligence, it is a new way to view the entire practice of BI. This best practice assumes that BI is of value to the whole enterprise, not merely 10% of it. This practice also assumes that for BI to work the end user becomes a partner in the development process. Most importantly, this is where we gain most of our insights on the overall performance of our Agile Data Architecture.
Copyright 2008, Semantech Inc.


