HOME

Archive for the ‘Data Governance’ Category

Is MDM Dead?

Wednesday, March 3rd, 2010

By Mike Shultz, Infoglide Software CEO

Andrew White of Gartner recently posed a question about whether master data management (MDM) is dead. He didn’t actually suggest that the demise of master data management is imminent. He was challenging whether our current terminology adequately clarifies the current reality about MDM and associated product areas.

Certainly the terms describing many markets and types of products are being associated with MDM. Jackie Roberts of DATAForge pointed out that the definition of MDM now seems to include “data integrity, data quality, entity resolution, matching, data integration, governance, metrics and analysis.”

While entity resolution was mentioned in her list, our obsessive focus on entity resolution (aka identity resolution) leads to the conclusion that, rather than being subsumed, its role is growing. Wayne Eckerson at TDWI seems to agree that identity resolution is a critical component of the recent MDM acquisitions. In his post about the acquisitions by Informatica and IBM of Siperian and Initiate Systems, respectively, he described the two transactions this way:

“You could say that Siperian is mostly MDM, but with identity resolution and other capabilities, whereas Initiate is mostly about identity resolution, but with MDM and other capabilities.”

Identity resolution is becoming an integral part of many product areas. Within MDM itself, creating a single-entity view is best done with an identity resolution engine. Data mining is greatly enhanced by the addition of entity resolution. Dan Power of Hub Solution Designs wrote about how key identity resolution is to data matching. We’ve talked about how social CRM can resolve identities of individuals across multiple disparate data sources using identity resolution, as well as “rationalize multiple variations and errors and anomalies that block finding existing customers within their systems”.

Although identity resolution technology has been years in the making, it has only recently risen into the consciousness of most analysts and customers. Because of its ability to bring enhanced clarity to ambiguous data, advanced identity resolution is now beginning to have a significant impact across many data-centered disciplines.

Identity Resolution Daily Links 2010-02-27

Saturday, February 27th, 2010

[Post from Infoglide] Attacking Subscription Fraud with Identity Resolution

“In March 2006, the Communications Fraud Control Association (CFCA) estimated that annual global fraud losses in the telecom sector were between $54 billion and $60 billion, and the losses continue to be substantial. Many types of fraud have been identified, but by far the most prevalent is subscription fraud.”

ITBusinessEdge: Analyst: SAP Missed Out During Recent MDM Acquisition Spree

SAP, on the other hand, has had a lot of issues in the past couple of years. They haven’t made a direct MDM acquisition since they acquired A2i years and years ago, which was a PIM vendor and they’ve just been working off of that architecture and been trying to improve it.”

Liliendahl On Data Quality: Data Quality Tools Revealed

“Data matching is the ability to compare records that are not exactly the same but are so similar that we may conclude, that they represent the same real world object.”

BeyeNETWORK: Master Data Management: Moving Forward…

“So now that MDM has been around for a while, and the master data terminology has drifted into our standard vocabulary, it might be worth stepping back and asking a different question:  Is MDM the revolutionary approach to organizational data consolidation and enterprise information management or is it devolving into yet another  (of many) data management tools?”

Identity Resolution Daily Links 2010-01-29

Friday, January 29th, 2010

[Post from Infoglide] Master Data Movement

“I read with interest yesterday’s article at SeekingAlpha which discusses rumors swirling around the MDM software industry.  According to the article, sources suggest that two deals are very near completion.  The first of those rumored transactions would see Informatica picking up MDM provider Siperian.  On the heels of their acquisitions of Identity Systems and AddressDoctor, the Siperian purchase could not be totally unexpected – but would most certainly create some ripple effect worth watching.”

[Post from Infoglide] Connecting the Dots: We May Be Closer Than We Think

“Paul Rosenzweig, former Deputy Assistant Secretary for Policy at the Department of Homeland Security, recently posted an intriguing piece on Harvard National Security Journal about connecting the dots regarding the Christmas Bomber. He makes a strong case that a decision to stop research on data analytic tools in 2003 has contributed to the problem analysts face today in making sense of the massive and manifold data sources they sift through.”

Forrester Blog: Introducing The MDM Market’s Newest 800lb Gorilla: Informatica Acquires Siperian!

“In the short term, I’m sure Informatica will be more than happy to continue to collect revenue from Oracle while keeping this partnership alive, but don’t expect future negotiated contracted terms to remain very reasonable as Informatica gains traction with its MDM strategy. No matter how often Oracle says how happy they are to maintain a friendly state of co-opetition with strategic partners, I don’t anticipate they will want to run the risk of a competitor pulling the rug out from under its aggressive MDM strategy.”

News8Austin: Community forum poses questions about Fusion Center

“According to department officials, sharing information with neighboring jurisdictions as well as state and federal agencies ensures that crime history and other information is shared outside the city limits. The department said it the center will be one that ‘analyzes information in order to best detect, respond and hopefully prevent criminal and terrorist activity — as well as other public safety hazards.’”

Ramon Chen: Informatica + Siperian Acquisition = Premier MDM Platform

“As expected, Informatica has announced that it has acquired Siperian (disclosure, my former company) for $130M… If predictions are correct, this will be a relative ‘bargain’ when compared with the upcoming IBM and Initiate Systems tie up which is expected to be 4 to 5x Initiate’s $90M annual revenues.”

The Big Story: Evolution

Wednesday, November 11th, 2009

Technology writer Chris Calnan’s story opened with a comment about Infoglide that nicely sums up the evolution of the broader market for identity resolution and entity analytics: “The market may have finally caught up with Infoglide Software Corp.’s technology.”

While identity resolution technology has evolved rapidly over the past decade, its market visibility only emerged fairly recently. It was barely two years ago in mid-2007 when Gartner analyst Mark Beyer dubbed it “entity resolution and analysis” and pointed out that it “was previously an obscure, but gradually developing, technology that has come to the forefront as a result of world events and market forces.” Gartner singled it out as an “On the Rise” technology within operational business intelligence.

That first Gartner “hype cycle” showed entity resolution and analysis entering at the earliest stage. A year later in mid-2008, a broader report on data management  depicted it significantly higher on the curve in the opinion of the Gartner analyst team. In both reports, its estimated time to “mainstream adoption” was 2-5 years, the second fastest category.

At the end of 2008, noted consultant and speaker Jill Dyché of Baseline Consulting issued her predictions for 2009. Along with predictions about SaaS, data governance, BI, and MDM, she said that “Identity Resolution will get its due.” Rob Karel of Forrester had written several months before about Informatica’s acquisition of one of the two closest Infoglide competitors (IBM EAS being the other one). Identity Systems was acquired from Nokia for $85 million.

As we progressed further into 2009, the most meaningful indicator of identity resolution’s growing importance surfaced: an escalating identification with the space by other companies. IBM, Infoglide, and Informatica were joined by Initiate Systems, Intelligent Search, and Netrics, each of whom began incorporating messaging around identity and entity resolution.

For our customers and for us, this is all good news.  Our evolving space becoming better known and more highly valued will provide more alternatives for customers while increasing our own visibility. The future of identity resolution looks bright, and we all win.

[Distributed earlier this week in our quarterly publication, Identity Resolution Quarterly]

Identity Resolution Daily Links 2009-10-19

Monday, October 19th, 2009

By the Infoglide Team

information management: Multi-Entity MDM Enablement

“Most efforts, however, are executed in surroundings inhibited by existing infrastructure (legacy applications, tools, hardware and integration), dispersed organizational structures and suboptimal processes. This reality introduces challenges in architecting and deploying efficient and effective multi-entity MDM solutions.”

BAM INTEL: BAM’s Thinking on the New DHS Standards

“Public Fusion Centers must be seen by citizens and policy-makers to play a direct role in the response to disasters as well as intelligence gathering. They cannot remain in the intelligence-sharing role only and not take some of the spotlight when their good work prevents or lessens the impact of America’s next disaster.”

newsday.com: OPINION: Revolution right in your doctor’s hand

“For doctors and their patients (in other words, all of us), the electronic health record is a far more revolutionary idea than those that brought us the ability to download a song, post a video online or read and send e-mails when you’re on a camping trip. While those other innovations indirectly enhance the quality of life, they are designed for entertainment or business purposes. The EHR directly improves quality of life because the end result of its design is better health.”

SmartData Collective: Data May Require Unique Data Quality Processes

“All data quality projects can appear the same from afar but ultimately can be as different as stars and planets. One of the biggest ways they vary is in the data itself and whether it is chiefly made up of name and address data or some other type of data.”

To Move or Not to Move: That is the Question

Wednesday, September 30th, 2009

By Robert Barker, Infoglide Senior VP & Chief Marketing Officer

A continual theme at IdentityResolutionDaily is maintaining the privacy and confidentiality of data at all times. Two recent posts concerned fusion centers and citizen profiling, but the same issues apply to virtually any application of entity resolution technology. The fact is that, in some cases, anonymous identity resolution is a requirement for more sensitive identity resolution implementations.

The strong emphasis in data management for the last decade or so has been to implement data warehouses, data marts, and master data management. When bundled with associated processes like data extraction, transformation, and cleansing, these methods have been widely accepted as the best approach to solve any data problem. Here at IdentityResolutionDaily, we tend to talk about this over-handling of data as “data deterioration.”

A more basic approach is simply working with data sources undisturbed in their native environments. New principles suggest that you should perform scoring analyses as close to the source as possible. By exploiting existing security layers already in place, the need to add new layers of security is obviated.

Of course, for key sources of operational data, existing IT policies may deny direct access. In other cases, it may be necessary or preferable to move data for other reasons. For example, achieving desired performance parameters may dictate working with an extracted subset of the data rather than the entire data store.

The point I’m making is not to forbid moving data or creating data marts under any circumstances. Rather, I’m suggesting that the most rational approach is the following:

  1. Develop solutions that adapt easily to multiple, disparate, remote data sources.
  2. Default to leaving data where it lives whenever and wherever possible.
  3. Provide the appropriate levels of entity anonymity within the solution and with the least possible intrusion to the enterprise.

Internal and External Views of Identity

Thursday, August 27th, 2009

By John Talburt, PhD, CDMP, Director, UALR Laboratory for Advanced Research in Entity Resolution and Information Quality (ERIQ)

In an earlier post, I stated my view that identity resolution and entity resolution are somewhat different processes.  In particular, I consider identity resolution as a special form of entity resolution in which entity references are resolved by comparing them to the characteristics of a given set of known entities.  Regardless of the approach, identity plays an important role in all forms of entity resolution.

The identity of an entity is a set of attributes and rules for comparing the attribute values that allow it to be distinguished from all other entities of the same type in a given context.  A key  feature is that identity is context-dependent, i.e., it depends upon the total set of entities under consideration.  For example, a common scheme for creating email addresses in an organization uses a person’s first two initials and last name, e.g. jrtalburt.  In a small organization, this is usually sufficient to make a unique address for each employee.  However, applying this in a much larger pool of users such as the yahoo.com or gmail.com domains quickly surfaces that these attributes are insufficient.

For a more relevant business example, consider the case of a customer, Mary Smith.  For simplicity, assume that the totality of her adult residential address history comprises:
1.    Mary Smith, 123 Oak St, Anytown, NY, 1998-06 to 2000-03
2.    Mary Jones, 234 Elm St, Anytown, NY, 2000-04 to 2002-11
3.    Mary Jones, 345 Pine St, Anytown, NY, 2002-12 to present

Despite having used 2 names and 3 addresses, these are all references to the same person. There are two ways to view the issue of identity as illustrated by this history.

One is to start with the identity based on vital statistics, e.g. Mary Smith, a female born on December 3, 1980, in Anytown, NY, to parents Robert and Susan Smith, then to follow that identity through its various representations of name and address as shown above.  This “internal view of identity” is the view of Mary Smith herself and might well be the view of a sibling or other close relative, someone with complete knowledge about her address history.  The internal view of identity represents a closed universe model in which all of the possible occupancy variants are known to the internal viewer (system) and any occupancy record not equivalent to one of the known variants must belong to some other identity.

On the other hand, an external view of identity is one in which some number of address records for a customer’s identity have been linked, but the viewer (system) does not know if it is the complete history.  Given another customer address record not equivalent to one of the records in the history, it must be determined if it does or does not belong to Mary’s history.

Suppose that a system has only the first two address records of Mary’s history.  In this case, the system’s knowledge of Mary’s identity would be incomplete.  It may be incomplete because either the third address record is not in the system (has not been acquired) or because the system hasn’t linked it to the first two records.  In the latter case, the system would assume that the third record is part of a different customer’s identity.  Even though an internal viewer would know that the third address record should also be part of the Mary’s complete history, the external viewer has not made that determination.

Conversely, an external viewer may assemble an inaccurate view of Mary’s history by linking the first two records of her address history to an address for a different Mary Smith.  These entity resolution failures, incomplete and inaccurate histories, are information quality dimensions and indicate why the areas of entity resolution and information quality are so closely related. (Several classes of failures were discussed in another recent post.)

In an external view, the identity of the customer is equivalent to the set of occupancy records that have been resolved (i.e. linked).  The known address records comprise the external viewer’s (or system’s) entire knowledge of the customer’s identity.  If additional occupancy records are acquired and are correctly determined to be for this same customer, then the system’s knowledge about this identity increases.

The external view of identity reflects the experience of a business or government agency using entity resolution tools and processes in an effort to link disparate records into a single view of a customer or agency client.  The “external view of identity” represents an open universe model because if the system is presented with a new occupancy record, it does not necessarily follow that the new records must be a part of a different identity.  It may or may not be part of an existing identity, something that the ER process must decide.

The major point to note is that an internal viewer is in a position to judge the quality of an external view.  With complete knowledge, the internal viewer can determine if any particular external viewer has omitted some records (completeness) or has linked records from different identities or failed to link records for the same identity (accuracy).

Along with Dr. Wang at MIT, I have introduced a quality metric in the form of an index for assessing the similarity of two identity resolutions.  In cases where one resolution represents an internal view (correct) and the other is an external view, the index provides a metric for entity resolution accuracy. I plan to explain this metric in my next post.

Identity Resolution Daily Links 2009-08-24

Monday, August 24th, 2009

By the Infoglide Team

CRMBuyer: The BI Outlook: A Bright Spot of Growth in a Gloomy Economy

“Investing in business intelligence is important for a company now more than ever, agreed Bill Barberg, president of Insightformation and an expert in Balanced Scorecard methodology. Sound business intelligence helps companies make fact-based decisions as they try to navigate in today’s stormy economy, he told CRM Buyer. “Business intelligence can help companies make much better decisions,’ he said.”

OCDQ Blog: Adventures in Data Profiling (Part 3)

“In Part 3, you will continue your adventures by using a combination of field values and field formats to begin your analysis of the following fields: Birth Date, Telephone Number and E-mail Address.”

SearchSOA.com: SOA with MDM prevents messaging confusion

“Increasingly, organizations are designing SOA into the MDM architecture from the beginning, says Dan Power, president and founder of consulting firm Hub Solution Designs Inc. in Hingham, Mass. This creates challenges in meshing the real-time realities with the need to keep the data accurate.”

iHealthBeat: Privacy and Security: Experts Focus on Legal Issues Surrounding EHR Use at AHIMA Summit

“Linda Kloss, AHIMA CEO, said many vendors have not focused on developing legally defensible EHR systems. In addition, health care providers have not created a demand for such functionality.”

Identity Resolution Daily Links 2009-07-31

Friday, July 31st, 2009

[Post from Infoglide] Data Finds Data in Real-Time Entity Resolution

“Jeff Jonas of IBM recently quoted from a chapter called “Data Finds Data”  that he co-wrote for a book entitled Beautiful Data: The Stories Behind Elegant Data Solutions, and I was impressed by how well this passage describes the effective use of entity resolution software (e.g., IRE 2.2)…”

IT-Director.com: GRC is not enough

[Philip Howard]”If you think about these different forms of risk, they can mostly be managed within existing GRC frameworks: business risk, data and IT governance and compliance cover five of these seven types of risk. But they don’t cover fraud or cyber attacks or similar security issues.”

SunSentinel.com: Roofer ducked $400,000 in worker’s comp premiums

“Investigators with the state’s Division of Insurance Fraud said Robert McDonald, owner of Gulfstream Roofing Inc., funneled $3 million in payroll through several fake companies between 2002 and 2006, claiming the money was being paid to insured subcontractors instead of his own workers.”

BNET Healthcare: What Can US Learn From European Health IT Experience?

“The three countries also use universal patient identification numbers in health care. This is much easier to do in Europe than it is in the U.S., where the mistrust of government is so high that the issue of having a single patient identifier number is no longer even under discussion. There’s also the small matter of our low EHR adoption rate, which is less than 20 percent for physicians and lower for hospitals. By contrast, most physicians in the three European countries are using some kind of EHR.”

Identity Resolution Daily Links 2009-07-27

Monday, July 27th, 2009

By the Infoglide Team

information management: Multidomain Master Data Management for Business Success

“All data that flows through an enterprise can be categorized into six different types: who, what, when, where, how and why. Master data is about who, what, when and where. ‘Who’ data is about the parties of interest that matter most to a business or organization including stakeholders, benefactors, customers, suppliers, owners, providers, partners, etc.”

HSToday: DHS Highlights Intelligence Improvements in Report Marking 9/11 Report Anniversary

“To date, 72 fusion centers have been designated throughout the country, with DHS having provided more than $340 million from fiscal years 2004-2009 to state and local governments to support these centers. DHS also deployed the Homeland Security Data Network to 29 fusion centers, which allows the federal government to share information and intelligence with states and provides fusion center staff access to the most current terrorism-related information.”

The Healthcare IT Guy: Guest Article: Why Doctors Hate Electronic Medical Records

“The fact is that doctors love high-tech. They have reason to hate EMRs but not computers and iPhones.”

DecisionStats: Interview Jim Harris Data Quality Expert OCDQ Blog

Jim Harris - ‘I know that Gartner has reported that 25% of critical data within large businesses is somehow inaccurate or incomplete and that 50% of implementations fail due to lack of attention to data quality issues.’”


Bad Behavior has blocked 1477 access attempts in the last 7 days.

Close
E-mail It
Portfolio Strategy News The Direct Marketing Voice