Monday, October 10, 2005

Can Data Warehouse be a COTS Solution?

Data-driven decision making has become quite the buzz word in education the past few years. What exactly is a data warehouse? Typically, a data warehouse is a central repository of data that enables longitudinal reporting across a time dimension. Such a repository would be able to answer a question like: "How are my students performing on the CRCT from year to year?" or "Are my district test scores increasing year to year for each of my demographic subgroups?"

Customized off the Shelf (COTS) solutions are just that: Software products that can be customized to meet a client's specific needs or unique characteristics. I don't think that anyone will argue that every district has specific names for things, specific measures of success and specific business rules around data. One thing I don't think that anyone will argue about either, is that for the most parts all districts use the same basic set of data to deliver required reporting to state and federal agencies.

For the longest time, vendors have approached education data warehouse efforts as custom engagements that require a different design every time. But is that really the case? Let's take the case of NCLB reporting as an example. All districts have to report from the same basic set of data to determine Adequate Yearly Progress (AYP). Districts are required to record and report the progress of different subgroups of students as defined by performance on certain high-stakes tests. This information must be reported at the school level since AYP is a school level measure.

If this is the case, for a specific test, can we not build a solution that uses a pre-defined set of data and deliver that solution to multiple districts? There is some customization that is necessary. Different states have different tests, so customization is necessary as you move from state to state, but that customization is minimal. Different states have a different AYP figure, some for a particular year of a range that must be achieved over a range of years, such as three. So there is another customization, but still we are using the same basic data set: Student, school and test data to deliver the required information.

Test vary greatly you say and this is not a trivial effort. Perhaps not trivial, but not complex either. Tests are primarily criterion reference or norm reference and the general structures are replicable from one test to another. AYP rules are vastly different and would require intense effort you say. Not true either I argue, because for all states there is a measure or a single data point for achievement for a single year or multiple years. There may be multiple data points within a range of years, but it is still a data point.

Granted, there are other customizations that have to occur, but I think that vendors can build and sell an entry level data warehouse solution that is low cost and delivers the functional reporting requirements that most districts have today with minimal customization. I would argue that there are several vendors who can deliver a near off the shelf package today, if districts would buy it. Would they? The fact that there is not a COTS solution out there today suggests that this idea may not be a good one. Or is that the vendors make more money delivering customized proprietary solutions that offer different values to different audiences? The real money is made off of customization. So why wouldn't you create a loss leader that fuels the appetite for information? Give them their basic reporting needs and then let them pay you for district specific reporting needs.

Many districts have reporting needs and wants far beyond the required reporting. Again, most of these reporting needs are dependent upon a set of data that all districts collect in their administrative systems. Vendors in the market have modeled and implemented other areas or reporting as well. All of the major vendors have deployed solutions that report on HR, Finance, Programs and Services, Food and Nutrition, etc. I would again argue that many of these could be transformed into COTS-like solutions with minimal effort.

And why would districts want to host and maintain their own solutions? Let vendors build an on-demand (to borrow IBM's terms) system that allows districts pay a per student or teacher fee to use. Total cost of ownership would be lower for most, if not all districts, and the level of service and availability would likely be better as well.

I am sure my argument has holes, but I think it also has merit. Poke some holes in it for me and let's figure how to deliver real value at a low cost to districts across America. Save them some money and redirect time and money to the classroom where the real impact on student achievement occurs.

No comments: