Articles & White Papers
What is Real about Real-Time Data Warehousing?
Author's Note: Real-time data warehousing (RTDW) is real. This article provides an excellent example of a RTDW while highlighting the meaning and the use of real time. I have disguised the identity of the company because of the highly competitive nature of their industry.
Silicon Valley Semiconductor (SVS) is a manufacturer of custom application-specific integrated circuits (ASIC) for large and moderate volume firms. The ASICs they produce become the heart of consumer products such as the sensor controls in night- vision goggles, the brains of a late model car and other devices.1 Each product is unique, though they all share the same manufacturing processes.
SVS specializes in efficient, quick turnaround production and does the design work for about 60 percent of their clients. They charge based on level of effort for design and an estimated cost for setup which is pretty much the industry standard. Their difference is in production pricing. SVS bids for deals with a pricing model that allows them to know, to a much greater extent, the real costs of production at various levels of volume.
Historically, supporting this pricing strategy has been a laborious and extended manual process. It worked well for getting profitable new business. However, as they became more successful, they also had more repeat business and more revisions to existing jobs. A time lag of 45 days to get information was no longer acceptable. Treating each customer cost-estimation exercise as a one- off project left them in the dark regarding profitability of continuing business.
SVS set a new goal to calculate customer job profitability more precisely, specifically and frequently. More precisely means to get information for runs, lots and jobs not just the duration of the engagement. More specifically means getting more actual costs or, at least, more finely adjusted allocations. As for more frequently, they were initially happy with getting weekly availability as opposed the former 45-day lag. A consultant then jazzed them up about daily data. Why stop there? This is just cutting the batch process more narrowly without changing the essence of the process. The best solution is real-time data capture providing a continuous business intelligence process.
At the beginning, their intended solution was to implement an integrated application suite from order processing through material procurement to shop floor manufacturing. This new "integrated" system passed transactions up and down the application suite fluidly and in real time. It effectively supported a more comprehensive view of where any job stood at a particular moment. Why did I put quotes around integrated? Because the core of their need integrated continuous analytics is not offered by this, or any, operational suite. No holistic mechanism is available to gather revenue and cost elements from order management, procurement, manufacturing, quality control, labor tracking and all the sources of data that affect profit margin, some of which are outside the package. The suite provides no specific support for their pricing strategy.
Finance built a new costing model with a minimum of allocations that directly maximized attributed costs. The planning group built a new profitability model that varied price with more precision during the life of the engagement. Neither could be fed the data they needed any better by the new application suite. The process remained manual, and the workload actually increased in order to adapt to all the system changes and accommodate the more sophisticated business process.
What they needed was a new analytic process from end to end. Their first notion was to create an operational data store (ODS). Luckily, they came to understand that this concept was a bad fit for their needs. An ODS is volatile, transactional, operationally modeled and near real time. They needed nonvolatile, analytic, dimensionally modeled and truly continuous.
Many companies veer toward an ODS when their time requirements become tighter2. The mistake is to assume that acquiring data more frequently than daily is somehow incompatible with the essence of analytic data warehousing. You can have nonvolatility, periodic snapshots and time-variant data with real-time feeds.
As SVS began to design their real-time business intelligence environment, an early lesson was that various types of data required different timings and different sourcing methods. For instance:
- Contract pricing has two components. The initial job price was set at the beginning. The mod price is the adjusted price for the next job of a given module. Before the RTDW, it was reset quarterly or when a major revision was negotiated. The RTDW was intended to help them calculate cost and infer price during shorter intervals such as by month and week by starting at the lot or run level. Lots are a batch of raw material required for the job, and they set inventory cost. There are generally multiple production runs continuous operation of the production cluster for any given lot of raw material.
- Initial job price is acquired daily from the order module for each new job order.
- Material cost is set by the procurement process when a lot is purchased. SVS uses actual cost instead of standard cost or an inventory cost such as last in/first out.
- Material cost is acquired daily when a new purchase receipt is acknowledged.
- Labor cost is tracked in a time management system for production workers and in the human resources system for everyone else. Production labor is specifically attributed to a machine cluster and, therefore, to a job run. All other labor must be allocated based on the time a job is on a machine cluster as a percentage of the total machine time.
- Direct production labor utilization is available soon after the end of each shift.
- Other manufacturing labor allocation is available weekly from human resources.
- All other labor is available only once per month.
So far, none of the data is inherently real time. Most is not even available daily. In fact, most allocated data has to be calculated during a baseline period of time from summary information.3 The following data elements are truly available on a continuous basis:
- Machine cluster throughput tracks the output productivity in terms of volume, defects, machine time per unit and other variables. This data is available as a continuous stream from each machine control unit.
- Material inventory depletion is the quantity of each specific material withdrawn from inventory for production, rework, scrap or return to vendor. This data is available immediately via the inventory management system as a message stream.4
These elements are the primary components of variable cost and are the only things available in real time. Is the value of collecting this data in real time worth the very real cost of building the new acquisition infrastructure? Does the business really need continuous, real-time analysis?
The production control system is responsible for reporting the throughput of the last run and tracking the running statistics. Near real-time production reporting already exists. It is not practical to calculate the change in profitability or margin-cost elements day by day. The deltas are too small to be meaningful, and most factors don't change daily. So, where is the value?
By continuously collecting data, it is possible to analyze revenue and cost elements during any relevant time frame. Trends can be assembled with any periodicity you choose without delay. Not only is the unacceptable long lag time eliminated, there is no constraint on your ability to produce any conceivable time slice for analysis and do it immediately.
Most of the magic in an RTDW comes from real-time data capture. We record every meaningful event and every significant change to reference data as they occur. This eliminates the most vexing part of the whole business intelligence process detecting and extracting changes that matter from the operational data sources. We can eliminate the E and L in ETL (extract/transform/load) and concentrate on the T, from which all the value derives.
A real-time data warehouse eliminates the data availability gap. Continuous processing without delay opens up significant new opportunities for the practice of business intelligence.
At SVS, the impact was tangible and measured in dollars straight to the bottom line. For the first time, they can measure the efficiency gain from increases in production volume as they occur. The declining cost curve can be projected with much higher accuracy. This has allowed them to share the cost savings with their customers, in the form of incrementally reduced prices while still maintaining a healthy margin. They have become highly price competitive without giving up profitability.
The SVS RTDW had unseen benefits as well. It became possible to identify patterns of material consumption that allowed them to fine tune their replenishment process. This helped them reduce both procurement and inventory costs. They were also able to identify subtle changes in the defect rate across machine clusters that allowed them to schedule preventive maintenance early and, thus, avoid costly outages.
These unforeseen benefits are the silver lining of a creatively designed business intelligence environment. The ability to serve needs not initially anticipated is the hallmark of a successful solution. The continuous availability provided by a real-time data warehouse eliminates the most significant barrier to flexibility and productivity of the traditional approach. Batch is dead. It is time to begin your transition to real time.
References
1. These are not necessarily the products produced by SVS. They represent a range of diversity of their ASIC designs.
2. An interesting related fact is that most ODS implementations are still batch-fed databases. Some get transaction level data only once a day. Many of the rest are some form of replicated database that is generally a clone of the operational source. Very few have real-time data feeds.
3. This business case has many other forms of allocated data such as general and administrative expense as well as capital costs. Many of these do not make sense to calculate more frequently than once a month.
4. These two real-time data elements are interesting because they use two different technology pipes. The machine data is available by tapping into the transaction stream hardwired between the production control units and the quality control system. The material data is received via a new technology messaging back plane using a publish-and-subscribe mechanism. This allows any application to "plug in" if it needs to receive feeds of inventory utilization. Both the warehouse management system and the continuous replenishment sub-system were designed to read this message flow. The RTDW just became another subscriber.
|