As any other business domain, retail is equally data hungry-so much so that data is the new currency. Data or lack of it is also a key differentiator between success and failure. Recall the battles between David and Goliath of Movie businesses–Blockbuster versus Netflix. Netflix won solely on the strength of its data. I’m sure that you’ve heard that data is currency many times before, but how do you recognize a good data state versus a bad data state? How do you get to a state where data is not an onerous burden but a shining currency that starts helping the company drive and generate revenue?
Let us first discuss what makes up a good data state. This is easier said than done. Where does one even start on that journey?
Understand the Company Strategy and Key Business Initiatives
First and foremost, you need to understand the company strategy and key initiatives helping that strategy. For example, if you are Netflix and your strategy is to move out of DVD business and become number one in video streaming business, then you need to understand key Netflix initiatives that will help the company in that strategy. Further, you will need timely, consistent and adequate data to help you measure effectiveness of those initiatives.
Having understood the company strategy and key initiatives, you can get a sense of where company executives want to take the company. Now, you need to understand the data and its state. This understanding is the key to identifying the gaps that data has in meeting those initiatives and helping the company strategic direction. You will also need to do baseline of your current data to know where you are today and envision the state you need to get to.
Once you have internalized the strategy, start identifying the gaps. Gaps can be on all fronts-data state, your teams’ readiness and technologies. In other words, understand the gaps in terms of people, process, data and technologies.
Understand the Data and its State-of-the-Union
You need to ask yourself the following three questions about your data:
Is your data timely? “Timely” implies that the data is available to all data consumers at the time you have committed to them.
Is your data accessible? “Accessible” means the data is accessible easily and in a self-service mode to authorized and approved users.
Is your data consistent? “Consistent” data is data that you and other consumers of that data can trust. Is data integrity assured.
Your next investigation should be around the state of master (or reference) data. In retail sector, master or reference data is for the following entities:
Other important attributes of master data is also the hierarchies of these reference data models, how they are created, related, and then joined to other transactions.
As I mentioned above, the last thing that you need to focus on are teams, process and technologies that store, replicate and move the data from one system to another.
Teams’ readiness points to competence of people in understanding data and technologies supporting the data. Do people understand the need for strong and resilient reference data? Do people have the data culture? Data culture simply implies that people argue based on data and not on emotions and egos.
Finally, technologies part should be obvious–if there are gaps that exist due to lack of technologies then those technologies need to be brought in. For example, if a gap points to lack of real time data, then it is mandatory that there has to be a solution, in-house or hosted, that provides real time data. Technologies like Hadoop, Kafka or Sqoop, among others may be needed for this type of solution if hosted in-house.
Every company that has some type of retail and transactional work will need a few key systems. The first such system is one that allows you to recognize sales and orders. All data from the sales and orders flow through this system. What is the difference between an order and a sale? An order has an order header and one or more order lines. It also has shipment and shipment lines. Further, it may have invoices and invoices lines–depending on multiple invoices being generated for a given order due to different ship-to/bill-to addresses. When all these are collapsed into one single item and revenue is actually accrued to a given book of business, it may be referred to as a “sale”. Simply stated, the final dollars that the company gets from an order is the sale. Hence there should be an existing system to capture and store the order and sale data. This is typically the ecommerce platform and/or point-of-sales system in a retail company.
There should be another system that should consolidate orders and sales from myriad upstream transactional systems. That system is typically called the Operational Data Store or ODS. This system is also needed because it’s the first place where all data from different channels come together into a single data store. It is possible that ODS can also lend itself to some level of lightweight reporting.
Further downstream, there is a need for an aggregating system–typically referred to as the data warehouse or mart system. This is the system that largely helps with heavy duty reporting, analytics and perhaps a bit of predictive analytics.
Finally, you may need a system to do some amount of predictive analytics, statistical “what-if” scenarios, data sciences and research. Typically, this type of work can be best done through the power of distributed computing. Big Data technologies like Hadoop, Storm and Spark do this fairly well. Teradata, DB2 and Oracle also do it as well, however, the price-point changes dramatically.
As the scale becomes bigger–that is to say that as the number of orders per day and number of visits on the e-commerce site increase dramatically—the solutions may change somewhat but if you have architected the solutions right and build them on right technologies, it may not change dramatically. Building on systems that are scalable and extensible allows you to do that. In other words, design for ten years ahead and build for the next year or so. This is to ensure you don’t lock yourself into a huge upfront cost.