Agile Business Intelligence and Data Lake Architecture
Building a central data warehouse is a long and expensive process. At the end of this process we often find that the initial requirements were either not completely met, incorrectly specified or they had changed dramatically. In fact BI requirements are as fluid and volatile as the business itself. With our foundation in Custom Software development we are always looking for ways to improve the effectiveness and timeliness of BI projects. We have found that data warehouse projects are typically organized in a waterfall project approach that suffers from several issues:
- A very long development cycle for medium-large data warehouses on the order of 2-3 years to release usable data marts and systems.
- Data models are built that do not accommodate business environment and requirements changes. Needs are often identified once data is used leading to expensive changes and rework.
- Highly precise models require expensive data cleansing, exact ETL programs and careful tracking of upstream data sources.
In the Software Development world much progress has been made using Agile processes that allow projects to proceed in short iterations releasing usable systems frequently. Agile processes are more responsible to business changes.
This presentation outlines a new architecture based on the concept of Data Lakes – unstructured data landing areas that allow experimentation, analysis and creating usable data much more quickly than standard data warehouse methods.
Luke Shannon from Pivotal Labs.
Canadian CIO Survey Results
Jim Love from IT World.