Dynamic Data Models using Hadoop and Spark

Self-Serve BI can be difficult to work when dealing with large amounts of data. Data models tend to change frequently and become difficult to maintain. Also, allowing the users to download vast amounts of data to their local machines creates more support issues when resources are inadequate. We have found that it is better to realize that business usually knows what data they need and then automatically generate a disposable data model specifically per request. We can then allow them to use whatever tool they want to help them with their analysis. This allows business to bring in only the data they need and reduces the need for maintaining a very large data model.

Ultimately, we have a system that does all CPU and memory intensive work in performing calculations business needs in our Hadoop cluster. The system then generates a newly created data model where the users can use almost any Self-Serve BI tool for analysis. Once the analysis is complete the data model can be thrown away or kept for auditing purposes.


Amila Kottege is a software developer currently working for Ontario Teachers’ Pension Plan. He focuses on the Asset Liability Model which is a simulation that projects the plan’s liabilities. He has been a software developer for 8 years where four of those years were at Bank of America Merrill Lynch working in varying back-end systems such as an automated trade approval system to a margin calculator.


Amila Kottege, Dynamic Data Models

Annual General Meeting

This meeting will also serve as our Annual General Meeting. Elections will be held for positions on the Board for the forthcoming year. Please contact our President or any other Board member if you are interested in assisting IRMAC for the year 2016-2017.


2016 AGM Minutes


May 18 2016


4:15 pm - 6:00 pm


The Albany Club, 91 King Street East (near King/Yonge subway), Toronto, ON M5C 1G3.

Leave a Reply

Your email address will not be published. Required fields are marked *