Big Data Analytics for high dimensional and heterogeneous datasets

Vandana Janeja, Information Systems
Akshay Grover
Jay Gholap

With the diversity and amounts of data increasing there is an increasing need to evaluate big data frameworks and how well they adapt to analytics techniques. In this project we will evaluate the performance of big data solutions across multiple analytic approaches. Publicly available healthcare data will be utilized as a test bed where analytics techniques, particularly ensemble based learning will be evaluated. Key parameters will be measured including algorithmic outcomes (such as diversity and size of training samples), usability, adaptability and modularity, robustness and efficiency.