The course aims to deliver an insight on the problems related to handling big volumes of data, covering statistical comprehension, data storage and data processing. The course covers theoretical formulation of the algorithms, aiming for a better analysis of the outcomes of the proposed strategies.
The course is focuses on problems with large-scale data mining, computational architectures and setting Python software environments. Furthermore, it addresses the MapReduce, Hadoop, statistical data analysis and data storage, relational vs non-relational databases, MongoDB.
Upon completing the course the student:
– knows different strategies and algorithms for data science;
– is capable of using tools and proper computational frameworks that enable big data processing;
– knows how to develop advanced specific knowledge and skills on programming languages for data science and its associated libraries or modules, e.g. Python.
– knows different strategies and algorithms for data science;
– is capable of using tools and proper computational frameworks that enable big data processing;
– knows how to develop advanced specific knowledge and skills on programming languages for data science and its associated libraries or modules, e.g. Python.