Description: This topic is formulated in accordance to the data Management phase in SmartHelm project. As part of the project we obtain different categories of data such as order management data, navigation data (Structured data), weather data, Geo Information Data (semi-structured data category) and EEG sensor data (unstructured category). The heterogeneous data is transferred through various protcols such as REST API, general file transfer, web API services etc., now these transfered data sources obtained in various formats such as (.csv, JSON, xdf etc.,) should be extracted from the sources and transformed using various data processing tools into a uniform format then finally loaded to store in Data Warehouse (DWH). In principle through literature study, the state-of-the-art ETL should be chosen in way that it best suitable for in-house data storage rather than cloud-based ETL tools. Data storage is completely done in-house database.
Data Integration plays a key role before storing the data in the main Data Storage (DWH), because stored data must be productive to utilize it for implementing data analysis as well data evaluation techniques. Therefore, within this Thesis there is scope to research the concept of Data warehousing and ETL methods in depth. In addition, can practically implement the best suitable methods for our Data Requirements and goals.
|Home institution||Department of Computing Science|
|Type of work||practical / application-focused|
|Type of thesis||Master's degree|
AIM: To find out and implement the suitable ETL approaches, data warehousing techniques for storing structured, Semi-structured and unstructured Data in an in-house Data Warehouse with the following aspects.
The main goal from the Thesis is to build a scalable Data storage system, which can be set into application on the Data collected in the Project from various sources.
Language: The Thesis can be written either in German or English. No restrictions.