Grunddaten
Titel | Creating a dataset of natural explanatory conversations (about cooking*) [Master] |
Beschreibung | The aim is to create an annotated dataset of human-to-human dialogue in Youtube cooking videos*, that can serve as a resource for training ML models to generate conversational explanations of the cooking process. This involves the identification of videos with multiple speakers, speaker diarization (partitioning audio and/or transcript according to speaker identity), identification of conversational interaction between the speakers, and investigating if these interactions qualify as ‘conversational explanations’ of the video content Contact: Ray Kodali Relevant literature: Speaker diarization: https://arxiv.org/pdf/2101.09624.pdf *We focus on the process of cooking as there is some related ongoing work at DFKI, but other instructional scenarios are possible. |
Heimateinrichtung | Department für Informatik |
Art der Arbeit | praktisch / anwendungsbezogen |
Abschlussarbeitstyp | Master |
Autor | Ilira Troshani |
Status | verfügbar |
Aufgabenstellung | |
Voraussetzung |
Programming in Python, ideally experience with processing video and audio data |
Erstellt | 14.04.2022 |