Thesis - Interactive Labeling of Brands / Python

IT & Technology


Bachelor’s or Master’s Thesis with the goal to design and develop an interactive labeling system for identifying brands in advertisements from scanned newspaper archives.   WHO CAN APPLY? Only enrolled students from KIT (Karlsruher Institut für Technologie) with course of studies Wirtschaftsinformatik, Wirtschaftsingenieurwesen, Informationswirtschaft, or Technische Volkswirtschaftslehre.


As the digitization of the worlds libraries and print archives continues steadily, the demand for automated processing of such documents grows. Hereby, researchers and practitioners would like to digitally process such documents with tools from computer vision (CV) and optical character recognition (OCR). Further they would like to search and filter for certain document meta-data. However, all of this presumes the availability of such extracted features and meta-data. As state-of-the-art machine learning (ML) classifiers still do not reach desired accuracy levels, especially on old documents or those from fringe contexts, manual labeling effort is required.


For the scope of this thesis, we limit the context to identifying the brands in advertisements from scanned pages of newspapers and magazines. This poses an interesting use-case for, for instance, advertising researchers. Associated colleagues at the University of Mannheim (UniMA) have already roughly extracted the brands of advertisements in the US magazine "The Economist", ranging from the 1840s to today. Hereby they used OCR to arrive at a simple representation of the advertising brand. We expect a thesis student to develop an interactive labeling system in order to support the extension of this brand identification to arrive at a cleaner representation. Interactive labeling hereby strives to combine automatic steps (e.g. the trained model) with incremental user input. The work-packages entail:

  • analyzing the state-of-the-art of such instance identification tools (potentially by conducting a structured literature review)
  • exchange with the researchers at UniMA regarding their needs and requirements
  • development of an interactive labeling system as part of a design science research process
  • writing a thesis document according to research group requirements & participation in our thesis colloquium
Design science research is a well established methodology in the information systems field, which deals with the scientific view on artifacts, such as the labeling system that should be developed during this thesis. Hereby so called design kledge can be derived from the development process and the finished artifact. 


We expect the student to be familiar with web development. The system should be developed with a modern web application frontend framework or be forked from an existing open source labeling system. Further we expect the backend to be based on standard Python frameworks. Experience in this regard is required as well.


If you are interested in this topic and want to apply for this thesis, please apply via Campusjäger.

Work Hours:

30 - 40 hours per week hours per week

About the company:
The research group “Information Systems & Service Design” (ISSD) headed by Prof. Mädche focuses in research, education, and innovation on designing interactive intelligent systems. The research belongs to the Institute of Information Systems and Marketing (IISM) and is embedded into the Information Systems & Engineering group. ISSD is also part of the Karlsruhe Service Research Institute (KSRI). The research group is positioned at the intersection of Information Systems (german: Wirtschaftsinformatik) and Human-Computer Interaction (HCI). Our mission is to create impactful scientific knowledge for designing interactive intelligent systems that enable humans to perform activities more efficiently, effectively, and meaningful. We believe that delivering cutting-edge knowledge and inspiring education, as well as an ongoing dialog with the public need to go hand in hand to maximize the impact of our work in organizations and society. The group is organized in three research departments: Digital Experience & Participation, Intelligent Enterprise Systems, and Digital Service Design & Innovation. Current topics of research are Human-AI Interaction, Cognitive Interaction Technologies, Physiological Computing Systems, Interactive Business Intelligence & Analytics Systems, and Interactive Systems Engineering.