64-861-P1 Project Computer Vision (Part 1)

Course offering details
Close 

Instructors: Prof. Dr. Simone Frintrop; Ehsan Yaghoubi

Event type: Project

Displayed in timetable as: MProj - CV

Hours per week: 4

Language of instruction: English

Min. | Max. participants: - | 15

Comments/contents:
Objective:


  • The objective of this course is to practically prepare the students for both industrial jobs and academic research positions in the deep machine learning and computer vision domain.
  • Further, students will be prepared for a master's thesis: learn how to organize a larger project from literature research, implementation, and evaluation to writing up the results (the report would be much shorter than a thesis and in the style of a research paper). 
  • Students will experience managing and working in a project team, documented in a group-shared git project. 


Meetings:

  • We will have weekly meetings to discuss practical and state-of-the-art deep learning systems.
  • What else we do in the meetings: short presentations by students, brainstorming, discussion on hot papers, and receiving feedback about your progress. 


Topics:
Students are encouraged to select a topic related to the following topics. Selecting external topics is possible after a discussion over your 1-page project proposal.

  • Attention mechanism in Transformers and CNNs (e.g., cross-attention versus self-attention; a combination of spatial, channel and temporal attention mechanisms; etc.)
  • Fusion models (e.g., fusing Transformers and CNNs to create efficient models, feature fusion, etc.) 
  • Cross-modal and multi-modal data learning (e.g., zero-shot learning from a combination of text-audio-visual data, learning from unlabeled cross-modal large datasets, etc.)
  • Audio-visual data processing (e.g., audiovisual localization, audiovisual separation, audiovisual representation/classification, etc.)
  • Behaviour and personality analysis (e.g., personality estimation, sentimental analysis, behavioural and visual cures relationship)
  • Biometric recognition (e.g., person re-identification, face recognition, human attribute recognition, etc.)


Computational resources:
Students will be provided with remote access to computing resources available at the department for the completion of their tasks. 

Requirements:
If the students do not have information about the following subjects, but still would like to take this course, are encouraged to watch some online lectures before coming to the class. Suggested links are provided under the ''literature'' option.

  • Python programming
  • Pytorch library
  • Basics of CNNs and Transformers 


Assessment:

  • Students have to submit a scientific report along with the project codes on a git webpage.
  • A final presentation of about 15 minutes is required.


Additional notes:

  • The course is running for two semesters, i.e., WiSe 22/23 and SoSe23. 
  • The meetings will be in-presence, unless otherwise specified. Hybrid meetings are possible in special cases.
  • The course will be in English.

Didactic concept:
The project language will be English. The course relies on the latest research results in computer vision and includes both reading research papers and implementing corresponding algorithmic solutions. Programming will be done primarily using Python. Additionally, libraries such as PyTorch, Numpy, and OpenCV can be applied.

Students will prepare project proposals which will be discussed and revised with feedback from the instructors to determine the final project topic. Organization and work on the project will be conducted independently by the students, with guidance and feedback from the instructors. Weekly meetings will be applied to track progress.

Good command of the Python programming language is required, or the willingness to self-study it. The lectures Computer Vision 1 and 2 provide sufficient background knowledge in computer vision, and it is recommended to attend both before the project. It is possible to take the lectures simultaneously with this course along with significant additional self-study. Prior knowledge of the aforementioned tools and frameworks and machine learning, in general, is helpful.

Literature:
Literature mainly depends on the project topic and consists of recently published research papers.

Basic knowledge of the Python programming, Pytorch, CNN and Transformers is required:

+ Must know:



+ Recommended, but not necessary:
The following playlists are very extensive and could be watched selectively.



 

Appointments
Date From To Room Instructors
1 Th, 20. Oct. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
2 Th, 27. Oct. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
3 Th, 3. Nov. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
4 Th, 10. Nov. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
5 Th, 17. Nov. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
6 Th, 24. Nov. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
7 Th, 1. Dec. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
8 Th, 8. Dec. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
9 Th, 15. Dec. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
10 Th, 22. Dec. 2022 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
11 Th, 12. Jan. 2023 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
12 Th, 19. Jan. 2023 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
13 Th, 26. Jan. 2023 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
14 Th, 2. Feb. 2023 14:00 17:00 R-031 Prof. Dr. Simone Frintrop; Ehsan Yaghoubi
Exams in context of modules
Module (start semester)/ Course Exam Date Instructors Compulsory pass
Class session overview
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
Instructors
Prof. Dr. Simone Frintrop
Dr. Ehsan Yaghoubi