Seminar aus Maschinellem Lernen und Data Mining

Stream Mining and Concept Drift

The seminar is available in TUCaN under module number 20-00-0102.

When and Where?

The kick-off meeting is on Tuesday, April 17, 17:10h in A213. Please note the different room assignment and kick-off date as in TUCaN.

The dates of subsequent meetings are given below. Unless mentioned otherwise, they will be on Tuesdays, 17.10h, A213.

Content

In the course of this seminar we will try to get an overview on the current state of research in a domain. This year's topic will be Stream Mining and Concept Drift, i.e. methods that learn from an incoming stream of data, with a particular focus on the problem that the concept to learn my change over time. We will cover both important traditional work and recent papers published in workshops, journals, and conferences.

 

Organization

The language used in the seminar will be English.

The topics for the talks will be assigned in the kick-off meeting. Do not miss the kick-off meeting if you want to participate in the seminar.

It is not necessary to have prior knowledge, but prior knowledge in data mining and machine learning will be helpful. Participation is limited to 20 students. In case we have more students, students with prior knowledge in data mining and knowledge discovery will be preferred. The selection will be made at kick-off meeting. If there are more qualified people than topics, we will use random selections.

The students are expected to give a 20 minute talk on the material they are assigned, followed by feedback, questions, and discussions. Although each topic is typically associated with a single paper, the point of the talk is not to exactly reproduce the entire contents of the paper, but to communicate the key ideas of the methods that are introduced in the paper. Thus, the content of the talk should exceed the scope of the paper, and demonstrate that a thorough understanding of the material was achieved. See also our general advices on giving talks.

For further questions feel free to send an email to ml-sem@ke.tu-darmstadt.de. No prior registration is needed, however, please still send us an email so that we are able to estimate beforehand the number of participants, and have your E-mail address for possible announcements. Also make sure that you are registered in TUCaN.

Talks

The talks are expected to be accompanied by slides. The students will have to send the slides one week in advance to the talk to ml-sem@ke.tu-darmstadt.de. We will use this opportunity to provide early feedback on common problems such as too many slides, too much text on the slides, small font sizes, etc. The talk and the slides should be in English.

There will be two talks in each meeting. As mentioned above, each topic is associated with one paper, but the talk should not exactly reproduce the content of the paper, but communicate the key ideas of the introduced method.

All papers should be freely available on the internet or in the ULB. Note that some paper sources such as Springer link often only works on campus networks (sometimes not even via VPN). If you cannot find a paper, contact us.

Grading

The slides, the presentation and the question and answers section of the talk will influence the overall grade. Furthermore, it is expected that students actively participate in the discussions, and this will also be part of the final grade. 

We may also require a short written report.

To achieve a grade in the 1.x range, the talk needs to exceed the contentual recitation of the given material and include own ideas, own experience or even demos. An exact recitation of the papers will lead to a grade in the 2.x range. A weak presentation and lack of engagement in the discussions may lead to a grade in the 3.x range, or worse. Please read also very carefully our guidelines for giving a talk.

In addition to the grading, we will also give public feedback on the talks immediately after the talks, and we are considering a best presentation award at the end of the seminar.

Topics

Here is a list of topics, each topic consists of two seminar talks (indicated by the bullet list). For each seminar talk, we give 1-3 papers as a starting point. However, note that you are not supposed to reproduce the papers in all details. For most talks, you should explain the method that is introduced in the paper(s), and show where and how it can be used. Often you will find much better examples or use cases in later publications on these methods. See also our guidelines for giving a talk.

Overview (8. 5. 2016)

Concept Drift (8. 5. 2018)

Decision Trees (15. 5. 2018)

  • Maximilian W.
    Domingos, P., & Hulten, G. (2000). Mining high-speed data streams. Proceedings Kdd, 71–80.
  • Christoph S.
    Gama, J., Fernandes, R., & Rocha, R. (2006) Decision trees for mining data streams. Intelligent Data Analysis 10 23-45.

Rule Learning (22. 5. 2018)

Ensemble Methods - Bagging and Boosting (29. 5. 2018)

  • Stefan W.
    Oza, N. C. (2005). Online bagging and boosting. 2005 IEEE International Conference on Systems, Man and Cybernetics, 3, 105–112.
  • Daniel W.
    Scholz, M., & Klinkenberg, R. (2007). Boosting classifiers for drifting concepts. Intelligent Data Analysis, 11(1), 3–2.

Ensemble Methods -  Other (5. 6. 2018)

Clustering (12.  6. 2018)

Statistical Learning (19.6.2018)

Active Learning and Forgetting (26.6.2018)

Context Tracking (3.7.2018)

Nonstationary Environments and Feature Drift (10.7.2018

Kontakt
small ke-icon

Knowledge Engineering Group

Fachbereich Informatik
TU Darmstadt

S2|02 D203
Hochschulstrasse 10

D-64289 Darmstadt

Sekretariat:
Telefon-Symbol +49 6151 16-21811
Fax-Symbol +49 6151 16-21812
E-Mail-Symbol info@ke.tu-darmstadt.de

 
A A A | Drucken | Impressum | Sitemap | Suche | Mobile Version
zum Seitenanfangzum Seitenanfang