Resources

This site contains datasets, applications, tools and other resources publicly provided by the KE Group.

The following list gives a short description of the available resources:

Datasets

  • EUR-Lex text collection
    The EUR-Lex text collection provides a large multlabel classification benchmark with up to 4000 different classes.
  • Datasets for Graded Multilabel Classification
    The known BeLaE Dataset and two new datasets from medical text classification and movie ratings.
  • Incident-Related Twitter Datasets
    These datasets comprise labeled tweets from 10 major cities in the English-speaking world. The tweets were selected and labeled for the domain of incident detection.
  • Medical Concept Embeddings
    Concept vector representations learned from a large labeled background corpus. These were used for computing the semantic similarity between terms from the medical domain.

Ontologies

  • UI² Ontology
    The UI² Ontology is a formal ontology for describing user interfaces, their components, and the possible interactions with them.

Software

  • Computer Poker Bots
    A small repository of Computer Poker Bots.
  • Attachment Checker
    A Thunderbird plugin that learns to warn you when you forget to attach a file to your message.
  • Classification GUI
    A graphical user interface that allows to intuitively assign concepts from an ontology to a set of documents in order to quickly and easily develop a (multilabel) classification dataset.
  • Peewit
    A light-weight meta-framework for machine learning experiments.
  • FeGeLOD
    A tool for generating machine-learning features from Linked Open Data.
  • Explain-a-LOD
    A tool for generating possible explanations for statistics based on Linked Open Data.
  • SeCo
    A framework for Separate-and-Conquer Rule Learning.
  • Perceptrovement
    A highly modular framework for the efficient Perceptron algorithm containing a great collection of effective extensions
  • MoB4LOD
    A framework for creating customized browser applications for Linked Open Data
  • JFreeWebSearch
    A free (i.e., no registration and API key required) Java library to perform searches on the web
  • Ontology Matching Tools
    The KE group has developed a variety of ontology matching tools.
  • Graded Multilabel Classification, Code and Data
    The code and data used for our paper about pairwise graded multilabel classification. In this setting, a label is not only present or absent, but can have several grades, e.g. stars.
  • Poogle
    A browser extension/add-on for personalized privacy-protected web search.
  • AiTextML
    Learn continuous vector representations jointly for words, documents, and labels. Use corpora with labelled documents and use also descriptions of labels. This enables also to do zero-shot learning, i.e., to predict labels for which no documents were observed during training.

Computing

  • Students Pool
    Students who are active in our group have the possibility to use our infrastructure and our pool with six Linux-based computers in room D205.
  • Computing Cluster
    Get to know our research computing cluster.
A A A | Drucken | Impressum | Sitemap | Suche | Mobile Version
zum Seitenanfangzum Seitenanfang