Projects

Collaborative Research Center SFB 876 -
Providing Information by Resource-Constrained Data Analysis

The group was part of the SFB 876 in subproject A2. For further details, please see https://sfb876.tu-dortmund.de/.

The following team members were involved in the SFB 876:

Prof. Dr. Erich Schubert
Erich Schubert
Room: OH14 R334
Phone: +49 231/755-7876
erich schubert () tu-dortmund de

Professor of data mining, working on cluster analysis, outlier detection, change detection.

 Andreas Lang
Andreas Lang
Room: OH14 R336
Phone: +49 231/755-6093
andreas lang () tu-dortmund de

PhD student working on tree-based acceleration of cluster analysis.

 Lars Lenssen
Lars Lenssen
Room: OH14 R335
Phone: +49 231/755-8254
lars lenssen () tu-dortmund de

PhD student working on clustering objectives.

 Erik Thordsen
Erik Thordsen
Room: OH14 R335
Phone: +49 231/755-8255
erik thordsen () tu-dortmund de

PhD student working on intrinsic dimensionality of data.

Project publications

Publications from our subproject:

Lars Lenssen and Erich Schubert.
Sparse Partitioning Around Medoids
In: Machine Learning under Resource Constraints -- Fundamentals 1, 182-196, 2023.
[DOI: 10.1515/9783110785944-005] | [preprint (arXiv)] | [BibTeX]
Erich Schubert and Andreas Lang.
Data Aggregation for Hierarchical Clustering
In: Machine Learning under Resource Constraints -- Fundamentals 1, 215-226, 2023.
[DOI: 10.1515/9783110785944-005] | [preprint (arXiv)] | [BibTeX]
Franka Bause, Erich Schubert and Nils M. Kriege.
EmbAssi: embedding assignment costs for similarity search in large graph databases
In: Data Min. Knowl. Discov. 36(5), 1728-1755, 2022.
[DOI: 10.1007/s10618-022-00850-3] | [BibTeX]
Erik Thordsen and Erich Schubert.
ABID: Angle Based Intrinsic Dimensionality - Theory and analysis
In: Inf. Syst. 108, 101989, 2022.
[DOI: 10.1016/j.is.2022.101989] | [BibTeX]
Andreas Lang and Erich Schubert.
BETULA: Fast clustering of large data with improved BIRCH CF-Trees
In: Inf. Syst. 108, 101918, 2022.
[DOI: 10.1016/j.is.2021.101918] | [BibTeX]
Erich Schubert and Lars Lenssen.
Fast k-medoids Clustering in Rust and Python
In: J. Open Source Softw. 7(75), 4183, 2022.
[DOI: 10.21105/joss.04183] | [BibTeX]
Erich Schubert and Lars Lenssen.
Fast k-medoids Clustering in Rust and Python
Open-Source software, Zenodo, 2022.
[DOI: 10.5281/zenodo.6802320] | [BibTeX]
Lars Lenssen and Erich Schubert.
Clustering by Direct Optimization of the Medoid Silhouette
In: Similarity Search and Applications - 15th International Conference, SISAP 2022, Bologna, Italy, October 5-7, 2022, Proceedings, 190-204, 2022, best student paper award.
[DOI: 10.1007/978-3-031-17849-8_15] | [Preprint (arXiv)] | [BibTeX]
Erich Schubert and Peter J. Rousseeuw.
Fast and eager k-medoids clustering: O(k) runtime improvement of the PAM, CLARA, and CLARANS algorithms
In: Inf. Syst. 101, 101804, 2021.
[DOI: 10.1016/j.is.2021.101804] | [BibTeX]
Erich Schubert, Andreas Lang and Gloria Feher.
Accelerating Spherical k-Means
In: Similarity Search and Applications - 14th International Conference, SISAP 2021, Dortmund, Germany, September 29 - October 1, 2021, Proceedings, 217-231, 2021.
[DOI: 10.1007/978-3-030-89657-7_17] | [Preprint (arXiv)] | [BibTeX]
Erich Schubert.
A Triangle Inequality for Cosine Similarity
In: Similarity Search and Applications - 14th International Conference, SISAP 2021, Dortmund, Germany, September 29 - October 1, 2021, Proceedings, 32-44, 2021.
[DOI: 10.1007/978-3-030-89657-7_3] | [Preprint (arXiv)] | [BibTeX]
Franka Bause, David B. Blumenthal, Erich Schubert and Nils M. Kriege.
Metric Indexing for Graph Similarity Search
In: Similarity Search and Applications - 14th International Conference, SISAP 2021, Dortmund, Germany, September 29 - October 1, 2021, Proceedings, 323-336, 2021.
[DOI: 10.1007/978-3-030-89657-7_24] | [Preprint (arXiv)] | [BibTeX]
Andreas Lang and Erich Schubert.
BETULA: Numerically Stable CF-Trees for BIRCH Clustering
In: Similarity Search and Applications - 13th International Conference, SISAP 2020, Copenhagen, Denmark, September 30 - October 2, 2020, Proceedings, 281-296, 2020.
[DOI: 10.1007/978-3-030-60936-8_22] | [Preprint (arXiv)] | [BibTeX]