Skip to main content

Showing 1–6 of 6 results for author: Raymaekers, J

  1. arXiv:2302.03931  [pdf, other

    stat.ML cs.LG stat.ME

    Fast Linear Model Trees by PILOT

    Authors: Jakob Raymaekers, Peter J. Rousseeuw, Tim Verdonck, Ruicong Yao

    Abstract: Linear model trees are regression trees that incorporate linear models in the leaf nodes. This preserves the intuitive interpretation of decision trees and at the same time enables them to better capture linear relationships, which is hard for standard decision trees. But most existing methods for fitting linear model trees are time consuming and therefore not scalable to large data sets. In addit… ▽ More

    Submitted 8 February, 2023; originally announced February 2023.

    Journal ref: Machine Learning, 2024

  2. Silhouettes and quasi residual plots for neural nets and tree-based classifiers

    Authors: Jakob Raymaekers, Peter J. Rousseeuw

    Abstract: Classification by neural nets and by tree-based methods are powerful tools of machine learning. There exist interesting visualizations of the inner workings of these and other classifiers. Here we pursue a different goal, which is to visualize the cases being classified, either in training data or in test data. An important aspect is whether a case has been classified to its given class (label) or… ▽ More

    Submitted 26 February, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

    Journal ref: Journal of Computational and Graphical Statistics 2022, Volume 31, 1332-1343

  3. arXiv:2101.01494  [pdf, other

    stat.ML cs.LG

    Weight-of-evidence 2.0 with shrinkage and spline-binning

    Authors: Jakob Raymaekers, Wouter Verbeke, Tim Verdonck

    Abstract: In many practical applications, such as fraud detection, credit risk modeling or medical decision making, classification models for assigning instances to a predefined set of classes are required to be both precise as well as interpretable. Linear modeling methods such as logistic regression are often adopted, since they offer an acceptable balance between precision and interpretability. Linear me… ▽ More

    Submitted 24 September, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

    Comments: New version: duplicate paragraph omitted

  4. arXiv:2010.00950  [pdf, other

    stat.ML cs.LG stat.ME

    Regularized K-means through hard-thresholding

    Authors: Jakob Raymaekers, Ruben H. Zamar

    Abstract: We study a framework of regularized $K$-means methods based on direct penalization of the size of the cluster centers. Different penalization strategies are considered and compared through simulation and theoretical analysis. Based on the results, we propose HT $K$-means, which uses an $\ell_0$ penalty to induce sparsity in the variables. Different techniques for selecting the tuning parameter are… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

  5. arXiv:2007.14495  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Class maps for visualizing classification results

    Authors: Jakob Raymaekers, Peter J. Rousseeuw, Mia Hubert

    Abstract: Classification is a major tool of statistics and machine learning. A classification method first processes a training set of objects with given classes (labels), with the goal of afterward assigning new objects to one of these classes. When running the resulting prediction method on the training data or on test data, it can happen that an object is predicted to lie in a class that differs from its… ▽ More

    Submitted 19 May, 2021; v1 submitted 28 July, 2020; originally announced July 2020.

    Comments: Appeared online, Technometrics

    Journal ref: Technometrics 2022, Vol. 64, pages 151-165

  6. Transforming variables to central normality

    Authors: Jakob Raymaekers, Peter J. Rousseeuw

    Abstract: Many real data sets contain numerical features (variables) whose distribution is far from normal (gaussian). Instead, their distribution is often skewed. In order to handle such data it is customary to preprocess the variables to make them more normal. The Box-Cox and Yeo-Johnson transformations are well-known tools for this. However, the standard maximum likelihood estimator of their transformati… ▽ More

    Submitted 21 November, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

    Journal ref: Machine Learning, 2021