Σας ανακοινώνουμε τη διάλεξη του κ. Daniel Schmidt Associate Professor of Computer Science at the Department of Data Science and AI, Monash University, Australia, (https://research.monash.edu/en/persons/daniel-schmidt).
Η ομιλία θα πραγματοποιηθεί την Παρασκευή 29 Μαρτίου 2024 & ώρα 13:00 στην αίθουσα Σεμιναρίων του Τομέα Μαθηματικών ΣΕΜΦΕ.
Τίτλος : Prevalidated ridge regression as a highly-efficient drop-in replacement for logistic regression for high-dimensional data
Abstract: Linear models are widely used in classification and are particularly effective for high-dimensional data where linear decision boundaries/separating hyperplanes are often effective for separating classes, even for complex data. A recent example of a technique effectively utilising linear classifiers is the ROCKET family of classifiers for time series classification. One reason that the ROCKET family is so fast is due to its use of a linear classifier based around standard squared-error ridge regression. Fitting a linear model based on squared-error is significantly faster and more stable than fitting a standard regularised multinomial logistic regression based on logarithmic-loss (i.e., regularised maximum likelihood), as in the latter case the solutions can only be found via a numerical search.
While fast, one drawback of using squared-error ridge-regression is that it is unable to produce probabilistic predictions. I will demonstrate some very recent work on how to use regular ridge-regression to train L2-regularized multinomial logistic regression models for very large numbers of features, including choosing a suitable degree of regularization, with a time complexity that is no greater than single ordinary least-squares fit. This in contrast to logistic regression, which requires a full refit for every value of regularisation parameter considered, and every fold used for cross-validation. Using our new approach allows for models based on linear classifier technology to provide well calibrated probabilistic predictions with minimal additional computational overhead. If time permits, I will also discuss some thoughts on when such linear classifiers would be expected to perform well.