On Regularisation Methods for Analysis of High Dimensional Data

High dimensional data are rapidly growing in many domains due to the development of technological advances which helps collect data with a large number of variables to better understand a given phenomenon of interest. Particular examples appear in genomics, fMRI data analysis, large-scale healthcare...

Mô tả chi tiết

Lưu vào:

Hiển thị chi tiết
Tác giả chính:	Sirimongkolkasem, Tanin
Đồng tác giả:	Drikvandi, Reza
Định dạng:	BB
Ngôn ngữ:	en_US
Thông tin xuất bản:	Springer Nature 2020
Chủ đề:	De-biased lasso High dimensional data Linear regression model
Truy cập trực tuyến:	https://doi.org/10.1007/s40745-019-00209-4 http://tailieuso.tlu.edu.vn/handle/DHTL/9409
Từ khóa:	Thêm từ khóa bạn đọc Không có từ khóa, Hãy là người đầu tiên gắn từ khóa cho biểu ghi này!

id	oai:localhost:DHTL-9409
record_format	dspace
spelling	oai:localhost:DHTL-94092020-09-14T09:40:21Z On Regularisation Methods for Analysis of High Dimensional Data Sirimongkolkasem, Tanin Drikvandi, Reza De-biased lasso High dimensional data Linear regression model High dimensional data are rapidly growing in many domains due to the development of technological advances which helps collect data with a large number of variables to better understand a given phenomenon of interest. Particular examples appear in genomics, fMRI data analysis, large-scale healthcare analytics, text/image analysis and astronomy. In the last two decades regularisation approaches have become the methods of choice for analysing such high dimensional data. This paper aims to study the performance of regularisation methods, including the recently proposed method called de-biased lasso, for the analysis of high dimensional data under different sparse and non-sparse situations. Our investigation concerns prediction, parameter estimation and variable selection. We particularly study the effects of correlated variables, covariate location and effect size which have not been well investigated. We ﬁnd that correlated data when associated with important variables improve those common regularisation methods in all aspects, and that the level of sparsity can be reﬂected not only from the number of important variables but also from their overall effect size and locations. The latter may be seen under a non-sparse data structure. We demonstrate that the debiased lasso performs well especially in low dimensional data, however it still suffers from issues, such as multicollinearity and multiple hypothesis testing, similar to the classical regression methods. https://doi.org/10.1007/s40745-019-00209-4 2020-09-14T09:37:32Z 2020-09-14T09:37:32Z 2019 BB https://doi.org/10.1007/s40745-019-00209-4 http://tailieuso.tlu.edu.vn/handle/DHTL/9409 en_US Annals of Data Science (2019), Volume 6, Issue 4, pp 737–763 application/pdf Springer Nature
institution	Trường Đại học Thủy Lợi
collection	DSpace
language	en_US
topic	De-biased lasso High dimensional data Linear regression model
spellingShingle	De-biased lasso High dimensional data Linear regression model Sirimongkolkasem, Tanin On Regularisation Methods for Analysis of High Dimensional Data
description	High dimensional data are rapidly growing in many domains due to the development of technological advances which helps collect data with a large number of variables to better understand a given phenomenon of interest. Particular examples appear in genomics, fMRI data analysis, large-scale healthcare analytics, text/image analysis and astronomy. In the last two decades regularisation approaches have become the methods of choice for analysing such high dimensional data. This paper aims to study the performance of regularisation methods, including the recently proposed method called de-biased lasso, for the analysis of high dimensional data under different sparse and non-sparse situations. Our investigation concerns prediction, parameter estimation and variable selection. We particularly study the effects of correlated variables, covariate location and effect size which have not been well investigated. We ﬁnd that correlated data when associated with important variables improve those common regularisation methods in all aspects, and that the level of sparsity can be reﬂected not only from the number of important variables but also from their overall effect size and locations. The latter may be seen under a non-sparse data structure. We demonstrate that the debiased lasso performs well especially in low dimensional data, however it still suffers from issues, such as multicollinearity and multiple hypothesis testing, similar to the classical regression methods.
author2	Drikvandi, Reza
author_facet	Drikvandi, Reza Sirimongkolkasem, Tanin
format	BB
author	Sirimongkolkasem, Tanin
author_sort	Sirimongkolkasem, Tanin
title	On Regularisation Methods for Analysis of High Dimensional Data
title_short	On Regularisation Methods for Analysis of High Dimensional Data
title_full	On Regularisation Methods for Analysis of High Dimensional Data
title_fullStr	On Regularisation Methods for Analysis of High Dimensional Data
title_full_unstemmed	On Regularisation Methods for Analysis of High Dimensional Data
title_sort	on regularisation methods for analysis of high dimensional data
publisher	Springer Nature
publishDate	2020
url	https://doi.org/10.1007/s40745-019-00209-4 http://tailieuso.tlu.edu.vn/handle/DHTL/9409
work_keys_str_mv	AT sirimongkolkasemtanin onregularisationmethodsforanalysisofhighdimensionaldata
_version_	1787740147522994176

On Regularisation Methods for Analysis of High Dimensional Data

Tài liệu tương tự