Knowledge management overview of feature selection problem in high-dimensional financial data: cooperative co-evolution and MapReduce perspectives

A N M Bazlur Rashid; Tonmoy Choudhury

doi:http://dx.doi.org/10.21511/ppm.17(4).2019.28

Knowledge management overview of feature selection problem in high-dimensional financial data: cooperative co-evolution and MapReduce perspectives

Received August 30, 2019;

Accepted November 20, 2019;

Published December 26, 2019
Author(s)

Link to ORCID Index: https://orcid.org/0000-0002-8672-5023

A N M Bazlur Rashid
,

Link to ORCID Index: https://orcid.org/0000-0002-7745-0048

Tonmoy Choudhury
DOI
http://dx.doi.org/10.21511/ppm.17(4).2019.28
Article Info
Volume 17 2019, Issue #4, pp. 340-359
TO CITE АНОТАЦІЯ
Cited by
3 articles

1791 Views
400 Downloads

This work is licensed under a Creative Commons Attribution 4.0 International License

The term “big data” characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs – volume, velocity, variety, and veracity - to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-and-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-to-use distributed, scalable, and fault-tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-the-art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions.

view full abstract

Keywords
big data, computational techniques, knowledge discovery, meta-heuristics, optimization, parallel programming, problem decomposition
JEL Classification (Paper profile tab)
M11, M15, C61, C63
References
143
Tables
2
Figures
7

- Figure 1. General feature selection process
- Figure 2. Overall categories of evolutionary computation for feature selection
- Figure 3. A general architecture of cooperative co-evolutionary algorithm
- Figure 4. An outline of cooperative co-evolutionary algorithm
- Figure 5. A typical MapReduce workflow shuffled list
- Figure 6. The basic flowchart of a MapReduce model
- Figure 7. Feature selections techniques based on MapReduce

- Table 1. Feature selection techniques based on cooperative co-evolution
- Table 2. Feature selection techniques based on cooperative co-evolution and MapReduce

< Prev Next >

Download Preview

Problems and Perspectives in Management

Please specify your request here

Knowledge management overview of feature selection problem in high-dimensional financial data: cooperative co-evolution and MapReduce perspectives

COOKIES
The cookie settings on this website are set to 'allow all cookies' to give you the best experience.

Please specify your request here

Knowledge management overview of feature selection problem in high-dimensional financial data: cooperative co-evolution and MapReduce perspectives

Developing countries organizations’ readiness for Big Data analytics

Regulating Big Data effects in the European insurance market

Assessment of the level of business readiness for digitalization using marketing and neural network technologies

COOKIES The cookie settings on this website are set to 'allow all cookies' to give you the best experience.

COOKIES
The cookie settings on this website are set to 'allow all cookies' to give you the best experience.