The aim of this paper is to present a new classification and regression algorithm based on Artificial Intelligence. The main feature of this algorithm, which will be called Code2Vect, is the nature of the data to treat: qualitative or quantitative and continuous or discrete. Contrary to other artificial intelligence techniques based on the “Big-Data,” this new approach will enable working with a reduced amount of data, within the so-called “Smart Data” paradigm. Moreover, the main purpose of this algorithm is to enable the representation of high-dimensional data and more specifically grouping and visualizing this data according to a given target. For that purpose, the data will be projected into a vectorial space equipped with an appropriate metric, able to group data according to their affinity (with respect to a given output of interest). Furthermore, another application of this algorithm lies on its prediction capability. As it occurs with most common data-mining techniques such as regression trees, by giving an input the output will be inferred, in this case considering the nature of the data formerly described. In order to illustrate its potentialities, two different applications will be addressed, one concerning the representation of high-dimensional and categorical data and another featuring the prediction capabilities of the algorithm.
Accepted:
Published online:
Clara Argerich Martín 1; Ruben Ibáñez Pinillo 1; Anais Barasinski 2; Francisco Chinesta 3
@article{CRMECA_2019__347_11_754_0, author = {Clara Argerich Mart{\'\i}n and Ruben Ib\'a\~nez Pinillo and Anais Barasinski and Francisco Chinesta}, title = {\protect\emph{Code2vect}: {An} efficient heterogenous data classifier and nonlinear regression technique}, journal = {Comptes Rendus. M\'ecanique}, pages = {754--761}, publisher = {Elsevier}, volume = {347}, number = {11}, year = {2019}, doi = {10.1016/j.crme.2019.11.002}, language = {en}, }
TY - JOUR AU - Clara Argerich Martín AU - Ruben Ibáñez Pinillo AU - Anais Barasinski AU - Francisco Chinesta TI - Code2vect: An efficient heterogenous data classifier and nonlinear regression technique JO - Comptes Rendus. Mécanique PY - 2019 SP - 754 EP - 761 VL - 347 IS - 11 PB - Elsevier DO - 10.1016/j.crme.2019.11.002 LA - en ID - CRMECA_2019__347_11_754_0 ER -
%0 Journal Article %A Clara Argerich Martín %A Ruben Ibáñez Pinillo %A Anais Barasinski %A Francisco Chinesta %T Code2vect: An efficient heterogenous data classifier and nonlinear regression technique %J Comptes Rendus. Mécanique %D 2019 %P 754-761 %V 347 %N 11 %I Elsevier %R 10.1016/j.crme.2019.11.002 %G en %F CRMECA_2019__347_11_754_0
Clara Argerich Martín; Ruben Ibáñez Pinillo; Anais Barasinski; Francisco Chinesta. Code2vect: An efficient heterogenous data classifier and nonlinear regression technique. Comptes Rendus. Mécanique, Data-Based Engineering Science and Technology, Volume 347 (2019) no. 11, pp. 754-761. doi : 10.1016/j.crme.2019.11.002. https://comptes-rendus.academie-sciences.fr/mecanique/articles/10.1016/j.crme.2019.11.002/
[1] Nonlinear dimensionality reduction by locally linear embedding, Science, Volume 290 (2000) no. 5500, pp. 2323-2326
[2] Archives of computational methods in engineering, Int. J. Numer. Methods Eng., Volume 25 ( January 2018 ) no. 1, pp. 59-68 | DOI
[3] Visualizing data using t-SNE, J. Mach. Learn. Res., Volume 9 (2008), pp. 2579-2605
[4] Decision Forests for Classification, Regression, Density Estimation, 2011 (Manifold Learning and Semi-Supervised Learning. Microsoft Research technical report, TR-2011-114)
[5] Traffic sign recognition using Mser and Random forests, 20th European Signal Processing Conference, 2012
[6] Deep learning in neural networks: an overview, Neural Netw., Volume 61 (2015), pp. 85-117
[7] Deep learning, Nature, Volume 521 (2015), pp. 436-444
[8] Long-term recurrent convolutional networks for visual recognition and description, IEEE Trans. Pattern Anal. Mach. Intell., Volume 39 (2017) no. 4, pp. 677-691
[9] Image super-resolution using deep convolutional networks, IEEE Trans. Pattern Anal. Mach. Intell., Volume 38 (2016) no. 2, pp. 295-307
[10] A neural probabilistic language model, J. Mach. Learn. Res., Volume 3 (2003), pp. 1137-1155
[11] Artificial Intelligence, a Modern Approach, Prentice-Hall, 1994
[12] Distributed representation of words and phrases their compositionality, Proceeding NIPS'13 Proceedings of the 26th International Conference on Neural Information Processing Systems, vol. 2, 2013
[13] et al. Google Research Large-Scale Machine Learning on Heterogeneous Distributed Systems, 2015 (Technical Report)
[14] A multidimensional data-driven sparse identification technique: the sparse proper generalized decomposition, Complexity (2018) | DOI
[15] From linear to nonlinear PGD-based parametric structural dynamics, C. R. Mécanique, Volume 347 (2019), pp. 445-454
Cited by Sources:
Comments - Policy