<i>Code2vect</i>: An efficient heterogenous data classifier and nonlinear regression technique

Clara Argerich Martín; Ruben Ibáñez Pinillo; Anais Barasinski; Francisco Chinesta

doi:10.1016/j.crme.2019.11.002

Code2vect: An efficient heterogenous data classifier and nonlinear regression technique

Clara Argerich Martín ¹ ; Ruben Ibáñez Pinillo ¹ ; Anais Barasinski ² ; Francisco Chinesta ³

¹ PIMM, Arts et Métiers Institute of Technology, CNRS, CNAM, HESAM University, 151, boulevard de l'Hôpital, 75013 Paris, France
² University of Pau & Pays Adour, E2S UPPA, IPREM UMR5254, 64000 Pau, France
³ ESI GROUP Chair @ PIMM, Arts et Métiers Institute of Technology, 151, boulevard de l'Hôpital, 75013 Paris, France

Comptes Rendus. Mécanique, Data-Based Engineering Science and Technology, Volume 347 (2019) no. 11, pp. 754-761.

Résumé

The aim of this paper is to present a new classification and regression algorithm based on Artificial Intelligence. The main feature of this algorithm, which will be called Code2Vect, is the nature of the data to treat: qualitative or quantitative and continuous or discrete. Contrary to other artificial intelligence techniques based on the “Big-Data,” this new approach will enable working with a reduced amount of data, within the so-called “Smart Data” paradigm. Moreover, the main purpose of this algorithm is to enable the representation of high-dimensional data and more specifically grouping and visualizing this data according to a given target. For that purpose, the data will be projected into a vectorial space equipped with an appropriate metric, able to group data according to their affinity (with respect to a given output of interest). Furthermore, another application of this algorithm lies on its prediction capability. As it occurs with most common data-mining techniques such as regression trees, by giving an input the output will be inferred, in this case considering the nature of the data formerly described. In order to illustrate its potentialities, two different applications will be addressed, one concerning the representation of high-dimensional and categorical data and another featuring the prediction capabilities of the algorithm.

Reçu le : 2019-05-20
Accepté le : 2019-08-01
Publié le : 2019-11-01

DOI : 10.1016/j.crme.2019.11.002

Keywords: Machine learning, Data representation, Classification, Categorial data, Neural network, High-dimensional data, Regression

Affiliations des auteurs :

Clara Argerich Martín ¹ ; Ruben Ibáñez Pinillo ¹ ; Anais Barasinski ² ; Francisco Chinesta ³

¹ PIMM, Arts et Métiers Institute of Technology, CNRS, CNAM, HESAM University, 151, boulevard de l'Hôpital, 75013 Paris, France
² University of Pau & Pays Adour, E2S UPPA, IPREM UMR5254, 64000 Pau, France
³ ESI GROUP Chair @ PIMM, Arts et Métiers Institute of Technology, 151, boulevard de l'Hôpital, 75013 Paris, France

@article{CRMECA_2019__347_11_754_0,
     author = {Clara Argerich Mart{\'\i}n and Ruben Ib\'a\~nez Pinillo and Anais Barasinski and Francisco Chinesta},
     title = {\protect\emph{Code2vect}: {An} efficient heterogenous data classifier and nonlinear regression technique},
     journal = {Comptes Rendus. M\'ecanique},
     pages = {754--761},
     publisher = {Elsevier},
     volume = {347},
     number = {11},
     year = {2019},
     doi = {10.1016/j.crme.2019.11.002},
     language = {en},
}

TY  - JOUR
AU  - Clara Argerich Martín
AU  - Ruben Ibáñez Pinillo
AU  - Anais Barasinski
AU  - Francisco Chinesta
TI  - Code2vect: An efficient heterogenous data classifier and nonlinear regression technique
JO  - Comptes Rendus. Mécanique
PY  - 2019
SP  - 754
EP  - 761
VL  - 347
IS  - 11
PB  - Elsevier
DO  - 10.1016/j.crme.2019.11.002
LA  - en
ID  - CRMECA_2019__347_11_754_0
ER  -

%0 Journal Article
%A Clara Argerich Martín
%A Ruben Ibáñez Pinillo
%A Anais Barasinski
%A Francisco Chinesta
%T Code2vect: An efficient heterogenous data classifier and nonlinear regression technique
%J Comptes Rendus. Mécanique
%D 2019
%P 754-761
%V 347
%N 11
%I Elsevier
%R 10.1016/j.crme.2019.11.002
%G en
%F CRMECA_2019__347_11_754_0

Clara Argerich Martín; Ruben Ibáñez Pinillo; Anais Barasinski; Francisco Chinesta. Code2vect: An efficient heterogenous data classifier and nonlinear regression technique. Comptes Rendus. Mécanique, Data-Based Engineering Science and Technology, Volume 347 (2019) no. 11, pp. 754-761. doi : 10.1016/j.crme.2019.11.002. https://comptes-rendus.academie-sciences.fr/mecanique/articles/10.1016/j.crme.2019.11.002/

Bibliographie
Cité par

[1] S. Roweis; L. Saul Nonlinear dimensionality reduction by locally linear embedding, Science, Volume 290 (2000) no. 5500, pp. 2323-2326

[2] E. Lopez; D. Gonzalez; J.V. Aguado; E. Abisset-Chavanne; E. Cueto; C. Binetruy; F. Chinesta Archives of computational methods in engineering, Int. J. Numer. Methods Eng., Volume 25 ( January 2018 ) no. 1, pp. 59-68 | DOI

[3] L. Vans Der Maaten; G. Hinton Visualizing data using t-SNE, J. Mach. Learn. Res., Volume 9 (2008), pp. 2579-2605

[4] A. Criminisi; J. Shotton; E. Konukoglu Decision Forests for Classification, Regression, Density Estimation, 2011 (Manifold Learning and Semi-Supervised Learning. Microsoft Research technical report, TR-2011-114)

[5] J. Greenhalgh; M. Miermehdi Traffic sign recognition using Mser and Random forests, 20th European Signal Processing Conference, 2012

[6] J. Schmidhuber Deep learning in neural networks: an overview, Neural Netw., Volume 61 (2015), pp. 85-117

[7] L. Yann; B. Yoshua; H. Geoffrey Deep learning, Nature, Volume 521 (2015), pp. 436-444

[8] J. Donahue; L.A. Hendricks; M. Rohrbach; S. Venugopalan; S. Guadarrama; K. Saenko; T. Darrell Long-term recurrent convolutional networks for visual recognition and description, IEEE Trans. Pattern Anal. Mach. Intell., Volume 39 (2017) no. 4, pp. 677-691

[9] C. Dong; C.C. Loy; K. He; X. Tang Image super-resolution using deep convolutional networks, IEEE Trans. Pattern Anal. Mach. Intell., Volume 38 (2016) no. 2, pp. 295-307

[10] Y. Bengio; R. Ducharme; P. Vincent; C. Jauvin A neural probabilistic language model, J. Mach. Learn. Res., Volume 3 (2003), pp. 1137-1155

[11] P. Norvig; S. Russell Artificial Intelligence, a Modern Approach, Prentice-Hall, 1994

[12] T. Mikolov; I. Sutskever; K. Chen; G. Corrado; J. Dean Distributed representation of words and phrases their compositionality, Proceeding NIPS'13 Proceedings of the 26th International Conference on Neural Information Processing Systems, vol. 2, 2013

[13] M. Abadi et al. Google Research Large-Scale Machine Learning on Heterogeneous Distributed Systems, 2015 (Technical Report)

[14] R. Ibanez; E. Abbisset-Chavanne; A. Ammar; D. Gonzalze; E. Cueto; J.-L. Duval; F. Chinesta A multidimensional data-driven sparse identification technique: the sparse proper generalized decomposition, Complexity (2018) | DOI

[15] G. Quaranta; C. Argerich; R. Ibanez; J.-L. Duval; E. Cueto; F. Chinesta From linear to nonlinear PGD-based parametric structural dynamics, C. R. Mécanique, Volume 347 (2019), pp. 445-454

Karthika M S; Harikumar Rajaguru; Ajin R. Nair Evaluation and Exploration of Machine Learning and Convolutional Neural Network Classifiers in Detection of Lung Cancer from Microarray Gene—A Paradigm Shift, Bioengineering, Volume 10 (2023) no. 8, p. 933 | DOI:10.3390/bioengineering10080933
Antoine Runacher; Mohammad-Javad Kazemzadeh-Parsi; Daniele Di Lorenzo; Victor Champaney; Nicolas Hascoet; Amine Ammar; Francisco Chinesta Describing and Modeling Rough Composites Surfaces by Using Topological Data Analysis and Fractional Brownian Motion, Polymers, Volume 15 (2023) no. 6, p. 1449 | DOI:10.3390/polym15061449
Francisco Chinesta; Fouad El Khaldi; Elias Cueto Hybrid Twin: An Intimate Alliance of Knowledge and Data, The Digital Twin (2023), p. 279 | DOI:10.1007/978-3-031-21343-4_11
Francisco Chinesta; Elias Cueto Empowering engineering with data, machine learning and artificial intelligence: a short introductive review, Advanced Modeling and Simulation in Engineering Sciences, Volume 9 (2022) no. 1 | DOI:10.1186/s40323-022-00234-8
Stefano Cassola; Miro Duhovic; Tim Schmidt; David May Machine learning for polymer composites process simulation – a review, Composites Part B: Engineering, Volume 246 (2022), p. 110208 | DOI:10.1016/j.compositesb.2022.110208
António Gaspar-Cunha; Francisco Monaco; Janusz Sikora; Alexandre Delbem Artificial intelligence in single screw polymer extrusion: Learning from computational data, Engineering Applications of Artificial Intelligence, Volume 116 (2022), p. 105397 | DOI:10.1016/j.engappai.2022.105397
Victor Champaney; Francisco Chinesta; Elias Cueto Engineering empowered by physics-based and data-driven hybrid models: A methodological overview, International Journal of Material Forming, Volume 15 (2022) no. 3 | DOI:10.1007/s12289-022-01678-4
Francisco Chinesta; Elías Cueto; Simon Guevelou Material Forming Digital Twins: The Alliance between Physics-Based and Data-Driven Models, Key Engineering Materials, Volume 926 (2022), p. 3 | DOI:10.4028/p-234d4y
Tarek Frahi; Clara Argerich; Minyoung Yun; Antonio Falco; Anais Barasinski; Francisco Chinesta Tape surfaces characterization with persistence images, AIMS Materials Science, Volume 7 (2020) no. 4, p. 364 | DOI:10.3934/matersci.2020.4.364
Tarek Frahi; Clara Argerich; Minyoung Yun; Antonio Falco; Anais Barasinski; Francisco Chinesta Tape surfaces characterization with persistence images, AIMS Materials Science, Volume 7 (2020) no. 4, p. 364 | DOI:10.3934/ms.2020.4.364
Clara Argerich Martín; Arnulfo Carazo Méndez; Olivier Sainges; Emilie Petiot; Anais Barasinski; Mathieu Piana; Louis Ratier; Francisco Chinesta Empowering Design Based on Hybrid TwinTM: Application to Acoustic Resonators, Designs, Volume 4 (2020) no. 4, p. 44 | DOI:10.3390/designs4040044
Ruben Ibañez; Fanny Casteran; Clara Argerich; Chady Ghnatios; Nicolas Hascoet; Amine Ammar; Philippe Cassagnau; Francisco Chinesta On the Data-Driven Modeling of Reactive Extrusion, Fluids, Volume 5 (2020) no. 2, p. 94 | DOI:10.3390/fluids5020094
Giuseppe Serratore; Francesco Gagliardi; Clara Argerich Martín; Ruben Ibanez Pinilo; Elias Cueto; Luigino Filice; Francisco Chinesta A novel sensitivity analysis on friction spot joining process performed on aluminumsheets by simulation, International Journal of Material Forming, Volume 13 (2020) no. 5, p. 737 | DOI:10.1007/s12289-020-01578-5
Fanny Castéran; Ruben Ibanez; Clara Argerich; Karim Delage; Francisco Chinesta; Philippe Cassagnau Application of Machine Learning Tools for the Improvement of Reactive Extrusion Simulation, Macromolecular Materials and Engineering, Volume 305 (2020) no. 12 | DOI:10.1002/mame.202000375
Minyoung Yun; Clara Argerich; Elias Cueto; Jean Louis Duval; Francisco Chinesta Nonlinear Regression Operating on Microstructures Described from Topological Data Analysis for the Real-Time Prediction of Effective Properties, Materials, Volume 13 (2020) no. 10, p. 2335 | DOI:10.3390/ma13102335
Minyoung Yun; Clara Argerich Martin; Pierre Giormini; Francisco Chinesta; Suresh Advani Learning the Macroscopic Flow Model of Short Fiber Suspensions from Fine-Scale Simulated Data, Entropy, Volume 22 (2019) no. 1, p. 30 | DOI:10.3390/e22010030

Cité par 16 documents. Sources : Crossref

Commentaires - Politique

Il n'y a aucun commentaire pour cet article. Soyez le premier à écrire un commentaire !

Publier un nouveau commentaire:

Publier une nouvelle réponse: