Bregman divergences for physically informed discrepancy 3 measures for learning and computation in thermomechanics 4

14 ABSTRACT : With view on the context of convex thermomechanics, we propose tools based on the concept of 15 Bregman divergence, a notion introduced in the 1960s and used in learned and optimization as well,. This study 16 is motivated by the need of “discrepancy measures” between physically constrained fields that are used both in 17 traditional algorithms, analysis methods and data driven modelling or applications as well. 18 We give also characterization of symmetrical Bregman divergences through their generating function which can 19 only be a quadratic function. Some properties of the Bregman divergence and the introduced concept of Bregman 20 Gap between couples of dual quantities are given


Introduction
In thermomechanics as for many multi-physics applications, the need for metrics, "discrepancy measures" or various concepts derived from a distance in order to compare fields of quantities of interest is growing, not only but mainly with the development of large amount of digitalized data, coming the experiments, computations or in-operation measurements.Indeed, a very large majority of algorithms or data processing methods use, at one time or another, the comparison of fields.We can cite mechanical analysis, inverse problems, model reduction as well as classification or knowledge extraction from massive data.In order to introduce a correct processing of data, or more generally of fields with different physical dimensions, in order to try to introduce the a priori knowledge which we have, thanks to the modeling works and the descriptions of the physics, it is essential to go beyond the usual norms of IR n or even of functional spaces, used until now mainly to deal with a single physics.The more and more frequent irruption of high dimension spaces or probabilistic tools also raises questions about the usual norms and even about the notions of neighborhood ( [1], [2]).In the case of thermomechanics, for example, "solution fields" such as displacement, temperature, strain and stress tensors, are structured both locally and spatially through the conservation laws, the thermodynamics requirements, the constitutive equations and the symmetries, and some invariant quantities or global properties are also well established.Even if in practice, data issued from experiments, measurements or even computations are corrupted with some noise, it seems that trying to incorporate as such as possible this a priori knowledge can lead to a guarantee of objectivity and performance, especially when mixing the different types or origins of the data that are manipulated.Coming back to thermomechanics, the general concept of energy has been used for decades and some authors already proposed derived concepts which revealed fruitful in this direction in various applications.To mention only a few: the Kohn-Vogelius functional [3] for distributed parameters identification, the constitutive relation error ( [4], [5], [6]) for finite element error estimation and model updating or identification of material parameters, the Reciprocity Gap [7] for crack identification, the Equilibrium Gap [8] used together with image correlation techniques, .. Let's take a simple illustration.Suppose we want to identify the rigidity of springs distributed along a bar in the traction-compression regime.The bar is clamped at one end and submitted to a prescribed force C at the other.The identification is performed provided the corresponding displacement is measured all along the bar and known.Two situations are considered, in the first one a), the springs are linear ones, whereas in the second one b), a supplementary cubic nonlinearity is added.The equations of the two systems are respectively: a) 2 " 0 , (0) 0, '(1) " 0 , (0) 0, '(1) In the case a) the rigidity  2 is sought whereas in the case b) it is known and the rigidity is sought.A classical approach is to minimize a squared measure of discrepancy between the actual displacement (for the true value of the rigidity, that is 0 for case a) and 0 for case b)) and the computed one for a given value of the rigidity.The figure 1 displays for the cases a) and b), the usual least-square function, the energy of the difference of the two fields, and what will be called below the symmetrized Bregman divergence between the two fields (which in case a) turns about to be exactly the energy of the difference) that is:     It can be observed that, even in this simplistic case, the functions G2 and H3 are more suited for the minimization procedure.Indeed the ratios of the Hessians are respectively 1.2 and 5.9 at points 0 and 0.The difference being much more pronounced in the nonlinear case.This paper is devoted to the introduction of various concepts, involving or derived from, the Bregman divergence and leading to various forms of discrepancy measures, in a special sense, for thermomechanics quantities.These concepts encompass some of the concepts cited previously and have been already used for solving nonlinear Cauchy problems in various situations in thermomechanics ([9]- [11]).The paper is organized as follows.First the Bregman divergence is defined and various properties are described together with de definition of derived concept such as the Symmetrized Bregman Divergence and the Bregman Gap.Then generating functions for the Bregman divergence and derived concepts are given for a large class of constitutive relations or situations in thermomechanics.Finally, some properties useful for elementary computational geometry are given.

The Bregman divergence: main properties and generalization 2.1 The Bregman divergence
The Bregman divergence has been introduced by Bregman [12] in convex programming context [13], although the name Bregman divergence has been coined later [14].It has also been studied and used for some years in the field of statistical inference and data processing problems [15], such as estimation, detection, classification, compression, and recognition, in particular for classification or clustering applications [16], and even, more recently, in deep learning applications.In learning applications, the Bregman divergence is used as a discrepancy measure and in some optimization algorithms [17] it is recognized as superior to the usual Euclidian distance.We give here a slight generalization of the Bregman divergence, which includes the case of nondifferentiable generating functions, because they are to be encountered in thermomechanics.Definition: Let J be a proper convex function, ( ) : IR IR n J e  , the Bregman divergence, generated by J, between two points (e1, e2) belonging to dom(J) 2 is the non-negative scalar : Fig. 2: Geometrical interpretation of the Bregman divergence in IR d , d=1 and 2 for differentiable generating functions Roughly speaking, the Bregman divergence is the tail of the Taylor expansion of J.An extension to infinite dimensional spaces of the Bregman divergence has been provided [18].The positivity of the Bregman divergence results simply from the genuine definition of a convex function which is "above its tangent planes".The Bregman divergence can be used as a discrepancy measure between two vectors as it takes strictly positive values if the two vectors are different (when the generating function is strictly convex), and is zero on any pair (e, e).The definition (1) for non-differentiable generating functions is chosen as to obtain a "medium" discrepancy measure, and leads also to a non-zero divergence between zero and other points for generating functions that are positively homogeneous with degree one.The geometrical interpretation is slightly different than in the case of differentiable generating functions, and the value p12, which is an element of ( ) J e  , by convexity of the subdifferential, turns out to be also the mean value of the directional derivatives in the directions (e1e2) and (e2 e1).There are situations in the field of learning, for example in image classification, which do not require finer concepts such as true metrics that by the way do not really make sense, for example for the triangle inequality [19].Nevertheless, even if DJ(e1,e2) vanishes when e1=e2 (and reciprocally if the generating function J is strictly convex), Bregman divergences are not distances as they are neither symmetric nor satisfying the triangle inequality in general.In table 1, some Bregman divergence are displayed, the two first ones are genuine distances whereas the last ones, which are not, are used in learning applications and statistics.
Table 1 : Examples of generating functions and associated Bregman divergences Nevertheless, the following interesting properties [16] which are useful in optimization as well as in computational geometry have been established and have been extended to the infinite dimensional context [18]. Convexity.
J D e e is always convex in the first argument (but not necessarily in the second argument, take for example J(e)=e 3 in IR + , 3 1 2 ( , ) e D e e is not convex in e2 when e2 < e2) Linearity.The Bregman divergence is a linear operator with respect to generating functions (3) As the Bregman divergence can nevertheless be symmetric, even it is generally not the case; it is interesting to derive a characterization of the symmetric Bregman divergences, because the preceding properties are clearly improved.Proposition: Characterization of symmetric Bregman divergences The Bregman divergence generated by the convex function J is symmetric iff J is a (positive) quadratic function.
The proof is provided in annex A. This condition is stronger than the condition of positive homogeneity of degree two.For example in IR, the Bregman divergence generated by the function positively homogeneous of degree two , with a ≠ b, cannot be symmetric.Indeed for any x > 0: This result, perhaps disappointing by the restriction it imposes on the generating functions J, has nevertheless as a corollary that symmetric Bregman divergences are actually distances (namely Mahalanobis distances in finite dimension).Corollary: Symmetric Bregman divergences satisfy the triangle inequality and are therefore distances.

Symmetrized Bregman divergences, Bregman gaps and -Bregman gaps
For application in thermomechanics, it can be desirable to have concepts of dissimilarity that first give the same role to both components of the pair (e1, e2) to be analyzed, and second involve primal and dual variables as well, because their dual product has a strong energetic sense.That is the reason why we introduce first the notion of symmetrized Bregman divergence and second the notion of Bregman gap.The symmetrized Bregman divergence is simply taken as the sum of the Bregman divergences (the first idea seems to come back to 1946, [20]) :  (5) and the symmetrized divergence of Chen et al. [22] (a positive quantity because J is convex) : J e e m e e e J J e J grounded on the property : .The properties of the symmetrized divergence (6), mainly for the scalar case, rely strongly on the regularity of the generating function, at least a C 2 .Nevertheless, in the context of thermomechanics, potentials useful for defining generating functions can be not differentiable (see part 3).These definitions are not equivalent even for symmetric Bregman divergences and the first definition (4) is chosen here.
In the framework of linear elasticity in small perturbations, then if eis the linearized strain tensor, and J is taken as the elastic energy density, the symmetric Bregman divergence is the so-called error in (elastic) constitutive equation, involving also the stress tensor , or twice the energy of the difference, or even up to a multiplication coefficient, the Khon-Vogelius functional integrand: Moreover when the generating function J of the Bregman divergence is twice differentiable, it can be given a quadratic expression of the symmetric divergence using the Hessian value at some point e of the segment [e1, e2], thanks to the intermediate value theorem, but obviously the point e depends on the pair (e1, e2): , .With a view to application in thermomechanics, it is of interest to introduce a concept that enjoys symmetry while simultaneously involving dual and primal variables.More precisely, it is useful in applications concerning data structured by an underlying physics (experimental data or data from numerical simulations involving partial differential equations), to have divergences or discrepancy measures between couples of dual variables (e.g.stress and strain tensors, temperature gradient and heat flux, ...).Definition: Bregman Gap Let J be a convex, not necessarily differentiable function, the Bregman gap BGJ generated by J between e1 and the couple of dual quantities (e2, p2), 2 , is the non-negative quantity: J BG e e p J e J e p e e     (9) Here in contrast with the definition (1) of Bregman divergences for non-differentiable functions, no choice is made for the element in the subdifferential of J at e2 because it is itself an argument of the Bregman Gap.In turn, this notion allows to define a symmetric gap between any couples of dual variables [e2, p1], and [e2, p2].Definition: Symmetrized Bregman Gap Generated by the convex function J, the Symmetrized Bregman Gap s J BG between the two couples of dual quantities (e1, p1) and (e2, p2), , is the nonnegative scalar :

BG e p e p BG e e p BG e e p  
A straightforward calculation gives the following expression for the Symmetrized Bregman Gap: BG e p e p p p e e     (10) The generating function does not appears anymore in this expression, but its positivity relies strongly on the existence of a convex function J such that ( ) , because this expression is nothing but the monotony of the differential of convex functions.In applications, the determination of J is then not necessary as soon as the pi are available, although its existence is mandatory.Going back to the example of linear elasticity where the elastic energy density  is quadratic (provided the reference state is stressfree), the Symmetrized Bregman Gap between two states (1, 1) and (2, 2), or their discrepancy measure, is simply twice the elastic energy of the difference of the strain tensors 2 *( ) . Again, an expression already existing is the literature for application with elastic materials or structures with quadratic potentials is recovered.More generally, the convex framework ( [24], [25]) used here leads to two properties involving the convex conjugate J* of the generating function J.The Bregman Gap generated by the convex function J is nothing but the Legendre-Fenchel residual If J* is the Legendre-Fenchel conjugate function of J then: ( , ) ( ), ( ) ( , ) , D e e D p p  As an example, the following double-well free energy potential, used in phase transformation models, is not convex, but its conjugate is convex, although not strictly convex, and can be calculated by taking advantage from the fact that the biconjugate of J is also the largest closed convex function smaller than J.  we can use the notion of -subdifferential, coming from the optimization community ( [26], [27]) and proved to be the right approximation notion for subdifferential, and incorporate it in a definition of -Bregman Gaps.

 
Definition: -subgradients,  -sub differentials [26,27] Let J be any the convex function and pick any e in dom J, then for a positive scalar , a vector p is called an -subgradient of J at e, when the following property holds : The set of all -subgradients at e is called the  -sub differential of J at e, and denoted by .
In the context of linear elasticity, the -stress tensors associated to a strain tensor e, are then the sum of the actual stress tensor  associated by the constitutive equation to e and any stress tensor with elastic energy lesser than .As an example in one dimension, the subdifferential of the function ( ) J e e  is illustrated on figure 5 Fig. 5:  -sub-differential of ( ) J e e  as a function of e along the real axis for =0.4.For =0, the subdifferential (heavy grey line) is the segment [-1,1] at e= 0, and reduces to   1 and   1  for e>0 and e<0 respectively.Definition: -Bregman Gap, Let J be a convex, not necessarily differentiable function, and  a non-negative scalar: i) the -Bregman gap -BGJ generated by J between e1 and the couple of approximate dual quantities (e2, p2), The interest of this definition is that, is leads to a positive discrepancy measure without knowing the generating function but only provided it exists with an approximation  for the relation ( ) Unfortunately, any property of separated convexity does not make sense for BG s nor for -BG s because their domains are not convex unless the generating function is quadratic.Indeed, the convexity would necessitate the convexity of the domains, and therefore prescribe the following implication, which is not ensured except for quadratic functions : (1 ) p J e p J e p p J e e

Thermodynamic potentials as Bregman generating functions in thermomechanics
The objective of section is to build suitable Bregman Divergences and Bregman Gaps for the processing of data that originate either from physical experiments or from simulation computations of these same physical phenomena.As mentioned in the introduction, it is believed that taking into account the specific structure of these data that is brought by the physical constraints, conservations laws or constitutive equations can lead to better performance of the algorithms used for processing this kind of data.Indeed metrics and scalar products are at the heart of processing methods such as Proper Order Decomposition used for the sparse representation of the data, identification of structures buried in it or model reductions for the PDE governing it, or such as classification, clustering and regression algorithms in learning, often by the way of minimizing discrepancy functions between data.The applications to nonlinear cases, emanating notably from the nonlinearity of the constitutive equation (for small deformation), are appealing for the definition of non-quadratic pseudo (squared) metrics.
While the merits of various metrics or pseudo-metrics, scalar or pseudo scalar products, can be considered at the very least to be more or less the same when applied to data corresponding to a single physics, the situation is very different in the case of multi-physics data relating to coupled phenomena.This is due on the one hand to the question of the physical units of the data, and on the other hand to the need to make the best use of a priori information contained in the coupled equations satisfied by the data.Even with uncertainties, it is desirable to construct more elaborate metrics than the usual Euclidean metrics of IR n , and the concepts associated with the Bregman divergence seem fruitful in that respect.
The main reason for this is the flexibility they allow in the choice of generating functions In mechanics, in thermics or more generally as soon as thermodynamic phenomena are involved, it is natural, when choosing generating functions for Bregman divergence, to turn to thermodynamic potentials or dissipation functions and potentials.Indeed, in a large majority of cases, these are convex functions of their arguments.Furthermore, these potentials were fundamentally conceived to introduce an energetic notion but also to define the relations between dual variables, and it is precisely this last particularity that can be exploited with Bregman divergences.It appears then the gap functions defined by the symmetric Bregman gaps coincide with some forms of errors in constitutive equations.In multiphysics situations, advantage can be taken of the additivity property of the Bregman divergence (cf.part 2.1):      for combining various potentials either coupled or involving different physical quantities, in order to build pseudo-metrics covering all the involved data and taking into account all the couplings between them.In this section, advantage is taken from the convex framework of generalized standard materials to build ad hoc Bregman gaps, and extension is made for the generalization of this framework with implicit standard materials.Nevertheless, the classical framework of linear elliptic problems first is analyzed, where the (linear) constitutive equations are incorporated directly in the conservation laws.

Potentials for data from physical fields
When the data to be processed come from vector fields, issued from experiments or from numerical simulations, more or less accurately obeying a system of PDEs, energy quantities such as the potentials from which these equations derive can be used for the building of generating functions for Bregman divergences.An important first field of application is provided by the framework of linear symmetric PDEs.If, in the quasi-static assumption, the data field u(x) on a spatial domain  is supposed to obey such a system of equations, the variational form of these equations reads: Where a is a symmetric bilinear form defined on the product of functions spaces U()xU(), l is a linear form on U, V U  is the space of admissible fields (with respect to boundary conditions on parts of  ) and V0 its tangent space.If, a is continuous and coercive and l is continuous, the solutions of the system are characterized by the optimization property of a potential P: The convexity and positivity of a lead to consider the global Bregman divergence, i.e. having as argument the field u over , and generated by the quadratic form associated with a. ( ) ( , ) J u a u u  because J is positive, convex and, in addition, J(u)=0 implies u equals zero as well.J being a quadratic function, the Bregman divergence generated by J is symmetric and takes the simple form: ( , ) ( , ) ( , ) 2 ( , ) ( , ) , ( )) , ( ( ), ( )) ( ) ( ), ( ) ( )  The previous expression is an abuse of notation because in general A is also a function of the derivatives of u and v: , / ,.. , , / ,..
. A(u, u) can be often interpreted as the density of a global energy or a dissipation ( ) 1 / 2 ( , ) V u a u u  as illustrated in the table below.In some situations where A is only a function of the derivatives of u, and not of the values of u, attention must be paid to the space V, which is usually a quotient space, in order to recover the strict convexity by eliminating the "rigid modes" with no energy.The following table displays some examples.

Physics
U space Energy or dissipation density A Scalar conduction or diffusion (thermal, electrical, Darcean flow, incompressible Stokes flow) : quadratic energies or dissipations associated with various linear physics The discrepancy measure generated by the Bregman divergence is of particular interest in relation with usual Euclidean metrics in situation where the medium is anisotropic and/or heterogenous [28].Another important field of application arises in fluid mechanics.For Navier-Stokes equations where the variable of interest is the velocity field v, two quadratic functions can be used for generating Bregman divergences: the kinetic energy: 2  v and the enstrophy 2 rot  v .The enstrophy is the square of the vorticity for incompressible flows.Then, using the associated Bregman divergence allows to compare two velocity fields (or more precisely their fluctuations) in terms of the vorticity contents, this can be of interest for example in POD in order to identify coherent structures like vortexes in a flow.

Generating functions in thermomechanics for standard generalized materials
The theory of Generalized Standard Materials (SGM) [29] is a coherent thermodynamic framework for modelling constitutive equations using only convex potentials.In addition to the usual description of the thermomechanical state constituted by the state variables (, , T ) that is (mechanical deformation, stress tensor, temperature), some internal (hidden) variables collectively denoted by  are introduced.Then complementing the free energy (Gibbs energy), ( , , ) T    , a pseudo potential of dissipation appears as a function of the rates of the internal variables ( )   D , and the constitutive equations are split into two groups:

D
The first group expresses the relation between dual variables (S is the entropy, A the thermodynamic force), the second group gives the evolution equation of the internal variable , and is extended for nondifferentiable pseudo-potentials thanks to the notion of subdifferential: ( ) . This extension is necessary when modeling constitutive equation involving yield functions, which are associated with positively homogeneous dissipation pseudo-potentials with degree one.
For the simplicity of the presentation we now limit ourselves to isothermal evolutions.The Euler implicit time incremental form of these constitutive equations reads: 11) showing that the (thermodynamically) conjugate couples for describing the thermodynamic state are ( , ), ( , This reduces to the so called Drücker mechanical error ( [30], [31]) when  takes the value ½, that is when the free energy and dissipation are equally balanced in building the discrepancy measure.
For perfect isotropic elasto-plasticity for example, the primal state is described by the couple   , p

 
where p  is the plastic strain, the free energy is simply 1 / 2 , : ( ) : ( ) ( )  and the pseudo-potential of dissipation is the non-differentiable positively homogeneous function of degree one

D
. The Symmetrized Bregman Gap reads then (for =0, the elastic energy gap is of course retrieved): By a reverse approach to the situations in section 3.1, Global Symmetrized Bregman Gaps are simply built by integrating over the geometrical domain of interest .It is worth noting, that a direct approach on the domain  is also possible following [32].Because the Bregman Gap takes exactly the value of the Fenchel residual [25] of its generating function J, involving its conjugate J*: ( , ) * ( ) ( ) , F r e p J p J e p e    the Symmetrized Bregman gaps are closely related to the errors in constitutive equations largely used in thermomechanics for used.Furthermore, it is possible to define a dual Bregman Gap with J* as generating function with the property: , , , , BG e e p BG p p e  This result just reflects that conjugate thermodynamic potentials can equivalently be used for building the Bregman Gaps within the framework of Standard Generalized Materials.Nonetheless, it has to be kept in mind that even if the generating function J does not appears anymore in the final expression (7) or ( 8) of the Bregman gaps, existence of a convex generating function is the essential condition for these expressions being positive.In finite elasticity for example, the Coleman-Noll error which involves the gradient of the deformation F and which is exactly the expression of the Bregman Gap, is not a positive quantity.Fortunately here, it is possible to turns to a weaker notion of convexity, namely the poly-convexity [33], and to use then as primal variables (F, Cof F, det F) in a correctly build, and consequently positive, Bregman Gap.

Bregman gaps generated by bi-potentials
Unfortunately, while the SGM framework is very wide, there are still many constitutive laws that do not fit into it, such as Armstrong-Frederick hardening plasticity, Drücker-Prager elastoplasticity, Clam-Clay models, some damage models or even Coulomb friction.An extension of the framework of Standard Generalized Materials has been proposed [34] in order to take into account these behaviors that do not obey the normality law in the complementary equation.In this formulation, the pseudo-dissipation potential becomes a function of both the primal variable e and its dual variable p, justifying the name of bi-potential, or implicit potential since the relationship between e and p is expressed by a subdifferential implicit equation:  , 0 p p e e    , it can be then expected that, since it is violated for non-standard materials, the reduction to (10) of the Symmetrized Bregman Gap to this expression is no longer verified for Symmetrized Bregman Gaps generated by a bi-potential.Indeed,   shows that for constitutive relations not entering the Standard Generalized Material framework but the implicit potential framework, convex generating functions for Bregman gaps can be built with bi-potentials.This extends considerably the application field of the Symmetrized Bregman Gaps.

Expression of Symmetrized Bregman Gaps for various modelling in thermomechanics
The following table gathers some Symmetrized Bregman Gaps associated with phenomena (or models) encountered in thermo-mechanics.Global symmetrized Bregman Gaps ate obtained by spatial integration over the domain occupied by the solid or over the interface between two contacting solids.

Phenomenon
Generating function
The expression of Global Symmetrized Bregman Gaps can sometimes be drastically simplified by using in supplement the equilibrium equation satisfied by the stress field and the compatibility of the strain tensor field.This is the case for the Drücker error (14): .
This leads to significant reduction of the burden computations 4 Computational geometry with Bregman concepts Various learning techniques like SVM, clustering or partitioning, manifold learning, topological data analysis, classification etc.. rely on basic problems of computational geometry, where a distance or a metric usually named loss function in learning application, plays a central role.In order to use Bregman divergences or Bregman gaps as loss functions, hopefully better adapted to the processing of data structured by physics, we give below some essential geometric properties.Methods for dimension reduction like the Proper Orthogonal Decomposition methods or generalizations of them can also benefit from more physically informed scalar products or, even more importantly, inner-products on productspaces of different physical quantities [35].Computational geometry with Bregman divergences can be challenging because of the non-symmetry of the original Bregman divergence but also and over all because of the violation of the triangle inequality, except for quadratic Bregman divergences (Mahalanobis distance).Nevertheless, the convex basis of the Bregman divergence leads to properties that allow almost all the operations of computational geometry, sometimes at the cost of slightly altered concepts.In this section are given some geometric properties and indications on the computation of interesting geometric quantities regarding applications.

Centroid properties and Bregman balls
The first and essential geometric property of the Bregman divergence, especially for applications in clustering is the following.It expresses that the mean of a finite set S of N points minimizes the sum of Bregman divergences or Bregman Gaps between this point and any other points of the set S, whatever the choice of the Bregman divergence or gap.Property: Centroid property for Bregman divergences [16] and for Symmetrized Bregman Gaps For any set S of N points in IR n , and any Bregman divergence DJ, one has: For any set S of N couples [ei,pi] , dual points with respect to any convex function J in IR 2n , and for the symmetrized Bregman gap Proof: Using the expression of s J BG , we can calculate the derivatives of CJ with respect to x and y: x y e x p y e p y x x y  [ ( )] )  and an exhaustiveness property [36] states that the centroid property is in fact a characterization of Bregman divergences when the minimization is extended over every probability distributions.If the centroid property holds for a positive function F, such that F(x, x) = 0, then F is necessarily a Bregman divergence.By the same arguments, this can be established also for the Symmetrized Bregman Gap.The generalization of the ball and sphere concepts, usually defined by means of a distance, is straightforward for both Bregman divergences and gaps.Nevertheless, due to the non-symmetry of DJ, two distinct kinds of balls or spheres appear.Definition: Bregman balls ( [37], [38]) and Bregman Gap balls The Bregman ball BBJ, respectively the second type Bregman Ball BB'J, associated to the generating function J, with center c and pseudo-radius  is the set: The Bregman Gap ball, associated to the generating function J, with center  and radius  is the set: ' J BB are generally not).To get some insights into the concepts of Bregman balls, it is interesting to look at origin centered balls, although these balls are not invariant by translation of their center.Recalling that the generating function can always be chosen with J(0)=J(0)=0, the simple expressions are obtained: These examples show how the physically information one can a priori have on the data can be taken into account and have potentially a strong qualitative impact on manipulating and comparing these data.The concept of (convex) ball seems to be the right way of dealing with questions like nearest neighbor determination or minimization of dissimilarity measures between physical fields.If he Bregman divergence is strictly convex then the Smallest Enclosing Bregman Ball, in the sense of the smallest radius , is unique and an efficient algorithm has been given for computing it [39].

Computational geometry with Bregman concepts
This section is devoted to some basic tools in computational geometry that are commonly used in situations where a true distance or a true scalar product exists.Much work has been done with Bregman divergence in the learning community, especially for classification algorithms or building Voronoi diagrams ( [38]- [42]).It was necessary for adapting methods that rely on the triangle inequality, a property that is not satisfied by most Bregman concepts, except of course for the Mahalanobis divergences which are norms (cf.section 2.1).Some results can be extended to Bregman gaps for use in applications in the field of thermomechanics and more generally in processing physically based data.The first useful concept is the bisector concept.Interestingly, bisectors are hyperplanes independently of the generating function, even strongly nonlinear.Definition: Bregman bi-sectors [38] and Bregman gaps bisectors The Bregman bisector of a pair (e1, e2) is the set of equidistant points x with respect to a Bregman divergence: J BGBs e p e p e p BG e p BG e p e p   Bisectors are hyperplanes, the two points or couples lie on the different sides of this hyperplane.The Bregman divergence being not symmetric, there is another notion of Bregman bisector, namely of the second type [38] which is generally not an hyperplane.
  BBs e e x D e x D e x   Regarding the Bregman balls, the classical approach for determining an orthogonal projection on a distance-defined ball cannot be followed because it relies strongly on the use of the triangle inequality.
In the very special case where J is quadratic, that is DJ is a Mahalanobis distance, we get that the projection e  of e lies is the intersection of the hypersphere and the line segment joining e and the center c of the ball.This is because the gradient at x is simply Ax with A a symmetric definite positive matrix.But in all other cases, the characterization of e  tells us that the alignment only takes place in the gradient space.Because the symmetric Bregman divergences are in fact true (squared) distances, and in view to derive efficient approximation procedures, the following definition of a subclass, relating a Bregman divergence to a symmetric one is of interest: Definition: μ-similar Bregman divergences [44] A Bregman divergence DJ defined on a domain K of a vector space V is μ-similar for some 0 < μ < 1 if there exists a symmetric Bregman divergence generated by Q  such that : This concept can then be used for deriving approximated methods involving the Bregman divergence or to obtain quickly a first result to be further refined, because for example the associated Bregman balls can be bordered by ellipsoids.As quoted by [AB], most Bregman divergences that are used in practice for clustering are µ-similar when restricted to a domain X that avoids the possible singularities of J.In the thermomechanics context however, this is not the case.As an example, the Bregman divergence generated by the function positively homogeneous of degree two    

Conclusion
In this paper, we try to extend the concept of Bregman divergence in thermomechanics, and we show how the large domain covered by the convex framework allows to systematically build discrepancy measures between fields of thermodynamic variables including couples of dual ones.Most of the already existing "errors" or "gaps" can then be encompass.This gives rise to discrepancy measures between variables with different dimensions, encountered especially in multi-physic applications, allowing to consider truly coupled approaches.We also give some useful properties related to these concepts for geometrical computation that is at the heart of the algorithms used or already imagined for the processing of massive data produced by experiments, computations or in-service measurements.
Annex A : Proof of the characterization of symmetric Bregman divergences The condition for symmetry of DJ reads, for the generating convex function J:  Note that the condition on J for the symmetry of DJ ensures recursively that fe is infinitely directionally derivable in each direction e.This result also shows that J is necessarily positively homogeneous of degree 2. In addition, returning to the equation for characterizing the symmetry of DJ, and using e e e    , one establishes that k enjoys the symmetry: k(e)=k(e) because J(e)=J(e): Then the strictly convex generating functions of symmetric Bregman divergences are parabolic functions along each line crossing the origin.
Step 2: Going back to the condition for symmetry of J, we have for any (e1, e2):     We are now led to show that the function f()=F()-G() is negative along the segment [0,1].But f(0)=0, so that we can consider the derivative of f which turns out to be expressed by :

Fig. 1 :
Fig.1: a) Error functions for the identification of , b) Error functions for the identification of  0 0 , 8 25 , 0.1 , 50 C    If the generating function J is differentiable at point e2, then the Bregman divergence reduces to:

Fig. 3 :
Fig. 3: Geometrical interpretation of the Bregman divergence in IR for non-differentiable generating function at e=0The choice retained within the subdifferential of J in e2 does not however guarantees the directional continuity of the Bregman divergence around the point of non-differentiability e2.Consider for example 2 definitions have been given : the Jensen-Bregman divergence[21]:

Fig. 3 :
Fig. 3: Geometric interpretation of the Symmetrized Bregman divergence in IR for differentiable generating functions If the generating function of the Bregman divergence is differentiable, then the symmetrized Bregman divergence takes a very simple form, which can be used to calculate it: divergence, although symmetric by construction, still does not possess generically the triangle property; nevertheless it can be shown that the violation of triangle inequality is limited by a quantity related to the Hessian bounds[23]: It is worth noting that the symmetrized Bregman divergence, as a function of (e1,e2), is generally not separately convex.This can be illustrated by the simple counter example of the generating function ( ) sin J e e e   on [0,].The symmetrized divergence with the origin (convex over the entire segment [0, ].The same counter-example applies to the Jensen-Bregman divergence (5) and the Chen et al. symmetrized divergence (6) as well.
energy of the difference of the stress tensors 1 2

Fig. 4 :
Fig. 4: Original function J (dotted line) and its biconjugate (solid line) that can be used as generating function for a Bregman divergence and associated concepts However, the separation property of the generated Bregman divergence fails in the interphase segment [-1,1].
In applications, the dual quantities p, a gradient or a sub-gradient, are often prone to errors, coming either from computations or measurements.If an estimation of the approximation level, say  is known, Bi-potentials[34] Let e and p be two vectors of variables in duality.A function b of IR n xIR n with value in IR   is a bi-potential for (e,p) if and only if: i) b is separately convex and lower semi-continuous for e and p ii) b satisfies the generalized Fenchel inequality: The material models defined using a bi-potential for the dissipation pseudo-potential are called Implicit Generalized Standard Materials.Indeed, in the particular case where b is a separate variables function, g and f are mutually conjugate.Therefore, Generalized Standard Materials are a special cases of Implicit Generalized Standard Materials for which the bi-potential takes the special form: bi-potentials can be used as generating functions for a Bregman Gap provided that the definition of the latter is adapted as follows.Definition: Bregman Gap generated by a bi-potential The Bregman gap generated by the bi-potential b(e,p) is defined by the positive quantity: definition generalizes the initial definition of Bregman gaps: if the bi-potential is a separate variables function, then the genuine definition is retrieved.Similarly, Symmetrized Bregman gaps can be derived.Definition: Symmetrized Bregman Gap generated by a bi-potential Let b be a bi-potential, the Symmetrized Bregman Gap generated by b is: soon as the bi-potential b is (separately) strictly convex, nullity of the Symmetric Bregman Gap generated by b implies the equality between primal state variables e1 and e2 and the associated dual variables p1 and p2.However, the normality law, characteristic of Generalized Standard Materials, being deduced from the inequality

1
Equating to zero the two derivatives leads to the announced result because the symmetrized Bregman gap, hence CJ , is separately convex.This also shows that the minimum is unique.■ The property extends to the probabilistic context with the expectation replacing the mean, function is the Euclidean norm, the usual notions of balls and spheres are recovered.As the Bregman divergences and symmetrized Bregman gaps are convex with respect to its first variable, Bregman balls and Bregman Gap balls are convex sets (whereas the second type Bregman balls of these different balls are displayed on figure4for various generating functions.

Figure 6 :
Figure 6: Bregman Balls a) BB' and BB for Itakura-Saito, and Kullback-Leibler Bregman divergences, from [NN1].b) Symmetrized Bregman Gap origin-centered balls generated by the elastic potential for an isotropic elastic material with different moduli in traction and compression bisector of a couple of dual quantities ([e1, p1], ([e2, p2]) is the set of equidistant points x with respect to symmetrized Bregman gap: Let e be a point in IR n and a Bregman ball of center c and radius , with e not belonging to the closure of BBJ(c, ).The projection e  on BBJ(c, ) is:Because the ball is strictly convex, the projection lies on the boundary of BBJ(c, ) or hypersphere e  can then be determined by the stationary point   , e   of the following Lagrangian: necessarily a quadratic function, as shown previously, so that the Bregman divergence generated by Q  is a Mahalanobis distance.μ-similar Bregman divergences enjoy the following properties: they are approximately symmetric and satisfy a variant of the double triangle inequality[44]:


and m=a/b.But the Bregman divergence generated by the function Consider an orthonormal basis (Ej)j=1,N, then for any N-uplets (j)in IR N and any index 1 ≤ j ≤ N: strictly negative for  in ]0, 1[ as soon as if C=0.

sets. There exists a Bregman projection onto convex sets C with uniqueness if J is differentiable at e.
This expression makes particularly sense in a thermodynamic framework were the thermodynamic potentials (cf.part 3) are used as generating functions: coherently, dual potentials are in correspondence in the Bregman divergence through the dual variables.Because the biconjugate J** of any function J is convex (like its conjugate J*), one can use it in order to derive a Bregman divergence for a nonconvex function.And then we have again (because J***=J*) : D as the generating function, where  is a non-dimensional weighting factor between the free energy potential and the dissipation pseudo-potential ([0,1] and  < 1 if D is homogeneous of degree one, for preserving the strict convexity).This leads to the following Symmetrized Bregman Gap on incremental couples of dual state variables [e, p], with Proof of separate convexity of the Bregman GapThe following inequality has to be established.