
Euclidean
Question: How is it possible for different events in different places to be simultaneous when even light takes some time to travel from one to the other (Entanglement)?
Answer: Let's first recall the 3D Pythagorean theorem, then the 4D, and the answer to the question will largely come by itself.
The right picture shows a cuboid. It is a regular 4sided prism that has 8 vertices each with three mutually perpendicular edges and an envelope of six rectangles, each in pair congruent with the opposite.
In the rectangle ABCD the square of the diagonal BD^{2} = AB^{2} + AD^{2}, or what is the same w^{2} = x^{2} + y^{2}, according to the marks in the picture. The rectangle BB'D'D has again the square of the diagonal l^{2} = w^{2} + z^{2}, so the same l^{2} = x^{2} + y^{2} + z^{2}, expressed by the edges of the cuboid. It's the 3D Pythagorean theorem.
Let us now imagine the same cuboid in the time course t and the light crossing the path l with the speed c. A series of positions of light on that path defines a 4D spacetime of Minkowski, an abstract space of events, that is, a series of events of light. A photon that waves, not to say clicks on the path DB' is outside the time particle, for it time stands still.
That's why we redefine the 4D Pythagorean theorem s^{2} = x^{2} + y ^{2} + z^{2}  (ct)^{2}, by an interval (s = 0) of zero length, if we are talking about the mentioned photon on its path. Then we look at small pieces of such roads:
ds^{2} = dx^{2} + dy^{2} + dz^{2}  (cdt)^{2}
and below are smaller, infinitesimal roads. It is the "interval" of special relativity events.
An event at place D at time t = 0 is at the "same position" as event B at time t = l/ c. Their distance measured by relativistic intervals is zero. Similarly, the events in places A and C' of zero distance, or D' and B, respectively A' and C.
I agree that it would be (mathematically) unacceptable if we did not consider light to be timeless and at the same time understand the time of its travel as relative, added by the observer, that is, as his experience.
Pseudo Euclid
Question: Good, let it be ok for the answer, but I'll rephrase the first question to: how is it possible for a "pseudosphere" of virtual photons to be larger and larger at the same time, simultaneous (Entanglement)?
Answer: We cannot be sure that the two photons, which at first glance would appear to leave and arrive at the same time traveling along the lengths of the diagonals DB' and AC' of the cuboid in the picture from the previous answer, because over time Δt = l/c, of their journey from the point of view of one, events are not simultaneous for many other observers in the same 4dim Minkowski spacetime.
However, a perpendicular "time" to a 3dimensional physical space is mathematically conceivable, and perhaps such dimensions are real, in which all that space is "simultaneous". It is therefore logical, and noncontradictory because there are algebras of such vector spaces, and hence we have the right and obligation to seriously consider it in the field of physics as well. As for probability theory, or information, that is no longer contested territory.
It is possible to construct, let's call them time dimensions, which would be orthogonal to the physical 4dim spacetime. They would be independent of different observers of our continuum and, like the linear independence of vectors, would retain the property of "simultaneity" with respect to any of them. Such "pseudo spheres" would indeed be "virtual" (like photons Feynman diagrams) propagating from a single charge (electron) all until an eventual interaction with another occurs.
In this way, we can also understand the quantum entanglement in the given me question, i.e. "phantom action at a distance" — which is instantaneous in relation to any observer, and at the same time is without the transmission of information.
Vectors
Question: You say "vectors", what do they have to do with "dimensional geometry"?
Answer: Vector spaces are an example and demonstration of the logical possibility of multiple dimensions. They are almost everywhere around us in applications.
In school, we first learn vectors as "oriented lengths". In the picture on the right, we see them as "arrows" that represent the same vector when they have the same direction (they are parallel), the same orientation (leftright, or updown), and intensity (they are the same length). So, we can move them (translate) in parallel and add them by concatenation.
If at the common origin of the vectors a and b in the picture there was a column that could measure the direction, orientation, and intensity of the force acting on it, and these vectors interpreted those forces, then the combined action of the two vectors would result in the diagonal vector of the shown parallelogram. This is why we often say that vectors are added along the "parallelogram of forces".
However, the vectors a and b can also represent (direction, orientation, intensity) the velocity, say of a calm river in relation to the shore and of a boat in relation to the water of the river. The resultant a + b then represents the speed vector of the boat relative to the shore. Note that the commutation law (a + b = b + a) applies to vectors, which seems to be more important for the addition of these velocities than the previous forces. And the law of association also applies to three vectors, (a + b) + c = a + (b + c), which is easy to check with the corresponding image in the case of oriented segments.
Extending a vector by the same is multiplication by a whole number, and then the proportional increase or decrease of a vector is its multiplication by a real number greater or less than one. Multiplying by a negative number also means changing the direction of the vector. We can choose the type of numbers for multiplying the given vectors, and once we do that, we call the numbers scalars.
Scalars must form a mathematical structure that we call a "body" (German: Körper) after the German mathematician Dedekind, who was the first to describe this structure, or a "field", which has, for example, a set of real numbers ℝ, or complex numbers ℂ, in order to in the case of such a choice, the vector space was called real or complex. Often, a given vector space and its associated scalar body are denoted separately, say X and Φ.
This simplifies the writing of (only) the four axioms that define the vector space:
 α(x + y) = αx + αy,
 (α + β)x = αx + βx,
 α(βx) = (αβ)x,
 1⋅x = x,
where are arbitrary scalars α, β ∈ Φ and vectors x, y ∈ X, as well as recording various consequences.
Using these axioms, it is easy to prove that the points of the Cartesian rectangular coordinate system Oxyz can be represented as vectors. For example, point A(a_{x}, a_{y}, a_{z}) as oriented along from the origin at O and the top at A, and also as a vector a = (a_{x}, a_{y}, a_{z}). Then, in general, ordered sequences, similar to these coordinates, can be viewed as vectors. The number of mutually perpendicular coordinate axes of the Cartesian system is called the number of dimensions, to which the lengths of the strings correspond, that is, the number of components of the associated vectors.
There is an isomorphism or mutual onetoone mapping between objects of classical and analytic geometry together with properties, and hence the term "vector dimensions" into "geometric dimensions". More precisely, an isomorphism is a mapping of a common structure between two mathematical forms of the same type that could be reversed by inverse mapping. Only one step further is the application of these spaces in physics.
Examples
Question: What can vectors be?
Answer: In the picture on the left is the vector a = (x, y, z), or oriented along from the point O to the point A = (x, y, z) system of three coordinates. The sum of two such is:
= (x_{1}, y_{1}, z_{1}) + (x_{2}, y_{2}, z_{2})
= (x_{1} + x_{2}, y_{1} + y_{2}, z_{1} + z_{2})
= (x, y, z) = a,
where the components are added:
x = x_{1} + x_{2}, y = y_{1} + y_{2}, z = z_{1} + z_{2}.
However, the real numbers themselves are also vectors with optional scalars. Also, complex numbers. Ordered strings with n + 1 coefficients a = (a_{0}, a_{1}, .. ., a_{n}) polynomial f(x) = a_{0} + a_{1}x + ... + a_{n}x^{n} of the given nth power form a n + 1 dimensional vector space. The vector space marked C(α, β) is also the set of all continuous functions real on the segment [α, β], where λx (for a real parameter λ) represents the function y(t) = λx(t), while z = x + y means z(t) = x(t) + y(t).
Solutions of the differential equation p_{0}(t)y^{(n)} + p_{1}(t)y^{(n1 )} + ... + p_{n}(t)y = 0 form a vector space, with the usual addition and multiplication. The set of solutions u = u(x, y) the string vibration equation a^{2}∂^{2}u/∂x^{ 2} = ∂^{2}u/∂y^{2}, also with the usual addition and multiplication, will be a vector space. Therefore, the waves behave as vectors, and the interference of the amplitude of the waves adds up as vectors.
The solutions of the Schrödinger equation (partial differentials of the second order) also form a vector space. It is the space of wave functions whose physical representations are quantum states. However, linear operators, whose representations are processes acting on quantum states, are also a type of vector spaces. There is a special connection between processes (operators) and states (vectors) on which they act, so we say that the first (vector spaces of operators) are dual to these others.
These are wellknown examples of the algebra of vector spaces, among many others, and there is no need to repeat here their individual proofs, which you can find in regular mathematics classes of that level.
Dot Product
Question: Those vectors look powerful, but what do they have to do with "perception information"?
Answer: Let's look at the Cartesian 2D rectangular coordinate system Oxy in the picture on the right and in it two vectors a = (a_{x}, a_{y }) and b = (b_{x}, b_{y}) which span the angle ∠AOB = φ. The projection of the vector b onto the vector a, with B > B', is OB' = b⋅cos φ, and conversely, the projection of vector a onto b would be a⋅cos φ, where a = a and b = b are intensities.
The scalar product of two vectors is defined as the product of the projection of the first onto the second and the intensity of the second (however, the reverse is true) with:
The second equality follows from the consideration of vertical vectors. Namely, let e_{x}, e_{y} be the unit vectors of the coordinate axes, socalled orts. Then e_{x}⋅e_{x} = 1⋅1⋅cos 0^{o} = 1, e_{x}⋅e _{y} = 1⋅1⋅cos 90^{o} = 0, e_{y}⋅e_{y} = 1, e_{y }⋅e_{x} = 0. However:
a⋅b = (a_{x}e_{x} + a_{y}e_{y})⋅(b_{x}e_{x} + b_{y}e_{y}) = a_{x}b_{x} + a_{y}b_{y}.
Consequently, by multiplying the vectors by themselves, we find the intensities:
As you say "powerful vectors" now give "powerful multiplication" that turns vectors into scalars and in real representations give "Information of Perception".
For example, by reducing the angle φ the vectors get closer, the projection of the first on the second lengthens and their (scalar) product becomes larger. And with a good interpretation, it becomes the interpretation of "adaptation", the mutual adjustment of perceiving subjects, which, like searching for a "key to a lock", becomes the way to "emergence". With that formalism, we actually explain part of the spontaneous processes of reducing the total information of an individual.
Inner Product
Question: Some scalar products (Dot Product) are not perceptual information, you say, and yet what if that information is the "structure of the world"?
Answer: Many interpretations of perceptual information are very nonobvious to us. Apparently, there are none, and that's because this theory was not visible before the mechanics of Archimedes, nor in the power of teamwork, or the nerve of emotions. But at the root of such phenomena are actually its laws, of the information of perception.
Second, the picture on the left shows the Venn diagram of the mathematics I am recounting here. Its small part is the "inner product of the space" with the multiplications x⋅y = ❬x, y❭. And a broader theory than that is "normed spaces" ∥x∥ = √❬x, x❭. The norm defines the metric d(x,y) = ∥x  y∥, but not the discrete or some other exotic ones. In fact, there are also differences between the "scalar spaces" (or "inner product spaces") of different authors, which become standards, but we will not deal with those details here.
"Vector" and "metric" spaces also differ in detail, so the inclusions from the diagram are correct. However, there are other frameworks, say topological spaces, which are metricfree at. Shapes, or graphs, for example, are mathematical forms that conceptually are not valued, not studied quantities. However, they are all information.
"Unitary space" is the vector space X over the field Φ of real ℝ or complex numbers ℂ, on which is given the inner (scalar) product of vectors satisfying the following axioms for all vectors x, y, z ∈ X and scalars λ ∈ Φ:
 ❬x, y❭ = ❬y, x❭^{*}
 ❬λx, y❭ = λ❬x, y❭
 ❬x + y, z❭ = ❬x, z❭ + ❬y, z❭
 if x ≠ 0, then ❬x, x❭ > 0,
where z^{*} = a  ib is the conjugate complex number z = a + ib ∈ ℂ, for a, b ∈ ℝ and the imaginary unit i^{2} = 1. Note that the product ❬x, y❭ ∈ Φ, therefore a complex number in general, in complex space. The fourth axiom states that the scalar square of a nonzero vector is a positive real number. A real unitary space is also called a Euclidean vector space.
Due to the 2nd and 3rd axioms, the product ❬x, y❭ is a linear functional (functional analysis) by the first argument, ❬αx + βy, z❭ = α❬x, z❭ + β❬y, z❭, and also socalled the antilinear functional by the second argument, ❬z, αx + βy❭ = ❬αx + βy, z❭^{*} = α^{*}❬z, x❭ + β^{*}❬z, y❭.
For example, a vector space with an "ordinary" scalar product (Dot Product) is a type of unitary space. Namely, if X is a space of 3dim directed segments, in the way we demonstrated it in 2dim, in the rectangular coordinate system Oxyz for vectors a = (a_{x}, a_{y}, a_{z}) and b = (b_{x}, b_{y}, b_{z}) we get a⋅ b = a_{x}b_{x} + b_{y}b_{y} + a_{z}b_{z}. In general, analogously the ndim vector space of X oriented lengths will be the real unitary, with the inner product ❬a, b❭ = ∑_{k} a_{k}b_{k} where is summed over all indexes k = 1, 2, ..., n.
The theory of unitary spaces was the framework for the book "Information of Perception", and the second framework was a wellknown application in physics presented in the book " Quantum Mechanics". However, a lot remains unfinished, both then and now.
Operators
Question: Operators are vectors?
Answer: Yes, it is obvious that the four axioms of vector spaces (Vectors) hold for linear operators. Otherwise, they are functions, mappings that maintain homogeneity and additivity for all vectors on which they act and their corresponding scalars, respectively A(λx) = λA(x) and A(x + y) = A(x) + A(y). In short, these properties are called linearity defined by A(αx + βy) = αA(x) + βA(y), for each α, β ∈ Φ and x, y ∈ X.
This universality of operators and their domain vectors allows us to use similar notations for both, except for special needs that will be specifically emphasized. A special case is, say, the product ❬x, y❭ as a linear operator by the first argument, which confirms the 2nd and 3rd axioms of unitary spaces (Inner Product).
The implication of this sameness is the same treatment of "processes" as "states" already recognized in quantum mechanics and typical of "information theory" (mine, unofficial). Therefore, after the proof of at least six dimensions of the universe, and placing in them a vector x = (x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}), where x_{k} = i_{k}ct_{k}, and the imaginary units (i_{k}^{2} = 1) are defined by quaternions and can be mutually different, and c is the speed light in a vacuum. Such is the quaternion q = iσ, where σ is the Pauli matrix (Quantum Mechanics).
The interesting thing about this idea is that we can take any four of the six coordinates of the universe, three of which we will consider as spatial, r = (x, y, z), and the fourth as time ict. It is, of course, only a theory for now, but it is logically indisputable at the level of unitary spaces. Whatever, the unification of states and changes will mean the inseparability of the concepts of space and time, no matter how physics develops further.
Invariant
Question: What happens in oblique coordinate systems?
Answer: Let φ = ∠xOy the angle between the abscissa and the ordinate (Ox, Oyaxes respectively), in the picture on the left, and the angle between the abscissa and the vector a from O to the point A is α = ∠xOA. I will use the script "Notes II", and the "15. Variant Vectors", here briefly.
If e_{x} and e_{y} are unit vectors of these axes, then the vector of the given point A(x, y) we write a = xe _{x} + ye_{y}. These are "covariant" coordinates. And when we write the same point with vertical projections on the given axes A = A'(x', y'), then we work with "contravariant" coordinates. However, we generally denote covariant and contravariant coordinates with subscripts and superscripts, for example A(a_{1}, a_{2}) and A'(a^{1}, a^{2}). When working with matrices, the covariant types are rows, and the contravariant are the columns.
The cosine theorem gives a^{2} = x^{2} + y^{2} + 2xy cos φ, for the distance of the point from the origin in covariant marks on the image, where a = OA = a = ∥a∥ depending on the chosen writing method. Considering the transformations of these coordinates:
y' = x cos φ + y, y = (x' cos φ + y')/sin^{2}φ
a square of the same length will not be written equally in contravariant coordinates. But a^{2} = xx' + yy', which is elaborated extensively in 2D coordinate systems in the aforementioned script.
Moreover, after rotation by the angle θ of a given oblique coordinate system around its origin, the values of these coordinates change, but the square of the distance remains the same form (a^{2} = xx' + yy'). The same sense of co or contravariance remains because rotation is an isometric transformation (it does not change the distances between points), so the distance of point A from the origin O is also unchanged. Details are in the linked script.
So, in oblique coordinate systems, there are "invariances" of writing distances of scalar products co and countervariant coordinates. In general, in the ndimensional oblique system of coordinates, we write the same point as co and contravariant:
OA^{2} = a^{2} = a_{1}a^{1} + a_{2}a^{2} + ... + a_{n}a^{n}.
This knowledge also gives additional meaning to "information of perception". Namely, if the factors in the summation of these "perceptions" are co and countervariant coordinates, then they express some inherent and unchanging properties. These are the properties of "selfcoupling".
Basis
Question: How does a matrix represent a linear operator?
Answer: It is an elementary question. Then the choice of the base is important, as is the way the operator acts on the base vectors. We work now with regular transformations of n dimensional spaces, which means that the sequence e_{1}, ..., e_{n} of the first goes in the sequence e'_{1}, ..., e'_{n} vectors of the second basis of the same space. In the picture on the left is a general matrix representation of the mapping A : v > u of some 3dim space.
1. The base vectors are not necessarily mutually perpendicular, but they are mutually independent for each of the bases, we usually reduce them to units and take enough to represent any of the vectors of the given space. So, it isrespectively for indices k, l = 1, 2, 3 vectors from image. Hence it is:
When the indices j, k, l = 1, 2, ... n the same applies. That sum tells us everything important about matrix representations of linear operators. You can write the matrix equation e' = Be yourself that transforms the bases, so understand the ways of multiplying a vector by a matrix, as well as a matrix by a matrix, say from the notation u = ABe.
2. The continuation of the story is the eigenvalues (λ) and corresponding eigenvectors (x) of the linear operator (A), such that:
(A  λI)x = 0,
det(A  λI) = 0,
λ^{n} + p_{1}λ^{n1} + ... + λp_{n1} + p_{n} = 0.
This is a polynomial equation of nth degree in λ, unknown eigenvalue, with solutions λ_{1}, ..., λ_{n}. I remind you, a regular operator was assumed, so the matrix is invertible, and these solutions are different. Each of the eigenvalues (λ_{k}) will yield an eigenvector (x_{k}) satisfying the initial characteristic equality ( Ax_{k} = λ_{k}x_{k}).
3. The picture shows an example of a symmetric matrix of the third order (3×3) with three eigenvalues λ_{k} ∈ {3, 6, 7} and their corresponding vectors x_{k}. In general, for any matrix, the eigenvectors are not necessarily orthogonal. But, with symmetric matrices, the eigenvalues are always real and the corresponding eigenvectors are mutually perpendicular, i.e. scalar products of different are zero (x_{j}⋅x_{k} = 0, if j ≠ k).
Let's try to understand it using vector invariance (Invariant). When the matrix is symmetric, it is equal to its transpose (columns replaced by rows), so covariant multiplication on the left of that matrix gives the same result as contravariant multiplication on the right, and both result in the invariant square of the intensity, a real number.
The eigenvectors associated with different eigenvalues are linearly independent, so they form the columns of the matrices that span the given space. Especially from Ax'_{1} = λ_{1}x'_{1} and x_{2}A = λ_{2}x_{2} with different lambda, by multiplying the equations we get λ_{1}x_{2}x'_{1} = λ_{2}x'_{1}x_{2}, which in the case of symmetric matrices means the equality of the product of these vectors. This is then possible only if the product of the vectors is zero, that is when they are mutually perpendicular.
4. Even simpler than this, I wrote earlier about eigenvalues (Eigenvalue), and now is the opportunity to add to those explanations something about the symmetric matrix. In linear algebra, a real symmetric matrix represents a selfadjoint operator represented in an orthonormal basis over a real inner (scalar) product space. The appropriate Hermite matrix is equal to its conjugate transpose for a complex space, and then the covariant vector is transposed and conjugate contravariant.
5. In information transmission, the symmetry (a_{jk} = a_{kj}) of the A channel matrix has the meaning of equal conditional probability of transmission jth to kth signal as well as reverse k > j, of each pair (j, k ). That's why even by scaling such a matrix A^{n}, when the exponent n = 1, 2, 3, ... increases, the symmetry remains, the same properties remain distribution vectors (of independent, complete outcomes) of increasingly even probabilities, and real eigenvalues.
Perhaps you can intuitively understand further, without proof by calculation, that the product of two symmetric matrices (channel composition) will not be a symmetric matrix in noncommutativity and indeterminacy.
Question: Можете ли ми појаснити (1), координате вектора у различитим базама?
Answer: In arbitrary bases e = (e_{1}, ..., e_{n}) and e' = (e'_{1}, ..., e'_{n}) is the vector x = (ξ_{1}, ..., ξ_{n}) = (ξ'_{1} , ..., ξ'_{n}) that can be written as the sum x = ∑_{k} ξ _{k}e_{k} = ∑_{j} ξ'_{j}e'_{j}. Let a linear operator A be given that maps the first base to the second, so that e'_{k} = Ae_{k} = ∑_{j} a_{jk}e_{j}, respectively for k = 1, 2 , ..., n. Then it is:
∑_{j}(ξ_{j}  ∑_{k} a_{jk}ξ'_{k})e_{j} = 0,
ξ_{j} = ∑_{k} a_{jk}ξ'_{k}.
If we denote by x(e) the matrix of vectors x in the base e and by x(e') matrix of the same vector in the base e', and with A(e) = (a_{ij}) operator matrix A in base e, then:
x(e') =[A(e)]^{1}x(e).
So, we have the operator A : e > e' and its inverse matrix A^{1} : x(e) > x(e') that translates the old coordinates into new.
Let B : e' > e'' be the operator that translates further, second to third base, then x(e) = A(e)B(e')x(e''), and the operator C = BA translates the base (e) directly into the base ( e''), and is:
[C(e)  A(e)B(e')]x(e'') = 0,
C(e) = A(e)B(e').
Operator matrices are multiplied in the opposite order of the operator and are taken in different bases.
6. The theme of these proofs is the linear independence of vectors. Two or more vectors {x_{1}, ..., x_{n}} are said to be "linearly independent" if any of them is not possible to write as a linear combination of the others, i.e. when from α_{1}x_{1} + ... + α_{n}x_{n} = 0 necessarily follows α_{1} = ... = α_{n} = 0. The basis vectors are linearly independent, and the regularity of the operator (matrix) means, before all, that such things translate into some such things again, and then that we can derive the first from the second.
The existence of linearly independent vectors in "information theory" will be a justification of the (hypo)thesis that there are corresponding independent conditions, that is states. If it is not "everything depends on everything" as colloquially is usually thought, then it follows that there are "objective uncertainties", and the encouragement that there are also "objective coincidences".
Hermitian
Question: What operators does quantum mechanics use?
Answer: Standard quantum mechanics uses Hermitian (linear, complexfield scalar) operators, real eigenvalues, and a set of orthonormal eigenvectors of the complete set. The picture on the right shows an example of a Hermitian matrix. Its diagonal values are real and its transposed (symmetric) elements are complex conjugates.
Observable, measurable quantities are characteristic values of Hermitian operators. By measuring, for example, energy, the system finds itself in a state beyond which there is no transition to another state with another energy. Hermitian operators conveniently provide this property in the form of orthogonality of eigenstates, and with eigenvalues estimating the probabilities of measurement outcomes.
When the eigenstates (vectors) of the observable are not orthogonal, there is still one way to switch to quantum mechanics and assign probabilities, if a given operator (representing a process, or a measurement) has a complete set of eigenstates of real eigenvalues. Real eigenvalues and complete eigenstates are needed to represent physical observed values. The notion of orthogonality can be relaxed and replaced by the weaker requirement of "biorthogonality". The derived quantum theory is called biorthogonal quantum mechanics.
Biorthogonal
Question: What is "biorthogonality", why?
Answer: When we have a Hermitian operator, all of its eigenvalues are real numbers that define the probabilities of the outcomes. The corresponding eigenvectors are a linearly independent and complete set that determines all the required states. However, if the eigenvalues contain the same ones, then the eigenfunctions (vectors) of the Hermitian operator are not orthogonal.
When the scalar (Inner) product is nonzero, in the case of nonorthogonal vectors, computational difficulties, and limitations arise that can be partly overcome by an auxiliary, biorthogonal basis. The idea is that to each base vector e = (e_{1}, ..., e_{n}), so inherent to the operator that simulates the given process (experiment), join one vertical vector of the new base e' = (e'_{1}, ..., e'_{n}). Shortly ❬e_{i}, e_{j}❭ = δ_{ij}, using the Kronecker delta symbol.
Not only does one biorthogonal basis always exists and its vectors are also linearly independent, but we can also understand with the help of the above picture. At the vertices O and O' are angles with vertical legs, and such angles are always equal or supplementary. With the base of the normal their vertices form a chordal quadrilateral OAO'B and, as we see, there are countless of them in the plane of the angle ∠AOB. Leaving the plane of the angles, we make a leap into a new dimension.
However many arms there are like OA, OB, on which the base vectors e lie, any two forms such a single image with such possibilities. Simply put, we don't have to imagine the ndimensional space of (degenerate) eigenvectors of the Hermitian operator all (if it is larger than 3dim), but we can in pairs. Sketched circles, like k in the picture, have each pair of eigenvectors lying in n(n + 1)/2 intersection of planes with ndim sphere that we know exists even though it is unimaginable to us.
Well, biorthogonal vectors lie on the legs of the given vertical angles and form the base e', where ❬e_{i}, e'_{j}❭ = δ_{ij}, respectively for all indices i, j = 1, 2, ..., n. A biorthogonal base is not made up of mutually orthogonal vectors, moreover, the angles between the new vectors are always equal to the angles between the old corresponding ones, but the orthogonality in pairs with data will help in the calculation.
In addition to the computational convenience, typically mathematical logic, and unusual accuracy of the predictions of quantum phenomena, which is taken for granted, the existence of biorthogonal bases fits particularly well with my information theory. First of all, because of the thesis that our reality is generated by the ongoing development of the universe, and it remains surrounded by a sea of options among which there are the most incredible, some of which are unlikely, and everything revolves around a few more certain ones. That is why we can live with relatively few senses (sight, hearing, smell, touch, taste, ...).
Dual Space
Question: Can you explain the "dual space" of vectors to me?
Answer: Simply put, dual, or adjoint space contains a special type of functions, functionals, which map vectors to scalars, \(f : X \to Φ \). And their peculiarity is linearity, \( f(αx + βy) = αf(x) + βf(y) \), for all \( x, y ∈ X \) and \( α, β ∈ Φ \). For a given vector space \(X\) we will denote the functional space by \(X^*\) and it will also be a vector space, dual to the given.
By changing the coordinate system, the vector records are changed, but some of their mutual relations remain. Thus, the child's toy in the picture above can pass from hand to hand, and go to another room, but the possibility of colorful geometric figures passing through the corresponding holes remains unchanged. Similarly, vectors lying in one plane, by a transformation of coordinates, will still be some vectors belonging to a certain same plane. By changing the coordinate system, the subspace passes into the corresponding subspace, and this property will preserve the linearity possessed by the functionals \(X^*\) of the given vector space \(X\).
For example, if we have a 3D real vector space \(X = \mathbb{R}^3\), then \( f(x,y,z) = 3x  4y + 5z \) is a member of \( X^* \).
Another example is when \( X \) is the space of quadratic (second degree) polynomials \( p(x) = ax^2 + bx + c \), then let's say the first of the associated spaces consists of functionals \( f_1(x) = p(1) \), so \( f_1(x^2  3x + 5) = 3 \). Functionals of the form \( f_2(x) = p(2) \) form a dual space with \( X \), so \( f_2(x^2  3x + 5) = 2^2  3⋅ 2 + 5 = 3 \), however, it is not the same (dual) space as in the case of the first functionals, because different polynomials of the given vector space will map by these two functionals mostly into different numbers (scalars).
A third example, when \( X \) is the space of square matrices of the third order (3×3), the function f can be matrix trace
\[ f\left(\begin{matrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{matrix}\right) = 1 + 5 + 9 = 15, \]
because the matrix trace is a linear mapping.
Dual is intuitive space of "rulers" (or measuring instruments) of a given vector space. Its elements measure vectors. Measurement is what makes dual space so important in, say, differential geometry.
Thus, the absence of canonical isomorphism (independent of the choice of basis, or isomorphism that does not depend on the coordinate system) between the vector space and it’s dual, we can understand it as the need for scaling the measurement. There is no canonical (independent of the choice of the coordinate system) way to provide a single calibration for space. However, if we are measuring the measuring instruments, then there is a canonical way of doing business — if we judge them by the way they act on what they should measure. We refer them to the essence then, as in the above example of toys, or subspace.
Therefore, the dual space of the dual space, which we denote \( X^{**} \), is like a "measurement of measurements" and it is canonically isomorphic to \( X \), although there is no such strict isomorphism between \(X\) and its first dual \(X^*\). Because of this great similarity, in many situations we treat those "second duals" as equal spaces, then we write \( X^{**} = X \).
Conservation III
Question: Are there mappings that preserve the scalar product?
Answer: Yes, such are unitary operators, \( U : X \to X\), for which \( \langle U(x ), U(y)\rangle = \langle x, y \rangle\), for all \( x, y ∈ X\). They are ubiquitous in quantum mechanics because they well represent processes for which conservation laws apply.
A trivial unitary operator is an identity function. The rotation in \( \mathbb{R}^2 \) is one of the simplest unitary operators. The complex plane \(\mathbb{C}\) is a vector space similar to the real one, but in which multiplication by a number of the form \( e^{iθ}\) for \( θ \in \mathbb{R}\) is a unitary operator. The momentum operator is also unitary \( \hat{p} = i\hbar\frac{\partial}{\partial x} \) quantum mechanics, or energy operator \( \hat{E} = i\hbar \frac{\partial }{\partial t} \), besides many.
First of all, we need to see if and when, based on the above definition, the unitary operator — linear and bijection ("11" and "onto" mapping).
1. To find out, let's consider the vector \( v = U(αx + \beta;y)  αU(x)  βU(y) \) and let \( z \in X \) be arbitrary. Then, the dot (scalar) product:
\[ \langle U(αx + βy)  αU(x)  βU(y), U(z) \rangle = \] \[ = \langle U(αx + βy), U(z)\rangle  α \langle U(x), U(z)\rangle  β \langle U(y), U(z)\rangle \] \[ = \langle αx + βy, z\rangle  α\langle x, z\rangle  β\langle y, z\rangle = 0, \]means that \( v \) is perpendicular to all vectors of the given space, and especially to itself, i.e. \( U(αx + \beta;y) = αU(x) + βU(y)\). Hence \(U\) is a linear operator. This proof of linearity also holds for infinitedimensional spaces, but that \(U\) is a bijection remains only for finitedimensional X. ∎
For inner (scalar) product spaces of finite dimensions, every injection (a "11" mapping) is an isomorphism, and unitary operators are special isomorphisms that preserve lengths and inner products. However, with the space of square summable sequences (ℓ^{2}), for example, we have a right shift operator that is an injection but not a surjection. It initializes each string in its range with a zero. Although the unitary operator preserves scalar products in infinite dimensional space, it does not have to be a bijection ("11" and "onto" function), neither unique nor reversible.
Applied to the "reality in the sea of uncertainty", which could be part of the theory of information, it turns out that the absolute reversibility of current processes should be reexamined. But about that  later.
From the very definition of unitary operators \( \langle Ux, Uy\rangle = \langle x, y\rangle \) the importance of the characteristic equations \( Ux = \lambda x\) follows. There is an agreement that we write functionals with an argument in parentheses, \( f(x)\), and other linear operators without parentheses, so that's how we do it below. By the way, we also use braket notation, Dirac brackets, for co and contravariant vectors, which become very practical with agreed writing.
2. The eigenvalue of the unitary operator is unit norm. Namely, \(Ux = λx\) and \( \langle x, x\rangle = \langle Ux, Ux\rangle = \langle λx, λx \rangle = \\lambda\^2 \langle x, x\rangle \), give \( \λ\ = 1 \). ∎
The eigenvalues of the unitary operators are, therefore, unimodular, i.e. modules are 1, and of the form \( λ = e^{iφ} = \cos φ + i\sin φ \), for \( φ \in \mathbb{R}\). Let us further show that the eigenvectors of unitary operators corresponding to different eigenvalues, as in Hermitian eigenvectors, are orthogonal.
3. If \( Ux = e^{iα} \) and \( Uy = e^{iβ} \), we have orderly:
\[ \langle x, y\rangle = \langle Ux, y\rangle = e^{iα}\langle x, y\rangle \] \[ \langle x, y\rangle = \langle x, Uy\rangle = e^{iβ}\langle x, y\rangle \] \[ \langle x, y\rangle^2 = e^{i(α  β)}\langle x, y\rangle^2 \] \[ α \ne β \iff \langle x, y\rangle = 0. \]This means that the corresponding eigenvectors of different eigenvalues are mutually perpendicular. ∎
Unitary spaces are those that contain unitary operators, and such are supplied with scalar multiplication. That is why they are the backbone of the perception of information, but their additional importance to the (later) theory of information will come due to the law of conservation of information, which is still questionable.
Norms
Question: What is considered the "intensity" of a vector?
Answer: First of all, it is the length of "oriented segment", and then further analogously everything that will satisfy the minimum of that, the following axioms:
 \( \x\ ≥ 0 \),
 \( \x\ = 0 \iff x = 0) \),
 \( \λx\ = λ \x\ \),
 \( \x + y\ \le \x\ + \y\ \),
for arbitrary vectors \(x, y \in X \) and every scalar \(λ \in \Phi \). A vector space in which every vector \(x\) is associated with a number \(\x\\) with properties 14 is called normed vector space.
In the picture, we see a "parallelogram of forces". With it, we demonstrate the positive lengths of vectors (12. axioms), proportional increases in their lengths (3), and triangle inequality (4). By substituting \(z = x + y\), the fourth gives
\[ \x\  \y\ \le \z\ \le \x\ + \y\, \]i.e. each side of the triangle is less than the sum of the other two and greater than their difference. It can be seen from the picture that \( \x  y\ \) is the distance AB, and \( \x + y\ \) is the distance OC, so is:
\[ \frac12(\x + y\  \x  y\) \le \x\ \le \frac12(\x + y\ + \x  y\) \]triangle inequality O,2A,C. Much is gained from these few axioms.
With the relation \( \x\ = \sqrt{\langle x, x \rangle} \) one can move from unitary to normed spaces. Therefore, in the same vector space it is possible to have at least as many types of norms as there are types of scalar products. From now on:
\[ \x + y\^2 = \langle x+y, x+y\rangle = \langle x,x\rangle + \langle x,y\rangle + \langle y,x\rangle + \langle y,y\rangle \] \[ \x  y\^2 = \langle xy, xy\rangle = \langle x,x\rangle  \langle x,y\rangle  \langle y,x\rangle + \langle y,y\rangle \]and from there by adding, for all \(x, y \in X\), we find:
\[ \x + y\^2 + \x  y\^2 = 2\x\^2 + 2\y\^2 \]which is the "parallelogram law", known from geometry. By calculating:
\[ \x + y\^2  \x  y\^2 = \] \[ = \langle x + y, x + y \rangle  \langle x  y, x  y \rangle \] \[ = 2\langle x, y \rangle + 2\langle y, x\rangle \]we see that in the real vector space (\( \Phi = \mathbb{R} \)) the equality holds
\[ \langle x, y\rangle = \frac14(\x + y\^2  \x  y\^2), \]and in complex (\( \Phi = \mathbb{C} \))
\[ \langle x, y\rangle = \frac14(\x + y\^2  \x  y\^2) + \frac{i}{4}(\x + iy\^2  \x  iy\^2). \]Thus, one can define a scalar product using a norm, knowing that the norm satisfies only the specified parallelogram relation (without the need for the above axioms).
However, there are also normed spaces from which it is not possible to derive the scalar product. Such is C_{[1,1]}, the space of continuous functions from the interval [1, 1] with the norm
\[ \x\ = \max_{1 \le t \le 1} x(t). \]Indeed, for functions:
\[ x(t) = \begin{cases} t & \quad 0 \le t \le 1 \\ 0 & \quad 1 \le t < 0 \end{cases} \] \[ y(t) = \begin{cases} 0 & \quad 0 \le t \le 1 \\ t & \quad 1 \le t < 0 \end{cases} \]\( \x\ = \y\ = \x + y\ = \x  y\ = 1 \) will hold, so the parallelogram equality does not hold for them. Therefore, in C_{[1,1]} there is no scalar product such that \( \max_{1 \le t \le 1} x(t)  = \sqrt{\langle x, y\rangle} \).
Finally, let's look at a few of the most famous, or most commonly used, norms, such as Euclidean real and complex, \(\ell_p\)norm, and maxnorm.
1. The absolute value \( \x\ = x \) is the norm of 1dim vector spaces, real or complex numbers. The absolute value of a real number \(x \in \mathbb{R}\) is an unsigned number, its positive value. The complex number \( z = a + ib \), where \(a, b \in \mathbb{R}\), has an absolute value \( z = \sqrt{a^2 + b^2}\) also called the modulus of the number \(z\). It is easy to prove the (above) axioms of the norm of these.
2. In the space \( R_p^n \) of real sequences \( \vec{x} = (\xi_1, ..., \xi_n)\) of length \(n = 1, 2, 3, ...\) and parameter \(p > 1\), the norms are of the form \( \x\ = (\xi_1^p + ... + \xi_n^p)^{1/p} \). Especially for \(p = 2\) we call it the Euclidean real norm and denote it by \( R^n\) and for \(n = 1\) we write only \(R\).
3. It is similar in the space \( C_p^n \) of sequences of complex numbers \( \vec{z} = (\zeta_1, ..., \zeta_n)\) of length \(n \) and parameter \(p > 1\) . Norms are of the form \( \z\ = (\zeta_1^p + ... + \zeta_n^p)^{1/p} \), where \( \zeta\) modulo the number \( \zeta \in \mathbb{C}\). For \(p = 2\) it is called the Euclidean complex norm and is written shorter \( C^n\), and for \(n = 1\) just \(C\).
The first three axioms for the last two cases are obviously true, and the fourth follows from Minkowski's inequality (Quantum Mechanics, Theorem 1.3.4), and it is similar with the next one.
4. The vectors \( \vec{z} = (\zeta_1, \zeta_2, ...)\) of the space \(\ell_p\) of the parameter \(p > 1\) are infinite sequences of numbers such that the row \( \sum_{k =1}^\infty \zeta_k^p \) converges. The norm is introduced with \[ \z\ = \left(\sum_{k=1}^\infty \zeta_k^p\right)^{1/p}. \] Instead of ℓ_{1} we write ℓ.
5. When in the previous \( p \to \infty\) then "ℓ_{∞}" norm changes to "max" norm
\[ \z\ = \max_{1\leq k < ∞} \zeta_k. \]Both of the latter apply equally to real and complex sequences. Among the mentioned examples, only in the 2nd and 3rd will the parallelogram relation be valid and only those two, Euclidean real and complex norm, have corresponding scalar products.
It is a warning of the limitations of linear algebra in the range of perceptual information. Intuitively expected, that the space of scalar products is a subspace of normed ones (Inner Product), or say that nature saves communications (minimalism ), so that "not everything communicates with everyone".
Norms II
Question: Do you know anything about operator norm?
Answer: The linear operators in question are of the vector type and therefore can have "intensity", i.e. the norm. We determine it by means of the ratio of the norms of the input and the output, or the original and the copy, like a cinema projection of a celluloid strip onto a film screen.
Elaborating the scene geometrically, we would compare it to a conic section (Conics), a projection from a narrow light source at an angle to the wall, which takes the forms of ellipses, parabolas, and hyperbolas. So, seeing a limited original with an unlimited copy is easier.
Starting from the scalar product in the vector space \(X\) we create another vector space \(X \to X\) of all linear operators which is normed, but not unitary. Furthermore, let \(X\) and \(Y\) be the unitary scalar spaces of the field Φ with scalar products \(\langle x, x'\rangle_x\) and \(\langle y, y'\rangle_y\), with orthonormal bases \( e_1, ..., e_n \in X\) and \( f_1 , ..., f_n \in Y\) respectively, so let's consider the linear operator \(A : X \to Y\). We have:
\[ Ae_j = \sum_{i=1}^m \alpha_{ij}f_i, \quad (j = 1, ..., n) \] \[ x = \sum_{j=1}^n \xi_je_j \implies \x\_x = \sum_{j=1}^n \xi_j^2 \]for an arbitrary vector \(x \in X\). Its image is \(y = Ax \in Y\), where:
\[ y = Ax = \sum_{i=1}^m (\sum_{j=1}^n \alpha_{ij}\xi_j)f_i \] \[ \y\_y = \sum_{i=1}^m\left\sum_{j=1}^n \alpha_{ij}\xi_j\right^2 \le \sum_{i=1}^m \left(\sum_{j=1}^n \alpha_{ij}^2\right) \left(\sum_{j=1}^n \xi_j^2\right). \]Cauchy–Schwarz inequality is applied, \( \langle x, y\rangle^2 \le x⋅y\), where there is an equal sign if the vectors are linearly dependent. Hence:
\[ \Ax\_y \le M\x\_x \quad (x \in X) \] \[ M = \left(\sum_{i=1}^m \sum_{j=1}^n \alpha_{ij}^2\right)^{1/2} \] \[ \left A\frac{x}{\x\_x}\right_y \le M. \]If we take all the unit vectors \(\x\_x = 1\) of the space \(X\), then the operator \(A\) translates that unit sphere into a limited set of the space \(Y\), radius \(M \ge \y\_y \) of the vector \(y \in Y \). Then \( \Ax\_y \le M\), for all unit vectors of the space \(X\), so neither supremum (least upper bound) does not exceed \(M\). Therefore, the operator norm of the operator \(A\) is:
\[ \A\ = \sup \Ax\_y \le \A\⋅\x\_x \quad (\x\_x = 1) \]In addition, \( \A\\) is the smallest of the numbers for which the above inequality holds, and in the case of finite dimensionality of these operators, we have the estimate
\[ \A\ \le \left(\sum_{i=1}^m \sum_{j=1}^n \alpha_{ij}^2\right)^{1/2}. \]It is easy to prove that such an "operator norm" satisfies the vector norm axioms (Norms) and that it is indeed a vector norm.
1. Now let's look at the same through the example of the norm \( \x\_1 = \sum_{j=1}^n \xi_j\), the "column sum" operator. Let the vector be \(x \neq 0\), so we have:
\[ \Ax\_1 = \sum_{i=1}^n\left\sum_{j=1}^n \alpha_{ij}\xi_j\right \le \sum_{i=1}^n \sum_{j=1}^n \alpha_{ij}\xi_j = \] \[ = \sum_{j=1}^n \xi_j \sum_{i=1}^n \alpha_{ij} \le M \sum_{j=1}^n \xi_j = M \x\_1. \]The triangle inequality for absolute values is used, and we need an estimate of the number \(M\). From the obtained \( \Ax\_1 \le M \x\_1 \) it follows:
\[ M = \max_{1 \le j \le n } \sum_{i=1}^n \alpha_{ij} = \sum_{i=1}^n \alpha_{ik}, \]for some \(k \in \{1, ..., n\} \). Such \( \A\ = M\) is the norm of the given operator.
2. For a general secondorder matrix
\[ \hat{A} = \left(\begin{matrix} a & b \\ c & d \end{matrix}\right) \]by the (above) definition, we are looking for the operator norm:
\[ \\hat{A}\ = \max\{ \\hat{A}x\^2 : \x\ = 1 \} = \] \[ = \max\{(ax + by)^2 + (cx + dy)^2 : x^2 + y^2 = 1\} \] \[ = \max\{(b^2 + d^2) + 2(ab + cd)x\sqrt{1x^2} + (a^2  b^2 + c^2  d^2)x^2 \}, \]under the condition \( 0 \le x \le 1\), and that is the task of finding the conditional maximum of the function, usually using the method Lagrange multipliers.
3. The same previous task, finding the operator norm matrix form, also solves the root of the largest eigenvalue of the product of the transposed and given matrix:
\[ \hat{A}^\tau \hat{A} = \begin{pmatrix} a & c \\ b & d \end{pmatrix} \begin{pmatrix} a & b \\ c & d \end{pmatrix} = \begin{pmatrix} a^2 + c^2 & ab + cd \\ ab + cd & b^2 + d^2 \end{pmatrix}, \] \[ \begin{vmatrix} a^2 + c^2  \lambda & ab + cd \\ ab + cd & b^2 + d^2  \lambda \end{vmatrix} = 0. \]For example, for \( (a, b, c, d) = (1, 2, 3, 4) \) is \(\lambda^2  30\lambda + 4 = 0\), with solutions \(\lambda_1 \approx 29.866 \) and \(\lambda_2 \approx 0.134 \), which means that \( \\hat{A}\ = \sqrt{\lambda_1} \approx 5.465 \).
The explanation of the search for the norm of the operator with characteristic quantities can be seen briefly from the equation:
\[ \\hat{A}\ = \max_{\vec{x} \ne 0 } \frac{\\hat{A}\vec{x}\}{\\vec{x}\} = \max_{\vec{x} \ne 0 } \frac{\sqrt{\\vec{x}^\tau \hat{A}^\tau \hat{A} \vec{x}\}}{\\vec{x}\} = \max \lambda \]where \( \hat{A}\vec{x} = \lambda \vec{x} \), and the product \( \hat{A}^\tau \hat{A}\), transposed and given matrices, is a symmetric matrix with real eigenvalues \(\lambda\).
4. Matrices of order 3×3:
\[ \hat{A} = \hat{B} = \begin{pmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{pmatrix} \implies \\hat{A}\ = \\hat{B}\ = 1, \] \[ \hat{A}\hat{B} = \hat{A}^2 = 3\hat{A}, \quad \\hat{A}\hat{B}\ = 3. \]show that the norms of the operator are not multiplicative.
Finally, let's add that physical states (vectors that operators act on) are spatial representations, processes (operators that act on vectors) are temporal representations of abstract vector spaces, and that they are completely in line with unofficial "information theory". These analogies are a reflection of the "objectivity of chance" and consequences (Dimensions) which are not found in physics so far.
Cauchy sequence
Question: What are Banah and Hilbert spaces?
Answer: A space where there is no gap, more precisely, in which every Cauchy sequence converges, is called complete space. Complete normed space is Banach’s space, and completely unitary is Hilbert's space.
For a series of vectors \(x_n \in X \) we say it is convergent (n = 1, 2, 3, ...) and to converge to the vector \( x_0 \), if for each \(\varepsilon > 0 \) there is \( N(\varepsilon) \in \mathbb{N} \) such that \(\ x_n  x_0 \ < \varepsilon \) for each \(n \ge N(\varepsilon) \). We write "\( x_n \to x_0 \), when \(n \to \infty \)", and
\[ \lim_{n \to \infty} x_n = x_0. \]For a series of vectors \(x_n \in X \) we say it is Cauchy sequence (n = 1, 2, ...), if for each \(\varepsilon> 0 \) there is a natural number \( N(\varepsilon) \) such that it will be \(\x_m  x_n \ <\varepsilon \) for all \(m, n \ge N(\varepsilon) \). We call the space "complete", or by a "Cauchy space" when each of the Cauchy’s sequences converges to the point (vector) in that space.
However, every convergent sequence will be withal Cauchy sequence, but it will not be the other way around. For example, a sequence of rational numbers \(q_k \in \mathbb{Q} \) converges to the irrational number \(\sqrt{2} \), in the space of the rational numbers is a nonconvergent but Cauchy sequence. Therefore, the space of rational numbers is "incomplete", it has emptiness, and "completes" by adding limits, numbers into which rational can converge, when it becomes real (\( \mathbb{R} \)).
A typical example of a Banach's space (complete normed) is space \(C_{[0,1]} \) continuous functions \( x(t)\) for \( t \in [0, 1] \) and norm \(\ x \_b = \max x(t  \). Recharged the space \(C _ {[0.1]} \) becomes Hilbert's (complete unitary) normed by
\[ \x\_h = \left[ \int_0^1 x(t)^2 \ dt\right]^{1/2}, \]and with scalar product
\[ \langle x, y\rangle = \int_0^1 x(t)y^*(t)\ dt, \]where \(y^* \) is the conjugated complex number \(y \). It is Hilbert's space \(L_2 (0.1) \) (capital el two) measure theories. Equivalent to it is Hilbert's space \(\ell_2 \) (small el two) infinite sequences \( x = (\xi_1, \xi_2, ...) \), of the scalars \(\xi_k \in \Phi \) so
\[ \sum_{k=1}^\infty \xi_k^2 < +\infty, \]that the sum of absolute squares converges.
We know that there are phenomena that we can measure, but not perceive with our eyes. Such is the sound we hear. But there are also sounds that we do not feel with our ears, and they are of such low frequencies that we can still sensate them as vibrations in the body. There are also such tremors that collapse buildings, but there are also waves that, like neutrinos, hardly communicate with the environment.
The theory I deal with accepts such and even more "strange" phenomena, and works with them as real because they are (in the mathematical sense) correct even though sometimes outside the domain of physics. When we talk about uncertainty or about the transmission of information, then about communication, instead of about superposition, action, and interaction, these could be those "unnatural" situations. The differences between them are similar, indeed very analogous, to the differences between vector and metric spaces that may be confusing us now.
Metrics
Question: What do you consider measurement and how much of it is in algebra?
Answer: Success, length, and information can be measured, but measurability in general should be distinguished from the interaction by which one of the parties would perceive the other. This is how you should look at this problem and understand how much algebra has only scratched the surface of it so far.
In algebra, the axioms of the metric are reached from the normed vector space \(X\) and the scalar \(\Phi\) of real or complex numbers, joining pairs of vectors in the real number by \(d(x, y) = \x  y \ \). The "metric space" obtained in this way is not necessarily linear, so it is a broader concept than the normed space (of the same elements from \(X\)). The resulting definition of measurement is the same as that of functional analysis, and rests on only four axioms:
 \(d(x, y) \ge 0\) for all \(x, y \in X\);
 \(d(x, y) = 0\) then and only then, if it is \(x = y\);
 \(d(x, y) = d(y, x)\) for all \(x, y \in X\);
 \(d(x, y) \le d(x, z) + d(z, y)\) for all \(x, y, z \in X\).
From the 1st axiom, we see that we measure with nonnegative real numbers, from the 2nd that it is some mutual incompatibility of two vectors, and from the 3rd that it is not like (dis)sympathizing because it is always mutually equal, symmetrical. But only the 4th axiom, the triangle inequality, establishes that this "measurement" is analogous to the measurement of lengths. It is therefore inappropriate for measuring communication, for example, but we will see that it is not inapplicable.
1. Each of the mentioned Norms has a corresponding metric. So:
\[ d(x, y) = \left(\sum_{k=1}^n \xi_k  \eta_k^p\right)^{1/p} \quad (1 \le p < \infty)\]is metric in the space of the strings \(x = (\xi_1, ..., \xi_n)\) and \( y = (\eta_1, ..., \eta_n)\). When all \(\xi_k, \eta_k \in \mathbb{R}\) we denote this metric by \(R_p^n\), and if the scalars are complex numbers by \(C_p^n\). In the case \(p = 1\) we call them Euclidean metrics, real and complex, and when \(p \to \infty\) we get the "maxmetric":
\begin{equation} d(x, y) = \max_{1 \le k \le n} \xi_k  \eta_k \end{equation}which we also denote by \(R_\infty^n\), or \(C_\infty^n\).
2. In the space of infinite sequences, \(n \to \infty\), the above metrics are denoted by \(\ell_p\), if the sequences \(\sum_{k=1}^\infty \xi_k^p\) converge. Instead of \(\ell_1\) we write \(\ell\).
3. Usually, these metrics are proved using Minkowski's inequality:
\[ \left(\sum_{k=1}^n \alpha_k + \beta_k^p\right)^{1/p} \le \left(\sum_{k=1}^n \alpha_k ^p\right)^{1/p} + \left(\sum_{k=1}^n \beta_k^p\right)^{1/p} \]where \(p \ge 1\), which we recognize as the "triangle inequality". The equality holds iff \(\alpha_k : \beta_k =\) const, for all \(k = 1, 2, ..., n\). We prove the inequality using Hölder's Inequality \((p^{1} + q^{1} = 1)\) :
\[ \sum_{k=1}^n \alpha_n \beta_n \le \left(\sum_{k=1}^n \alpha_k^p\right)^{1/p} \left(\sum_{k=1}^n \beta_k^p\right)^{1/p} \]which we could equally call the "multiplicative" triangle inequality, i.e. Minkowski's. The equality holds iff \(a_k^p : b_k^q =\) const, for all \(k = 1, 2, ..., n\). When \(p = q = 2\) it becomes the CauchySchwarz inequality. The proofs are in the book Quantum Mechanics, in the section "1.3.1 Metric space", also.
It would be too long for this occasion to list the more wellknown metrics, so for the sake of demonstration, I will list only two of the unusual ones, one known and one unknown.
4. The space \(s\) of infinite sequences \(x = (\xi_1, \xi_2, ...)\) and \(y = (\eta_1, \eta_2, ...)\) has a distance given by the expression:
\[ d(x, y) = \sum_{k=1}^\infty \frac{1}{2^k}\frac{\xi_k  \eta_k}{1 + \xi_k  \eta_k}. \]The triangle relation follows from the inequality
\[ \frac{\alpha + \beta}{1 + \alpha + \beta} \le \frac{\alpha}{1 + \alpha } + \frac{\beta}{1 + \beta}, \]where we put \(\alpha = \xi_k  \zeta_k\) and \(\beta = \zeta_k  \eta_k\), so we use the fact that the function \(t/(1 +t)\) is monotonically increasing for \( t \ge 0\).
5. In the book Physical Information, Economic Institute Banja Luka 2019, under the title "2.3 Binomial distribution" you will find an interpretation of the Bernoulli distribution \(\mathcal{B}(n, p) \) and the Shannon information \(S_1 = p \log p  q \log q \), when \( n = 1 \) and \(q = 1  p \in (0, 1)\), say in the case of tossing an (unfair) coin that has the probabilities of landing and not landing the tails \(p\) and \(q\).
I constructed there a different information of the label \(L_n\), of the same binomial distribution \(\mathcal{B}(n, p) \), for which \(L_1 = S_1\), but \(L_n = nL_1\) holds. The name "physical" is due to the imitation of the conservation law, and we use its additivity here to define the metric. Say, if \(x\) and \(y\) are points of the "space" of binomial distributions, with \(m\) and \(n\) the number of repetitions of the experiment, then
\[ d(x, y) = L_m  L_n = m  nS_1 \]is a metric.
6. Two metric spaces \(X\) and \(Y\) are isometric, if there is a biunique correspondence \(f\) between them such that for every pair of points \(x_1, x_2 \in X\)
\[ d_y(y_1, y_2) =d_x(x_1, x_2), \]where \(y_1 = f(x_1)\) and \(y_2 = f(x_2)\). The function \(f : X \to Y\) is an isometry between the spaces \(X\) and \(Y\).
The last definition (6) proves to be very useful when comparing various metric spaces due to their multitude, in situations where it is necessary to observe or convey their common properties.
Contraction
Question: Can you explain the alternative theories of gravity to me?
Answer: It's a different person, but it's a similar question (Freefall), so the answer will be a supplement, like a new item there, and a little more detailed. I emphasize, only the paths are alternatives, not the results.
Let's imagine a body of mass \(m\) which, from a state of rest, begins to slowly and increasingly rapidly fall towards the center of gravity of mass \(M\). That A moves inertially and relative to the initial position of B at the moment \(t\) at a distance \(r\) from the center has a speed \(v\). Forces are not felt in a falling body, while a relative observer will be able to interpret this relatively accelerated movement by the action of force \(F = GMm/r^2\), and we, C, say that forces change probabilities and that the body will therefore accelerate relatively because those probabilities for A and B are not the same.
To the question of who is right, the answer is — all three, A, B, and C. Length contraction (SpaceTime, 1.4.1 Lorentz transformation) along the direction of movement is \(\Delta r = \Delta r_0 \gamma\), where is the Lorentz coefficient
\[ \gamma = \frac{1}{\sqrt{1  \frac{v^2}{c^2}}} \approx 1 + \frac12\frac{v^2}{c^2}, \]with \(v\) radial velocity of the body and \(c \approx 300\ 000\) km/s speed of light in vacuum. The circumference of the circle around the center is the same for both observers A and B, and the radial length is greater for the descending one, which means that for him (A) the ratio of the circumference to the diameter of the circle (with the center at the center of gravity) becomes smaller and smaller than \(\pi = 3.14159...\), and that the gravitational space is of increasing spherical curvature. The time units of the falling are longer, \(\Delta t = \Delta t_0 \gamma\), relative to the relative observer.
In the continuation of the mentioned book (SpaceTime, Theorem 1.4.4.) Lorentz transformations and in the small (infinitesimal) environment of the falling body and it was shown that the Lorentz coefficient is then
\[ \gamma_g = \frac{1}{\sqrt{1  \frac{2GM}{rc^2}}} \approx 1 + \frac{GM}{rc^2}, \]which corresponds to the conversion of potential energy into kinetic energy of body A falling in relation to the relative observer B. Such relative contraction length \(\Delta r = \Delta r_0/\gamma_g\), slowing down of time by dilation \(\Delta t = \Delta t_0 \gamma_g\), and the increase in mass \(m = m_0\gamma_g\), but also the energy of the falling body \(E = E_0\gamma_g\), agree with the equations of general relativity. The text is taken from my several years earlier (Times in Relativistic Motion).
From those infinitesimal Lorentz transformations, in the book of notation (1.199), the metric (1.205) is obtained
\[ ds^2 = (\gamma_g dr)^2 + r^2(\sin^2\theta d\varphi^2 + d\theta^2)  (cdt/\gamma_g)^2, \]which is exactly equal to Schwarzschild's (1.137). This is the known solution to Einstein's general relativity equations for centrally symmetric gravitational fields.
Let's note that the kinetic energy obtained by free fall corresponds to the change of the potential \(E_k = mv^2/2 = GMm/r \), and its change through \(r\) corresponds to the work of the force (\(dE = Fdr\)). Hence the conclusion is that we can consider the change in probability "under the influence of a force" as proportional to the potential. That's why it makes sense to say, according to the assumption of information theory, that "force changes probabilities". Moreover, the results also point us to the relationship between force and probability.
A body "falls" toward the center of gravity moving into smaller units of length, sticking to higher probability densities of place (Compton effect, Collision). It moves towards a slower flow of time (fewer realizations of randomness), towards greater certainty.
Universe
Question: Are you into universe metrics?
Answer: A little, and only theoretically. In addition to relativity, which fits unusually well into my theory of information, although not with the principles that Einstein would have accepted, cosmic theories that have been derived from it are a little shaky.
I mean Friedmann's Friedmann model
\[ ds^2 = \left(\frac{dr^2}{1  kr^2} + r^2 d\Omega^2\right)R(t)^2  c^2dt^2, \]where \( d\Omega^2 = d\theta^2 + \sin^2\phi^2 \), and the dimensionless quantity \(R(t)\) is increasing over time and reflects the size of the universe. I analyzed the Rotating Universe separately, not imagining that all that space really revolves around us, but for reasons that I will explain below. Among the twisted models I played with was the torus.
Starting from the uncertainty at the base of (almost) everything, and nature's behavior as if it doesn't want itself to be like that, we can imagine one peak of reality, everything in one point. Such is possible in the world of bosons themselves, say at the beginning of the world, 13.8 billion years ago at the time of the Big Bang. The spontaneous development of events would lead to the relief of the density of uncertainty, and these are the moments of the creation of the first fermions, the decay of Higgs bosons.
I say it could look like that, but it might not be like that, because we also have the ergodic theorem (Informatics Theory II, 61.2.), which would make it look like that, no matter what it really was. However, both are somewhat fiction, as is the past that we look at through telescopes today.
Well, the uncertainty is being diluted today by the gradual (more frequent) creation of bosons from fermions, making galaxies more and more distant, and with space that remembers and Growing. Space accumulates memories, and its specific uncertainty (per unit of volume) is decreasing, so time slows down. In my information theory, the speed of time is defined by the whole amount of (random) outcomes.
We do not notice the slowing down of time in the current present, because the units of time are shortened just enough for us to perceive the speed of light as constant. We could observe both effects (contraction of length and dilation of time) if we could observe the future from the past, or the past from the present. The latter is easier than the former but is insufficiently clear due to the real growth of space between galaxies and their escape.
As the law of conservation of information applies analogously to energy, and then to mass, observed from the past, they would relatively increase together with the slowing down of time, so that everything would "unbelievably" resemble the free fall of a body into the gravitational field in the previous answer (Contraction). Hence the metric of space
\[ ds^2 = (dr/\gamma_g)^2 + r^2 d\Omega^2  (\gamma_g cdt)^2. \]There is no coefficient in front of \(d\Omega\), as in Friedman's model, so that the analogy with falling into the gravitational field is greater. But it doesn't exist because changing that number \(R\) would shrink (expand) the universe around us, so the galaxies wouldn't move away from us in a straight line. Friedman's model is without it because there are no changes in the time course of the present.
In contrast to the rotating universe, where time would flow more and more slowly in places further away from us and which would therefore be more attractive, the reverse is happening here. Time runs faster in distant places, but its velocity is also faster, so time runs slower for them. The two effects could be offset if the time of distant galaxies were going as fast as ours, which is not a possibility in the rotating model.
In this way, the past is rejected from us by the spatial (average statistical) distance of galaxies and by the principled spontaneous tendency of the present towards less information.
Metric II
Question: What is a metric tensor?
Answer: Metric tensor g_{μν} is a metric of small distances, and the basis of tensor calculus. It is the secondorder tensor, twice covariant. A tensor is a kind of generalization of a scalar, vector, or matrix.
Consider the points A = (A_{x}, A_{y}) and B = ( B_{x}, B_{y}) in the Cartesian coordinate system Oxy. Write them in polar coordinates Orφ, and the coordinate transformations:
we will understand from the picture. Namely, for point A is x = A_{x} and y = A_{y} , while r = a = OA and φ = ∠A_{x}OA. It would be similar for the point B, however, we will now put that ∠B_{x}OB = φ + Δφ and OB = r + Δr, then go to infinitesimals \(\Delta r \to dr\) and \(\Delta\varphi \to d\varphi\) .
The metric of the Cartesian rectangular system is defined by the Pythagorean theorem, which we then translate into the metric of the polar, using differential calculus:
\[ AB^2 \to d\ell^2, \] \[ dx^2 + dy^2 = [d(r\cos\varphi)]^2 + [d(r\sin\varphi)]^2 = \] \[ = (\cos\varphi\ dr + r\sin\varphi\ d\varphi)^2 + (\sin\varphi\ dr  r\cos\phi\ d\varphi)^2, \] \[ dx^2 + dy^2 = dr^2 + r^2\ d\varphi^2. \]It is at all
\[ d\ell^2 = g_{11}\ d\xi^1d\xi^1 + g_{12}\ d\xi^1d\xi^2 + g_{21}\ d\xi^2d\xi^1 + g_{22}\ d\xi^2d\xi^2, \]in the system of (contravariant) coordinates \(O\xi^1\xi^2\), which is in the Cartesian metric \(g_{11} = g_{22} = 1\), and in polar \(g_{11} = 1, g_{22} = r^2\). Note that these metrics do not have mixed metric tensors, \(g_{12} = g_{21} = 0\). It will always be if the coordinates of such a pair are mutually perpendicular. Otherwise, \(g_{\mu\nu} = g_{\nu\mu} \) when \(\mu \ne \nu\), due to the symmetry of the metric. When \(d\ell = 0 \) the points are in the same place.
For example, in the 3D system of Descartes, the Pythagorean theorem is
\[ d\ell^2 = dx^2 + dy^2 + dz^2. \]Let's transfer it to spherical coordinates \(Or\theta\varphi\) by transformations:
\[ \begin{cases} x = r\sin\theta\cos\varphi \\ y = r\sin\theta\sin\varphi \\ z = r\cos\varphi \end{cases} \]and working as in polar we arrive at metric
\[ d\ell^2 = dr^2 + r^2\ d\theta^2 + r^2 \sin^2\theta\ d\varphi^2. \]Since there are no coefficients of the metric tensor of mixed indices here either, the spherical system is also orthogonal.
We add the time coordinate as an imaginary length ξ = ict which in time t the light travels with speed c. The interval is then \(ds^2 = d\ell^2  c^2dt^2 \), when the expression "at the same place" becomes "simultaneous". With such additions, the metric tensor is not always "positive definite", it is not a tensor whose determinant holds the inequality \( g_{11}g_{22}  g_{12}^2 > 0 \).
