Tag Archives: math

Why can’t we add two temperatures?: Part #3

In the previous article, we looked at how positions of any kind form an affine space, whose differences are vectors. For when we are dealing with scalar quantities, in this final part of the series, we look for a more lightweight mathematical structure. As mentioned earlier, we use real numbers as values for things like temperature, but we can’t bring over all the familiar operations like addition (that they satisfy due to their field structure) to our physical quantities.

Well algebra has no shortage of mathematical structures with and without strange esoteric properties. But its surprisingly hard to track down a suitable mathematical answer from vector spaces or fields even if you knew what you were looking for. And it isn’t wiki’s fault either, they’ve got this nice little box (snapshot below) organising many related entities titled Algebraic Structures. Expanding all the ‘[show]’s however gets you no closer (see below).

The dive

The way algebra works is that there are many structures where sets are tied up with operations and axioms are attached to these structures. Each structure then becomes a full-blown topic for study in its own right, like groups (you’d have encountered this in school likely), rings, fields, vector spaces, Banach spaces, Hilbert spaces, and more, forming a dizzying array of entities. Fields are a relatively specialised structure with a lot of axioms and being closed under two different operations. So let’s start dropping axioms and operations from fields till we get to something useful, shall we?

A field is listed under ‘ring-like’ structures in that box. The least restrictive structure in that section is the cheekily named rng, a set closed under addition and multiplication but without the equivalent of a × 1 = a that we take for granted. But addition is a no-go, so there’s more chopping required still. ‘Group-like’ looks more interesting because they only need one operation, and hopefully something there drops even that one operation? Modules and Algebras (yes they are a type of algebraic structures studied in the subject of abstract algebra, and yes I agree they really needed to choose a different name) have even more structure and axioms piled on on top of fields (modules are almost vector spaces in fact), so they’re an immediate no-go.

And while group-like certainly delivers on the less axioms bit (see above, though only group-like structures get their own detailed table in the “Outline of algebraic structures”), they never drop the (usually thought of as addition) operation. E ven the exotic goopy-sounding “partial magma”, which is barely more than a set and allows a subset of elements to obey the (addition) operation, still needs an operation. And because the subtracting one position from another yields a displacement which is a genuine vector, that operation can’t be subtraction either. Downward we go then, abandoning even more order and structure as we descend into ~~savagery~~ sets.

Sets. Just collections of numbers, a way to tell if something is in them or isn’t … and in our case a total ordering since its pretty important to be able to compare temperatures. But we still need to be able to subtract and then work with the differences. So this feels a little bare.

Now for all the physical quantities we care about, we could do a little better by adding the notion of distance between any two numbers, to get a metric space. And this may have been barely sufficient, but for the fact that the distance has to be positive. And we haven’t even gotten to allowing the quantity differences to be added, subtracted, divided etc.

Torsors

But waitaminute, surely this has been done before? Why are we doing basic algebra research on an extremely straightforward property of our physical world? Fortunately, there is indeed prior (obscure) art on this.

So it turns out our elusive algebraic structures are called torsors – and its extremely likely you’re hearing about it for the first time in your life. So it turns out that dropping properties wasn’t going to lead us to much of anywhere. Instead the differences in values are happily treated as our ordinary numbers in blessedly-axiom-rich real number field. But the absolute values themselves just form a bare set which is just associated with this field. Torsors are also called principal homogenous spaces, where a homogenous space brings in the group action and the “principal” property implies a kind of linearity. In fact, it seems that affine spaces are in fact torsors where the group is a vector space.

Just like affine spaces, torsors are also defined as a set (call it T_absfor our temperature test-case) with a group action on it. Here, for our purposes, the group is the set (call it T_Δ) of real numbers with the addition operation. Because these represent temperature differences, T_Δ occupies the entire −∞ C° to +∞ C°. The set T_absspans [−273 °C, +∞ °C), and the torsor properties guarantee that for any two absolute temperatures, there exists a temperature difference between them. The restriction that an invalid absolute temperature (absolute hot, anyone?) does not result from an arithmetic operation will have to be enforced in the definition of the group action.

That’s it! In one stroke, we have solved several niggling issues of how to represent these physical quantities. More examples:

Voltage, or more accurately electric potential difference, is only ever defined between two points. An electric potential does indeed exist at each point but is undetermined and possibly unknowable by definition. Potential at points in Euclidean space are members of the set of reals, and potential differences would form the group/field of reals, and we only ever evaluate the differences but the calculations are based on electric fields at those points. Absolute potential is an ℝ-torsor.
Time is a more valuable case given a pressing need for picking a zero but which is otherwise meaningless. Time instants are members of the set of reals, and time intervals are members of the group/field of positive reals. Time intervals naturally result as from subtracting one time instant from another. Timezones would be an interesting concept to model here, and perhaps is better off defined as a vector. Time is an ℝ⁺-torsor.

I have vivid memories of dealing with time programmatically where correct handling and distinctions between instants and intervals was what sold me on that particular software library.

Torsors also apply perfectly to memory pointers – the type of pointer could be embedded into the group action as a size parameter which ends up multiplying the offsets. A void pointer would then be a set with no group action at all, essentially hobbling the ability to do pointer arithmetic. Therefore, pointers are ℤ-torsors!

To summarise, we represent temperature with a real number and a unit, and we expect to add/subtract/multiply/divide with reals because the set of real numbers is usually tacked on with closure under addition and multiplication (called a field in algebra). However, with many physical quantities that doesn’t make sense, and we need a more restrictive algebraic structure than fields. In such quantities, the right way is to classify absolute temperature as a different type of object than temperature differences which are just numbers, and explicitly define addition between an absolute and a difference. This structure is called a torsor.

Diving deeper

This actually felt like a small book, phew. I’m also linking a couple of further good resources:

This mathoverflow answer goes into the origin of the term
Matt Baker’s alternate construction of proportion spaces is later shown by commenters to be equivalent to a heap.
And finally Terence Tao (perhaps the greatest living mathematician) has written a beautiful piece on the ways to axiomatise physical units and dimension. He also mentions torsors for the same reason that I do (yay!).

The heap construction is also quite cleverly suited for the matter at hand, managing without the need for a separate set and group. Instead, it uses a single ternary operation (our usual arithmetic are all binary operations!) which more or less comes out to a−a+b, with some properties: given five elements of the heap, it satisfies the equivalent of a−a+b = b, and a−b+(c−d+e) = (a−b+c)−d+e . The elements are the same as the elements of the torsor itself, like absolute temperature, absolute potential, time instants, absolute pointer, etc. This very much does satisfy the restriction we need. The only reason I prefer the torsor construction is that it explicitly defines the differences/intervals as members of the group, which obviously have their own rightful existence.

Epilogue

I want to be clear here that there is no original work here, lest someone mistake so. I simply scoured the internet and am paraphrasing existing work targeted at the apparently-niche need of merely most physical parameters (!) I have been liberal in hyperlinking multiple good resources, because I needed all of them together to get the picture. I am not exactly a mathematician by training, let alone by profession, so I gladly welcome any corrections or discussions. I haven’t added a lot of mathematical notation here because I am not doing any rigorous work here and would not like to give anyone that idea. (Plus I’m hoping to add LaTeX capability to my blog another day). And if anyone liked the usage of some typographical symbols like degrees, sets, and the minus sign, I’ll be happier. Adiós, amig(o/a)s!

Why can’t we add two temperatures?: Part #2

3 Replies

In the first part, I talked about intuitive reasons why some physical quantities can’t be subject to the full repertoire of math operations. There’s more fun to be had, however, in the mathematical aspects of this limitation for related quantities like position and pointers (which are not addable due to a closely related but distinct reason from that of temperature).

Let’s check out your room

Let’s ask a different but equally simple question as in the last part: why can’t we add positions in 3-D space (or 2-D space for that matter)?

By which I mean, if you fix any given point in your bedroom as your origin, and mark the positions of various objects as 3-tuples ( 𝑥^*,𝑦^*,𝑧^* ), it makes literally no sense to add the position of your unwashed-clothes-bearing-chair to that of your underused washbasket (with a caveat for later). It does make sense to subtract them though, which gives you the 3-tuple displacement between the two (which is pointing from your washbasket to your chair, if you’re keeping track). It is also probably congruent to too-far-to-move-stuff currently.

A room with clothes on a chair and a separate washbasket — Courtesy Google Gemini

So now that we have brought in displacements, what can we say about them? Pretty straight: its a vector quantity, which just means it has a magnitude and direction. How to displacements differ from positions (after all both are 3-tuples)? The general answer on the internet is to distinguish between bound and free vectors. The terminology is fairly widespread in teaching material – I also encountered it, but none of us picked up on the distinctions involved and it never mattered. The biggest problem is that no one tells us what operations can be done on these bound and free vectors. Also, they tend to be emphasised in the definition, but eventually the distinction is lost (see here for e.g. towards the end, even though great care was taken initially) and any ordered sequence of 3 numbers become vectors. And then only intuition keeps calculations from becoming nonsensical. There’s more structure required for the bound vectors, so we have to go back to some definitions.

What’s a vector anyway?

Given the utter simplicity of the question for which apparently even STEM graduates can’t give the proper answer, let alone school teachers, we should probably be more rigorous than “a vector is a quantity with magnitude and direction”.

Source: https://xaktly.com/MathVectors.html

To do that, we first have to figure out what a scalar is. Again, the simple physics answer to this is a physical quantity (such as temperature) that can be represented only by magnitude (which is a number) with a unit. But merely being a number hasn’t told us whether certain scalars can be added or multiplied together. In math though, a scalar is a member of a field, which can be added, subtracted, multiplied, or divided by another (non-zero) scalar to yield another scalar. A very beautiful way of rewording this is that the field is closed under these four operations.

Building on this, a vector space over a scalar field is basically a set of sequences/tuples of scalars together with two binary operations: + (applies to two vectors and results in another vector), and × (applies between a scalar and a vector and results in another vector) that satisfy several axioms which I’ll not get into here. Among many others, the displacements between positions in your room that we talked about above can be added and multiplied/stretched by a scalar, and are clearly vectors.

Already, you can see that the vector space is also closed under addition, but multiplication of two vectors isn’t part of the deal.

This closure property makes a pretty powerful statement – that is, given any possible vectors and the addition operation, you have no hope of getting out of that vector space with any combination of operations. Sounds scary if you’re in a vector jail, but very satisfyingly safe if viruses were vectors. (Unfortunately, viruses are carried by a very different kind of vector nevertheless with suprising etymological commonality.) More to the point, closure is amazing if you’re trying to set boundaries or prove on what you can/can’t do with things and operations on those things. Same for velocities, accelerations, angular velocities, force, yada yada yada you get the drift.

But back to our position-addition-problem. The main reason I’m not going into vector space axioms is because they’re utterly irrelevant to what’s coming next. As we saw above, the scalars that vectors are defined using, come from a field. And a field is a set that is closed under addition, multiplication and their inverse operations (no prizes for guessing which they are). But the “scalars” of our positions, i.e. the underlying 𝑥-, 𝑦-, or 𝑧-components in the positions of your chair and washbasket, themselves don’t make sense to add (with the aforementioned caveat). And that’s not least because the result is nonsensically dependent on whether the origin is your room corner or the other side of the country. The components of position just can’t be meaningfully added, and thus don’t even begin to form a field.

Incidentally, we can now recognise that voltages and time, also “scalar” quantities in physics, do not form a field.

A familiar yet rigorous term

It turns out that geometry, only a little more rigorous than I learned in high school, always had the perfect answer to this issue of position arithmetic. It even has a very familiar sounding-name which pops in and out of common parlance and mathematical boilerplate setup. The concise answer is that positions in our real world belong to the Euclidean space. And Euclidean spaces in general are NOT vector spaces! Rather they are a very important type of structure called affine spaces. Affine spaces are sets whose members are called points (sounds familiar?), which are however associated with a vector space, and an action (we’re almost there). This action is a kind of function that does addition between a point and a vector to give… another point!

There are further technical conditions that successively add the expected intuitive properties of physical space onto the mathematical object. The Euclidean space also requires the presence of a dot product on its associated vector space, which doesn’t come built-in with a regular vector space.

One of these technical conditions essentially says that by fixing one particular point in the affine space as the origin, there must exist a corresponding vector for every point and vice versa. And this is the origin of the sloppy thinking that got us to this problem in the first place.

We were taught loosely that points were some kind of vectors, which some people call “bound vectors” or “position vectors”. The truth is the vectors from a given point in the affine space to each other point are displacement vectors or “free vectors”. And adding displacement vectors corresponding to two points is tantamount to marking a point with which you can complete a parallelogram. And that parallelogram is completely dependent upon what the origin is, and is thus a fairly useless object.

The caveat

However, it turns out that a weighted sum of points, only if the weights sum to zero, is independent of whatever point was picked as the origin. Such a weighted sum is called the barycentre, or the centre of mass. In the case of your room, this limited form of addition for positions (points to be precise) does exist, which can ideally put, for e.g., you, midway between your washbasket and your overfull chair for maximising the dilemma.

As you can see, kind of the whole point of affine spaces is that there is no meaningful zero-point for the entire universe. You just have to pick some arbitrary point and call it an origin, but its choice doesn’t affect anything. The tradeoff however is that the points themselves have no representation in terms of numbers. You have to resort to the associated vector space, at which point changing the origin involves changes to the components themselves.

Phew!

So that solves our problem of how to treat positions, where the answer is that positions are points in affine space and vectors can add to them, but two points can only be subtracted from one another, and no other operations are defined on points. There are some decent references for this available if you search hard enough, and some that address this in the much more general and mind-bending setting of differential geometry. This finally resolves the mess without requiring “free” and “bound”-vectors.

As for memory pointers, they are basically positions of a kind, and thus it makes sense that adding doesn’t work for them either. This is unlike temperature, where finding and using a true zero as in the Kelvin scale (or Rankine scale! there’s more than one way to skin that cat!) enables addition.

So if we now return to our original question in part 1, of adding temperature, and ask what mathematical structure represents it best, do we have an answer? Well… a relatively simple solution is to consider a 1-D affine space, or just a real line with the affine structure added. However, it seems excessive to involve a field, a vector space, and an inner-product, considering we want to talk about “scalars”! So, in our next (and final) part, we continue on our quest to pare down confusing cruft for our most basic physical quantities.

Why can’t we add two temperatures?: Part #1

1 Reply

A very simple question took me down the depths of mathematics: Why can’t we add two temperatures like 30 °C and 10 °C and get 40 °C?

Of course temperatures in Kelvin can be added, and of course temperature differences can be added. There is a little used scientific convention denoting temperature differences as e.g. C° while absolute temperatures are °C. This difference in notation, and the very significance of the Kelvin scale, presages some fairly fundamental underlying issues which we will explore in this article. (Or rather, we will rediscover what made physicists choose this distinction in the first place). But… wait… how are temperature differences obtained again?! Yep… we have the weird situation of the Celcius (and Fahrenheit) scale allowing subtraction while not allowing addition.

Importantly, its not even the only such physical quantity. absolute time is also unable to added – I’ll challenge you to add 1^st March 2023 and 3^rd July 2025 and tell me the result! Absolute electric potential (which gives rise to voltage) also cannot be added – absolute electric potential simply cannot be assigned a unique numeric value (although the potential at infinity is commonly fixed to be zero, its not really a privileged choice).

Why exactly do we have this problem? Before we dive in, be aware that “why?” is not always an answerable question. In fact, taken too far, it leads to the problem of the infinite epistemic regress. With that caveat, lets go however far we can.

What do we mean by temperature first of all? Apparently the degree of hotness or coldness (I didn’t like this definition in middle school and I still don’t. Edit: Parameter defining equilibrium distribution of energies in statistically independent particles and parameter characterising objects in thermal equilibrium feel much better.). It is written with a number and a unit, in this case Celcius but would work with any scale. We imagine we could put any number in there, and that typically means we use real numbers (in this case with the restriction that T ≥ −273.16°C), and hence it looks like you could apply all arithmetic operations to it. But clearly addition and also multiplication aren’t fitting in for temperature (if you recall that multiplication can be viewed as repeated addition).

No physics envy here

It turns out that the most famous theoretical justification comes not from physics or mathematics, but from psychology (with statistics having a claim to it). And although a few formidable mathematicians such as Tukey (a co-inventor of the extremely consquential Fast Fourier Transform and also apparently the origin of the word “bit” and “software”!) have contributed to the field, I am not alone in feeling like there’s been a conspicuous lack of mathematical attention (to put it mildly).

The contribution from psychology are the “scales of measurement” first by Stevens. He divides measurements into nominal scale (just names like colour), ordinal scale (names with ranks like college grades), interval scale (numbers with difference but no ratios), and ratio scale (number with a true zero and hence all arithmetic operations are allowed, edit: also called an absolute scale). If you think he was secretly an algebraist LARPing as a psychologist I’m with you, dear reader.

Stevens’ classification answers our original question saying that the lack of a true-zero disallows the meaning of “temperature is a quantity of something” from the Celcius scale, which is thus an interval scale. Since temperature is not quite a quantity of heat, adding them doesn’t give us a real quantity of any kind. The same applies to time (because the Gregorian calendar isn’t counting time from the Big Bang), and electric potential (because the physical theory simply doesn’t allow a real zero).

And – this will be interesting some of you who aren’t into the natural sciences (unnatural scientists?) – pointers to memory in languages such as C or C++ also can’t be added. However there is a much subtler and more interesting mathematics that underlies pointer arithmetic, which I will be covering in one or more followups to this article. Stay tuned!

Postscript: Measurement theory

Stevens’ contributions attracted only a little attention from natural scientists for a long time. Tukey’s additions frankly don’t feel like a substantial upgrade to this framework. Chrisman’s further expanded framework too doesn’t add much, with a very important exception for cyclic quantities like months or times within a day which were sorely needed (more in the upcoming articles in this series!). Only recently has this field gotten the rigour it should have, now called Measurement Theory, and is best accessed via a good review, given that it doesn’t have a wiki article yet.

The long tail

Weblog of yet another Indian Engineer