Posts Tagged ‘statistical inference’

Intersections

January 1, 2021

Let’s look at a singular elliptic curve. Elliptic curves are used in cryptography. Cryptography uses an encoding algorithm to lengthen the time it takes to decode the encrypted content. It maintains the secrecy of some content for some period of time. It is a secret for now. Notice that encryption cannot maintain that secrecy forever.

The other thing that these algorithms do is compress and decompress content. The normal distribution is typically n–dimensional. We can visualize in 2–dimensions. We cannot visualize in n–dimensions. We were taught to think in terms of a 2–dimensional compression of some n–dimensional content. We are thinking in terms dictated by a compression algorithm. A statistic is the result of some compression algorithm. So we are constantly compressing the values of a given statistic. The mean is a compression of the data it is reporting on. A statistical inference is likewise a compression, of two constituent compressions, or compression of two statistics: Ω and β.

So I’m still working my way through Elliptic Tales. I got an email from Quanta Magazine that linked to an article, “A Master of Numbers and Shapes Who Is Rewriting Arithmetic” about Peter Scholze the youngest Fields medalist ever. He became a Fields Metalist back in 2018 for his work on p–adic numbers, which ties to elliptic curves through finite fields and modular arithmetic. Most of us are not familiar with that arithmetic, but we should be. As product managers, our addressable market population is a finite field.

One of the first things you learn when working your way into this arithmetic is that we have two types of elliptic curves: singular, and non–singular elliptic curves. Later, you add an onton so you have three classes of curves. Singular elliptic curves intersect themselves.

I’ve highlighted the two tangents where the curve intersects itself. Keep in mind that the curve we are looking at is a projection, a compression. It is hiding a fact from us. It is practicing its magician’s act. The intersection shown in this view is an intersection, not a union. The curve is one curve, not an aggregate of two curves.

A parameter, time (t), was added. The parameter deconstructs the intersection by decompressing it. Before that parameterization, the situation was Boolean, aka an intersection existed. It was a bit. After parameterization is it a space. Like a point and an interval in probabilities. Afterward, . We have also moved our point of view from above the curve on the xy–plane to the z–axis being southeast of our former point of view. We had to move. As product managers, our company as a signal processor, had to change its perspective, its point of view. We moved; we saw. Given the red elliptic curve in the xy–plane, the green line is perpendicular to the xy–plane. The light green line is not the y–axis, my bad.

What is true for this intersection is true for many other intersections. You can imagine an intersection that you drive through every day on the way home. We don’t intersect there, because the traffic lights keep us apart temporally, and physically separated. The black line in the parameterization looks to be a Bezier curve, another parameterization. Compression collapses the temporal dimension. This is a trick from projective geometry that the author of the book uses often. It simplifies things. But, at times it oversimplifies things. As product managers, the technology adoption lifecycle (TALC) is organized about task sublimation, but we are oversimplifying things in the pursuit of popularity, rather than task sublimation. Yes, most of our users are not experts, yet that expertise is what brought our products to life.

That z–axis is something that we can own exclusively. Likewise that Bezier curve. Both make significant impacts on the commodities we sell in the continuous innovation portion of the TALC. We have to invent those parameterization axes. Invent like file patents and engage in trade secret practices.

In the next figure, I tried to show the projections involved.

I used light purple lines to project the front of the elliptic curve and its tails to the parameterization. The tails of the diagram look correct, but the thick black line is not touching the front of the purple line projected from the elliptic curve. The orange line is my correction. The pink grid is doing something with the numbers. The thick dark purple line is infinity or the edge of the finite field.

On the TALC, that thick purple line is well to the right of the last phase of the TALC. It is the line that represents oversimplification. The pursuit of simplicity can leave much money on the table, and it leaves our expert users in the past. They will need to seek other products and vendors once our simplifications exit their cognitive models and flows. Beware.

Poincaré Disk

September 13, 2020

The Poincaré is one model of hyperbolic space. Try it out here.

Infinity is at the edge of the Poincaré disk, aka the circle. The Poincaré is a three more dimensional bowl seen from above. Getting where you want to go requires traveling along a hyperbolic geodesic. And, projecting a future will understate your financial outcomes. Discontinuous innovation happens here.

A long time ago, I installed a copy of Hyperbolic Pool. I played it once or twice. Download it here. My copy is installed. It say go there. Alas, it did not work when I tested it from this post. My apologies. Hyperbolic space was a frustrating place to shoot some pool.

I’ve done some research. More to do.

A few things surprised me today. The Wikipedia entry for Gaussian Curvature has a diagram of a torus. The inner surfaces of the holes exhibit negative curvature. The outer surfaces of the torus exhibits positive curvature. That was new to me.

I’ve blogged on tori and cyclides in the context of long and short tails of pre-normal, normal distributions, aka skewed kurtotic normal distributions that exit before normality is achieved. These happen while the mean, median, and mode have not converged. I’ve claimed that the space where this happens is hyperbolic from the random variable’s birth after the Dirac function that happens when random variable is asserted into existence and continues until the distribution becomes normal.

Here are the site search results for

There will be some redundancy across those search results. In these search results, you will find that I used the term spherical space. I now us the term elliptical space instead.

We don’t ever see hyperbolic space. We insist that we can achieve normality in a few data points. It takes more than 211 data points to achieve normality. We believe the data is in Euclidean “ambient” space. We do linear algebra in that ambient space, not in hyperbolic space. Alas, the data is not in ambient space. The space changes. Euclidean space is fleeting: waiting at n-1, arrival at n, departing at n+1, but computationally convenient. Maybe you’ll take a vacation, so the data collection stalls, and until you get back, your distribution will be stuck in hyperbolic space waiting, waiting, waiting to achieve actual normality.

Statistics insists on the standard normal. We assert it. Then, we use the assertion to prove the assertion.

Machine learning, being built on neurons and neural nets, insists on the ambient space because Euclidean space is all their neurons and neural nets know. Euclidean space is convenient. Curvature in machine learning is all kinds of inconvenient. Getting funded is not just a convenience. It might be the wrong thing to do, but we do much wrong these days. Restate your financials, so the numbers for the future, from elliptical space paint a richer future than the hyperbolic numbers that your accounting system just gave you.

And one more picture. This from a n-dimensional normal, a collection of Hopf Fibered Linked Tori. Fibered, I get, but I stayed out of it so far. Linked happens, but I’ve yet to read all about it.

The thin torus in the center of the figure results from a standard normal in Euclidean space. Its distribution is symmetrical. Both of its tails are on the same dimensional axis of the distribution. They have the same curvature. The rest of the dimensions have a short tail and a long tail. Curvature is the reciprocal of the radius. The fatter portion of the cyclides represent the long tails. Long tails have the lowest curvatures. The thinner portion of the cyclides represent the short tails. Short tails have the highest curvatures. Every dimension has two tails in the we can only visualize in 2-D sense. These tori and cyclides are defined by their tails.

Keep in mind that the holes of the tori and cyclides are the cores of the normals. The cores are not dense with data. Statistical inference is about tails. And, regression to tails are about tails, but in the post-Euclidean, elliptical space, n+m+1 data point sense. One characteristic of the regression to tails, aka thick-tailed distributions, is that their cores are much more dense than that of the standard normal.

Hyperbolic space will only show up on your plate if you are building a market for a discontinuous innovation. Almost none of you do discontinuous innovation, but even continuous innovation involves elliptical space, rather than the ambient Euclidean space, or the space of machine learning. We pretend is that Euclidean space is our actionable reality. Even with continuous innovation, the geometry of that space matters.

Enjoy!