tori | Product Strategist

Posts Tagged ‘tori’

Miscellaneous, 1/26/2021

January 27, 2021

Central Limit Theory

I watched some YouTube videos on the central limit theory in which, according to that theory, a population can be sampled with samples of size 30. The presenter implied that you had as many such samples as needed to cover the population. But, the point was that each sample would have 30 entities in it.

I don’t know, but 30 seems too small. It is nowhere near 2¹¹. Even 2¹¹ is too small to get us to a symmetric normal, one without skew and kurtosis.

I drew a picture with my blunt tools. I didn’t use a sphere packing algorithm. I just drew a multiplication table. A surprise jumped out.

A normal has a circular footprint. A normal sits inside a square. So what shape are we talking about here? How do we get to 30? 2⁵ or 25 is too small, and 6² or 36 is too large. We are talking people here, not say populations of the square root of 30.

The red lines are ellipses sitting inside rectangles. They are not normals yet. They are pre‑normals or long‑post‑normals. They are either hyperbolic or spherical. And, somehow, according to the central limit theory, when added together, they add up to a standard normal. That implies that their mean is 0 and their standard deviation is 1.

A circle implies the absence of a correlation. A rectangle implies the presence of a correlation, or a bias.

Notice that the samples for 1 and 36 are outside the circle. They are omitted from the population. Oh well.

Synthetic Data

Mashhood Ahmed’s discussion of synthetic data came up again out on LinkedIn. See the discussion.

Project management has long been a research topic. Software engineering research is similar. The data justifying and validating the practices have existed for a long time. If anybody went looking for data, it exists. Yes, maybe your way is different. But, getting consistent data for yourself and your organization should be simple. Capture it. Analyze it. Integrate it.

Once you know the parameters and the constraint envelopes, you can generate synthetic data on those parameters and constraints. You can run a real project using those synthetic parameters and constraints and then see what your organization delivers. Capturing your outcomes lets you forecast where the parameters and constraints will take you before you go.

When I talk about the technology adoption lifecycle, I know what my distributions will look like before I have any customers. I know what the processes are going to be. I know the evolution. My current obsession with regressions to the tail is just a matter of knowing what I can expect and knowing what to do about it when I see it coming. This business of financial turbulence bleeding into adjacent processes and dependent processes is trouble. How do you put that in a box? How do you deal with the coupling and cohesion of that turbulent system? How do you make it an object?

I built my long tail representing my application as I build my application. Use tells me about my tails. Does the tail match up with the requested requirements? The resulting tails validate the survey data that led to the requirements. The resulting tails confirm marketing organizations’ delivery of the appropriate users. With a user interface, there is plenty of ongoing data collection. Call them surveys if you like.

In too many correlation classes, a given correlation is arbitrarily thrown away. It is actually a component of a tail. There are many tails. And, I’ve seen one system of correlations get replaced by another, assuming that there is only one tail. When a UI control asks you to select one of three possible choices, there are three tails. Or, is that question some pointless data to be stored? Is that choice eventually expressed by some component of the system? Three choices give you three different probabilities, and three departing Markov chains of probabilities to add to the predecessor’s tail, assuming only one, led us to that control. In AI, the overall UI would have been a small world.

Knowing your expected distributions and putting synthetic data in them should not be a problem.

Open Source Software

Today, I came across a job description. They wanted a product manager for a product that aims to replace Dreamweaver. The product was written for programmers. We used Dremweaver. Were we programmers? The product is open source software.

Open-source software development is supposed to deliver better software than other development processes. It does this because the programmer is a member of the user community. That programmer knows the carried content. Most programmers know the carrier but have to be taught the carried content. These latter programmers are not users. Those two types of programmers present us with very different propositions.

Hell, I remember an organization that produced carrier type products. That company defined the world. They wanted to become a product company. That meant listening to the outside world, listening to users and others that defined their world, their carried content. In the end, they could not make that leap.

Barcodes and Persistent Homology

In my YouTube watching, I revisited bar codes. I got it this time. Start with a collection of points. Each of those points is the center of a circle of a given radius. All the circles are the same size. Increase the radius of all those circles. The circles begin to overlap at some radius. Continue to increase the radii. Some space gets surrounded. That surrounded space is a hole. When that happens, a hole is born. Continuing to increase those radii, the surrounded space, the hole disappears. That demarks the death of the hole. That happens at some radius.

The barcode starts at the radius when the hole is born. The barcode ends at the radius when the hole dies. That hole is exhibiting a lifecycle. That hole is a topological hole, not an algebraic hole. Tori and cyclide are topological structures that have holes. They show up in the curvatures of the tails around a normal over the lifecycle of that normal. Barcodes tell us how large the holes in those topological structures happens to be.

In the figure above, on the left, I put in a point cloud of blue points. These points are in a multidimensional space. I drew my first circles of radius r₁. I cheated and moved the points, so they enclosed that big red space, our hole. The blue points generated a deformed torus. On the right, I drew a circle six points larger. That is radius r₂. I overlaid those circles on the earlier ones. I failed to cover the hole. You can still see a red area in the center. Radius r2 needs to be one point larger. I added that point as shown. I did test the figure on the left, but my tools subtracted two points instead of one.

If the points came from survey data for a particular requirement, the barcode would show you the requirement’s lifecycle. In the figure on the left, removing point A representing a customer or user would prevent that hole’s birth.

If you could find the rate involved in the process of moving from r₁ to r₂, you could put a date on the birth and death of that requirement. The radius r₁ tells you how much time you have to deliver that requirement from the start of the demand in your survey data.

Enjoy!

Tags:absence of correlation, adding or subtracting customers, adjacent processes, AI, all radii, analyze, applications tails, arbitrarily thrown away, arrival, assuming there is only one tail, assumptions, at some radius, barcodes, birth of a hole, bleeding into, capture, carried content, carrier, central limit theory, choice-based tails, circles, cohesion, component of a tail, consistent data, constraint envelopes, correlation classes, coupling, cover the population, curvatures, cyclide, data justifying and validating practices, death of a hole, deformed torus, delivering the appropriate users, departure, dependent processes, development tails, ellipses, expected distributions, fnancial turbulence, forecast, generate synthetic data, hole, hyperbolic, integrate, kurtosis, lifecycle, long tail, long-post-normals, make an object, many tails, Markov chains, mean, miscellaneous, multiplication table, normal without skew and kurtosis, omitted data, ongoing data collection, open software, organizational capabilities, organizational practices, parameters, particular requirement, persistent homology, point cloud, population, pre-normals, predecessors tail, presence of bias, presence of correlation, product manager, Product Strategist, programmers, project management, radii, radius, rectangles, regression to the tail, requested requirements, requirements lifecycle, research topics, resulting tails, sampling a population, skew, small, small world, software enginering, some number of samples, sphere pacing algorithm, sphericial, squares, standard deviation, standard normal, surprise, survey data, surveys, symmeric normal, synthetic data, systems of correlations, tails, technology adoption lifecycle, time to delivery, time to disruption, too small, topological structures, tori, turbulent system, use tails, users, users interface, with samples of size 30, YouTube
Posted in Uncategorized | 1 Comment »

Regression to the Tail = VIX and VVIX

January 20, 2021

The problem with regression to the tail situations is that they have no easily calculated mean and standard deviation. Risk is expressed in relation to standard deviation. Well, not so fast. VIX and VVIX can tell you about risk without calculating the mean and standard deviation. VIX is an index. VVIX is an index of the VIX index, aka volatility of the volatility. Read this article, Intro To Using The Volatility of Volatility Index (VVIX).

Lets look at a distribution that has a fat tail.

The normal in blue is overlaid with a fat-tailed distribution for comparison purposes. The fat-tailed distribution has a taller peak and a narrower core. The narrower core is defined by the shoulders of the fat tailed distribution. The tail begins at the shoulders, where the blue and green lines cross. The normal extends out three standard deviations, shown in light orange. The fat-tailed distribution extends beyond six standard deviations, shown in light blue.

The worry with the regression to tail events is the outliers that lie beyond three standard deviations from the mean. Look at the lower figure. The range of the hazardous outliers is either in the six standard deviations from the mean, or further. That further region is labeled as an outlier event. That event is leveraged from mode defining the short tail. The short tail is shown as a black line through the center of a black circle. The circle is supposed to represent the curvature, but as such it should be above the x-axis. Similarly, the curvature circle for the long tail should be above the x-axis. Those curvatures when revolved will generate a cyclide. The red ovals represent the hole in the cyclide. A normal would give you a torus because of its symmetric tails. A cyclide is generated by the fat-tails asymmetric tails. The pink thing is the cyclide.

The data is under the distributions. The metadata is under the tori and cyclides. Stock market transaction are metadata, aka the pricing of opinions as to the price of a stock or an option.

I’ve indicated the conditions that give us a fat-tail with thick red arrows in the upper figure. I’ve called these thick red arrows mitigations. I don’t know how one would achieve them. I know that we can mitigate a regression to the mean black swan events that happen in the first three standard deviations from the mean by going downmarket and renormalizing the remaining addressable market. But, regressions to the tail are not something I know how to deal with yet. Finance types do know.

The thickness of the normal distribution’s core is shown as a blue outlined ellipse. The thickness of a fat-tailed distribution is shown light green. Those cores define the hole in the tori/cyclide. The core and the holes shown with the red ellipses were supposed to line up, but I drew the short tail in a different place as the figure came together.

The regression to tail distribution is skewed and kurtotic. The thick black lined distribution in the lower part of the figure shows that. We are in the long-post-normal phase of the normal distribution’s lifecycle. In the pre-normal phase the normal is likewise skewed and kurtotic, but it stays within three standard deviations of the mean. It exhibits regression to the mean. Increasing the sample size is what gets us to the (standard) normal that has no skew or kurtosis. The peakedness from the pre-normal goes away once the distribution is normal.

The normal does not become a fat-tailed distribution without some mathematical transformation into a distribution that does have a fat tail. That is where the risk is expressed. Numerous distributions exhibit a fat tail. These are always asymmetric.

Here is a more abstract representation. A leptokurtic distribution has a high peak. A platykurtic distribution is has less height. These terms are tied to the peakedness use of the kurtosis statistic. But, these days, kurtosis describes the shoulders and tails, not peakedness. Peakedness is a matter of sample size. A new random variable is a line or a point, not a distribution. As the same size increases, you get an interval, a probability, and a distribution. As sample size continues to increase, you get a skewed and kurtotic distribution with a long and a short tail, which in turn generates a cyclide. As the same size continues to increase, you achieve a standard normal. And, as sample size continues to increase even more, the distribution becomes platykurtic and it widens. But, it remains symmetric. Normality achieved, the distribution became symmetric and remains that way as the same size continues to increase. Some transformation happens before a fat tail would be exhibited.

The red arrows show how the distribution changes as it becomes a fat-tailed distribution. The purple circle and ellipse reflect the curvatures and eccentricities exhibited by the tails that get us to the cyclide. The horizontal ellipse near the core represents the hole in the cyclide.

The gray vertical ellipse represents the reduction in distribution height from the pre-normal to the normal and the increase in height as we move from the post-normal to the long post normal.

The thick purple line represents how the core gets narrower as we approach the regression to the tail situation.

There is a leap from distributions exhibiting regression to the mean and distributions exhibing regression to the tail. The VIX and VVIX help us make that leap. We move from data to metadata. Similarly, the tori and cyclides do not contain data, but represents the space outside the distributions.

In the figure above, we are looking at stock prices, aka the metadata that is the VIX and VVIX. Inside the black rectangle is the period of time where the VIXX is above the VIX. The VIX is usually above the VVIX. When these elements swap roles, we have a VIX-VVIX reversal. The reversal above happened from November through April. Call it a bad economy. Call it a time when value-based pricing and sales would become the practice. In May, we would go back to normal operations. Value-based pricing is one way to avoid going downmarket, and a way to avoid a missed quarter. Notice that I am not talking about how finance would deal with the situation. I stay operational.

Recognizing the transformations that move us from the normal, regression to the mean perspective, to the fat-tail regression to the tail perspective is essential. We might want to always have both views at hand.

Enjoy.

Tags:asymmetric, asymmetric tails, beyond six standard deviations, Black Swan, comparison, core probability density, curvature, cyclide, data, distribution, downmarket, eccentricity, fat tail, fat-tailed distribution, fat-tailed distribution core, finance, from data to metadata, from mean, going downmarket, high shoulder, hole, index, index of an index, Intro To Using Volatility of Volatility Index, kurtosis, kurtotic, late-post normal, leap, leptokurtic distribution, leverage, line, long tail, long-post-normal, long-post-normal phase, mathematical transformation, metadata, missed quarter, mitigations, mode, mode defining he short tail, narrower core, no mean, no standard deviation, normal, normal distribution, normal distribution core, not a distribuiton, not an interval, operational, option, outlier event, outliers beyond three standard deviations, outside the distribution, peakedness, perspective, perspectives, platykurtic distribution, point, post-normal, pre-normal, pre-normal phase, pricing of opinions, random variable, reduce height, regression to the mean, regression to the tail, renormalizing, risk, risk relative to standard deviation, risk without a mean or standard deviation, sample size, short tail, shoulders, single dimension, skewed, standard deviation, standard normal, statistic, stay operational, stock, symmetric, symmetric tails, taller peak, the remaining addressable market, three sigmas, three standard deviations, topological hole, tori, torus, under the distribution, VIX, VIX-VVIX reversal, volatility, volatility squared, VVIX, widen core, widens, x-axis
Posted in Uncategorized | Leave a Comment »

Poincaré Disk

September 13, 2020

The Poincaré is one model of hyperbolic space. Try it out here.

Infinity is at the edge of the Poincaré disk, aka the circle. The Poincaré is a three more dimensional bowl seen from above. Getting where you want to go requires traveling along a hyperbolic geodesic. And, projecting a future will understate your financial outcomes. Discontinuous innovation happens here.

A long time ago, I installed a copy of Hyperbolic Pool. I played it once or twice. Download it here. My copy is installed. It say go there. Alas, it did not work when I tested it from this post. My apologies. Hyperbolic space was a frustrating place to shoot some pool.

I’ve done some research. More to do.

A few things surprised me today. The Wikipedia entry for Gaussian Curvature has a diagram of a torus. The inner surfaces of the holes exhibit negative curvature. The outer surfaces of the torus exhibits positive curvature. That was new to me.

I’ve blogged on tori and cyclides in the context of long and short tails of pre-normal, normal distributions, aka skewed kurtotic normal distributions that exit before normality is achieved. These happen while the mean, median, and mode have not converged. I’ve claimed that the space where this happens is hyperbolic from the random variable’s birth after the Dirac function that happens when random variable is asserted into existence and continues until the distribution becomes normal.

Here are the site search results for

There will be some redundancy across those search results. In these search results, you will find that I used the term spherical space. I now us the term elliptical space instead.

We don’t ever see hyperbolic space. We insist that we can achieve normality in a few data points. It takes more than 2¹¹ data points to achieve normality. We believe the data is in Euclidean “ambient” space. We do linear algebra in that ambient space, not in hyperbolic space. Alas, the data is not in ambient space. The space changes. Euclidean space is fleeting: waiting at n-1, arrival at n, departing at n+1, but computationally convenient. Maybe you’ll take a vacation, so the data collection stalls, and until you get back, your distribution will be stuck in hyperbolic space waiting, waiting, waiting to achieve actual normality.

Statistics insists on the standard normal. We assert it. Then, we use the assertion to prove the assertion.

Machine learning, being built on neurons and neural nets, insists on the ambient space because Euclidean space is all their neurons and neural nets know. Euclidean space is convenient. Curvature in machine learning is all kinds of inconvenient. Getting funded is not just a convenience. It might be the wrong thing to do, but we do much wrong these days. Restate your financials, so the numbers for the future, from elliptical space paint a richer future than the hyperbolic numbers that your accounting system just gave you.

And one more picture. This from a n-dimensional normal, a collection of Hopf Fibered Linked Tori. Fibered, I get, but I stayed out of it so far. Linked happens, but I’ve yet to read all about it.

The thin torus in the center of the figure results from a standard normal in Euclidean space. Its distribution is symmetrical. Both of its tails are on the same dimensional axis of the distribution. They have the same curvature. The rest of the dimensions have a short tail and a long tail. Curvature is the reciprocal of the radius. The fatter portion of the cyclides represent the long tails. Long tails have the lowest curvatures. The thinner portion of the cyclides represent the short tails. Short tails have the highest curvatures. Every dimension has two tails in the we can only visualize in 2-D sense. These tori and cyclides are defined by their tails.

Keep in mind that the holes of the tori and cyclides are the cores of the normals. The cores are not dense with data. Statistical inference is about tails. And, regression to tails are about tails, but in the post-Euclidean, elliptical space, n+m+1 data point sense. One characteristic of the regression to tails, aka thick-tailed distributions, is that their cores are much more dense than that of the standard normal.

Hyperbolic space will only show up on your plate if you are building a market for a discontinuous innovation. Almost none of you do discontinuous innovation, but even continuous innovation involves elliptical space, rather than the ambient Euclidean space, or the space of machine learning. We pretend is that Euclidean space is our actionable reality. Even with continuous innovation, the geometry of that space matters.

Enjoy!

…

Tags:2-D, actionable reality, assert a random variable, assert existence, asymmetric tails, birth of a random variable, circle, continuous innovation, converge, curvature, cyclide, cyclides, data space, dense cores, Dirac function, discontinuous innovation, distribution becomes normal, distributions, do the wrong thing, elliptic space, elliptical space, Euclidean space, fibered, fibered liked tori, financial outcomes, Gaussian curvature, geodesic, geometry of space, getting funded, highest curvature, Hopf, hyperbolic, hyperbolic geodesic, hyperbolic pool, hyperbolic space, Infinity, inner surfaces of a torus, kurtotic normal, linked, long tails, lowest curvature, mean, median, mode, model of hyperbolic space, much wrong, n, n+1, n+m+1, n-1, negative curvature, normal, outter surfaces of a torus, Poincaré disk, positive curvature, post normal, post-Euclidean, pre-normal, pretend, Product Strategist, radius, random variable, reciprocal of the radius, regression to tails, restate your financials, short tails, skewed kurtotic normal, skewed normal, space, standard cores, statistical inference, symmetric tails, symmetrical, tails, thick tails, Thick-tailed distributions, thin torus, tori, torus, two tails, understatement
Posted in Uncategorized | Leave a Comment »

Product Strategist