From Time Series to Machine Learning

December 4, 2017

This post, “Notes and Thoughts on Clustering,” on the Ayasdi blog brought me back to some reading I had done a few weeks ago about clustering. It was my kind of thing. I took a time series view of the process. Another post on the same blog, “The Trust Challenge–Why Explainable AI is NOT Enough,” boils down to knowing why the machine learning application leaned what it did, and where it went wrong. Or, to make it simpler, why did the weights change. Those weights change over time, hence the involvement of time series. Clustering changes, likewise, in various ways as n, n as time, changes, again time series is involved.

Time is what blew those supposed random mortgage packages up. The mortgages were temporally tied linked, not random. That was the problem.

In old 80’s style expert systems, the heuristics were mathematics, so for most of us the rules, the knowledge was not transparent to the users. When you built one, you could test it and read it. It couldn’t explain itself, but you could or someone could. This situation fit rules 34006 and 32,***. This is what we cannot do today. The learning is statistical, but not so transparent, not even to itself. ML cannot explain why it learned what it did. So now there is an effort to get ML to explain itself.

Lately, I’ve been looking at time series in ordinary statistics. When you have less than 36 data points the normal is a bad representation. The standard deviations expand and contract depending on where the next data point is. And, the same data point moves the mean. Then, there is skew and kurtosis. In finance class, there is skew risk and kurtosis risk. I don’t see statistics as necessarily a snapshot thing, only done once you have a mass of data. Acquiring a customer happens one customer at a time in the early days of a discontinuous innovation in the bowling alley. We just didn’t have the computing power in the past to animate distributions over time or by each data point. We were asked to shift to the Poisson distribution until we were normal. That works very well because the underlying geometry is hyperbolic explaining why investors won’t put money on those innovations. The projects into the future get smaller and smaller the further out you go. The geometry hides the win.

It turns out there is much to see. See the “Moving Mean” section in the “Normals” post for a normal shifting from n=1 to n=4. Much changes from one data point to the next.

I haven’t demonstrated how clustering changes from one data point to the next. I’ll do that now.

Clustering DP1

At n=1, we have the first data point, DP1. DP1 is the first center of the first cluster, C1. The radius would be the default radius before any iterating that radius to some eventual diameter. It might be that the radius is close to the data point or at r=1.

At the next data point, DP2, it could have the same value as DP1. If so, the cluster will not move. It will remain stationary. The density of the cluster would go up. But, the standard deviation would be undefined.

Or, DP2 would be different from DP1 so the cluster will move and the radius might change. A cluster can handily contain three data points. Don’t expect to have more than one cluster with less than four data points.

Clustering DP2

At n=2, both data points would be in the first cluster. Both could be on the perimeter of the circle. The initial radius would be used before that radius would be iterated. With two points, the data points might sit on the circle at the widest width, which implies that they sit on a line acting as the diameter of the circle, or they could be closer together closer to the poles of the circle or sphere. C2 would be a calculated point, CP2 between the two data points, DP1 and DP2. The center of the cluster moves from C1 to C2, also labeled as moving from DP1 to CP2. The radius did not change. Both data points are on a diameter of the circle, which means they are as far apart as possible.

The first cluster, CL1, is erased. The purple arrow indicates the succession of clusters, from cluster CL1 centered at C1 to cluster CL2 centered at C2.

P1 is the perimeter of cluster CL1. P2 is the perimeter of cluster CL2. It takes a radius and a center to define a cluster. I’ve indicted a hierarchy, a data fusion, with a tree defining each cluster.

With two data points the center, C2 and CP2, would be at the intersection of the lines representing the means of the relevant dimensions. And, there would be a standard deviation for each dimension in the cluster.

New data points inside the cluster can be ignored. The center and radius of the cluster do not need to change to accommodate these subsequent data points. The statistics describing the cluster might change.

A new data point inside the cluster might be on the perimeter of the circle/sphere/cluster. Or, that data point could be made to be on the perimeter by moving the center and enlarging the radius of the cluster.

The new data point inside the cluster could break the cluster into two clusters both with the same radius. That radius could be smaller than the original cluster. Overlapping clusters are to be avoided. All clusters are supposed to have the same radius. In the n=3, situation, one cluster would contain one data point, and a second cluster would contain two data points.

A new data point outside the current cluster would increase the radius of the cluster or divide into two clusters. Again, both clusters would have the same radius. That radius might be smaller than the original cluster.

Clustering DP3

With n=3, the center of the new cluster, C3, is located at CP3. CP3 would be on the perimeter of the cluster formerly associated with the first data point, DP1. The purple arrows indicate the overall movement of the centers. The purple numbers indicate the sequence of the arrows/vectors. We measure radius 3 from the perimeter of the third cluster and associate that with CP3, the computed center point of the third cluster, CL3.

Notice that the first cluster no longer exists and was erased, but remains in the illustration in outline form. The data point DP1 of the first cluster and the meta-data associated with that point are still relevant. The second cluster has been superseded as well but was retained in the illustration to show the direction of movement. The second cluster retains its original coloring.

Throughout this sequence of illustrations, I’ve indicated that the definition of distance is left to a metric function in each frame of the sequence. These days, I think of distributions prior to the normal as operating in hyperbolic space; at the normal, the underlying space becomes Euclidean; and beyond the normal, the underlying space becomes spherical. I’m not that deep into clustering yet, but n drives much.

Data points DP1 and DP2 did not move when the cluster moved to include DP3. This does not seem possible unless DP1 and DP2 were not on a diameter of the second cluster. I just don’t have the tools to verify this one way or another.

The distance between the original cluster and the second was large. The distance is much smaller between the second and third clusters.

This is the process, in general, that is used to cluster those large datasets and their snapshot view. Real clustering is very iterative and calculation intensive. Try to do your analysis with data that is normal. Test for normalcy.

When I got to the fourth data point, our single cluster got divided into two clusters. I ran of time revising that figure to present the next clusters in another frame of our annimation. I’ll revise the post at a later date.

More to the point an animated view is a part of achieving transparency in machine learning. I wouldn’t have enjoyed trying to see the effects of throwing one more assertion into Prolog and trying to figure out what it concluded after that.





November 27, 2017

Unit of Measure

Back in an earlier post, A Quick Viz, Long Days, I was wondering if the separate areas on a graphic were caused by the raster graphics package I was using, or if they were real. If a pixel is your unit of measure, then the discontinuities are real. The unit of measure drives the data. So yes, those disconnected areas would be Poisson distributions tending Unit of Measureto the normal and the units of measurement get smaller.

In this figure, I changed the unit of measure used to measure the top shape. I increase the size of the unit square moving down the page. Then, for each of the measured shapes, I counted complete units, used Excel to give me a moving mean and standard distribution with time (n) moving left to right on each figure. In the first, measurement I generated a histogram of the black numbers below the shape.

A graph of the moving averages appears above each shape in gray. A graph of the moving sigmas appears above each shape in black. This helps us see the maximum or minimum sigmas and means. It also reveals uninominal to multinominal structure, or how many normals are involved. In all cases, the means were uninominal involving a single normal. The results from the smallest pixel show that the sigma was binominal. The middle pixel resulted in three sigmas as the distribution was trinominal. The largest pixel resulted in a uninominal. In all three cases, the shape generated skewed distributions.

No time series windows were used.

Where the data was smaller than a pixel, it is highlighted in red and omitted from the pixel counts. You can see how the data was reduced each time the pixel size went up. The grid imposing the pixelizations were not applied in a standard way. We did not have an average when the grids were applied. The red pixels could be counted with Poisson distributions. They are waiting to trend to the normal. Or, they could be features waiting for validation. In a discontinuous innovation portfolio, they could be lanes in the bowling alley waiting for their client’s period of exclusion to expire, or waiting to cross the chasm. Continuous innovations do not cross Moore’s chasm. Continuous innovations might face scale chasms or downmarket moves via disruption or otherwise. All of these things impede progress through the customer base. They would be red. Do you count them or not.

Grids have size problems just like histogram bins.

A Moving Mean

When you first start collecting data each data point changes the normal massively. We hide this by using a large amount of data after the fact, rather than like a time series building out a normal towards the standard normal, or a Poisson distribution and increasing the number of data points until the normal is achieved.

When watching a normal go from 1 to n, it matters where the next data point comes from. If the data point is the third or more, it will be inside or outside the core, or, as an outlier, outside the distribution entirely. In the core, an area defined by being plus or minus one sigma, one standard deviation from the mean, the density goes up, the sigma might shrink. That sigma won’t get wider. Outside the core, in the tail, the sigma might get wider. The sigma won’t get narrower. These would change the circumference of the circle representing the footprint of the normal. An outlier makes the normal wider. That outlier would definitely move the mean.

So what is the big deal about moving the mean? It moves the core. It’s only data. No. That normal resulted from the sum of all the processes and policies of the company. A population makes demands of the company and the product. When the core moves, some capabilities are no longer needed, some attitudes are no longer acceptable. On the financial side of the house, skew risk and kurtosis risk are real. When the core moves, the tails move. The further the core moves, the further the tail moves in the direction of the outlier.

Sales is a random process. Marketing is not. We don’t much notice this when we are selling commodity goods, but with a discontinuous innovation, that outlier sale has many costs that we have never experienced. The technology adoption lifecycle is only random when you pick where you start, your initial position, in the middle and work towards the death of the category. Picking the late mainstream phase because it’s all you know, leaves a lot of money on the table and rushes that population to the buy before the business case they need to see is ready to be seen. But, picking late mainstream also means you’re fast following. Don’t worry. The innovation press will still call your company innovative. Hell, yours is purple and the market leader’s version is brown.

But, let’s say you began in the beginning and through the early phases coming out of the tornado as the market leader. You will have gone from a Poisson distribution to the three sigma normal to the six, to the twelve, to more. Your normal will dance around before it sets its anchor at the mean and stays put while it grows outward in sigmas.

That outlier that sales demands and we refused eventually will be reached. Sales just got ahead of itself and cost the company quite a bit trying to build the capabilities the outlier takes for granted.

I sat down with a spreadsheet and sold one customer, built the normal, and sold another, built another normal. That first customer was narrow and very tall. It’s as tall as that normal will ever be. It looks like a Dirac function. Of course, there is no standard deviation when you have a single data point. I fudged the normal by giving it a standard deviation of one. And, the standard normal looks like any other standard normal. Only the measurement scales changed from one normal to the next. The normals get lower and wider as the population gets larger.

I did this without a spreadsheet, but I got normals with a Normal Distribution N eq 1kurtosis value, but no skew or kurtosis are produced by those standard normal generators. So this first figure is the first data point. It may be a few weeks until the next sale. Or, this might be a developer’s view of some functionality that certainly hasn’t been validated yet. Internal agilists never dealt with this problem. The unit measure is a standard deviation, a sigma.

Normal Distribution N eq 2 and 3

In the figure above, DP1 is the first data point and the first mean. So I went on to the next data point.

Here, in the figure above, the distribution for the second data point, DP2, is the gold one. The standard deviation was 13. The mean for the gold distribution is represented by the blue line extending to the peak of the gold distribution. The black vertical lines extending upwards to the gold distribution demark the core of the gold normal. In the top-down view, the normal and its core are shown as black circles. With a standard deviation of 13, three standard deviations are 39 units wide.

The next data point, the third data point,  DP3 gives us the third mean.  This mean is shown as a red line extending to the top of the pink distribution. In the top-down view, this normal and its core are shown as red circles. Notice that the height of this normal is lower than that of the gold normal. Also notice that this new data point is inside the core of the previous normal, so this normal contracts. With a standard deviation of 11, three standard deviation is 33 units wide. The third mean moved, so there is some movement of the distribution.
Horizontally and Vertically Correct

The figure above is illustrative but wrong. The vertical scale is off. So I rescaled the normals generated for the second and third data points. And, a fourth data point was added as an outlier. No normal was generated for it. That would be the next thing to do in this exploration.

The black arrows at the foot of the gold normal show the probability mass flowing into the pink normal. The white area is shared by both distributions.

Where I labeled the mean, median, and mode is the same is not real either. The distribution is not normal. I tried to draw skewed distribution show with the numbers from the spreadsheet. Eventually, I left that to the spreadsheet. In a skewed distribution all three numbers separate. The mean is closest to the tail.

In the top-down view, the outer circle is associated with the outlier.

The means moved from 5 to 18 to 20, and to 34 in response to the addition of the outlier at 75. The footprint of the normal expands with the addition of the outlier, and contracts in response to the addition of the third data point at 24.

The distribution is like gelatin.

Now, I got out the spreadsheet. I built a histogram and then put the line graph of a normal over it. The line graph doesn’t look normal at all.

Histogram w normal

So I took the normal off.

Histogram wo norml

This showed three peaks. Which drove the normal to show us a trinomial that was right or positively skewed. This data has a long way to go before it is really normal. When I tried to hand draw the distribution, it looked left or negatively skewed. Adding the outlier cause this.

No, I’m not going to add another data point and keep on going. I’ll wait until I get my programmer to automate this animation. I did try to get a blog up for our new company, but WordPress has not gotten easier to use since the last time I set up a blog. Anyway, they told us in statistics class that the normal wouldn’t stabilize below 36 data points. We looked at this. Use a Poisson distribution instead. Set some policy about how many data points you have to have before you call a question answered.

Hypothesis Testing over timeIn Agile, the developer wants to get to validation as quickly as possible. Using the distributions at n = 2 and n = 3, we can look test a hypothesis. We will test at n = 3 (now) and n = 3 -1 = 2 (previous). Since n =3 contracted, we could accept H1 previously and no longer accept H1 now.

I did not compensate for the skew in the original situation. The top-down view shows that with skew rejecting a hypothesis depends on direction. In our situation, the mean only moved to the right or the left. With another axis, the future distribution could move up or down, so there is, even more, sensitivity to skew and kurtosis. And, these sensitivities are financial risks. Sales to outliers translate into skew and kurtosis. These sales can also be costly in terms of, again, the cost of the capabilities needed to service the account.

Beware of subsets. With any given subset, that subset will likewise need 36 or more data points before the normal stabilizes. Skew risk and kurtosis risk will be realized otherwise.


Upmarket and Downmarket

November 4, 2017

A while back I ran across a developer coding for the upmarket. It took me a while to recall what an upmarket move was. Geez. And, when you’re talking upmarket, there is a down market. I don’t think in those terms since they are late main street and the horizontal phase issues. Not my game.


I decided to look at them from the standpoint of the technology adoption lifecycle, so I drew two figures to take a look at them.

Market Definition--Down Market

I drew the downmarket case starting with the technology adoption lifecycle (TALC) as a normal of normals. The company is in the late mainstreet phase. This is usually where a company builds a downmarket strategy. Companies in this phase are on the decline side of the TALC. Growth really a matter of consuming the market faster and reaching the end of the road, the death of the category sooner. Growth is a stock market trick. Going downmarket is a way to grow by actually increase the size of the population that the company is facing.

I labeled the baseline of the TALC “Former. ” Then, I drew another line under the TALC. This line should be long enough to contain the population that the company is moving downmarket to capture. I labeled this line “Planned.” Then, I drew a standard normal to sit on this new line extending from the original normal.  I did not normalize the new normal.

The current market is a subset of the new down-marketed market. The new market need not be centered at the mean of the current market. The population will be new so the mean and standard deviation could differ. The standard normal view of the TALC assumes a symmetrical distribution. This need not be the case. Having two means do make a mess of the statistics. It might not look like a binomial. It will exhibit some kurtosis. The speed of the efforts separating the means will take time and planning. If the company is public, it must provide guidance before making such efforts. Don’t switch before providing those projections to the investors.

I went with have one mean in the figure.

The downmarket effort starts with a making the decision. The decision will require some infrastructural changes to the marketing and sales efforts at a minimum. It will also require some UX and code revisions to give the downmarket user relevant interfaces. Simple things become much harder when the user doesn’t have the funds they need. The cognitive model may differ from that of the upmarket. These problems may or may not be an issue with your software. The decision might be made across products, particularly in a company organized around their bowling alley. That could mean that this downmarket might be a permanent element across all products.

After some period of time, the decision to move downmarket will become operational. Sales may continue in the current markets as other sales efforts address the new downmarket or the current market might be deemphasized or delayed. I removed it. I color coded the lost earnings in yellow and notated it with a negative sign (-). I color coded the gained earnings in green and notated it with a positive sign (+). The gained earnings are dwarfed by the lost earnings as the scale of the market grows and subsequently hits the first scale constraint. Then, the downmarket move will stop until the current population and projected population can be supported. Efforts to support the increase in scale can start earlier before the scale constraint generates a crisis.

Beyond the first scale constraint, the gains begin to drown the losses. Then, the next scale constraint kicks in. Once again the downmarket move will stop until the infrastructure can support the needs being generated by the downmarket move.

Beyond the second scale constraint, the losses dry up and the gains continue out until the convergence of the normal with the x-axis happens, aka the death of the category. Another managerial action will need to be taken to further extend the life of the category.

Notice that I moved the baseline downward beyond the second scale constraint. I labeled this “Overshoot.” I did this to make the losses look continuous. Initially, the curve sat on the original downmarket baseline, but this gave a sawtooth-shaped curve. I’m unsure at the time of this writing which representation is better. As shown, the convergence with the baseline of the normal shows up on the “Overshoot” line.

Pricing will drive the speed of the downmarket realization. Pricing might impair the downmarket move. The net result of the downmarket move will be an increase in seats, which turns into an increase in eyeballs, financial results will depend on price, policies, and timeframes, and an extension of the life of the category.


In the TALC, we usually start in the upmarket and work our way to the downmarket as we move from early (left) to late (right) phases, from growth to decline. Hardly ever does a company move upmarket after being a lower priced commodity.

Market Definition--Up Market

Here I started with the TALC again. I selected a target population, a smaller population, and drew a horizontal above which would represent the upmarket. The upmarket as a horizontal slice across the normal is shown in yellow and gold. Renormalizing that gets us the green and orange normals. The purple arrow behind the normals provides an operational view as sales grow the eventual standard normal shown in orange. The zeros convey how the market is not growing. The higher prices of an upmarket might shrink the size of the market.

When converting an existing market to a higher price, we can consider the market to be Poisson, eventually a kurtotic normal shown with the gray normals, and finally a standard normal without kurtosis. The figure skips the Poisson distribution and begins with the kurtotic normal. Normals with small populations are taller. They shrink towards the standard normal. When a normal is kurtotic it exhibits a slant which disappears as the kurtosis goes away.

I called all of these changes in the size, shape, and slant of the normal the “Price Dance.” This dance is illustrated with the purple arrows. Once the standard normal is achieved, kurtosis risk is removed. As the standard normal gains sigmas, the risk is reduced further.

The Poisson distribution representing the initial sales at the higher price puts the product back in hyperbolic space. Once the single sigma, standard normal is achieved, the product is in Euclidean space. From the single-sigma standard norm, the sigmas increase. That puts the product in spherical space where the degrees of freedom of strategy and tactics increase making many winning strategies possible. In the hyperbolic space, those degrees of freedom are less than one. Euclidean space has a single degree of freedom. This implies that the Euclidean space is transitory.

The net result of the upmarket move will be an increase in revenues depending on pricing, The number of seats will remain constant with optimal pricing, which in turns leaves eyeballs unchanged. Upmarket moves shorten the life of the category.


Downmarket moves take a lot of work, more work than an upmarket move. In both cases, the marketing communications will change. Upmarket moves get you more dollars per seat, but you would have to be selling the product. The number of seats does not change or falls with an upmarket more. Downmarket moves get you more seats, more eyeballs, and given pricing, more revenues if any are independent revenues from eyeballs. Downmarket moves extend the life of the category/product/company. Upmarket moves shorten those lives.

Downmarket and upmarket moves are orthodox strategies and tactics. Talk with your CFO. I’d rather keep the lanes of my bowling ally full.


A Quick Viz, Long Days

October 29, 2017

Three days ago, out on Twitter, a peep tweeted a graph that was supposed to show how a market event amounted to nothing. The line graph dropped the baseline, rose above the 0 Net Zerobaseline, and dropped again to the baseline. It was a quick thing that had me spending the rest of the day, and parts of the following three days hammering on it.

The peeps point was that nothing happened. Grab a hammer and join me in building a case showing just how much did happen.

This was their graph. If you’re in a hurry, you won’t notice the net loss.

I rotated the minima so I could see if the loss was completely recovered. It was not. The 1vertical symmetry is asymmetric. Rotating the minima reveals a gap, labeled A, shows that the upside did not completely recover the value lost in during the first downside.

The second downside loss stops at the line labeled B, the new baseline. There is a gap between the initial baseline and the final baseline. The gap between the baselines is larger than the gap between the peaks. I coped the gap between the peeks and put it below the initial baseline to demonstrate that loss at A did not account for all the loss between the baselines. Subtracting the loss A from the loss between the baselines gives us the gap labeled B.

Notice that the baseline at B moves up slightly. I just saw this after drawing many diagrams. I annotate my error. We will ignore this slight upside. Just one more thing that the peep and I overlooked. I will remove it from subsequent diagrams.

Going back to the first diagram, we had a downside, an upside, and another downside. The first downside (A) and the second downside (B) account for the difference between the initial and final baselines.

2In the figure on the right, I explored the symmetries. The vertical red lines represent the events embedded in the signal. The notation for the symmetry for an event n, span the interval from n-1 to n+1. These spans are shown in gray.

Since I rotated the minima, the symmetry above the signal is actually a vertical (y-axis) symmetry around the origin. I drew purple lines from the vertex at the top to the vertexes at the baseline. Then, I moved the purple lines to the top of the figure. They looked symmetric, but are slightly asymmetric. The left side was three units wide; the right, four units wide.

Both of the horizontal (x-axis) symmetries are asymmetric. The gray box notation demonstrates that these signal components are very asymmetric.

Asymmetries indicate locations where something was learned or forgotten. The Glass-Steagall Act often gets cited as one of the causes of the housing crisis. It was a forgetting. In Stewart Brand’s “How Buildings Learn,” they learned by accretion. We accret synapses as we learn. When we put a picture on a wall, the wall learns about our preferences. The next resident may not pull that nail out, so such remodeling artifacts accret. Our house becomes our home, because we teach our house, and our house learns. So it is with evolution.

Before I created the box notation, I was drawing the upside and downside lines and 3rotating them to see how much area was involved in each of the asymmetries. I’m using the rotation approach in the figure to the left. I’ve annotated the three asymmetries, The white areas are cores, and the orange areas are tails. The asymmetry annotated at the top of the figure is, again, horizontal. The tail is just a line as the asymmetry is slight. The cores are symmetric about vertical lines, not shown, that represent the events encoded into the signal.

In an earlier figure, I just estimated the area of the tail. When I highlighted that area, 4because I use MS Paint to draw these things and it dithers, I got a line of green areas, rather than a single area. I numbered them in order. They are labeled as Area Discontinuities. In a sense, they would be Poisson distributions in individual Poisson games. In area 8, those Poisson distributions become a single normal distribution. That normal has more than 32 data points. With 20 data points, that normal can be estimated. In a sense, there is a line through those Poissons and the normal. This is what happens in the technology adoption lifecycle as we move from early adopters each with their own Poisson game and sum towards the vertical/domain specific market f which the early adopter is a member. This line is one lane of Moore’s bowling alley.

Where the figure mentions “Slower,” that is just about the slope of that last diagonal, the second loss. The red numbers refer to the earlier unrefined gaps we are now calling A and B.

When there are tails, the normal distribution involved will exhibit kurtosis. I built a histogram of the data in the area that I highlighted in green and then, looked at the underlying distribution along the line through those areas. There seemed to be two tails: one thicker and one thinner. Of course, all of this is meaningless, as it results from the dithering. With a vector rendering, there would only be one more consistent area.

The tiny thumbnail in the middle of the thumbnails at the bottom right of the figure shows a negatively skewed normal, but in another interpretation, the distribution is four separate normals. Where I mentioned theta, the associated angle quantifies the kurtosis5

One more thing is happening where a Poisson distribution finally becomes a normal distribution, the geometry shifts from hyperbolic to Euclidean.




In the next figure, I look at the black swan view of the signal. A black swan is usually 6drawn as a vertical line cutting off the tail of the normal distribution, labeled Original and highlighted with yellow and light green. Here we are talking generally. The next figure we will use this to show how the three black swans generate the signal that we’ve been discussing. The negative black swan throws away the portion of the distribution remaining beyond the event driving the black swan, then the remaining data is used to renormalize the remaining subset of the original data. The lifetime of the category is reduced. The convergence with the x-axis contracts, aka moves towards the y-axis. The positive black swan moves the distribution down. The normal becomes enlarged, so it sits on the new x-axis below the original baseline. The new distribution includes the light green and green areas in the figure. The lifetime of the category is lengthened. The convergence moves out into the future, aka moves further away from the y-axis.

In the continuous innovation case, the positive black swan will stay aligned with the driving event. The normal distribution is enlarged just enough to converge with the new x-axis below the prior x-axis. In the discontinuous innovation case, the positive black swan would begin at the B2B early adopter phase of the technology adoption lifecycle. In the discontinuous case, the size of the addressable market would drive the size of the normal, and it is not correlated with the prior distribution.

Now we go back to the example we’ve worked on throughout this post. We will apply the black swan concepts to the signal using the diagram below. There are three black swans. A negative black swan that generates the first loss. A positive black swan follows with a recovery that does not fully recover the value lost in that first loss. This recovery is followed by another negative black swan that contributes to the net loss summed up by the signal. The normals are numbered 0 through 3. The numbers are to the right of the events, and they are on the baseline of the associated normal. The original distribution (0) is located at the event driving the first black swan. The new distribution (1) associated with the first loss, the first negative black swan. The x-axis of this black swan is raised above the original x-axis. This distribution lost the projected data to the right of the event, data expected from the future. Renormalizing the distribution makes it higher from peak to the new baseline, and the distribution contracts horizontally. The rightmost convergence of the normal with the x-axis is where the category ends. The leftmost convergence is fixed. The x-axis represents time. The end of the category will arrive sooner unless some other means to generate revenues is found, aka a continuous innovation is found. The first gain, aka the positive black swan, generates a larger distribution (3). The x-axis is lower than that of the immediately prior x-axis. The convergence moves into the future relative to the immediately prior distribution. This is followed by another loss, the second loss, the second negative black swan. Here the x-axis rises above the previous x-axis. The distribution (3) is renormalized and is smaller than the immediately previous distribution (2).

From a signal perspective, the original signal input was above the output. The black swans move the signal to the line labeled “Restatement.” The shape of the original and restatement generate and output the same signal.


Next, we look at the logic underlying the signal. I’ll use the triangle model. In that model, every line is generated by a decision tree represented by a triangle. The x-axis has decisions trees, aka triangles associated with it. Each interval on the x-axis has its own decision tree. The y-axis has its own intervals and decision trees. The events that drove the black swan model drive the intervals and associated decision trees.


The pink triangles represent the y-axis decision trees involved in the losses. The green triangle represents the y-axis decision tree for the gain.  The green triangle is higher than the gain, because it does not recover the entire loss from the first loss. I annotated the shortfall. The asymmetry in the vertical axis, that we discussed earlier, appears on the upper right side of the triangle is thicker. This thickness is not constant. The colors and the numbers show the patterns involved on that side of the triangle. The axis of symmetry associated with the green triangle is an average between the baseline of the input signal and the baseline of the output signal. Putting this symmetry axis would increase the asymmetry of the representation.

The erosion would be shown more accurately as subtrees, rather than a single subtree starting at the vertex, like a slice of pie.

On the x-axis, each triangle is shown in blue. The leftmost triangle consists of a blue triangle and yellow triangle. The blue triangle represents the construction of the infrastructure that generates that interval of the signal. The yellow triangle represents the erosion that infrastructure. The black sway, the first lost resulted from that erosion.

Keep in mind that the negative black swan reduces the probability, so they move their baselines up vertically. Positive black swans increase the probability, so they move their baselines down vertically.

In the very first figure, I annotated the asymmetries and symmetries. Asymmetries are very important because they inform us that learning is necessary. Asymmetries in the normal distribution show up as kurtosis due to samples being too small to achieve kurtosis-free normality or symmetry.

The vertical orientation of those pink triangles is new to me as I wrote this. They represent the infrastructure to stop loss, a reactive action. The results may appear positive, but in the long run, represents exposure. These actions will be instanced for the situation being faced. Given that a black swan can happen at any moment, you don’t want to have to invent a response. You want to move from reactive, predictive, proactive time orientations as quickly as possible. Many people see OODA loops as a reactive mechanism. The military trains on the stuff, on the infrastructure–decision trees being part of that infrastructure. Know before you go. Eliminate or reduce those asymmetries before you get into the field, before the black swan shows up.

The events in the original signal view ties to the black swan/distribution view and the logical view are tied together by the red lines representing the events.


I drew another figure that is a bit cleaner about the signal view.  The


Even if the signal looks like nothing, a net zero, take a closer look, there was much to be seen, much learning got done to produce the result. Know before you go.





The Mortgage Crisis

September 5, 2017

Last week, I came across another repetition of what passes for an explanation of the mortgage crisis. It claimed that the problem was the propensity of low-quality loans. Sorry, but no. I’m tired of hearing it.

A mortgage package combines loans of all qualities, of all risks. But, being an entity relying on stochastic processes, it must be random. Unfortunately, those mortgage packages were not random. This is the real failing of those mortgage packages. Mortgages happen over time and are temporally organized, as in not random.

The housing boom was great for bankers up to the point where they ran out of high-quality loans. At that point, the mortgage industry looked around for ways to make lower quality loans. Mortgage packages gave them the means. So fifty loans got sold in a given week, the lender packaged them into one package. Some of those loans were refinancing loans on high-quality borrowers. Rolling other debts into the instrument improved the borrower’s credit but didn’t do much for the mortgage package. Still, the averages worked out, otherwise, throw a few of the pre-mortgage packaging loans, high-quality loans, in there to improve the numbers. A few people had to make payments to their new mortgage holding company. Their problem.

But, the real risk was that all of the original fifty loans originated from the same week. They were temporally organized. That breached the definition of the underlying necessities of stochastic systems. That was the part of the iceberg that nobody could see. That;s the explanation that should be endlessly retweeted on Twitter.

Why? Well, we no longer living in a production economy. You can make money without production. You can make money from the volatility economy. You can make money off of puts and calls and packages of those. That allows you to make money off of your own failures to run a successful business. Just hedge. The volatility economy is a multitude of collections of volatility based on a stochastic system, the stock market.  And, with the wrong lessons having been learned about mortgage packages, the regulators want to regulate mortgage packages and other stochastic systems. Or, just make them flat our illegal because they didn’t know how to regulate them. I’m not against regulation. Constraints create wealth. I just see the need for stochastic systems.

Too many stories are wrong, yet, endlessly repeated on twitter. Kodack, …. 3M, …. There was only one writer that wrote about Kodak that understood the real story. With 3M, their innovation story was long past and still being told when the new CEO gutted the much-cited program.

From the product manager view, where do stochastic systems fit in? The bowling alley is a risk package akin to a mortgage package. But, if you are an “innovative” company much-cited in the innovation press these days, don’t worry, your innovation is continuous. The only innovations showing up in the bowling alley are discontinuous. Likewise, crossing the chasm, as originally defined by Moore, was for discontinuous innovations. Those other chasms are matters of scale, rather than the behavior of pragmatism slices.

But, back on point, we engage in stochastic systems even beyond the bowling alley. A UI control has a use frequency. When they have a bug, that use-frequency changes. Use itself is a finite entity unless you work at making your users stay in your functionality longer. All of that boiling down to probabilities. So we have a stochastic system on our hands. In some cases, we even have a volatility economy on our hands.


A Different View of the TALC Geometries

August 25, 2017

I’ve been trying to convey some intuition about why we underestimate the value of discontinuous innovation. The numbers are always small, so small that the standard financial analysis results in a no go decision, a decision not to invest. That standard spreadsheet analysis is done in L2, a Euclidean space. This analysis gets done while the innovation is in hyperbolic space so the underestimation of value would be the normal outcome.

In hyperbolic space, infinity is away at the edge at a distance. In hyperbolic space, the unit measure appears smaller at infinity when viewed from Euclidean space. This can be seen in a hyperbolic tiling. But, we need to keep something in mind here and throughout Hyperboic Tilingthis discussion, the areas of the circle are the same in Euclidean space. The transform, the projection into hyperbolic space makes it seem otherwise. That L2 financial analysis assumes Euclidean space while the underlying space is hyperbolic, where small does not mean small.

How many innovations, discontinuous ones, have been killed off by this projection? Uncountably many discontinuous innovations have died at the hands of small numbers. Few put those inventions through the stage-gated innovation process because the numbers were small. The inventors that used different stage gates pushed on without worrying about the eventual numbers succeeded wildly. But, these days, the VCs insist on the orthodox analysis, typical of the consumer commodity markets, that nobody hits one out of the ballpark and pays for the rest. The VCs hardly invest at all and insist on the immediate installation of the orthodoxy. This leads us to stasis and much replication of likes.

I see these geometry changes as smooth just as I see the Poisson to normal to high sigma normals as smooth. I haven’t read about differential geometry, but I know it exists. Yet, there is no such thing as differential statistics. We are stuck in data. We can use Monte Carlo Markov Chains (MCMC) to generate data to fit some hypothetical distribution from which we would build something to fit and test fitness towards that hypothetical distribution. But, in sampling that would be unethical or frowned upon. Then again, I’m not a statistician, so it just seems that way to me.

I discussed geometry change in Geometry and numerous other posts. But, in hunting up things for this post, I ran across this figure. Geometry Evolution I usually looked at the two-dimensional view of the underlying geometries. So this three-dimensional view is interesting. Resize each geometry as necessary and put them inside each other. The smallest would be the hyperbolic geometry. The largest geometry, the end containment would be the spherical geometry. That would express the geometries differentially in the order that they would occur in the technology adoption lifecycle (TALC) working from the inside out. Risk diminishes in this order as well.

Geometry Evolution w TALC

In the above figure, I’ve correlated the TALC with the geometries. I’ve left the technical enthusiasts where Moore put them, rather than in my underlying infrastructural layer below the x-axis. I’ve omitted much of Moore’s TALC elements focusing on those placing the geometries. The early adopters are part of their vertical. Each early adopter owns their hyperbola, shown in black, and seeds the Euclidean of their vertical, shown in red, or normal of the vertical (not shown).  There would be six early adopter/verticals rather than just the two I’ve drawn. The thick black line represents the aggregation of the verticals needed before one enters the tornado, a narrow phase at the beginning of the horizontal. The center of the Euclidean cylinder is the mean of the aggregate normal representing the entire TALC, aka category born by that particular TALC. The early phases of the TALC occur before the mean of the TALC. The late phases start immediately after the mean of the talk.

The Euclidean shown is the nascent seed of the eventual spherical. Where the Euclidean is realized is at a sigma of one. I used to say six, but I’ll go with one for now. Once the sigma is larger than one, the geometry is spherical and tending to more so as the sigmas increase.

From the risk point of view, it is said that innovation is risky. Sure discontinuous innovation (hyperbolic) has more risk than continuous (Euclidean) and commodity continuous (spherical) less risk. Quantifying risk, the hyperbolic geometry gives us an evolution towards a singular success. That singular success takes us to the Euclidean geometry. Further data collection takes us to the higher sigma normals, the spherical space of multiple pathways to numerous successes. The latter, the replications, being hardly risky at all.


Nesting these geometries reveal gaps (-) and surplusses (+).





The Donut/Torus Again

In an earlier post, I characterized the overlap of distributions used in statistical inference as a donut, as a torus, and later as a ring cyclide. I looked at a figure that Torus_Positive_and_negative_curvaturedescribed a torus as having positive and negative curvature.


So the torus exhibits all three geometries. Those geometries transition through the Euclidean.Torus 2

The underlying distributions lay on the torus as well. The standard normal has a sigma of one. The commodity normal has a sigma greater than one. The saddle and peaks refer to components of a hyperbolic saddle. The statistical process proceeds from the Poisson to the standard normal to the commodity normal. On a torus, the saddle points and peaks are concurrent and highly parallel.

Torus 3


The Average, or the Core

August 4, 2017

Tonight I ended up reading some of the Wolfram MathWorld discussion of the Heaviside Step Function among other topics.  I only read some of it like most things on that site because I bump into the limits of my knowledge of mathematics. But, the Heaviside step function screamed loudly at me. Well, the figure did, this figure.


Actually, the graph on the left. The Heaviside step function can look like either depending on what one wants to see or show.

The graph on the left is interesting because it illustrates how the average of two numbers might exist while the reality at that value doesn’t. Yes, I know, not quite, but let’s just say the reality is the top and bottom line, and that H(x)=1/2 value is a calculated mirage. All too often the mean shows up where there is no data value at all. Here, the mean of 0 and 1 is (0+1)/2. When we take the situation to involve the standard normal, we know we are talking about a measurement of central tendency, or the core of the distribution. That central tendency or core in our tiny sample is a calculated mirage. “Our average customer …” is mythic, a calculated mirage of a customer in product management speak.

Cute w Nomal

Here I put a standard normal inside the Heaviside step function. Then, I show the mean at the x=1/2 of the Heaviside step function. The core is defined by the inflection points of the standard normal.

The distribution would show skew and kurtosis since n=2. A good estimate of the normal cannot be had with only two data points.

More accurately, the normal would look more like the normal shown in red below. The red normal is higher than the standard normal. The height of the standard normal shown in blue is around 4.0. The height of the green normal is about 2.0. The red normal is around 8.0. I’ve shown the curvature circles generated by the kurtosis of the red distribution. And, I’ve annotated the tails. The red distribution should appear more asymmetrical.

more acurately

Notice that the standard deviations of these three distributions drive the height of the distribution. The kurtosis clearly does not determine the height, the peakedness or flatness of the distribution, but too many definitions of kurtosis define it as peakedness, rather than the height of the separation between the core and the tails. The inflection points of the curve divide the core from the tail. In some discussions, kurtosis divides the tails from the shoulders, and the inflection points divide the core from the shoulders.

To validate a hypothesis, or bias ourselves to our first conclusion, we need tails. We need the donut. But, before we can get there, we need to estimate the normal when n<36 or we assert a normal when n≥36; otherwise, skew and kurtosis risks will jerk our chains. “Yeah, that code is so yesterday.”

And, remember that we assume our data is normal when we take an average. Check to see if it is normal before you come to any conclusions. Take a mean with a grain of salt.


Another find was an animation illustrating convolution from Wolfram MathWorld “Convolution.” What caught my eye was how the smaller distribution (blue) travels through the larger distribution (red). That illustrates how a technology flows through the technology adoption lifecycle. Invention of a technology, these days, starts outside the market and only enters a market through the business side of innovation.

The larger distribution (red) could also be a pragmatism slice where the smaller distribution (blue) illustrates the fitness of a product to that pragmatism slice.


The distributions are functions. The convolution of the two functions f*g is the green line. The blue area represents “the product g(tau)f(t-tau) as a function of t.” It was the blue area that caught my eye. The green line, the convolution, acts like a belief function from fuzzy logic. Such functions are subsets of the larger function and never exit that larger function. In the technology adoption lifecycle, we eat our way across the population of prospects for an initial sale. You only make that sale once. Only those sales constitute adoption. When we zoom into the pragmatism step, the vendor exits that step and enters the next step. Likewise when we zoom into the adoption phase.

Foster defined disruption as the interval when a new technology’s s-curve is steeper than the existing s-curve, we can think of a population of s-curves. The convolution would be the lessor s-curves, and the blue area represents the area of disruption.  Disruption can be overcome if you can get your s-curve to exceed that of the attacker. Sometimes you just have to realize what was used to attack you. It wasn’t the internet that disrupted the print industry, it was server logs. The internet never competed with the print industry. Fosters disruptions are accidental happenings when two categories collide. Christensen’s disruptions are something else.


Notes on the Normal Distribution

July 24, 2017

Pragmatism Slices and Sales

Progress through the technology adoption lifecycle happens in terms of seats and dollars. If you use alternate monetizations, rather than sell your product or service, drop the dollars consideration. Beyond those monetizations even if you sell your product or service, dollars are flaky in terms of adoption. But the x-axis is about population, aka seats.

Sales drive the rate of adoption in the sense that a sale moves the location of the product or service, the prospect’s organization(s), and the vendor’s organization(s) under the curve. By sales, I mean the entire funnel from SEO to the point where the sales rep throws the retained customer under the bus. But, I also mean initial sales, the point where prospects become customers. That sale moves from adoption from the left to right, from the early phases towards the late phases, from category birth to category death.

But, there are two kinds of sales: the initial sale, aka the hunter sale, and the upgrade sale, aka the farmer sale. What struck me this week was how the farmer sale does absolutely nothing in regards to progress through the various entities locations under the adoption curve. So let’s look at this progress.


People in a pragmatism slice reference each other. They do not reference people in other pragmatism slices.

In the figure, the hunter sales move the adoption front across the adoption lifecycle from left to right. The hunter sales rep made four sales. The farmer sales rep made four sales as well that generated revenues, but no movement across the lifecycle.


The size of the normal representing the addressable markets in the technology adoption lifecycle is fixed. It does not grow. A single company has a market allocation that tells us how much of that normal they own. With discontinuous innovations, that allocation to the market leader maxes out at 74%. Beyond that, antitrust laws kick in. Such a market leader would be a near-monopolist. Their market leadership will be the case until they exit the category, or face a Foster disruption. Intel was the market leader until NVIDIA brought a different technology to market. With continuous innovations, we are dealing with many players in a commodity market. The allocations are small. Market leaders can change every quarter.


In this figure, I started with a standard normal distribution (dark yellow) representing 100% of a category’s market. I represented the near monopolist’s market allocation of 74% as a normal distribution (light blue) inside of the larger normal. Then, I drew the circles (orange and blue) representing the curvature of the kurtoses of these distributions. The light blue distribution cannot get any larger. It is shown centered at the mean of the category’s normal. It could be situated anywhere under the category’s normal. Once a vendor has sold more than 50% of its addressable market, that vendor starts looking for ways to grow, ways to move the convergence of the vendor’s distribution as far to the right as possible. They try to find a way to lengthen the tail on the right. They run into trouble with that.

While a normal distribution represents the technology adoption lifecycle, the probability mass gets consumed as sales are made. The probability mass to the left has been consumed. So there is very little mass to allocate to the tail. In placing those curvature circles, I looked for the inflection points and made the circles tangent to the normals there. For the proposed tail, I drew its curvature circle. The thick black line from the mean to the top most inflection point doesn’t leave enough probability mass to allocate to the tail so the tails would be lower and the curvature circle would be larger. The thick red line from the mean to the bottom most inflection point leaves enough probability mass to allocate to the tail. It’s important that the curves represented by the black and red lines be smooth.

The points of convergence for the 74% normal, the 100% normal, and the long tail appear below the x-axis of the distribution. The mass between the convergences of the 100% normal and the long tail are outside the category’s normal distribution. The normal under the normal model used a kurtosis of zero. But, with the long tail, the kurtosis is no longer zero. That growth is coming from something other than the product or service of the vendor. And, the mass in the tail would not come from the normal inside the category’s normal. The normal was deformed when the mass was allocated towards the tail. But, again, that still does not account for the mass beyond the category normal. That mass beyond the category normal is black swan like and hints towards skew risk and kurtosis risk. Look for it in the data. These distributions just show a lifecycle of the category and vendor normals. The data should reflect the behaviors shown in the model. The pragmatism slices move as well. Taking a growth action that concatenates the tail can dramatically change your phase in the technology adoption lifecycle. Each phase change requires some, possibly massive, work to get the products and services to fit the phase they find themselves addressing.

Booms stack the populations in the technology adoption lifecycle. See Framing Post For Aug 12 Innochat: The Effects of Booms and Busts on Innovation for that discussion.

I drew my current version of Moore’s adoption lifecycle.

The Technology Adoption Lifecycle

Moore built his technology adoption lifecycle on top or Rodgers’ model of the diffusion of innovation. Rodgers identified the populations involved in technology adoption like the innovators, early adopters, early and late majorities, and laggards. Moore went further and teased out technical enthusiasts, and the phobics, Moore changed the early majority to vertical markets and the late majority to horizontal markets. Moore identified several structural components like the bowling alley, the chasm, and the tornado.

I’ve made my own modifications to Moore’s model. The figure is too abundant. Another incidence of my drawing to think, rather than to communicate.

TALC setup

The technology adoption lifecycle provides the basis for the figure. The technology adoption lifecycle is about the birth, life, and death of categories that arise from discontinuous innovation. This leaves aside the categories that can be created via management innovation discussed in an HBJ article over the last year. A category is competed for during the Tornado and birthed when market power selects the market leader. Immediately after the birth of a category, the competing companies consolidate, or exit. Their participation in the category ends. The category can live a long time, but eventually, the category dies. Its ghost disappears into the stack. The horse is still with us. Disruption is a means of killing a category, not about competing in the disrupted category. Disruption happens to adjacencies, not within the category sponsoring the disruptive discontinuous innovation.

The populations are labeled with red text. Most of the phase transitions are shown with red vertical lines. The transition to the early majority is shown with a black line, also labeled “Market Leader Selected.” The vertical labeled with red text consists of the early adopter (EA) and the next phase that Moore called the vertical market. Some technical enthusiasts would be included in the vertical as well, but are not shown here as such.

Notice that I’ve labeled the laggard phase device and the phobic phase cloud. The cloud is the ultimate task sublimation. The device phase is another task sublimation. These are not just form factors. They are simpler interfaces for the same carried use cases. The carrier use cases are different for every form factor. Moving from early majority to late majority phases also involved task sublimation, as described by Moore. Laggards need even simpler technology than consumers. Phobics don’t want to use computers at all. The cloud provides admin-free use. The cloud is about the disappearance of both the underlying technology in the carrier layer and the functionality in the carried layer. Notice that after the cloud the category disappears. There are no remaining prospects to sell.

The technical enthusiasts, as defined by Moore, was a small population at the beginning of the normal. But, there are technical enthusiasts in the Gladwell sense all the way across the lifecycle. They are a layer, highlighted in orange, not a vertical slice, or phase. I’ve shown both views of the technical enthusiasts. The IT horizontal people would show up as technical enthusiasts if the product or service was being sold into the IT horizontal. This distinction is made in my Software as Media Model. The technical enthusiasts are concerned with the carrier layer of the product or service.

Moore’s features are shown as brown rectangles. These features include the chasm, the tornado, and the bowling alley. Specific work, tactics, and strategies address the chasm, the tornado, and the bowling ally. These are labeled as pre-chasm, pre-tornado, and keeping the bowling alley full. They show up as blue rectangles. Another feature stems from de-adoption, the “Need (for a) New Category,” and appears as a blue rectangle. This latter feature happens, because nothing was done to create a new category before it was needed. Or, such an effort failed. The point of keeping the bowling alley full is to create new categories based on discontinuous innovation on an ongoing basis. I’ve seen a company do this. But, these days discontinuous innovation is very rare. Discontinuous innovations can, but not always, cause (Foster) disruptions. Christensen’s disruptions happen in the continuous innovation portion of the adoption lifecycle.

The lifecycle takes a discontinuous innovation to market and keeps the category on the market via continuous innovation. Plant the seed (discontinuous), harvest the yield (continuous). This division of the lifecycle is labeled in white text on a black rectangle towards the bottom of the figure. Discontinuous innovation generates economic wealth (inter-). Continuous innovation generates an accumulation of cash (intra-). A firm does not own the economic wealth it generates. that economic wealth is shared across firms. I am unaware of any accounting of such.

At the very top of the lifecycle, the early and late phases are annotated. The early phases constitute the growth phase of the startup. The late phases constitute the decline phase. The decline phase can be stretched out, as discussed in the previous section. When the IPO happens in the early phases, but not before the Tornado, the stock price sells at a premium. When the IPO happens in the late phases, the stock price does not include such a premium. The Facebook IPO bore this out. It’s typical these days, these days of continuous innovation, that no premium is involved.

Founders, at least in carrier business, with discontinuous innovation are engineers, not businessmen, so at some point, they have to hire them to put the biz orthodoxy in place. VCs these days require a team that is already orthodox. The hype before the Shake Shack IPO demonstrates that innovation has moved on from software. Orthodox businesses are now seen as innovative, but only in the business model, continuous innovation sense. Shark Tank and VCs don’t distinguish the technology startup from other startups. The innovation press confuses us as well. It used to be that the CFO and one other person had an MBA, now everyone has one. But, in an M&A, the buyer doesn’t want to spend a year integrating the business they just bought. The merger won’t succeed unless the buyer can launch their own tornado and bring in new customers in the numbers they need. The Orthodoxy needs to be in place at least a year before the IPO, or the stock price will underperform the IPO a year after the IPO.

From a statistical point of view, the process of finding a new technology involves doing Levy flights, aka a particular kind of random walk, until that new technology is found. It should not be related to what you are doing now, aka to your install base. You are building a brand new company for your brand new category. Google’s Alphabet does this. Your company would become a holding company. Managing the diversity inherent in the technology adoption lifecycle becomes the problem. “No, that company is in a different phase, so it can’t do what our earlier company does now.” Contact me to find out more.

After the Levy flights, we search for early adopters. Use Poisson games to look at that. The Poisson distributions tend to the normal. Those normals become higher dimensional normals. The standard normal has six sigma, the later normals in later phases of the lifecycle have more than six sigma. These divisions translate into geometries. The nascent stages of the lifecycle occur in a hyperbolic geometry where the distant is small from a Euclidean perspective generated by the inherent L2 geometry of linear algebra. Artists see the distant as small reality in perspective drawings. They call that foreshortening. We foreshorten our financial forecasts and small is bad. But, as the Poisson become a normal, those financial forecasts stop foreshortening. The idea we threw away becomes obviously invaluable after the founder builds a market, a technology, a product or service, a company, value chains,… The distributions change, and the geometries change. Once you move beyond six sigma, the geometry becomes spherical. In such geometry, there are many ways for followers with different strategies to win. We start with a very narrow way to win in the hyperbolic, arrive at the one way to win in the Euclidean, and find ourselves in the many ways to win in the Spherical. Or, damn, so many fast followers, geez.

Last but not least, we come to the Software as Media model. Media is comprised of carrier layers and carried content layers. The phases of the adoption lifecycle change layers when they change phases. The technical enthusiast is about the carrier layer; the early adopter, the content layer; the vertical, the content layer; the horizontal, the carrier layer; the device, both; and the cloud, carrier. At the point where you need another category, it could be either. But, these oscillations involve the market and the way the vendor does business. Each phase is vastly different. The past has nothing to do with the present. Yes, the practices were different, but they fit their market. They were not better or worse unless they did not fit their market.

Designers whining about the 80’s were not around then. They take today’s easiness for a given and think the past should have been done their way. The past taught. We learned. And, as we cross the technology adoption lifecycle, the Ito process that crossing, the memories are deep. We learned our way here. And, when we repeat the cycle, our organizations are not going to start over. They don’t have to if properly structured. Call me on that as well. But, usually they don’t start over from scratch, but should, because they forgot the prior phase, as they moved to the next.


The Curvature Donut

July 23, 2017

In last month’s The Cones of Normal Cores, I was visualizing the cones from the curvatures of a skewed normal to the eventual curvatures of a standard normal distribution. The curvatures around a standard normal appear as a donut or, a torus. Those curvatures are the same all the way around the normal in a 3-D view. That same donut around a skewed normal appears as a deformed donut, or a ring cyclied. In the skewed normal the curvatures differ from one side to the other. These curvatures differ all the way around the donut.

The curvature donut around the standard normal sits flatly on the x-axis and touches the inflection points of the normal curve. Dropping a line from the inflection points down to the x-axis provides us with a point where a line 45 degrees above the x-axis is where the origin of the circle of the particular curvature would be.

The curvature donut of a skewed normal would sit flatly on the x-axis, but might be tilted as the math behind a ring cyclied is symmetrical to another x-axis running through the centers of the curvatures. In January’s Kurtosis Risk, we looked at how skew is a tilt of the mean by some angle theta. This tilt is much clearer in More On Skew and Kurtosis. That skewness moves the peak and the inflection points but the curve stays smooth.

So I’m trying to overlay a 2-D view of a skewed distribution on a 3-D view of ring cyclied.

Ring Cyclide

I’ve used a red line to represent the distribution. The orange areas are the two tails of the 2-D view. The curvatures show up as yellow circles. The inflection points on the distribution are labeled “IP.” The core is likewise labeled although the lines should match that of the tilted mean.

I think as I draw these figures, so in this one, have a gray area and a black vertical line on the ring cyclied that are meaningless. Further, I have not shown the orientation of the ring cyclied as sitting flat on the x-axis.

The ring cyclied occurs when skewness and kurtosis occur. A normal distribution exhibits skewness and kurtosis occur when the sample size, N, is less than 36. When N<36, we can use the Poisson to approximate or estimate the normal. Now, here is where my product management kicks in. We use Poisson games in Moore’s bowling ally to model Moore’s process as it moves from the early adopter to the chasm. The chasm being the gateway to the vertical market that the early adopter is a member of. We stage gated that vertical before we committed to creating the early adopter’s product visualization.  We get paid for creating this visualization. It is not our own. The carried component always belongs to the client. The carrier is our technology and ours alone.

So let’s look at this tending to the normal process.

Conics as Distribution Tends to Normal

I was tempted to talk about dN and dt, but statistics kids itself about differentials. Sample size (N) can substitute for time (t). The differentials are directional. But, in statistics, we take snapshots and work with one at a time, because we want to stick to actual data. Skew and kurtosis go to zero as we tend to the standard normal, aka as the sample size gets larger. Similarly, skew risk and kurtosis risk tend to zero as the sample size gets larger.

The longer conic represents the tending to normal process. The shorter conic tends to work in the inverse direction from the normal to the skewed normal. Here direction is towards the vertex. In a logical proof, direction would be towards the base.

The torus, the donut associated with the standard normal, like its normal is situated in Euclidean space. However; the ring cyclide is situated in hyperbolic space.

An interesting discussion on twitter came up earlier this week. The discussion was about some method. The interesting thing is what happens when you take a slice of the standard normal as a sample. The N of that slice might be too small, so skew and kurtosis return, as do their associated risks. This sample should remain inside the envelope of the standard normal; although it is dancing. I’m certain the footprints will. I’m uncertain about the cores in the vertical sense. Belief functions of fuzzy logic do stay inside the envelope of the base distribution.

Another product manager note: that slice of the standard normal happens all the time in the technology adoption lifecycle. Pragmatism orders the adoption process. Person 7 is not necessarily seen as an influencer of person 17. This happens when person 17 sees person 7 as someone that takes more risk than they or their organization does. They are in different pragmatism slices. Person 17 needs different business cases and stories reflecting their lower risk willingness. These pragmatism slices are a problem in determining who to listen to when defining a product’s future. We like to think that we code for customers, but really, we code for prospects. Retained customers do need to keep up with carrier changes, but the carried content, the use cases and conceptual models of carried content rarely the changes. The problem extends to content marketing, SEO, ancillary services provided by the company, and sales qualifications. Random sales processes will collide with the underlying pragmatism structure. But, hey, pragmatism, aka skew and kurtosis, is at the core of problems with Agile not converging.

In terms of the technology adoption lifecycle, the aggregated normal that it brings to mind is actually a collection of Poisson distributions and a series of normal distributions. The footprint, the population of the aggregated normal does not change over the life of the category. Provided you not one of those to leave your economy of scale with a pivot. Our place in the category is determined in terms of seats and dollars. When you’re beyond having sold 50% of you addressable population you are in the late market. The quarter where you left the early market and entered the late market is where you miss the quarter and where the investors are told various things to paper over our lack of awareness that lost quarter was predictable.

If you know anything about the ceiling problem, the sample distribution reaching beyond the parent normal let me know.

I’ve actually seen accounting visualizations showing how the Poissons tend to the normal.


The Postmodern UI

July 8, 2017

A tweet dragged me over to an article in The New Republic, a journal that I’m allergic to.  But the article, America’s First Postmodern President, an article I read with my product manager hat on, an article about the postmodern world we live in, that world one of constant, high-dimensional, directionless change. And, it became obvious to me that I’m not a postmodernist while Agile is exactly that, postmodernist, so our software products reflect that.

No politics here. The quotes might go that way, but I will annotate the quotes to get us past that. I’ll ignore the politics. Here the discussion will be product, UI, design, Agile.

For Jameson, postmodernism meant the birth of “a society of the image [textual/graphical/use case] or the simulacrum [simulation] and a transformation of the ‘real’ [the carried content] into so many pseudoevents.” Befitting the “postliteracy [Don’t make me read/YouTube it] of the late capitalist world,” the culture of postmodernism would be characterized by “a new kind of flatness or depthlessness [no heirarchy, no long proofs/arguments/logics/data structures/objects, a new kind of superficiality [the now of the recursion, the memorilessness of that recursion’s Markov chain] in the most literal sense” where “depth [cognitive model/coupling width/objects] is replaced by surface [UI/UX/cloud–outsourced depth].” Postmodernism was especially visible in the field of architecture, where it manifested itself as a “populist” revolt “against the elite (and Utopian) austerities of the great architectural modernisms: It is generally affirmed, in other words, that these newer buildings [applications/programs/projects/products/services] are popular works, on the one hand, and that they respect the vernacular of the American city fabric, on the other; that is to say, they no longer attempt, as did the masterworks and monuments of high modernism [No VC funded, logrithmic hits out of the financial ballpark], to insert a different, a distinct, an elevated, a new Utopian language into the tawdry and commercial sign system [UX as practiced now] of the surrounding city, but rather they seek to speak that very language, using its lexicon and syntax as that has been emblematically ‘learned from Las Vegas [for cash and cash alone, no technlogical progress/reproduction by other people’s means].’”


For Baudrillard, “the perfect crime” was the murder of reality, which has been covered up with decoys (“virtual reality” and “reality shows” [and UIs]) that are mistaken for what has been destroyed. “Our culture of meaning is collapsing beneath our excess of [meaningless] meaning [and carrier impositions], the culture of reality collapsing beneath the excess of reality, the information culture collapsing beneath the excess of information[multiplicities in the spherical geometry where every model models correctly in the financial/cash sense]—the sign and reality sharing a single shroud,” Baudrillard wrote in The Perfect Crime (1995)…[political cut].

What a mess. It helped that this morning in those Saturday morning, light-weight introspective moments the notion of objects being bad and the reassertion of functional programming was leaving us with data scattered in the stack via recursion, and the now of the current system stack with nothing to see of how we got here. But, hey, no coupling between functions through the data structure, something I never thought about until some mention in the last two weeks. Yes, the alternative to static would do that no matter how dynamic.

Those gaps, the architecture enabling us to escape those tradeoffs we make in our products, the slowness of feedback from our users, and the feedback from the managers as if  they were users–a flattening–all disappear when we go postmodern when we go flat. That jack in your car becomes worthless when your emergency tire goes flat.

Still, I don’t like surface without depth; the absence of a cognitive model; the painted on UI, the erasure of the deep UX/CX/BX/MX/EX, the surface of machine learning, and programmers writing up other people’s disciplines as if those disciplines don’t matter, as if those years spent in school learning that discipline doesn’t matter, that the epistemical/functional cultures don’t matter–but, of course, they don’t matter because the programmer knows all the content they encode, and management lays off all the content anyway ending their Markov chains and filling their resumes so full of cheap labor jobs so you can’t see the underlying person. Thirty years of doing something, the depth, forgotten because seven years have passsed, still leaves depth, but hiring passion over experience gets us to that postmodernist surface. Oh, well. When better is surface, when success is reality TV, when…

The danger of a sweeping theory like postmodernism is that it can produce despair.

But, that’s where we are this morning, sweeping theory, not despair.