## Archive for August, 2016

### A Discontinuity in a Sequence

August 22, 2016

In my last post, The Grid, we looked at how grids imprison sequences. We discovered a discontinuity, a hole, among the sequences laminated into the larger sequence, the sequences of differences between z-score values. I called them out. And, left much unsaid. We’ll continue that discussion in this post.

In mathematics, we have holes in our graphs. We have holes in what each of us knows about math. In Algebra class, we’re restricted to the reals, so we’re told no solution exists. It turns many of those solutions are complex numbers, not reals. There are plenty of holes, potholes.

Then, we have asymptotes. We can approach them, but we can’t cross them with a function because they are manifolds, something that falls into that wide category of math we don’t know yet.

I remember stepping into a gopher hole. After that, I kept a close eye on the ground where my feet were stepping into. One day a lieutenant colonel stopped his staff car so we could have a conversation about why I didn’t salute his staff car. “Gopher holes sir.” Not that I had to worry, my colonel would have laughed the incident off. It was one of those days when the graph you live in has a few new nodes and the graph’s normal distribution changes.

The z-score sequence is directed from core to tail–away and towards. Oddly, humans use the same kind of dimension, technically a half of a dimension. We are 2.5-D beings, not 3D beings. But, we round off dimensions for our mathematical convenience.  If it’s not easy, it’s not math–easy being very relative. Consider that z-score sequence to be a vector. Consider the hole to accommodate another intersecting vector that for the moment we will consider orthogonal, or simply perpendicular.

Being orthogonal in statistic means that the vectors intersecting in that manner are independent, aka not correlated. The cosine of 90 degrees is zero, so the cosine of correlation is zero, so the vectors are not correlated.

The vector passing through the hole in the z-score sequence has its own distribution. In the end, the data comprising that distribution will be added to the z-score sequence’s distribution. For now, that distribution is unknown, and like all unknowns constitutes a source of risk.

Now, we can imagine a flow through the subsequences. Imagine each layer as a pipe. That gives us some plumbing, aka some fluidics. No, I’m not going there tonight. But, I did draw it just to assess its probabilities. Of course, I ignored some of the subsequences. In modeling, you put in what you think is important and you leave out the rest.

Just for the bayesian priors s, t, and u all started with a probability of 0.50. That gave us the probability of st, the probability after the first mix of st=0.25. Then we dealt with the second mix, which had us adjusting the probabilities so they equaled 1.00, leaving us with p(st)=0.333 and p(u)=0.666. Oh, we’ve crossed an approximation boundary.

I finally gave in to reading David Hand’s “The Improbability Principle.” Hand refers to Borel’s theorem about the impossibility of events with sufficiently small probabilities. Borel wanted us to understand that p=1 and not more than 1. It takes a while to get to the point. Borel is modeling via probability, so the impossible events are left out, but due to Borel’s theorem, we are assured that we can simplify the situation via omission and kept going, all things being logically consistent.

We are not leaving the hole out. Everybody else probably has left it out. It’s not in the z-score table screaming out to be seen. We stumbled across it with much labor. But, we will start with the vector being orthogonal. I took a top-down view for the next graphic.

Here we start at the global maxima of the z-score differences sequence, the axis of symmetry or rotation, on the left. The sequence running to infinity somewhere off the page to the right. The hole appears in light blue. The hole is where the sequence vector intersects the orthogonal vector.  The long-term mean will come to rest at the intersection. The r variable is the indicator of correlation. The angle between the sequence vector the actual vector (shown in red), theta, illustrates a positive correlation. So the distribution will come to rest on the actual vector (red).

We started with a surprise unknown at the hole. Once discovered, we have to find it’s measure. So we assert the distribution’s existence. This has the effect of putting a Dirac function at the center of the distribution. With more data, we have a Poisson distribution. We can use that Poisson distribution to approximate the normal distribution until we have collected 30 or more datum. The figure is wrong, but I had to make the Poisson distributions large enough to show up. The Poisson distributions would still be inside or under the normal distribution. As the Poisson approaches the normal, the mean moves around until it settles at the core intersection, aka the mean as shown in the diagram, and the distribution would exhibit skewness and kurtosis.

Here I show the evolution of that hole. The Dirac function generates a line at infinity, here labeled PE, as in potential energy. Potential energy is used here to hit at information physics. Strong writing on information physics put it as potential energy being position and not some form of energy, just a physics bookkeeping slight of hand. Next, the Poisson distribution is generated along the line of positive correlation in its continuous form (blue line and blue area). Poisson distributions speak loudly to the myth of deregulation being valuable in a business. The constraint, here a policy constraint (gray) moves the probabilities stretching out to infinity and concentrates them into the histograms inside the constraint, which makes the business more focused and less costly. Beware of this myth. The constraint generates the higher histograms (red volumes with orange tops) in the discrete form and generates the higher curve (dark red) as opposed to the original curve (blue) in the continuous form. Constraints create value.

Next, the Poisson distribution is generated along the line of positive correlation in its continuous form (blue line and blue area). Poisson distributions speak loudly to the myth of deregulation being valuable in a business. The constraint, here a policy constraint (gray) moves the probabilities stretching out to infinity and concentrates them into the histograms inside the constraint, which makes the business more focused and less costly. Beware of this myth. The constraint generates the higher histograms (red volumes with orange tops) in the discrete form and generates the higher curve (dark red) as opposed to the original curve (blue) in the continuous form. Constraints create value.

Last, the normal distribution reaches its equilibrium distant from the Poisson distribution on the timeline (gray). The normal has lost the directional sense that the Poisson distribution provided. The data is close in distance but spread out over time. The potential energy of the assertion that generated the Dirac signal flows down to the normal and beyond as the normal gets wider and loses height, aka becomes flat. The normal here is in situated in Euclidean space. The Dirac and Poisson are situated in hyperbolic space. Beyond the normal shown, where the normal becomes flat, those normals find themselves in the spherical space. Financial analysis as it is conducted today is carried out in spherical space. In that space, multiple analyses give good answers. In hyperbolic space, no analysis gives good answers.

Think of your data efforts as dynamic undertakings. Statistics uses the static view as the means to honest statistics, dynamics is prohibited. Statisticians take snapshots, but technology adoption is a dynamic proposition.

Standard normals hide much. All normal distribuitons look the same in the standard normal form. At times seeing the real normal will tell us much.

### The Grid

August 18, 2016

It’s been said of mathematical proofs that they start somewhere and end somewhere else. Grids behave in the same manner. Grids might be rectangular or square.

Grids might be laid out on some modulo, which greatly restricts their shape and how they shape the content they contain or in our verbiage “carry”. In the end, a grid starts somewhere and ends somewhere else.

Each of the rows could have kept on going, but the rule about row population prevents this, and instead, puts the red numbers on the next line.

A table of z-scores takes an infinite ray and chops it up at decreasing and later increasing intervals. The z-score table in the back of my statistics book gives the wrong impression when it chops the entries up ten z-scores to rows of modulo 10. The shape of the table controls the shape of the carried z-scores. The z-scores have their own shape, but it is lost here.

Just to make the table as media reality clearer, I’ve changed the carrier, the grid, as I changed the number of columns. I changed the metadata or meta carrier to change the number of columns. Being a carrier or the carried is a matter of shifting contexts in the stack.

Oops! This carrier is smaller than the last. We’ve run out of carrier before we’ve ran our of carried content. Those excess numbers fall into a jumble on the floor. Some of the numbers that remained in the table did not move. Other’s moved. I’ve highlighted the ones that did not move. They remind me of Ito processes, processes with fixed sized memories. A Markov process is an Ito process with zero memory (n=0). In our table, the rows are memories that vary between zero and ten (0 ≤ n ≤ 10). This memory problem is what the Hilbert Curve was invented to solve. A value placed on a Hilbert space-filling curve never moves. Hilbert curves forget nothing in our Ito process sense even as the resolution or densities vary. In terms of the last post, Matrix Composition, matrix compositions, the processes never move even as the customers and the products move on.

When the carried is a sequence, it remains a sequence. The grid becomes sparse or ceases to be a rectangle or a square when the sequence dances. z-scores are such a sequence. The z-score sequence is really a collection of sequences.

Here I’ve put each sequence making up the larger sequence on its own line. Here we put a parsing rule in place. The first number that is larger than the previous number goes to the next line. Then, we add the next equal in value numbers pushing back to the front indicated by the red vertical line. This works until the new line is longer than the prior lines. Then we add another rule. Push the front of the lower value number further to the right and add spacers or holes on the lines above where necessary, so the lower values are aligned at their front. Spacers change the shape of the surface of the curve. Holes run through the solid mass of the curve. Those two rules let the sequences express their “natural” shape. The grid is going where it will. The shape of the curve, the shape the grid will follow, might surprise you.

Iterations and releases would behave similarly. If you put too much in an iteration, you end up pushing the boundary of the next iteration or release. Or you move the current iteration into the next release and ship what you have, a working iteration.

As a product manager, are you imposing a modulo on your roadmaps, or are your roadmaps going where they go without enforcement? Are you mining the shape of your roadmaps for surprise? Yes, we impose some rules about delivering value in each release. We have an upgrade tempo, but the functionality carried by the roadmap dances to its own shape.

Are your carriers clearly separated from your carrieds? Are your populations facing your carriers or your carrieds? Remember that the IT horizontal is carrier facing. Most of what we do these days is likewise carrier facing even though we might be selling to consumers. Are we turning consumers into administrators with this carrier focus?

The push rule provides a new kind of outcome if we were being probabilistic about outcomes. Z-scores have holes in them.

### Matrix Composition

August 14, 2016

Watch this first, “Matrix algebra as composition.” A firm is a sequence of matrix multiplications. When we do anything, we are left with a need for each transformation, a sequence of such, and the evolution of that sequence over time. Your fast followers won’t match your evolution, and they won’t match your sequence, your composition. They will start somewhere else, and go directly to the product emerging from your composition. The fast follower will duplicate your output without duplicating your firm.

With discontinuous innovations, we start off with a client, just one, but a firm, not a single individual, with a wide width of use cases to cover. We start with a lot of potential. We picked that client with our bowling ally strategy in mind. We pick one in the middle of the industrial classification tree, so we can move up or down the tree as we go. That enables us to span not just the firm, but the whole industry, the whole ecology, the whole value proposition. Eventually, we will be in a simpler place described in the previous paragraph. But, our composition in matrix terms is deep. Our fast followers are thin. So keep your cards to yourself and fake the tells, so the competition chases it’s imaginary illusions, instead of you.

The differences across the technology adoption lifecycle are immense. We hire for each function, we tune each function, then we cross a technology phase boundary, and change the focus of our functionality. Call this later thing forgetting. But, that means we cannot repeat the function in the future when the demand for another discontinuity requires it. Apple is stuck now. The length of time that a company is stuck is a reflection of how much it forgot. Repeated discontinuous innovation requires remembering, rather than forgetting. Repeated discontinuous innovation requires an organizational structure that can improve it’s processes and it’s customer knowledge. Not the stuff of innovation consultants. Even if Christensen suggested it long before his effect-cause confused disruption idea became the rage. The cost accountants couldn’t go there. So the organizational structure required goes unaddressed.

But, what of Christensen’s separation as he called it? Everyone is probably thinking separation as in spin outs or it’s cousins. But, there is another way to separate. It’s hard work. It doesn’t anchor itself to economies of scale. Discontinuous innovations require new markets that might merge , or not, decades down the road into one of the company’s economies of scale. The company has a tempo modulating continuous innovation with discontinuous innovations. The former serving existing customers. The latter finding new never before addressed customers.

Software as media provides a hint. In the software as media model, we split the carrier from the carried. The distinction is difficult at times. What is strictly speaking about the carrier, the software, and what is about the content of the domain? Addition is a carrier (red) of the carried things being added (blue), so 01+01=10. But, if it something carrier being added, like loop indexes, the whole thing would be carrier. as in 01+01=10.

An organization is also a media, so it has carrier and carried layers. The carried layer would be focused on the customer. The carrier layer would be focused on things that don’t require customer inputs like the process of shipping goods to the customer. The staff that had customer relationships would flow through the firm with the customers. The staff that had process knowledge would stay in the phase specific organizations and keep improving those phase specific processes.

The technologies would flow through the organization as well. The technology would  be productized at the B2B early adopter client engagement. The technology and the product would then flow into the vertical phase, then the IT horizontal phase, and beyond. But, when the bowling alley has a free lane the next technology would take it. The processes across the phase specific divisions would be fully loaded all the time as would the staff attending to those processes.

The IT horizontal oscillation switches the focus from the carried to the carrier and the next adoption phase shifts the focus back to the carried. In this situation, the customer specific staff would not be fully loaded, but would have time to gain more in depth knowledge of the domain constituting the carried.

A company organized in such a way would have to manage the separation. Cross talk between the managers in the different phases needs to be suppressed. A best practice in the tornado, “free,” doesn’t work beyond the tornado. Sales reps love tornados, but tornado sales forces are unlike the sales forces serving both retained customers and new prospects. “Free” fails in all other contexts except the merger tornado.

Each phase has its own operational foci. A factor analysis of each would reveal that those organizations in a specific phase are alike, and different from the organizations in all the other phases. Each organization has it’s own factor analysis: as in factors and factor weights. The parent company would look like a holding company and have holding company problems like understanding that there are no synergies across the held organizations.

Know where you are. Don’t do what everybody else is doing, particularly those companies that don’t know where they are. Know that funding phases are not synchronized with adoption phases. Many of those so called technology companies are not technology companies at all. Most of them are technology users, not technology makers. They are coding content, not carrier. They are doing continuous innovation and throwing away the results from discontinuous possibilities because the hyperbolic realities don’t look like the familiar spherical geometries they are use to. Yeah, I know, too much.

### More On Skew and Kurtosis

August 9, 2016

After the last blog post, Donuts, I was still puzzled about where skew and kurtosis come from. I’ve chased enough rabbits into their holes with this one. I’m tired of the obsession, so I’ll write this one up and let go of it until I cross paths with again down the road.

It was stats. It became math. It’s on its way to becoming a set of tools like the black swan that can be applied within product management in the sense that here is the investment. How do we code down the kurtosis? I found mentions of skew risk and kurtosis risk. They are not a playground yet.

Skew and kurtosis were and still are descriptive. Later came the “summary statistics” that our spreadsheets generate for us, but read that again “summary statistics.” With kurtosis one number is describing two things. Well, those two things are connected or are part of one thing, a thing never described anywhere in the literature I’ve cruised through, another donut. Then, there is the matter of that angle, that I found a hint for after I found it own my own. The angle accounts for the two kurtoses, and the new donut.

The literature talks about moments. Skew is the third moment; kurtosis, the forth. Then, there is another view that talks integrals, and another that talks derivatives.

For myself, it boils down to derivatives being about inflection points. Three of them: 1) a global maxima,  and 2) two concavity change inflection points. That’s all there is for all of that calculus. There are a few more concavity changes, but  no more points. The fifth and sixth derivatives sit on top of the third and forth. The drive the curve, but don’t present us with any additional inflection points.

All the mentions of leptokurtic, platykurtic, and mesocratic  are just terminology from long ago lacking in any numeric definition or reality. Some times we are told the data has these characteristics, but we need to keep in mind that we are describing a curve, rather than the data. We use summary statistics and distributions to make the data itself disappear. So whatever is going on is not the result of the data doing anything. The data stands around in lines, we call histograms.

On of my early pursuits was a search for slant asymptotes. Well, there are none. There is a horizontal asymptote. It is a cubic rather than a straight line. The cubic crosses the x-axis at the origin. It leaves us wondering where our convergences are with the line “formerly known as the x-axis.” Anyway, when you have a horizontal asymptote, you won’t have any slant asymptotes.

Next I looked to extrapolate something I read about setting up bins in regard to a given range of numbers. The binomial approximates the normal when the bins capture the data evenly.

The bin widths had to be the same even if the data width doesn’t completely fill those bins. Maybe we only have data to fill the right half of the base of the decision tree.

I didn’t draw a distribution for this decision tree. The distribution will be skewed with a long tail to the left and the short tail to the right. The first box plot below shows what the distribution resulting from the above decision tree will look like. The second box plot is not skewed and is shown for comparison purposes only.

When looking at box plots, if the line dividing the box does not divide the box into two equal size partitions, the distribution is skewed. Likewise, if the tails are not of equal length, even if the box has equal partitions, the distribution is skewed. Likewise if the outliers, not show, are not of equal distance from the mode, the distribution is skewed. These outlier skews are sensitive. Measures of coskewness and cokurtosis are about sensitivity in the financial/investment domain. Beware of outliers. I’ve said it before, say no to sales when they present you with deal demands from outliers.

The boxplot view gives a hint to the angle driving skew and kurtosis. Keep in mind that without skew, there is no kurtosis, or the kurtosis has a summary statistic value of 3, aka no kurtosis.

I ran some lines out from the unskewed mode and skewed mode. The angle between them ties to kurtosis. I didn’t read this anywhere, but later did find some diagrammatic hints from other writers out on the internet. Notice that the mean never moves and that the vertical line labelled mode is also the mean in the unskewed case. Notice that there are two different kurtosis measures apparent in this view. This is where the summary statistic goes off in the weeds unless it is an index to both kurtoses. Given that we started with the standard normal and deformed it in a consistent manner, the two kurtoses should be correlated and indexed. I’ve not come across such.

Kurtoses are measure by curvature. The Kurtosis  curves are intrinsic curves. There is no controls off the line as in the Bezier curves we’ve discussed in the past. Curvature is a circle generated with radi of the recipricol of radius, aka 1/r.

Notice the gap between the blue line and the red one. I  couldn’t make that circle big enough. But, this two dimensional view misses that there are kurtoses in every direction around the distribution. Here we’ve show the largest and the smallest. Those encompassing the distribution would be smaller than the largest and larger then the smallest. Sweeping these kurtoses around would give us a lopsided donut.

I leave it up to your imagination to sweep the ellipse around the core of the distribution to form the donut. I made a mistake by limiting the redlines to the tails of the distribution indicated by the outer circle. The actual radi would extend beyond the circle for the longer tails and not touch the outer circle for the shorter tails.

Have fun with it.