Stuff Left Out

January 17, 2016

My review of math has me learning things that I was never taught before. There were concepts left for later, if later ever arrived.

We had functions. At times we had no solutions. Later, they become functions without the right number of roots in the reals, as some of the solutions were in the space of a left out concept, complex roots. For a while, after a deeper understanding of complex roots, I’d go looking for the unit circle, but what was that unit circle centered on? Just this week, I found it centered on c, a variable in those equations like ax2+bx+c. So that open question comes to a close, but is that enough of an answer? I don’t know yet.

We had trigonometry, because we had trigonometric functions. We have the unit circle and an equation, notice that it’s not a function, because as a function it would fail the vertical line test. When a function violates the vertical line test, we create a collection of functions covering the same values in a way that does not violate the vertical line test. Trigonometry does that for us without bothering to explain the left out concept, manifolds.

One of Alexander Bogomolny’s (@CutTheKnotMath) tweets linked to a post about a theorem from high school geometry class about how a triangle with it’s base as a diameter of a circle and a vertex on the circle had angles that added up to 180 degrees. Yes, I remembered that. I hadn’t thought about it in decades. But, there was something left out. That only happens in a Euclidean geometry.

Well, I doodled in my graphics tools. I came to hypothesize about where the vertex would be if it were inside the circle, but not on it, and the base of the triangle was on the diameter. It would be in a hyperbolic geometry was my quick answer; outside the circle, same base, spherical. Those being intuitive ideas. A trapezoid can be formed in the spherical case using the intersections of the circle with the triangle. That leave us with the angles adding up to more than 180 degrees.

I continued to doodle. I ended up with the bases being above or below the diameter, and the vertex on the circle. Above the diameter, the geometry was hyperbolic; below, spherical. I got to a point using arc length were I could draw proper hyperbolic triangles. With the spherical geometries there were manifolds.

The point of all this concern about geometries involves how linear analyses break down in non-Euclidean spaces. The geometry of the Poisson games that describe the bowling allies of the technology adoption lifecycle (TALC) is hyperbolic. The verticals and the horizontals of the TALC approach the normal. Those normals at say six sigma give us a Euclidean geometry. Beyond six sigma, we get a spherical geometry, and with it a redundancy in our modeling–twelve models all of them correct, pick one.

We’ve looked at circles for decades without anyone saying we were constrained to the Euclidean. We’ve looked at circles without seeing that we were changing our geometries without thinking about it. We left out our geometries, geometries that inform us as to our non-linarites that put us at risk (hyperbolic) or give us too much confidence (spherical). We also left out the transitions between geometries.

The circle also illustrates why asking customers about their use cases and cognitive models, or even observing, doesn’t get us far enough into what stuff was left out. Back in school, it was the stuff that was left out that made for plateaus, stalls, stucks, and brick walls that impeded us until someone dropped a hint that led to the insight that dissolved the impedances and brought us to a higher level of performance, or a more complete cognitive model.


January 8, 2016

I just finished reading a review of “Are we Postcritical?,” a book about literary criticism.  It mentions things like Marxism and Structuralism used as the basis of literary criticism. It goes further to mention a stack of critical frameworks.

These critical frameworks are applied to all art forms from literature to sculpture to architecture to “design” in its various outlets. You might find a Structuralist critique of the iPhone. Or, a Marxist critique of some computer game.

But, developers hardly go looking for the critical reviews of their own art. We hear it on twitter. Designers tell us the UI’s of the 80’s and 90’s suck. Well, sure, had we been trying to serve the same users we serve today. I’ve got news for you, they didn’t suck. They served the geeks they were intended to serve. Nothing today serves geeks. Sorry geeks, but we serve consumers today. So today, we can think about that same stack of critical frameworks that gets applied to “design.”

To get to this “design” as a form of art, we would have to be artists. Yes, some Agilists come across as artists in their demand to be free from management and economic considerations and purposes. But, back in the early phase of the technology adoption lifecycle this wasn’t so. Under the software engineering approach, we had a phase called “design!” Yes, we applied critical frameworks that had nothing to do with literary criticism that were just as critical as these art and artist-oriented critical frameworks.

Every domain has its own critical frameworks. Before we drop a bomb, we take a photo. We drop a bomb, then we send another drone to take a photo. That photo is graded via a critical framework, and bombing is assessed. We could if we like apply a structuralist framework during this assessment, but mostly we’ll ask “Hey do we need to send another drone to take the target out?” “No, it’s a kill. Tell PR there was zero collateral damage.” That being just an example of a geek’s critical framework.

More to the point we might say O(n), which becomes possible once we have true parallelism. Yes, everything will have to be rewritten to pass the critical framework of algorithm design. Oh, that word again, “design.”

Accountants have critical frameworks. Likewise, hardware engineers and software developers. They just don’t happen to be artist-oriented frameworks. It just kills me when an artist-designer puts down the design critiqued by some unknown to them critical framework. And, it kills me that these designers want to be the CEO of firms selling to consumers without recognizing all the other critical frameworks that a CEO has to be responsive to like those of the SEC, FCC, and various attorney generals.

Design is the demarcation of and implementation of a balance between enablers and constraints be that with pastels and ink; plaster and pigments, or COBOL. The artist wants to send a message. The developer wants to ship a use case. What’s the difference? At some scale, none at all. Design, in it’s artistic/literary senses has yet to reach the literate/artistic code of developers. Design, in that art sense, is an adoption problem. But, we do have computers doing literary criticism, so one day computers will do code criticism. Have fun with that.

And, yes, since this is a product management blog, some day we’ll have a critical framework for product management. We already do, but who in the general public will be reading that in their Sunday papers? Maybe we’re supposed to be reading about it in the sports section. The sportscaster calling out the use of, oh, that design pattern.

Yes, artist-oriented design is important, as is brand. Have fun there, but there is a place, and that place is in the late phases of the technology adoption lifecycle, not the phases that birthed the category and value chains that underlay all that we do today. Late is not bad, it’s just limited to the considerations of cash, not wealth–oh, an economic critical framework.






December 24, 2015

In this blog, I’ve wondered why discontinuous innovation is abandoned by orthodox financial analysis. Some of that behavior is due to economies of scale. Discontinuous innovation creates it’s own category on it’s own market populations. Discontinuous innovation doesn’t work when forced into an existing population by economies of scale, or economies of an already serviced population.

But, there is more beyond that. In “The Inventor’s Dilemma,” Christensen proposed separation as the means to overcome those economy of scale problem. Accountants asserted that it cost too much, so in the end his best idea did not get adopted. Did he use the bowling ally to build his own category and market population? No. Still, his idea was dead on, but not in the familiar way of separation as a spin out.

Further into the math, however, we find other issues like the underlying geometry of the firm evolves from a Dirac function, to a set of  Poisson games, to an approach to the normal, to the normal, and the departure from that normal. The firm’s environment starts in a taxi cab geometry (hyperbolic) where linearity is fragmented, becomes Euclidean where linearity is continuous, and moves on to spherical where multiple linarites, multiple orthodox business analyses, work simultaneously.

With all these nagging questions, we come to question of coordinate systems. Tensors were the answer to observing phenomena in different frames of reference. Tensors make transforms between systems with different coordinate systems simple. Remember that mathematicians always seek simpler. For a quick tutorial on tensors, watch Dan Fleisch explain tensors in “What’s a Tensor.”

In seeking the simpler mathematicians start off hard. In the next video, the presenter talks about some complicated stuff, see “Tensor Calculus 0: Introduction.” Around 48:00/101:38 into the video,  the presenter claims that the difficulties in the examples were caused by the premature selection of the coordinate systems. Cylindrical coordinates involve cylindrical math, and thus cylindrical solutions; polar, similarly; linear, similarly. Tensors simplified all of that. The solutions were analytical, thus far removed from the geometric intuition. Tensors returned us to our geometric intuitions.

The presenter says that when you pick a coordinate system, “… you’re doomed.” “You can’t tell. Am I looking at  a property of the coordinate system or a property of the problem?” The presenter confronts the issue of carried and carrier, or mathematics as media. I’ve blogged about this same problem in terms of software or software as media. What is carried? And, what is the carrier presenting us with the carried?

Recently, there was a tweet linking to a report on UX developer hiring vs infrastructure developer hiring. These days the former is up and the latter is down. Yes, a bias towards stasis, and definitely away from discontinuous innovation in a time when the economy needs the discontinuous more than the continuous. The economy needs some wealth creation, some value chain creation, some new career creation. Continuous innovation does none of that. Continuous innovation captures some cash. But, all we get from Lean and Open and fictional software is continuous innovation, replication, mismanaged outside monetizations and hype, so we lose to globalism and automation.

I can change the world significantly, or serve ads. We choosing to serve ads.

Back to the mathematics.

I’m left wondering about kernels and how they linearize systems of equations. What does a kernel that linearizes a hyperbolic geometry look like? Spherical kernels likewise? We linearize everything regardless of whether it’s linear or not. We’ve selected an outcome before we do the analysis just like going forward with an analysis embedding a particular coordinate system. We’ve assumed.  Sure, we can claim that the mathematics, the kernel makes us insensitive or enable us be insensitive to the particular geometry. We assume without engaging in WIFs.

Kernels like coordinate systems have let us lose our geometric intuition.

There should be a way to do an analysis of discontinuous innovation without the assumptions of linearity, linearizing kernels, a Euclidean geometry, and a time sheared temporality.

Time sheared temporality was readily apparent when we drove Route 66. That tiny building right there was a gas station. Several people waited there for the next model-T to pull in. The building next to it is more modern by decades.

This is the stuff we run into when we talk about design or brand, or use words like early-stage–a mess that misses the point of the technology adoption lifecycle, only the late main street and later stages involve orthodox business practices typical of F2000 firms. That stuff didn’t work in the earlier phases. It doesn’t work when evaluating discontinuous innovation.


Is the underlying technology yet to be adopted? Does it already fit into your firm’s economies of scale? Wonder about those orthodox practices and how they fail your discontinuous efforts?





December 14, 2015

More statistics this week. Again, surprise ensued. I’ll be talking math, but thinking product management. I’ve always thought in terms of my controls and their frequency of use, but when did the data converge? When does it time series on me? Agile and Minimal Viable Product are experiment based. But, how deep is our data?

So while everything I’m going to say here is not new to anyone maybe, you’ll be surprised somewhere along the way.

First, we start with the definition of probability.

01 Probability 01

The stuff between the equal signs is predicate calculus, or mathematical logic, the easy stuff. It’s just shorthand, shorthand that I never used to get greasy with my notes. In college, I wanted to learn it, but the professor didn’t want to teach it. He spent half the semester reviewing propositional calculus, which was the last thing I needed.

Moving on.

01 Probability 02

What surprised me was “conditions or constraints.” That takes me back to formal requirements specification, in the mid to late 80’s, where they used an IF…Then… statements to prove the global context of what program proving could only prove locally. Requirements were questions. Or, Prolog assertions that proved themselves.

Constraints are the stuff we deal with in linear programming, so we get some simultaneous equations underpinning our probabilities.

01 The World

The red stuff is the particular outcome. Anything inside the box is the sum of all outcomes. Just take the space outside the distribution as zero, or ground.

Lately, I got caught on the issue of what is the difference between iteration and recursion. I Googled it. I read a lot of that. I’ve done both. I’ve done recursive Cobol, something my IT-based, aka data processing, professor didn’t like. No, it was structured coding all the way. Sorry, but I was way early with objects at that point. But, back to the difference, no none of it really stuck me as significant.

What I really wanted was some explanation based on the Ito/Markov chain notions of memory. So I’ll try to explain it from that point of view. Lets start with iteration.


02 Iteration Iteration has some static or object variables where it saves the results of the latest iteration. I’m using an index and the typical for loop constructs. There are other ways to loop.

That’s code, but more significantly, is the experiment that we are iterating. The conditions and context of the experiment tell us how much data has to be stored. In iterations, that data is stored, so that it can be shared by all the iterations. Recursion will put this data elsewhere. The iteration generates or eats a sequence of data points. You may want to process those data points, so you have to write them somewhere. The single memory will persist beyond the loop doing the iteration, but it will only show you the latest values.

It can take a long time to iterate to say the next digit in pi. We can quickly forecast some values with some loose accuracy, call it nearly inaccurate, and replace the forecast with accurate values once we obtain those accurate value. Estimators and heuristics do this roughing out, sketching for us. They can be implemented as iterations or recursions. Multiprocessing will push us to recursion.

03 Iteration w Heuristic

Notice that I’ve drawn the heuristic’s arc to and from the same places we used for our initial iterations or cycles. The brown line shows the heuristic unrolled against the original iterations. This hints towards Fourier Analysis with all those waves in the composition appearing here just like the heuristic. That also hints at how a factor analysis could be represented similarly. Some of the loops would be closer together and the indexes would have to be adjusted against a common denominator.

Throughout these figures I’ve drawn a red dot in the center of the state. Petri nets use that notional, but I’m not talking Petri nets here. The red dots were intended to tie the state to the memory. The memory has to do with the processing undertaken within the state, and not the global notions of memory in Markov chains. The memory at any iteration reflects the state of the experiment at that point.


In recursion, the memory is in the stack. Each call has its own memory. That memory is sized by the experiment, and used during the processing in each call. Iteration stops on some specified index, or conditions. Recursion stops calling down the stack based on the invariant and switches to returning up the stack. Processing can happen before the call, before the return, or between the call and the return. Calling and returning are thin operations; processing, thick.

04 Recursion

The individual memories are shown as red vertical lines inside the spiral or tunnel. We start with calls and when we hit the invariant, the blue line, we do the processing and returning. We start at the top of the stack. Each call moves us towards the bottom of the stack, as defined by the invariant. Each return moves us back towards the top of the stack. The graph view shows the location of the invariant. The calling portion of the tunnel is shorter than the processing and returning portion of the tunnel.

Notice that I’m calling the invariant the axis of symmetry. That symmetry would be more apparent for in-order evaluation.  Pre-order evaluation, and post-order evaluation would be asymmetrical, or giving rise to skewed distributions.

Recursion is used in parsers and in processing trees, potentially game trees. In certain situations we are looking for convergences of distributions or sequences.

05 Convergence and Sequence

The black distribution here represents a Poisson distribution. This is the Poisson distribution of the Poisson game typical of the early adopter in the bowling ally of the technology adoption lifecycle. That Poisson distribution tends to the normal over time through a series of normal. The normal differ in the width of their standard deviations. That increase in widths over time is compensated for by lower heights, such that the area of all those normal is one.

We also show that each call or iteration can generate the next number in a sequence. That sequence can be consumed by additional statistical processing.

06 Numeric Convergence

Here, in a more analytic process, we are seeking the convergence points of some function f(n). We can use the standard approach of specifying a bounds for the limit, , or a more set theoretic limit where two successive values are the same, aka cannot be in the same set. Regardless of how that limit is specified, those limits are the points of convergence. Points of convergence give us the bounds of our finite world.

Throughout I’ve used the word tunnel. It could be a spiral, or a screw. Wikipedia has a nice view of two 3D spirals, take a look. I didn’t get that complex here.

07 3D Sphere Spiral


When you experiment, and every click is an experiment in itself, or in aggregate, how long will it take to converge to a normal distribution, or to an analytic value of interest? What data is being captured for later analysis? What constraints and conditions are defining the experiment? How will you know when a given constraint is bent or busted, which in turn breaks the experiment and subsequent analysis?




Box Plots and Beyond

December 7, 2015

Last weekend, I watched some statistics videos. Along with the stuff I know, came some new stuff. I also wrestled with some geometry relative to triangles and hyperbolas.

We’ll look at box plots in this post. They tell us what we know. They can also tell us what we don’t know. Tying box plots back to product management, it gives us a simple tool for saying no to sales. “Dude, your prospect isn’t even an outlier!”

So let’s get on with it.

Box Plots

In the beginning, yeah, I came down that particular path, the one starting with the five number summary. Statistics can take any series of numbers and summarize them into the five number summary. The five number summary consists of the minimum, the maximum, the median, the first quartile, and the third quartile.

boxplot 01

Boxplots are also known as box and whisker charts. They also show up as candlestick charts. We usually see them in a vertical orientation, and not a horizontal one.

boxplot 04

Notice the 5th and 95th percentiles appears in the figure on the right, but not the left. Just ignore it and stick with the maximum and minimum, as shown on the left. Notice that outliers appear in the figure on the left, but not on the right. Outliers might be included in the whisker parts of the notation or beyond the reach of the whiskers. I go with the latter. Where the figure on the left says the outliers are more than 3/2’s upper quartile, or less than 3/2’s the lower quartile. Others say 1.5 * those quartiles. Notice that there are other data points beyond the outliers. We omit or ignore them.

The real point here is that the customer we talk about listening to is somewhere in this notation. Even when we are stepping over to an adjacent step on the pragmatism scale, we don’t do it by stepping outside our outliers. We do it by defining another population and constructing a box and whiskers plot for that population. When sales, through the randomizing processes they use brings us a demand for functionality beyond the outliers of our notations, just say no.

We really can’t work in the blur we call talking to the customer. Which customer? Are they really prospects, aka the potentially new customer, or the customer, as in the retained customer? Are they historic customers, or customers in the present technology adoption lifecycle phase? Are they on the current pragmatism step or the ones a few steps ahead or behind? Do you have a box and whisker chart for each of those populations, like the one below?


This chart ignores the whiskers. The color code doesn’t help. Ignore that. Each stick represents a nominal distribution in a collective normal distribution. Each group would be a population. Here the sticks are arbitrary, but could be laid left to right in order of their pragmatism step. Each such step would have its own content marketing relative to referral bases. Each step would also have its own long tail for functionality use frequencies.

Now, we’ll take one more look at the box plot.


Here the outliers are shown going out to +/- 1.5 IRQs beyond the Q1 and Q3 quartiles. The IRQ includes the quartiles between Q1 and Q3. It’s all about distances.

The diagram also shows Q2 as the median and correlates Q2 with the mean of a standard distribution. Be warned here that the median may not be the mean and when it isn’t, the real distribution would be skewed and non-normal. Going further, keep in mind that a box plot is about a series of numbers. They could be z-scores, or not. Any collection of data, any series of data has a median, a minimum, a maximum, and quartiles. Taking the mean and the standard deviation takes more work. Don’t just assume the distribution is normal or fits under a standard normal.

Notice that I added the terms upper and lower fence to the figure, as that is another way of referring to the whiskers.

The terminology and notation may vary, but in the mathematics sense, you have a sandwich. The answer is between the bread, aka the outliers.

The Normal Probability Plot

A long while back, I picked up a book on data analysis. The first thing it talked about was how to know if your data was normal. I was shocked. We were not taught to check this before computing a mean and a standard distribution. We just did it. We assumed our data fit the normal distribution. We assumed our data was normal.

It turns out that it’s hard to see if the data is normal. It’s hard to see on a histogram. It’s hard to see even when you overlay a standard normal on that histogram. You can see it on a box and whiskers plot. But, it’s easier to see with a normal probability plot. If the data once ordered forms a straight line on a plot, it’s normal.

The following figure shows various representations of some data that is not normal.

Not Normal on Histogram Boxplot Nomal Probability Plot

Below are some more graphs showing the data to be normal on normal probability plots.

Normal Data

And, below are some graphs showing the data to not be normal on normal probability plots.

Non-normal Data

Going back to first normal probability plot, we can use it to explore what it is telling us about the distribution.

Normal Probability Plot w Normal w Tails

Here I drew horizontal lines where the plotted line became non-normal, aka where the tails occur. Then, I drew a  horizontal line representing the mean of the data points excluding the outliers. Once I exclude the tails, I’ve called the bulk of the graph, the normal portion, the normal component. I represent the normal component with a normal distribution centered on the mean. I’ve labeled the base axis of the normal as x0

Then, I went on to draw vertical lines at the tails and the outermost outliers. I also drew horizontal lines from the outermost outliers so I could see the points of convergence of the normal with the x-axis, x0. I drew horizontal lines at the extreme outliers. At those points of convergence I put black swans of the lengths equal to the heights or thicknesses of the tails giving me x1 and x2.

Here I am using the notion that black swans account for heavy tails. The distribution representing the normal component is not affected by the black swans. Some other precursor distributions were affected, instead. See Fluctuating Tails I and Fluctuating Tails II for more on black swans.

In the original sense, black swans create thick tails when some risk causes future valuations to fall. Rather than thinking about money here I’m thinking about bits, decisions, choices, functionality, knowledge–the things financial markets are said to price. Black swans cause the points of convergence of the normal to contract towards the y-axis. You won’t see this convergence unless you move the x-axis, so that it is coincident with the distribution at the black swan. A black swan moves the x-axis.

Black swans typically chop off tails. In a sense it removes information. When we build a system, we add information. As used here, I’m using black swans to represent the adding of information. Here the black swan adds tail.

Back to the diagram.

After all that, I put the tails in with a Bezier tool. I did not go and generate all those distributions with my blunt tools. The tails give us some notion of what data we would have to collect to get a two-tailed normal distribution. Later, I realized that if I added all that tail data, I would have a wider distribution and consequently a shorter distribution. Remember that the area under a normal is always equal to 1. The thick blue line illustrates such a distribution that would be inclusive of two tails on x1. The mean could also be different.

One last thing, the fact that the distribution for the normal probability plot I used was said to be a symmetric distribution with thick tails. I did not discover this. I read it. I did test symmetry by extending the  x1 and xaxes. The closer together they are the more symmetric the normal distribution would be. It’s good to know what you’re looking at. See the source for the underlying figure and more discussion at


Always check the normalcy of your data with a normal probability plot. Tails hint at what was omitted during data collection. Box plots help us keep the product in the sandwich.


Fluctuating Tails II

November 22, 2015

In my last post, “Fluctuating Tails I,” we explored the effects of a black swan on a single normal distribution. In this post, we will look at a the effects of a black swan on a multinormal distribution from the perspective of a linear regression.

Lets start off with the results of a linear regression of multidimensional data. These regressions give rise to ellipse containing the multidimensional data. This data also gives rise to many normal distributions summed into a multinormal distribution.

 Black Swan Imacts on Multivarite Distribution 1

I modified the underlying figure. The thick purple line on the distribution on the p(y) plane represents the first black swan. The thin purple line projects the black swan across the ellipse resulting from the regression. The data to the right of the black swan is lost. The perpendicular brown lines help us project the impacts on to the distribution on the p(x) plane. The black swan would change the shape of the light green ellipse, and it would change the shape of the distribution, shown in orange, on the p(x) plane.

In the next figure, we draw another black swan on the p(y) plane distribution further down the tail. We use a thin black line to represent the second black swan. This black swan has fewer impacts.

Black Swan Imacts on Multivarite Distribution 4

In neither of these figures did I project the black swan onto the p(x) plane distribution, or draw the new x’ and y’ axes as we did in the last post. I’ll do that now.

Black Swan Imacts on Multivarite Distribution 3

Here we have projected the black swan and moved the x and y axes.

Notice that the black swan is asymmetrical, so the means of the new distributions would shift. This means that any hypothesis testing done with the distributions before the black swan would have to be done again. Correlation and strength tests depend on the distance between the means of the hypotheses (distributions).

Parameter Distributions

After drawing these figures, I went looking for Levy flight parameters. I wanted to show how a black swan would affect pumps in a Levy random walk. I settled instead on a Rice distribution.

 Rice Distribution

The shades of blue in the figure are the standard deviations of sigmas from the mean. Sigma is one parameter of the Rice distribution. V is another.

Rice Distribution Parameters

Here are the PDFs and CDFs of a Rice distribution given the relevant parameter values. The blue vertical line through both of the graphs is an arbitrary black swan. Some of the distributions are hardly impacted by the black swan. A particular distribution would be selected by the value for the parameter v. The distributions would have to be redrawn after the black swan to account or the change in the ranges of the distributions. Once redrawn, the means would move if the black swan was asymmetrical. This is the case for the Rice distribution and any normal distributions involved.

If the parameters themselves were distributions, a black swan would eliminate parameter values and the distributions for those parameter values.

When we base decisions on statistical hypothesis testing, we need to deal with the impacts of black swans on those decisions.


Fluctuating Tails

November 13, 2015

On twitter tonight, @tdhopper posted a citation to a journal article on the fluctuating tails of power law distributions. In my last post, I mentioned how black swans moved the tail of a normal distribution. So I took a look at those power law distributions. We’ll talk about that first. Then, I’ll go back and look at tail fluctuations and more for normal distributions.

Power Law Distributions

I drew a two-tailed distribution. This distribution has an axis of symmetry. In the past, I talked about this axis of symmetry as being a managerial control. In the age of content marketing, a question we might ask is what is the ratio of developers to writers. The developers would have a their tail, a frequency of use per UI element histogram, and the writers would have their tail, the SEO page views histogram. Add a feature, a few pages–not just one. So that axis of symmetry becomes a means of expressing a ratio. That ration serves as a goal, or as a funding reality. Adding features or pages would constitute fluctuations in the tails of a power law distribution.

The commoditization of some underlying technology, say the day relational databases died, would result in loss of functionality, content. That would be a black swan. In it’s original sense, financial, against a normal distribution, the losses would be in stock price. In a more AI sense that I’ve written about before, world size, the losses would be in bits.

So I’ve illustrated three cases of fluctuating tails for a power law distribution.

Fluxuations in Power Law Distributions as Change in Angle of Axis of Symmetry

The first power law distribution is A shown in orange. It’s tails have a ration of 1:1. Each tail has the same length. On the figure, the arrowheads represents the point of convergence and provides us with a side of a rectangle representing the size of our world. The point of convergence is represented by a black dot for emphasis.

The second power law distribution is B shown in green. It’s tails have a ration of 2:1, as in x:y. The green arrow give us the right side of our rectangular world. Changing the angle of the axis of symmetry is one way of expressing fluctuation or volatility. The axis of symmetry runs from the origin to the opposing corner of that rectangle.

The third example is C shown in red. This power law distribution has undergone a black swan. The black swan is represented by a black vertical line intersecting the power law distribution B. That point of intersection becomes the new point of convergence for power law distribution C. Notice that this means the black swan effectively moves the x-axis. This makes the world smaller in width and height. The new x-axis is indicated by the red x’ axis on the figure. If this figure were data driven the ratio for the axis of symmetry could be determined. Black swans are another means of expressing fluctuation. Realize that stock price changes act similarly to black swans, so there is daily volatility as to the location of the x-axis.

Normal Distributions

I’ve talked about the normal distributions and black swans in the past. But, this time I found some tools for making accurate normal distributions where I freehanded them in the past. The technology adoption lifecycle is represented by a normal distribution. The truth is that it is at least four different normal distributions and a bowling ally’s worth of Poisson distributions. And, if you take the process back to the origination of a concept in the invisible college you’ll find a Dirac function.

Let’s look at a standard normal, a normal with a mean of zero and a standard deviation of 1, and a normal with a mean of zero and a larger standard deviation. The value of that larger standard deviation was limited by the tool I was using, but the point I wanted to make is still obvious.

Lets just say that the standard normal is the technology adoption lifecycle (TALC). Since I focus on discontinuous innovation, I start with the sale to the B2B early adopter. That sale is a single lane in the bowling ally. That sale can likewise be represented by a Poisson distribution within the  bowling ally. The bowling ally as a whole can be represented as a Poisson game.

The distribution with a larger standard deviation is wider and shorter than the standard normal. That larger standard deviation happens as our organizations grow and we serve an economics of scale. Our margins fall as well. That larger standard deviation is where our startups go once they M&A. Taking a Bayesian view of the two normal, the systems under those distributions are by necessity very different. The larger normal is where F2000 corporations live, and what MBAs are taught to manage. Since VCs get their exits by selling the startup to the acquirer, the VCs look for a management that looks good to the acquirer. They are not looking for good managers of startups.

After drawing the previous figure, I started on the normal undergoing a black swan. With a better tool, I was surprised.

Now a warning, I started this figure thinking about Ito processes beyond Markov processes, and how iteration and recursion played there. Programmers see iteration and recursion as interchangeable. Reading the definitions makes it hard to imagine the difference between the two. The critical difference is where the memory or memories live. Ultimately, there is a convergent sequence, aka there is a tail. The figure is annotated with some of that thinking.

So the figure.

Black Swan 01

I started with a standard normal, shown in dark blue. The gray horizontal line at y=4 is the top of the rectangle representing the size of the world, world n, associated with the standard normal. This is the world before the black swan.

The black swan is shown in red. The new x-axis, x’, runs through the point where the normal intersects with the horizontal line representing the black swan. Notice that the normal is a two-tailed distribution, so the new x-axis cuts the normal at two points. Those points define the points of convergence for a new thinner normal distribution. I drew that normal in by hand, so it’s not at all accurate. The new normal is shown in light blue. The red rectangle represents the new distribution’s world, world n+1.

The new distribution is taller. This is one of the surprises. I know the areas under the two normal equal one, but how many times have you heard that without grasping all of that property’s consequences. Where you can see the new normal in the diagram, what you are looking at is the new learning that’s needed. Again, taking a Bayesian/TALC point of view.

Between the new x-axis and the old x-axis, we have lost corporate memory and financial value.  The width of the new distribution is also thinner than the original distribution. This thinning results from corporate memory loss.

I also annotated some time considerations. This would be TALC related. The black swan happens at some very recent past, which we can consider as happening now. Using the black swan as an anchor for a timeline lets us see how a black swan affects our pasts, and our futures. Those memory losses happen in our tails.

The original x-axis represents, in the AI sense, the boundary between the implicit and explicit knowledge. I know that’s pushing it, but think about it as the line between what we assert and on what we run experiments.

I drew an Dirac distribution on the diagram, but it doesn’t happen at mean or where a normal would be. It is a weak signal. It happens prior to any TALC related Poisson games. Oddly enough, I’ve crossed paths with a Dirac when I asserted a conditional probability.

So here is a Dirac distribution, not normalized, just for the fun of seeing one. This from Wikipedia.


Please leave comments. Thanks.

Flashbulb Pop!

November 2, 2015

I’ve went through more of that math for life sciences book. It’s taking forever and it’s at that point where you want it to end, but there is one more chapter taunting you. The discussion is about how samples tend to the normal, and how the sum of normals is another normal. It sounds straightforward enough. But, there was other reading, other thinking, and a surprise.

I’ve talked about how a distribution defines a world, a rectangular world, and how a black swan chops off the smooth convergence and creates a thick tail. It moves the convergences of that world defining distribution, so you end up with a smaller world. Wall street puts a valuation on those worlds, so smaller means a massively lower price for your stock.

I’ve been reading about random walks and Levy flights. The length of the jump and the direction of the jumps in these things is controlled by several parameters each under their own distribution. So instead of having one distribution we have several. And, that black swan cuts through them as well. If we are making a jump in a Levy flight, and we’re not there yet, that black swan would force us to backtrack and make a different jump. We’d stop drilling in the Arctic. That black swan is operating on our networks.

I’ve also come across the notion of causation. Correlation is not causation. We hear that all the time. But, what is correlation? Correlation is a pair of nodes in a network. Causation is an arc between nodes in a network. The network is a collection of distributions connected by other distributions. This was the lesson of “Wetware.” [I tried to find the link, but I’m not certain at this time.] In biochemistry, we had the Krebs cycle, a nice circular pathway describing metabolism. Alas, the cycle isn’t real. It’s a name we put on a collection of random walks constrained by physical shapes.

Our networks include value chains, and they get cut by our black swan as well. That smaller world that the black swan brings us involves all of our networks, not just the one describing our progress across the technology adoption lifecycle, or across our market populations. What we really have is a multinormal collection of distributions all being sliced at the same time. We can’t make the strategic jumps we intended to make. We can’t continue our daily routine in the tactical space either.

The multinormal distribution is also the best way to think about populations for discontinuous innovation. Innovating for the same and adjacent populations, the populations of our economies of scale is continuous–one of our normals. Discontinuous innovation has us addressing and tuning ourselves to a population beyond our economies of scale, a yet to be discovered population–another normal. Keeping those normals separate is essential, but Christensen couldn’t sell that idea to cost accountants, despite it being the way to creating economic wealth, rather than just capturing more cash. Keeping those normals separate would be essential to our organizations, because our current position in our category’s technology adoption lifecycle is tuned into built organizational structure.

You can’t sell to late market pragmatists and early adopters with the same sales force, or same marketing, or same business model. Is it any wonder that existing companies don’t do discontinuous innovation? Is it any wonder that the typical analysis of an F2000 company doesn’t work for discontinuous innovation? The first assumption would be our experience, our templates, our economies of scale matter. Well, no, and that’s long before you get to the differences in the geometries of the pre-six sigma company and the 42 sigma company. It fundamentally boils down to separate populations and their normals. And, that huge paper slicer we call the black swan. Chop. Opportunities gone in a pop of a flashbulb, in a moment unsuspected, but delusionally well known.


HDR for Product Managers

August 4, 2015

I discovered HDR a while back. It’s been a while. I stopped paying attention to cameras a long time ago. Then, just browsing through the photography section at B&N, I came across my first mention of HDR. I came to a rudimentary understanding of it. I thought it seemed like a data warehouse and I left that thought there to bounce around in my head. So it’s been bouncing ever since.

I bought a book on some photo manipulating software last year. It’s not like I use that kind of software. But, it was something to read on an airplane, but not what you’d call an airplane read. Tonight, I was wondering if I could sell it at Half-Price Books, but no, there was a chapter on HDR, so I have to read that. I have a better understanding of what it is.

What is HDR?

HDR Imaging, high-dynamic range imaging, captures a larger dynamic range than camera w/o it. It captures 32-bits where a normal camera captures 16-bits. In a photograph that translates into a larger tonal differences between white and black. It lets you do the Zone System in your camera rather than in a darkroom. It attempts to see they way an eye sees. A camera takes one image at one point in time. The human eye sees a series of images and adjusts the contrast on the fly as it attends to various locations in the scene. In the mathematical sense, HDR is local to the normal camera’s global. In the ordinary photograph the global swamps the local and details get lost for better or worse.

In the old days you bracketed the exposure and took three shots -1,0,and +1 hoping that when you got back to the darkroom you’d have a shot you could use. HDR automatically takes a much wider sequence of exposure settings, and constructs a photo where all the details were captured and then puts all the pieces back together from the different shots. Back in the old days, one of the shots was going to be the picture, the best one. In HDR, each shot might have something to contribute to the final picture.

So enough already. How does that get me to something a product manager could use?

Frequency of Use Histograms

As product managers (PMs) we might be envious of the product marketing managers (PMMs) histograms that they get from their SEO analytics and log files. They get a long tail. They understand which conversions worked and which ones didn’t work. They know how their content network is pushing prospects, buying roles, customers, and users to their next sale or use. They can optimize their content to their audiences. Product managers could have the same thing.

Each content conversion has a frequency that you get from the analytics that sum up the clicks recorded in the server logs. Notice the clicks. They don’t look like UI elements, but at an abstract level, that exactly what they are. So in the parallel universe of product managers, we get the use frequencies for every control in our UI. Since that isn’t an off-the-shelf thing, it would have to be built in. When a user click on button A, the button makes a request to a server, the server logs the request, serves nothing finishing the request. Then, using the same SEO analytics tools, sum up the requests in various ways over various periods, and that gives you the frequency of use histogram for the use of the controls in a collection of controls inside your application or across a collection of applications. Product marketers and product marketing managers would have analytic equality. They both have their frequency of use histograms.

I’ve written about these frequency of use histograms in other posts.

I wrote a post where I put the PM histogram on the x-axis and the PMM histogram on the y-axis and coordinated them across the axis of symmetry of an exponential curve. But, it must be lost on a prior blog. That axis of symmetry is one point of control. It would determine the length of the long tails of the product document set/touchpoint collection and product controls.

So we have our frequency of use histograms. In the product managers histogram, each bar would represent a single feature or a rollup of feature frequencies in a give use case. The aggregation would depend on the analyst, the product manager.

Data Warehouses

A data warehouse aggregates data in different ways. The summing of a single data item could be represented by a histogram; aggregates another. Aggregates can also be represented by pie charts. In the end, data warehouses contains histograms.

Back to the HDR

In photo editing software a photo is, likewise, represented via histograms. A data warehouse is like an ordinary photo. It represents a firm at one moment in time, one interval. Cameras use exposure setting to define the time interval that becomes a photos one moment in time. HDR captures a sequence of various intervals of a given scene, and aggregates the various components of the scene through a wide range of aggregations or data fusions. A data warehouse has captured all of its data, all of its light. A wide range of aggregations, exposures within a data warehouse would be delivered as a result of different SQL queries later aggregated to show the local objects in a global picture. Integration with the firm’s or customer’s factor analysis might drive the contrasts within the system.

Prospects talk to marketers every time they click. Users talk to product marketers every time they click. Make sure every click in your lean experiments get logged. Listen to what users are saying to you with every click.

A Night at the Bookstore, yes, a Bricks and Mortar Space

July 23, 2015

When I go to the bookstore or a university library, I pick out a stack of books in my areas of interest, and try to scan through them enough to justify taking them off the shelf. I was supposed to finish a particular book, but that didn’t happen. Instead, I spent some time looking through the following at a high level:

  1. EMC^2 (the author), Data Science and Big Data Analytics,
  2. Lea Verou, CSS Secrets, and
  3. Adam Morgan, A Beautiful Constraint.

In Data Science …, I came across a very clear diagram of how power (or significance) gets narrow and taller as sample size increases. Consider each sample to be a unit of time. That leads us to the idea that power arrives over time. These statistics don’t depend on the data. They are about the framing of the underlying studies. The data might change the means and the standard deviations. If the means are narrowly separated, you’re going to need a larger sample size to get the distributions to be narrow enough to be clearly separated, which is the point of the power statistic. Their arrival and departures will change the logic of the various hypotheses. You could under this paradigm see the disruptions of Richard Foster’s Innovation, a book Christensen referenced in his Inventor’s Dilemma before Christensen took an inside-out view of disruption, a view of the scientist/engineer-free innovation, as the arrival of the steeper slopes of the price-performance curve intersections and the departures of same.

As an aside, This week in a twitter linked blog post by a never to be named product manager, I came across the weakest definition of our “all the rage” disruptive innovation, as being akin to a classroom disruption, so far has our vocabulary fallen. No. No. But, it is a buzzword after all. Louder with the buzz please. “I can’t hear you.”

There was also a graph of Centroids (Clusters) that turn out to look like a factor analysis in the sense of steep and long to ever flatter and shorter spans.

There was also a discussion of trees. A branching node in the middle of the tree was called an internal node. I typically divide a tree into its branch nodes and it’s leaf leave nodes. I didn’t read it closely, so the distinction is lost on me.

This book is not an easy elementary statistics book.  I will buy it and take a slow read through it.

In CSS Secrets, there were a lot of things new to me. I did some CSS back in the day, so sprinting through this was interesting. Yes, you can do that now. What? Align text on any path, use embedded SVG. The real shocker was tied to Bezier curves and animation. Various curves in a cubic-Bezier curve showed how to “Ease In;” “Ease In and Out,” which looks like the S-curve of price-performance fame; “Ease Out”; and the familiar “Linear.” The names of the curves could be framed as talking about business results. There were more curves, but there are only a limited number of cubic-Bezier curves. Higher-order curves were not discussed. A cubic-Bezier curve has two end points, and two control points. In the animation sense, the curve feeds values to the animated object. The cubic-Bezier curve is not capable of driving, by itself, full-fledged character animation, but it’s a beginning. We, the computer industry, are easing out of Moore’s law as we speak.

In A Beautiful Constraint, we are looking at a biz book, in the self-help sense. It describes the mindset, method, and motivation for overcoming constraints on one’s performance. We start out as victims. We have to overcome path dependence. We do that with propelling questions and what the author calls Can-If questions. With a Can-If question we are asking about the “How,” sort of the developer’s how, rather than the requirements elicitor’s what. Breaking the path dependency has us asking ourselves or our team about believing it’s possible, knowing where to start, and knowing how much do we want to do it.

An interesting statement was that Moore’s law is actually a path dependence. Intel’s people didn’t let the law break. They always found a way to preserve the “law.” But, Moore’s law was really a sigmoid curve. It flattens at the top. The investment to break the constraint requires much more investment and delivers almost no return, so Intel’s people easing out of it. They like Microsoft will have to find another discontinuous innovation to ride.  The cloud is not such a thing. In fact, the cloud is old and there won’t be a near monopolist in that category. It’s not the next discontinuous innovation. It is really the disappearance, the phobic and non-adopter phases–the phases at the convergence at the end of the category. The device space is that the laggard, yes laggard, but it is still 10x bigger than pre-merger late mainstreet. The normal of Moore’s technology adoption lifecycle is really a sum of a bunch of normals, which leave us unable to see the reality of the category that the discontinuous innovation gave rise to. The end is near.

Anyway, that was tonight’s reading/browsing/carousing. Enjoy.


Get every new post delivered to your Inbox.

Join 1,843 other followers