## Search Scarcity

I’m still working my way through “Chaos Theory Tamed” by Garnett P. Williams. I had to return it to the library, so it will come up again in the near future. He was discussing how to tile a trajectory to reconstruct the attractor when out of the blue the Product Strategy session I presented at Pcamp Seattle09 came back to me. I was talking about how the frequencies of feature use would look like a power law distribution, or the long tail. One of the attendees brought up that the situation was really a thick tail.

Well, the ensuing Eureka moment had me googling around to see if I understood the thick tail even after starting into “The Black Swan” twice. I’m still not finished. I recalled it being drawn like a long tail, except that it’s higher, so a vertical line connected the distribution to the x-axis. No this vertical line is not part of the distribution. The Eureka moment centered around the question of what’s on the other side of the apparent end of the thick tail? So off we go.

Tilings

Well, this is a tiling. Imagine that you’re trying to count your cash cows out in the pasture. You take a photo. Then, you drop a grid over the photo to isolate the number of cows you have to count at any particular moment. Did I count that cow or not?

Tilings do show up in the business world. They’re called the monthly, quarterly, and annual close. Another word for them is measurement lattices.

In the chaos book, we are looking at trajectories, or more specifically the points that comprise a trajectory.

Pseudo Phase Space Trajectory

We drop the tiling over the trajectory, so we can count up the points.

A Trajectory Tiled with Numbered Tiles

It took 45 tiles to tile the trajectory. The number of tiles is significant, because some n/45, where n is the number of points in a given tile, gives you the probability of a point occurring in that tile. Some of the points are on the border between two or more tiles, so counting can be fun. My counting rule was to count these border points in each of the tiles bordered. Strangely enough, this is on big deal apparently.

Tiled Counts and Probabilities

I color coded each tile. The orange tiles are traversed by the trajectory. The yellow and blue ones are not. The yellow tiles comprise a single contiguous  area. The blue tiles comprises another separate contiguous area.

Color Coded Tiling

Here’s a closer look. The points are ordered in a time series. The first point is over in tile 10. Two increments later we are in tile 11. Imagine we are in a spaceship, or making a motion control video. Much of the attractor graphs look like caves, so we’d be spelunking. You maybe, but no, not me. I’m not really headed towards chaos here.  The tiles (tile numbers) alias for the points moving forward.

Tiled Topological Object with Numbers and Colors

The attractor here is a torus or a doughnut. The upper corners wrap around and touch the lower corners. The arrows represent a continuity. I may be far ahead of myself here. We won’t go there.

So far we have a path traversed over time, and each point on that path has a probability.

Markov Chain

Here I color coded the tiles. The yellow ones represent the walls of the cave. We can’t go there. The blue and pink tiles define the continuous pathways until the cave forks. The blue pathways are definitely traveled. The pink tiles are not on the main pathway. The arrows show you the choice at a given fork.

The last figure is really a map of a Markov process. Markov processes have finite memory unlike say processes built from the normal or Gaussian distribution. Markov chain is just another term for Markov process.

OK, so what does that have to do with me, a product manager? It turns out that the technology adoption lifecycle is a Markov process. Each phase of the lifecycle is significantly different from the one on either side of it. That Moore drew it as a normal distribution means that it hides time. He even went so far as to say it wasn’t a clock. Sure. Like Campy saying process re-engineering wasn’t object oriented.

I’ve drawn the technology adoption lifecycle as a Markov process.

TALC as Markov Chain

Moore’s early adopter is in a vertical. Not show is the chasm, the bowling ally, and the tornado among many adoption structures. The diagram shows a layered architecture. The technical enthusiasts are not a population that only shows up at the pre-market phrase. They underlay the entire lifecycle.

I know the folks that think that doing something on the web makes them a technologist won’t like this figure. But, it’s the way I see it. Consumer web is consumer. The web is just a means of selling something to that consumer. And, you are in the business of the stuff you are selling. Do you have an exec that came from that industry? Do you have an exec that came from each of the industries you monetize around? You should.

It turns out that Markov processes are based on Poisson distributions. The main point of my presentation in Seattle was to expose people to something called Poisson games, or games of unknown population. Each of the phases in the technology adoption lifecycle gets its own Poisson game. See my slides for a proposed session at the Orange County Product Camp 10, So you don’t have a market? Great! If you’ve been a regular reader of this blog, you may have seen these slides before.

A single Poisson game could represent a single functional culture, particularly in the bowling ally.

All the probabilities in our tiling taken together comprise a probability distribution, a surface. The long tail is a distribution. So are the Poisson distribution, normal (Gaussian) distribution, and the thick tail.

The Convergences of Probability Distributions

Notice that these distributions converge at different rates. The Poisson distribution converges quickly, the normal a little slower, the long tail quite a ways further out than the normal, and the thick tail about ten or more multiples of the long tail.

So why would you use one distribution as opposed to another? We use the normal, because it is built from familiar statistics–the mean and the standard deviation. We use the normal habitually. The other distributions are defined by parameters other than those that define the normal distribution.

When laying out the long tail, Chris Anderson, saw lots of low volume markets that did not exist prior to the internet, more specifically the search surplus provided by web search engines. I saw the Beatles store as a center of the artifactual culture of the Beatles subculture, not just a market, but an anchor of meaning. I also saw the power law distribution, or long tail, as an organizer of clicks on a software application’s interface. An application is a network of features at one layer, a network of concepts at another, a network of tasks (modules of work done by users) at yet another. Each of these things: a feature, a concept, a task would be just a point in space linked via a Markov chain. Yes, a feature has a probability. This is not news to SEO people. But, these probabilities are not taken into account by product managers, particularly when they think that adding features is key.

Attention is limited. Use is habitual. The probabilities would thin if every new feature was actually used.

Even those pedagogical pathways we discussed back in Visualizing Functional Culture are networks ultimately comprising Markov chains.

To back up this UI as long tail idea even more, it turns out that factor analysis builds a discrete version of the long tail. In factor analysis, three factors generally cover 85 percent of the variance. That implies that your application has three features that consume 85 percent of your user’s attention. You can squeeze out the last 14.95 percent of the variance before you hit the noise that will block further exploration. That would be like going to infinity. A test budget limits tests. A time budget limits factor analysis similarly.

So what of the thick tail? It’s a long tail that is further away from the x and y axes that it eventually converges with. That means that there is even more space under the curve for more markets–not just markets for goods, but ideas, messages, use, or financial disasters. That vertical line represents the end of the known or imagined world–the end of search itself. But, I’ve gotten ahead of the projector. Slide please!

Search Spectrum

Here we are looking at search.

We go to the mall. We go to the bookstore. At the bookstore we look at their front list, the best sellers and the new books–the hits, or we might just look at the shelves deeper into the store–the back list, the thing that makes one bookstore chain different from the others–the long tail. We are searching. And, the merchandizers, the marketers, the sellers have organized the goods, so we can find them. All this existed before the internet and information architecture came along. This search is indicated by the dark green below the timeline of the graph at what I called search enablers.

Channels and brands organize search.

On the far left side we have the bibliographic maturity lifecycle. Consider it to be a miniature technology adoption lifecycle. It starts out with the creator. The creator finds some apostles. The creator and the apostles exchange research in an invisible college. Ultimately, the members of the invisible college are trying to create a conceptualization that a peer edited journal would accept, and in doing so publish a work embodying the idea, so it escapes the ultimate niche of its speciation, the individual, and spread into the larger world where it will strive or to extinct. When an idea is first published by a peer edited journal, the idea is said to have achieved bibliographic maturity. Now, professional reference librarians (dark green again) can find it, which enables the idea to spread and become adopted by the larger academic population. Eventually, one of those academics will write a book, a searchable entity (lighter green). Yes, search, but not search abundance.  The breakout into the general/commercial population will take years and years. Or, maybe a blog would hurry things along.

The hit portion of the long tail is the commercial portion of the search spectrum that has search, but not search abundance. It does have search abundance to the degree that it is aliased online, and that alias has been consumed by a search engine’s spider. It’s only after the Pareto split at the inflection point of the power law’s curve that the internet search engines kick in and provide search surplus. Yes, internet search is not perfect. Stick to the PR line buddy.

Before the invisible college, the idea if it had been conceived lay hidden in search scarcity. The long tail lays in search surplus, but beyond it’s point of convergence, the thick tail; the unrepentant, not gonna search me, unknown; has its way with its search scarcity. Search is just the filling sandwiched between two pieces of white bread search scarcity.

Well, there might be more to this sandwich, because many things still can’t searched. Search expands. Search scarcity stands hardly bothered. Search scarcity is a bully.

Notice the line labeled Now. Right next to it is the past and the future. The long tail, the thick tail, the normal, and the Poisson sit on a time line. We’re back to talking about processes again. Throw a tiling on it and let’s get moving. Funny, Slavic time has no notion of the arrow of time, instead, time–this moment– is a container. That container has been here forever and will be here after we leave it. We step into a moment of time like we step into an LDAP or DOM container. That moment in time is going nowhere. “Beam us up Scotty.” If search scarcity wasn’t infinite, the constant seeding of the near term spaces of search scarcity might bother the bully.

Yes, the science fiction writer’s content seeds the future. Eureka moments seed the future. Our dreams seed the future. Marketers do their share of plowing and sowing of the future. Is is searchable? It will be.

The Seeds of the Future

As marketers, we seed a horizon, a planned expanse. When some of those seeds sprout into searchable content, we generate a point in the search space and add it to the regression cloud expanding the regression line that seems to inform the boundary between search surplus and search scarcity. The limits of regression are with us always. Much like the priestly computation personnel of the ancients, when they did something like subtract down to zero, a non-existent concept at the time, they subsequently fell through the cracks and died, regression spanks us when we in any direction exceed the regressed data cloud.

Plant your seeds in the fields of search scarcity. Put it on your roadmap. Farmers use tractors with GPS units. What do your fields look like?

And, before I shut the machine down and head out, there is the idea that our ordered convergences of statistical distributions define just how much space we might end up searching. I’m tying this back to the triangle model as an abstraction of a realization here in terms of divergence and convergence, searching and deciding.

Realization: Search Divergence Followed By Convergence

So there you have it. Sure, ideas are a dime a dozen, but that might just be paying too much. They’re like lettuce. If you buy it today, eat it today. There’s an infinity of ideas. We might not be ready for them, but search scarcity assures us that it will be awhile before an idea creeps up on us and astonishes us with a Eureka moment! That shouldn’t happen, but hey, we are not web search engines. We’ve got fields to plow.

Oh, back to Seattle. The fact that the distribution might be a thick tail as opposed to a long tail does not disturb the motivations. Consider that software applications are used beyond their designed intent all the time. In Architecture, some architects have written about the need to escape the lifelessness of contemporary architecture by escaping the program (use/requirements) put forth by the clients/eventual owners/managers. They want to build for the emergent use. We provide macro facilities, so users can explore. In doing so, we build in the thick tail, because that exploration that begins with search is part of the product. So, yes, I can see it, the thick tail. Thanks for pointing this out to me long before all the pieces fell into place, so I could understand it.

Let’s have a conversation–cluetrain and all that. That comment text box below is sitting there at the boundary between the long tail and the thick tail. It is search scarcity up close–the transporter. Comment, please. I’ll learn a lot from you. Thanks.

### 3 Responses to “Search Scarcity”

1. Tweets that mention Search Scarcity « Product Strategist -- Topsy.com Says:

[...] This post was mentioned on Twitter by David W. Locke, David W. Locke. David W. Locke said: 3See "Search Scarcity" on the Product Strategist blog, http://wp.me/ptDH8-6S #prodmgmt [...]

2. Feature Terrains, Networks of Frequencies « Product Strategist Says:

[...] distribution will be a long tail, or a thick tail distribution, which we discussed last week in Search Scarcity. Long [...]

3. Long Tails Beyond SEO | Search Engine People | Toronto Says:

[...] Since a tweet last week of a link to an SEO presentation on the long tail laid out an interesting application of the long tail, I'll assume you've read the book, the revision, and all the stuff written about the SEO long tail. A few months ago, there was also another tweet mentioning the long tail of search. To that I tweeted a long set of tweets eventually mentioning the thick tail of search, something I called search scarcity in a Product Strategist post of the same name, Search Scarcity. [...]