Narrative exploration of code

June 19th, 2010

I’ve noticed that a lot of frameworks and libraries don’t have narrative documentation. Instead, they only have a description of each module without a story to go with them.

This post will outline a possible strategy to develop narrative understanding of software libraries.

Exploratory programming is creating prototypes and experimenting with them to develop a deeper understanding of the problem domain and requirements.

Exploratory programming for understanding libraries will be based partially on exploratory testing methodologies. As such, there are several tools that can be used from the testing world:

  • Functional testing
  • Scenario testing
  • Focus/Defocus

Functional testing — in this context — means surveying each module, class or method and experimenting with it in a manner consistent with its documentation. The point is to learn what the software is doing. This can sometimes be difficult when you don’t know what exactly a class does or what you’re looking for, but it can give you a handle on the search space

Scenario testing — in this context — means creating scenarios of use and trying to use the library to meet them. This is generally more challenging than functional testing, but allows you to learn more rapidly about the software then just looking at the manual

Focus/defocus — Focusing on details is sometimes important, but remember to focus on the big picture. Switch between these modes if you get stuck, try something new, do something unexpected.

That’s it for now.

Twisted

June 18th, 2010

First, I’m going to start with Bruce Eckel’s description, then with the twisted online documentation.

As I progress, I’ll ask the mailing list — this should allow me to generate a narrative tutorial of my experiences that could then be integrated back into the twisted documentation.

Antipathy Redux

June 16th, 2010

I’m getting an itch to work on Antipathy.

This time, I’m using a slightly different strategy:

  • PyMunk for collision detection
  • cocos2d or grease for the game framework (pyglet for graphics)
  • Twisted (of course) for networking/eventing
  • Humane interface (with zooming) as much as possible!

Besides that I’m trying to simplify the game design and development. I just want some graphics, then networking and then go from there.

Until then, though, I’ll be running through tutorials over and over and over again.

Sequitor Based Reinforcment Learning Chatbot

June 16th, 2010

A reinforcement learning based chat bot can be implemented by using the context free grammars created by the sequitur algorithm.

As phrases are learned by the chat bot, the sequitar algorithm creates grammar sub trees based on the existing corpus. These sub trees are cataloged and can be used as a state space. Simplistically, a state consists of a vector equal in length to the number of sub trees, with a bit flipped for whether that specific subtree was seen in the input.

Based on the state, various actions can be taken. New phrases could potentially be generated using the same grammar model, or other actions could be taken.

Reinforcement “simulations” or dreams could be executed between each user input.

Primary difficulties: Keep states consistent as new additions to the state space are added, Generating new phrases based on the captured grammar.

Architecture View of a Software Program

May 27th, 2010

Michael Feather’s “Working Effectively With Legacy Code” describes architecture as a simplification of how software actually works. Details and inconsistencies are glossed over, but the gist of the software is the focus.

As such, it highlights the most important pieces of the software.

What if we could visualize the most important pieces? A simplified UML structure diagram or sequence diagram would be sufficient, and that’s probably what a lot of projects do. But what if we could go further?

The architecture description is hierarchical. You start by describing the central piece and then add the details and exceptions later. Could we make a semantic zooming view of software architecture, so that the most important parts were at the highest zoom level and more details become apparent, finally showing the actual code?

The question of course, would be how to do this without adding too much additional overhead to describing the software itself.

In Search of Serendipity

May 22nd, 2010

One of m goals over this summer, and in life in general, is to increase my exposure to serendipity. To do this, I’m going to do a couple of things:

  • Keep an open mind and find joy and excitement in novelty
  • Try to meet and talk to strangers more

The second part is the most difficult to me. I can easily talk to strangers if I don’t know anyone at party, but I have difficulty talking to strangers in the grocery store, or whatever.

The conventional wisdom is basically “Just do it”, which may work. But I’ll have to keep track of it.

I can easily to up and talk to strangers if I have some extrinsic motivation — a dare, or proving a point to someone, the issue is doing it without the extrinsic motivation.

The other part is where exactly, I should go to meet interesting people. I can keep my eyes open everywhere, but I’d like to maximize the opportunity I have as well.

I’m going to make my goal to do the approach, to even annoy or confuse someone, not to get them to respond positively. I’m quite sure I’m able to do that, I just need to follow through.

Black Swans and Bayesian Belief (part 2)

May 21st, 2010

To be robust against black swans, you must select subjective priors which has a relatively high probability of rare events. E.g. Mandelbrotian, not Gaussian. Otherwise, your inductive bias will lead you to believe that no rare events are possible.

This does not protect against other types of model error, “true black swans” that exist because your assumptions are wrong, but it reduces the probability of Gray Swans becoming Black Swans.

Pascal’s Wager, Optimal Bayesian Decision and Serendipity

May 20th, 2010

Bayesian decision theory basically states that you sum over all possible theories of events, and then multiply the probability of an event times the payoff to determine the optimal action.

The question is, what do you do when you don’t know what the probability of an event is? What do you know if you don’t know what the payoff is?

The answer is in Pascal’s wager, assuming you can determine whether the payoff is bounded or unbounded, (either negative or positive) you base your decision solely on the payoff and not on the probability.

This is of course a heuristic and can be abused by silly people, but it’s still a good general description.

If you are looking for high payoff events that you don’t know when they’ll occur, you have to maximize your serendipity. That’s what I’m trying to focus on more and more in my life, maximizing serendipity.

Cython trees and graphs

May 4th, 2010

I wonder how fast Cython extension types are when it comes to graph like algorithms, e.g. construction and traversing trees and directed graphs.

I’ve seen a lot of examples of how much faster Cython is for stuff that involves arrays and loops, but I really need to experiment and see if I can increase the speed of the trees used for decision trees, for instance.

It’d also be interesting to see the speed compared to a purely static language and python dictionaries.

Exploratory Testing is a Heuristic Approximation of Cleanroom Statistical Testing

March 13th, 2010

I figured this out talking to my friend Daniel earlier this week.

In cleanroom software development, a usage model is created of the expected use of the system. This includes states of usage, system stimulus and probability. It’s basically a markov chain of usage from the user perspective.

This model is used to generate test cases, which consist of lists of stimulus inputs. To be complete, stimulus must include everything that can happen to the software, timing differences, OS interleaving, hardware failures, etc. Often times specific classes of test cases are generated to test features of the product that are unlikely to be encountered in practice but that must provide high levels of robustness.

These test cases are a very small subset of the infinite possible number of test cases from any usage model. Because the model is formalized, specific measurements of reliability can be generated from testing.

Exploratory testing, specifically as professed by the context driven school of testing, acknowledges the same limitations. There are an infinite number of possible test cases to execute in a finite amount of time.

Instead of front-loading the model design process to generate test cases, an exploratory tester actively develops a model of usage as she tests, taking into account risk of potential failure, etc.

This reminds me of the contrast between extreme programming and the design part of cleanroom. Both use an incremental approach to software release, peer reviews and strong specifications (in the form of unit tests in XP, box models in cleanroom). Extreme programming does this actively as the software is made and running, while cleanroom creates it “on paper” before code is generated. In the same way, exploratory testing generates the model actively against the running system, while cleanroom creates the model “on paper” before hand.

Exploratory testing is a heuristic approach of covering a sample the most important test cases to report quality related information while cleanroom is a formalized approach to do the same thing.