Life between the buildings, the pilot study

I have been meaning to blog this for ages, but have been to busy playing with the data to really get anything out in writing. Lilia, Anjo and I have been working on the pilot study portion of our paper, Life between the buildings: An approach for defining a weblog community (pdf on wrong computer, will add in the morning). The journey through the pilot study has been an interesting one. We have learned many programs, defined many methodology problems, and finally begun to form a way to (hopefully) define weblog communities. To me, an important distinction in this paper is between weblog networks and weblog communities. Networks are (somewhat) easier to define in that they can be shown mathematically. This person links to this person who links to these people and in turn, these people link back to these people…so on and so forth. Linking, however, does not a community make! Community definition takes more than archeology…it also takes quite a bit of ethnography. Anyone can have connections…I have around 300 links in my rss (functions as my blogroll). I do not, however, reside in a community of these bloggers (virtually residing, that is). In order to define a weblog community one must delve into the way that these bloggers maintain their network structure. You need to look at different measures such as how many different types of connections do they maintain with a person (back channel communication, face to face meeting, partnerships, etc.) and how they communicate with each other. Taken in connection with the number and type of links mined from weblog entries over time, interesting pictures begin to emerge.

Light green is me(Lilia)
Blue – KM blogs
Red – educational blogs
Orange – internet research blog
Green – A-list
Grey – all not coded
picture via Lilia’s flickr

One interesting example was based on geographical relations between the participants. Despite the fact that collaborating online removes physical barriers, many bloggers were clustered together, the most prominent group working out of the Netherlands.

Another interesting example can be found in the picture above. It is obvious from this visualization that this particular network of weblogs tends to cluster around topics of interest. I believe that weblogs in all genres cluster in this way which create fuzzy boundaries in the network, as topical clustering is subject to change over time and with the activities of its’ members. It is these fuzzy boundaries that allow for the entrance and exit of core members.


A lot was learned through the various methods of defining this network, too much for a blog entry 🙂 To read more about the various stages of link mining and spidering, read out AOIR paper.