Now I Have a Blog TooNow I Have a Blog Too Christopher Finke is a software engineer at Mahalo. He is available for birthday parties and bar mitzvahs.

Tracking the top Diggers

May 23rd, 2008

For the past 15 months, I've been maintaining a list of the top 100 (and top 1000) Digg users. As Digg has become less and less relevant to my interests (and the interests of the greater tech community), I've decided to stop updating these lists. However, I have chosen to provide the tools I have used in case someone else has an interest in tracking Digg statistics.

Caveat: This whole thing was thrown together without any regard for coding or design conventions. Also, I can't guarantee that you won't be banned from Digg for using any of this information - I was, a few times. Proceed with caution.

First, the data structures. The Top Diggers lists are supported by two database tables: `digg` and `digg_users`. `digg` holds the set of front-page stories and their submitters, while `digg_users` stores aggregated information about each user in the system.

CREATE TABLE `digg` (
  `user` varchar(128) NOT NULL default '',
  `submission` varchar(255) NOT NULL default '',
           `date` datetime NOT NULL default '0000-00-00 00:00:00',
  PRIMARY KEY  (`submission`),
  KEY `user` (`user`)
);

CREATE TABLE `digg_users` (
  `user` varchar(128) NOT NULL default '',
  `frontpage` int(11) NOT NULL default '0',
  `dugg` int(11) NOT NULL default '0',
  `submitted` int(11) NOT NULL default '0',
  `profileViews` int(11) NOT NULL default '0',
  `frontpagestatic` int(11) NOT NULL default '0',
  `frontpagetotal` int(11) NOT NULL default '0',
  `submittedstatic` int(11) NOT NULL default '0',
  `image` varchar(128) NOT NULL default '',
  UNIQUE KEY `username` (`user`)
);

Next, some data to get you started. Here is a SQL dump of the two tables described above, including information on about 13000 Digg users (for `digg_users`), as well as the last 4 stories on the Digg homepage at the time of the last update (for `digg`). Import this data into your database in preparation for the next step.

The next step: data retrieval. The main work of updating the top 100 list is done by a Python script:

import digg

# Grab the last X pages of popular stories
digg.update_news(100)

# Update the top X profiles
digg.update_profiles(110)

Of course, this code makes no sense without the digg.* methods, downloadable here. This script also requires the excellent BeautifulSoup Python HTML parser. You will have to modify digg.py to change the database connection parameters.

For those who don't care to read through the code, it achieves two main objectives: Find out who submitted any frontpage stories since the last update (it stops when it hits a story already in the `digg` table), and using that information, determine the new top 100 users and update their profile information.

The last step is data presentation. The information in the database tables needs to be transformed into a readable HTML file. I'll leave this step as an exercise for the reader, but to get you started, this SQL query will get you the data you want in an easy-to-read format:

SELECT
   user `Username`,
   frontpagetotal `Frontpage Stories`,
   submitted `Stories Submitted`,
   dugg `Stories Dugg`,
   profileViews `Profile Views`
FROM digg_users
WHERE
   frontpagetotal <= submitted
ORDER BY
   `Frontpage Stories` DESC,
   `Stories Submitted` ASC,
   `Stories Dugg` DESC
LIMIT 100

So to sum up, if you want to manage your own "Top 100 Diggers" list, take the following steps:

1. Import the dump of digg data linked above.
2. Set up and run your scripts
3. Create a readable version of the data.

Have fun, and beware the Digg ban-hammer.

Does Mahalo crash your Firefox?

May 22nd, 2008

We at Mahalo have had quite a few reports of Firefox 3 crashing when visiting Mahalo.com. (Here's a video of it happening.) We'd love to get this fixed, but we are unable to duplicate the problem on our own machines.

If this happens to you, here is some information that you could send us to help us out:

  1. Your Operating System and Firefox versions. Example: Mac OSX 10.4.11, Firefox 3.0RC1.
  2. Whether you're logged into Mahalo when it happens.
  3. prefs.js. This file is found in your Firefox profile directory, and it contains any changes you've made to Firefox's default settings. It doesn't contain any especially private or personally identifiable information. (Where is my Firefox profile directory?)
  4. If you can get Mahalo to crash Firefox in a clean profile (one without any personal information like passwords or an extensive browsing history), then sending us a ZIP of your entire profile directory would be extremely helpful. If the crashing happens in your regular, every-day browsing profile, please don't send this to us if there's any chance that it includes information that you don't want anyone else to know, like usernames, passwords, or browsing history.

So if you'd like to help us fix this problem, send as much of this information as you can to finke@mahalo.com. Just remember: We're all in this together. We don't want Firefox to crash any more than you do.

WTB: Los Angeles Systems Engineer

May 9th, 2008

Mahalo is growing, and although Dan is some sort of wizard, he has to start getting more than 20 minutes of sleep per night. We're looking for another Los Angeles Based Systems Engineer to complement (and compliment!) him:

"You should be expert in massively scalable architectures, how MySQL and Linux interact, how MySQL and memcache interact, sharding, replication (including multiple master replication) and how to tune MySQL based on various schemas for maximum performance and availability. You are a HANDS ON implementor, a get-it-done kind of developer. The right person is a self starter with the 'general get it factor.' You work well with a team of like-minded engineers, and have a genuine desire for excellence."

See Jason's blog for instructions on applying for this Systems Engineer job in Los Angeles. Tell 'em Finke sent you.

I got yer links... right here!

April 29th, 2008

It's the week of Grand Theft Auto IV at Mahalo this week. If you play video games like me, you'll probably want to read the Grand Theft Auto 4 Cheats before you check out the Grand Theft Auto 4 Walkthrough.

If the Wii and Mario Kart is more your style, check out my brother's blog. He updates regularly with new insights into the Wii and Wii games.

If you like to read about food, or if eating is one of your hobbies, check out Fast Food Critic. Just try not to read the nutrition information posted with each review.

And for no reason at all, here's our dog Pedro trying on his Halloween costume a few years back. You can see the Scape-o-lantern on the coffee table behind him.

Slashdotter updated to 2.0

April 26th, 2008

I've updated the Slashdotter Firefox extension to make it compatible with Slashdot's latest decision changes. "Hide/Show Replies" is working again, the BSD section has been added to the "Styles" options panel, and most of the code has been rewritten due to the fact that we're no longer living in a Firefox 1.0 world. Thanks to Michael Bunzel for the patches!