Oct 29, 2006

Some interesting tweaks for Firefox

For when I get around to installing it:

(This is from the Slashdot article Firefox 2 Downloads Top 2 million in 24 hours.)


(Score:5, Informative)
by teslatug (543527) on Saturday October 28, @10:57AM (#16622640)
Here are some of the settings that I've gathered so far to get Firefox 2.0 to my liking:

In about:config
* browser.tabs.closeButtons to 3 for one close tab button
* browser.tabs.selectOwnerOnClose to false for successive reading and closing
* browser.tabs.tabminwidth to 20 for displaying tab scrolling in extreme cases only
* browser.urlbar.hideGoButton no use for the Go button
* dom.disable_window to true, fix various window annoyances
* network.prefetch-next to false for not wasting my bandwidth

In userChrome.css for disabling the List all tabs which annoys me when using the close button:
/* Disable Container box for "List all Tabs" Button */
.tabs-alltabs-stack {
display: none !important;

Feel free to add your own to the thread.

Oct 27, 2006

LaTeX document fidelity

Here's something I've been meaning to talk about. Came across this comment in an MSDN blog yesterday. The blogger, Rick Schaut, is a software design engineer for Microsoft Word on the Mac. He makes the following claim:
To relate this back to the equation editing problem, the problem with any TeX-based way is that you won't get identical layout from one platform to another. TeX (whichever flavor you're talking about) is designed to maximize the quality of the output for each given platform, but that sacrifices some aspects of layout compatibility.

Suppose, for example, you have two people working on a paper. One uses a Windows computer, the other uses a Macintosh. Should that paper include a rather lengthy equation, there's a good chance that the equation might fit on one line when the document is opened on the Mac yet not fit on one line when the document is opened on the Windows computer.
Here, Mr Schaut, you are happily wrong :-) TeX and LaTeX cleverly outsource the work of device- and operating system-independent document rendering; they merely specify the document's internal structure. Once a TeX processing system has turned the document into a PostScript or PDF file, you can merrily distribute it anywhere you want with full assurance that the rendering will be inviolate, down to the last full stop on the last line of each paragraph.

And of course, unless I'm missing something here, any plain LaTeX source files will compile to the exact same PDF file whether it's on Windows or the Mac; that's a given, because the processing that TeX applies to the source file is the same regardless of operating system.

Of course, there is the entirely separate issue of distributing LaTeX source files or DVI files of your documents; but given all the potential incompatibilities you can face there, why even bother to distribute anything other than PDFs? LaTeX + PDF is the only sane way to go.

Oct 25, 2006

Temptation, thy name is Internet Explorer 7

OK, I know I said I'd upgrade to IE7 later, but I seriously couldn't resist the temptation -- especially since I have nothing to lose by upgrading now; IE is not my default browser and I'm not using anything that might not be ready for the upgrade. So how am I finding it?

Internet Explorer 7 feels much better than 6. Tab support is awesome, and rather faster than it was before, although still not as fast as Firefox's. And the keyboard shortcuts for switching among tabs should definitely be simpler than Ctrl-Tab and Ctrl-Shift-Tab -- they should really be Ctrl-PgDown and Ctrl-PgUp like they are in Firefox. That said, I've just discovered the Quick Tabs feature (shortcut: Ctrl-Q), which may just be the best tab-related feature I've seen yet in a browser. Press the key and IE shows a tab containing thumbnail previews of each of your tabs, and you can go to the tab you want with a single click. This is amazingly fast, and really cool.

Everything else they've done, including getting rid of the menu bar and putting the the address bar right below the title bar along with the navigation buttons, I applaud because it just increases the screen viewing space in the browser. Who uses the menus in their browser all that much, anyway? And if you need it, you can just right-click on any toolbar and turn it on. Or just press the Alt key to turn it on temporarily.

IE still doesn't have extension manager as convenient as the one in Firefox, but that's OK, I've invested a lot in Firefox, including future plans with Greasemonkey; and I wouldn't really switch back to IE even if it had awesome extensions. (Well, maybe if they had a Greasemonkey-compatible extension....)

So to conclude? IE7 feels like a hit. Definitely go for it. But as for Firefox 2? I've dowloaded the installer, but I'll still wait out the couple of weeks to get it through Mozilla's updating channels once they've made sure all the best extensions are compatible with it.

Eid mubarak

Eid day yesterday was great. I think Bangalis really know how to turn in a good Eid party, if they're in the mood. Some friends got hold of a Bangladeshi biryani chef, who cooked us a big pot of awesome chicken biryani. We got together at their place and had a good lunch, and a nice break from the pressure of the upcoming exams. I know there's a risk of all these breaks from the pressure turning into a permanent break from studies -- but hopefully we'll keep things under control here.

Firefox 2 & Internet Explorer 7

Firefox 2 is coming out in a few hours, but I'm really very satisfied with 1.5. I read though that 1.5 will soon have a minor update that paves the way for an automatic update to version 2, in a couple of weeks. I'm taking that upgrade strategy, because I'm a little worried that my current extensions, especially the very useful SessionSaver extension, will need some time to adapt properly to FF2. So hopefully in a couple of weeks that will be the case. But I'll probably wait a wee bit more even then while checking up on their status, i.e. what people have to say about them.

Not to sound nostalgic, but it seems like just yesterday that I downloaded the all-new Internet Explorer 5 over my slow dial-up line in Sharjah and tried it out, after reading a glowing review full of lavish screenshots in the Emirates' Windows User Magazine. And it was an awesome browser, the best of the best after using Netscape's weird-looking offering. But yeah, anyway ... IE7 is out now, and I know I'm going to upgrade. But similarly to Firefox, I'll wait till Microsoft pushes IE7 out to us through its Windows Update facility -- it gives everyone some breathing room and just feels right.

I won't really be using IE7 -- not with FF2 probably installed by then -- but obviously, it will be necessary to have it because of the bugfixes and interesting new features to try out.

It strikes me now that in about a couple of weeks, I'll be flying back to Dhaka, where my laptop will be totally cut off from internet access -- our desktop PC in Dhaka is the only machine which will be connected (hopefully, anyway). So it might be more than a month before any software gets upated on the laptop. That's fine by me, I guess. I'll still be able to try out FF2 on the home desktop. Hm, might download it now and take it to Dhaka to install, to save the download hassle when I get there.

Oct 20, 2006

Bayesian multi-category classification in Gmail?

What this would do is something like what POPFile does for email clients like Thunderbird or Outlook or whatever: automatically categorise incoming emails based on keywords they contain. POPFile is trainable and it's supposed to reach a pretty high accuracy after a couple of weeks. POPFile doesn't actually put your emails into different folders in your email program; it just marks them as belonging to one category or other (say `Work', `Family', `Junk', `Stamp collecting', etc.). Then you set up your email program so that it puts these emails into different folders, or deletes them, or forwards them, or whatever.

What POPFile does

OK first of all, you're asking why would you want some software like POPFile to classify emails for you when you can just set up filters in your email program to do that based on who they're from, what the subject is, and so on? The reason is your email client's filters are static: they do not learn about new family members who are sending you email, nor about new correspondents from your workplace, nor about new junk mailers you have to deal with all the time. In fact, programs like Thunderbird already have Bayesian filtering to deal with this problem of continually-changing junk mails -- it's just that POPFile goes one step further, to try to identify your mails as belonging to arbitrary categories that you set up.

The idea is, once you've set up POPFile to recognise email from your family, from your work, and from your stamp collecting buddies, it will correctly identify these different types of emails, say, about 99.99% of the time. The rest of the time, which is presumably a piffling amount of time, you'll be telling POPFile something like `no, this isn't junk, it's just my little brother, mark it as ``Family'' '. And POPFile will continue to learn, using the Bayesian statistical analysis.

So the end result is, you can set up your email programs to put email marked `Work' in the right folder, and so on, without having to worry about updating your filters all the time.

Now here's my question: why should email program users get these benefits exclusively? Why can't we have something like this for webmail users? Specifically, Gmail users (like me, and it seems half the world nowadays)? Maybe we can. It boils down to three things: Gmail's JavaScript functions, POPFile's statistical categorisation methods, and Firefox's Greasemonkey extension.

What Greasemonkey does

Basically, Greasemonkey allows you to customise Web pages in Firefox in almost unlimited ways with a little JavaScript programming, using that page's Document Object Model and any JavaScript functions defined in it. Check out http://persistent.info/archives/2005/03/01/gmail-searches for an idea about just how powerful Greasemonkey is, and what it can do to Gmail.

I've tried out the above hack, and it actually does work, with a few hiccups. Furthermore, I've tried programming Greasemonkey scripts myself and I can tell you it's a really powerful way of customising websites which you love to make them even more useful. There are actually a ton of scripts people have written out there, and the best place to get them is userscripts.org. Check it out.

POPFile for Gmail?

OK, now we know we can extend Gmail's functionality in amazing ways with Greasemonkey hacks like the one above. Essentially, Greasemonkey is giving us the means to program a user interface for the new Bayesian classification classification features we want in Gmail. Greasemonkey scripts are written in JavaScript. Now I'm pretty sure the `business logic' of POPFile, which is currently written in Perl, can be ported to JavaScript without too much trouble. The end result: an interface in Gmail that tags incoming messages and quickly allows you to check for and correct mistakes, training it, and bringing the convenience of automatic Bayesian classification to Gmail. Anybody up for it?

Oct 19, 2006

The Trojan War was never this good

Read Dan Simmons' Ilium and Olympos a couple of months ago, but haven't gotten round to talking about them till now. First of all, it's true that they're actually one book published as two, probably because if they were published in one piece nobody would buy a book that fat, and sales would be half as much as they were with two books instead of one.

Second, the book is not about the gods and Trojans and Greeks of the recreated Trojan war battlefield of far-future Mars; it's really about the future of humanity and what shape it might take. Simmons draws from a lot of literary sources, primarily Shakespeare (The Tempest) but also Proust (stuff I'm not familiar with) and Vernes (i.e. his Time Machine Eloi and Morlocks ideas).

The thing is, the story starts off with the scholic Thomas Hockenberry telling of the recreated war, and it's immediately gripping, especially to a guy like me who grew up reading his sci-fi on one hand and Greek/Norse/Egyptian mythology on the other. It's gripping for all the reasons the original mythologies are gripping -- the heroes and their stories are larger than life, etc. But the Trojan War storyline intercuts with that of the humans on Earth and the Moravecs on Jupiter, which takes the wind out of it somewhat, because you have all these new characters you didn't know before that you have to deal with, and you just want to get back to reading what Achilles did next.

Achilles by the way is the most interesting character in the story and Simmons lavishes him with detailed description, enough to satisfy any geek. Achilles the man-killer, Achilles the god-killer, Achilles the fleet-footed, Achilles this, and Achilles that. For some reason I kept imagining Brad Pitt as Achilles throughout the story, and it fit, right to the end. (But Eric Bana as Hector didn't -- Hector needs a stronger jawline, and a taller, more muscular figure).

The stories do converge, but they approach convergence from different points, and there's a lot of suspense. I won't bother with a detailed analysis of the thing here, but it's definitely enjoyable. I do want to talk about some of Simmons' ideas for the future of humanity though. Humans ten thousand years in the future are a sad, childlike lot, with every need catered to by robot servants and, who don't know how to read because they don't need to, and spend most of their time partying and pursuing other pleasures. Sounds perfect, but there's no intellectual stuff, no advanced thought. Simmons has a characters in the books disparagingly refer to them as `post-literate'. Ouch.

But these Eloi do have an interesting feature: they have been genetically modified to contain a hundred cybernetic functions, like a map/locator function that projects holographic images of the person being located; body status query functions; and advanced stuff like infonet access, the infonet being a semi-conscious web of information evolved from the internet which now blankets the planet. This infonet is extremely powerful -- it contains a huge amount of data, like information about every molecule in every cell of a tree the infonet user might be looking at. It's described as being totally overwhelming. You see the information, but you don't understand most of the knowledge contained in it. Oh, and you activate these functions by visualising combinations of coloured geometric shapes in your mind's eye. At least, until you can do it without thinking.

The `old-style', Earth-human protagonists introduced have a destiny to fulfill -- to recover the ability to use these advanced functions and recover the technological knowledge lost to the human race. But that's about it. There is some stuff about recovering some ten thousand humans encoded in a tachyon beam orbiting the Earth, but that's just another problem in the myriad collection of problems and mysteries the humans are faced with.

The infonet plays a large part in the book, actually -- combined with some really wild interpretations of quantum theory and post-human technology. It's a good read, but I still think the Trojan War part of the story should have been a different story altogether -- or rather, the story of the old-style humans on Earth should have been a different story, say The Final Fax. The Trojan War parts of the books would have made a kick-ass movie -- especially Achilles' visit to the pit of Tartarus in Hades, in the presence of the original Greek gods, the Titans, imprisoned there by Zeus.

Oct 17, 2006

If only they used this instead of E-Views

(Interesting note: just found out that the Cochrane behind the famous Cochrane-Orcutt method was at Monash, http://www.buseco.monash.edu.au/depts/ebs/.).

We started doing the basics of econometrics -- things like regression and ANOVA -- in Excel last year, but moved to E-Views this semester to do more advanced stuff like time-series analysis. That's too bad, because there's a much better program we can use: R (http://www.r-project.org/). The main reason is it's free -- we can download and use it at home, so we don't have to depend on the computer labs being open and free to get our assignments done. Here's a good article that talks about why R is great: http://jackman.stanford.edu/papers/download.php?i=22.

And yes, I know R is mostly command line and teaching it at Monash would take up too much of our time, taking our focus away from the econometrics theory. But R can be customised and tailored to the Monash courses with a little effort; and it has a Tcl/Tk widget set built-in which can be used to implement graphical versions of the stuff they teach us using E-Views -- things like restricted model F tests (Wald tests), AR(1) estimation, weighted OLS estimation, things like that.

That said, I'm still learning R and it's sometimes been frustrating to try matching my results on time series data to what my textbook, Wooldridge, says I should get. Things like ARMA(p, q) estimation seem to be built in to non-obvious places like the gls function in the nlme package. But it works, for the most part. Using Excel after R -- especially R's matrix handling -- feels like going backwards now.