senecaplanet

Clay Shirky's hidden fees

I recently finished reading Here Comes Everybody by Clay Shirky. If you haven't read the book, but want a flavour of it, check out his talk at TED:
Institutions versus collaboration

For those not familiar with Shirky or the book, he's what I guess you would call a new media theorist. He's not quite a Marshall McLuhan for our time, but his stuff is in the same vein. What happens when you add new communications technology to human society? The tech is of course the Internets.

The chapter "Everyone is a media outlet" is a must read for anyone left earning a living as a journalist. It will either blow your mind or make you want to blow your brains out. It all depends on how you cope with change.

However, there was one quote from that chapter that lept out at me as wrong:

Publishing used to require access to a printing press, and as a result the act of publishing something was limited to a tiny fraction of the population, and reaching a population outside a geographically limited area was even more restricted. Now, once a user connects to the internet, he has access to a platform that is at once global and free.

Free is one of those funny words. It has two meanings: free as in speech and free as in beer. I think that he is using it here in the latter sense. In a later chapter he refers explicitly to blogging as something without cost:

Another advantage of blogs over traditional media outlets is that no one can found a newspaper on a moment's notice, run it for two issues, and then fold it, while incurring no cost but leaving a permanent record.

This reminds me of the old joke, "We have freedom of the press, but only for those who own a press." What I mean is that if blogger.com is your press, Google owns it. Google may not charge you money for it, but bills are still being paid somewhere. And even if you do buy your own hosting, something that most Internet users don't do, you still don't own the press. The infrastructure between your computer and some other random Internet user is massive, complex and expensive. For this new press to be both free as in speech and free as in beer requires a great deal of government and industry investment.

I think that if we want the Internet to remain free as in speech, we shouldn't skate past the fact that it is not free as in beer.

But seriously, it's a great book. I had it in my Toronto Public Library queue for five months before I got it. That's almost as long as I've been waiting for the Button-down mind of Bob Newhart.

 

Adding xpcshell tests to core modules

A few more words about xpcshell testing.

The devmo tutorial explains how to write the tests. However, the information for adding tests to the tree is perhaps not totally clear because it geared towards extension developers:

You should store your tests near the source of code you want to test. For example, imagine you want to add tests for your component in extensions/myextension/mycomponent, you will add a tests/ directory. You can store different kind of tests here (reftests etc.), and for xpcshell, you should create a unit directory and store all your javascript files inside it. Example: extensions/myextension/mycomponent/tests/unit/test_foo.js. The unit name is important because for the moment the xpcshell unit tests framework recognize only this name (especially when you want to launch a single test with check-interactive).

Then you should create a extensions/myextension/mycomponent/tests/Makefile.in which will contain XPCSHELL_TESTS = unit. Of course, in the extensions/myextension/mycomponent/Makefile.in you will add tests/ into the DIRS variable.

In practice, the work of setting up directories and makefiles may already be done. Here's some bash to list all the modules that have xpcshell unit tests and where they are located:

find ./ -name "Makefile.in" | xargs grep XPCSHELL_TESTS

If you're module has one, it's just a matter of dropping yours into the correct directory. For the most part, unit tests are found in module_name/tests/unit.

For example, here's an abridged version of the xpcom tree that shows the interface I want to test and the javascript file I use to test it:

xpcom
|-- tests
|    `--unit
|        `--test_nsIProcess.js
`-- threads
     `--nsIProcess.idl

To manually run all the tests in the unit directory:

make -C obj-ff/xpcom/tests check

Or to run just one:

make SOLO_FILE="test_nsIProcess.js" -C obj-ff/xpcom/tests check-one

 

Unit test for nsIProcess exists

New patch:
http://jamesboston.ca/patches/patch112808.txt

I've figured out how to move my unit test into the tree along with a few simple programs that are used by the test. There is an existing directory for unit tests at xpcom/tests/unit and that's where test_nsIProcess.js can now be found. I also created a new directory, xpcom/tests/simple-programs, that contains the source for, well, simple programs. When the tree is built, they get compiled into executables that I can use to test the functionality of nsIProcess. Links to the executables appear in obj-ff/dist/bin which is convenient for me.

Now it's just a matter of polishing the actual unit test code to make sure it covers every edge case. At the moment, it simply runs through all the functions to see if the return codes make sense.

 

Getting paths inside an xpcshell

One of the issues I've had with creating a unit test for nsIProcess is figuring out the full path to my unit tests within the build system.

The answer is much simpler than I expected.

var file = Components.classes["@mozilla.org/file/directory_service;1"]
                     .getService(Components.interfaces.nsIProperties)
                     .get("CurProcD", Components.interfaces.nsIFile);
print(file.path);

This will return /home/james/mozilla/src/obj-ff/dist/bin which is exactly what I need.

The other issue is building a few simple executable files at compile time. I need binaries that do things like exit, never exit, and crash. (These are actually the files I need the full path for.)

Ted Mielczarek has pointed me in the right direction with regard to makefiles (which are my nemesis). Just write a TestProgram.cpp and then set SIMPLE_PROGRAMS = TestProgram in the makefile.

The only other stumbling block left is figuring out where the unit tests should go in the tree. I've been using the example file in the devmo tutorial. There is probably more makefile pain ahead.

 

Unicode variant of PR_CreateProcess?

I've consulted with Wan-Teh Chang, the NSPR module owner, regarding Unicode support. Apparently, Unicode support has been added to NSPR's file I/O functions (to support Unicode file pathnames). However, the new functions have only been implemented on Windows. Moreover, they are conditionally compiled only if the MOZ_UNICODE macro is defined.

Searching for that macro leads to functions for opening files and directories. I'm not sure that's what I'm looking for or if it's useful to what I'm doing. I'm a bit confused, because I know already that nsIFile (or nsILocalFile) supports Unicode for paths and filenames. And I have no idea what to do about Unix and Unicode. That's actually a much bigger problem in nsIProcess because Mac and Linux both call PR_CreateProcess, whereas I have the option of avoiding that with Windows with existing code.

Wan-Teh indicated that a Unicode variant of PR_CreateProcess is something that could make it into the tree.

 

Lizard Feeder

This is interesting:
http://khan.mozilla.org/~lorchard/lizardfeeder/

It's a real-time feed of bug reports, code commits and other Mozilla stuff.

 

Patch to add Unicode support to nsIProcess

New patch:
http://jamesboston.ca/patches/patch111108.txt

After discussions with Benjamin Smedberg, Mark Finkle, Jason Orendorff, and Ted Mielczarek (among others) about Unicode, I've decided to decouple nsIProcess from the Netscape Portable Runtime, at least as far as process creation goes. I may still be able (need?) to use it for piping between processes. Benjamin brought home to me how long it would take to get Unicode support into the NSPR. Unfortunately, my approach to managing processes depends on the NSPR. So that approach is out. Unicode in the NSPR will have to be my next project.

Currently, nsIProcess only uses the NSPR to start non-Windows processes. It doesn't support Unicode at all. My plan is to port the Unix processes creation stuff over from the NSPR into nsIProcess.

I've gone back to an early patch I wrote to fix the kill method under Windows without using the NSPR, so that's back in. I've also merged code for Unicode support from an extension written by dafi:
http://dafizilla.wordpress.com/2008/10/08/nsiprocess-windows-and-unicode/

So with this patch Windows users at least can start and stop processes and use Unicode arguments. I've tested this with some Aramaic and it works.

The run method is now defined in xpidl to take wide characters like this:

unsigned long run(in boolean blocking,
                  [array, size_is(count)] in wstring args,
                  in unsigned long count);

In practice this translate to:

NS_IMETHODIMP  
nsProcess::Run(PRBool blocking,
               const PRUnichar **args,
               PRUint32 count,
               PRUint32 *pid)

The method that assembles the command line looks like this now:

static int assembleCmdLine(PRUnichar *const *argv,
                           PRUnichar **cmdLine)

 

Piping using the Netscape Portable Runtime

Since my approach to fixing nsIProcess has been to take advantage of code that alraedy exists in the Netscape Portable Runtime, I've been hunting in there for something that might be useful for inter-process communication. What I need is something for capturing the stdout of a process or for sending stuff to its stdin. Well, it looks like the NSPR has an API for i/o redirection.

nsprpub/pr/include/prproces.h#59

NSPR_API(void) PR_ProcessAttrSetStdioRedirect(
    PRProcessAttr *attr,
    PRSpecialFD stdioFd,
    PRFileDesc *redirectFd
);

nsprpub/pr/include/prio.h#1891

NSPR_API(PRStatus) PR_CreatePipe(
    PRFileDesc **readPipe,
    PRFileDesc **writePipe
);

The PRProcessAttr* appears to be the key here. If you look at the implementation for PR_ProcessAttrSetStdioRedirect you can see that that PRPRocessAttr* contains the information about piping:

nsprpub/pr/src/misc/prinit.c#520

PR_ProcessAttrSetStdioRedirect(
    PRProcessAttr *attr,
    PRSpecialFD stdioFd,
    PRFileDesc *redirectFd)
{
    switch (stdioFd) {
        case PR_StandardInput:
            attr->stdinFd = redirectFd;
            break;
        case PR_StandardOutput:
            attr->stdoutFd = redirectFd;
            break;
        case PR_StandardError:
            attr->stderrFd = redirectFd;
            break;
        default:
            PR_ASSERT(0);
    }
}

Although PRProcessAttr* is the fourth term in the PR_CreateProcess signiture, it isn't used in the nsIProcess invocation:

xpcom/threads/nsProcessCommon.cpp#371

PR_CreateProcess(mTargetPath.get(), my_argv, NULL, NULL);

I think I can use the NSPR API to create pipes and then pass them to PR_CreateProcess as a PRProcessAttr*. In this way I can expose i/o piping for processes in nsIProcess.

There isn't any documentation for this that I can find, but someone has been good enough to write some unit tests for this stuff that gives me an idea of how it works:

nsprpub/pr/tests/pipeping.c#119

I think this stuff may have been created with file i/o in mind, but I'm not sure that matters.

 

Getting a picture of the PDXR

Trying to get the PDXR project moving again. The team met with David Humphrey last week and we meet again ourselves to map out a few things. The goal is create enough documentation that it could be implemented by another team (although I hope it won't come to that.)

Samer and I put together a simple diagram of the project architecture so that we could have some sort of model to which to refer as we try to describe the what the project is:

I'm going to try and elaborate on the diagram here a bit.

Mozilla-Central is the main repository for Mozilla source code. This is the raw fodder out of which documentation will be created. How it is created doesn't really appear on our diagram unless you count the "How does this happen?" note. Well, how it happens is through a static analysis tool called Dehydra. In layman's terms, Dehydra reads the Mozilla source code and annotates it in a way that can be turned into documentation in an automated fashion. This part is another project entirely and as it relates to our project, the relevant work is being done by David Humphrey. However, our team will need to setup the Dehyra tool to feed our update server. (More on that in a moment.) David has assured us that this part is not a huge job. It might be useful to sit down with him sometime in the near future and get our own development setup running to play around with and use for research. (David has a setup, but we wouldn't want to muck it up.) For now, let's just take it as read that the Mozilla source code has been processed and fed to the update server.

So what it the update server? Although this project is all about building a tool for browsing documentation off-line, that documentation is in a constant state of flux. It would be useful to be able to refresh it when need be. The update server needn't be terribly complex. A simple Apache server and some cgi work should suffice. However, it needs to be worked out exactly what kind of update is being pushed. Is it a binary diff of the sqlite database? And if so, will it need to keep a collection of diffs based on a original dataset in order to service users who haven't sync'd for some time. There's also the question of whether the update mechanism build into Mozilla for extensions is useful to us. We need to do some research on how extension updates work.

Now we come to the heart of the project. The extension. As our diagram shows, the Portable DXR (our browsable source code documentation) is an application that lives inside our extension. The diagram should probably be revised a bit to show that the cgi tools (Perl or possibly Python support) exist as part of an embeddable web server. When you remove the DXR part, what you are left with is the platform on which it runs. It's also worth pointing out that the sqlite support is available in two ways. Local storage capabilities using sqlite are build into Mozilla applications and can be accessed by an extension using C++, javascript (and I believe Python). However, sqlite database files can be accessed by any software that understands the format, so sqlite support will also be available through the cgi tools. The DXR application will probably use the cgi tools to access sqlite files if our presentation layer is traditional XHTML. But if we decide to use XUL, we can use the browser's built in functionality directly from JavaScript. In either case, I suspect we will end up leveraging the built in sqlite support for the updates. But we need to do some research on that.

The last part is the the presentation of the documentation, the DXR proper. David has a something now that uses XHTML, Ext-JS for the front-end. We could port that to the extension, but there would need to be modifications for sure. For instance, David's DXR doesn't give the user the option of updating. If we are combining this with Prism and XULRunner, the only UI elements the user will see will be in the displayed page. So we would need to include UI elements there ourselves.

I've been reading some blog posts from Mark Finkle about mixing XUL and HTML. I think that may be one way to go. Also, we would want to take advantage of Samer's jquery skills.

I haven't said anything yet about how one embeds a server in an extenions. I already have a proof-of-concept extension right now. It uses SHTTPD and supports Perl and PHP for cgi. However, the browser runs as a separate process from the browser and this could lead to problems where one is running and not the other. The plan is to find a suitable light weight server and compile it as a binary extension to the browser. We won't have to write the server from scratch, but there will be considerable work finding a suitable server that we can hack to work as a binary component on Windows, Mac and Linux. And then there is the not-trivial task of getting the cgi tools to run across all three platforms.

Probably our next step is to do some research. Actually, if we are doing this in a proper project management way, we need to map out how we going to do the research, how long it should take, who is going to do it, and so forth.

So that's kind of where the project is right now.

 

Patch works but the approach is wrong

I had a chance to test the latest patch on Linux. The same solution works for both Mac and Linux. That means that the run and kill methods of nsIProcess now work across three platforms as described here:
developer.mozilla.org/En/NsIProcess.

But the way I return the PID is just too hacky to make it into the tree. My approach assumes too much about the internal stucture of a PRProcess type that can't be verified at compile time or run time. It's a brittle solution. Benjamin Smedberg, the guardian of XPCOM, recommends re-implementing PRProcess for my needs.

What I really need to do is publish the nsIProcess2 API design I promised several weeks back. It would be easier to figure out what needs to be re-implemented and get advice if I had an over-arching design to guide me.

 

RSS

Syndicate content