davidflanagan.com: books, tools and techniques for programmers

Books & Tools

Techniques

Jude

Jude is my Java documentation browser. It combines Sun's definitive javadocs with the easy-to-use format of Java in a Nutshell, and tops it off with easy keyboard-based navigation and full-text searching.

Jude is available for free evaluation.

See the user's guide for more info

Java in a Nutshell

The 5th edition is now out, with complete coverage of Java 5.0!

It includes a fast-paced tutorial on the language, and a compact quick-reference for the core Java API.

Java Examples in a Nutshell

The 3rd edition, updated for Java 1.4

This edition has all-new coverage of the NIO and JavaSound APIs, completely rewritten Servlets and XML chapters, and coverage of new Java 1.4 features (assertions, logging, preferences, SSL, etc.) added througout. A great book for those who like to learn by example. 193 working examples: 21,900 lines of carefully commented code to learn from.

JavaScript: The Definitive Guide

This is my JavaScript bestseller. It has been called the "bible" for JavaScript programmers.

Brendan Eich, the creator of JavaScript called it a "must-have reference".

Java 1.5 Tiger: A Developer's Notebook

Amazon incorrectly credits me as the main author on this book. I'm actually the second author: really more of a consultant. This is a good book about all the language changes in the latest version of Java.

Effective Java

I didn't write this excellent book, but I wish I had.

Author Josh Bloch is probably best known for the collections classes in the java.util package. His experience and wisdom are apparent in this book. I learned from it and recommend it highly.

January 11, 2006

May deferred scripts be executed out-of-order?

The HTML specification defines a defer attribute for <script> tags. It is a hint to the browser that the script does not include calls to document.write() that produce output. Browsers are free, therefore to continue parsing and rendering the document while waiting for the script to load.

My understanding of this attribute has always been that scripts may only be deferred until the next non-deferred script is encountered. That is if the defer attribute is present, a script may be deferred, but it must still be executed in the order in which it appears.

This is not Microsoft's understanding, however. In IE, a deferred script runs out-of-order. Consider the following code:

<script>alert('first script');</script>
<script defer src="deferred.js"></script>
<script>alert('third script');</script>

IE only defers scripts that are loaded from external files, so the second line uses the src attribute to include the deferred.js file. This file simply contains another alert() call. When run in IE, we see that the first script is executed, then the third script is executed, and then the second script is executed.

My impulse is to bash Microsoft for getting this wrong, of course. The HTML spec does explicitly say "All SCRIPT elements are evaluated in order as the document is loaded.". I cannot be 100% certain that Microsoft is wrong however, because the spec does not mention execution order when describing the defer attribute:

defer [CI]
When set, this boolean attribute provides a hint to the user agent that the script is not going to generate any document content (e.g., no "document.write" in javascript) and thus, the user agent can continue parsing and rendering.

Microsoft deserves credit, I suppose, for actually having implemented the defer attribute; no other browsers appear to utilize the hint. I think they got the implementation wrong, though. In the absence of other implementations, however, it doesn't matter so much whether Microsoft follows the spec or not: we can just treat deferred scripts as an IE-only feature.

If you've got a different interpretation of the HTML spec, or know of other browsers that honor the defer attribute, let's hear about it in the comments

permalink | Comments (0)

January 02, 2006

When are document elements available to scripts?

As far as I can tell, the DOM and HTML specifications are silient on the issue of when DOM elements become available to scripts embedded within the document. That is, if you have an element <div id="foo">, when can you call getElementById('foo') and be sure that you'll get a valid result?

To be safe, we typically wait for the onload() event. (Although the specifications don't explicitly guarantee that, either.) It appears to be common practice for browser vendors to make the DOM available while it is being parsed. That is, any script in the body of a document can access any elements that appear before it in the document.

Can anyone with more knowledge of the specifications, or with more experience writing production code comment on this issue?

This is, of course, related to the "onload problem" which Dean Edwards purports to have solved. In the comments to that post, Dean asserts that placing scripts at the very end of the body does not guarantee that those scripts will have access to the whole document. Dean's experience carries a lot of weight, but he does actually have code that demonstrates that this technique is unreliable.)

Comments are open.

permalink | Comments (8) | TrackBack (0)

December 05, 2005

Flash Persistence Security: false alarm, or lingering flaw?

Okay, I've done some more investigation. After Brad Neuberg reported (in the comments of a previous blog entry here) that he could not reproduce the security hole I report using his AMASS library, I tried myself. I couldn't reproduce it with AMASS either.

I think I've figured out what is going on. In my experiments with Flash persistence, I was using a little SWF movie that did nothing but load and store persistent data. When it started up, it would load the persistent data and pass it to JavaScript with an FSCommand. This is what opened up the hole: another webpage, in a different domain, could hotlink the SWF, setting allowScriptAccess=always, and still receive the FS command and the persistent data.

I modified my swf so that the FSCommand just notifies JavaScript that data is ready. Then the JavaScript code has to ask for that data with a GetVariable command. This seems to be more secure: according to the Macromedia website, the Flash player (in version 7 and later) doesn't allow cross-domain invocation of GetVariable and SetVariable. There might still be a problem here for player versions less than 7, but I think I can defend against that in the ActionScript code.

Note that when my SWF is hotlinked, it can access the shared data. All I've done is modify it so that it no longer exports the data. I argue that there is still a security flaw here. When my SWF is hotlinked by a different domain, it should not be able to access the persistent data at all. The security team at Macromedia might argue that the security hole was in my code, and that I created it with my ignorant use of fscommand().

My apologies for raising what may have been a false alarm.

permalink | Comments (0)

December 02, 2005

Flash Persistence: one more try

[Update: this may be a false alarm. See how I resolved the problem.]

I'm still trying to find a workaround to the Flash persistence security hole that opens up when you script your flash movies from JavaScript.

What I want to try now is to code my ActionScript so that it compares the URL of the Flash movie of which it is a part (something like _root.movieClip._url) to the URL of the web page in which the movie is embedded. If they are not the same, then the actionscript knows that it is being hotlinked, and can refuse to return any persistent data.

My problem is that I don't know the ActionScript programming environment or API well, and I can't find a way to find the URL of the document in which the movie is embedded. It seems like there ought to be a way, but I don't know...

Anyone know how to do it?

permalink | Comments (4)

December 02, 2005

More on the Flash Persistence Security Hole

[Update: this may be a false alarm. See how I resolved the problem.]

I've been looking for a way to work around the security hole in the flash persistence mechanism that I described in the previous post.

Since the hole occurs when one site hotlinks a Flash movie on another site, it seemed to me that the obvious solution would be to prevent hotlinking. So I set up a .htaccess file using the Apache mod_rewrite module and the HTTP_REFERER header. It worked: offsite links to the SWF file were prevented.

But this wasn't good enough. I didn't take caching into account. Suppose user has just visisted Site A, and that site uses a scriptable Flash movie to store persistent data locally. The user then visits Site B, which is the malicious one. Site B hotlnks the movie on Site A and scripts it to gain read access to the persistent data. My .htaccess file on the server was supposed to prevent this kind of hotlinking. But the server never gets involved: Site B doesn't need to ask the Site A server to send it the movie: the movie is already in the browser's cache, and Site B can read Site A's persistent data with no trouble.

Anyone have a solution to this problem? I'm ready to give up and declare that Flash-based persistence is only good as a cache for information that is already publically available on the network.

And as a meandering aside, let me say that there are some interesting uses of it there. I gather that XMLHttpRequest bypasses the browser cache, at least in certain browsers. Flash persistence could be used to create a persistent cache for XMLHttpRequest. This might be useful for a web site that had a bunch of pages that all did XSLT using the same stylesheet, which was retrieved with XMLHttpRequest.

Adam Vandenberg has an XMLHttpRequest caching solution but the cache only persists for the lifetime of the current page. Flash persistence might be an interesting addition to his caching code. (Of course, we'd have to confirm that this Flash security hole only gives read access to the cached data; otherwise any website could corrupt the Flash cache of any other.)

permalink | Comments (5)

December 01, 2005

Security Flaw in Flash-based Persistence

[Update: this may be a false alarm. See how I resolved the problem.]

Inspired by Brad Neuberg's AMASS project, I've been experimenting with using the Flash plugin as a client-side persistence mechanism, as an alternative to cookies. It allows the storage of much more data, and that data doesn't have to be uploaded to the server for every webpage that uses it. This seems important, and projects like AMASS seem like they have breakthrough potential.

However, I've discovered something I'm worried about. Persistence in Flash is done with the SharedObject class. By default persistent data is not shared among Flash movies: it is only accessible to the movie that stored it. SharedObject has a way to loosen this up, like the path attribute of a cookie. You can specify that all Flash movies in the same directory can share persistent data, for example. Or you can specify that all Flash movies from the same website can share data.

This may be fine when the data is locked inside the Flash movies. But when Flash movies are scriptable, we run into problems. Both AMASS and my own experiments with Flash-based persistence use a simple invisible movie to make the necessary calls to the SharedObject. Both script the Flash plugin with JavaScript so that we can persist client-side JavaScript data.

And this is where the problems arise. Flash governs access to the data based on the URL of the movie; it doesn't seem to care about the URL of the document within which the movie is embedded. Suppose, therefore, that Site A uses a scriptable Flash movie to persistently store some data from JavaScript. Site B can hotlink that scriptable movie right off of Site A, and the JavaScript code on Site B can read the data stored by the JavaScript code on Site A. (In my experiments, Site B cannot overwrite or delete the data stored by Site A, but it can read it.)

My testing was done using Firefox 1.0.6 using the Linux version of the Flash player 7.0 r25.

Update:I have now obtained the same results using Firefox 1.02 and Flash version 8 on a Windows machine.

I think this is a problem. I suspect that it is a big problem and that Macromedia needs to fix it. I'm only beginning to dabble in Flash and just learning about how to script Flash movies. I don't know if this security problem will affect movies that use SharedObject but do not explicitly open themselves up to scripting of that SharedObject. It could be that if you play a Flash-based game at Site C, site D could figure out a way to read your high score in that game.

Your thoughts are welcome in the comments, especially if you understand Flash and Flash scripting better than I do!

permalink | Comments (4)

November 04, 2005

Amazon Commodifies Human Intelligence

Amazon has a new web service which they call Mechanical Turk, after a fake chess playing robot from the 18th century.

Amazon describes it like this:

Complete simple tasks that people do better than computers. And, get paid for it.

Most of the task available for completion today involve looking at a set of streetscape photographs and picking the one that is the best picture of a specified business. For completing such a task, you can earn 3 cents.

For the more ambitious, you can help Amazon re-catalog the automotive parts they sell, earning 40 cents for re-writing the title and bullet points for a custom license plate holder or a exhaust tube extension.

At first I was intrigued by this. Interesting use of technology. (There's even an API for it: hey, I could use Mechanical Turk tasks as a captcha for blog comments, and readers would earn money for me...) Free market goodness. And as Amazon says:

Choose from thousands of tasks, control when you work, and decide how much you earn.

What's not to like?

Then I noticed Amazon's subtitle for the Mechanical Turk service: "Artificial Artificial Intelligence". That seems a little creepy. Reducing humans to cheap computing power. But don't think I'm reading too much into this. Here are two paragraphs from the FAQ:

When we think of interfaces between human beings and computers, we usually assume that the human being is the one requesting that a task be completed, and the computer is completing the task and providing the results. What if this process were reversed and a computer program could ask a human being to perform a task and return the results? What if it could coordinate many human beings to perform a task?
Amazon Mechanical Turk provides a web services API for computers to integrate "artificial, artificial intelligence" directly into their processing by making requests of humans. Developers use the Amazon Mechanical Turk web services API to submit tasks to the Amazon Mechanical Turk web site, approve completed tasks, and incorporate the answers into their software applications. To the application, the transaction looks very much like any remote procedure call: the application sends the request, and the service returns the results. In reality, a network of humans fuels this artificial, artificial intelligence by coming to the web site, searching for and completing tasks, and receiving payment for their work.

I don't know if I'm more disturbed by the philosophical implications of this Matrix-lite inversion of control, by the commodification of intelligence it represents, or by the implications for workers rights. A system like this will just accelerate the race to the bottom: even offshore worker's jobs won't be safe. Heck, they could probably train pigeons to do some of the image recognition tasks they're currently paying 3 cents for. 3 cents probably buys a lot of pigeon pellets.

I'm not qualified to really critique Mechanical Turk the way it needs to be critiqued. Perhaps former O'Reilly editor Steve Talbot will write about it in his wonderful NetFuture newsletter.

Mechanical Turk is still a beta service. I hope it fails.

permalink | Comments (6) | TrackBack (2)

October 24, 2005

java.io.Console

Build 57 of Java 6 includes a new class java.io.Console.

Javadocs are at the link above. And you can also leran more at Alan Bateman's blog .

Highlights:

Obtain the Console object with System.console(). It returns null if there is no console.
a readPassword() method for reading input that is not echoed on the console
a readLine() method for reading input from the user. Much simpler than wrapping a BufferedReader around an InputStreamReader around System.in
Also supports printf() and format() methods just like System.out does.

permalink | Comments (1)

Advertising

About

about davidflanagan.com