Stuart Thompson’s Tech Blog

Web 2.0 & Quality Development

Archive for April, 2007

Gravatar

Posted by stuartthompson on April 23, 2007

I discovered today, way behind the blogosphere eight-ball, a site that provides online avatars: gravatar.com.  A gravatar is a Globally Recognizable avatar, which means an 80×80 image is associated with an email address.  The gravatar service will serve images from its database for use on blogs where you post comments or any other location where an avatar-accompanied email address makes the content a little more personal.  The service is very simple, to see the image associated with an email address, you add the MD5 hash of that email address (I use this utility) as a querystring parameter named gravatar_id to the url: http://www.gravatar.com/avatar.php.

In addition to the gravatar_id parameter, you can specify the location of a default image to use if the email address is not recognized, a size (in pixels) to which the image should be scaled, and even an MPAA rating (G, PG, M, X) that represents the highest “tolerance” your site allows.  Setting the rating to PG, for example, would mean that only G and PG images are shown on your site.  This is a very appealing reason to include gravatar support in the comments section of a blog.

The MD5 hash is a great way to avoid having someone scrape the links for email addresses and allows you to share your gravatar image as a url without fear of revealing the original email address.  There are far easier ways to farm email addresses from blogs (as many people who have been bot-spammed have discovered) but it is very refreshing to see the folks at gravatar considering this aspect when designing their service.

I registered for a gravatar this morning, but currently receive a stock image when requesting the image back from the service.  I’m presuming that there is some kind of delay, possibly incurred while a person approves submitted images.  That would certainly give extra strength to the value of the ratings system.  This ties in to the only weakness I’ve found with the service so far.  The documentation on the sign-up process and usage of the system is a little sparse for anyone but blog developers and in-depth users, which may lead to confusion for some folks.  I couldn’t find any documentation indicating that a sign-up process might be delayed for 24 or 48 hours while images are reviewed, although I didn’t find any documentation to the contrary either.  I’ve completed the sign-up successfully, tested that other people who have got gravatar working show up when I calculate the MD5 of their email and plug it in, so I’ll just have to be patient and check again in the morning to see if my address registration has worked.

Fortunately it looks as though SubText, on which this blog is hosted, is already integrated with gravatar because an image citing a gravatar url is displayed next to each comment entry in the posts.  There seems to be another problem regarding the specification of a default image, which I’ll work out over the next couple of days, but for now it’s a promising new way to add a little personality to any feedback received.  It is kind of cool to think that I’m being rewarded for still using the same hotmail account I set up eight years ago too.  If any blogs are migrated or upgraded to a gravatar-aware hosting platform, then any comments I’ve left over the last eight years with that address will magically be accompanied by my new gravatar.  Further, because it is a lookup service versus being a static copy, I can update my gravatar to avoid the image being as stale as the picture on my driver’s license.

UPDATE:  My gravatar image is now working and is now viewable here.

Posted in Services | Tagged: | Leave a Comment »

Quantum Computing and Public Key Cryptography

Posted by stuartthompson on April 23, 2007

I have just finished reading Decoding the Universe by Charles Seife, which tells a fascinating tale about the emergence of information theory over the past hundred years.  I have read many texts on classical physics, quantum theory, and the mathematics behind modern information technology.  However, while many have given good insight into their particular field, none have done so wonderful a job of tying all of these disciplines together as Charles Seife does in this text.  The great scientific revolution of our generation is that of information theory and this book presents summaries of the many great theories and articles that have led to the discovery that information theory is a superset of the many classical and quantum physics theories developed over the last few hundred years.

One particular section of the book discusses quantum information, specifically the qubit (derived from quantum bit), which is similar to a classical computer software bit except is has an additional third state indicating both 0 and 1 simultaneously.  Without going too deeply into the complete science behind a qubit, it is suffice to say that physicists have found techniques to place very small items such as electrons into a state known as superposition.

The discovery of superposition stems back to a classical physics question of whether light was a wave or a stream of particles.  In 1801 Thomas Young devised an experiment that was thought to show that light was a wave in which he shone a beam of light through two narrow slits and observed an interference pattern upon a viewing screen beyond.  Interference patterns are very typically associated with waves as the constructive (crest) and destructive (trough) nature of intersecting waves draw dark and light bands on Young’s viewing screen.  However, waves propogate through mediums, that is how sound is heard on an eardrum.  The wave itself is not shaking the eardrum, rather a set of air particles displaced by the sound source are moving in a wave-like formation and wiggling the eardrum.  So what was a light wave moving, what material was it displacing?  It was proposed that an invisible ether known as the luminiferous ether was the medium through which light propogated.  It was an attempt to measure this ether that caused the development of the interferometer, a device that would cause a beam of light to be split down two paths and then re-combined before hitting a viewing scope.  The paths were of equal length, however because light was travelling in two different directions along those paths, one of them should be affected by the ether more harshly than the other.  Because the Earth is moving very quickly, there would have to be a natural ether wind and a light wave jostling through the direction with the wind should jostle faster than the path running perpendicular to the same wind.  However, the experiment concluded that the two beams always reach the viewing scope at exactly the same time.  That is to say that the light travelled at exactly the same speed irrespective of the angle at which the experiment was conducted.  The experiment was even repeated at the top of a very high mountain to ensure that the original laboratory and its environment were not acting as a wind break.

It was the interferometer experiment that later led to a discovery of superposition.  The same concept of splitting down two pathways to recombine later was used to send electrons down two paths to recombine at a sensor.  However, the experiment was attempted with only a single electron.  It was expected that the electron would pick one of the two paths as such a particle is indivisible and cannot be split.  However, the first run of the experiment seemed to indicate that the electron took both paths at once.  The viewing scope at the end of the path showed an interference only possible if two electrons had collided and interfered with each other simultaneously.  “Faced with two mutually exclusive choices, the electron chooses both.”.  Even more amazing was the result of placing a laser along one of the paths and repeating the experiment.  It appeared that the electron chose both paths only until it was measured.  At that point the electron definitively chose only a single path, either by being observed by the laser on the path being watched or by not creating an interference pattern at the end and hitting the end sensor; thereby indicating it chose the unwatched path exclusively.  In quantum mechanics, the electron is said to be in superposition.  In another quantum term, the electrons upon the two paths are said to be entangled (see entanglement) meaning that when one “chose” to exist the other simultaneously chose not to.  It is the two concepts of superposition and entanglement that led to quantum computing.

It was later found that other particles could be forced into other types of superposition.  For example, atoms could be manipulated in such a way that they are both spin up and spin down simultaneously (mutually exclusive states for an atom) and hence be placed into superposition.  This brings us back to the idea of a qubit.  In the same way that a classical computer bit (0 or 1) could be stored in silicon memory, a qubit could now be stored in one of three states by an atom being either spin down (0), spin up (1), or both simultaneously (0 & 1).  It was this principle that was used to build the first quantum computer.  In 1994, Peter Shor developed Shor’s Algorithm for factoring a number N in O((log N)3) time and O(log N) space.  The concept exploited the idea of superposition, of a particle being in two states at once, and of forcing the particle to “choose” a state (along with the related concept of entanglement).  In the most basic terms, the quantum computer set up qubits using atoms in both spin up and spin down states simultaneously to model its virtual memory, which is similar to how classical computer science would set up memory with bits as 0s and 1s.  Then it exploits the idea that all of these qubit carrying particles are entangled to test all possibilities of their combination of states at once.  With some mathematical massaging, the probabilities of these qubits to “choose” a particular state (and therefore force their entangled compliments to choose the opposite state) are tickled in a couple of very hairy math passes until the resulting “chosen” states of the atoms reveals the “chosen” answer.  This is a vast over-simplification of the process, but the experiments performed using these theories get the correct results in vastly fewer operations than would be required by a classical computer.  Formally, it has been found that any algorithm requiring n steps on a classical computer would require root n steps on a quantum computer.  Not a big speed increase when considering a four-step classical process takes only two on a quantum computer, but with the thought that a process requiring 256 operations classically takes 16 quantum steps, or that 2,304 steps take only 48 on a quantum computer, the speed increase becomes clear very quickly.

In 2001, in an experiment performed by Isaac Chuang, a 7-bit quantum computer correctly factored the number 15 into 5 and 3.  While not impressive on the surface, the fact that a quantum computer was able to factor a number using Shor’s algorithm has some interesting ramifications.  While building quantum computers on larger scales is still not yet a possibility, were a 256-bit quantum computer actually constructed, it could perform the kind of brute-force attack that would take a classical computer an eternity in a matter of minutes.  It has been calculated that the most powerful classical computer in existence performing billions of operations per second would need to run for the entire lifetime of the universe to brute-force crack modern cryptographic keys.  However, a 256-bit quantum computer could perform the same attack in just a few minutes, rending modern cryptography wide open.

Of course, quantum computers cannot yet be built on that scale, nor have the theories of scaling to such sizes ever been field tested.  Furthermore, the technology required to build such a machine is unimaginably more difficult to obtain and construct than a modern home PC, so the cyphers used to secure internet transactions are more than safe for the present time.  It is, however, quite fascinating to think that such computational power could one day become available.  Projects such as cancer research, genome mapping, and even SETI would be revolutionized if they could harness the kind of power that quantum computing is predicted to possess.

If you’re interested in learning more about information theory or quantum computing, I can highly recommend both Decoding the Universe and Charles Seife’s other book Zero, which I read a couple of years ago.  Seife has an excellent way of wording concepts that are otherwise completely over my head without abstracting so much information as to neuter the subject in question.

Posted in Cryptography, Quantum Computing | Tagged: , , | Leave a Comment »

Ambiguous Error Messages

Posted by stuartthompson on April 18, 2007

Last night I came across an instance of poor software design that is all too common in modern applications, one that has troubled and annoyed me ever since I became involved with computers and software development. Worse than my ruffled feathers, this is an issue that is responsible for such a vast amount of wasted time and money that it never ceases to amaze me how little has been done to correct it.

I have a Canon Powershot A710 IS digital camera. It’s a middle of the range compact digital camera with just enough features to give me power without having any menus or buttons that I don’t understand. One of the features I enjoy and use regularly is the ability to shoot 640×480 videos by simply turning a dial to the little video icon. I recorded such a video last night of about nine minutes in length, which was actually the longest I had yet recorded on the device (as I discovered later). When I bought the camera, Canon supplied with it a suite of software for managing the download of pictures and video to my PC. I have installed this software on a couple of my computers but last night I happened to be synching with a PC that did not have the software installed. This was no big loss. I’ve synchronized without using the Canon software several times. I don’t use any of the advanced features, I simply transfer the pictures to a folder on my PC and use other software applications for catalog organization, photo touch-up, and publishing at a later date. Windows XP ships with the Microsoft Scanner and Camera Wizard, which had been servicing my “get them off the memory card and onto disk” requirements quite sufficiently for several months.

However, last night while transferring a handful of pictures, some video clips, and my nine-minute movie, I was presented with an error dialog stating that there was a problem with the current picture and it could not be transferred. I was presented with two options “Try Again” and “Cancel”. A note indicated that choosing “Cancel” would cause the pictures transferred thus far to be deleted from their destination; my hard drive. I clicked “Try Again” hoping that some power glitch or other anomaly had simply occurred and that all would be fine. After about thirty seconds of nothing, I received the same dialog with the same options. I chose to “Cancel” this time and watched a progress bar indicating that the pictures were being deleted from the disk. After the process completed, another dialog informed me that the transfer process had failed and embedded in an otherwise redundant paragraph of text was a link for more information about the error. Wanting to correct the problem and complete the transfer, I clicked over the link to receive another pop-up dialog that stated:

“The following problem occurred while copying pictures: Not enough storage is available to complete this operation.”

Aha! Progress was beginning to emerge. My first thought was to check the available space on my hard drive; 11.1GB. Not a massive amount, but certainly enough to transfer a video from a camera with a 2GB storage card. After a little more thought and research, I discovered that the software involved in transferring images and video from a camera was the most haphazard layering of TWAIN drivers on WIA drivers on USB drivers with some other dribble gluing them all together. I’m not a low-level digital transfer guru. I’m familiar with the terms and have a pretty good idea of what goes on at each layer from an interface and bits perspective, but I certainly couldn’t debug errors being logged by a TWAIN driver on my system from a mini-dump file.

I checked the camera, first of all to see that the original pictures and video hadn’t been erased during the “Cancel” process from before, but also to see if the nine-minute video would play back on the camera natively. I was suspecting a possible corruption problem on the camera, perhaps something on the SD card had flaked or perhaps the hardware didn’t perform well with little to no memory available natively on the device. I had almost filled the 2GB card in the camera and was beginning to wonder if it used a cache in the transfer process and if, perhaps, that was the source of the “Not enough storage is available to complete this operation” message. The video played fine, indicating no corruption natively, so I proceeded under the caching assumption that the camera needed some virtual breathing room in order to complete the transfer. I fired up the Microsoft Scanner and Camera Wizard again and this time selected to transfer everything except the long video that had caused the problem. Everything else transferred just fine, including several other short video clips I had taken. I confirmed that they had transferred to the PC and then deleted them from the camera to free up some space.

Here we go! Firing up the Microsoft Scanner and Camera Wizard for a third time, I began the transfer process of the one and only item left on the camera’s memory; my long video. After about thirty seconds of inactivity, I was presented with the same error message I had received earlier. Hmm, not the virtual breathing room then. My last thought before I wrote off the video (not something I wanted to do as it was of our cat Peaches playing with her toys, who is having kidney and urinary problems and is going to be put down tomorrow) was to install the Canon software that had come with the camera and see if it could transfer the video. I dug out the CD, which ended up being in my “Ultimate Dance Party” CD case out in my car (long story, don’t ask – I catalog digitally for a reason!) and installed the software on my PC. Voila (or Walla as seems to be the trend now – Google it), the video transferred without error. I confirmed that it played back successfully in Quicktime, approximately nine minutes in length, occupying about 975MB on disk.

Not satisfied that one application magically transferred it while the other spewed, I began investigating my hard drive for other clips that had transferred successfully. The longest I had previously transferred was 780MB in size, confirming that this was the longest clip I’d recorded on my A710 IS camera. I found it an odd place to draw an imaginary line in the sand, but began formulating the idea that perhaps the nine-minute video was too large for the Microsoft Scanner and Camera Wizard to transfer. Perhaps it caches some of the movie as part of the transfer, or uses some ancient component of WIA or TWAIN that hasn’t been re-written since the late eighties and has some arbitrary “no movie will ever be large than 800MB” limitation in it. Perhaps part of the algorithm in one of the many moving parts just couldn’t figure out how to begin transferring that particular movie. Software is complex, and software operating upon multiple general purpose moving layers is even more complex. I accept that as part of being a software engineer.

However, and this is a very big however, this is the job and purpose of error messages and reporting. I can count the number of times I’ve received a meaningful and useful error message from a piece of software on the fingers of one hand. I cannot imagine the number that represents the number of times I’ve had a meaningless piece of rhetoric presented to me when a piece of software fails; probably close the famously canonical 1080 number of atoms in the universe. The only purpose, the very reason we have error messages is to meaningfully convey to the user of an application an explanation of why an operation failed. This leads me to one of most infuriating observations I have made time and time again in my career as a software engineer. When a piece of software is being developed by a programmer, in most cases the programmer can isolate a point of failure in a piece of code and tell you exactly the steps that would lead to the situation occurring. If you allow them to sit down in an office and explain the myriad reasons why a particular hardware bit being set or a third-party component returning a specific error message leads to their software encountering a failure, they will talk for hours and hours about the underlying reasons why the piece of software failed. Why then when it comes to trapping that error in their software, a process which takes time and planning to achieve, do they suffice by returning the error message “An error occurred. The operation cannot be completed.”? They have the information available to them. When you ask them to research the error at a later date, a volume of comments exist in the actual source code about why this particular error occurs. Why does this information become suddenly unimportant when it is time to present it to the user? Because most programmers are driven by the goal of making the software work and meeting a stated list of requirements by a specified deadline. Furthermore, no part of those requirements details the error messages that should be presented to the user of their application. Everyone blindly assumes the software will always just work, and yet the same set of people will instantly concede that no software is perfect and there are a whole heap of reasons why their software might legitimately fail under certain circumstances.

This leads me to the conclusion that software development teams need to be more aware and responsible of the failure conditions of their application and make error situations a priority from the first phases of software design and architecture. Questions about what should happen when the camera is unplugged or a cat pees on the keyboard should be asked very early on and considered a fundamental part of the success conditions of the application. The idea that a piece of software has succeeded when it reports a fatal error probably raises many eyebrows skyward and causes much scratching of heads, but the concept is pretty obvious when you think about it. If a software application gracefully presented a meaningful reason as to why it had failed, providing all of the information to the user about where the failure occurred, then it has succeeded in correctly handling the situation. From a software quality perspective, this case is just as important as completing the requested task and displaying a confirmation number and yet error cases are given some of the lowest priorities and subsequent development time. When was the last time you sat in a software planning meeting and discussed the cases under which a particular failure could occur, how this could affect layers consuming your software, and how you could most meaningfully provide information about the error either to an end user or a consumer of your component? Yet from a maintenance cost, ease of use, and customer satisfaction perspective, these discussions carry a phenomenally larger weight than whether or not to use a hyperlink or a button for sorting a DataGrid.

For software engineers who are looking for additional challenge in their work or for the ability to bring increased quality and value to the software that they develop, spending some quality time on exception planning and management is a gold mine of opportunity just waiting to be unlocked. Try revisiting a simple application you’ve written recently and listing out some of the known ways in which it could fail. Then examine the behavior of your application and think whether or not your parents could meaningfully digest the error information provided to come to a conclusion about how they might correct it and successfully complete their work. Software isn’t used by engineers as a majority and even when it is they are engineers like me who have expertise in other fields. I can often diagnose why an ASP.NET web application failed and what the error might mean, even on sites I’ve never worked on. However, when presented with an error from a TWAIN device driver consumer I have absolutely no clue what happened and would have to walk a pretty long road of knowledge gathering to even start understanding where to look for the failure. Consider that the majority of software users have absolutely no development skills whatsoever and it becomes pretty clear that software needs to do an outstanding job of reporting why errors have occurred.

Several companies, Microsoft included, have made attempts to bring more richness to their error reporting. Unfortunately, this usually results in a troubleshooting guide with only the most basic reasons for failure being listed (see “Is the printer turned on?” for details) and these are all too often compiled into a single indigestible list of all possibilities instead of being more context specific. It will take a long time for software to become more verbose with error reporting, especially considering that many errors occur in external software components that haven’t been rewritten in over a decade. However, that does not excuse a programmer who has additional information about an error from excluding it in their error message. Be verbose, take the time to really document why the line of code that only runs when an error occurs is being executed. Put down everything you know and really try to think about the reasons why the error might have occurred rather than simply stating “There was an error with the camera.” It will not only improve the usability and quality of your application, but might also make you learn something about the error case that allows you to handle it more effectively. Perhaps it is an error that can be handled and corrected without the user even knowing something went wrong. More often than not, programmers will choose to error out because something unexpected happened without even taking the time to figure out why it was unexpected. This masks real design flaws in an application and only perpetuates the “too little information” trend that causes such frustration and expense in software today. Break the trend and really document what is happening at all points. Also, educate your project managers and requirements teams on why error conditions are important and why taking the time to structure the output of error messages is cost effective. Planning a list of 100 well thought out error messages, using resource files to allow for the translation of these error messages in future releases, and taking the time to really understand the reasons behind error conditions will save literally hundreds of thousands of dollars spent developing FAQs, supporting forums, and paying for a technical support department to answer questions that a simple error message could have conveyed instantly. How many times has a call come in regarding “Error 471: Failure to communicate” that was received by a level 1, then a level 2 technical support assistant, forwarded on to a developer, researched, and then reported as “Oh yeah, they have to disable the wireless adapter first or the transfer won’t work.” Next a patch is released to check if the wireless adapter is disabled before attempting transfer and documentation created to distribute and manage the use of the patch. Tech support lists are updated and a new code branch is created for the new version. Instead, this condition could have been investigated as a success scenario up front and a meaningful error message like “Try disabling the wireless adapter (click here for instructions how) and try the transfer again.” could have been displayed.  Even better, the software could have been coded to attempt disabling the adapter first or research could have been done into how the wireless software API could be utilized to achieve this. All of this would have been discovered if the error scenario had been taken seriously up front and would have cost vastly less than the “pick it up in support later” solution. Users really like software that just works. They gain confidence in your products and have a positive experience that they associate with using your solutions when those solutions speak to them in English and give meaningful reasons why they can’t obey instructions. Consider that the next time you didn’t ask a developer how they handled the camera being unplugged.

Posted in Electronics | Tagged: , , , , , | Leave a Comment »

The ObjectDataSource Web Control

Posted by stuartthompson on April 17, 2007

There exists a wide variety of ways in which data for a web page can be retrieved from a data store. One of the ways introduced in ASP.NET 2.0 involves using the ObjectDataSource control. This article discusses the ObjectDataSource control and how it fits into the ASP.NET 2.0 toolkit. A working example is provided demonstrating how the ObjectDataSource control can be used with a typed data set to provide search results to a GridView that are filtered based upon search parameters entered into other controls on the same page. The results can be paged and sorted within the grid, demonstrating some of the additional capabilities that the ObjectDataSource control provides.

What is it?
In high level terms, the ObjectDataSource is described as “…a middle-tier object with data retrieval and update capabilities” and “…acts as a data interface for data-bound controls“. Fancy! What does this really mean? In basic terms, it can be thought of as wrapping up the code necessary to open a connection to a data store, execute a query and return a result set, as well as provide caching, paging, and parameter support.

The ObjectDataSource can be configured to retrieve some data so that other controls don’t have to worry about data store access. The benefits of this approach over having the controls on the page source the data from a database themselves is that it provides a single structured location to define the data retrieval operation and isolates that retrieval, ostensibly making maintenance and extensions to the page more robust and reliable. As a further benefit, being a web server control means that many uses require no compiled code-behind implementation and the configuration of the ObjectDataSource on the page can be changed in an .aspx page directly; eliminating the requirement for recompilation during maintenance.

For those wishing to follow along with additional reference information, the full declarative syntax for the ObjectDataSource control can be found here: http://msdn2.microsoft.com/en-us/library/ms227436.aspx.

For the purposes of this discussion, a simpler implementation path will be taken; the typed data-set. Strongly typed data sets were introduced in the .NET 1.0 framework and have been vastly enhanced in the .NET 2.0 framework. I would strongly advise anyone who wrote off .NET 1.0 typed data sets as over-bloated and under-implemented sledgehammers to take a look at the .NET 2.0 typed data set before passing judgment. For ease and efficiency of development and maintenance with little to no performance impact at run-time, typed data sets and the ObjectDataSource control can form one of the most powerful duos in any ASP.NET 2.0 developer’s toolkit. http://msdn2.microsoft.com/en-us/library/ms228150.aspx

Basic Syntax
The basic syntax to declare an ObjectDataSource on an ASPX page looks like the following:
<asp:ObjectDataSource ID=”srcSearchResults” runat=”server” SelectMethod=”GetData” TypeName=”VideoManagerTableAdapters.VideoTableAdapter”>
<SelectParameters>
<asp:ControlParameter ControlID=”ddlGenre” Name=”Genre” PropertyName=”SelectedValue” Type=”String” />
</SelectParameters>
</asp:ObjectDataSource>

There are a few interesting points to note about this basic declaration. First, the TypeName attribute, which identifies a class used to perform the actual database access operations. This gives the first clue as to how the ObjectDataSource control is performing its work. The second interesting point is the SelectMethod attribute, which specifies a method on that class that will be used to perform a select query. This gives the next clue as to how the pattern will work. When a control using the ObjectDataSource is bound, the method specified in the SelectMethod attribute will be invoked on an instance of the class specified in the TypeName attribute. The results of this query will then be used by the control that is binding data.

Sourcing Objects
This raises an important question about how the class specified in the TypeName attribute of an ObjectDataSource is defined. Any object that satisfies the requirements of the ObjectDataSource can be specified in this attribute. For more information on the specifics of those requirements and for instructions on how to implement a custom sourcing object, see this article:

What about parameters?
When executing a query to return a result set from a data store, it is often desirable to provide parameters to the query in order to narrow the results that are returned. Imagine a search page on a video library web site that allows a user to specify a genre, director, and title as possible search parameters. When executing the query to return the list of results, any user input provided for those three attributes would need to be passed as well. The ObjectDataSource provides such a mechanism for supplying parameters to queries. Additionally, the ObjectDataSource allows for a different set of parameters to be provided for each of the types of queries configured across it, but more on this later. The syntax used to supply parameters should be instantly familiar to developers who have worked up through the various releases of ADODB, including the popular ADODB 2.5 used with VB6 and the even more popular ADODB.NET released with the .NET 1.0 framework.

The ObjectDataSource control recognizes seven parameter types. Each of these can be provided to any of the five query types. The following tables list the query and parameter types along with a brief description of each.

Query Type Description
SelectParameters The collection of parameters to supply to the SELECT query.
InsertParameters The collection of parameters to supply to the INSERT query.
UpdateParameters The collection of parameters to supply to the UPDATE query.
DeleteParameters The collection of parameters to supply to the DELETE query.
FilterParameters The collection of parameters to supply for the Filter expression.
Parameter Type Name Description
ControlParameter A parameter sourced from another ASP.NET web server control.
CookieParameter A value sourced from a cookie.
FormParameter An item of form-post data received from an HTTP post.
Parameter A literal parameter with either a hard-coded value or sourced programmatically.
ProfileParameter Sourcing a parameter from the current security profile.
QueryStringParameter A parameter sourced from a query-string value.
SessionParameter A value read from the session object.

What else can it do?
In a subsequent article, I am going to discuss the additional features and potential pitfalls of the ObjectDataSource control. These include paging, filtering, and sorting, how caching can work, and the list of most common traps and errors that you might encounter while using the control. For now, it is worthwhile playing with the ObjectDataSource control on a few test pages, examining its syntax and thinking about its application as a middle-tier enabling layer. In essence that is what the ObjectDataSource is trying to provide, a managed ability to employ a middle-tier in your solution architecture that manages some of the grey area items between those layers such as connection string management, parameter marshalling, and the provision of a single point of reference for user interface layer controls.

Working Example
The following section contains a full working example using an ObjectDataSource and a typed data set to bind search results to a GridView on an .aspx page. The ObjectDataSource takes parameters from other controls on the page to provide filtering capabilities to the search.

Creating the SQL database
Open SQL Management Studio and connect to a SQL database server. Create a new database by right-clicking over the Databases folder and selecting New Database… Enter the name ObjectDataSourceDemo as the name of the database and click OK. Expand the new database in the object explorer window and add a new table to the database by right-clicking the Tables folder and selecting New Table… Fill out the table definition as shown below being sure to set VideoID as the Primary Key and the Identity Specification (Is Identity) to Yes. Press Ctrl-S to save the table, which will display the Choose Name dialog. Enter Video as the name of the table and click OK.

Open the table by expanding the Tables folder in the object explorer and then right-clicking over the Video table in the list and selecting Open Table. Enter a few rows of data in the table and then close SQL management studio. Remember that VideoID is an identity column and does not need to have data typed into it. An ID will be assigned automatically by navigating to the next row.

NOTE: In a real solution, the data for directors and genre would not be stored directly as strings within the Video table. For the simplicity and brevity of this example, the contents of several logical tables have been compressed into a single table.

Defining a Typed Data Set
In Visual Studio, create a new C# web site in a new solution. To add a typed data set, right-click the project in the solution explorer and select Add->New Item to show the Add New Item dialog. Select DataSet from the list of item types, enter a name, and then click the Add button.

This will open the newly added data set in design mode. From here you can add items to the data set and visually build its definition. For the purposes of this article, we’re going to add a TableAdapter to the data set for use in an ObjectDataSource. For more information on typed data sets and table adapters, see this article: http://aspnet.4guysfromrolla.com/articles/020806-1.aspx

First, right-click over the design pane and select Add -> TableAdapter. This will add a table adapter to the data set and open the TableAdapter Configuration Wizard. Create a new database connection to the SQL server and select the ObjectDataSourceDemo as the database to connect to.

Click OK on the Add Connection dialog and then click Next to save the connection string to a configuration file. For this example we will just use SQL statements to connect to the database. Select the radio button marked Use SQL statements and click Next. Type the following SQL into the dialog box and click Next.

SELECT Title, [Director Name], Genre
FROM Video
WHERE Genre = @Genre

Finally, accept the default names for the Fill and Return actions of the TableAdapter and click Next. Click Finish to close the dialog. The new table adapter will be displayed in the design pane and you can now save the dataset.

(The newly created TableAdapter)

Creating the ASPX page
In the solution explorer pane, open the Default.aspx page that was added when the web site project was created. Add the following code between the <form> tags on the page:

<asp:DropDownList ID=”ddlGenre” runat=”server” AutoPostBack=”true”>
<asp:ListItem Text=”Action” Value=”Action” />
<asp:ListItem Text=”Comedy” Value=”Comedy” />
<asp:ListItem Text=”Mystery” Value=”Mystery” />
</asp:DropDownList>
<br />
<asp:GridView ID=”gvSearchResults” runat=”server”
AutoGenerateColumns=”false” DataSourceID=”srcSearchResults”>
<Columns>
<asp:BoundField DataField=”Title” HeaderText=”Video Title” />
<asp:BoundField DataField=”Director Name” HeaderText=”Director Name” /> <asp:BoundField DataField=”Genre” HeaderText=”Genre” />
</Columns>
</asp:GridView>
<asp:ObjectDataSource ID=”srcSearchResults” runat=”server”

SelectMethod=”GetData” TypeName=”VideoManagerTableAdapters.VideoTableAdapter”>
<SelectParameters>
<asp:ControlParameter ControlID=”ddlGenre” Name=”Genre”
PropertyName=”SelectedValue” Type=”String” />
</SelectParameters>
</asp:ObjectDataSource>

Running the Sample
While not the most advanced example out there, this demonstrates the use of an ObjectDataSource control as a data source proxy for another ASP.NET web control. The DropDownList control is set to post the page back when the selected item in the list is changed. Upon this post-back, the GridView control is data-bound. It is during this data-binding step that the data source specified for the GridView, in this case our ObjectDataSource control, performs its work and retrieves the search results from the database. It uses the selected value of the genre DropDownList control as a parameter to supply during the search.

Posted in ASP.NET, WebControls | Tagged: , , , , | Leave a Comment »

Overloaded Indexers cause Ambiguous Match

Posted by stuartthompson on April 13, 2007

Developing custom web server controls can be a powerful way to provide opportunity for code re-use while retaining strong design-time support and Visual Studio integration.  This is especially useful when developing a solution that will be handed off to a maintenance team as it allows full control of the rendering and behavior of your control whilst retaining ease of use and configuration for the maintenance team at a later date.  One of the common patterns that appears in such server controls is the definition of a child collection of items.  This is akin to the collection property Columns on the System.Web.UI.WebControls.DataGrid control.  When defined in .aspx, it looks like the following:

<asp:DataGrid id=”myGrid” runat=”server”>
<Columns>
… column definitions …
</Columns>
</asp:DataGrid>

This collection is used in the definition of the data-grid to define information about the columns that the control should create when it is rendered.  Similarly, a custom control might have a collection of child items that represent domain objects within your custom solution.

I came across an error in one custom server control I developed that manifested itself as an System.Web.HttpParseException with the error message ‘Parser error: Ambiguous match found’.  The collection property, shown as <ChildItems> in the code below, was highlighted as the source of the error:

<ctl:CustomControl id=”myControl” runat=”server”>
<ChildItems>
… child item definitions …
</ChildItems>
</ctl:CustomControl>

After a little bit of research, I found a Microsoft KnowledgeBase article (#823194) that seemed to describe the symptoms I was observing.  The article indicated that the problem was in the definition of the custom collection itself.  If the collection contained an overloaded indexer then the ASP.NET framework would raise the HttpParseException and display the observed “Parser error: Ambiguous match exception” message when trying to instantiate the control in the page hierarchy.

I quickly popped open the code for the collection class and saw that I had indeed added an overloaded indexer as part of a recent update to the control.  I had added the ability to retrieve an item from the collection by item name as it was useful by another control that aggregated the one exhibiting the error.  There was already an indexer defined for the collection that used an integer to retrieve an item from a specified position.  In terms of C# syntax and for other consumers of the class, having an overloaded indexer is perfectly legitimate, however testing revealed that it was indeed the new this[string] indexer that was causing the problem when ASP.NET tried to instantiate the control programatically and deserialize the .aspx definition into that object instance.  After commenting out the indexer and re-running my tests, the problem was gone.  I have since replaced the overloaded indexer with an additional GetBy…() method and updated the documentation accordingly.  This wasn’t a particularly tricky bug to track down, but it’s interesting nonetheless.

Posted in ASP.NET, C# | Tagged: , , , , | Leave a Comment »

CascadingDropDownList and Page Validation

Posted by stuartthompson on April 9, 2007

While working with the AJAX Control Toolkit (http://ajax.asp.net), I came across something interesting with the CascadingDropDownList control and ASP.NET page validation.

What Does the CascadingDropDownList Do?

The control is used to create tiered drop-down lists that each depend upon parent values for their own data population.  The canonical example given regards using three DropDownList controls to narrow a selection of a car.  The first list displays a list of manufacturers, the second a list of models, and the final a list of common packages, with each successive list control populating only the relevant values based upon its parent.  For example, selecting “Ford” from the manufacturer list would populate “Focus, Sierra, Probe, F-Series, etc…” in the model list.  Selecting “F-Series” would populate “F150, F250, F350, etc…” in the package list.

What’s the Problem?

When the CascadingDropDownList control renders the child lists, it makes an AJAX server request to a web service supplying the selected parent value and requesting the list of relevant child values.  It then uses client-side script to populate the child DropDownList control with the list of returned values. This means that the contents of the child drop-down lists are being modified on the fly by JavaScript and the final values in the list will not match the list that the ASP.NET page thinks it rendered.  The ASP.NET security model contains validation upon a postback to ensure that the contents of a drop-down list being posted back match those that were rendered to avoid injection attacks (see the section below on injection attacks if you want a refresher).

When the page posts back, an exception is raised by the ASP.NET framework because the contents of the originally rendered list don’t match the newly populated list.  ASP.NET provides a solution to “permit” a control to be modified on the fly, however it requires a call to ClientScriptManager.RegisterForEventValidation for each of the additional valid values that might appear.  Since we don’t know (server-side) ahead of time which option the user will choose client-side, we can’t know the list of options that the JavaScript control will add, thus preventing us from using that solution.

The Atlas Solution

The only official solution provided by the Atlas team is to disable validation for the page:

http://ajax.asp.net/ajaxtoolkit/Walkthrough/CCDWithDB.aspx

However, even that team admits that this must be done with extreme caution and a complete understanding of the consequences.  It is fine to disable the automatic validation for a page as long as you are validating the received values manually or if you really trust the users of the application not to inject.  (hint: even on intranet apps you never trust the users not to inject because automated bots and viruses running on client machines inside the intranet are becoming advanced enough to post injections to page controls without the user’s knowledge, not to mention the “password on a sticky note on the monitor” hole.)

Why is the Solution a Problem?

The solution can be worked.  Server-side validation of incoming data from input controls (including DropDownList controls) is a good practice to begin with.  However, it raises some interesting limitations with regards to the standard integration models used by most developers.  If, for example, you are using the <ObjectDataSource> control to take parameters from your page controls and feed their selected values directly into a stored procedure or SQL query, you won’t get the opportunity to manually validate the input.  This means that a control on the page that is used as a parameter in the ObjectDataSource could have an injection string posted back as its value and sent directly to the stored procedure without the opportunity for server-side validation.  This opens up the injection hole and could only be solved by coding some pretty awkward stored procedures.  There is a better way.

Finding a Solution

Things We Know

  • I love the CascadingDropDownList control and want to still use it.
  • To avoid ASP.NET raising validation errors, page-validation must be disabled for the page.
  • Disabling automatic validation and failing to perform manual validation can open a security hole.

Compromises?

  • For the particular page using CascadingDropDownList, don’t use the ObjectDataSource directly.  Bind to the control with the three lines of code it takes in the code-behind file.  You can still use the TableAdapters and fantastic ASP.NET 2.0 DataSet pattern, you just need to validate the incoming paramters with a few assertions before running off with the input to the database.
  • As a lesser, potentially still dangerous, and (to my mind) somewhat messy solution, you could bullet-proof your stored procedures with validation – but this feels like a future stumble and could get ugly in maintenance, testing, and debugging.  Just write the manual validation code and be done!

Injection Attacks You Say?

Without page validation a security hole appears.  Consider the following scenario.  A page is rendered with three items in a drop-down list, arbitrarily the names of three authors.  The ID and name of the author are rendered as the value and text respectively.  Because this is client-side html, the contents of the list can be modified before the page is posted back.  Say you insert the following into the html for the drop-down list:

<option value=”1 or 1=1;“>Foo</option>

This adds a new item to the list whose value is ‘foo’ or 1=1;

If the stored procedure being called was:

SELECT *

FROM   Author

WHERE  AuthorID = @selectedAuthorID

then the executed SQL would now end with:

WHERE AuthorID = 1 or 1=1;

Why is that a Bad Thing?

Well, for one the stored procedure or sql query will now return every item in the table because OR 1=1 will always evaluate to true.  Secondly, there are far, far worse injections that can be performed depending upon the rights of the user that the web application uses to connect to the database.  Because most applications have a single user account for the entire web application (restricting admin features only by url or declarative security), that account usually has vastly elevated rights.  You can now inject any SQL command after the WHERE clause.  For more information on SQL injection attacks, see http://www.unixwiz.net/techtips/sql-injection.html

Posted in AJAX, ASP.NET | Tagged: , , , , , | Leave a Comment »