Thursday, April 27, 2006

I havent brought any IT books for a few months until today when I picked up a copy of

Practical .NET for Financial Markets ( Apress, Shetty and Jayaswal )
Due to a two hour delay on my train commute home I managed to find time to dive into it.

First impressions what a great book. I am halfway thru chapter one. This chapter has no code or IT in it at all. Instead it details the people and institutions involved in trading in the equities market and how they relate to each other. The lifecycle of a trade and the processes involved.

Future chapters also look to have a heavy dose of business explanation intermixed with the architecture of the code written to solve the business problems, lots of real world code in C#.

Looks like Yogesh Shetty has a lot of experience in the IT for major financial institutions C#, SQL Server e.t.c Samir Jayaswal heads a Product Management & Product Development group for treasury and risk management products at 3i Infotech ltd. My guess is he provides a lot of the business explanation.

Great concept business and IT given equal status in the book. Chapter 1 is easy to read yet is in no way shallow, learning ( or reminding myself ) a lot. Very good to have it written by people earning their money developing the systems they write about for a living.

Also picked up a copy of Pro C# 2005 and the .Net 2.0 Platform by Andrew Troelsen. Mainly because all my C# books are pre version 2.0 and this looked like a good one.

Thursday, April 20, 2006

SQL For Developers

Why is it that so many otherwise skilled developers skimp so much on their SQL knowledge. Databases are a huge part of many applications yet until a few years ago I was as guilty as anyone of avoiding learning too much about SQL.

I guess most of us know how to string together a basic SELECT, INSERT or DELETE statement, but how many of us so often use a SQL Cursor when writing a stored procedure or trigger ?

Any of Joe Celkos books are a great way of getting into more advanced SQL but the best way is to decide you want to learn about SQL and not just avoid it as much as possible. Often it offers the easiest way to speed up your code.

Some big speed increases can be gained with very minor changes.

I have also seen SQL which checks if there are any unprocessed messages like this :
SELECT COUNT(*) FROM MyMessages WHERE Processed = 0
This SQL is usually built up in a hardcoded string and executed using the relevant objects for that language. Then the program looks at the number returned and checks if its greater than zero.

Its usually much faster to execute this
SELECT CASE WHEN EXISTS(SELECT MessageID FROM MyMessages WHERE Processed = 0 ) THEN 'YES' ELSE 'NO' END

In the first case the SQL engine has to read through all rows on the table and decide if they match. In the second case the SQL engine can stop looking at the table as soon as it finds a single match. Big speed difference.

Telling the database to create an index on the column Processed can also speed this up enormously.

Also consider putting more SQL into stored procedures rather than hardcoding it into your application. The engine will probably optimize it further, it allows you to restrict attack surface to hackers ( code only needs access to the proc not the underlying table ), also if you need to change the business logic its a SQL script to run not a full application build.

SQL Cursors code runs at least one order of magnitude slower than writing your SQL without them. That means your program runs *much* slower than it needs to. The reasons they get used so much is developers are conditioned to think in terms of looping thru a set of results. They can usually be removed by thinking about the pr oblem in terms of processing SETS of data and avoiding loops.

Replacing cursors with SET based operations is not an easy hack, you need to learn to think about the data processing in a different way. But once you can make this shift in the way you think about the problems the benefits of tearing out those cursors is vast speed increases.

A simple example, I would always have thought a cursor was needed to find duplicate rows. Maybe I would have a cursor running thru the table and some variables to load the row into. If this row matched what was already in the variables I had found a duplicate. But this SET based SQL runs *much* faster :

-- Show duplicate customer names in customer table
SELECT
CustomerName, COUNT(*)
FROM Customer
GROUP BY CustomerName
HAVING COUNT(*) > 1

If you are running Microsoft SQL Server try typing the SQL into query analyzer, highlight the code and press CTRL-L, this shows you the approach the engine is taking to process the SQL. It shows you the indexes and joins it will use and what percentage of time it thinks each part of the operation will take. Often this shows you where changes to the SQL will have the biggest speed increase.

SQL isnt just for dba's, mastering it will improve your employability and enable you to make big speed increases to your application.

Wednesday, April 19, 2006

What sort of developer are you ?

I guess a lot of people have already come across Microsofts categorisation of developers into Mort, Elvis and Einstein : http://blogs.msdn.com/ericlippert/archive/2004/03/02/82840.aspx

A Mort is a developer who is more knowledgeable about the business than the IT. They are interested in getting the job done as fast as possible with as least code as possible. Drag and drop and click-wiring up events is the order of the day. Even ( yeuch ) copy and paste coding.

Theres is a temptation to label all Mort programmers as poor coders almost hobbyist programmers. But I have come across Mort programmers that can consistently write large amounts of bug free functionality fast. They are also more likely to write functionality the business wants even when faced with too-high level and ambigous specs.

A Mort programmer isnt likely to write the worlds most scalable code. But its often good enough and another programmer can come along and fix a lot of the speed issues afterwards.

I dont count myself as a Mort not sure if I fit the Elvis or Einstein mould most. Duplicate code makes me feel dirty. I want to see a clear seperation of layer between the user interface and business logic. I enjoy crafting a well designed class hierachy or API ( whilst avoiding the pitfalls of over design and patternitis ). I read books and articles and learn new technologies when they look important.

If I need to enhance a Mortified code structure I am quite likely to make the decision to develop new code in the same style. It doesnt make sense to have inconsistent code that the original developer cant follow. On the other hand if its "my" code I feel justified in writing Einstein code, as long as the Einstein code is such for a reason and presents a easy to use interface to the Mort code.

But theres no denying Morts have a valid place in development and dont always deserve the implied label of inferiority. They are often close to the business. The client doesnt directly see how well structured the code is and not all Mort code is more buggy than "proper" code. Maybe a reason is Morts are content to keep following the same coding pattern without the need to always try out the latest techniques.

Tuesday, April 18, 2006

Assuming you are already using source-code control the next step is automating the build process.

One of my current companys main products is a WIN32 application written in Borland Delphi. It consists of several EXE files and a bunch of DLL's. When I started here and I got asked to do a build there were several issues :
  • The builds have always been done on my PC so prudence dictates that this one should also be done on mine ( e.g. in case another developer has different patch level on one or more of their software components ). But maybe I am halfway through making some changes to some source-code on my local PC. It will be a pain to copy that code somewhere and make sure I build with the required source then copy my changes back again.
  • I would much rather somebody else could do the build but without risking a bad build due to different compiler options or patching e.t.c. et.c.
  • To do the build there are lots of small projects to manually build, its time consuming and error prone. I also have to manually set version information on each project.
  • After the build theres more work to do. Create a new SQL script folder in source-control ready for the next version. Add a file NotReleasedYet.txt and another version information changing script to this new folder. My manager will want a list of changes made in this newest build, that means hunting down the changes in emails and copying them into a other email.
Nowadays this is how we do a build ( any skynet worker can do this, you dont need any development tools on your PC ) :
  • In "Internet Explorer" open a web-page
  • Type in the version number we want to build
  • Select our email address from a drop-down list
  • Optionally type in our ICQ ( instant messaging ) number
  • Press the relevant button depending on which product we want built
  • Wait for an email confirming the build has started
  • Optionally monitor instant messages describing progress
  • Wait for the email saying the build was successful ( unless it failed which you will also be told ). A build-log is attached in case of failure. Otherwise the email tells you where to find the new version, a link to a document describing the changes in this version, a link to a directory containing any SQL scripts to run.
  • A RSS feed of software changes is also updated.
  • The new source-control SQL script folder is automatically created, the new SQL scripts are created and added to source-control. The existing source-code is automatically labelled in source-control.
This has saved an unbelievable amount of time. As a matter of routine after I check in any source code changes I do a test build to check I didnt break the build. It takes me no extra time to do this, I just wait for the success or failure email.

This was done using the software "FinalBuilder", some custom C# code ( including a website and windows-service ) and a old ( slow ) spare server machine.

You may be thinking you havent got the time to waste setting all this up.
Neither did we, I didnt stop developing to write all this in one go.

First write a basic FinalBuilder project which automates some of the steps leaving the rest as manual. This will save some time each time you make a build. Every build you now saves a little bit of time, use that time to keep increasing the automation until you reach diminishing returns, which I think we now have.

A future post might describe this process in more detail.

Having an automated build puts you in a good position should you want to add more automation :

  • If you are using unit-testing the build process can also run some or all of the tests and email people if the tests failed. This could potentially be done anytime code is checked in.
  • Static code analysis could also be automatically run. e.g. if its dotnet code maybe FXCOP could run and email back warnings.
  • If you have an installer then the build process could be expanded to automatically create a new installation e..g with InstallShield
One of my regular reads is Coding Horror and a recent post had a list of things all development teams should do : http://www.codinghorror.com/blog/archives/000568.html

Number 1. Do you use source control.
I guess most developers say of course doesnt everybody but that isnt always the case especially at smaller companys.

One of the first things I did at my current company was introduce source control when I found source-code was in various places on the network and this directory here "probably" contains the latest version.
  • Updated source code files were zipped up and emailed to other developers.
  • A file comparison tool was used to merge each set of changes into the "master" source.
  • The master source was kept in a directory on the senior developers machine.
  • This could be time consuming and error prone.
  • Sometimes one developers emailed changes wouldnt make it into the main build.
  • Occasionaly somebody would lose their changes and would need to code things again.

Even in a team of one Source-Control can be very very useful.
  • If used properly it allows you to go back to an earlier, known good, version of the code.
  • It lets you apply a fix to an older version of the code if the newer code hasnt gone thru testing yet but you need a fast release.
  • You can see when that nasty bug was added to the source code ( and who was to blame ! ) and more importantly identify what could have prevented this bug arising in the first place.
  • Its step one on the way to being able to create builds at the press of a button
We started off with "Microsoft Visual Source Safe" and later moved to a tool called "Team Coherence". "Team Coherance" has some nice features like being able to do a search by filename ( Yes I know but believe it or not VSS didnt have this ! ), see a sourcefiles history as you highlight it, work faster over a internet connection.

Even if politics means you cant force all developers to use the source control system its worth using. As long as somebody checks in all changes you get the benefits. As those benefits become more visible and obvious then it may beome easier to get everybody checking in their own changes.

And as I mentioned without source control you will find it difficult to add automated builds and move towards continuous integration ( if thats your bag )
Browsers have been around for a few years now, its bad that some things that most developers need are still so difficult/kludgy. Why is still a pain to tell the difference between the user closing the browser and using the back/forward history buttons ? Yes I recently tangled with OnBeforeUnLoad.

Microsoft ( rightly ) get castigated because IE is not the worlds most CSS complient browser. But in this case the problem is the standards dont seem to address a common and simple development need.

When somebody logs on to our software a row gets created in a SQL table. They logout and the row is deleted. If they are already logged on they cant login again as the same user. Instead an administrator needs to go in and unlock them. This is part of our licencing system. A issue comes in the ASP.NET version of one of our applications, the user can just close the browser instead of clicking logout. So we needed to make the software logout the user however they exit the browser. ( Short of killing the task in task manager ). My manager, not unsensibly, expected this to be a trivial thing to code. Well its not rocket science but theres way too many steps here.

First port of call was the JavaScript event OnBeforeUnload. I google that attaching code to here isnt foolproof and it must run fast, otherwise it wont get to finish running. So decide to go for an asynchronous AJAX type call.

I have used AJAX.NET before but this project hasnt got any more needs for AJAX calls at present, downloading the latest version of AJAX.NET shows that more changes to web.config are needed over the earlier version I had used before. Seemed silly to do all that and add another DLL just for this.

So back to basics a JavaScript include file with a function which manually creates the XMLHttpRequest object and calls a webpage, signout.aspx. Create the new webpage, edit its page_load method to retrieve the current userID from the session state and use SQLConnection, SQLCommand objects to do the deletion.

Added this javascript call in the OnBeforeUnload event of a body tag in the default.aspx page to test. Works great. Erm hang on a minute it also unlocks the user if they click on a link on the page, or use back, or use forwards. Oops. Another google shows that yes this really is the best event to use.

OK I can fix the logout when user clicks on a link by attaching some JavaScript to the body tags OnClick event. It sets a global ( yeuch ) variable to true which tells my main JavaScript function to ignore the next logout request.

OK that works but what backspace/forwardspace ? ( What about that link which renders a Crystal Report as PDF, this page isnt even HTML its pure PDF, so no JavaScript can be attached )

OK I could handle keypresses, look for backspace or forward keys and set the global ignore logout request variable. This gets worse. And it still doesnt help if user presses the backspace button.

Change of thinking came with the realisation that I cannot have my logout code fire only when the user closes the browser window ( Installing browser addons isnt an option because its horrible and because our application users might be in an internet cafe ).

So instead I decide to live with the fact that the user record will get deleted if the users uses the history feature. To counter this each page must readd the record if its gone. So now the logon process stores all the information it needs to reinsert this record in session state. Each pages page_load method checks for this missing record and readds it if needed. In the extremely unlikely event that the user logged on with the same username, e.g. into the WIN32 application, the browser has to tell them this and prevent further actions until they logout of there.

That just leaves the PDF report page which has no way of calling JavaScript code.
In this case the logon record is purposely deleted as this page is loaded. So if they exit the browser when displaying the report all is OK, they are already logged out. If they backspace to the previous page then page_load on this page takes care of putting the record back.

So everything worked out but boy that was a lot of hoops to jump through !

It introduces a maintenance issue in that any new webpages added must make this ( one line ) call to ensure the user record is there. Maybe I could do away with this by adding a descendant of the Page class and having all my pages descend from there, or by adding the logic into the HTTP pipeline elsewhere - but this way is at least easy to understand when reading the code and it works.
This is the first entry in my blogg. What to put in a blogg ? Firstly nothing that might come back and bite me. Information might want to be free but anyone ( like future potential employers ) can and will Google a name and take first impressions on what it returns. Call me paranoid but this seems scary. This seems to rule out politics and religion. Nobody would want to read a diary of what I do in my spare time or about friends/family so IT it is.

I work as a developer for a small software house in the City of London, very near St Pauls. Before that I worked for Enron. I have also worked for a dotcom ( during the dotcom boom ) and another software house in the City. The languages I currently use on a daily basis are Borland Delphi, T-SQL, C#, JavaScript.