Monday, March 30, 2009

Virtual Windows Clusters

I have been using Virtual Server R2 for almost two years now and it has proved quite useful as a testing environment. I was able to use my it to create both a Windows 2003 and Windows 2008 cluster. I have used these clusters for testing and documentation purposes. There is an excellent tutorial from Microsoft on creating a Two-Node cluster using VirtualServer which describes creating a Windows 2003 cluster in great detail. I used this to create my first test cluster and as a starting point for creating a Windows 2008 cluster.

Since you will be creating several VMs for your virtual cluster environment I recommend creating a Parent Virtual Machine, as described in the Micriosoft tutorial. Make sure you apply all the updates, security patches and the virtual machine additions. You will need to update Virtual Server to get the latest additions which support Windows Server 2008. You should then run sysprep on the image to ensure that any VMs created from this base image will contain unique security information.

You can create an answer file before performing the sysprep which will greatly simplify the install process. By running sysprep with an answer file you can completely automate the final steps of the install process. Therefore when you create new VMs based on the parent disk it will be like starting up a brand new machine from right out of the box. I chose to set my license key, timezone, screen resolution and networking setup. So when I create new VMs everything is automatically set to my base settings.

Microsoft supplies a graphical tool which helps create an answer file. For Windows 2003 you can use SetupMgr.exe which will walk you through generating an answer file. You can get the latest sysprep update for Windows 2003 server here. For Windows 2008 I recommend getting the Windows Automated Installation Kit (AIK) which you can get here. It is a bit more difficult to use than setup manager but it is worth the trouble.

Once you have run sysprep on the parent disk go to the vhd file and mark it read-only. You can now use this disk as a parent disk for new virtual machines. This is done by using a Differencing Disk which creates a unique disk based on the specified parent. This saves both a lot of time and disk space.

With your parent(s) created you are ready to build your cluster environment. To create a cluster you will need the following items:

  • A domain controller
  • Shared storage
  • Two Networks
  • Two VMs for nodes
  • Win 2003 Server Enterprise or Win 2008 Server Enterprise

All nodes in a Windows cluster must be members of a domain so you must have a domain controller. To keep everything in the same place I configured a VM running Windows 2003 Server as my domain controller. I use this domain controller for both a Windows 2003 and 2008 cluster. I also configured three networks; Public, Private and Storage. These networks run in Guest Only mode so they are not accessible to other machines. This isolates my test network making it easier to manage.

The Public network is used to communicate to the cluster and its applications. The private network is used for communication between cluster nodes, the 2003 and 2008 clusters run on separate subnets. The Storage network is used for iSCSI access from the Windows 2008 cluster.. Although this is not strictly necessary it is recommended that a dedicated network be used for the shared storage.

Both Windows 2003 and Windows 2008 require some type of shared storage to work correctly. Virtual Server allows up to two VMs to connect to a shared drive using a virtual SCSI controller. This provides basic functionality for a Windows 2003 Cluster but only allows for a single shared drive. Windows 2008 Clustering does not use this shared SCSI system so you must use an iSCSI solution instead. Windows 2008 includes an iSCSI initiator which allows the server to connect to shared storage over the network. However, you will need additional software to create an iSCSI target. This is included with Windows Storage Server which is an embedded solution from Microsoft and difficult to obtain. I used StarWind software to create iSCSI targets on my Domain Controller that I could use from the new cluster. It is a very lightweight download and simple to install and configure. There is a free version which limits the disk size to 2GB but is perfectly adequate for testing.

Windows 2003 Enterprise includes the Cluster Manager by default, however, you need to add the Failover Clustering feature both VMs that will serve as Windows 2008 cluster nodes. This can be done from the Initial Configuration Tasks dialog under Customize This Server. Once the Failover Clustering feature has been installed you create the cluster by going to Control Panel->Administrative Tools->Failover Cluster Manager.

Before creating a new cluster verify that the shared storage has been initialized and formatted. You will also have to configure each node to access the shared storage. Until the cluster is completely installed you will need to configure the storage with only one of the nodes running at a time.

Now that all the prep work is done you are ready to install your cluster. Boot up node 1 and start either the Cluster Manager or Failover Cluster Manager. This will start the Cluster Wizard which will walk you through creation of your cluster. The Failover Cluster Wizard will give you an option to run the Validate a Configuration Wizard, I recommend doing this to verify that you have setup your VM environment correctly. After you have successfully configured the first node start your second node and add it to the cluster using the Cluster manager.

Now that you have a virtual cluster environment setup you can install Advantage.

Friday, March 27, 2009

Configure Advantage on a Windows 2003 Cluster

Advantage Database Server supports Windows clustering on Windows 2003 Server Enterprise. I described Failover Clustering using Windows 2008 Server in a previous post. Although the Advantage help file has a very complete set of instructions for Running Advantage on a Windows Cluster, I thought I would include some screenshots and some additional thoughts.

Advantage Database Server needs to be installed on each node of the cluster. Additionally you should configure a shared data drive which is accessible to all nodes of the cluster that will store the data and Advantage log files. This shared drive is in addition to the Quorum drive used by the cluster. There are also several Advantage configuration settings you must make to Advantage for it to work properly in a cluster environment. I have just included the outline here, however, there are screenshots and more details on the Windows 2008 Cluster post.

NOTE: The Use Clustering configuration parameter is only used for NetWare clustering.

Windows 2003 Clusters use Application Groups to define which applications and services can be moved throughout the cluster. You create an application group from the Cluster Administrator. Simply right-click on the cluster you wish to add an application to and select Create Application Group. This will bring up the Cluster Application Wizard.

2K3_App

Next you create a new Virtual Server and a new Resource group. Specify a name for the resource group ( I chose Ads2k3Cluster ) there is also an area for entering a description.

Resource Group

The next step is to provide information on how to access the virtual server which allows clients outside the cluster to access Advantage. This consists of a network (NetBIOS) name and an IP address. I used the same name I specified for my resource group and specified a new virtual IP address.

Virtual Server

You do not need to modify any of the Advanced Properties. Clicking Next will take you to a confirmation screen. Select Yes, create a cluster resource for my application now and click next. The next screen will prompt for an Application Resource type, choose Generic Service and click next.

Resource Type

Specify a resource name on the next screen, there is no need to configure any advanced properties. Clicking next will bring up the Generic Service Parameters. Use advantage for the service name. You can specify any valid startup parameters but none are necessary.

Service Parameters

You may specify registry keys which will be copied to every node in the cluster ensuring that all nodes use the same configuration settings. The last page displays a summary of the new application group. Clicking finish will create the new group.

Configuration Complete

After the cluster group has been configured you need to open your new resource and add some dependencies. Click on the resources folder in the Cluster Manager. Then select the resource you just created, right-click and choose properties.

Configure Resource

You will need to add the IP Address, Network Name and the Data Drive. This will ensure that all of these resources are available when Advantage starts. Remember to store your log files on the shared data drive. You will also store your adsserver.ini file on this drive to ensure that the server-side aliases remain consistent when the Advantage service is moved between nodes.

You are now ready to bring the Advantage application group online. Remember to configure the Windows firewall to allow UDP traffic for the Advantage communication port (6262 by default). Since you are using server-side aliases and not network shares you will have to turn off rights checking using the ignore rights option on all clients.

Now all that is left is the testing. Verify that you can connect to Advantage from a machine outside the cluster. Then use the Cluster Manager to move the Advantage service to another node and test connectivity again. Congratulations you now have a working Advantage cluster.

One final note: Advantage clients will not automatically reconnect in the case of a node failure. The failover cluster will automatically start Advantage on another node but the clients will have to restart their applications.

Wednesday, March 25, 2009

Configure Advantage on a Windows 2008 Cluster

Advantage Database Server includes support for Windows Clustering which is now called Failover Clustering. You must use the Enterprise version of Windows 2008 Server in order to use clustering. Additionally you will need to use a Storage Area Network (SAN) either with a fiber channel connection or iSCSI connection. You can get more information on Windows 2008 Clustering here.

The first step is installing Advantage on every node in the cluster. You will need to specify some options for this to work properly. First make sure that the Suppress Message Boxes option is checked. This ensures that any messages requiring user input from the server are not displayed. If a message is displayed by Advantage (i.e. users still connected) the service may seem to hang when a shutdown command is sent. This option is on the Misc. Settings tab of the Advantage Configuration Utility.

Misc Settings

You will also want to change the error log and transaction log paths to a shared storage location. This ensures that all nodes of the cluster write to the same error log. It also ensures that transactions are rolled back in the event that the node running Advantage fails. The node that takes over the Advantage service will restore all the data back to the state it was in prior to the uncommitted transactions. These paths are defined on the File Locations tab.

File Locations

The final configuration is to set the startup type to Manual to ensure that Advantage is only running on one node at a time. Once Advantage has been installed and configured on each of the nodes it is time to add Advantage as a clustered service. This is done through the Failover Cluster Manager snap-in.

Cluster Management

Choose Configure a Service or Application to open the High Availability Wizard then choose Generic Service. Select Advantage Database Server from the list of available services. In my case it was at the bottom of the list. Next you must create a client access point,this how other machines on the network will access the service. You provide both a netbios name and an IP address for the service.

ClientAccess

Next you specify a storage device for the service. Control of this storage device will move with the service ensuring that all information needed by the service is available to the node running Advantage. Make sure you select the device with the drive letter you choose when configuring the error log and transaction log paths. All of the data accessed by Advantage needs to be stored on this device as well. You can choose to replicate registry settings so that every node has the same values. Use the Advantage server registry key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Advantage

Once all of these steps have been performed you will see a confirmation screen. Verify that all the settings are correct and click Next. The Service will be created and started. You can now view a summary of your new service.

Summary

Now that the cluster and the Advantage service has been added it is time to configure your data location. Because this is a clustered environment network shares are problematic so you should use server-side aliases instead. The adsserver.ini file, which contains the aliases, is stored in the same location as the error log. This means that it will be available for each node of the cluster ensuring that the data path does not change regardless of which node Advantage is running on. Remember that you will have to turn off rights checking using the ignore rights option.

Now all that is left is the testing. Verify that you can connect to Advantage from a machine outside the cluster. Then use the Failover Cluster Manager to move the Advantage service to another node and test connectivity again. Congratulations you now have a working Advantage cluster.

One final note: Advantage clients will not automatically reconnect in the case of a node failure. The failover cluster will automatically start Advantage on another node but the clients will have to restart their applications.

Monday, March 23, 2009

Book Review – The Non-Designers Design Book

NonDesignersDesign-WilliamsThe Non-Designer’s Design Book, Design and Typographic Principals for the Visual Novice by Robin Williams (not to be confused with the comedian) is an easy to read book about page layout. While most of the book discusses the traditional print media (i.e. business cards, flyers, brochures, mailers etc) there are many references to Web layout. I also think many of the design principals discussed can be applied to application user interfaces.

The book opens with the premise that once you know  what to look for you will begin to see it all around you. Therefore by learning good design principals you will be more able to describe why something has a good design. We already intuitively know whether a design is appealing or not but perhaps we can’t pin down exactly why. By learning the design principals in this book we can learn how to avoid poor design in our own creations.

Ms. Williams breaks down design by introducing the four basic principals of design; Contrast, Repetition, Proximity and Alignment. The author leaves it up to the reader to invent their own easy to remember acronym for these principals. Each principal is discussed in its own chapter. I found myself experiencing several “ah-hah” moments while reading through these chapters. There are many examples throughout the book first with poor design and then a re-design based on the principal being discussed. As they say “seeing is believing”

Although these principals seem rather straight forward they are not always applied. Contrast is not simply the use of colors (i.e. black on white) but the use of size, weight and font. The key to using contrast is to be bold. If you pick a different font for your titles make sure that it is a distinctly different font. When changing size ensure that the difference is obvious so that attention is called to the point you are trying to make.

Repetition in design is not about repeating yourself it is about providing a consistent look throughout your design. This can be easily seen in magazine and newspaper articles. Each article has a headline and a body and the pages demonstrate a consistent layout throughout the publication. This can be applied in your applications as well. If you have several forms throughout your application try to make sure they have a consistent look. For instance: if your data entry forms have a toolbar, use the same toolbar for all data entry forms.

I have been guilty of misusing alignment quite often. Ms. Williams feels that centered text is used too often and generally distracts from the message you want to convey. Essentially pick an alignment and stick with it to improve consistency. Her suggestion is to take a look at your professionally designed business card and notice how all the information is aligned.

Proximity is grouping related items together. I think proximity is well used by the development community. Just think of all the “standard” controls we have for grouping such as; groupbox, panels, tab controls, etc. Proximity is where we shine as developers. Through our menus, listboxes, comboboxes and other controls we demonstrate proficiency in this principal. The act of grouping things together seems to come naturally to a developer.

Each of these principals work together to provide the overall effect. Alignment can be used to provide repetition, using the same alignment for all your text. Alignment can also be used as contrast, by making one block of text center or right-justified while everything else is left-justified. Similarly proximity can be achieved using alignment or contrast. Repeating the same alignment or font groups ideas together.

There is a very good chapter which discusses the use of color in your design. From an introduction into how colors are created (CYMK vs RGB) to how to determine which colors go together, a color wheel can be a useful tool. I found the discussion of shades, tints and hues to be very useful.

There are exercises throughout the book which help reinforce the concepts of the chapters. There is also a full chapter of hints and tips which illustrates proper use of the four design principals described in earlier chapters. The “do this not that” approach clearly demonstrates how the application of the basic principals makes a tremendous impact on the final result.

There are three chapters dedicated to type which is a very useful tool in the design world. Although our type choices are a bit more limited in the development world, we are at the mercy of the typefaces installed by our users, I still think there is a lot of good information in this section. There is a very thorough discussion of what makes typefaces different (i.e. serif vs. sans-serif) and when each should be used. The versatility of type is demonstrated as each of the four principals can be used with your typeface.

The bottom line: I found this book both informative and easy to read. Ms. Williams gives a very easy to follow introduction into design and demonstrates the four basic principals using many examples. There is good advice on putting together common materials such as business cards, brochures, advertising and web pages. Many of the principals can be applied to the design of your user interface as well.

Wednesday, March 18, 2009

Database Replication – Part IV

In this post I will discuss handling replication conflicts. A replication conflict can occur if a replicated record was changed since the last time it was replicated to the subscriber. This is most likely to occur when replication to the subscriber has been delayed due to a connection issue or other replication error. The default behavior with Advantage is a “last in wins” approach and the record will be updated to reflect the latest changes.

  • Part I – Overview and Server to Server strategies
  • Part II – Multiple server strategies
  • Part III – Replication filtering
  • Part IV – Replication conflicts

Let me start with a bit of background. I refer to the source database as the publisher , the target database as a subscriber and the item to be replicated as an article. Horizontal filtering limits the records through the use of a filter expression. Vertical filtering specifies the fields to be replicated. Although these strategies can apply to database replication in general some of the information I discuss is specific to Advantage Database Server. You can get an overview of Advantage replication in our online help files.

What Causes a Conflict

In order for a conflict to occur the same record must be modified by two or more people before a replication occurs. For example someone at the corporate office updates a customer contact information and that same customer’s information was updated at a branch office at the same time. In this case the record at the branch office does not contain the same data as the original record at the corporate office.

Conflicts rarely occur in an environment where a connection was always available allowing changes to be replicated almost instantly. The chance for conflicts rises when the connection is not available and many changes are waiting to be replicated. Conflicts are also likely when replication is being used by disconnected users (i.e. traveling salespeople) update information changed while disconnected. In these situations conflicts can be minimized by limiting changes on the same records.

Replication Internals

When a record is updated Advantage first computes a CRC of the original record data (NOTE: ModTime and RowID fields are not included in the checksum). If there is a horizontal filter defined it is checked to verify that the record needs to be replicated. If the record passes the filter or no filter is defined it will be added to the replication queue. A thread is then scheduled to process the replication queue.

If a CONFLICT trigger is defined on the table in the subscriber database a CRC for the target record will be computed. This value will be compared to the CRC value that was calculated on the original record by the publisher. If these values match then there is no conflict and the update is applied to the subscriber. If the values are different the conflict trigger will be executed.

Replication can use either a primary key or some subset of the searchable fields in the table to uniquely identify records at the subscriber. When conflicts are expected, the primary key should be used to identify the replicated records. Otherwise, if one or more of the searchable fields being used to identify the record have changed at the target, the record will not be found. This will cause an error which will need to be resolved before replication can continue.

Conflict Triggers

A conflict trigger behaves like an INSTEAD OF trigger, meaning the specified operation is performed rather than the normal update operation. Updates are the only events that raise a conflict trigger. Inserts and deletes occur regardless of whether the subscriber has changed the specified record.

The conflict trigger is written just like any other Advantage trigger as either an SQL script or Advantage Extended Procedure. This allows for a vast array of functionality in handling the conflict. An example conflict trigger can be found in the following Advantage tech-tip.

Regardless of how you choose to handle the conflict the changes made by the conflict trigger at the subscriber will not be reflected at the publisher. This is true even if the subscriber publishes its changes to the original publisher of the change. Since the conflict trigger runs as part of a replication event it is not pushed back to the publisher which would cause a loop condition.

Monday, March 16, 2009

Database Replication – Part III

In this post I will discuss using filters to limit the data that is replicated. Filters can be applied to limit the number of records, limit the fields or a combination of these. With Advantage these filters are defined on the publication therefore all subscribers to the publication will have the same filters applied. If you need specific filters for each subscriber you will need to create a publication and a subscription for each subscriber at the publisher.

  • Part I – Overview and Server to Server strategies
  • Part II – Multiple server strategies
  • Part III – Replication filtering
  • Part IV – Replication conflicts

Let me start with a bit of background. I refer to the source database as the publisher , the target database as a subscriber and the item to be replicated as an article. Horizontal filtering limits the records through the use of a filter expression. Vertical filtering specifies the fields to be replicated. Although these strategies can apply to database replication in general some of the information I discuss is specific to Advantage Database Server. You can get an overview of Advantage replication in our online help files.

Example Scenario

To demonstrate filtering we will use an example scenario. We have a corporate office which needs to run reports on consolidated data from all the stores. The corporate office provides pricing information to each of the stores. To meet these requirements we will use a spoke and hub model as shown below.

 Example Scenario

In our example these are convenience type stores which can sell gasoline along with various snacks and groceries.

Horizontal Filtering

In the example described above our stores sell a variety of convenience items, however not all stores sell gasoline and other automotive products. Since product information is updated at the corporate server we will filter the products based on the type of store. This is an example of a horizontal filter since the data is being limited by a condition.

In this simple example the products table has a type column which specifies which type of store stocks the item. This allows us to limit the data that is replicated to the stores based on the value in this column. An example publication for a type 1 store is shown below.

Horizontal Filter

Horizontal filters are applied using any valid expression, making them very powerful. Any record that meets the conditions in the expression will be replicated to all subscribers of the publication.

Vertical Filtering

Another requirement of our example is to provide employee information to all the stores. However, not all of the information in the employee table is required at the store level. Things like social security number and home address for instance. To allow replication of the essential data without the personal information can be done using a vertical filter.

A vertical filter specifies which columns are replicated from a particular table. The subscriber’s table only needs to have the replicated columns defined. Therefore, the table structures can be different although, the fields which are replicated must have the same properties as the publisher’s table.

An example of the vertical filter for the employee table is below:

Vertical Filter

It is important to remember that all filters are defined in the publication and will apply to all subscribers of the publication. Filters are article (table) specific and both types of filters can be applied to an article.

In the next part of this series I will discuss replication conflicts.

Friday, March 13, 2009

Database Replication – Part II

In this post I will discuss multiple-server replication strategies. These strategies are meet more of the real world requirements.  These scenarios can be very useful for load balancing, updating common information, providing a central repository and allowing multiple locations to have the same data. Other articles in this series are listed below.

  • Part I – Overview and Server to Server strategies
  • Part II – Multiple server strategies
  • Part III – Replication filtering
  • Part IV – Replication conflicts

Let me start with a bit of background. I refer to the source database as the publisher and the target database as a subscriber. Although these strategies can apply to database replication in general some of the information I discuss is specific to Advantage Database Server. You can get an overview of Advantage replication in our online help files.

Multiple One-Way Scenario

There are essentially two ways that this scenario can be implemented. First a single publisher can have multiple subscribers or multiple publishers can replicate to a single subscriber. This scenario applies if the changes made at the subscribers is not needed by the central server.

In the first case a central server pushes changes out to multiple subscribers. For example the corporate office sends catalog updates to individual stores. This can also be used for load balancing with all changes being applied to the single publisher and then queried at the subscribers. This ensures that the users get the same data regardless of which subscriber they are connected to.

Central Publisher

The second case could be used as a central backup or reporting server. Each of the publishers push their changes up to a central server. This provides a consolidated database of all activity across all locations. For instance a group of retail stores replicate their sales data to a corporate office who can now run sales reports which include data from all locations.

Central Subscriber

Multiple Two-Way Replication

When all locations need to have the same data then two-way replication is required. Keep in mind that the more servers involved in the replication scenario the more likelihood of conflicts. There are two basic approaches to this situation; Each server is a publisher/subscriber to every other server or each server is a publisher/subscriber to a central server.

When a central server is not used each server will need to have n-1 replication partners. With 4 servers each server will have 3 subscribers and be a subscriber to 3 publishers. This solution can get very complicated quickly since a connection between these servers is necessary to push changes. Additionally an additional user license is required for each publisher the server is subscribed to.

Multiple Publisher Subscribers

You can simplify this scenario by using a Spoke and Hub model. In this model each location replicates to a central server which uses forwarding to push changes to all the other locations. Although the spoke and hub requires an additional server you get the benefit of having a single consolidated database for reporting as well as a central location for common updates.

 Spoke and Hub

Hybrid

These models can be combined to provide additional functionality. For instance you may be using a spoke and hub model to ensure that all your locations have the most current data. In addition you want to have another server at the central office to use as a real-time backup and reporting solution. To do this you simply add another one-way replication to the central server.

Hybrid

In the next post I will be discussing using replication filters to limit the data that is replicated.

Wednesday, March 11, 2009

Database Replication – Part I

Database replication is the ability to maintain multiple copies of the same database across multiple servers. This series of posts will discuss various replication strategies and give examples on when these strategies may be used. This will be done in four parts:

  • Part I – Overview and Server to Server strategies
  • Part II – Multiple server strategies
  • Part III – Replication filtering
  • Part IV – Replication conflicts

Advantage replication is an asynchronous push implementation which minimizes the impact on applications and provide the most up-to-date information to all servers. It is important to note that replication is not synchronization. Advantage does not compare the databases and apply changes. Replication simply pushes changes from one database to another.

I will use the following terminology when talking about replication. The source database is referred to as the publisher, the target database is the subscriber. Servers can be both publishers and subscribers and any server can be a publisher to multiple subscribers or a subscriber to many publishers. With that in mind let’s discuss some replication strategies.

One-Way Replication

One-way replication is a publisher pushing changes to one or more subscribers. The subscribers do not send any changes back to the publisher. Therefore changes are not normally made at the subscriber or changes made to the subscriber are not needed by the publisher.

image

Although the simplest form of replication it is never the less very powerful. For example this model can be used to provide a warm standby solution to increase availability. A subscriber will receive all updates from the publisher in real time. If the publisher suffers from a catastrophic failure the subscriber can be used by the clients to continue working.

Two-Way Replication

Two-way replication allows each server to be both a publisher and subscriber. Changes made at either server will be pushed to the other server.

image

This model can be very useful when you have two or more locations which need to manipulate a common set of data. This method may provide better performance than requiring one location to connect to a database via a WAN connection. However, problems can arise if both locations are manipulating the same records at the same time. Advantage provides a mechanism for dealing with replication conflicts which I will discuss in part 3 of this series.

Forwarding

Forwarding, also referred to as chaining, allows subscribers to pass replicated records on to its subscribers. This eliminates the need for the first publisher to define an additional subscriber. With Advantage forwarding is specified for each subscription and is off by default.

image

One potential pitfall when using forwarding is creating a loop. This is where the last server in the chain replicates to the first server in the chain ( A –> B –> C –> A ). Advantage will detect this situation and the update will not be applied. There is more information in the Advantage Help File.

In part 2 I will discuss scenarios which involve two-way replication between multiple servers.

Monday, March 9, 2009

Book Review – Facts and Fallacies of Software Engineering

FactsAndFallacies-GlassRobert L. Glass has been working in the computer industry for over 45 years in both academia and the aerospace industry. He has written over 20 books and 75 technical papers about software development. In Facts and Fallacies he discusses many of the facts of developing software along with a few fallacies. He believes that many of these facts are often forgotten, not to mention usually under some controversy.

The book is very well organized with each fact and fallacy presented in the same format. First a discussion of the fact, next the controversy and finally the source(s) for the fact. There are 55 facts grouped into four categories; About management, About the life cycle, About quality and About Research. Which are conveniently the first four chapters of the book. The last three chapters are the categories for the 10 fallacies: About management, About the lifecycle and About education.

Chapter one is about management and it contains five subcategories; People, Tools and Technologies, Estimation, Reuse and Complexity. The facts about people are related to the information in Peopleware and The Mythical Man Month. These include “Adding people to a late project make it later” and “The working environment has a profound impact on productivity and quality”. The facts about tools aren’t too unexpected for instance “Software developers talk a lot about tools, but seldom use them”. The facts on reuse are worth reading particularly the discussions of reuse in the large.

The section on complexity contains an interesting fact “Eighty percent of software work is intellectual. A fair amount of it is creative. Little of it is clerical”. I found this interesting since most people envision developers as someone sitting in front of a computer writing code all day long. Although I think this was more true before the personal computer it still has merit. I tend to design on the fly as I am coding, probably not the best practice, however, it is much more effective to take the time to create a good design document. This takes thought and some creativity in order to generate a good design. However don’t forget to keep up on your clerical skills.

Chapter two contains facts relating to the life cycle. Again these facts are grouped together under several subheadings. Since there are 22 facts in this chapter I thought I would highlight a few of them. Under design “There is seldom one best design solution to a software problem”, not much controversy here since there are several ways to solve problems. The single best solution is a very elusive beast and it may even be a waste of time hunting it down. Under error removal “Error removal is the most time consuming phase of the life cycle”, although this seems obvious we hate to admit that software can contain so many errors. We spend a lot of our time finding and correcting bugs during the software development process. In fact this process continues well after the software is released.

Although all the facts about testing have valuable insights “One hundred percent coverage is still far from enough” stood out to me. The premise here is that you can have a unit test which tests every single function in your code. You can also have tests that “press every button” on your application. However, you can never have a set of tests that calls every function in every way it can be called. A function can fail if a specific sequence of functions are called before it and replicating every possible combination is impossible.

Reviews and inspections get their own category, since this process can greatly improve code quality and functionality. The one fact that stood out for me was “Postdelivery reviews  (some call them “retrospectives”) are important and seldom performed”, probably because of my Army background and its use of After Action Reviews (AARs). It is very important to look back on a project and determine what went well and what can use some improvement or even be eliminated the next time. Hindsight is 20/20 after all.

The section on Maintenance is a must read. Maintenance is the biggest portion of software development, not only fixing bugs but adding new features can be considered maintenance. If I had to pick one fact it would be number 43 “Maintenance is a solution, not a problem”. By fixing bugs we make the product more stable and reliable. We can also discover inefficiencies during the maintenance process adding to the performance of the product. By adding new features into the scope of maintenance you can easily justify spending most of your time and efforts doing maintenance.

Chapter three focuses on Quality. I discussed fact number 46 in an earlier post but I think fact number 47 “Quality is not user satisfaction, meeting requirements, achieving cost and schedule, or reliability” deserves a closer look as well. Although all of these things look like elements of a quality product they are in fact something different.

User satisfaction = Meets requirements + delivered when needed + appropriate cost + quality product.

Since quality product is included in user satisfaction then these two things must be separate. Each of these items is very important since user satisfaction is generally what keeps you in business. Therefore it is important to address all of these items. Quality itself is only one of the elements of user satisfaction and is defined by its own attributes outlined in fact 46: portability, reliability, efficiency, usability, testability, understandability and modifiability. The rest of chapter three goes into more detail on these attributes of quality.

Ten fallacies are discussed in the final three chapters of the book. The two that stood out to me were “You can’t manage what you can’t measure” and “You teach people how to program by showing them how to write programs”.

The ability to measure developers is somewhat elusive. Programming is a very creative process making it very difficult to quantify neatly. Besides that programmers are generally very intelligent people who can figure out how to game any measurement technique. Joel Spolsky wrote a great article on what motivates developers and demonstrates the fallacy of motivation through measurement. This along with Robert Glass’ discussion really demonstrate the fallacy of management through measurement.

The last fallacy “You teach people how to program by showing them how to write programs” seemed more of a fact to me until I read the discussion. Essentially we don’t learn how to write in our native language before we learn how to read. We have to understand how sentences are constructed and how we relate them together before we are expected to write on our own. However, in the programming world we read a tutorial, insert your language of choice here, which starts with “Hello World” and moves into more complex tasks quickly. Before even reading a well designed program you are required to write a Twenty Questions Game. You can learn a lot by reading good and bad code as well as writing your own code. So perhaps the way we learn written language could be applied to learning programming as well.

The bottom line: If you have actually made it this far you can probably tell that I enjoyed the book. Robert Glass spent a lot of time collecting information for all his facts and fallacies and providing opposing viewpoints for most of them. I found his insights interesting and relevant to the discipline of software development.

Friday, March 6, 2009

Time to “Spring Forward”

Its the most wonderful time of the year, Daylight Saving Time (DST) that is. Ah yes, those warm bright spring evenings where all the kids complain that it is too early to go to bed because “its still light out”. Perhaps that is only my house since it is light here until 10PM during the summer. The benefits of living against the edge of a time zone.

DST changed in 2007 from beginning the second Sunday of April to beginning the second Sunday in March. It ends on the first Sunday of November instead of the last Sunday in October. Although the change officially occurs at 2AM, in your local time zone, I recommend setting your clock before you go to sleep.

DST was originally proposed by Benjamin Franklin, which I learned by watching National Treasure, when he observed that people were burning candles at night but not waking up when the sun rose. It was introduced during World War I in an effort to conserve fuel used for electric power. Germany and Austria first established daylight saving time on April 30th 1916. The United States first adopted daylight saving time in March 1918.

The reasons for DST are varied ranging from better use of daylight, energy savings and safety. Although the energy savings generated by the time change are a matter of debate there are some other benefits. People are safer drivers during daylight hours and research has shown that accidents are reduced during DST.

There is a downside however. Some people have a difficult time adjusting to the time change and the impendence of heart attacks increase during the first week of DST. When “falling back” pedestrian fatalities from cars increases especially around the 6 o’clock hour. On a brighter note there is a noticeable decrease in heart attacks the week after the change back to standard time.

So what does all this trivia have to do with computers and more specifically Advantage? It has to do with the way that Windows keeps track of time. Windows stores all timestamps in Greenwich Mean Time (GMT) and adjusts based on the time zone you select. During daylight saving time an additional hour is added to the offset so all of the file times are displayed as 1 hour later. See this Microsoft article.

This is simply a cosmetic effect as the timestamp itself is not changed. However, it can still cause some confusion. To help identify the version of Advantage files we time stamp the files with our version number. For instance version 9.0 of Advantage has a time of 9:00 AM, the date varies depending on which service release you have installed. When your computer adjusts for DST the time displays as 10:00 AM.

When I worked in Advantage Tech Support I used the file time as a quick way to verify that all of the files in the server directory were the same version. Keeping in mind that these times may be displaying differently if the user is automatically adjusting for DST. If they set their computer’s clock back manually then the time would not be changed. So if the files had a time of 9:10 AM and the customer insisted they were using version 8.1 then you should ask if the machine automatically adjusts for DST.

Enjoy your weekend and remember you will get that lost hour back in November.

Wednesday, March 4, 2009

The Quality Debate

Quality is a very subjective thing and often times difficult to define exactly. A quality product for one person may not meet the needs for another. This ambiguous definition along with the need to produce a product in a timely manner means that quality sometimes suffers. Everyone wants to produce a quality product but unfortunately sometimes we settle for good enough.

Quality is one of those topics which invoke passionate responses from people. For example: back in January Jeff Atwood and Joel Spolsky talked about quality during the Stack Overflow Podcast. Jeff said “quality really doesn’t matter that much” which generated some comments from Uncle Bob. Which in turn generated a post from Joel. Here is the offending statement in context.

And I find, sadly, to be completely honest with everybody listening, quality really doesn't matter that much, in the big scheme of things... There was this quote from Frank Zappa: "Nobody gives a crap if we're great musicians." And it really is true. The people that appreciate Frank Zappa's music aren't going, "that guitar was really off." They're hearing the whole song; they're hearing the music, they're not really worried whether your code has the correct object interfaces, or if it's developed in a pure way, or written in Ruby or PHP... they don't really care about that stuff.

In this context I have to agree with Jeff, just like Joel did. Software development is a very complex process that is typically driven by business and marketing needs. Meeting deadlines and producing features can become the most important aspect of building software. Whether or not the code was developed using Agile principals, Extreme Programming or Pair Programming usually doesn’t matter that much to the business. What matters is that a working product is delivered. Therefore the quality of the code isn’t as important as the quality of the product that ships, which I believe is what Jeff and Joel were getting at.

Uncle Bob’s post argues that quality is important at all levels. It is difficult to argue this point since enforcing quality at all levels will produce a high quality product. I would agree with Uncle Bob’s points that there are tools which help such as Test Driven Development which can certainly improve the overall quality of the product. Additionally following the SOLID Principals of Object Oriented Design improve overall quality and are worth the effort to implement. So I have to agree with Uncle Bob as well “quality does matter”.

I think the crux of the matter is how we define quality. Jason Young, author of YTechie, wrote a good post about quality as well. I think he sums up quality nicely:

First of all, we need to understand that quality isn’t a Boolean. It’s not “yes”, you have quality, or “no”, you don’t have quality. Quality is a gradient, but it’s even worse than that. Everyone sees it differently, and everyone experiences a different aspect of it. In short, quality is a multidimensional gradient!

I think Robert Glass provides a way to measure this gradient in Facts and Fallacies of Software Engineering. This definition of quality comes from Fact 46 - Quality is a collection of attributes.

Quality in the software field is about a collection of seven attributes that a quality software product should have: portability, reliability, efficiency, usability (human engineering), testability, understandability, and modifiability. Various software people provide somewhat different sets of names for those attributes but this list is pretty generally accepted and has been for almost 30-something years.

These two definitions are what make a product manager’s live very difficult. The product manager needs to identify the quality gradient that the potential customers are looking for. When developing a personal finance application things like usability, understandability and efficiency may be the most important. So this product may fit into Jeff’s point of view where testability and modifiability are a lower priority for this product. However, a development tool such as a database server has a very different audience. The testability, portability and reliability are probably the most important factors. This means that much more attention needs to be paid to testing and quality of code to provide the most efficient and reliable product possible.

So what do I think? I believe quality matters at all levels and that you should pick the tools, principals and methods which produce the highest quality product for your market. This varies with the product your producing and I think using Robert Glass’ quality attributes provides a good mechanism for creating a quality product. I do realize that some quality sacrifices are made because of time constraints or long feature requirements. However, these can be minor sacrifices and can always be improved upon since software is a living product.

Monday, March 2, 2009

FAQs – February 2009

New Environment Variable

The ads.ini file can be used to set several Advantage properties such as aliases, server type, communication type etc…. When an Advantage application starts it first looks in the directory where the exe is run from then in the Windows directory. In most cases this is sufficient, however, Windows Vista does not allow users to modify directories under Program Files.

Advantage will now ( version 9.1.0.0 and 8.1.0.38) use an environment variable called ADS_PATH as a search path to the ads.ini file. This will override the default exe directory then Windows directory then search path mechanism. For more information you can read this knowledgebase article.

ARC SQL Debugging

One of the new features in ARC 9.x is the SQL debugger. This is a very useful for writing and testing SQL scripts. As an additional feature of the debugger you can right-click on a stored procedure and select Debug. This will bring up the SQL window populated with an Execute Procedure command and the parameter names. Simply fill in the parameter values and run the script, the SQL stored procedure will execute and stop at the breakpoints you have set.

After you have debugged the SQL procedure once using this method the EXECUTE command is saved so the next time you select Debug from the context menu the same debug script, with the parameter values, will be opened. This is very convenient since you do not have to go back and determine valid parameter values each time you wish to debug the script.

However, if the procedure parameters change you will have to manually change the script since it is cached. Once the change is made it will be saved and the new SQL will be used for executing the procedure. These cached scripts can be found in My Documents\Advantage Data Architect\DebugDriverScripts and are named <connectionname>.debug.<procedurename>.sql. Deleting this file will force ARC to regenerate the script.

Are Deleted Records Backed Up by Advantage Online Backup

With DBF files the deleted records are always backed up. There is a setting for DBF tables named ShowDeleted which will display records marked for deletion. This setting is not used during backups but it is used when exporting DBFs from ARC, see this KB Item.  Backups of Advantage tables (ADT) never include deleted records.

When a backup is restored a re-index is performed to ensure that it is current. Any records marked for deletion in a DBF file will be restored, but still marked as deleted. A restore of an ADT file will not include any deleted records.

Slow Index Open After Abnormal Server Shutdown

If the Advantage Server is shut down abnormally, (i.e power failure, service hang or user end process) a recovery process is initiated when the server is started again. Any transactions that were in progress when the service was stopped are rolled back restoring the data to the state it was in before the transaction was started. With Advantage 8 and newer indexes are also checked for corruption when they are first opened after an abnormal shutdown.

If the index is corrupted the Advantage server will re-index the file when possible. This causes the initial open of the table to be delayed until the re-index is completed. This re-index is only applied if index caching is enabled, which is the default. You can disable index caching by setting the MAX_CACHE_MEMORY configuration value to 0. However, this can reduce performance of Advantage so we recommend using the default cache settings.