[Return to Technical Manual Index ]
With the changes we have made to create an object-oriented version of RHESSys, the model has taken a giant leap forward in terms of the ease with which it can be maintained and its interoperability with other related models. Unfortunately, given the immense size of RHESSys, the amount of work that had to be done, and our time constraints, we were forced to prioritize our work or risk not getting anything done. Therefore, there is some work that should still be done to make RHESSys into the best possible piece of software. This document is intended to give future developers of RHESSys an index of this work, so that they are aware of these issues and do not have to stumble across them on there own. Hopefully, future developers will maintain this index so that it is always up to date. Each of the following categories in this document is an area of RHESSys that needs improvement that we have not yet been able to provide. Wherever possible, I have tried to include a brief description, a list of files, classes and functions that are affected, and finally my suggestions for fixing these problems.
The problem here is that base stations are currently created at the world level even though they properly reside at the zone level. This is poor stylistically, and represents poor cohesion, but more importantly, it creates a problem with destructors. As mentioned, base stations are stored at the zone level. This is proper since the climate data represented by base stations is applicable only to the particular zone object and all objects contained by that zone. Thus, we took pains to hide the base station of a given zone from all higher levels of the hierarchy and also from other zones. The problem is, given the manner that the world file was parsed in the old version of RHESSys, we had to create the base stations at the world level. However, given our attempt to hide this data from other parts of the spatial model, we have to destroy the base stations at the zone level. This causes a problem as multiple zones could share a base station without realizing it (because they hide their base station info from each other). Thus, it is possible for a zone to destroy its base station and then another zone might attempt to destroy the same base station.
As of 4/26/1999 the problem with the destructors has been fixed. A reference counting mechanism has been implemented, so that each base station keeps track of how many zones are pointing at it and it only destroys itself once the number of zones pointing at it is zero. While it is still problematic that the base_stations are created at the world level no better solution has been thought of to date. The old text for a suggested solution to this problem has been left for future reference.
The constructors for the following classes are affected:
My suggestion would be to implement some sort of reference counting on the base stations objects so that each base_station kept track of how many zones were pointing at it and it would only destroy itself after its reference count dropped to zero. This solution may be hard to implement if each zone creates its own base station, so this solution works best in combination with the process described above where the base stations are created at the world level.
This is not so much a problem as it is a suggestion for a powerful enhancement. Currently, the model creates base stations by reading a whole lot of data from files that store climate information. This is a good solution when the model is being run on dates in the past where climate information is known. This method is insufficient, however, when the model is being run on future dates when the climate data cannot be known. With the current model, to run the simulation for future dates, you would have to run a climate model, and store all that data, then reformat that data in the manner that RHESSys reads it, and then finally run RHESSys. A much better solution would be to allow the user to dynamically load a base station implementation, so that the user could specify whether or not she wants to create base stations from information stored in files or by linking to a climate model. The foundation for this work has already been laid, because we have completely hidden the climate data from all other parts of RHESSys. That is, any module that needs climate information can only access it by making a function call to the appropriate base station. Thus, changing the implementation of a base station would not require changing the implementation of any other module, and developers are free to completely overhaul the implementation of the base stations without worrying about changing the implementation of any other module.
The only things affected by adding this functionality would be those classes and functions described in the file base_station.h. Extensive changes will probably be necessary to these classes and functions, however, I believe that the advantage gained would be well worth the work.
My suggestion would be to add functionality to the command line class that would allow the user to input command line parameters that specified whether base stations were going to be read from files or implemented by a climate model. If the command line specified, that the model should read from files, that functionality is already in place. If the command line specified that the base stations should use a separate climate model, I would recommend a producer/consumer model. That is, RHESSys, probably in the base station constructor, would set up a series of shared buffers analogous to the buffers that are currently used to store the climate information. Then the buffers would be filled by the climate model and emptied by RHESSys. Synchronization primitives would, of course, have to be used to ensure that the synchronization occurred correctly. In this manner the climate information could easily be implemented as a simple producer/consumer relationship.
The reader may be alarmed that the cost of computation for RHESSys has just jumped dramatically because we may have to compute the temperature each time we request it, instead of just reading it out of a buffer. It is important to note, however, that the cost of computation will be only minimally increased over the cost of running both models back to back. In addition, you may actually get a gain in efficiency when you factor in the fact that when you run the two models concurrently, you don't need to reformat the output of the climate model in order to input it into RHESSys. Furthermore, running the models concurrently will save a lot of memory. Currently every piece of climatic data for every day is stored in a giant array that exists throughout the duration of the simulation. This is obviously a very expensive way to store this information. A producer/consumer model, however, would allow designers to fine-tune their array size to an optimal one.
This problem is not a bug but it will greatly help future developers work on the code. The problem here is that in the old version of RHESSys data in a given struct was accessed directly from that struct. Thus, it was immediately clear to a reader whether the variable being referenced was a member variable, or a variable local to the function. Now, however, it is not so clear because member functions access member variables directly. This can create some confusion over the location of the given variable. I cannot stress enough how much this would help in aiding the readability of the code.
Every class and every member function is affected by this problem.
Adopt a naming convention to make it more clear what type of variable is being manipulated. One suggestion would be to prefix al member variables with the sting "m_" as it is unlikely to be used for any other variable.
Currently, the GUI supports a protocol for receiving visualization data from RHESSys during execution. For information specifically on this protocol, clickhere. For the GUI to correctly interpret the modelís output, RHESSys must also support this protocol. Given the amount of time it took to convert the model from C to C++, we had no time to add the functionality to support this communication protocol, but the following are suggestions on how to do so.
There are two suggestions for implementation and the files that are affected will be slightly different for each. However, I believe in either case it will be possible to piggyback the required functionality on existing functionality without an adverse affect on the system. Likely the tec class will be changed, as will the output functions of each class.
One suggested method involves allowing the GUI to modify the tec file before inputting it to RHESSys. The second method would be to implement a new command line flag to indicate that output visualization is desired.
In method 1, the GUI would take the information input in the visualization pane and use it to add new events to the tec file. These events would include a date and a command of the form "visualize_patch_C02_daily_on." For every "on" command there would be a corresponding "off" command to tell the model to stop sending visualization data. As the function tec::execute_tec() cycled through the simulation it would determine if visualization output is required at the current date and time. If so, it could call a certain function to trace through the spatial hierarchy, determine the value of the requested variable and pipe it to the GUI. This function could be a new function written specifically for the purpose of supporting visualization or the output functions of each class could be modified to support both the existing output capabilities as well as the visualization output.
Method 2 would involve adding functionality to the command_line class. A new flag would have to be added to the command_line to indicate that visualization was wanted. In addition, the user would have to indicate what variable was supposed to be spit out and how often (hourly, daily, monthly, or yearly). The command_line object would store the name of that variable and the frequency at which to output it. Then, functionality could be added to the "output_*" functions of each class and the execute_*_output_event functions of the tec class. The execute-output-event functions would now be called if visualization was wanted or if normal file output was wanted. The output functions would be modified to check the command line for what to output and demultiplex the variable that is stored there. Then the output functions would send the variable according to the protocol mentioned above.
The GUI currently is capable of graphing the changing value of a variable over time on a line graph. The JGraphicWindow is not capable of graphing the timestamp of the variable in conjunction with its value. The line graph that is produced does not indicate the specific time at which the variable had a specific time, instead the user is only able to determine the progression of a variables value over a specific time period.
JGraphicWindow, StdoutTalker, CommandObj, HydroGUI
The Java classes, specifically JGraphicWindow, must be modified if the user wants to be able to determine the exact time a variable had a specific value.
Also, some modifications would have to be made to several classes if the user wanted to graph more than one variable on the same graph. The JGraphicWindow would need to be modified so that it would maintain a list of the variables it was displaying. Also, StdoutTalker would need to be modified so that it could associate a list of variables with a specific window. The HydroGUI would need to be modified so that the user could select the variables that would be displayed together. Finally, the CommandObj would need to be modified to pass the information from the GUI to StdoutTalker.
Author: Hank Hoffmann