Page One

Campus Computing News

New Center for Distributed Learning

My Remedy what!? has been resolved???

UNT Y2K Watch

ACS UNIX Systems Group News

New for Fall '98 . . . in a General Access Lab Near You

McAfee Scans the Campus

A Report on the Joint Meetings of the American Statistical Association

The Network Connection

List of the Month

WWW@UNT.EDU

Short Courses

IRC News

Staff Activities

Shift Key

    

UNT Y2K Watch*

By Coy Hoggard, Senior Director of Administrative Computing and UNT's Year 2000 Compliance Officer

The Year 2000 Computer Problem (also called "The Y2K Computer Problem," or just "The Y2K Problem") – is there anyone on the face of the planet who has not heard about it? Surely not! Do we really need another article discussing "The Y2K Problem?" Perhaps not! But knowing about the problem and understanding how it may affect businesses and citizens are two different matters entirely. In this and subsequent articles I’ll attempt to explain the problem as I see it, how it could affect us here at UNT, how it could affect us in our lives away from the campus, what’s being done about the problem, and what we all can do to help.

Exactly what is the Y2K Problem?

Well, the problem is that computers are often programmed to treat a date as though it contains six digits, with two of those representing the year. We should realize that’s just the way most of us also represent dates when we write them down. The typical shorthand representation for August 1, 1998, for example, is 8/1/98. Using the same notation, January 1, 2000, January 1, 1900, and January 1, 2100 would all be represented as 1/1/00. See the pattern?

Since there is no explicit century designation, dates repeat themselves every 100 years using our traditional 6-digit date representation. For human beings, this is not usually a problem, because we use some judgement when evaluating information. We consider the context and many other things without even realizing it. We know, for example, that if we see a date of 1/1/00 whether it makes more sense for the date to have really been January 1, 2000, January 1, 1900, or some other date where the year ends with two zeros.

Computers have no such inherent judgement. Unless programmed otherwise, they will simply treat a year of 00 as being a date 98 years earlier than a year of 98, simply because 00 is a lower value numerically than 98. This can result in incorrect arithmetic calculations when determining elapsed time (as when calculating someone’s age based on year of birth, aging accounts receivable, etc.). It can also cause incorrect results when simply comparing dates to see whether one date is prior to or after another date. And, it can cause reports and on-line lists to be incorrectly sequenced if sorted by date. We often want date-sensitive reports to be sorted such that the most current information is presented first, followed by older items. Since years represented by only two digits will be numerically less than years just prior to the turn of the century, the newer information may very well fall to the bottom of the report rather than being at the top where one would expect it.

Where can the problem be found?

Is it in the computer hardware or the software? The answer is that the problem is often found in both computer hardware and software, and some other places as well. I have chosen to categorize the problem areas as follows:

  • Computer Hardware (equipment). Most, if not all, computers which maintain the time internally (in the hardware) do so by starting with some base point in time (a date) and counting time intervals from that point forward. The base date used varies by system, as it is simply a date picked by the designer. Often it is based on the date the machine was designed or developed. The base date used by IBM-Compatible PCs, for example, is January 4, 1980, which is considered to be the "birth date" of the PC. The time interval may be seconds, fractions of a second, or some other interval, typically based on the pulsing of a crystal inside the machine. When an operating system function requests time of day (to service a request from a program, for example), either the operating system itself or a program on a chip within the machine converts the internal count to a traditional date and time format. Problems with this approach arise when the interval counter (which is stored in binary [base 2] format) overflows. When a binary counter (or one in any number base, for that matter) overflows, the high-order digit is lost, and the counter rolls over to zero.

    Using an analogy from a non-computer device that most folks are familiar with, consider the odometer of an automobile. Although they vary somewhat, a typical odometer will accommodate a number consisting of 5 digits representing miles plus 1 additional digit representing tenths of miles. An odometer with this capacity will represent up to 99,999.9 miles. Then it rolls over to zero, with the high-order digit being lost, causing the odometer to show zero miles, when the correct mileage is actually 100,000.0 miles. Exactly the same thing happens to the computer’s time interval counter when it overflows. It rolls over to zero. Depending on the system, this may be reported as a date with a year of 00, or it may be reported as the base date. Either result is incorrect. Devices that we do not think of as being computers can suffer from this same problem. Early model Global Positioning Devices (GPSes), for example, will (if not corrected) fail at midnight the evening of August 21, 1999 because that’s when their internal time interval counter will overflow. At this point, these devices will "think" that the current date is January 5, 1980, because that is the "base date" used in these devices.

  • Operating Systems. Even if the computer hardware is Y2K compliant, the operating system may still report an incorrect date. A common problem is for the operating system to report a date with only a 2-digit year, resulting in a year of 00 being reported for the year 2000, for example.
  • Language Translators/Compilers. Programmers do not develop programs using the computer’s actual language. Although the very first stored-program computers were programmed using machine language, it quickly became apparent that this approach was simply too tedious, time-consuming, and error-prone to be practical. So, the early computer scientists (by whatever name they were called at the time) began to devote a significant amount of their time to developing programming languages and language translators (compilers, assemblers, interpreters) which would convert the code written by the programmer into the machine language required for the target computer to run the program. Most language translators save the translated code so that the program does not have to be translated each time it is executed (run). The machine language program produced by the language translator is called an "object program," whereas the program actually coded by the programmer is called the "source program."

    All programming languages are designed to free programmers from some of the more tedious details of developing the machine-language code required by the computer hardware. Examples of widely used programming languages include COBOL, FORTRAN, BASIC, and "C." The choice of programming language is made for a variety of reasons, but in all cases there is some kind of language translator (which is itself a specialized computer program) that produces the actual machine-language program based on the source program coded by the programmer. A commonly-used program function calls for the program which is executing (running) to acquire the current date from the system (for inclusion on a report or other display, or for posting to an updated record in a database file, for example). Even if the hardware and operating system provide dates with 4-digit years, a language translator that is not Y2K Compliant could cause the executable program to store the date with only a 2-digit year, producing incorrect results.

  • "Platform" and Utility Software. The line between this category of software and "applications software" is sometimes blurry. Consider a word processing or electronic spreadsheet package, for example. Are those applications? Many people would say "yes." How about a database package such as MS Access? Is that an application? Again, many people would say "yes." I take a different approach, however, and consider all those (especially the database product) to be tools which are used to develop true applications. That’s more true, I’ll admit, for a spreadsheet than it is for a word processor, and more true still for a database product than for either a word processor or spreadsheet product. I’ve chosen to consider all these type products as "Platform and Utility Software." Any of these products can have problems dealing with dates at and beyond January 1, 2000. In most cases later versions of these software products will have been made Year 2000 Compliant, and most major vendors have been forthcoming regarding the situation as regards their products.

    It is worth noting, however, that even a software product that is fully Year 2000 Compliant can be used in such a way as to cause date-related problems. For example, in a typical spreadsheet application one can define a field as being a date field, but then still enter dates using 2-digit years. The software will infer the century (first 2 digits of the year), using a "windowing" technique. It is possible that the software will store a 4-digit year 100 years away from the year you intended. The safe thing to do is to enter the full 4-digit year, thus removing all ambiguity. Sometimes spreadsheets and database files are created using simple numeric field definitions for storing dates. When this is done, the platform software simply processes the date information the same as it would any other numeric data, and if the date information is stored with only a 2-digit year, then results are likely to be incorrect.

  • Electronically Stored Data. Dates are stored in most electronic data files, often using the MM/DD/YY format, or other format with a 2-digit year. This can cause computer programs to perform incorrect calculations of elapsed time (age, elapsed time since most recent payment, etc.). It also can result in invalid comparisons, making a date that is actually in the future (beyond Dec. 31, 1999) appear to be some time in the past. It can also cause sorted data (based on dates) to be incorrectly presented.

    Generally speaking, there are two approaches to correcting this problem. One approach is to use a "windowing" technique which attempts to discern the correct 4-digit year based on the value of the stored 2-digit year. One might determine, for example, that any stored date with a value of less than 30 actually means that the first 2 digits of the year should be "20," while for dates equal to or greater than 30, the first 2 digits should be "19." Using this approach, for example, a stored year with the value 01 would be interpreted to mean 2001 while a year with value 98 would be interpreted to mean 1998. Using the terms generally associated with "date windowing," it is said that in our example, we are using 30 as the "pivot year" or "swing year." Problems with windowing are that if the file(s) have dates spanning more than 100 years, then windowing is not reliable. And, the pivot year must "slide" over time. At some point, for example, assuming that 2-digit years with a value equal to or greater than 30 to be 19xx dates will be incorrect, since the time will come when "30" is intended to be 2030, not 1930. Windowing also does not necessarily take care of the sorting problem, since the stored files still contain only a 2-digit representation of the year. Windowing is a pragmatic approach that will work acceptably in some situations, avoiding the massive effort involved in restructuring the data files. But, it will not work reliably in all cases, and results in the date issue potentially continuing to be a problem at some level.

    The other approach to dealing with the issue of dates in stored files is to restructure the files, expanding all date fields so that they will contain a 4-digit year. This approach solves the problem once and for all (well, at least until 1/1/10000, when a 4-digit year is no longer adequate). Depending on the situation, however, the amount of effort involved may make this approach impractical to accomplish within the available time, considering that all programs which read the data file(s) may need to be modified.

  • Devices with Embedded Chips/Processors. Perhaps the most problematic category is another type of hardware (equipment). These are devices which we do not typically consider to be computers, but which have some type of embedded processor chip with a computer program burned onto the silicon which allows the device to "act smart." Examples of such devices with embedded processor chips include consumer products such multi-function digital watches, VCRs, and microwave ovens. Examples in the medical field include heart pacemakers and kidney dialysis machines. Examples in building environment systems include "intelligent" heating / cooling systems, lawn sprinkler systems, and building access systems. Other examples are some elevators, airplanes, factories, electrical generation and distribution systems, automated valves in gas and water distribution systems, and communications systems.

    The list is endless. This category of item (embedded chip) is much more problematic than is immediately obvious is because there are so many of them, they are difficult to identify since many of them are hidden inside mechanical devices and are not easily recognized, and it’s difficult to determine whether they are date-sensitive. It’s also difficult to determine whether they will have date-related problems if they are known to be date-sensitive. Since there is typically no source program available to the user (or maybe even the vendor) of these devices, evaluating and/or correcting these items is difficult at best. Setting the date forward to test for compliance in a "live" device may trigger the very problem that one wishes to avoid.

    Some devices that do not appear to be date-sensitive (i.e., are not using the date in any discernible way) may, in fact, be doing some type of date processing internally which can cause unpredictable problems when the year rolls from 1999 to 2000, if the device uses only a 2-digit year. This is because the internal chip may have a dozen or so functions designed into it, but the device using it may make use of only 1, 2, or 3 of those functions. The processing for the other functions is still taking place within the chip. Without having detailed knowledge of the programming on the chip, it is impossible to predict the results of the chip encountering a date with year value of "00." If the chip is doing any kind of trend analysis, for example, when the 2-digit year rolls over from 99 to 00, it will appear that the current date is 100 years prior to the most recent prior reading, which is a nonsensical situation. The results are simply unpredictable without knowledge of the program code on the chip.

Why does the Y2K Problem exist? How did it come about?

Were programmers and other computer scientists so stupid that they did not understand that using 2-digit years in their designs would cause problems when the century rolled over? Many articles in the trade journals emphasize that programmers used 2 digits to represent the year due to the expense associated with storing full 4-digit years in times when the cost of electronic storage (memory [RAM in today’s terminology] and disk storage) were quite expensive. I’ve even seen one study showing that the savings resulting from using 2-digits to represent the year in a system developed in the early 1970s easily offsets the cost of correcting the problem today. Well, that may be nice to know, but is only indirectly related, in my opinion, to the real reason that 2-digit dates were used.

In my opinion, there were two primary reasons why programmers used 2-digit dates. First, today’s business computing systems have their roots in automated electronic data processing systems that used punched cards for data storage. Although there were some variations, by far the most widely used of the punch cards was the 80-column card used by IBM. The punched card was widely used prior to the advent of widespread use of computers for business data processing. The earlier accounting machines used plugboard type control panels in which wires could be inserted in a variety of ways by expert panel wirers to control the behavior of the machines. A variety of machines allowed the punched cards to be arranged in different sequences (using a "sorter"), different files to be merged together (using a "collator"), listed, with control totals as needed (using a "tabulating machine"), duplicated (using a "reproducer"), etc. Although these machines were much more mechanical than electronic, relatively sophisticated systems were developed using the punched card and the variety of machines available for manipulating the cards and the data they contained. The cards were logically divided into "fields" of fixed length. In a payroll application, for example, the first 9 columns might be allocated to contain the individual’s Social Security Number, the next 20 or 25 columns might be used for name, etc. As business functions required that an increasing amount of information be maintained, it became increasingly difficult to store all the information required about any given entity in a single 80-column punched card. Using multiple cards to store information about a single entity increased the complexity and difficulty of operation and maintenance of these systems exponentially, and was especially problematic prior to the advent of computers, since the machinery available had extremely limited storage capacity, and typically the contents of one card was totally lost when the next card was read. This made it difficult if not impossible to print a report line with information from more than one card. For these reasons, minimizing the size of the data fields in punched cards was usually very important. As more information was required about an entity, a typical approach was to see if other fields (such as name, address, etc.) could be shortened to allow the additional information to be added in the existing card format.

In the early 1960s, electronic computers were beginning to be widely used to supplement or replace some of the electromechanical equipment used in these punched card systems. These computers were equipped with card readers, card punches, and printers, allowing them to duplicate the functions of at least the tabulating machine and reproducing machine. Often card sorters and collators were retained to allow card files to be physically manipulated, although if the computer was equipped with adequate disk and/or tape storage, the data from these card files could be arranged internally for reporting without physically sorting and merging the card files. Additionally, the internal storage ("memory") of the computer allowed information from multiple cards to be retained internally, allowing report information from multiple cards to be printed in the order needed instead of in the order encountered when reading the cards. However, using multiple punched cards to store data about a single entity was still a problem. Because the punched cards were just pieces of cardboard with holes punched in them, they could easily be lost or mis-filed. This happened quite often. The programming effort to verify that all the cards needed for a particular entity were, in fact, present could dwarf the effort to program the actual business function. So the practice of minimizing the sizes of data fields continued as computers were used to replace some of the punched card equipment. To use four digits to represent the year in an application developed in 1965, for example, would have been considered by most programmers to be absurd if the space was needed for other information. In 1964 or 1965, a more likely debate would have been whether to use one digit or two to represent the year. The belief was that this system would be replaced well before the end of the decade, so why "waste" two digits when one would do just as well?

I began my career in electronic data processing (EDP) in 1961 as a tabulating machine operator (which included plugboard panel wiring). In 1961 I began programming second generation computers (IBM 1400 series and 7000 series systems), and by late 1964 I was a working manager here at North Texas State University (that was well before the name change to University of North Texas), supervising one or two programmers and doing lots of programming myself. At that time NTSU still used punched card files and associated card handling equipment, but had also acquired two digital computers; one IBM 1620 devoted primarily to academic usage, and one IBM 1440 devoted to business and administrative functions. Both of these computers were equipped for reading and writing (punching) card files. When we were given a system or application to design and write programs for, I did not calculate the cost of storing dates in 1, 2, or 4 digit format, I simply considered how we could accomplish what was needed within an acceptable timeframe, using the equipment that we already had in place. Rarely did we do a system, application, or project which warranted purchasing additional or updated equipment. I have a hard time imagining myself going to the upper administration and saying something like "Hmmm – I guess we could get this done by the time you’ve said that you ABSOLUTELY MUST HAVE IT – except that this two digit year thing is going to cause a real problem at the end of the century. My guess is that the answer would have been something along the lines of "Get out of my office, get back to work, and don’t EVER bother me again with a problem that won’t occur for 35 years in a system that will probably last no more than five!"

Contributing to this situation, I think, is that most people doing programming in the 1960s and 1970s were relatively young, and to a young person, ten, twenty, or thirty years seems like forever. At my current advanced age, I look back at something that happened 25 years ago, and it may seem like it happened only yesterday (well, maybe last week). But to a 30 year old, 25 years is an eternity. In 1965 I was 30 years old. The turn of the century (i.e., January 1, 2000) was 35 years in the future. That was five years more than I’d even been alive to that point! To think that shortcomings in systems and programs that I designed in 1965 would be a problem at the turn of the century (35 years in the future) would have been unimaginable to me. It would have been inconceivable to me to think that any trace of those systems or programs would still be around at the end of the century.

As anticipated, the systems designed and programs written in the 1960s and 1970s were, in fact, replaced with other, more modern systems. Punched cards gave way to magnetic tape and disk for file storage. But quite often the basic design of the original data files was brought along, complete with the abbreviated year format. Then as newer systems were developed, the attitude of "well, that’s the way it’s always been done," and "there’s no need having this specific file have 4-digit years when all the others only have 2-digit years – we’ll just fix them all at the same time when we get around to it" prevailed. Well, the time when we must "get around to it" is upon us!

The other primary reason that we have the Y2K Problem, in my opinion, is because that’s the way we’ve all thought about date representation all our lives. When we write the date, we almost always abbreviate it to MM/DD/YY format. This leads to a particular mindset that carries over to our work. The punched card history of information systems can partially explain why the problem exists in current information systems. But it does not explain why chip designers designed those devices as late as the mid to late 1990s using only 2-digit years. This situation is more a result of the "that’s the way we represent dates" mentality. But before we’re too quick to condemn, we should consider that this situation exists to some degree outside the world of computing and electronic communications. When one partner in a long-term marriage dies, it is quite common for a double grave marker to be purchased and installed, marking the final resting place of the deceased partner and identifying the location where the surviving partner will eventually be buried. In addition to the name, date of birth and date of death of the deceased, typically the name and birth date of the surviving spouse is chiseled into the headstone prior to its being delivered to the gravesite. The date of death is, of course, left blank to be added upon the death of the remaining partner. Quite often, however, "19" is pre-cut into the stone, indicating the first two digits of the anticipated year of death. But, some of these currently surviving spouses will frustrate the stone carver’s expectations by living beyond December 31, 1999. So, we have yet another "Y2K Problem."

So, what’s being done about the problem at UNT?

At UNT, as at most organizations, the problem was first believed (or assumed) to be strictly a problem with legacy, mainframe-based applications and environments (equipment, operating systems, and platform software). That’s still where most of the effort is being, and will be, expended. UNT is in better shape as regards its PCs than most organizations, due to the dedication and competence of our Microcomputer Maintenance Shop, the staff of which is dedicated to assisting with the evaluation and remediation (as necessary) of PCs used in the various academic and departmental units. The Computing Center’s Network and Microcomputer Support Division is committed to evaluation and remediation of the campus network and centrally supported PC software.

Problems with spreadsheets, databases, office equipment, research equipment, campus infrastructure (environmental systems, for example) are of concern. Although the effort to remediate these items is not nearly as large as is the effort for the mainframe systems, it is not nearly as straight-forward. These items are difficult to locate, and often difficult to analyze. It’s often difficult to determine whether an item even might possibly have a date-related problem. How does one know, for example, whether a piece of research equipment (for example) even has an internal date, and if so, how it’s used? The assessment must be done by someone familiar with the functioning of the device. In some cases, the manufacturer may be the only one who can give an accurate assessment, and in some cases the original manufacturer may no longer be in business. In fact, very few items are turning out to have date-related problems. But for the ones that do, the consequences can vary from being only a nuisance (as with a piece of office equipment) all the way to life-threatening (as with medical equipment). Industry predictions indicate that about 5 percent of all embedded processors will fail as the Year 2000 transition occurs. The problem is no one knows which 5 percent will fail. As someone said, "It’s a low probability but high risk situation."

UNT is committed to taking necessary actions to ensure that no critical functions of the University will fail or suffer serious degradation due to the Y2K Problem. It is the intent of the University that all items (including equipment and software) which are critical to the functioning of any organizational unit of the University which may have a date-related failure prior to, at, or beyond January 1, 2000 be evaluated for year 2000 compliance and any necessary corrective action taken by December 31, 1998. If it will not be possible or practical to take corrective action to make an item year 2000 compliant within this timeframe, then the State of Texas’ Department of Information Resources (DIR) requires that a contingency plan be prepared to indicate how the organizational unit will deliver critical services at an acceptable level should the at-risk item suffer date-related failures.

The UNT Computing Center has assumed responsibility for making certain that all software and equipment that is supported centrally by the Computing Center is or will be made to be year 2000 compliant. This includes, but is not limited to, our Student Information Management System (SIMS), the Human Resources Management Information System (HRMIS - which produces our payroll checks), the General Ledger Accounting System (GLAS – which gets our vendors paid). The Computing Center will also be responsible for making sure that centrally supported office software such as the MS Office 97 suite of products are compliant. We will also be responsible for the centrally supported computers (including the IBM mainframe, JOVE, etc.) as well as for the campus network equipment and network operating systems.

The Physical Plant will be responsible for the campus infrastructure devices, including HVAC, and building access systems. Individual units must be responsible for departmentally owned equipment (including micro computers), access systems which control an area within a building, for specialized software not centrally supported, and for the contents of individual spreadsheets, databases, and other files, including those created by centrally supported "platform" software. I.e., even if a spreadsheet product is Y2K compliant, the spreadsheet still can be created in such a way as to produce invalid results at the turn of the century.

The staff of the Computing Center, the Microcomputer Maintenance Shop, and the Physical Plant are willing to assist as appropriate in evaluating equipment, software and files which may be date sensitive, but the ultimate responsibility for departmentally controlled items rests with those units. To some degree, responsibility and "ownership" belongs with everyone, not just the Year 2000 Coordinator or the Information Technology Manager."n


*The July 1998 issue of Benchmarks Online touched on Y2K campus issues and included some links for more information on the problem. You can see how Mr. Hoggard has thrown himself into the role as the University's Year 2000 Compliance Officer by following this link. -- Editor