Members present: Craig Berry, Tim Christian, Chris Strauss, Dallas Newell, Michael Hatch, Allen Bradley, Ginger Boone, Kelly Wood, Ramu Muthia, Robert Pierce, Paul Hons, Maurice Leatherbury, Jim Curry
Members absent: Bill Buntain, Abraham John, Cyndie Harris, Eric DuChemin, Rich Anderson, Bruce Pollock
Guests present: Richard Harris, Curtis Hort, Mike Wright
Maurice described the series of problems that the Unix and Internet mail gateways had experienced over the previous one-and-a-half weeks. Starting mid-week in the previous week, a runaway process on a machine in Materials Sciences produced a ping flood that overloaded the campus router, resulting in the inability of that hardware to handle Internet traffic to and from campus. That problem was resolved within four or five hours and traffic resumed to its normal flow. Late in that week, we received notification that Jove may have had its security compromised and after checking the system, we could not detect any problems although implemented extra security checking over the weekend. Early the next week, however, we found evidence that unauthorized person(s) had obtained root access to Jove and to prevent further compromises, brought Jove down to re-install the operating system and thus wipe out any possible Trojan horses, etc.
Upon completing that task and installing additional security traps, we attempted to restart Jove but discovered bad memory in the machine that prevented the successful reboot of the system. Sun service came to make repairs but brought the wrong type of memory and had to return late in the day. The memory replacement didn't solve the probem, however, since it was discovered that there was a bad controller board in the machine. Eventually, it took Sun two controller boards and a day-and-a-half to get the machine running.
When Sun replaced the controller boards, they reversed the order of the disk cabling to the two controllers so when the system disk was re-formatted in order to reinstall the OS, the wrong disk was affected. That error further resulted in delays in getting Jove back on line since it wasn't discovered until Marc St.-Gil was recalled from his vacation in Colorado.
Because of the security breach, other central Unix systems were brought down, including Sol (the research machine,) and Calliope (the central Web server.) While we originally suspected that Calliope had been compromised as well, further investigating could not find any evidence that it had been affected by the hackers, so it was brought back on line. At the time of the DCSMT meeting, the Jove disk array that had been mistakenly reformatted was being restored from tape.
Because of the large number of undeliverable mail messages destined to Jove while it was down, the SMTP gateway was either slowed down to a fraction of its normal speed or crashed completely and had to be periodically rebooted. That affected the delivery of Groupwise Internet mail, and thus that mail delivery system was prevented from communicating with the outside world. A temporary workound for the problems with the SMTP gateway had been installed and mail was flowing between it and the Groupwise gateway, although not as fast as normal.
Tim Christian volunteered his services, along with Duane Gustavas', to help in architecting a better mail delivery system that would help prevent future problems such as the one(s) just encountered. Mike Hatch reported that he had been able to get his Pegasus mail server to deliver mail by pointing to a Kansas State gateway and Mike suggested that we work on making arrangements with outside agencies to do similar things when problems occurred.
Maurice reported that the Desktop Applications Standardization subcommittee had met and had modified the original draft "standards" by changing the title to "guidelines", plus including an option for each distributed area to make its own decision about the implementation of WordPerfect products. Also, the recommendation of Microsoft Access as a preferred DBMS was changed to say that it may not be appropriate for individual desktop use because of its complexity.
Jim Curry reported that the Call Tracking subcommittee had decided that the Remedy Action Request System best met UNT's needs for call tracking after examining Clientele, Apriori, several Lotus Notes applications, SA Expert, Vendata Heat, Intel Landesk, and the option of developing a system in-house. The committee found Remedy to be the most flexible, could be obtained at a mid-range price (between $30K and $80K,) and could be tied into other campus systems such as the paging system.
Tim Christian enumerated the following benefits from adopting a campus-wide call tracking system:
The preliminary plans for implementing Remedy are to form an implementation team with Tim Christian as the project leader and Chris Strauss as the technical support person for the management of the product. Tim outlined a tentative time line for bringing the product up and after discussion the DCSMT members decided that January or February of 1998 would be a feasible target date to have a full rollout of the system. The Computing Center has agreed to fully fund the purchase of the hardware, software, and initial training necessary to bring the system up: that cost will be in the range of $100,000 over a three-year period.
Tim announced that his planning document for the rollout could be found at http://www.cascss.unt.edu/calltrax. After additional discussion, Maurice called for a vote on whether Remedy would be adopted by each distributed computing support area as a campus-wide system and the attendees voted unanimously to approve that adoption. In addition, the following persons signed a document that stated "Yes, I do buy into Remedy!:" Tom Newell (HSC,) Craig Berry, Tim Christian, MIke Wright, Chris Strauss, Dallas Newell, Mike Hatch, Allen Bradley, Ginger Boone, Kelly Wood, Ramu Muthiah, Robert Pierce, Curtis Hort, Paul Hons, Richard Harris, and Maurice Leatherbury.
Jim Curry reported that MicroMaintenance was now supporting internal-only Zip drives, both IDE and SCSI, and internal-only SCSI Jaz drives.
The meeting adjourned at 3:45.
Return to Distributed Computing Support Management Team Home Page.
Page last modified on September 15, 1997 by Philip Baczewski