Benchmarks Online

Skip Navigation Links


Page One

Campus Computing News

Checkin 4.0: Additional Features

Account Management System E-mail Preferences Page Enhanced

Maintaining Your RedHat Linux System with Autorpm

EDUCAUSE Redux

Today's Cartoon

RSS Matters

The Network Connection

Link of the Month

WWW@UNT.EDU

Short Courses

IRC News

Staff Activities

Subscribe to Benchmarks Online
    

WWW@UNT.EDU

It Worked! -or- How we transitioned to new WebCT server hardware and kept 18,000+ students happy.

By Austin Laird, Distance Learning Administrator, Central Web Support

Online learning here at the University of North Texas has grown by leaps and bounds over the past several years. Our primary and centralized means of delivering course content online is via WebCT. This semester (Fall 2003) we have over 955 UNT course sections enrolled in over 473 WebCT courses. Our total unique student enrollment is over 18,300 for a total number of seats in WebCT courses just under 30,000. What is more exciting is that 4,200 students are strictly distance education students. That means that nearly 7% of the student population of the University of North Texas doesn’t necessarily ever step foot on the actual campus. For these students, the University of North Texas is found online and in WebCT.

Why did we move to a new machine?

These numbers show that WebCT is a mission critical application here at the University of North Texas. As the Distributed Learning Administrator, I am charged with keeping WebCT available 24 X 7 for our student and faculty. Our WebCT users are accustomed to this high level of availability and expect that anytime they go to webct.unt.edu they’ll find a functioning and responsive application. Recently, however, we experienced several unexpected downtimes due to server hardware failure. We also recently experienced slower than normal performance from our WebCT server.

Unplanned downtime and slow server response are unacceptable in our WebCT environment and thus we quickly sought a path to moving to new WebCT hardware to remedy these problems.

How did we make the transition?

We had been planning to move to new WebCT hardware during the next semester break and already had this hardware in place. We decided to go ahead and move to this new hardware during the semester because we could not risk more unexpected downtime. Sun Microsystems had replaced every part of our old WebCT server, but we were still having problems with hardware failures. On Tuesday, September 16th we made the transition. This of course is all easier said than done so what follows is the path we took to a successful migration.

System Specification

Hardware:

Old: Sun UltraSparc II E450 with dual 400 MHz processors, 1gig of RAM, 120gig mirrored internal hard drives

New: Sun Fire UltraSparc III  280R with dual 1 GHz processors, 4gig of RAM, 170gig fibre attached SAN (expandable as we need it).

Software:

Old: Solaris 8, Sun Solstice Disksuite Volume Management software, UFS, WebCT 3.8

New: Solaris 8, Veritas Volume Manager, Veritas File System, WebCT 3.8

We chose to move to Veritas File System because it is better at handling our millions of small files than UFS. We have Veritas Volume Manager in place to build the large volume from our drives that are presented to the machine from the SAN.

Data:

Approximately 120 gigabytes of data, made up of 8.5+ million files in the WebCT file system structure!

Method :

We were challenged with finding a way to move these 8.5+ million files from the old server to the new one without losing changes to them that the students and faculty were making. We did not have the option to take the machine offline long enough to directly copy all the data from one machine to the other; WebCT would have been down for nearly 36 hours had we chosen this route. We also could not simply copy the files while the system was live because of the constant changes being made to files and because that would have negatively impacted our IO. After much investigation and testing, we decided on the following method:

  1. We took the latest full system tape backup of the old WebCT system and restored it to the new WebCT system. This process brought our new system up to date as of September 7th.  This process was also our first indication of the performance improvement that the new hardware and the new file system would give us: it took 7 hours to restore the 120+ gig 8+ million files, whereas on the old system it would have taken approximately 36 hours!
     
  2. Once this process was completed, and we had worked out some bugs in the Veritas File System settings, we began synchronizing the two systems with rsync. Rsync is an open source tool that provides incremental file transfer by comparing the source system (in this case the old WebCT machine) to the new system and determining what files have changed and thus what files need to be copied from the old machine to the new one. Initially, beginning on September 12th, we ran rsync across the entire file system. This was a very processor intensive and time consuming process (30+ hours), but it ensured the system was up to date as of at least the 12th.
     
  3. Next, we began running rsync only on the parts of the WebCT file system that we knew had been changed since September 12th. We determined what to rsync based on what courses had been accessed (as reflected in the Apache web server logs of WebCT). We ran an rsync on Monday, September 15th overnight to catch up all changes that had been made to the system since the 12th.
     
  4. Throughout the day Tuesday, we ran several incremental rsyncs to maintain consistency between the two machines.
     
  5. Our last rsync before we brought down the WebCT server was at approximately 9:00PM. We took down the WebCT application at 11:30 so that users were no longer able to login to the system and make changes. With the system down, we could take information from the latest Apache logs (from 9:00PM to 11:30PM) and rsync just those courses that had been accessed during that 2 ˝ hour block, thus minimizing our downtime. As it turns out, over 300 courses had been accessed by 1000s of users during that 2 ˝ hour timeframe! We began an rsync of these 300+ courses that lasted until about 3:00AM.
     
  6. After we had completed this final rsync to ensure all the most recent updates and changes were copied to the new system, we had to reinstall the WebCT application. This was necessary to ensure that all of the paths that are hard-coded in WebCT scripts were set properly (we had changed the mount point of the WebCT application). This process took approximately 45 minutes. We also re-configured the new machine to reflect the same IP address and hostname as the old machine.
     
  7. Once the re-install of the WebCT software was complete, we were able to begin comparing the two installations. We checked key parts of key courses that we knew were accessed frequently and compared them to make sure they were identical. We also tested all the pieces of WebCT to make sure they were working. FrontPage access to the server was not completely functional at this point, but it is not necessary for the functioning of WebCT. We fixed that piece throughout the days following the move to the new server.
     
  8. At approximately 6:30AM, Tuesday September 17th we gave everyone access again to WebCT, on the new system

Results

Our method for transferring and synchronizing files was very successful. We have had literally only one or two problems that were attributable to this transition. In those cases, we were able to go back to the old server and pull the needed information that did not properly copy from one machine to the other. We’ve found that our server performance is vastly improved. The improvement is a combination of the faster Veritas file system, and the improved hardware. Transactions that might have taken minutes to complete on the old server (user queries for instance) now take a few seconds. As one user told us, “the new server is sooooooooooooooooooooooo much faster!!” Most importantly to us, we now have a reliable WebCT server again.