Windows 7 Installation …

I started the move to Windows 7 last night.  Just to make things interesting, I also moved to the 64 bit version.  The base installation was quick and painless.  I was installing on a Dell Latitude D630 with a nVidia video controller.

Windows 7 did a pretty good job of finding and configuring my hardware.  The only drivers that I needed were for the video controller and the finger reader.  Everything seems to be working good and the computer is running noticeable faster with a lower resource usage.

Here is a list of the software that I have had problems with:

  • Alcohol 52% – Alcohol 120% does work and they are working on a update for Alcohol 52%.  This is causing me more irritation than anything.  I have converted all my install CDs to ISO images and use Alcohol to mount them.
  • Acronis True Image Echo Workstation 9.1 – I just get a message that this program will not run on Windows 7.  Nothing that I could find on the Acronis web site, but there is a new version out.  The explorer shell extension seems to be working for browsing previously created images.  I may get stuck purchasing the new version.  I use this for backup and recovery.   Update: Purchased Acronis Backup & Recovery 10 Workstation and it supports Windows 7.
  • Microsoft SQL Server 2008 Developer Edition – This installs but would not run until Service Pack 1 was installed.

    Here is a list of the software that has installed without any problems:

    July Patch Cycle …

    Windows Vista – 4
    Windows Server 2003 – 5
    Microsoft VirtualPC – 1
    Microsoft Office 2007 – 2
    Microsoft SQL Server 2008 – 1 (Service Pack 1, Just installed SQL 2008 last week on test server.  Update has been out since May.)

    New updates in WSUS – 19

    After a week of testing, we have not encountered any problems with the July updates and released them for installation.  I was also quite happy that I didn’t need to reboot the servers for a change.

    ERP Upgrade … (Updated)

    We upgraded our ERP system about six weeks ago.  This was one of the worst upgrades that we have ever went through.  During the testing we didn’t uncover any real problems; however once we went live everything went to hell.  The upgrade was from Fourth Shift 7.4 running on MS SQL Server 2000 to Fourth Shift 7.5 running on MS SQL 2005.

    The first main part of the upgrade was the installation of MS SQL 2005.  This part went pretty quick and easy.  After uninstalling SQL 2000 and installing SQL 2005 sp2; we reattached all our databases and tested the applications that were attaching to them.  There were some problems with security setting not coming back correctly, but they were fixed without any trouble.  I really feel that SQL 2005 is a step back from SQL 2000.  In our sandbox we are using SQL 2008 which is much better than 2000 or 2005.  I can’t wait for our ERP system to start supporting SQL 2008.

    Next came the meat and potatoes of the upgrade, and that was Fourth Shift.  The installation was very easy and well documented.  It took about three hours from start to finish and we had a running system.  We upgraded a couple of workstations after to make sure that the client would also install and run.  We left the rest of the clients to be updated for the next day.  We only have twenty users on the system and them come in in the morning over a two hour period, so we can pretty much update the clients as the users are coming in.

    The first two days after the upgrade it seemed that everything was running good.  It took a couple of days to start seeing that some of the reports were not working correctly.  The sales reports were not updating correctly.  After digging around in the logs we couldn’t find anything.  There were no error and the process that updates the sales reports was reporting that it was finishing without any errors.  This is where SQL 2000 was much nicer.  There is no EASY was to look at the steps is a SSIS package.  It ended up being a problem with the upgrade problem.  The report update package was installed on the server and was running fine, except for the part were none of the steps for the package were installed.  We were issued a new licenses key file and had to reinstalled the upgrade over our current installation to get the reports working right.  This ended up taking a week to get resolved.

    Next came the first end of month.  None of our data extracts that used the general ledger would run.  It turns out that there was a problem with extracting GL data in the new version so they stopped supporting that feature of the program.  They just didn’t bother to include that little nugget in the upgrade documentation.  This works fine in a small databases, but once the extract gets over 750,000 records it quits working.  Our sandbox was only a subset of our full production database.  That may have to change going forward.  The really cool part of all this is that they have a service to convert your extracts to DTS packages that can be run directly form SQL Server.  The DTS packages run much faster then using the extract function within the ERP system.  The only problem with it is that they charge $3500 for the first extract and $200 for each additional extract.  We have 225 extracts that are used by the accounting department.  That works out to over $48,000 for them to fix something that they broke.  Not a bad racket if you can get it.  It took almost three weeks for us to recreate the primary extracts to DTS packages.  We still have a canned report that Fourth Shift generates that is not working.  We have had a open case on this item for six weeks now.  (Update) On July 14 we received a fix for one of the open issues.  It was eight weeks to fix a sorting problem in a report.

    Last night was the second end of month since the system went in.  Everything seems to have run fine so I guess we are getting all the kinks worked out.  The other thing that has happen during all of this is that our ERP vendor, was purchased by a second software company and a holding company.  That is usually not a good sign and may explain why we have had a outstanding action with them for six weeks on a report that they can reproduce the error on.

    Network Upgrade Project …

    In December we had one of our NETGEAR 48 port switches die.  Well, we planned on replacing all of our switches this year to a more enterprise ready switches.  We ordered three new switches this month to go with the one that was purchased in December.  Before we ever got around two installing them we last two of the old switches.  It has been kind of odd that we have now had three switches die in such a short period of time.

    A big part of this upgrade project has been pulling cable.  When we are done the entire network will have been rewired and the wiring cabinet will have been moved to the server room.  This could be part of the problem with the failing switches.  Right now the wiring cabinet is located in a wood cabinet above the break room.  It is very dusty and gets extremely hot during the summer.  By the end of this week all off the office computers will be wired into the new cabinet and be gigabit enabled. (Yeah!)   There has been a big difference for all the users of our ERP and CRM applications.  The ERP is a VERY thick client.  Once it is running there is not much of a difference, but now it start much quicker.  The CRM runs better, unfortunately the reason it was running so slow was because over time the scope of it has changed and some of the additions that I have added have been two chatty with the SQL Server.

    Maybe once this is done I will post some before and after pictures.

    Network Problem

    Monday started with a bang this week. We were having lots of problems with network connections dropping randomly throughout the building.. In the middle of trouble shooting this, a third of the network just disappeared. The first place to look when this happens is always the wiring cabinet. I took a quick look there and didn’t see anything out of the "ordinary". It was on my second trip to the wiring cabinet that I noticed that one of our 48 port switches was not blinking; all the lights were on and nobody was home. I went over to the server room to check the main switch that acts as the backbone for the network and it was showing no connection back to the other switch. I tried resetting the switch and still nothing. I then tried a factory reset of the switch and it never came back on. Guess what we didn’t have a spare of; a switch. It is amazing how many small routers you can find lying around in a company since that is what I had to use to get things back up and running. I did get a new switch shipped in overnight and it was installed Thursday morning. We were already looking to do some major infrastructure upgrades next year, so we just started the project a little early.  The new switch was a Netgear ProSafe 48-port Gigabit L3 Managed Stackable Switch.  There has been a noticeable improvement in throughput on the network.  We put this router in the server room were most of the bandwidth was needed.

    Entitlement Issues …

    What is it about employees and entitlement.  I think that I have talked about the fact that we will let people bring in home computers and we will work on them as we have time.  We currently have a manager who always has their home computer, work computer or both computers in being worked on.  Today he decided that we are not taking his home computer problems serious enough and he wants it fixed now.  You would have had to be there to understand the utter stupidity of this comment.  He is standing in my office demanding that I fix his computer today, while we have one of our three primary switches down and a third of the building has no network access.  The only other person that has been a pain about stuff like this is the person he was hired to replace.  Makes me wonder if this entitlement issue is just something that comes with being the being a sales manager.

    Recovering From a Corrupt Registry Hive

    We had a computer come in the office that was getting the following error today:

    Windows XP could not start because the following file is missing or corrupt:
    WINDOWSSYSTEM32CONFIGSYSTEM

    I found sever solutions to fix this, but all of them would cause the registry to be restored to the default Windows installation state.  This does not seem like a very good solution at all.  After digging around trying to find a way to run Windows Restore from the Recovery Console, I found a post that explained how to restore files backed up a Restore Point from the Recovery Console.

    The Steps required are:

    1. Log into the recovery console using a Windows install disk.
    2. Navigate to the windowssystem32config directory and rename the file system to something like system.bak
    3. Navigate to the System Volume Information directory.
      cd
      cd system~1
      cd _resto~1
    4. A quick dir command will give you a list of directories named RP and then  a number.  If you look at the timestamp for these directories it will let you know when the restore point was created.  Look for one that is dated JUST before you started to have this problem and navigate into it.
      cd rp#
    5. Within the RP# directory there will be a directory named snapshot. This is the directory with the registry hives in it, so will want to go there now.
      cd snapshot
    6. The SOFTWARE hive is named _REGISTRY_MACHINE_SOFTWARE and the SYSTEM hive is named _REGISTRY_MACHINE_SYSTEM.  Now we need to copy this hive into the location of the corrupt hive.
      copy _REGISTRY_MACHINE_SOFTWARE windowssystem32configsoftware
      or
      copy _REGISTRY_MACHINE_SYSTEM windowssystem32configsystem
    7. With any luck you can now type exit and let Windows reboot.

    This solution was information combined from the following two sources:
    Running System Restore from the Recovery Console (well, sort of)
    How to recover from a corrupted registry that prevents Windows XP from starting

    Virus, Virus everywhere , but nothing seems to stop them …

    It seems as if there is a zero day exploit floating around out in the wild that is still unknown.  In the last 6 years we have had less than 5 computers get infected at work.  In the last three weeks we have had a steady stream of calls about computers acting weird.  Almost all of them have had trojans installed and two of them have been rooted.  All of the computers affected have been fully up to date with both patches and AV definitions.  At work we are using avast! for our antivirus software, and had not had a problem until the last few weeks.  Like most IT shops we will look at personnel computers when thing are not too busy.  We’ve had lots of home computers coming in infected also.  Most of them have also been fully up to date.  Only one of them was really asking for a problem.  Some of the antivirus software packages that have been on these computers are AVG, McAfee and Trend Micro.

    I’m not really sure wall all this means, but it is getting old.

    Major Systems Upgrade

    We have been making major changes at work the last six weeks. We replaced both of our old Dell Power Edge 4600 servers with three new Dell PowerEdge 2900 Servers. I went from having half a terabyte of storage to having two terabyte of storage. Two of the servers are running VMware ESX 3.5 and the other server is my Microsoft SQL server. When we purchased the new servers we did not get any tape drives on them because we wanted to change our backup method during the change.

    The basic server layout is:
    Physical Server 1 (PowerEdge 2900)

    • VM 1: Domain Controller
    • VM 2: File and Print Server
    • VM 3: Application Server

    Physical Server 2 (PowerEdge 2900)

    • VM 4: Intranet Server
    • VM 5: Domain Controller (Primary)
    • VM 6: Application Server

    Physical Server 3 (PowerEdge 2900)

    • ERP and MS SQL Server

    Physical Server 4 (PowerEdge 4600)

    • Backup Server

    For backups we are using Acronis TrueImage Echo Server to backup VM 2, VM 4, VM 5 and the Physical Server 3. We are creating full images on Sunday when to network to slowest and pushing them to the backup server. This process is taking about one hour. Monday through Friday we are doing differential backups on the save four machine and pushing them to the backup server. On the SQL server we are also doing log file backups every 30 minutes. These files are created locally and then copied to the backup server. The differential backup is only taking about five minutes total for the four servers. On Saturday we are using the ability of TrueImage to merge the full image from the previous Sunday and the last differential image from Friday into one file and then removing all the pieces from the week. At this point we have one image for each of the four servers that is current. All of the nightly processing done on these four servers is handled by a VBscript. (Note: The script has been saved as a text file for security on the web site.) Each night the backup server backs itself up to tape which will include all the image file from the other servers. We are not backing up VM 1 because it is a mirror of the other domain controller and in the event that we had to restore everything, I have never had much luck getting to domain controllers to resync after. VM 3 only changes when one of the applications on is updated. So after any application changes we make a snapshot of the VM and burn it to DVD to store off site. VM 6 has one data directory that changes so I pull that directory each night when the backup server backs up, other wise it is treated the same as VM 3.

    These changes have made a world of difference to our nightly processing. When everything was going to tape it was taking almost four hours total for all the servers to backup and the one time we did have a major hardware failure it took sixteen hours to get everything up and running. We did a test run after everything was up and running, and the full restore took under two hours have things running.

    I also finally got and answer to my question about our corporate pain threshold for data loss in a major failure. I have always been working to keep my exposure to less than four hours. Well it turns out the corporate standard is one WEEK. I was shocked to hear that. I am pretty sure that I will have NO problem meeting that requirement.

    I’m Baaaack …

    What a great week. The weather was nice, but the training was GREAT. It is always amazing how you go somewhere for training on one thing and pick up other things that really help in other areas. We had noticed that our system had been slowing down over time and nothing seemed to help. I was talking to another IT Manager that is running the same ERP release that we are on who was having the same problem earlier this year. She mentioned a few undocumented settings that she found out about through tech support. I tried the changes on our sandbox and got about a 40% increase in speed. When your nightly processing takes almost six hours that is huge. Once I applied the changes to the live system we saw about a 25% increase. The lower return on the live system most likely had to do with the fact that it is running on much better hardware so it was taking a smaller performance hit that the sandbox. I will still take a 25% increase for three hours worth of work on a Sunday.

    I did also learn quite a bit about Reporting Services and have to say that I am impressed with it. I’m not sure if I like it more than Crystal Reports, but you can do a lot with it and it is free if you are using Microsoft SQL Server. I started translating some of our custom reports that we have build over the year to Reporting Services. One of the things that are going to bite about the new version is that they have moved all of the support databases into SQL Server. They used to keep all the sales reporting data in a Microsoft Access database that will no longer be available in the new version. We have about 125 custom reports build using just this one database that needs to be updated to work with SQL Server. They have provided a tool to move this database into SQL Server with our current version and keep it updated with the Access database while you are working on updating all your reports.

    Well this should be fun working on these things over the next several months.