How-To
Windows Server Logo 2015

Rdr Errors on NT Terminal Server 4

Recently a major problem befell one of my bigger customers and I was at a loss as to the cause of this issue. Many different causes were thrown around and a few methods were used to eventually solve this problem. This article will describe the symptoms of the problem experienced and the method I used to solve the problem. Because of the nature of the problem, it is unlikely you’ll see it on any Operating System other then Windows NT Terminal Server – but it may be possible.

The problem manifests itself

Our problem was glaringly simple; the server (with all of its 25 users) would, as the users described it, “Freeze” and then require a reboot. This would effectively cause all users to loose all their work and place the main database in serious danger of corruption. During this instability, the server would need to be rebooted upwards of three times over the course of a day.

The troubleshooting methodology

Being one of the more challenging problems I’ve come across, I was forced to use some alternative means of tracking down this problem. The first method of looking into the problem was to isolate the time and place the problem was occurring. I knew it was on the server, the time and place was well defined by users – then I checked the server, the state of the memory on the server, disconnected users, and the event log.

The event log was the only place I was given a clue about the problem. The process was obvious, the server posts successive ‘rdr’ errors to the system event log with error number 3013 , a few session (event ID 22) errors, and then posts a “This server was shut down unexpectedly” error (post sp4). This ‘rdr’ information would give me all I needed to understand the problem as soon as I understood what ‘rdr’ was and what it meant to the Terminal Server.

Getting information from Microsoft was clunky, cumbersome and in many cases useless. I strongly hope Microsoft changes its methods of delivering information to the technitions and users because it use a refresher course in support by a leader (Novell). Using Microsoft’s support and others, I set out to understand what the ‘rdr’ was.

Redirected

I finally found out what this was – the ‘rdr’ was related t a driver in Windows NT called the Redirector. The driver’s main purpose is to handle requests to redirect calls to network drives. Consider a mapped drive in Windows, this drive is mapped to the server at \\server\cdrive. If you want access to a file on the mapped drive, the Redirector takes over and sends your request to the server to access the file. The Redirector also handles substituted drives.

Terminal Insanity?

What does this mean to the terminal server? This was the burning question to me. I know there is a Redirector, I understand how its used – but why would it not be working on a Terminal Server? Could this be a memory problem? Could this be that the server can’t handle 25 users? Is the driver ‘rdr.sys’ corrupted? All of these questions swished around in my mind.

Try and try again

I tried a few things to no avail – the server seemed to be dying at the start of the day, this was the point when all of the users were logging in at 8:30am and putting the server under an extreme load. I asked the users to stagger the login, until all users were on the server. This seemed to initially work – but it did not solve the later failures and, in the end would still not stop the morning failures since they would still occur.

I attempted to manipulate caching on the server by changing the following registry values:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Rdr\Parameters
Value Name: UseWriteBehind
Data Type: REG_DWORD
Set To: 0
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Lanmanworkstation\parameters
Value Name: UtilizeNTCaching
Data Type: REG_DWORD
Set To: 0

These entries were recommended by Microsoft as a solution to rdr errors in one of theirKnowledge Base articles. This, however, did not solve my problem as the server would continue to crash.

I then started to remove shared directories, there were a few that were unused – so I removed all of these and still no resolution to this perplexing issue.

As a matter of course, I decided it would be proper to download and install the Terminal Server Service Pack 6 on the server – to be sure I was up to date. The service pack had been out at the time for at least 6 months and I was not aware of any major issues with it. The problems still continued.

It comes to me

The answer came to me, probably as I was sitting on the toilet, but as there are shares and mappings on a Windows computer this is aggravated by the amount of users on a Terminal Server. The redirector must work extremely hard to handle all of these requests. One of the first things I did when installing the server was to create what is called a ‘root drive’ or a home drive for all of the users on the server. This is recommended by Microsoft to make sure all of the users files are stored on one spot (for my server it was the W: drive). Microsoft says “Use the RootDrv.cmd file to choose a different drive to which the application will write configuration and date files. Consider using the user’s home directory instead.”. The RootDrv.cmd batch file asks you which drive to create as the user’s root drive and, on login, associates this drive to the user’s home directory.

This is good thinking since all the users files and configuration details should be stored in a central home directory for the user and possible even on a separate drive. If you look closer at the usrlogon.cmd (Login Script) on the server, it actually uses the subst command to locally map the root drive for the user while he/she logs in.

It would be great to follow this plan for 5, 10 or even 15 users, but I think at some point there is a breakdown when all users start using the W: drive while accessing files for any purpose (mail, configuration, local files). All of these operations rely on the redirector to process a transaction for files to a local disk – yes, this is inefficient.

My first order of business was to modify all mail profiles, I changed the location of the .PST files from W:\MAIL to D:\HOME\%username%\MAIL and configured all MW Word Profiles to use the D:\HOME\%username%\My Documents in all user profiles. This immediately provided results and the rdr errors were fixed. As a matter of course I changed all references not required on the W: drive to the local drive itself (including the TEMP variable). The W: Drive would then be used only for configuration files.

What was learned?

While attempting to solve this problem, a key point was hammered through, The Terminal Server 4.0 product is great, but is not rock-solid. The structure, usage and operation of the Operating System has to be controlled and understood to be ale to react to issues such as the rdr errors.

Links: Terminal Server: Microsoft