AWS 2019 Server WinHTTP Web Proxy Auto-Discovery Service Errors

AWS 2019 Server WinHTTP Web Proxy Auto-Discovery Service Errors

We have been battling an issue for several months, it seems to only effect Amazon AWS servers running Windows Server 2019. We have had 5 servers exhibit this behavior.

So what exactly is the issue? Suddenly the server will no longer be able to establish outgoing connections. Websites, etc. all still function fine, unless you are making calls to 3rd party webservices, you know for things like email, credit card processing! Any outgoing communication just times out, and if you log into the server and try to open a browser, and navigate to any website, it just hangs, no error, no timeout, it just hangs forever.

When this happens, if you review the windows event log you will find a 7024 event, “The WinHTTP Web Proxy Auto-Discovery Service service terminated with the following service-specific error:
The endpoint mapper database entry could not be created.”
. Googling this error mainly finds pages where folks are discussing consumer windows, and typical virus issues, etc. and really doesn’t apply to our situation. I did find a couple of references to AWS and both of those were also 2019 servers, so I think this is a specific AWS + 2019 issue, and doesn’t really have anything to do with WAS.

What is really strange is every server that we have had this issue, its very consistent on that server, but varies from server to server. We had one server that would do this exactly every 21 days, another that was every 6 weeks, and yet another that was every 30 days. VERY STRANGE!! We tried a number of things to prevent this issue, from turning off Automatic Proxy detection, forces some related services to run all the time, instead of being set to manual, but nothing helped.

And the reason this has been going on for several months, is there is no way to test other than waiting for it to happen again. It has been a very frustrating process. Even more frustrating we haven’t figure out a way to resolve the issue once it happens, restarting services, resetting the network adapter, etc. nothing seems to help other than a reboot.

After battling this thing for several months, we have finally given in and taken a brute force approach to the issue. When the 7024 error happens we reboot the server via a PowerShell script. It’s not a eloquent solution, but it sure is better than finding out that your credit card processing has been down for a week!

So if you are having this issue with one of your servers here is the steps to apply the fix we are using to your server.

Run PowerShell as administrator and execute the following command

Get-ExecutionPolicy

Make sure its Unrestricted. And then execute the following command. This allows our script to add some custom events to the event viewer so we can see when our server reboots because of this error.

New-EventLog -LogName "Application" -Source "Proxy Crash"

Now save the following code as ProxyCrash.ps1. I placed this in a folder (C:\ProxyCrash). You can place it where you like, just be sure to make the appropriate adjustments in the following steps.

$LogName = 'System'     ##Event Log Name 
$EventIDFilter = 13     ##EventID to filter by 13 is Shutdown
$AfterDate = (Get-Date).AddMinutes(-30)  ##Amount of minutes in past to filter by

#This will find filtered events using the parameters above (by EventID and AfterDate Minutes set)

$Events = Get-EventLog -LogName $LogName -After $AfterDate | Where-Object {$_.EventID -eq $EventIDFilter}

IF ( $Events.count -le 1 ) {
 
Write-EventLog -LogName "Application" -Source "Proxy Crash" -EventID 9977 -EntryType Error -Message "The service 7024 failed.  Rebooting Server Now." 
Restart-Computer -Force

 } ELSE {

 Write-EventLog -LogName "Application" -Source "Proxy Crash" -EventID 9978 -EntryType Error -Message "The service 7024 failed, and reboot has occurred in the past 30 minutes.  Stopping reboot."

 }

So what exactly is the above doing? Well first let me say I am not a PowerShell master, but fortunately wxPerts does have access to one, and he wrote the script for us.

One of our fears was that if something else when wrong, and the 7024 error wasn’t caused by this issue, the server could get itself into a infinite reboot loop, so if the server has been rebooted in the last 30 minutes (specified in $AfterDate) then it just makes an entry in the event log (EventID 9978), but doesn’t reboot the server. Otherwise it makes an entry in the event log (EventID 9977) and reboots the server.

Now we need to create a schedule task that will be triggered anytime the 7024 error occurs. So run event viewer and search for EventID 7024.

Right click on one of the events and select “Attach Task to This event” which starts a wizard.

Name the task something that makes sense to you, like “Reboot if Proxy Crash”

One the next screen there is nothing to do

On the next screen we want to Start a program

And that program will be Powershell.exe and the argument will be the script we created earlier. (C:\ProxyCrash\ProxyCrash.ps1)

We need to make some adjustments to the task once it is created, notice the checkbox on the finish window that will open the properties for the task. If you forget to do that, no worries, just go into the task scheduler and edit them there.

We want to change the task so that it runs all the time, even if no one is logged in and we want to make sure it has the highest privileges so it is able to reboot the server.

Not it default to the Administrator login, since that was what I was logged in with, but it will require you to reenter the password to setup the task to use that login.

To test that everything is good, you can open the Task Scheduler and find our new task under Event Viewer Task, right click on it and select Run, at which point it should reboot your server. If it doesn’t you missed one of the steps above. PowerShell can be quite finicky and is case sensitive so check everything closely.

At this point we now have a task that will reboot the server anytime the 7024 error occurs. This is what I call treating the symptom instead of the disease, but unfortunately in this case we never could find a cure for the disease.

The last thing we want to do is create a couple of custom views in the Event viewer so we can monitor the activity of this script and the 7024 errors.

Open Event Viewer, right click and choose “Create Custom View”

We want to do By Log and select the Windows Logs

And we want to specifically filter on the two custom EventIDs from our script 9977 and 9978

Name the custom view something that makes sense to you like “Proxy Crash Reboots”

I also created a custom view to see the 7024 errors themself.

Hopefully this helps someone else that has been battling this same issue and if we ever find a cure for the actual disease we will be sure to update everyone.

7 thoughts on “AWS 2019 Server WinHTTP Web Proxy Auto-Discovery Service Errors

  1. I’m running into this issue, and I’m going to give your solution a try. I noticed in your script you use IF ( $Events.count -le 1 ) instead of IF ($Events.count -lt 1 ) and I was wondering what the justification for -le over -lt was?

    Like

  2. Hello, I have this issue too. I had the RDP port open to the world, which I think is a very bad idea on windows systems. I left that port open just for known IPs for now. Do you happen to have that port also opened to the world?

    Like

    1. No, It is never a good idea to have the RDP Port open to the world. What happens is the hackers cruise the AWS IP block constantly and when they find an open RDP port they start banging away trying to get access. Eventually, your server will be so busy rejecting login attempts that it can’t do anything else.

      What we do is in the Security Group for our EC2 Servers we allow RDP for specific IP’s. This can be a pain if your IP changes often, but if you are on an internet service such as cable, generally your IP only changes when the router is rebooted, and even then it’s within a range. So Step 1 is to find out what your IP is (not your local but what the rest of the world sees your IP as) https://whatismyipaddress.com/ is good for this. Now use a CIDR calculator to get the entire range https://www.ipaddressguide.com/cidr#range is good for that. So say your IP is 171.84.148.134, you put that from and to range in the calculator as 171.84.148.0 and 171.84.148.255 and it gives you 171.84.148.0/24, you put that in your RDP entry for your AWS security group, and now as long as your IP only changes the last section you are good. And you only have to worry about hackers in your same neighborhood (which usually isn’t the issue). If the last 2 sections of your IP change then you can do the same thing with the calculator but it will likely give you a few entries to make.

      The only challenge you then have is if you travel. If I need to log into one of our servers while traveling, I will just log in to AWS, and when you add an entry you will notice it has an option for MYIP, I use that, it automatically fills in my current IP and then I am good. Once I am done, I delete the entry from my security group.

      Like

Leave a reply to Joaquim Cardeira Cancel reply