AD Replication Troubleshooting steps (DNS)

 

Quick Checks:

-Check and confirm the network browsing is working and it has not changed your firewall to public

-Use Event Viewer to get more detail. It is the easiest quick resource but note the timeline of the errors many times you will see errors as the server is booting up and initializing. Ignore those and pay attention to the logs after full boot up completes. (i.e. don’t chase your tail)

-Most alerts are generic in their details/description. Make sure you are addressing ones that matter for your current AD forest and domain level. If it doubt escalate to someone that knows. You can melt down your domain by following bad online advice.

-Check DNS settings on the network interface and point your DC to another DC first then itself second. There are different views on this but I do so to avoid DNS isolation.

-Know your FSMO Role holders. If you don’t know what FSMO role holders are you should not be learning on production. Quickest way to check them is:

NETDOM /query FSMO

-if you need to transfer FSMO Roles see my FSMO posts.

Open a command Prompt as an admin

repadmin.exe /showreps

- this will show DCs listed in domain

repadmin.exe /replsummary

- This will give you a DC replication summary

On DCs run “dcdiag” for a quick searchable summary

- The following command will export a detailed diagnostics to the path specified

DCDIAG /e /v /c /f:C:\DCDiagReport.txt

-you can then review those for errors and more detail. This report takes a few minutes to run, longer with more DC’s and/or errors/retries.

 

The following is a recent real AD replication issue I troubleshot:

I was getting the 1722 RPC service error below on repadmin.exe /replsummary:



This is a generic error that can mean lots of communication issues.

In this case I confirmed that the RPC services were running and set to Automatic as expected

From the errors I saw in DCDiag I ran a quick command on DNS

DCDIAG /TEST:DNS

-be patient with this test it takes a few minutes

On one DC it reported no errors. On the other server it took longer but was filled with errors. One in particular caught my attention:



It caught my attention because 192.168.1.22 in this case is NOT a current DC or DNS server.

I then confirmed the troubled  servers network interface had the wrong DNS address.

That was causing it to query to nowhere and thus break replication.

With the change to a correct DNS server all errors were cleared and replication started working perfectly

Final Note to remember:

IT'S DNS! Even when it can't be DNS, IT'S DNS  

Comments

Popular posts from this blog

Office 365 Deployment Tool Office Download fails “Could not Install”

FRS to DFSR Post Cleanup “File Replication NtFrs Stopped”

Domain Migration SubinACL /Migratetodomain How To: