Fix Cluster Poisoned Error 0xC0130017

Quick answer: Run Clear-ClusterNode -Node <NodeName> from another healthy node, then force evict and rejoin the poisoned node.

What “Cluster Poisoned” Actually Means

You see STATUS_CLUSTER_POISONED (0xC0130017) when a cluster node thinks it’s been excluded from the cluster consensus. This happens after a network partition, a quorum loss event, or an improper shutdown. The node still has its old view of the cluster, but the other members have moved on without it. The cluster service refuses to let it participate because it’s seen as a “poison” node — one that might have stale data or could cause split-brain if it re-joins.

The culprit here is almost always a node that lost connectivity during a quorum arbitration. For example, in a 5-node cluster, if 2 nodes go down and the remaining 3 lose one more, the quorum might shift. When the down node comes back, it brings an outdated cluster database. The cluster service marks it as poisoned to protect integrity.

Fix Steps

Step 1: Check Cluster Status from a Healthy Node

Log into a node that’s still running normally. Open PowerShell as Administrator and run:

Get-ClusterNode

Look for the poisoned node — it shows a state of “Down” or “Paused” with a comment about being poisoned. Note its name.

Step 2: Evict the Poisoned Node

From the healthy node, force evict the poisoned one:

Remove-ClusterNode -Name <PoisonedNodeName> -Force

The -Force flag skips the normal checks. Without it, the cmdlet will block because the node is still considered part of the cluster.

Step 3: Clean the Node’s Local Cluster State

Now go to the poisoned node itself. Stop the cluster service:

Stop-Service -Name ClusSvc

Wipe the local cluster database. This is stored in the registry:

Remove-Item -Path HKLM:\Cluster -Recurse -Force

Delete the cluster logs too — they can hold stale references:

Remove-Item -Path C:\Windows\Cluster\*.log -Force

Then restart the node.

Step 4: Rejoin the Node to the Cluster

On the clean node, start the cluster service and add it back:

Start-Service -Name ClusSvc
Add-ClusterNode -Name <NodeName> -Cluster <ClusterName>

If it complains about the node already existing, go back to Step 2 and evict again. You might need to also run Clear-ClusterNode from the healthy node first.

Step 5: Validate and Test

Check the cluster status:

Get-ClusterNode | Format-Table Name, State

All nodes should show “Up”. Run a test failover to make sure resources move correctly:

Move-ClusterGroup -Name <GroupName> -Node <TargetNode>

Alternative Fixes if the Main One Fails

Sometimes Remove-ClusterNode won’t work because the cluster thinks the poisoned node is critical. Try these:

Force quorum reset: On the poisoned node, run netdom resetpwd /s:<DomainDC> /ud:<Domain>\<Admin> /pd:* to fix its domain trust, then retry.
Use Cluster.exe (legacy): cluster node <NodeName> /forcecleanup. This is less reliable on Server 2016+ but still works in a pinch.
Rebuild the node entirely: If the registry cleanup fails repeatedly, demote the node from AD, clean install Windows, and rejoin. This is overkill but guaranteed to work.
Check for firewall issues: Ensure ports 3343 (UDP/TCP for cluster traffic) and 445 (SMB for cluster database) are open between nodes.

Prevention Tips

You won’t see this error if your cluster is properly configured. Here’s what I do after fixing this:

Set a witness — file share or cloud witness. A 2-node cluster without a witness is asking for this exact problem.
Use dynamic quorum — it’s on by default in Server 2012 R2 and later. Check with Get-Cluster | fl QuorumType.
Keep network separation — at least two separate networks for cluster traffic (one heartbeat, one client). If one fails, the other keeps arbitration alive.
Never force-shut a cluster node — use graceful shutdowns. An unexpected power loss is the #1 trigger for cluster poisoning.
Monitor cluster logs daily — smaller issues like “could not reach node” warnings precede a full poison event. Catch them early.

One last thing: don’t bother with a full cluster rebuild unless you’ve tried the steps above on all nodes. I’ve seen teams panic and rebuild a 16-node cluster over this when a simple registry wipe fixed it in 10 minutes.