This post is for anyone who administers a Juniper SSL VPN. I saw an issue in our environment recently that was created by an unexpected interaction between two different systems that were working to enforce our computer security policy. Because the way the systems were configured is pretty common and because the issue is not specifically warned against by Juniper, I'm going to share it here.

In hindsight it makes perfect sense that the configuration I'm about to describe turned into an problem. What was interesting (at least to me) is that each configuration on its own was harmless and in fact, best practice. Here's the breakdown:

  • A role is configured on the VPN with a Network Connect resource attached to it
  • A host checker policy that requires the Windows firewall be turned on is bound to the role
  • Host checker dynamic policy evaluation is enabled
  • An Active Directory Group Policy is in place which forces the Windows firewall on when domain computers are not connected to the corporate network and off when connected to the corporate network

Here's how the host checker rule and the GPO ended up working against each other. Microsoft Technet tells us that Windows evaluates connectivity to the corporate network based on whether it can reach a domain controller.

The "domain network" location type is detected when the local computer is a member of an Active Directory domain, and the local computer can authenticate to a domain controller for that domain through one of its network connections. - http://technet.microsoft.com/en-us/library/cc753545%28WS.10%29.aspx

In other words, if the computer can reach a domain controller, the firewall will be forced off by our GPO. And note, it doesn't just turn the firewall off on the interface that's connected to the domain network, it turns it off globally.

What would happen to our users is they would connect to the VPN from somewhere on the Internet; their firewalls would be on at this point. After logging in and passing the host checker firewall policy, Network Connect would initialize and create a tunnel back to corporate. Of course our Network Connect policy allows connectivity to the domain controllers so Windows immediately believes it's connected to the domain network... and turns the firewall off. Begin chain reaction.

Dynamic policy evaluation on the VPN sees the firewall turn off and strips the user of the role which grants them Network Connect ("Staff Full"). When the role is stripped it immediately tears down the Network Connect tunnel.

jknight(User Realm)[Staff, Staff Full] - Roles for user jknight on host 174.4.101.156 changed from <Staff, Staff Full> to <Staff> during policy reevaluation.
jknight(User Realm)[Staff, Staff Full] - Host Checker policy 'Corp - FW Test' failed on host 174.4.101.156  for user 'jknight'. Reason: 'Microsoft Windows Firewall  does not comply with policy. Compliance requires firewall to be turned on.'.
jknight(User Realm)[Staff] - Closed connection to TUN-VPN port 443 after 14 seconds, with 17475 bytes read (in 68 chunks) and 23060 bytes written (in 74 chunks)
jknight(User Realm)[Staff] - Network Connect: Session ended for user with IP 192.168.8.205

After a period, Windows realizes it cannot communicate with any domain controllers, decides it's not connected to the domain network, and turns the firewall back on.

jknight(User Realm)[Staff] - Host Checker policy 'Corp - FW Test' passed on host 174.4.101.156  for user 'jknight'.
jknight(User Realm)[Staff] - Roles for user jknight on host 174.4.101.156 changed from <Staff> to <Staff,Staff Full> during policy reevaluation.
jknight(User Realm)[Staff, Staff Full] - Network Connect: Session started for user with IP 192.168.8.205

The cycle now repeats endlessly.

From the user's point of view, their Network Connect tunnel only manages to pass traffic for a few seconds before bearing torn down for the first time. After that it begins to cycle up and down so fast that it doesn't pass any traffic. While this is happening, the user access log on the VPN is logging multiple messages like this every second:

jknight(User Realm)[Staff] - Request to connect to JKNIGHT_LAPTOP port 443 permission denied

This log message was a telltale sign that showed up every time a user was experiencing this issue.

For us, the solution was to remove the host checker firewall policy. The GPO which forces the firewall on when computers leave the office is critical for ensuring our computers comply with our security policy so removing or changing that wasn't an option. The risk of having the firewall turned off while a computer is in a Network Connect session is largely mitigated by disabling split tunneling on all Network Connect sessions. This prevents two-way communication from the computer to any host on the Internet other than the VPN host. So in the end the host checker policy is gone and Network Connect is now stable for all users.

Hopefully this post helps others avoid the same misconfiguration.