Understanding EZVPN Authentication

Introduction

 

I have been learning EZVPN lately, and one thing that most books and things all say but don’t really explain all that well is this — “If you are using pre-shared key authentication with EZVPN, it uses aggressive mode for IKE Phase 1.  If you are using digital certificates, it uses main mode.”  OK, that is all well and good, but I needed to understand WHY this is the case.  I always find that truly understanding the why of the technology ultimately is the answer to getting really good at it. So, let’s dig into that concept.

 

IPSEC negotiation basics

 

IPSEC negotiations include 2 phases, known as phase 1 and phase 2.  Phase 1 is mainly used for authentication and to create a protected bidirectional tunnel (ISAKMP SA) for phase 2 negotiations. At the end of phase 1, both peers have authenticated each other and we have a secured tunnel to do phase 2 negotiations.  Phase 2 negotiates parameters for the actual secure tunnel used for data flow between peers.  At the end of phase 2 we have two unidirectional IPSEC SA’s to protect the data flow between the peers.

The purpose of this article is not to cover IPSEC authentication in detail, but we should understand the basics of how it works.  When two peers want to establish an IPSEC connection to one another, they need to authenticate each other some how.  After all, you want to know that who you are talking to is really who you think you are talking to.  We can do this by having a pre-shared key on both peers.  Alternatively, we can use RSA key pairs or digital certificates.

Phase 1 can happen in either main mode (MM) or aggressive mode (AM).  The main difference is that main mode uses six messages between the peers, while aggressive mode uses only three.  Aggressive mode is therefore faster, but comes at a price of security.  The reason is because in aggressive mode, the IKE ID that identifies the peer is sent in the first two messages unencrypted, but in main mode the IKE ID is not sent until messages 5 and 6 in encrypted form. This is a key point to remember for this discussion.

 

EZVPN PSK Authentication

So, why then do we HAVE to use IKE aggressive mode when using PSK for authentication with EZVPN?  It has to do with the specifics of the IPSEC phase 1 negotiations.  In short, the PSK is used in the calculation of session keys in the middle of phase 1 — following MM4 but preceding MM5.

With a normal L2L IPSEC tunnel using PSK authentication, how do we tie a specific PSK to a specific peer?  We usually tie the PSK to an IP address or hostname using crypto isakmp key or an isakmp keyring. So, when router A initiates a connection to router B, router B knows who it is dealing with right away based on the source IP address.  Based on that, router B can find the pre-shared key it has for that specific peer.  It can use that pre-shared key prior to MM 5 to calculate session keys used to actually protect the MM 5 and MM 6 messages.

With EZVPN, we configure the PSK under the EZVPN group configuration, and it is not tied to any specific IP address or hostname.  The remote host initiating the connection has to at some point send the EZVPN group ID to the server, so the server knows what group the client is wanting to join.  The remote peer does this by setting the IKE ID to the name of the EZVPN group it wants to join.  Remember, with main mode, the IKE ID is not sent from the initiator to the responder until MM 5 where it is encrypted with the session keys that are partly derived from the PSK.  Therein lies the problem.  Basically, main mode cannot be used because the server would have no way to figure out what PSK to use for the negotiations with the remote client, because it would not know the group name until MM5. It would then have no way to negotiate the session keys.

If we use aggressive mode, the IKE ID that identifies the EZVPN group is sent by the initiator in AM 1 so the EZVPN server right away knows “OK this peer is for EZVPN group x and group x has such and such pre-shared key…great.”  Now the VPN server has everything it needs to compute the session keys later on.

Examples always do it for me…

Trying to use MM for ISAKMP Phase 1 with EZVPN would look something like this, assuming R1 is the initiator and R2 is the responder.

MM1: R1 –> R2: Security proposals (hash, encryption, DH group, authentication type)

MM2: R2 –> R1: Accept security proposals

MM3: R1 –> R2: DH key exchange

MM4: R2 –> R1: DH key exchange

Now, here is the critical step…after MM4, R2 (the responder) would calculate the DH shared secret key.  The DH shared secret is used in conjunction with the PSK and a lot of math to generate a series of session keys.  These session keys are used to protect the following MM5 and MM6 messages as well as a few other things.  There are actually three session keys that get generated at this stage.  OK, so R2 goes to generate the session keys…R2 needs the PSK to do that…oh wait WE DON’T KNOW IT because the PSK would be tied to the EZVPN group ID which has not yet been communicated to us by R1.  EPIC FAIL.

So, the remote server needs to already HAVE the PSK by the time MM4 ends, and this is impossible to do with EZVPN by nature of how it is designed.  Using aggressive mode allows the remote peer to send the group it wants to join in the IKE ID straight away in the first message.  The server can then tie that group ID to the respective PSK and continue the negotiations successfully. So why does it work with digital certificates?  When using certificates for authentication, the session keys derived after MM4 and prior to MM5 have nothing to do with a PSK because ….well we aren’t using PSK to authenticate!

I hope that helps somebody else out.  I could not find a lot of good resources on this.  I would like to thank Piotr Matusiak for helping me get this down over on the OSL

 

 

2 Comments

  • […] Understanding EZVPN Authentication – Astorino Networks – Great article from Joe on EZVPN authentication. […]

  • LAMO says:

    Excellent ,thanks for your explanation.

    1- (when router A initiates a connection to router B, router B knows who it is dealing with right away based on the source IP address) vs (the IKE ID is not sent from the initiator to the responder until MM 5 where it is encrypted with the session keys that are partly derived from the PSK)

    what is the difference between source IP and IKE ID,in most cases they are the same,i know it could be the FQDN ,but here the initiator expose it’s address in the beginning so where is the security in encrypting it after ?

    2-Can you provide any detailed documentation of how the DH is used in generating the shared secret,i mean a very detailed explanation of all key exchanges till the shared key is generated

    Thanks in advance

Leave a Reply