Solving a Final Remaining Performance Impact with Mutli-Threaded Operation by using Connection-State Mapping in the Highly-Modified OpenVPN Source Code [Implementation]

In my previous blog post, I started observing strange performance issues with using my network-wide, highly-modified OpenVPN application setup. I noticed that everything ran fast and speedy in bulk-mode, however, in mtio-mode things began to not work as smoothly as expected (speed tests were good, TCP seemed alright, but UDP appeared fairly impacted). When I thought about it more in depth, I realized that my multi-threaded implementation of OVPN was simply throwing any data read off of the TUN interface into any available thread all at the same time to try and maximize parallel performance. The issue with doing it this way was that I was breaking up the ordering of packets in either of the UDP or TCP “streams” of data and non-advanced/capable API applications were not able to handle this as well as expected (although TCP seem to fair a bit better but you could still sense a hesitation or lag to the connection).

I wrote about this observation to the OpenVPN devs and they agreed that this setup could cause connection problems with having to perform mass-reordering of all the packets all the time. I then came up with the split tunnel solution in my previous post which did help to temporarily solve the issue, however, I wanted to find a way to solve the issue in source code as well instead. I did not, however, want to implement packet tracking and ordering as I would then be basically re-implementing the whole entire functionality and complexity of a TCP protocol all over again. Instead, I chose another way to help prevent this issue which was to implement a simple version of connection state tracking and mapping (similar to how iptables conntrack would work). All I would need to do is parse and extract the source+destination addresses from the packet header and place them in a mapping table for a brief amount of time and associate that connection “stream” of data with an available thread. When you combine this change with the bulk-mode operation, it now makes for a very snappy and performative VPN experience overall. I was able to implement this change in less than 75 lines of code with only a single helper method!

Complete Code Change Commits: github.com/stoops/openvpn-fork/compare/master…bust

~

~

Leave a comment