Quick Blog Links

about   root @       
 
  MacOS Applications   [x-code + objective-c]  
 
  SSHPass Automation Program   [python / app]  
 
  DHCP/ARP Relay-Bridge ~ Proxying   [c / networking]  
 
  ARP Sync + Route Replacement   [python / networking]  
 
  DNS-VPN ~ UDP-SSL   [python / networking]  
 
  OpenVPN Mods ~~ BULK ++ MTIO ++ DUAL   [c / networking]  
 
  Linux Kernel IP-DF Flag Header Rewrite   [kernel / networking]  
 
written   Secure LAN Communication   [College Thesis]  
 
  College Project – Teaching Hacking!   [Course Paper]  
 
  ARM Assembly – A Basic Introduction…   [Blog Post]  
 
configs   Mac Mini ++ Lenovo Mini   Firewalling ~ eb|iptables  
 
  Cisco and OpenWRT   Ubiquiti and OpenWRT  
 
gear   Home Stuff | WiFi Networks  
 
 
# Note: github.com/fossjon <- I lost access due to missing 2fa, so now I'm using -> github.com/stoops
for p in `seq 1 3` ; do
  curl -sL "https://fossjon.com/feed/?paged=$p" | grep -Ei '<(title|link)>' \
    | sed -e 's@<title@~<title@g' | tr ' \t\r\n' ' ' | tr -s ' ' | tr '~' '\n' \
    | sed -e 's@^.*<title>\(.*\)</title>.*<link>\(.*\)</link>.*$@<a href="\2" style="text-decoration:none;font-family:monospace;">\1</a><br/>@' \
    | grep -i '/fossjon.com/'
done > blog.html
   
   pages  
   1   2   3   4   5   |  6   7   8   9   10   
 11   12   13   14   15   |  16   17   18   19   20 

Year End Summary – OpenVPN Modifications in a Screen Cap – Sometimes Lines of Code Removed is a Better Metric!

I’ve spent a few number of months ironing out some remaining edge cases and code paths to get this highly modified version of OpenVPN to work as stable as I intended it to. It’s basically a lighter-weight TCP-protocol focused-version of OVPN with a number of extra huge unused libraries removed, including WIN32, which I never have run anything on for the last few decades at least now. The summary of all the modifications I made comes out to roughly 3,000 lines added and over 25,000 lines of code removed! I did not remove the UDP library or protocol files from it since that was the OG VPN protocol and it doesn’t really get in the way of things, I may just leave it be anyway.

Warning: This is a really long screen capture to scroll through!

~

Secure VPN DNS Service – Forwarding/Proxying/Caching/Ciphering – Python

Since I am running a network-wide VPN tunnel on behalf of the clients on the network, any plaintext UDP based DNS packets would be protected from your ISP seeing them, however, the VPN servers ISP would still be able to see them all. I decided to write a new Python DNS server which will listen on the VPN client side and redirect all plaintext UDP DNS traffic to it locally instead. It will then create a TCP SSL connection to a DNS server through the VPN tunnel and perform the query via DNS-over-TLS as a replacement. The answer and response can also then be cached which will help to reduce the amount of UDP DNS packets being sent over the VPN tunnel as well!

Source Code: https://github.com/stoops/xfer/blob/main/fdns.py

~

Finally Able to Insert a Proper Layer of Bi-Directional Multi-Threaded Set of Core Operations to the Highly-Modified OpenVPN Source Code!

Edit-Edit: After spending several more days tracing and tracking down connection state edge cases, I was able to greatly improve the performance of this modification. I had to rewrite most of the ssl library, socket library, packet library, forward library and multi/io libraries as well as remove some unneeded code paths including the fragment, ntlm, occ, ping, proxy, reliable, and socks libraries. I also filed a couple more code quality related PRs as well ( #4 ~ #5 ).

Edit: After spending several days of testing and debugging, I found a couple bugs in the OpenVPN source code that I have informed the developers about ( #1 ~ #2 ~ #3 ). I was finally able to get 3 parallel session key states negotiated and renegotiated which the framework was never accounting for to happen! The framework allows for a 3-bit key ID field code to be used and negotiated.

Session-State-Key-IDs(1,2,3): Client to Server Traffic Communication

Session-State-Key-IDs(4,5,6): Server to Client Traffic Communication

Session-State-Key-IDs(0,7): Backup General Traffic Communication


We first negotiated our set of 3 keys for each of the 4 threads in this 1 process:
Session State Key IDs: no=1, no=7, no=4

2025-11-02 21:05:23 TCPv4_CLIENT DUAL keys [5] [ [key#0 state=S_GENERATED_KEYS auth=KS_AUTH_TRUE id=0 no=1 sid=0eb81f5c fb6a0553] [key#1 state=S_INITIAL auth=KS_AUTH_FALSE id=0 no=0 sid=00000000 00000000] [key#2 state=S_GENERATED_KEYS auth=KS_AUTH_TRUE id=0 no=7 sid=be550b82 c30fb5ec] [key#3 state=S_GENERATED_KEYS auth=KS_AUTH_TRUE id=0 no=4 sid=d0f67472 3dc74469] [key#4 state=S_INITIAL auth=KS_AUTH_FALSE id=0 no=0 sid=00000000 00000000]]

We then renegotiated each of the 3 keys for each of the 4 threads for a total of 12x keys:
Session State Key IDs: no=2, no=0, no=5
Lame State Key IDs: no=1, no=7, no=4

2025-11-02 21:06:09 TCPv4_CLIENT DUAL keys [5] [ [key#0 state=S_GENERATED_KEYS auth=KS_AUTH_TRUE id=0 no=2 sid=68cc0579 08edeaed] [key#1 state=S_GENERATED_KEYS auth=KS_AUTH_TRUE id=0 no=1 sid=0eb81f5c fb6a0553] [key#2 state=S_GENERATED_KEYS auth=KS_AUTH_TRUE id=0 no=0 sid=66f6bd41 5b9c05bd] [key#3 state=S_GENERATED_KEYS auth=KS_AUTH_TRUE id=0 no=5 sid=af1447d1 95e7c215] [key#4 state=S_GENERATED_KEYS auth=KS_AUTH_TRUE id=0 no=4 sid=d0f67472 3dc74469]]

The second keys to each primary key listed are considered lame keys which are only meant for back up purposes only
and are set to expire as they will be rotated out upon the next key renegotiation.

When it comes to tunnelling and proxying data, there are in general two independent pipeline directions, read-link->send-tunn && read-tunn->send-link. I separated out some shared limiting variables in the bulk-mode source code which were the c2.buf && m->pending variables so that the data processing can operate independently for RL->ST and RT->SL. I also added a separate additional session state cipher key in the new dual-mode so that the PRIMARY key can handle client->server encryption/decryption independently and the new THREAD key can now be used for server->client traffic communication.

~

~

Using TCP+BULK+MTIO+DUAL modes together with iptables performing dynamically distributed load balancing over 2x OpenVPN process tunnels running with 16x available threads — I am now able to achieve 144,000 bytes worth of data at a given time onto the 1500 byte MTU VPN link!

~

Commit Code: github.com/stoops/openvpn-fork/compare/mtio…dual

Pull Request: github.com/OpenVPN/openvpn/pull/884

Complete Commits: github.com/stoops/openvpn-fork/compare/master…bust

~

Linux Routing – Load Balancing

So, I also learned another new lesson recently that the Linux kernel way back in the day used to properly load balance routes (nexthop weights) by making use of a routing cache to remember which connection is associated with which route. Then, that cache implementation was removed and replaced instead with sending individual packets randomly down each route listed (similar to how I was initially approaching the OpenVPN load balancing between multiple threads). This, however, would break connection states and packet ordering as the two route links may be going to separate places entirely. Then, the Linux kernel developers decided to implement a basic hash mapping algorithm that would associate a connection stream with the same routing hop, always. This is a very limited form of load balancing as the same source+destination address will always map to the same hash which will always match the same routing path every time (this will be even more limiting also if you are using a source NAT).

It turns out there is another trick to get some more dynamically distributed load balanced routing under Linux which is to make use of the iptables mangle table and connmark a new connection state so that the conntrack table can save+restore the specified packet markings. You can then set an ip rule to pick up these firewall markers and associate them to a different routing table. In addition to this, you are able to use a random algorithm or modding algorithm to evenly set different marks giving you even greater routing variety!

$ipt -t mangle -A PREROUTING -i lan -m mark --mark 0x0 -j CONNMARK --restore-mark
$ipt -t mangle -A PREROUTING -i lan -m statistic --mode nth --every 2 --packet 0 -m mark --mark 0x0 -j MARK --set-mark 8
$ipt -t mangle -A PREROUTING -i lan -m statistic --mode nth --every 1 --packet 0 -m mark --mark 0x0 -j MARK --set-mark 9
$ipt -t mangle -A PREROUTING -i lan -m mark ! --mark 0x0 -j CONNMARK --save-mark
echo "8 vpna" >> /etc/iproute2/rt_tables
echo "9 vpnb" >> /etc/iproute2/rt_tables
ip rule add fwmark 8 table vpna
ip rule add fwmark 9 table vpnb

You can then now add your VPN tunnel routing rules to both new ip routing tables, for example, vpna and vpnb!

~

Solving a Final Remaining Performance Impact with Mutli-Threaded Operation by using Connection-State Mapping in the Highly-Modified OpenVPN Source Code [Implementation]

In my previous blog post, I started observing strange performance issues with using my network-wide, highly-modified OpenVPN application setup. I noticed that everything ran fast and speedy in bulk-mode, however, in mtio-mode things began to not work as smoothly as expected (speed tests were good, TCP seemed alright, but UDP appeared fairly impacted). When I thought about it more in depth, I realized that my multi-threaded implementation of OVPN was simply throwing any data read off of the TUN interface into any available thread all at the same time to try and maximize parallel performance. The issue with doing it this way was that I was breaking up the ordering of packets in either of the UDP or TCP “streams” of data and non-advanced/capable API applications were not able to handle this as well as expected (although TCP seem to fair a bit better but you could still sense a hesitation or lag to the connection).

I wrote about this observation to the OpenVPN devs and they agreed that this setup could cause connection problems with having to perform mass-reordering of all the packets all the time. I then came up with the split tunnel solution in my previous post which did help to temporarily solve the issue, however, I wanted to find a way to solve the issue in source code as well instead. I did not, however, want to implement packet tracking and ordering as I would then be basically re-implementing the whole entire functionality and complexity of a TCP protocol all over again. Instead, I chose another way to help prevent this issue which was to implement a simple version of connection state tracking and mapping (similar to how iptables conntrack would work). All I would need to do is parse and extract the source+destination addresses from the packet header and place them in a mapping table for a brief amount of time and associate that connection “stream” of data with an available thread. When you combine this change with the bulk-mode operation, it now makes for a very snappy and performative VPN experience overall. I was able to implement this change in less than 75 lines of code with only a single helper method!

Complete Code Change Commits: github.com/stoops/openvpn-fork/compare/master…bust

~

~

Split VPN Tunnelling and Routing Based on Packet Protocol and Port for Improved Network Performance and App Compatibility

So I’ve been running a highly modified version of OpenVPN and speed and performance have been pretty good overall, however, I noticed that some very few apps would use a custom UDP based API call for data transfer (Ring live video app) and the video would appear very choppy and blurry. I suspected this is because the multi-threaded version of OpenVPN does not preserve UDP packet ordering (whereas UDP protocols like QUIC seem to be more advanced and capable).

I am now experimenting with a solution to this issue by running two VPN tunnels at the same time, one is bulk-mode-multi-threaded for the majority of smart protocols and data and the second is bulk-mode-single-threaded for some of the simpler UDP protocols that assume a specific order of packets arriving from a server or service. This technique requires use of iptables mangle packet marking as well as ip rule mark setting with additional routing table rules.

iPhone 17 Base – The best of the iPhone releases this year… almost perfect!

I have been a fan of smaller sized iPhones ever since I initially purchased the 12 Mini (as well as the 13 Mini which I still have as part of my collection today). It allows me to more easily carry and use in light-weight, one-handed operation while I am out carrying something in my other hand, for example answering phone calls or looking information up while waiting in line. Ideally, this size would be less than 6″ diagonal, preferably 5.7″ (+/- 0.2″) with as small of a camera bump as possible.

I had switched over to the iPhone 15 Pro due to the always-on-display feature which is important to me so that I can catch any missed important notifications at a quick glance while I’m working at my desk and my phone is resting on the table. However, the 15 Pro is far from “small” as it comes in at 6.1″ (which actually technically makes it the smallest AOD screen that has been offered from Apple to date).

This year, Apple surprised me as they added the AOD screen to the regular base iPhone 17 which would save me quite a bit of money from having to buy the Pro versions in the future during upgrades. However, the screen size comes in at 6.3″ which is the only option available. If Apple were to offer this phone in 5.7″, it would be an instant buy from me tomorrow… it’s almost perfect!

Update: I just learned of this new signed memory security feature that was implemented at the A19 hardware level which is really tempting me to upgrade now… [Apple]

~

Buffer Bloat Buster – Running a TCP VPN tunnel with large memory buffer sizes

I have been running the highly modified version of OpenVPN which is a TCP VPN with large sysctl memory buffers set and I am trying a new add-on experiment of using an extra dedicated thread to send dummy/mock UDP data at a specific bit rate, for example 1024 kbps, through the tunnel at a constant and consistent pace. It works with both client and server modes to get the virtual IP addresses (or you can specify one yourself) and auto sends random bytes to each IP in attempt to keep the tunnel active and alive at all times!

Server-to-Client (~750kbps down)

[--bust-mode 750 0.0.0.0]

~

Client-to-Server (~250kbps up)

[--bust-mode 250 10.0.0.1]

~

~

Patch Diff: https://github.com/stoops/openvpn-fork/compare/mtio…bust

Implementing the work that the OpenVPN devs decided to once abandon!

So I’ve been running the highly modified version of OpenVPN (bulk mode + mtio mode) here at the core of my home network for a few weeks now as I have spent several days tuning up the code which allows me to achieve near full line speed of my WAN link now. I have also submitted my proof-of-concept code pull requests to the OVPN GitHub code base for anyone to take a look at as well. The devs there informed me that they were once pursuing making OpenVPN multi-threaded, however, they gave up on that idea some time back and that they now prefer their “DCO” kernel module hook instead. I suspected that when WireGuard became popular, more priority was placed into kernel-level speed and performance for OVPN to compete with of course which is not a bad choice, however, it is still good to optimize the code base in other ways other than at the kernel-level only. For my use case in particular, my argument would be that there are many kinds of OpenVPN customers who are possibly running OpenVPN clients on embedded-hardware devices or non-Linux OS’s and that it would still be a good selling point and distinguishing feature to offer improvements to the user-level operation which is something that cannot easily be done with WireGuard which in my opinion is a very bad overall design decision (kernel-level + udp-protocol only). Being able to run OVPN at the user-level with the tcp-protocol has solved the small-sized MTU issue I was running into with all my WiFi clients and thus has greatly and highly improved the overall speed and performance of my network-wide VPN setup so that is why I decided to work on this project framework instead!

I was able to update my original multi-threaded code to operate a bit cleaner now in addition to not depending on any thread-locking techniques but to instead use a file-descriptor substitution-swapping technique where in which the TUN read file descriptor is separated out from the TUN write file descriptor and it is then remapped and switched for a socket-pair instead which is now used for signalling and linked to an extra management thread. This change allows for the new management thread to perform multi-dedicated TUN device-reads and pre-fills 1500 bytes worth of data for 6x context buffers for 4x simultaneous streams. This results in a potential for up to 36,000 bytes of tunnel data being processed and encrypted and transmitted all happening in parallel and all at the same time!

Edit: I have updated this function functionality to incorporate these now 4 phases:

  • Data Read / Process
  • Thread Association / Ordering
  • Buffer Ordering / Compaction
  • Data Process / Send

Links

~

Something I wish I knew how to do years ago – SO_ORIGINAL_DST – Proxy Related

Note to future self, something that I’ve been doing completely inefficiently in the past, getting the destination IP address of a redirected packet from iptables/linux in an official manner!

Edit: I was trying to read this code block snippet and I’m not sure how the source address is used here other than the TCP connection socket file descriptor possibly used which might mean it doesn’t work for UDP redirects…?

https://github.com/darkk/redsocks/blob/master/base.c#L216

static int getdestaddr_iptables(int fd, const struct sockaddr_in *client, const struct sockaddr_in *bindaddr, struct sockaddr_in *destaddr)
{
    socklen_t socklen = sizeof(*destaddr);
    int error;

    error = getsockopt(fd, SOL_IP, SO_ORIGINAL_DST, destaddr, &socklen);
    if (error) {
        log_errno(LOG_WARNING, "getsockopt");
        return -1;
    }
    return 0;
}

Re-Modifying OpenVPN Source Code to Allow for Dual-Connection, Multi-Threaded, Load-Balanced Operation for Even More Performance!

This is a continuation of this original post in exploring the modifications that can be made to the OpenVPN source code to increase its overall performance: [post]

I’m still exploring how I can make this perform better and optimize the code more but I was finally able to build on top of the bulk-mode changes I had made in the last post and create a multi-threaded server and client model work together. It was tough to do because of the complexity of the code as well as not interfering with the TLS connection state variables and memory along the way.

I was able to make the code spin off 4 threads which both share a common TUN interface for bulk-reads and then create 4 separate TCP connections to each perform a large bulk-transfer. The server will load balance the dual connections from the client across the threads based on the connecting IP address. I am also running 4 VPN processes with 4 TUN devices and using IP routing next hop weight to load-balance the traffic between them.

Update: I just implemented an extra management thread that is dedicated to reading from the shared TUN device and bulk filling the context buffers so that they can all run and process the data in parallel to each other now in a non-locking fashion (6 x 1500 x 4 == 36,000 bytes)!

Config Tips:

  • Ensure that your VPS WAN interface has a 1500 MTU (my provider was setting it to 9000)
  • Perform some basic sysctl network/socket/packet memory/buffer/queue size tuning (16777216)
  • Set the TUN MTU == 1500 && TX QUEUE == 9000 (properly sized middle pipe link)
  • Push && pull the snd ++ rcv buffer sizes from the server config to the client options (16777216)
  • Use elliptic curve keys and stream cipher crypto (more efficient algos for the CPU)
  • No more need for compression, fragmentation, or MSS clamping (–mssfix 0)
  • Use a smaller timeout nftables values for fewer forwarded traffic table connection states (conntrack time_wait for udp/tcp)

Bulk-Mode ++ MTIO-Mode

~

~

~

~

Source Code: https://github.com/stoops/openvpn-fork/compare/bulk…mtio

Pull Request: https://github.com/OpenVPN/openvpn/pull/818/files

~

Modifying OpenVPN Source Code to Allow for Bulk-Reads, Max-MTU, and Jumbo-TCP for Highly Improved Performance!

So some time back, I wrote this highly-performant, network-wide, transparent-proxy service in C which was incredibly fast as it could read 8192 bytes off of the client’s TCP sockets directly and proxy them in one write call over TCP directly to the VPN server without needing a tunnel interface with a small sized MTU which bottlenecks reads+writes to <1500 bytes per function call.

I thought about it for a while and came up with a proof of concept to incorporate similar ideas into OpenVPN’s source code as well. The summary of improvements are:

  • Max-MTU which now matches the rest of your standard network clients (1500 bytes)
  • Bulk-Reads which are properly sized and multiply called from the tun interface (6 reads)
  • Jumbo-TCP connection protocol operation mode only (single larger write transfers)
  • Performance improvements made above now allows for (6 reads x 1500 bytes == 9000 bytes per transfer call)

As you can see below, this was a speed test performed on a Linux VM running on my little Mac Mini which is piping all of my network traffic through it so the full sized MTU which the client assumes doesn’t have to be fragmented or compressed at all! 🙂

Note: Also, the client/server logs show the multi-batched TUN READ/WRITE calls along with the jumbo-sized TCPv4 READ/WRITE calls.

Note-Note: My private VPS link is 1.5G and my internet link is 1G and my upload speed is hard rate limited by iptables and this test was done via a WiFi network client and not the actual host VPN client itself which to me makes it a bit more impressive.

Edit: The small size MTU problem that can affect both WireGuard and OVPN-UDP is documented here by another poster: https://gist.github.com/nitred/f16850ca48c48c79bf422e90ee5b9d95

~

~

~

~

I created a new GitHub repo with a branch+commit which has the changes made to the source code.

Patch Diff:

Fork Pull:

Maling List:

~

Generating colorful iOS backgrounds in less than 50 lines of JS and some basic photo editing skillz

<script>
	//scroll to bottom
	//shot 1475 x 935 (2950 x 1870)
	//sips --rotate -35 a.png --out b.png ; cp -fv b.png c.png
	//crop 768 x 1665
	//noise 5, vintage 15
	function a() {
		var l = [
			[113,  73, 173, 1.25, 1, "purple"],
			[171,  51,  51, 0.50, 1, "red"],
			[245, 115,  35, 0.75, 1, "orange"],
			[255, 215, 125, 0.75, 5, "gold"],
			[ 69, 139,  69, 0.50, 1, "green"],
			[ 33, 153, 243, 1.25, 3, "blue"],
			[113,  73, 173, 1.15, 1, "purple"],
			[113,  73, 173, 1.00, 1, "purple"],
		];
		var o = [0.97, 0.93];
		var p = 750;
		var h = 15;
		var n = parseInt((((p/h)-1)/(l.length-1))-1);
		if (n < 1) { n = 1; }
		n = ((n + 3) + (n % 2));
		var t = 0;
		var u = "";
		var z = document.getElementById("a");
		for (var i = 0; i < l.length; ++i) {
			var k = (i + 1);
			var r = l[i][3];
			var s = l[i][4];
			var m = parseInt(n * r);
			u += (l[i][5]+" "+i+" "+n+" "+r+" "+s+" "+m+"\n");
			for (var j = 0; j < s; ++j) {
				z.innerHTML += ("<div style='height:"+h+"px; background:rgba("+l[i][0]+", "+l[i][1]+", "+l[i][2]+", "+o[0]+");'></div>");
			}
			for (var j = 0; (j < m) && (k < l.length); ++j) {
				var a = ((((l[k][0] - l[i][0]) - 1) / (m + 1)) * (j + 1));
				var b = ((((l[k][1] - l[i][1]) - 1) / (m + 1)) * (j + 1));
				var c = ((((l[k][2] - l[i][2]) - 1) / (m + 1)) * (j + 1));
				var d = (l[i][0] + a);
				var e = (l[i][1] + b);
				var f = (l[i][2] + c);
				z.innerHTML += ("<div style='height:"+h+"px; background:rgba("+d+", "+e+", "+f+", "+o[1]+");'></div>");
			}
			if (i < (l.length - 1)) { t += ((m * h) + (h * s)); }
		}
		alert(n+"\n"+t+"\n"+u);
	}
</script>
<body onload="a();">
	<div id="a"></div>
</body>

~

~

TurnTable – A MacOS App In Swift – Starting From Scratch And Copying iTunes!

It’s been a while since I’ve posted here on the good old blog, I’ve been busy with life and work, however, that may change soon as the big 5 banks in Canada are now forcing everyone to a mandated RTO back in to downtown Toronto. I had to move out of the city some years back due to the cost of living crisis here so I may be out of a job come September.

App Store: https://apps.apple.com/ca/app/turntable/id6747615304?mt=12

Anyway, I started a new MacOS app in Swift called TurnTable which is written from scratch to try and copy the old spirit and simplicity of the original iTunes application. It doesn’t have anything fancy yet implemented but I just wrote it all today and am posting the source code of course up on my github. I will try to add more features to it over time when I get a free chance to do so!

Source Code: https://github.com/stoops/TurnTable/tree/main

~