May the Circle be Unbroken

Two weeks ago my career in IT came full circle. Fifty-five years ago, as a rising high school senior, I took a summer course in computers at Columbia University in NYC. It was a very comprehensive course which covered theories of computing, algorithms, compilers, etc. in the mornings and then taught us programming in Fortran and assembler for Columbia’s newly updated IBM 7094 computer in the afternoons. For many years, that course was my only formal training in computers and IT. Most of what I have learned since then was either self-taught or learned on the job.

The 7094 had recently been updated from a 7090. The hottest question at the time was what was the difference between a 7090 and a 7094, and the answer was 4. In addition to a few other changes in clock speed, instruction set, etc. a major change between the two computers was that the 7094 had 4 more index registers. The 36 bit instructions for these computers had 3 Tag bits set aside to indicate the use of an index register, often used for stepping through memory in loops. In the 7090 these 3 bits selected one of 3 index registers. For the 7094, a binary decoder circuit was added so that the 3 bits could select one of 7 index registers, thus the update from 7090 to 7094.

Part of my decision to take a course in computers during that summer was the news that my high school, Brooklyn Technical High School, was planning to get a new IBM 1130 computer in the fall. In addition to taking the course at Columbia that summer I also went to the IBM offices on Maiden Lane in lower Manhattan and bought a set of manuals for the IBM 1130 so that I would be ready to use the computer when it arrived. In fact, I ended up being the first person to successfully run a program on Tech’s 1130. My oldest brother was going to school at Brooklyn College at the time, and was working on the IBM 1620 which they had there. I was able to visit his office and use a keypunch machine there to punch my program onto the punch cards then used as input for computers. As I recall, the program was to calculate great-circle bearings from NYC to other points on the globe. My father was a ham radio operator and he had installed a large, rotatable antenna on a tower alongside our home in Queens. He asked me to give him a table of bearings to point his antenna depending on where in the world he wanted to speak to someone, so I wrote the program, punched it onto cards at Brooklyn College (along with the Monitor Control Records, an early form of JCL for the 1130) and then walked into the computer room at Brooklyn Tech one day and asked the teacher there, I believe it was the head of the electronics course, if I could use the computer. He stared in amazement as I walked over to the card reader, inserted my card deck and then proceeded to run the program which generated many pages of output tables. He was sufficiently impressed by my ability to get the computer working that from that day on I was always welcome to come into the computer room, even if I was cutting a class to do it.

One interesting sideline about the 1130 was that it was a desk-sized computer console with a removable rotating magnetic disk cartridge (IBM 2315) behind a panel in the desk stand. The monitor program, as well as the compiler and other utilities were loaded onto the disk drive from 2 boxes of punched cards (about 4000 total cards). One item which I seem to have neglected to notice in the manuals I bought was that you had to turn off the disk drive with a switch behind the front panel before you shut off the computer. Otherwise, the retracting head of the magnetic drive would scribble all over the disk as the power was removed. We wondered why we had to load the 2 boxes of cards onto the magnetic drive every day, until someone realized that the switch inside the front panel would save us that chore.

There is irony in the fact that this IBM 1130 was bigger and more powerful than the IBM 1620 computer at the college I went to, although my college also managed to purchase an IBM 1130 during the summer before my senior year. More about that in another post.

The full circle I was talking about in the first paragraph of this post is that 2 weeks ago I started working as a system administrator in the computer research facility of the computer science department at Columbia. So my career in computing, which started with a course at Columbia 55 years ago, has now returned me to Columbia.

Sometimes You Just Need a Man-in-the-Middle (MITM)

Ok, to be politically correct I suppose it should be called a Person-in-the-Middle, but the acronym PITM is just too close to PITA for me. Maybe I’ll change it to Machine-in-the-Middle since that is what it usually is, but for me it was just a process on the receiving machine.

For those of you who don’t know it, MITM is often used in the context of an attack on web browsing. When a browser (like Chrome, Edge, Firefox, …) connects to a sever the name of the server is converted to an IP Address via DNS and then the packets are routed between the browser and the server. Of course, if someone can corrupt your DNS or stick a malicious router into the path between your browser and the server then they can read all of the packets that go back and forth. That is called a MITM attack.

The S in HTTPS is for Security and indicates that the connection is encrypted between the browser and the server using Transport Layer Security (TLS), which is an update to the previous Secure Socket Layer (SSL) which it turns out wasn’t really as secure as people hoped it would be. This is intended to protect against MITM attacks by making sure that the machine-in-the-middle can’t read the encrypted packets. When a browser connects to a server over HTTPS the server supplies its Certificate to the browser so that the browser can confirm that it is connecting to the correct server. The certificate contains the name of the server and is cryptographically signed by a Certificate Authority (CA) which confirms that the server name belongs to the server. This is called Public Key Infrastructure (PKI) which is well beyond the scope of this blog post. If the name of the server in the certificate doesn’t match the name of the server you asked the browser to connect to, if the certificate is signed by a CA that your browser doesn’t trust, or if there is another problem with the certificate like it is expired, then your browser will put up a warning about the certificate before letting you see the web site. Some corporate web proxy servers, which connect computers in a corporate environment, include a MITM which allows them to snoop on what their employees are doing on the Internet, even when the employee is using HTTPS. To avoid their employees getting the browser certificate warning the proxy server has to create on the fly a certificate for the specific server which the browser is connecting to. Since no legitimate CA will sign such a certificate the proxy server has its own CA and any browsers which use that proxy have to install the CA certificate from the proxy server as a Trusted CA.

This blog post is about why I needed a MITM in order to solve a problem I was having. We have 6 Amazon Web Services (AWS) Elastic Compute Cloud (EC2) Virtual Machines (VM) and I wanted to have all of them send their logs to one common VM for analysis. These VMs came with RSyslogD, a standard Linux/Unix system logging utility which I planned to use. Of course, the version of RSyslogD installed was 5.8.10, which was released in 2010 and last updated in 2012. For comparison, the latest version of RSyslogD is 8.2104. Since I am a card carrying Certified Information Systems Security Professional (CISSP) I decided that the logs should be transmitted to the common log server over TLS even though the servers were all in our AWS Virtual Private Cloud (VPC) meaning that no other computers should have access to our packets. Configuring RSyslogD to use TLS wasn’t too hard, but we also had some Python programs written in-house which I wanted to have transfer their logs directly to the common log server without using RSyslogD on the local VM. If you read my earlier blog post about Python you know how much I love looking for and using standard Python modules. I was able to find a Python module which said it interfaced the standard Python logging module to the Syslog protocol used by RSyslogD. Of course, it didn’t support sending the packets over TLS so I had to modify the module to wrap the packets in TLS. I was able to get the Python programs to send their logs successfully to the common log server using the Syslog protocol without TLS but when I wrapped the packets with TLS the logs were ignored. I could see that the TLS connections were being made, but since TLS encrypts the packets I couldn’t see what was in the encrypted packets that kept it from working. The packets going between the RSyslogD processes on the servers was working, but my TLS wrapped Python packets were being ignored.

To figure out what the issue was, I needed a MITM where I could decrypt the packets and inspect them to see what the difference was between the working logs from the RSyslogD process and the non-working logs from my Python module. Since I was already so embedded in Python for this I decided to write a MITM module for Python which would accept the encrypted TLS connection from the source, decrypt and display the logs, and then re-encrypt them and pass them on to the RSyslogD process on the common log server. Normally a MITM module is blocked by browsers because it supplies a certificate whose name doesn’t match the server you are connecting to, or is signed by an untrusted CA, but in this case I didn’t have that problem. For these TLS connections I had created our own local CA which only needed to be trusted by the VMs in our VPC. Since I had already been doing all this Python coding with TLS I had no problem cobbling together a simple MITM module with which I quickly discovered that the working messages had the length of the log message inserted before the actual log message. Adding that to my Python TLS wrapper module got the log messages flowing cleanly. Of course, if I had Googled “Syslog over TLS protocol”, as I should have, I would have quickly found RFC5425 which would have given me the needed answer without the necessity of an MITM, but what would have been the fun in that?