Deadbolt: User Management for Databases

Not too long ago, I posted about a project I created for Careerbuilder called CloudSeed. The goal of the project was to make life easier for my team as we began our rapidly-accelerating move to the AWS cloud space. The project has been largely successful for our team, and as such, we open-sourced it with the hopes that others may find it equally useful. Now, I present to you another tool with just such an origin story:Deadbolt

Deadbolt was once again the product of trying to make life easier for our DBAs. Most of them were accustomed to working with Microsoft SQL Server, but now had to adjust to primarily using MySQL and Amazon Aurora. One of their most noted pain points was the lack of Windows Authentication. With Windows Authentication, they never needed to worry about bootstrapping users on to a new system, because the Active Directory (AD) did that for them! Without it, users had to be manually created and granted access for each system. Since “manually” is a dirty word on our team, I set to work.

We first researched how other companies got around this issue. It seemed there was a common LDAP package for MySQL that many teams had come to rely upon, so we began digging deeper in to that solution. Eventually we hit a wall in the road. AWS does not allow for AD or LDAP authentication for RDS resources. This was a major blow as we had just finished migrating the last EC2 MySQL instances to RDS. We had found a problem with no reliable solution, so it was time for a homegrown one.

The Plan

First off, we needed a management portal. This would allow us to add or remove users, assign them to databases, and assign permissions. We decided on a user to group to systems approach, allowing us to create groups representing functional teams which had access to the systems owned by that team.

Secondly, we needed a way to create the users on different systems with the same password. We added a password portal to the system with the ability to hash a password in each flavor’s preferred method and construct the appropriate user creation/update query.

Lastly, we needed a way to store these hashed passwords securely, so that any new systems would be auto-populated with users. The hashes are not secure enough themselves, so they need to be re-encrypted before storage.

The Execution

I am a sucker for the MEAN stack (with MySQL in place of Mongo) so I started building a Node API with Express. The API handles all authentication, management, and password functionality server side. Users can then interact with the API through a portal created in Angular and served via Express. The portal looks slightly different depending on if the logged in user is a  full admin, a group admin, or a developer. Developers can only use the portal to reset their password, Group Admins can assign new users and permissions to users in the groups they administrate, and “Full Admins” can add and remove systems, groups, and users.

Deadbolt Portal

The appearance of the portal to a full admin

All actions taken through the API or portal are recorded into a history table on the backing database. This database also holds the user information, hashed passwords, group mappings, and system information necessary for propagating the user to the databases. All sensitive info is encrypted as per the last point in the plan using Amazon KMS or AES depending on how the API is configured at set-up.

Try it yourself!

This tool has made managing a growing number of RDS systems a breeze, since we no longer have to worry about users. We can add a system to Deadbolt and within 5 seconds, all the user accounts are created and given appropriate access. If this sounds like something you could use, please give it a try. Its free and open source, and I will happily answer any questions you may have.

Deadbolt can be found on the Careerbuilder Open Source Org here:



I make good on my promises. I may be 3-6 months late, but I promised you guys that I was open-sourcing one of my work projects that I was super excited about, and the day has finally come. This delay was worth it however. The timing coincides with Careerbuilder officially accepting an open-source policy which allows developers to open non-business-specific code, as well as use and contribute to open-source libraries. I am proud to say that my project, CloudSeed, is the first to follow this policy and become a publicly viewable open-source repo!

What is it?

Magic in a bottle. Or rather, a server I guess, since no bottles are involved. In essence, CloudSeed provides an easy way to build and maintain “stacks” of resources in the Amazon Cloud. It leverages Amazon’s CloudFormation framework, but strives to remove human error (to which that framework is incredibly prone) and increase maintainability. If you have ever used CloudFormation and gotten lost in a sea of JSON needed for a large template, this tool will help you.  For my readers unfamiliar with the Amazon Cloud and its tools, let me give a little background.

The first acronym in the coming alphabet soup here is “AWS”. This is Amazon Web Services, and it refers to the application which provides business with an externally hosted cloud environment. This means we can move expensive datacenters in to AWS, and cut the maintenance costs, location issues, and uptime concerns.

aws services

AWS has a lot of of services

The next big guys are probably the services. There’s a BUNCH of these guys, and Amazon adds more every day, but the big ones I’m going to mention are RDS, EC2, and S3. In order, those are Relational Database Service, Elastic Cloud Computing, and Simple Storage Service. RDS is a service providing hosted databases, so you do not need to spin up a server and install an instance. EC2 is a service offering complete control of a private cloud and servers therein. One can create complex networks populated with servers in different regions and subnets, just like local datacenters. S3 is a hosted storage system as the name implies. It allows for easy storage and retrieval of data via an API.

There are many more acronyms, but for the sake of not turning this post in to a glossary, I will define the others as we need them.


What about CloudFormation?

All of these services in the Amazon cloud make it a very useful tool. This means that it gets used for some very important business functions. Things that important always need a backup. So you could feasibly keep a network map and record every change you make, so that if anything happens you can rebuild it according to your notes. And that would totally work, until it doesn’t. It doesn;t take too long for a network to get far too complicated to rebuild by hand. That’s why Amazon created CloudFormation!

CloudFormation is yet another Amazon service. This one accepts JSON templates representing all the services, networks, and servers needed to make a stack, and builds it according to the configurations you set! When you need to add a server, you add it in the JSON template, then rebuild. It keeps a record of all your changes, and makes sure your stack is always consistent and rebuildable at a moment’s notice.  Clearly, the smartest way to use AWS is to ditch their web interface and maintain your entire infrastructure through CloudFormation!

At least, that’s the theory…

If it ain’t broke…

The reality is that CloudFormation has its own problems. Small stacks are easy to build, configure, and maintain in JSON. Large ones get considerably less easy. Couple that with some questionable choices on Amazon’s side (Really, ACL rules are their own part, but security group rules aren’t?) and you end up crippling yourself if you vow to only use CloudFormation.

Our team made an honest go of it. We did pretty well too, excepting the 10-20 minutes needed to debug the new template each time we created a stack. However, it soon got out of control. We added a VPN tunnel to one of our subnets, and it created a hellscape of complicate network interactions. Now, the template for creating a subnet (which was previously only about 100 lines) needed the possibility to include VPC peering and routes to the VPN tunnel. Security concerns meant those new routes needed security groups and access control lists, and before you knew it, the template titled “Subnet.json” made a LOT more than just a subnet. When we needed to change a single ACL rule for one of our subnets, we now had to dig through ~2000 lines of JSON to get to the relevant portion. Its important to note that this was the case whether that specific subnet needed those other features or not.

I ain't lion

WAY too much JSON for 3 subnets…

Our workflow was broken, and I was assigned to fix it.

I proposed a new workflow using unique templates per stack, which solved the complicated stacks issue, but ended with a ton of reused code. I was confident this was the correct approach though, so I made a simple python script to illustrate my point. By keeping a record of the common code (mostly the actual resources needed to build a stack) I could prompt for parts and their required parameters. This allowed us to assemble unique stacks without having to copy and paste common blocks. It worked so well, I spent quite a bit of time on adding to those common blocks of code which I began referring to as “Parts”. I was so proud, I showed my team and watched their dissatisfied faces. They didn’t like it. A script was still too error prone and much too hard to visualize.

Bunch of whiners if you ask me…

Well, nobody asked me. Instead, I decided to help them visualize it with a simple UI. Side note, UIs are freaking hard. I made an agile card titled “Create CloudSeed UI”, which I estimated at a 2. Here I am 3 months later, still working on it.

I set my sights on the MEAN stack, mostly because I already knew Node/Express, and my closest coworker had spent the week preaching the word of MongoDB.  It was only a day or so before I had a nice looking API set up to handle stack saves and loads, serve all of the parts files, and push stacks to AWS. Then came Angular. I struggled my way to a mostly working angular app by the end of the week, littered with a poor understanding of the framework, misused bootstrap classes, and a non-functioning set of Material Design classes. Regardless of its hideous face, I was proud. The app now presented the user with a collection of parts which could be independently configured and assembled in to a working single-file template.

Look how bootstrappy

Buttons > !Buttons

I had replaced the arduous task of typing out  a new stack with a few simple clicks. I could click “VPC”, “Subnet”, “Subnet”, “Subnet”, “EC2Instance” and instantly have a stack ready for configuration. More importantly, Existing stacks could now be updated as easily as using the web portal for AWS, only this way they were tracked for future rebuilds. The stacks were auto-committed to a git repo for reliable change tracking, and loading them up filled the info into user-friendly forms for easy editing.

Where are we now?

Mobile apps are ALSO hard

Its mobile friendly too!

Since CloudSeed’s first showing, I have been pulled in many different directions. The app has been dramatically improved from its first state, but there is still much to be done. I have learned a lot of web apps since I started, and would love to bring that knowledge back to CloudSeed. However, the nature of my job has made it such that I can only really afford to work on CloudSeed when it directly solves a problem we are having within our team. It is primarily for that reason that I sought to open source this project. I want to see what other teams can do with it, and look forward to making it a truly useful tool for all AWS users.

If you or your team use CloudFormation and think this could be a good fit for you, please give it a try! The code is hosted at Careerbuilder’s new Open Source github account and I would be happy to help set you up if you experience any problems! A big thank you to the Careerbuilder Open Source Development Council for allowing me to see this through, and as always, code on.

Open Source All the Things

First off, I’m sorry its been so long (just over 2 months by my count). I have no excuses for lack of posts, besides lack of content. I have recently moved in to a new position at work in which I am actually fulfilled, which has been disastrous for my side  projects’ productivity. In addition, I adopted the world’s cutest and neediest dog 2 months ago, and have spent much of my free time taking care of her (for more of the pup follow her instagram @me.oh.maya)

Why Now?

So why can I write now? I finally did something again, and damn if it didn’t feel good. I have been working on a webapp at work (true MEAN stack, mongo and all) which I hope to be able to share with you soon barring legal issues from the company and I suddenly got inspired. I’m sure since I’ve been gone you’ve all had time to read my entire posting history and remember all about the super-secret .NET MealManager project. Well, its no longer super-secret! As of yesterday, the MealManager Windows application has been open sourced. I do this mostly out of futility, since I dread going back to fix some of the tech debt only to end up with a closed source application. If you have a windows machine and want to give it a try, grab the source at the usual spot

What’s this got to do with webapps?

Everything. You see, in making the webapp for work, I glimpsed the future. Ok, maybe not anything that dramatic, but I definitely saw the error of my ways in developing a platform-locked installed application. Webapps allow for easy cross-platform support, constant availability, and much lower requirements for the device. Moreover, Nodejs and Angularjs make it pretty easy to tie your server-side and client-side code together in a way that makes the pain of WPF seem pointless.

So a webapp is the obvious next step for this project. For that matter, it is the next step for MANY of my projects in waiting, blocked by my indecision of platform. My Bartenderbot has been resting in pieces in between its last life and new one for far too long, perhaps a webapp interface is the motivation I need to resurrect it. Our new puppy could probably use a web-cam monitoring system for while my or the fiance are at work, so that could probably be fit into a MEAN stack. Lastly, I’m getting married soon! How cool would it be to have wedding webapp (wedapp?) instead of a flat website? All of these are of great interest to me as I further explore what I can do with my new friends Angularjs and Mongo.

What now?

For now, I will work on webapps. Maybe before I get a finished product to show off I can do a short post about the MEAN  stack, fun tricks to building an API, and the process of converting a monolith to a scalable webapp. I am cautious to promise too much, since I need to be planning that wedding I mentioned, but I promise to try.

In other news, my favorite music streaming service died this week (RIP Grooveshark), and took with it a wonderful community of programmers and generally nice people from the WritheM Radio broadcast. I wanted any of my readers that care to know the community is finding itself a new home at, so feel free to  drop in and join for some good music and better discussion. If you see me there (handle is §wimmadude66) say hello! Otherwise, stay tuned for more posts soon-ish, and code on.

REST Easy – An Edison Project

This post represents the end (for now) of my series on the Intel Edison board (part 1 and part 2) wherein I describe my latest project. This was an app designed partially for use in my apartment, partially as a trial in programming for such a small device (storage and power-wise). In the end, I managed to squeeze out a minimum viable product with lots of room for growth. There was definitely a need for scaling back initially, however, since the grand design for the project would have overflowed the memory just on install. After swapping some tools for their smaller counterparts, trading uglier code for less space, and dropping of some top level design characteristics, the app was able to build and run on an Edison in almost any storage state, allowing this app to run quietly alongside whatever other tools you have decided to use on your Edison board.

What’s it Do?

For once, I have a project which needs no complicated justification for its use. This app deploys a very simple REST API (get the name now?) which can manipulate the many GPIO pins which make this device a real player in the Internet of Things. Without any need for understanding APIs, no knowledge of javascript, and very little need for understanding linux at all, this makes it possible to manipulate your Edison’s pins from anywhere with an internet connection.  The idea is simple. The Edison has a built in wifi adapter, and using that combined with DHCP reservations can give the Edison a permanent local IP address. Port forwarding port 3000 (the REST API port) to that address allows you to send commands through your router to the wireless device from anywhere in the world. Sure, you could ssh in and manually manipulate each pin from anywhere already. What this api REALLY leads to is extension. When you can reduce a series of complicated ssh commands to the press of a button, that’s progress. With REST Easy, the button need only send an http request to the API, and all changes will propagate.


Very, very simplistically. The app is essentially a prebuilt nodejs application, which self installs and runs with the app’s own install script. Beyond that, the only other trick is a library of very simple python scripts which use JSON to communicate data between the linux representation of the GPIO pins and the API. These scripts are really aggregate read/writers of all the status files associated with each pin.

I have no idea who thought this was a good way to handle GPIO

This is a simple view of the file structure for each pin


When the request for the status of a pin is received, the script grabs the information about every aspect of that pin from the file system (/sys/class/gpio/gpio<pin#>/[direction, edge, value, etc.]) and compiles it to a JSON object which can be easily sent back over the API. The reverse is also true. When a POST command is sent to the API with a JSON object (or array) the python script will break down the object and set each field to the value in the received data.  The JSON object for each pin looks like this:

 "ID": 130,
 "active_low": 1,
 "direction": "out",
 "edge": "none",
 "value": 0,
 "power": {
     "async": "disabled",
     "control": "auto",
     "runtime_enabled": "disabled",
     "runtime_status": "unsupported",
     "runtime_active_kids": 0,
     "runtime_active_time": 0,
     "runtime_suspended_time": 0,
     "runtime_usage": 0

This should be pretty easy to understand, even with no prior experience with JSON. The words before the colons are the keys or variables, and those after are the values. The format is actually how javascript saves objects in flat files, hence the name JavaScript Object Notation. This makes it doubly useful in a javascript based API, since upon the receipt of a JSON payload, the data is immediately converted into an object. This can be manipulated and altered using standard OO operations including dot notation, rather than messing with string manipulation and regexes.

What’s Left

By this point you should be expecting that this project has not fully realized its potential (how many projects ever do?).  There are several features on my list which will hopefully be added in the near future, and many which I plan to leave out until needed unless some willing soul wants to contribute. Some of my planned features:

  • API Keys/Auth
    • The app comes with a SQLite database and authentication system (endpoints /users and /login) but as of yet, they are not used for anything. The plan is to require a login every so often to release an API key. This would limit your board from attack by anyone who knows your IP.
  • Aliases
    • While the pins are currently accessed by passing their linux ID number, it would be even better to be able to assign an alias to a pin and use that to access the pin’s state.
  • Groups
    • Many processes using GPIO rely on the simultaneous change of more than one pin in order to change a multiplexer input or some other multi-bit operation. The app does not currently support the simultaneous changing of pins, however multiple pins can be accessed in the same request via passing an array of ID’s or JSON objects for GET and POST respectively.

I have no intention right now of working on a front end to this application. That task along with any other features not in the above list are up to you, the user. This code is 100% open source, and available in the usual spot on my github. Feel free to alter, use, adapt and destroy any of it to suit your needs. The install process is rather simple:

  1. Download the source code and package all but the “installer” directory into a tar.gz archive (this will be done for you when a release medium is linked).
  2. Send the installer folder and tar.gz package to your edison.
  3. SSH in and run the shell script included in the installer folder.
  4. assuming your Edison is already set up for wifi, you should be able to access the global GET endpoint at: <Edison’s IP>:3000/api/pins

Go forth and do wonderful things with this. If you use it for something cool, I only ask you mention me in the documentation, and of course share a link to what you’ve built so I can admire your work! Next week I will be writing up a very short summary of some project currently on my plate, so stay tuned!

Crowd Computing

Last year, I was a part of the inaugural HackGT. This is an annual hackathon sponsored by Georgia Tech, which seeks to gather programmers from all around the country for one weekend to develop the best app they can. The grand prize is $60,000. The prize drew a lot of interest, but what compelled me to participate was the presence of a variety of big companies with new technologies. One such pre-announced presence was Intel, with an early look at the Edison board I wrote about last week. The board fascinated me, and the ability to hack on one for a weekend before it was even available for purchase ensured my name would be on the HackGT signup list.


If this word is unfamiliar to you, its time to learn. Hackathons are spreading, becoming more frequent at software companies, schools, clubs, even cities (see HackATL) because of their tendency to produce minimum viable product prototypes in a short amount of time. Essentially, a hackathon is just a gathering of programmers with the resources needed for extended programming times. Often these hackathons feature diversions and entertainment to allow for breaks, food and drink so you never need to leave, and caffeine and/or alcohol for those late night coding sessions. At the end of the 24-72 hour span, apps created by the participating teams and individuals are presented to judges in order to determine winners. These winners could be awarded prizes, or have their idea produced, or may even be offered a job.

Crowd Computing

Crowd computing was my HackGT project, done over a 48 hour period with 2 teammates. (See how much more sense that makes after the intro?)  The idea was to create a big data platform on a tiny board. These Edison boards were great, but they lacked space and computational power compared to traditional computers. In theory however, their price meant that there would be many of them. The number of boards combined with a tendency to be used for passive computation made them ripe for use in cloud computing. Essentially, jobs that couldn’t be run on one board could be run on LOTS of boards working together. A simple website would allow for you to enroll your board in the program by installing a tiny script. This script reports to the webserver every couple minutes to verify the availability of resources on the board. When a job submitted to the website needed boards, yours could be chosen, and the un-utilized resources would be used to compute a portion of the job. When a few dozen boards contribute as such, the resultant power is pretty astounding.

Our app leverages the Map Reduce framework common in big data applications, with a tiny twist. since the boards are hardly big enough to function as nodes, we had to use something with a little more power as the master node. The webserver played that role, allowing for mapper scripts to be run on it that distribute data and a reducer script to the Edisons. From there, the boards would each execute the reducer script on their small portion of data, then return the output to the webserver along with an id which denotes which board the data belonged to. In our proof-of-concept demo we used a very simple example. A single Edison would first attempt to sort the entire text of  War and Peace alphabetically in a very short python script. Simply allocating the space for the novel alone was a struggle, and once the sort process began, the ram began to overflow and the board rebooted. This was expected. This task is simply too large for the memory and computational capabilities of the device. For contrast, we uploaded the same task to our webservice, to which we had registered 6  boards. A mapper script was created along the following lines:

def map(text):
words = text.split(' ')
letters = dict()
for word in words:
#map each word to a list by its first letter
letters[word.lower()[0]] .append(word)
return letters

This split the book into 26 arrays by the starting letter (plus a few for symbols) for every word in the book. Now, we had smaller chunks we could work with. The webserver sent a single array of data to each device, along with the index of the array. Since “A” comes first, a machine would receive all the words beginning with “A”, plus an ID of 0. The device also received a short python script, which told it to sort the list, then communicate the results and original ID back to the webserver. This process repeated until all the arrays of words had been sorted and returned. At that point, the web server would run it’s handler, which sorts the lists by ID. Since “A” had an ID of 0, “B” was ID 1, and so on, the result was a completely sorted novel in a short period of time. In our example it took around 15 seconds to sort the entire book. When some of the devices are in use it may take longer to lobby for access to CPU time and memory, but the idea remains the same.

Where are we now?

The code is on my github. It was just recently open-sourced, and there’s a reason it took this long. The code is VERY sloppy. One of the downsides to hackathons is that programming competence tends to decrease with tiredness. After 36 straight hours of working on the code, we began to make VERY bad mistakes. compound that with a teammate leaving in the middle of the night and frustration with new technologies and poor internet connection, and you get a mess. I’m not entirely sure that what is on github will actually work anymore, and I know that what was on the webserver no longer works. However, over the course of the next few weeks, I intend to revisit it and clean up large sections of the code, hopefully producing a live website soon enough. Please feel free to contribute and fork, or just stay tuned for a beta invite if you own an Edison board (and if you don’t you totally should).

Visit the code HERE

That’s all for this week. Next week I will wrap up my discussion on the Edison for now with my latest and current project: “Rest Easy”. Until then, raise a glass and code on.

The Internet of Things

Surely by now you have seen this term somewhere, maybe even subconsciously. This term (or its acronym IoT) has been used to describe anything and everything with an internet connection to the point that the average consumer probably has no idea of its meaning.  The idea is really pretty simple though. Newer devices should have internet linked controls, like an API, or a web interface meaning that following a random IP is more likely to lead you to a device than a person at a computer. In truth, very few devices really need to be called part of IoT. Most IPs today connect to a computer of some kind, with a small number of exceptions. There are some Internet capable thermostats, a few home automation controllers, and even a toaster with internet connections qualifying them as IoT, but the smart TVs and netbooks being advertised as such should probably not qualify.

Intel Edison

May I present to you, the exception to my above rant. Developer devices, such  as the raspberry pi have always had the potential to qualify, but more often than not in my experience, people use them as small portable computers. The Edison is a little bit different.

The board featuring a rare appropriate use of IoT

The board featuring a rare appropriate use of IoT

This board is a fraction the size of a raspberry pi, because it cuts down on a few notable things. Everything, for example. There is no hdmi, no usb, no ehernet connector; just a single ridged port on the bottom in an unfamiliar form. This device is not intended to be your portable media device. It is a true member of the Internet of Things. With one of the “blocks” (boards which connect to the weird connector and extend functionality) you can add GPIO pins, usb power, and a serial connection. Other blocks can add batteries or sensors to increase the use of the device, but for this post, I am going to focus on the just the one board.edison_board

Soldering in the 4×14 GPIO pins, connecting the 2 microUSB slots to a computer and opening a serial connection reveals a teeny tiny linux machine. It runs a smaller distro than the raspberry pi called “Yocto” in order to handle its limited computing power and memory. Clearly, this is not meant to replace anyone’s laptops. Instead, Yocto is the perfect size for utilizing the other huge benefit of the Edison: a built-in wifi antenna. No more dongles, no tethering oneself to an ethernet port, just a wirelessly connected computing machine. Configuring the built in wifi over the serial connection allows for SSH access on the network, meaning one more cable could be eliminated. Conceivably, the serial/power block could be replaced with a battery once SSH is established, and it could be completely wireless.


What good is a computer without netflix?

Most developers I’ve given this pitch to are already drooling at this point, but for those who are yet to see the implications here, let me explain. This device can run an app, control hardware, or even just host a website. Setting up an IP for the Edison means you can run computations on it, control things from its GPIO pins, or connect to hosted content over the internet, with no human interaction required. Personally I have 2 IoT projects currently in development on my Edison (blog posts on both to come, but feel free to check out the first here) which can be loaded and set free, interfacing only through an API. Clearly this is not a device for consumers, but rather a platform for developers. This is the current state of much of the internet of things, and probably will be until widespread use of home automation and web-based controllers begins. So no, it won’t stream netflix to you, but with the right software, it could dim the lights, turn up the speakers, and play the movie with the push of a button on a website.  Personally, I think that’s pretty cool.

I am trying to work more on the two projects in progress on the Edison in order to release a blog post about at least one of them next Friday, so stay tuned for that. Until next time, raise a class and code on.

Interview Question – Tail

In my hiatus, I finished my last semester at Georgia Tech and began my job search. I had an offer starting the semester, so I wanted to explore my options before the deadline near the end of the year. As such, in the 3 months of school, I participated in over 20 interviews with several companies across the US. During those interviews, I was exposed to a new style of programming: Interview programming (sometimes called “whiteboard coding”). This style of development is so fundamentally different from my experiences in classes and work that I decided to dedicate this post (and possibly a few others to come) to a few of the more interesting question I encountered.

The Problem

Through my interviews, I was asked an assortment of odd questions. Some were complicated: “write an algorithm to determine the shortest sequence of words in order out of a paragraph” and others were downright absurd: “If you had to spend a day as a giraffe, what do you think would be your biggest problem?”. The problem at the center of this post was a seemingly tame one, and possibly my favorite of the many questions I was asked.  It is worth noting that one of the key secrets to most interview questions is to find a clever use for a data structure. While many questions seem to have a simple brute force answer, it is typically a good idea to examine the question for possible uses of data structures (often a hash map). This question was no different.

The problem posed was to implement the linux command “tail” in O(n) time. For those unfamiliar with the command, running tail on a file or stream displays the last n lines (which can be specified with an argument). This is useful for monitoring logs, as new records are often appended to the end. It seems a simple problem to print lines [length-n, length] but that first requires indexing every line of the file. It causes problems with files smaller than n, very very large files, or streams. So we need a way general enough to scale up or down without performance losses.

Hash Map?

Sadly, this time the answer is not a hash map. It is however, another common data structure which can be creatively implemented to solve our problem. A linked list is a very simple data  structure. It is made up of nodes, which are small wrapper objects containing data and a pointer to another node. The list is referenced by storing the node at the top of the chain. This is a convenient structure for our needs because you can keep a list in memory without concerning  yourself with  the contents of the whole list.

So we have start with a node class, containing a line of text and a pointer to another node. This still leaves us with the problem of finding the end of the file, plus we do not have an index. So we need a class to manage the list. This class will be responsible for iterating through the file and creating the nodes for each line. Since we have a controller, we no longer need to  make the entire file into a list. Instead, we can append to our list until the length (tracked by the control class) is equal to the provided n lines of text. From that point on, in addition to appending to the end of the list, we can move the head. Moving the head pointer effectively subtracts a node from the front of the list. Repeating this algorithm through a file or stream should return the expected results regardless of filesize. At the end of the process the list should start a maximum of n lines from the end of the file, ending on the last line. This list can be iterated to print the correct lines. If a file is smaller than the specified length, the whole file will be printed, and should the file be very very huge, only the last lines will be printed without filling memory with the entire file.

The Train Rolls On

This problem had an elegant solution. Think of it as a sliding window or rolling train. The train adds cars until it hits its maximum length, then rolls down the tracks. The contents of the train at the end of the tracks are the desired lines. Visualizations like this can help in an interview, where the questions tend to be more about the way you think, rather than the best solution. Even if you were unable to implement the solution, explaining your approach with imagery shows you understand the problem.

So take some time this week to brush up on your data structures and the everyday struggles of being a giraffe. As always, raise a glass and code on.

The Backlog

It’s finally time for my return to regular posting. Over the last year, the rest of my life caught up with me and conspired to keep me away from any and all free time in which to work on this blog. Just since my last post, I have graduated, gotten engaged, moved twice, and started a new full-time job! It’s been a pretty busy 6 months. On the bright side, thanks to the long hiatus, I have a lot of material to cover over the next few blog posts. Last semester I completed 4 long-term projects in a variety of languages and platforms, and already at work I’m learning spark, chef, scala, postgres and more. There will certainly be no shortage of topics. Additionally, before choosing my current job, I ran the gauntlet of interviews with several companies, meeting lots of cool people in the industry. In particular I want to give a shout-out to the Microsoft team in Charlotte, NC. You guys were the first readers I ever met in person, and I hope to make it interesting for you in the coming months!Without further adieu…

The Meal Manager

I have discussed this software previously, but this semester, some SERIOUS work got done. At Georgia Tech, you are required to complete your degree with a “capstone” project: a semester-long project simulating a real company development cycle. My team chose to develop the meal manager in to a working app, and after a semester’s worth of frustration, hard work, and learning experiences, the payoff was tremendous.

Actually details the contents of my kitchen right now...

This is what you see when you first open the app

viola. It might not look like much, but there is a lo going on here. First, if you compare to the original screenshots, you’ll see my baby got a face-lift. We decided to model the UI after the Windows 8 “metro” patterns, using matte colors and lots of squares. The UI is also now completely configurable through settings, allowing for different color schemes and “look-and-feel”. Additionally, you may notice that a lot of the grid-like structure from the initial app has been replaced by the more modern stack panel look. Behind the UI, the changes are much larger.

This is making me hungry...

More Squares and a lot of recipes.

This screen is where you browse recipes. This picture only shows a few, but our database now contains 48,438 unique recipes! One of the benefits of working as a team was the ability to work on multiple high-priority long-term tasks simultaneously. One of the teammates graciously volunteered to be our data scientist, and set about scraping recipe sites for recipes. Additionally visible on this page are the search and filter controls. The “have” vs “all” radio buttons are especially helpful here. They control a filter on all displayed recipes, determining whether or not to display recipes which meet the criteria, but require ingredients not currently in your pantry. Someone wanting a meal to make right now would use the “have” button, search for some tags or ingredients they like, and choose a recipe. On the other hand, someone planning meals for the week could look at the “all” view of the same search to find recipes they can plan for.

Speaking of planning…

Built this calendar control by hand. APPRECIATE IT!!!!

Double the views for double the fun!

This is the planner view. When you want to make a recipe in the future, you can plan it from here or the recipe view, and it will appear on the calendar. The calendar also has two view modes for different use cases. Some people want to see the whole month at a time, other just need to know the upcoming week. As such, pressing the “Change view” button will swap between a full month view, and an agenda style view of the upcoming 7 days. The information on the calendar is preserved across views, so don’t worry about choosing a favorite yet. It’s OK to love them equally, I know I do.

I don't have a witty comment here. Forgive me

Shopping made easy

This is the end result of your valiant planning effort. The Shopping list view lets you select a date range, and it will create a list of every ingredient required to make the meals planned in that time, excluding those already in the pantry. Currently it is a binary system, only concerned with having/not having rather than quantities, but there are systems in the works to correct that.

So what’s left?

We worked hard on this.

There were really a TON of branches...

Look at all those branches converging

It took some time to mesh as a psuedo-company, but once we got rolling, the biggest limitation was the time we had together. Given another semester, this code could probably be production-ready. However, we managed to make a fair bit of progress. I have dubbed this build a “minimally viable product” and ready for beta-testing. There are several bugs, a few features we wanted but didn’t get to, and I’m sure some which we haven’t even thought of. In the next couple weeks I will be migrating the software from using a shared development Database to a local database included in the installer. After that, the Beta test begins. We have a few people who have volunteered, but if you would like to join the beta and help us out, comment below and I will notify you when the build is ready.

It’s a new year, and one in which I hope to write here a lot more, so please stay tuned. Most importantly though, and as always, raise a glass and code on.

Woodbuntu’s Monitoring Software

Given that my recently created (and blogged about) home theater PC is functionally a piece of furniture, the last thing we need is to be constantly checking on it. Rather, the machine should manage itself, and notify us (the users) when updates or intervention is needed.  As such I played around with my favorite scripting language to make some monitoring tools.


Using a very very very simple python script, I am able to read in a file and send the contents to a list of recipients. Once that was built, we needed a way to create the file for each email including html formatting. Another simple python script which reads in updates lists and status files to create an html file which can then be sent by the original tool.  The last step was automating the process. A few bash scripts placed in the automatically called locations (init.d, cron) can consolidate the necessary information to a file which can be passed through the python scripts.

actual received email

I should log on and update the end tale!

The result is a daily email or two (perhaps I will consolidate the daily emails in to one soon!) alerting us of any power-cycles, updates, or necessary changes needed.    There is support for a few extra features which I have not implemented yet, but will soon (read: eventually).  These tools can be used for many things I have not thought of I’m sure, so I made them highly generic. The builder can accept any type flags you want to define, so its behavior can be expanded without limit. The sender can send any file desired with a customizable subject in the command line. Both tools are written fairly clearly in the hopes of use for new and different things to suit each need.  The code is available in the usual spot here and is intended for linux systems.  I must clarify that these tools have only been tested on the end table PC, so excuse any compatibility errors.

To install, simply download the repository, edit the sender with the To, From, and Auth fields of your choosing, then run the as root.

Hope you get some use out of this tool, and as always, raise a glass and code on.


Ever watched an old spy movie? I’m talking your classic heists: expensive loot, black tights, ski masks, and the inevitable laser security grid. What if you could own one of those for yourself? For my bartenderbot project (which I am now affectionately calling pi bar) I have been tasked with creating a novel sensor. Since the nature of the product leads to inebriated users, the system needs to be fairly idiot-proof. As such, I am creating a laser security grid for cups.


Beware the laser array

Cup Security

The cups used in this application have a fairly limited size range. The code hard-scales recipes to max out at 12  oz to prevent overflow on Solo cups and shakers, and the smallest drink that makes much sense is 1 oz. As such, we can make some assumptions regarding our cups volume from another property, like height.  This laser security array will read that height (approximately) and give us a best guess of cup size. It is in no way precise, but it will hopefully prevent drunken users from pouring a 12 oz Long Island Iced Tea in to a 1 oz shot glass.

So where do lasers come in to this? EVERYWHERE.  The new design for the cup platform will look like the museum floor around the priceless sculpture centerpiece.  One side of the platform now consists of a  small project box with 3 inconspicuous metal washers sticking out of the side at various heights. The other side looks identical, minus the washers. Instead there are small holes in the box, with no visible parts.



The super secret dark boxes contain 2 cardboard sheets, keeping the insides as dark as possible. This is important for the tiny light dependent resistors (LDRs) nestled in the back. The LDRs are wired to 5V, through a simple comparator to grab an amount of light from their resistance levels. These resistors supply 25 MegaOhms of resistance in relative darkness, and close to 0 Ohms in direct light. I think you can see where this is going. When there is no cup on the platform, all three lasers hit the LDRs head on, supplying our detector circuit with the information we need to determine a cup’s presence. When at least one is blocked, we know SOMETHING is in the way. Using this, we can tell approximately how tall our vessel is to be! 3 lasers allows us to detect a shot glass, low ball and highball.  We are ignoring specialty glasses like martini glasses and margarita glasses for now for simplicity. But this can’t solve all our problems. You see, glasses are often made of glass. Glass is used in many applications for its ability to NOT block light.  So how does the system handles glasses?



LDRs are not binary. They are analog (which is another whole problem which will probably get its own blog post once I solve it), meaning they have a large range of resistance, dependent on the AMOUNT of light, rather than the presence. So when a laser passes through a glass, as long as it is weakened, it will be OK. Since glasses are often made of glass >1 micron thick, the scattering of light due to the change in the speed of light is quite noticeable.  Pinholes on the sensor box will allow directed, uninterrupted laser to hit the LDRs head on, but scattered light will hit with much less intensity. A circuit can be used to measure the voltage drop from the increased resistance and “call it” at a certain point. That point is called a threshold. This threshold will need to be experimentally determined due to the near infinite variety of drinking vessels.


Project Progress

The laser array has eaten up a fair amount of my time, but this weekend should be a productive one for the pi bar. The design has been created, so a rudimentary parts list is ready for shopping. Once a large amount of PVC and lumber can be acquired, the machine should begin to take shape. I hope to be posting a blog entry with a frame soon, so keep tuned, and keep coding.