Code

CloudSeed

I make good on my promises. I may be 3-6 months late, but I promised you guys that I was open-sourcing one of my work projects that I was super excited about, and the day has finally come. This delay was worth it however. The timing coincides with Careerbuilder officially accepting an open-source policy which allows developers to open non-business-specific code, as well as use and contribute to open-source libraries. I am proud to say that my project, CloudSeed, is the first to follow this policy and become a publicly viewable open-source repo!

What is it?

Magic in a bottle. Or rather, a server I guess, since no bottles are involved. In essence, CloudSeed provides an easy way to build and maintain “stacks” of resources in the Amazon Cloud. It leverages Amazon’s CloudFormation framework, but strives to remove human error (to which that framework is incredibly prone) and increase maintainability. If you have ever used CloudFormation and gotten lost in a sea of JSON needed for a large template, this tool will help you.  For my readers unfamiliar with the Amazon Cloud and its tools, let me give a little background.

The first acronym in the coming alphabet soup here is “AWS”. This is Amazon Web Services, and it refers to the application which provides business with an externally hosted cloud environment. This means we can move expensive datacenters in to AWS, and cut the maintenance costs, location issues, and uptime concerns.

aws services

AWS has a lot of of services

The next big guys are probably the services. There’s a BUNCH of these guys, and Amazon adds more every day, but the big ones I’m going to mention are RDS, EC2, and S3. In order, those are Relational Database Service, Elastic Cloud Computing, and Simple Storage Service. RDS is a service providing hosted databases, so you do not need to spin up a server and install an instance. EC2 is a service offering complete control of a private cloud and servers therein. One can create complex networks populated with servers in different regions and subnets, just like local datacenters. S3 is a hosted storage system as the name implies. It allows for easy storage and retrieval of data via an API.

There are many more acronyms, but for the sake of not turning this post in to a glossary, I will define the others as we need them.

 

What about CloudFormation?

All of these services in the Amazon cloud make it a very useful tool. This means that it gets used for some very important business functions. Things that important always need a backup. So you could feasibly keep a network map and record every change you make, so that if anything happens you can rebuild it according to your notes. And that would totally work, until it doesn’t. It doesn;t take too long for a network to get far too complicated to rebuild by hand. That’s why Amazon created CloudFormation!

CloudFormation is yet another Amazon service. This one accepts JSON templates representing all the services, networks, and servers needed to make a stack, and builds it according to the configurations you set! When you need to add a server, you add it in the JSON template, then rebuild. It keeps a record of all your changes, and makes sure your stack is always consistent and rebuildable at a moment’s notice.  Clearly, the smartest way to use AWS is to ditch their web interface and maintain your entire infrastructure through CloudFormation!

At least, that’s the theory…

If it ain’t broke…

The reality is that CloudFormation has its own problems. Small stacks are easy to build, configure, and maintain in JSON. Large ones get considerably less easy. Couple that with some questionable choices on Amazon’s side (Really, ACL rules are their own part, but security group rules aren’t?) and you end up crippling yourself if you vow to only use CloudFormation.

Our team made an honest go of it. We did pretty well too, excepting the 10-20 minutes needed to debug the new template each time we created a stack. However, it soon got out of control. We added a VPN tunnel to one of our subnets, and it created a hellscape of complicate network interactions. Now, the template for creating a subnet (which was previously only about 100 lines) needed the possibility to include VPC peering and routes to the VPN tunnel. Security concerns meant those new routes needed security groups and access control lists, and before you knew it, the template titled “Subnet.json” made a LOT more than just a subnet. When we needed to change a single ACL rule for one of our subnets, we now had to dig through ~2000 lines of JSON to get to the relevant portion. Its important to note that this was the case whether that specific subnet needed those other features or not.

I ain't lion

WAY too much JSON for 3 subnets…

Our workflow was broken, and I was assigned to fix it.

I proposed a new workflow using unique templates per stack, which solved the complicated stacks issue, but ended with a ton of reused code. I was confident this was the correct approach though, so I made a simple python script to illustrate my point. By keeping a record of the common code (mostly the actual resources needed to build a stack) I could prompt for parts and their required parameters. This allowed us to assemble unique stacks without having to copy and paste common blocks. It worked so well, I spent quite a bit of time on adding to those common blocks of code which I began referring to as “Parts”. I was so proud, I showed my team and watched their dissatisfied faces. They didn’t like it. A script was still too error prone and much too hard to visualize.

Bunch of whiners if you ask me…

Well, nobody asked me. Instead, I decided to help them visualize it with a simple UI. Side note, UIs are freaking hard. I made an agile card titled “Create CloudSeed UI”, which I estimated at a 2. Here I am 3 months later, still working on it.

I set my sights on the MEAN stack, mostly because I already knew Node/Express, and my closest coworker had spent the week preaching the word of MongoDB.  It was only a day or so before I had a nice looking API set up to handle stack saves and loads, serve all of the parts files, and push stacks to AWS. Then came Angular. I struggled my way to a mostly working angular app by the end of the week, littered with a poor understanding of the framework, misused bootstrap classes, and a non-functioning set of Material Design classes. Regardless of its hideous face, I was proud. The app now presented the user with a collection of parts which could be independently configured and assembled in to a working single-file template.

Look how bootstrappy

Buttons > !Buttons

I had replaced the arduous task of typing out  a new stack with a few simple clicks. I could click “VPC”, “Subnet”, “Subnet”, “Subnet”, “EC2Instance” and instantly have a stack ready for configuration. More importantly, Existing stacks could now be updated as easily as using the web portal for AWS, only this way they were tracked for future rebuilds. The stacks were auto-committed to a git repo for reliable change tracking, and loading them up filled the info into user-friendly forms for easy editing.

Where are we now?

Mobile apps are ALSO hard

Its mobile friendly too!

Since CloudSeed’s first showing, I have been pulled in many different directions. The app has been dramatically improved from its first state, but there is still much to be done. I have learned a lot of web apps since I started, and would love to bring that knowledge back to CloudSeed. However, the nature of my job has made it such that I can only really afford to work on CloudSeed when it directly solves a problem we are having within our team. It is primarily for that reason that I sought to open source this project. I want to see what other teams can do with it, and look forward to making it a truly useful tool for all AWS users.

If you or your team use CloudFormation and think this could be a good fit for you, please give it a try! The code is hosted at Careerbuilder’s new Open Source github account and I would be happy to help set you up if you experience any problems! A big thank you to the Careerbuilder Open Source Development Council for allowing me to see this through, and as always, code on.

Advertisements

Crowd Computing

Last year, I was a part of the inaugural HackGT. This is an annual hackathon sponsored by Georgia Tech, which seeks to gather programmers from all around the country for one weekend to develop the best app they can. The grand prize is $60,000. The prize drew a lot of interest, but what compelled me to participate was the presence of a variety of big companies with new technologies. One such pre-announced presence was Intel, with an early look at the Edison board I wrote about last week. The board fascinated me, and the ability to hack on one for a weekend before it was even available for purchase ensured my name would be on the HackGT signup list.

Hackathons

If this word is unfamiliar to you, its time to learn. Hackathons are spreading, becoming more frequent at software companies, schools, clubs, even cities (see HackATL) because of their tendency to produce minimum viable product prototypes in a short amount of time. Essentially, a hackathon is just a gathering of programmers with the resources needed for extended programming times. Often these hackathons feature diversions and entertainment to allow for breaks, food and drink so you never need to leave, and caffeine and/or alcohol for those late night coding sessions. At the end of the 24-72 hour span, apps created by the participating teams and individuals are presented to judges in order to determine winners. These winners could be awarded prizes, or have their idea produced, or may even be offered a job.

Crowd Computing

Crowd computing was my HackGT project, done over a 48 hour period with 2 teammates. (See how much more sense that makes after the intro?)  The idea was to create a big data platform on a tiny board. These Edison boards were great, but they lacked space and computational power compared to traditional computers. In theory however, their price meant that there would be many of them. The number of boards combined with a tendency to be used for passive computation made them ripe for use in cloud computing. Essentially, jobs that couldn’t be run on one board could be run on LOTS of boards working together. A simple website would allow for you to enroll your board in the program by installing a tiny script. This script reports to the webserver every couple minutes to verify the availability of resources on the board. When a job submitted to the website needed boards, yours could be chosen, and the un-utilized resources would be used to compute a portion of the job. When a few dozen boards contribute as such, the resultant power is pretty astounding.

Our app leverages the Map Reduce framework common in big data applications, with a tiny twist. since the boards are hardly big enough to function as nodes, we had to use something with a little more power as the master node. The webserver played that role, allowing for mapper scripts to be run on it that distribute data and a reducer script to the Edisons. From there, the boards would each execute the reducer script on their small portion of data, then return the output to the webserver along with an id which denotes which board the data belonged to. In our proof-of-concept demo we used a very simple example. A single Edison would first attempt to sort the entire text of  War and Peace alphabetically in a very short python script. Simply allocating the space for the novel alone was a struggle, and once the sort process began, the ram began to overflow and the board rebooted. This was expected. This task is simply too large for the memory and computational capabilities of the device. For contrast, we uploaded the same task to our webservice, to which we had registered 6  boards. A mapper script was created along the following lines:

def map(text):
words = text.split(' ')
letters = dict()
for word in words:
#map each word to a list by its first letter
letters[word.lower()[0]] .append(word)
return letters

This split the book into 26 arrays by the starting letter (plus a few for symbols) for every word in the book. Now, we had smaller chunks we could work with. The webserver sent a single array of data to each device, along with the index of the array. Since “A” comes first, a machine would receive all the words beginning with “A”, plus an ID of 0. The device also received a short python script, which told it to sort the list, then communicate the results and original ID back to the webserver. This process repeated until all the arrays of words had been sorted and returned. At that point, the web server would run it’s handler, which sorts the lists by ID. Since “A” had an ID of 0, “B” was ID 1, and so on, the result was a completely sorted novel in a short period of time. In our example it took around 15 seconds to sort the entire book. When some of the devices are in use it may take longer to lobby for access to CPU time and memory, but the idea remains the same.

Where are we now?

The code is on my github. It was just recently open-sourced, and there’s a reason it took this long. The code is VERY sloppy. One of the downsides to hackathons is that programming competence tends to decrease with tiredness. After 36 straight hours of working on the code, we began to make VERY bad mistakes. compound that with a teammate leaving in the middle of the night and frustration with new technologies and poor internet connection, and you get a mess. I’m not entirely sure that what is on github will actually work anymore, and I know that what was on the webserver no longer works. However, over the course of the next few weeks, I intend to revisit it and clean up large sections of the code, hopefully producing a live website soon enough. Please feel free to contribute and fork, or just stay tuned for a beta invite if you own an Edison board (and if you don’t you totally should).

Visit the code HERE

That’s all for this week. Next week I will wrap up my discussion on the Edison for now with my latest and current project: “Rest Easy”. Until then, raise a glass and code on.

Architecture: More Than Drawing Buildings

Chances are, if you are in the habit of speaking to programmers, you will hear someone referred to as an “Architect”. Surely, This person is not the designer of the building, still deeply involved with the company inhabiting it. Surely of all the places in a corporation in which to place such a person, the development team would not be the first choice.  And yet, no dev team would be complete without at least one architect.In case you haven’t figured it out, the architects I am referring to deal with the code architecture. Their job parallels that of a traditional architect in all ways but the medium. Where a traditional architect explores viable designs for the building appearance, the code architect is concerned with the appearance of the code. Just as the traditional architect can draw from Gothic, Renaissance, or Modern structures,  a code architect uses design patterns to match the desired appearance of the code. Design patterns should be familiar to those who have perused code banks before. Patterns such as the Factory Pattern, Singleton Pattern, and  FlyWeight Pattern (more here) are the frameworks off of which code architects build the structure of their application. These applications may have a unique structure by the end, but often include large sections adhering to one or more patterns.

Why do I care?

I am in a unique situation for this question. As a student, I have seen and written MANY poorly architectured programs. I am myself guilty of the monolithic class, every violations of SOLID principles known to man and beast, and cyclical dependencies galore. What sets me apart is that I have seen the error of my ways. At my first development job, I got to work with a veteran of code architecture. He encouraged me to separate concerns, abstract business logic away from models, and write an enterprise software that was maintainable and scalable. The difference in my code was astounding. No longer did I have multi-thousand line classes. No longer need I worry about changing constructors or variable names. The architecture allowed me to create concrete references, set it and forget it.

But… That’s a lot of work…

I know the pain. For instance in my current project I am designing a Pantry Manager/Recipe box/Meal Planner. If I want to make a new button on the UI that saves an item to the pantry I must follow a checklist:

My Recipe Box Architecture

My Recipe Box Architecture

  1. Create Class (in this case some thing like pantryItemModel in the Models Project)
  2. Create method in the relevant Manager (in this case the PantryManager in the Manager Project)
  3. Add method definition to Manager Interface (IPantryManager in the Interfaces Project)
  4. Create method in relevant Data Access Object (DAO) class  (PantryDAO in the Data Project)
  5. Add method definition to the DAO Interface (IPantryDAO in the Interfaces Project)

God forbid I need a new manager or DAO, because then I have to alter the Manager and DAO Factories in the FactoryProject. As you can see, the method call is passed down the chain. This seems unnecessary at a first glance; why not just write the method from the DAO in the UI? It is all about scalability. When the project grows, it is likely to change. The data fetching and saving however is likely to remain constant. By abstracting the business logic away from the data logic, changes can be made on the front end which use the same methods from further back. The structure also allows for project isolation, wherein no class knows about anything but what is needed for its function.

How could this benefit me?

Once the checklist above is complete for all data fetches and saves, the benefits become clear. Now when I want to save a new item in the pantry, I simply gather the form data into a new PantryItemModel, and pass that up the chain with something like this:

//domain call
private void savePantryItem(PantryItemModel pantryItem){
IPantryManager manager = ManagerFactory.getPantryManager();
manager.savePantryItem(pantryItem);
}

With a set of domain calls for each user control, I can rule the WORL- ahem. I mean: I can rapidly code the UI and all gather all necessary data for display or saving. A couple long nights of setting up the architecture turns all the business logic and tweaking in to a few hours work. Mistakes happen, bosses change their minds and clients complain. Isn’t it better to change the wallpaper, rather than rebuild the wall?

Blog Stuff

I would like to formally apologize for my drop-off in posts lately, and let you know whats next. My bartenderbot project has hit a brick wall in the form of limited time. The next step is the construction of the frame for testing, and until I can find the time to hit up Home Depot for some resources, and then construct said resources, the project remains on my shelf. As such, the desire to code has manifested itself in the form of my Recipe manager thingy (which needs a name BTW) so updates on that are coming soon. Until then, grab a drink and code on.

Blast from the past, My first coding project

I’m going to be honest here, I’m updating for the poor HR representatives that have to look at this blog for research after reading my resume at the GT career fair. In order to give some personality to this mysterious author who’s thoughts you read so diligently, I have decided to write about my first real coding project!A little background, when I graduated high school, I KNEW I wanted to do Aerospace Engineering. Until orientation at GT (we call it FASET) at which point I decided I KNEW I wanted to do BioChemistry. I believed that new path was the one for me until I took a wonderful class in Computer Science. At that point, I was sure that I had been hasty before, so I waited an entire semester before realizing that this time I REALLY KNEW. So what magical class was so powerful as to make me change majors again and stick with it?

It was Jython

For the uninitiated, Jython is a special blend of the programming language “Python” and some common java libraries. This allows beginner programmers (like myself at the time) to create very clean object-oriented python code in a neat IDE with a console,  and makes visual animations easy.

Jython code in the JES IDE

Jython code in the JES IDE

Jython was my first experience in coding, and it saw my first “Hello World” program, very basic number programming, image manipulations, and eventually animations.  The final project of this class used a unique aspect of Jython called “Turtles”.  You see, in the parallel version of this class which was required for CS majors, the end of the class involved controlling small robots with attached pens around in a pattern. These robots were called ScribblerBots, but they were relatively expensive. Instead, our class made use of the visual libraries provided by java to create small turtles in a picture. The turtles could then be commanded just like the robots, moving and drawing lines, but they could also be used for animation.  By using an array of turtles, each with a certain job, the turtles could change the background to create animation, they could change colors to act as an animated object themselves, or they could drop images on to the screen to create new visual objects.

Our Final Assignment

We were expected to use everything we learned throughout the year to create an animation which satisfied several requirements. As best I can remember them, the minimum  requirements were:

  • Use at least 4 different turtles on the screen at the same time
  • Make at least one turtle change color and size
  • Make a turtle draw, move, stop drawing, and move again
  • Use a randomizer to change at least one attribute of one turtle.  (I went a little crazy on this one)

Beyond those requirements, the assignment was very free form, and we were encouraged to be creative. I took it to heart.

What Horror Hath I Wrought?

Screenshot of the animation

Screenshot of the animation

Pokémon. Registered copyright or trademark or whatever, but let’s be real, it was Pokémon. My goal was to recreate a Pokémon trainer encounter and subsequent battle. For those that have a JES environment (get it here) you can download my code and watch it with full randomness at the usual place. Feel free to download and parse it to see how it works, but remember this is my first real coding work.

For those who have no interest or capability to parse the code and execute it natively, I have a screen recording of the execution.

I hope you enjoy the movie and stay tuned for more!

The Word of the Day is List!

I am still working on a post about the Bartenderbot project (gonna be a big post) so in the meantime, I’d like to talk to you about Lists.

Quick background for the non-programmer audience, bear with me programmers.
There are many ways of storing data in computer science. These ways of storing are called “Data Structures”. They range from stacks and heaps to arrays and lists. The ones I will be discussing today in the most detail are arrays and lists. Arrays are ordered lists, almost like a row of numbered boxes each holding some data. This allows data to be accessed according to its position in the array. Arrays are notated with square brackets and indexed by an integer in those brackets.

int[] arr;
int x = arr[i];

Because of their ability to store and load data quickly, arrays are quite common, and my code used to be full of them. That is until I discovered the full potential of lists.

Arrays have a fatal flaw. When an array is made, it has a set length. That is to say if you want to add data to a full array, you need to create a new longer array, copy all the old data in to the new array, then add the new data. Lists avoid this flaw.

List is actually a misnomer. Java.Util.List is an abstract class, meaning it cannot be instantiated. Rather, you must used a class that inherits List. My favorite such class is Java.Util.ArrayList. ArrayLists use the afforementioned arrays as backing structures, but takes care of all the messy array reassignment internally. So if you have a full ArrayList, and you want to add one more object, you only need to call list.add(object); and the list will append the new object.

My past week has been centered around finishing the software for the Bartenderbot. This software relies on collections of Ingredients, Instructions and Recipes, which I had previously set up as arrays of the corresponding objects. After writing:

Instruction[] temp = new Instruction[instructions.length+1];
for(int i=0; i<instructions.length; i++){
    temp[i] = instructions[i];
}
temp[temp.length-1]=newGuy;
instructions = temp;

for the nth time, I tried switching to lists.

After reassigning variables throughout the code, it was amazing how many lines I saved. I turned for loops into foreach loops, arr[i]’s into .get(i)’s, and the previously shown add functions into .add()’s. So if you haven’t already, go make some lists and correct the overuse of arrays.