Author Archive

Innovation is Everyone’s Job

without comments

I like this blog post from the Harvard Business Review.

Control Group has a culture that attracts certain kinds of people. Sure, the culture changes as the company does, but there are certain things that definitely stick from iteration to iteration. I think that our acceptance and interest in innovation is one of them.  I think that we should all be innovating. Everyone has something to contribute, no matter what your title or role is.

So as an FYI, R&D is open to everyone and we will be scheduling more of those drive-bys to accommodate more schedules and interests.

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

December 7th, 2011 at 9:24 am

Posted in general

Tagged with , , ,

Thinkers wanted. Typists and runbook operators need not apply.

without comments

DevOps: Thinkers wanted. Typists and run-book operators need not apply.

If you replaced your runbook with a puppet recipe, spun up a dev environment for breakfast, moved your production infrastructure to AWS, and have a few Arduinos on your desk… we want to talk to you.

Who are we? Just some geeks building the next. next thing and having a blast along the way. We work on dozens of projects every year, using the latest tools and inventing them when they don’t exist yet. We’re super busy creating new infrastructures for our clients, supporting our developers, and working on our own R&D. Your networking, database, storage, cloud, and hardware hacking chops will be challenged and honed.  Since DevOps is an emerging discipline, we’re writing the playbook as we move along.  So we’re looking for someone who lives and breathes this stuff– not necessarily the person with the most experience.

If your interested in joining our team, send us your resume or LinkedIn profile. (GitHub account and OSS contributions will also get our attention!)

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

November 15th, 2011 at 9:03 am

New Hadoop Spin-offs: Meh.

without comments

People are crazy about Hadoop. I think that this is the fastest that I’ve ever seen a technology go from competitive advantage to commodity. This technology is so new to organizations, but also so well deployed and understood by technologists, that we are in some kind of strange no-man’s land.

I think that the real issue may be more that no one knows what to do with Hadoop, not how fast it is or which version is better. I mean really, who cares if your HDFS implementation is like 10% faster when you can just spin up 10% more Elastic MapReduce instances.

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

November 1st, 2011 at 11:40 am

Posted in development

Tagged with ,

Big Data: SQL Planning & Migration to Spark and Hadoop

without comments

I was in a meeting the other day discussing a problem that a client keeps running into. They need a platform to analyze trends in a rapidly growing data set, where the criteria is changing as fast as their business is changing, which as it turns out, is pretty fast. Right now they are storing the data in a relational database and writing complex SQL queries to mine information from it. The DBA told us that he would run a query and then go to lunch, hoping it would be done by the time he gets back. They need the results faster, and they know that their problem is just going to get worse as the data grows.

The kneejerk reaction to a problem like this is to get a bigger database server. Sure, this may help right now when the data is only a few hundred gigabytes, but what happens when we are dealing with a few hundred terabytes? A few hundred petabytes? This kind of solution just does not scale.

The real answer here is to step back, examine the problem, understand what the goal is, and then design a process that can achieve that goal. In this case, the problem is that a business needs to be able to understand patterns and trends in a rapidly growing data set. The goal is to be able to do this quickly and consistently even as the data grows. One process that can achieve this is by using something like Hadoop or Spark to build a cluster that can scale as the data scales.

There were concerns as soon as I brought this up; What about the schema? How do you write SQL for that? Why not just shard the database? Some of these concerns may be valid, but I feel we must evaluate this without emotion. Do people want to use the relational database because it is a better solution for the problem or because they feel comfortable with it?

I’m not sure it’s accurate to say that we are facing new problems these days, but the shape and size of our problems have changed. Now even the smallest company has something to gain from working with big data– anyone with a credit card can spin up a compute cluster. We should not be afraid to change our tools as our challenges change.

Technology is continuously evolving. This means our tools are continuously changing and so must our processes for tackling new challenges. I believe that the system we came up with in that meeting will be the one to solve our client’s problem. If someone gave us the same problem five years ago or five years from now we would probably have wildly different suggestions, but we would come to those suggestions in the same way: through deep understanding of both the problem and the technology available.

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

October 11th, 2011 at 9:19 am

Configuring Machines in the Cloud: Our Approach

without comments

We’ve done a lot of work recently to revamp the way we deploy computers in the cloud and I wanted to share a little bit about how we’re doing this at a pretty low level to give you an idea of how we are approaching this. Our software and processes are cloud agnostic, but we mostly work with Amazon Web Services because we feel that they offer the best solution for most of our clients at this time.

We maintain two base Linux images as part of our cloud toolkit. The only difference between the two images is their architecture. One is 64bit and the other is 32bit. The images are minimal– they have just enough software and configuration to get them off the ground and configured. We have copies of the images in each region in Amazon, but when it comes to maintenance and upgrades we really only deal with the two master images. All of the computers that we deploy in EC2 come from these two images.

The base image by itself is not very useful. When a computer is instantiated from one of the images, our toolkit combines it with our Puppet repository and some instance specific configuration. The Puppet repository contains the Puppet manifests for how we deploy software. The repository is where we store our collective knowledge around deploying successful software. The instance specific configuration is crafted by the developers and operations teams to pick and choose the appropriate things from the Puppet repository provide the very specific configuration about how to deploy the server and the application that will run on it. As the instance boots, it configures itself, installing the software and making the changes required to bring it into service.

This is all pretty low level, but it provides some capabilities that makes our solution very flexible:

  • With only two images to maintain, keeping software up to date is simple. We anticipate that we will be releasing new images about once a quarter to capture any updates to the packages in the base system.
  • Everything is version controlled. It is easy for us to see what a machine looked like on a specific date or understand the changes that have been made to how the software is configured on an instance.
  • The instances are very self sufficient. There is no single point of failure that would prevent instances from starting correctly.
  • This is all very portable. With just a little bit of work we can deploy things in a different region of Amazon. Also, our Puppet code and instance specific configurations can work in more places than just Amazon. With a little bit of work to recreate the base images in another platform we can consistently and predictable recreate infrastructure anywhere, giving our clients the ability to choose the right solution for them.

This last item is something that should be on everyone’s mind (especially considering the outage at Amazon last week). As Steve said last week, everything fails and you need to design your infrastructure and applications around that. A process for redeploying your infrastructure in another AWS region or a different cloud is an important part of building a very reliable service in the cloud. It is hard to say what the next kind of failure in the cloud will be like, but with a process like ours we can be ready to deal with outage when it happens.

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

April 25th, 2011 at 9:00 am

Rapidly Prototyping Tagatag on Google App Engine

with one comment

Google App Engine is Google’s platform-as-a-service for developing web applications. There’s been some people saying goodbye to GAE, and perhaps in response Google has announced several enhancements to the service.

In the midst of all of this, a few of us at Control Group have been developing Tagatag: an Android and iPhone application for commenting on barcodes that uses web services running on Google App Engine.

Scan this QR code with Tagatag to join the conversation!

Barcodes are everywhere around us. You can find them on advertising, products, places and even people. Tagatag provides you with a virtual paint marker to let you make your mark on all of these codes anonymously. Download the Tagatag app and give it a try. Scan a barcode to see comments people have left for you and then leave some for them.

We chose Google App Engine for the back end of Tagatag for a few reasons:

  • It’s quick – You sign up for an account, download the SDK and you’re developing. The development server in the SDK lets me run the application on my desktop and interact with the code as I’m writing it.  Uploading new versions, rolling back old ones, or performing maintenance is a snap with the GAE dashboard.
  • It’s simple – There’s not much to the web service. It’s small and simple. We used the webapp framework because we didn’t feel we needed anything else. It makes for a very concise application. Believe it or not, there are about 300 lines of code for the GAE part of Tagatag.
  • It’s scalable – We don’t have to worry about what we do when Tagatag becomes popular. We’ll just raise our billing quotas in GAE and let them handle spinning up new instances or expanding the datastore. Knowing that you don’t have to be concerned about scaling makes things a lot more fun.

I’m happy that GAE let us bring Tagatag to you so quickly. So, when it’s available at the end of the week, be sure to download the app, tag a tag and make your mark!

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

December 6th, 2010 at 8:00 am

Crunched for time? Get in the cloud.

without comments

I am really busy these days, but a bunch of things have just been in the news that I need to comment on.

I’m working on the infrastructure for a new phase of QA testing that we are doing on a product. The infrastructure consists of a variety of physical computers, about fifty in all. Managing and maintaining them is more time consuming than the cloud-based computers I work with. The increased amount of attention and time that physical computers take is why I wonder about these things that I’ve read.

First, New York City has entered a “money-saving partnership” with Microsoft, signing up for some massive licensing. Fortunately this includes some cloud-based infrastructure, but it’s unfortunate that the city did not compare the Microsoft solution with something like Google Apps, or with open-source solutions like Libre Office. Since we are paying the taxes that are being used to pay for these services, shouldn’t we be getting the best deal? So, NYC, please call me when you’re ready to talk about your infrastructure.

Have you ever been shivering from the cold in a data center while waiting on hold for the URL to a service pack because everyone’s email is down? I have, and I never want to do it again. I’m sure no one in the city wants to do it either. Why not let Google freak out about keeping your systems up all of the time so you can do some things that really matter. That’s what the cities of Los Angeles and Washington DC do (along with a lot of other people).

Microsoft is also in the news for something else too: Ray Ozzie, their chief software architect, is stepping down. Ozzie seems like a sharp guy and was behind a lot of good things at Microsoft (yes, this is one of the few times you will hear me complimenting Microsoft). He’s asking his colleagues to “close our eyes and form a realistic picture of what a post-PC world might actually look like, if it were to ever truly occur.’’ Guess what dude — we are in a post-PC world already.

Can I say that  more people are interacting with technology that’s in the cloud via their cellphones than through their PCs? Probably not, but I will tell you that what’s going on in the cloud and mobile space is a lot more interesting than the PC space. Will PCs even be relevant in a few years? We’ll see. Also interesting to note is that these articles indicate that no one will take Ozzie’s place as chief software architect. That makes me wonder about who’s driving the bus there. This probably doesn’t mean MS is going to just dry up and disappear, but will they ever be innovators again?

Well, enough pondering for now, I have to get back to punching power buttons and checking for failed hard drives — things that you never have to do in the cloud.

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

October 28th, 2010 at 10:00 am

DebianNYC “What’s in a Package?” Workshop on the 27th

without comments

Our friends at Debian-NYC will be running a workshop at Control Group on October 27th at 6:30 pm about what’s in a Debian package. Debian’s packaging system is a huge part of why it’s such a great operating system (the packaging system is also why Ubuntu is so great). This workshop will give you a tour of how it works:

This workshop will provide advanced theory useful for people modifying or creating packages. For people modifying packages, you’ll learn many typical motifs and about various build systems. For creating packages, you’ll be much better prepared to read and understand guides a deep level. However, this is still not a step-by-step guide in “how to build packages”, but will get you very close to there.

The Debian-NYC series of workshops as a whole is designed to introduce Debian, Debian tools and techniques, and the Debian community as well as provide skills to attendees. This workshop is targeted towards technically-skilled people who would like knowledge of the Debian packaging system and how to contribute back to the Debian community. Most of the material of these workshops will be applicable to other Debian-derived distributions, such as Ubuntu.

If you’re interested, please RSVP. These workshops have always been really fun and informative. I hope to see you there!

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

October 21st, 2010 at 8:30 am

Update: Coffee at Control Group

without comments

There have been some changes since my last post about CG’s coffee maker. We’ve gotten a new “machine” that’s much simpler. It’s easier to understand and less likely to break down. It’s a 10 cup Chemex that we use with an electric kettle.

The Chemex is really easy to use. Heat up water in the kettle, position a filter in the top of the vessel, add coffee grounds, and pour hot water over it (a little at first, to let it bloom, then as much as you want to make). Sweet Maria’s has an excellent guide on making coffee with a Chemex.

Java at Control Group

This is week two of our Chemex experience. So far the kitchen is cleaner, the brew is stronger, and the coffee-notify mailing list is more active. Sure, it takes a little longer to make a pot, but I feel like it makes us more mindful of what we’re doing. Apparently we’re not the only ones that have recently switched to the Chemex.

I suppose the next steps are to grind our own beans (by hand of course). We’ll see what we can do. Stay caffeinated my friends.

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

Written by David Rocamora

October 19th, 2010 at 2:00 pm

Posted in general

Tagged with ,

Managing Load Increases in Style with Cloud Computing

with one comment

This picture is an exclusive behind-the-scenes look at Mercedes-Benz Fashion Week. It wasn’t taken backstage, on the runway, or at an after party. This image comes from the monitoring console in the CG war room where I was working with a few other engineers on a rapid response to a load increase on mbfashionweek.com, the event’s new website.

Since Mercedes-Benz Fashion Week only runs for a week, we needed to boost the website infrastructure quickly so that it could continue to provide updates and information about the event for seven days despite heavy traffic.

If this was a few years ago and we were using a traditional web-hosting infrastructure this would be difficult — maybe even impossible. But fortunately this is 2010 and IMG Fashion, the company responsible for the event, was using Amazon Web Services.

The application that powered their website was good, but there were some unexpected issues preventing it from performing well in a very high traffic environment. There was no time to profile, troubleshoot, and retool parts of the application. The fastest solution to the problem was to create more web server instances and to distribute the traffic to them.

We were already using an Elastic Load Balancer to spread traffic between their two web serving instances, so adding new instances was simple. We created new EC2 instances in several different Availability Zones to ensure that the site would stay online no matter what happened. A sophisticated system built into the application’s content management system (CMS) kept content on the different web servers consistent. And we used Amazon’s Relational Database Service to handle the database tier.

We increased the amount of servers several times over the course of the event to handle the incredible amount of traffic on the website. You can see two of those increases in the graph at 15:00 and 20:00. After the event was over and website traffic slowed down, we were able to reduce the infrastructure and costs. This proves that systems for events like Mercedes-Benz Fashion Week are perfect candidates for cloud computing — organizations can pay for exactly the amount of compute resources they need and no more.

Mercedes-Benz Fashion Week is over now. The models and designers have moved on and their EC2 instances are spinning down. And, while trends in computing seem to change as often as trends in fashion, I think we’ll see cloud computing and scalable websites stick around for a few more seasons.

Share this: Share this page via Digg this Share this page via Facebook Share this page via Twitter Share this with Linked in

services people careers press blog contact follow us