Author Archives: karthequian

About karthequian

I love creating products, and full stack dev comfortable creating things from scratch, and know a bunch about containers, kubernetes, auth and agile development. I live in Austin and organize @devopsdays, @container_days and @cloud_austin. Follow me on twitter.com/iteration1

Long live ChatOps, RIP AOL IM!

I grew up in Muscat, Oman, and it was an exciting time when we got Internet at home in 1996. By 1998, all of my friends who had Internet at home were first on ICQ and then on AOL IM. AOL IM was huge when I went to college in the early 2000’s and was the primary way to connect friends together to chat. Back then, it was rare to have chat rooms, and the rooms that existed were usually long-running things set up to talk about general topics.

The first time I saw value in a chat room in a professional setting was when I got invited to a Basecamp “deploy room” by fellow Agile Admin Peco (or was it Ernest?) at NI when our quarterly release cycle was going super poorly, and all of us (100 other people) were waiting around at hour #34 trying to figure out why some random enterprise application was holding up the rest of the release process. Post invitation to the room, I was able to look at the past messages between the ops team about application failures, and then realized pretty quickly that our databases weren’t actually responding like they should. It took all of 10 minutes to ask someone on the ops side with credentials to run a database query, and figure out that the db creds were all wrong. 2 hours later, the release was all done…

That moment made me realize that 1×1 chats were great, but having a persistent chat rooms with teams of people added value to an organization.

Recently, a colleague asked me a simple question that made me reflect. He asked, “What’s the big deal about Slack?”. At work, there’s been a big push to move towards Slack, when we’ve had 1×1 chat forever. Here are my 5 most compelling reasons for doing so:

1) Collaboration++: 15 years ago, software was a simpler, and there was no cloud/microservices. You’d have 1 large binary to deploy for a platform, and typically have a few folks who understood the overall workings of platform. Today, with microservices, you require a bunch of applications to deploy, and each of these have specific owners who understand specifics. Thus, you’re going to have to have conversations with multiple folks to figure out any issues. Having this in a room setting versus a 1×1 setting gets you to a resolution faster.

2) Chat metadata: Chat is less about words, and more about conversations that include images, links, slash commands, workflows etc. Chatops tools make pasting these much easier than before, and looking at formatted code in Slack is so much easier to read than looking at the same in pidgin.

3) Chat History: Chat apps now give you history – even from when you were not online or in the chat room. This is valuable from the perspective that you can see everything from when you weren’t around, and don’t have to ask someone to keep repeating the problem over and over again. You can just scroll up, read the context, and be ready to help if you can. This is my one knock against IRC (or at least the implementation of IRC at a company I worked at); it was nice to have everyone in a spot, but it only worked when we were VPN’ed in, and had no history.

4) Pipelining with chatbots: Continuous Integration/Delivery is all the rage these days! Having a chat system that allows for your devops systems to push data is a primary requirement in order to build a pipeline of this sort. Responses to broken builds, tests, alerts are quicker when the data associated with these are transmitted to a chatroom that you’re looking at, than having to look at Jenkins all the time. Chatbots are invaluable in this scenario, and help you with information flow.

5) The new normal: A new generation of engineers already do this. It’s already part of the culture for the next generation of engineers who work on open source (for example, kubernetes slack) and there’s even chatter about slack at Universities now. The world is evolving towards broader conversation, and not having chatops tools will hurt your company in terms of hiring and retention.

 

Agree/Disagree, or have a different perspective? Let me know by commenting below!

4 Comments

Filed under Agile, DevOps

Docker 101

Working at Stackengine, and now at Oracle, I’ve been working in the Docker ecosystem for the last 5 years!

While containerization has taken the IT and devops world by storm, a lot of larger enterprises might still be on the outside looking in. If you find yourself in that boat, you’re in luck!

Here’s a quick video on getting you running your very first Docker container on your Mac in under 5 minutes.

Also, I had the pleasure of traveling back to my childhood hometown of Bengaluru and presenting a workshop at Code Conf this year. I’ll create a separate post about my travels, but I got to present a workshop lab that is an Introduction to Containers. This lab is a perfect follow on to the video above, and will help you get started on your Docker journey! Let me know if you have questions.

If you’re more of a product manager, or just looking for why you’d want to use Docker, and understand its usecases, you can take a look at this presentation I had published on Why to docker? as shown below.

Questions, comments, or concerns? Hit us up by leaving a comment below…

2 Comments

Filed under DevOps

Why to Docker?

I recently gave a presentation on “Why to Docker” for the BrightTalk summit. Here’s a list of all the things I talked about. I had a great turnout of over 300 people, and some great questions that followed. Fortunately, I finished a bit early and was able to answer a bunch of the questions (was asked about 30). I’ll end up adding answers to all the questions this weekend!

Here’s a link to the slides:

Leave a comment

Filed under DevOps

Devops State of the Union 2015

James, Karthik, and Ernest did a Webcast on Devops State of the Union 2015 talk for the BrightTalk Cloud Summit. It went well!  We had 187 attendees on the live feed. In this blog post we’ll add resources discussed during the talk and we will seed the comments below with all the questions we received during the webcast and answer them here – you’re all welcome to join in the discussion.

The talk was intended to be an overview of DevOps, with a bunch of blurbs on current and developing trends in DevOps – we don’t go super deep into any one of them (this was only 40 minutes long!). If you didn’t understand something, we’ve added resource links (we got some questions like “what is a container” and “what is a 12-factor app,” we didn’t have time to go into that in great detail so check some of the links below for more.

devopsstateoftheunion

Resources:

22 Comments

Filed under DevOps

ReInvent – Fireside Chat: Part 1

One of the interesting sessions at ReInvent was a fireside chat with Werner Vogels., where CEO’s or CTO’s of different companies/startups who use AWS talked about their applications/platforms and what they liked and wanted form AWS. It was a 3 part series with different folks, and I was able to attend the 1st one, but I’m guessing videos are available for the others online.  Interesting session, giving the audience a window into the way C level people think about problems and solutions…

First up, the CTO of mongodb…

Lots of people use mongo to store things like user profiles etc for their applications. Mongo performance has gotten a lot better because of ssd’s

Recently funded 150 million, and wanting to build out a lot of tools to be able to administer mongo better.

Apparently being a mongodb dba is a really high paying job these days!

User roles may be available in mongo next year to add more security.

Werner and Eliot want to work together to bring a hosted version of mongo like RDS.

Next up twilio’s Jeff Lawson

Jeff is ex amazon.

Untitled

Software people want building blocks and not some crazy monolithic thing to solve a problem. Telecom had this issue, and that is why I started Twilio.

Everyone is agile! We don’t have answers up front, but we figure out these answers as we go.

Started with voice, then moved to SMS followed by a global presence. Most customers of ours wanted something that didn’t want boundaries and just wanted an API to communicate with their customers.

Werner: It’s hard to run an API business. Tell us more…
Lawson: It is really hard. Apis are kinda like webapps when it comes to scaling. REST helps a lot from this perspective. Multi tenancy issues gets amplified when you have an API business.

Twilio apparently deploys 20 times a day. Aws really helps with deployment because you can bring brand new environments that look exactly like prod and then tear it down when things aren’t needed.

When it comes to api’s, we write the documentation first and show our customers first before actually implementing the API. Then iterate iterate iterate on the development.

Jeff asks: Make it easier to make vpc up and running.

Next up: Valentino with adroll (realtime bidding)

Untitled

There’s a data collection pipe which gets like 20 tb of data everyday.

Latency is king: Typically latency is like 50ms and 100ms. This is still a lot for us. I wish we had more transparency when it comes to latency inside aws and otherwise…

Why dynamo db? Didn’t find something simple at the time, and it was nice to be able to scale something without having to worry about it. We had 0 ops people at the time to work on scaling at the time.

Read write rates: 80k reads per second (not consistent), 40k writes per second.

Why erlang? You’re a python god.
I started working on Python with the twisted framework. But I realized that Python didn’t fit our use case well; the twisted system worked just as well but it would be complicated to manage it and needed a bit of hacks..

Today it would be hard to pick between erlang and go….

Leave a comment

Filed under Cloud, Conferences

ReInvent 2013: Day 2 Keynote

I didn’t cover the day 1 keynote, but fortunately it can be found here. The day 2 keynote was a lot more technical and interesting though. Here are my notes from it:

First, we began by talking about how aws plans its projects.

Lots of updates every year!

Before any project is started, and teams are in the brainstorming phase. A few key things are always done.

  • Meeting minutes
  • FAQ
  • Figure out the ux
  • Before any code is written

“2 Pizza Teams”: Small autonomous teams that had roadmap ownership with decoupled lauch schedules.

Customer collaboration

Get the functionality in the hands of customers as soon as possible. It may be feature limited, but it’s in the hands of customers so that they can get feedback as soon as possible. Iterate iterate iterate based on feedback. Different from the old guard where everything is engineering driven and it is unnecessarily complex.

Netflix platform….

Netflix is on stage and we’re taking about the Netflix cloud prizes and talking about the enhancements to the different tools…looks pretty cool, and will need to check them out. There are 14 chaos monkey “tests” to run now instead of just 1 from before.

Cloud prize winners

Werner is back is breaking down the different facets that AWS focuses on:

  • Performance- measure everything; put performance data in log files that can be mined.
  • Security
  • Reliability
  • Cost
  • Scalability

Illya sukhar CEO from Parse is on stage now (platform for mobile apps)
-parse data: store data; it’s 5 lines of code instead of a bunch of code.
-push notification

Parse started with 1 aws instance
From 0-180,000 apps

180,000 collections in mongodb; shows differences between pre and post piops

Security

IAM and IAM roles to set boundaries on who can access what.
How to do this from a db perspective?
Apparently you can have fine grained access controls on dynamodb instead of writing your own code.
Each data block is encrypted in redshift
Cost:
Talking about how customers are using the spot instances to save $.

Scalability:
We transfer usecase, who take care of transferring large files.

Airbnb on stage with mike curtis, VP of engineering
-350k hosts around the world
-4 millions guests (jan 2013)
-9 million guests today.

Host of aws services
1k ec2 instances
Million RDS rows
50tb for photos in s3

“The ops team at Airbnb is with a 5 person ops team.”

Helps devote resources to the real problem.

AirBnB in 2011

AirBnB in 2012

Dropcam came on stage after that to talk about how they use the AWS platform. Nothing too crazy, but interestingly more inbound videos are sent to dropcam than YouTube!

Dropcam

They keynote ended with an Amazon Kinesis demo (and a deadmau5 announcement for the replay party), which on the outside looks like a streaming API and different ways to process data on the backend. A prototype of streaming data from twitter and performing analytics was shown to demonstrate the service.

Announcements

  • RDS for PostgreSQL
  • New instance types-i2 for much better io performance
  • Dynamo db- global secondary indexes!!
  • Federation with saml 2.0 for IAM
  • Amazon RDS- cross region read replicas!
  • G2 instances for media and video intensive application
  • C3 instances are new with fastest processors- 2.8 gig intel e5 v2
  • Amazon kinesis- real time processing, fully managed. It looks like this will help you solve issues of scalability when you’re trying to build realtime streaming applications. It integrates with storage and processing services.

Announcements

Incase you want to watch it, the day 2 keynote is here: http://www.youtube.com/watch?v=Waq8Y6s1Cjs

And also, the day 1 keynote: http://www.youtube.com/watch?v=8ISQbdZ7WWc

2 Comments

Filed under Cloud, Conferences

ReInvent 2013- Scaling on AWS for the First 10 Million Users

This was the first talk by @simon_elisha I went to at ReInvent, and was a packed room. It was targeted towards developers going from inception of an app to growing it to 10 million users. Following are the notes I took…

– We will need a bigger box is the first issue, when you start seeing traffic to an application. Single box is an anti pattern because of no failover etc. move out your db from the web server etc…you could use RDS or something too.

– SQL or NoSQL?
Not a binary decision; maybe use both? A blended approach can reduce technical debt. Maybe just start with SQL because it’s familiar and there are clear patterns for scalability. Nosql is great for super low latency apps, metadata data sets, fast lookups and rapid ingesting data.

So for 100 users…
You can get by using route53, ELB, multiple web instances.

For 10000 users…
– Use cloud front to cache any static assets.
– Get your session state out of the webservers. Session state could be stored in dynamo db because it’s just unrelated data.
– Also might be time for elastic cache now which is just hosted redis or memcached.

Auto scaling…
Min, max servers running in multiple az zones. AWS makes this really simple.

If you end up at the 500k users situation you probably really want:
– metrics and alarms
– automated builds and deploys
– centralized logging

must haves for log metrics to collect:
– host level metrics
– aggregate level metrics
– log analysis
– external site performance

Use a product for this, because there are plenty available, and you can focus on what you’re really trying to accomplish.

Create tools to automate so you save your time especially to manage your time. Some of the ones that you can use are: elastic beanstalk, aws opsworks more for developers and cloud formation and raw ec2 for ops. The key is to be able to repeat those deploys quickly. You probably will need to use puppet and chef to manage the actual ec2 instances..

Now you probably need to redesign your app when you’re at the million user mark. Think about using a service oriented architecture. Loose coupling for the win instead of tight coupling. You can probably put a queue between 2 pieces

Key tip: don’t reinvent the wheel.

Example of what to do when you have a user uploading a picture to a site.

Simple workflow service
– workers and deciders: provides orchestration for your code.

When your data tier starts to break down 5-10 mill users
– federation
Split by function or purpose
Gotcha- You will have issues with join queries
– sharding
This  works well for one table with billions of rows.
Gotcha- operationally confusing to manage
– shift to nosql
Sorta similar to federation
Gotcha- crazy architecture change. Use dynamo db.

Final Tips

Leave a comment

Filed under Cloud, Conferences