Data Delivery

Do You Have a Handle on Your AWS Costs?

Just like with your credit card issuer, there can be "hidden" fees with AWS that you don't know about until you get the bill.

Amazon Web Services (AWS) is a great place to utilize virtually unlimited storage and compute for your enterprise IT needs. It also gives you the ability to scale IT resources up and down without having to rely on big capital expenditures upfront. This can speed your time to market, or at the very least give a good idea of the IT cost and/or feasibility of new IT projects.

However, this is a double-edged sword.

This ease of use of AWS and its cost-per-hour type of billing can become expensive if not monitored and controlled. Think of it as a personal credit card, which can be used when you need that immediate purchase but don't have the cash in your wallet or purse. We know this is dangerous on many levels.

Your AWS resources running 24x7 is like a high credit card interest rate, building more debt until you pay your balance. You might have started the large EC2 machine with the intention to turn it off when you were done using it, but did you? How about all of the others on your team? Are they disciplined enough to do this? Do they pay down their credit cards to zero each month?

If you don't shut down your AWS EC2 instances, you are continually billed even if you are not actively using them. If you run it, AWS will bill you. AWS will not automatically tell you that you are running its machines without any benefit. Why would it? Similarly, if you copy your data to many different areas in AWS, you're getting billed for that redundancy.

Just like with your credit card issuer, there can be "hidden" fees with AWS that you don't realize until you get the bill. For instance, data download costs. It's easy to get you to upload data -- it's unlimited and free! However, the more you upload, the more data storage costs you incur. Also, if you want the data out to you or a collaborator outside your organization, that will cost you, too (though you can check a box for an external user and have them billed for the cost).

How about Glacier, AWS' bulk storage service? Cheaper, right? Sure, but if you want your data out of Glacier, there is an additional cost to "rehydrate" that data back to AWS Simple Storage Service (S3). Then you get the same cost to download it -- a double charge. 

AWS does allow accounts to have a certain amount of data downloads free per month, but for larger data sizes this won't save you much. For larger downloads, there are cheaper tiers for larger data sizes. Ultimately, there are two extremes if you need to download large datasets : 1) rip the Band-Aid off and take the charge as fast as it will download, or 2) bleed it down slowly using as much of the free monthly pricing as possible. Most will pick something in the middle.

AWS can charge you for other various services, which may or may not be significant for your AWS use, but those costs should be considered. My team and I have begun to dig into our use of AWS services, and though we are still early in our overall process, here are a couple of helpful tips we have found:

Just like you shouldn't look at your weight on the scale every day, don't look at your AWS costs every day. It can become an obsession and you will lose focus on the larger picture. I would suggest looking at it weekly or biweekly, depending on the size or your current AWS resources and what your budget is with AWS.

AWS does give some tools for looking at the billing in its interface, which is a good place to start. In particular, there is a free tool to analyze your costs and AWS will produce a report that recommends cost-saving actions. Use it  -- it can give you some initial insights and start discussions with your team and end users.

As you get more comfortable with the billing and how things are charged, start looking at your billing in aggregate, particularly over time. For us, we downloaded the details from AWS (in text files), opened it in Excel and started to dig into the AWS breakdown. We had a fair amount of services running, but the files aren't big and are appropriately labeled so you can understand the data. You might be shocked by what you see, or it might be obvious.

Do this visually if you can, meaning charts and graphs. This can be powerful as you begin to build processes around AWS resources in the future. Look for trends and points of weakness, or areas where you don't understand the billing and why it occurred. We have good representatives at AWS who were able to help us with our questions.

Create an action plan. Even if you have a good handle on your costs, this can quickly change. Be proactive and look for ways to monitor and control costs. For us, we had enough cost and AWS services where we could not focus on all of them at once. We created a small grid that helped us prioritize what we should address first. Below were our four categories, which we addressed in this order:

  1. Easy to implement, high cost savings
  2. Easy to implement, low cost savings
  3. Harder to implement, high cost savings
  4. Harder to implement, low cost savings

Doing them in this order allowed us to concentrate on items we could address quickly and see results in our bill. This created the momentum we needed, and we were also able to show our leadership progress.

Lastly, as your team progresses, add a defined process to evaluate the use of new or expanded services in AWS. Best practice would suggest doing this before any new or expanded AWS services are implemented. The more defined and transparent, the better. This might be hard to attain at first. Also, as your organization matures, look to automate your billing analysis, creating dashboards, alerts and monitoring as needed.

Keep an eye out for future columns for specific details on tools and processes we have implemented to get the most efficient use out of AWS.

About the Author

Aaron Black is the director of informatics for the Inova Translational Medicine Institute (ITMI), where he and his team are creating a hybrid IT architecture of cloud and on-premises technologies to support the ever-changing data types being collected in ITMI studies. Aaron is a certified Project Management Professional (PMP) and a Certified Scrum Master (CSM), and has dozens of technical certifications from Microsoft and accounting software vendors. He can be reached at @TheDataGuru or via LinkedIn.


Subscribe on YouTube