Image by liz west
When considering the best choice for a data analytics warehouse, which is better: MySQL or Amazon’s Redshift? In this post, I’m going to look at some of the advantages of each, and give you ideas on which might be best for your organization.
Consider Their History
MySQL is a database built for Online Transaction Processing (OLTP), while Redshift was originally created from the ground up as a data warehouse, and forms part of Amazon’s hosted offerings in Amazon Web Services. While MySQL was originally designed to reside on a single machine, Redshift is distributed by default and so can theoretically scale to very large datasets. This means you can store a lot of information in a single table, and the system can handle databases as large as several petabytes.
##Large vs. Small Datasets
While Amazon Redshift is fast and adept at handling large datasets for analytical queries, many organizations have capped a low-budget, functional data warehouse on MySQL. So, do you really need Redshift? In evaluating the two technologies, I suggest you look not just at the overall volume of data but at the size of individual records, too.
MySQL is better suited when you are accessing smaller data chunks on a frequent basis, while Redshift shines when loading huge datasets less often. MySQL is still a solid workhorse, and using it in a sharded cluster is a reasonable solution for a more performant analytics backend, but large volumes of data are really suited better to a solution like Redshift.
Who Uses Redshift?
Redshift is often seriously considered when companies realize they have grown and their MySQL, PostgreSQL, or other database cannot handle the volume. For example:
Nokia’s huge volume of data was crushing its databases. Nokia moved its high data usage activity to Redshift and now runs queries two times faster.
VivaKi wanted to load 10 terabytes of data, which would have taken almost three weeks with the company’s previous solution, but took only six hours with Redshift.
Foursquare used Redshift to avoid onerous licensing costs, while maintaining compatibility with Tableau.
Consider Volume and Business Intelligence Needs
To summarize, it comes down to data volume. If you are working with a traditional MySQL database, you can use it as a data warehouse until it outgrows your needs. However, some firms find MySQL cannot keep up. In that case, something like Redshift is a better solution. It is built from the ground up to handle large scale data analytics, and is cost effective compared to other analytics specific databases on the market.
Have a database already? Get even more from your database with reporting from Beekeeper Data. Beekeeper can help you automate your reporting and make your customers much happier in the process. Learn more about Beekeeper by installing for free or scheduling a demo.