You can't say it's about SQL vs NoSQL
Recently it has come to our notice that many people(including us) are confused with the type of data storage technique we should use while building our software solutions, so we decided to have a look around to find out what's going on, and we have to admit we found some pretty interesting things. Also we noticed that what we found was very techy for normal human beings to understand in simple language, so we decided to elaborate on the topic in a language most can understand and so here we go.
Let's first start with a though experiment. Imagine you own 2 go downs / warehouses. In warehouse 1 there is very strict policy of where should which things go and also its a bit expensive because of existing infrastructure it provides, i.e there are racks of predefined sizes designed for all the different kind of items that can be stored in the warehouse. In warehouse 2 there is no such racks, just wide open labelled(for identification) spaces where any thing can be put any where, and so its relatively cheap.
Keeping the above in mind let's go further to understand what type of data storage techniques can be used where. Amongst which SQL (click here for find more)and NoSql(click here to find more) are the most prominent. So we'll keep our topic around these two techniques.
One Line Intro:
SQL which is structured query language is not storage itself, the structure it uses is tables with rows and columns something like inside warehouse 1 which is having racks.
NoSql is everything else that is not SQL, its like warehouse 2 it does not have any structure so you could build any kind of structure you want (for techy guys: JSON is just one of them).
Why make another if we already have one ?
Good question, time for some history. When all this storage of data first started, hardware was pretty expensive so we had to create techniques where data is stored in a very optimised manner so to reduce any waste, also the amount and type of data we had to store was very specific, but now things have changed hardware is cheap also the amount and type of data we are storing is humongous so we needed a different approach.
Quick fact: The name of NoSql database MongoDB is stemmed from Humongous.
Anyway, Why should i care ?
If you are anyone which is having a business and that business involves storage and retrieval of data for different processes of business, or if you are a person who is planning for a software project, or if you are a software developer, chances are having knowledge of these data storage techniques can help you make decisions which can make or break the system. Or it would be like cutting a piece of cloth with knife instead of scissors, you could that but its not meant for it.
Seems important! tell me more.
Let's first go for SQL, kind of technique used in warehouse 1
- It could provide surety of the type of data that is stored because there are strict requirements, as in warehouse 1 the racks are specifically designed for a particular type of item. But that could also be a problem, because we have specialised racks for all different kind of items, adding any new type of item requires us to alter the warehouse and bring new racks for new types of items. Same is the case with SQL databases, it has very strict limits to kind of data that should be stored and storing any new kind of data would mean altering the state of database.
- To scale up in SQL we could increase the power of the hardware, but that's up to the limit it can support, in our warehouse 1 that means we increase the number of rows in the rack but that's up to the limit of the ceiling of warehouse. Although we could buy a new warehouse but we would require to setup the structure as in this one. So SQL is restrictive to scale.
- Now, its also a possibility that a product is made out of 1 or more items but as our racks are specifically designed for one particular item, i.e we cannot store both the items that makes a product in one place, so we keep the other one at some other place and we some how link them to each other. In SQL world this technique is known by the term relations between data. But notice now this would mean when ever we want the whole product, we'll have to join (another one of SQL terms) the two items together, which could be time consuming.
Let's now understand NoSql, kind of technique used in warehouse 2
- Surety of type of data cannot be guaranteed here, but that could be a requirement. Some one might want to store all different kinds of data at one place. As in our warehouse some one might pick warehouse 2 as it offers just wide open space, for storing items that he/she might have not already predicted he/she would store.
Also every one might build their own structures to store their data, so it could become difficult to impossible to use the same searching techniques on all different NoSql databases.
- When ever we require to scale up, we just buy new servers and start storing data. In our warehouse 2 example that would mean just buy a new empty labelled space and start putting stuff there, no requirements to create racks as in warehouse 1. Here you could also buy a new warehouse if even more scaling is required. So NoSql is easy to scale.
- The idea here is that you put every thing that's related at one place, in our warehouse 2 example that mean storing all the items which combined make a product at one place so when ever we require a certain product we just have to go to one place.
Quick fact: Experts say, world's data is doubling every two years
So when can i use which one ?
You go for SQL,
- When data consistency is utmost requirement, so systems involving accounting, transactions, inventory management, etc usually benefit from it
- When your type and amount of data is limited. Like for example in banking type of data is fixed.
- When storage and retrieval of data is from few tables only, i,e we do not require joins between a large set of tables.
And you go for NoSql,
- When data availability is utmost requirement, so systems involving real time analytics, mobile apps, content management systems usually benefit from it.
- When your type of data can vary and amount of data can be humongous. like for example in usage statistics data (which goes in terabytes) collected by companies like Twitter, Google, Facebook, etc.
- Considering that the idea of "Every thing at one place" is followed, when there is a requirement of super fast reads as all the data is already inside the collection.
Quick fact: NoSQL databases have existed since the 1960s, but have been recently gaining traction with popular options such as MongoDB, CouchDB, Redis and Apache Cassandra.
Everything aside when any system becomes very large, data requirements vary and one particular solution could not work, so many Big companies like Youtube, Facebook, etc use both of these data storage techniques.
That's it, for today, hope we were able to clear up some clutters.
Please leave a comment below if you have something wandering around your head, may it be something crazy. Also please let us know if you like this post and anything we need to improve upon and what kind of blogs would you be interested in future. You can leave a message in the chat box.
Ending with a quote
There is no problems. Its just situations that we connect with unpleasant emotions.