If you spend a good amount of time on the computer, you probably interact with databases several times a day—if not several times per hour. Databases power everything from banking software to scientific research to government records, as well as the websites you use every day, like Amazon, YouTube, Netflix, and Wikipedia. In fact, if you found this page through an Internet search engine, your search was powered by a (very, very large) database. But databases aren’t just for massive websites that serve millions of users per day—they can also be used by small businesses, or by a small group of people.
The reason why databases are so common, for use cases big and small, is because they make accessing information using a computer much, much easier. If you work with information on a daily basis—and most of us do!—it’s worth understanding what databases are. And once you understand what databases you might even want to build your own databases, which will work just the way you need them to.
What is a database?
To really explain why databases are important, it’s important to first define what a database actually is:
A database is a logically organized collection of information, designed in such a way that the information within can be accessed for later use by a computer program.
Here’s another way to think about databases: a computer is a device that allows you to manipulate information, whether that information takes the form of words, numbers, pictures, or videos. However, a computer needs to store the information before that information can be referenced or changed, and it also needs to ensure that you can find the right information at the right time. Databases are how computers solve these two problems.
By definition, the data within a database needs to be arranged according to a consistent, logical set of underlying principles. The term data model describes the logical structure of a database, which determines the rules for how the information within can be organized and manipulated. There are many different types of databases, which are usually classified according to their underlying data models. For example, the most popular data model, the relational data model, organizes information into related two-dimensional tables, whereas the graph data model organizes information into nodes (points) and edges (the lines connecting those points).
Some abstract representations of data models: hierarchical data model (top left), network data model (top right), relational data model (bottom left), graph data model (bottom right).
Different data models have different benefits and drawbacks depending on what you’re trying to accomplish. If you’re trying to create a database with a high degree of data integrity that can scale flexibly, a relational database will likely work best; if you need a database that can handle massive quantities of data, a graph database could be a good option.
The implementation of a data model in a particular database is called a database schema. You can think of a schema as a blueprint for a specific database, describing details about how you want the database to be implemented—like required data types or other constraints. The schema of a database is what distinguishes it from a list or a spreadsheet: with a schema, you can rest assured that the data inside your database will be organized according to a certainContinue reading