Database Design. Best practics

In anticipation of the start of the next stream on the course “Databases”, we prepared a small authorial material with important tips on designing the database. We hope this material will be useful to you.








Databases are everywhere: from simple blogs and directories to reliable information systems and large social networks. It is not so important a simple or complex database, how much it is essential to design it correctly. When the base is designed thoughtlessly and without a clear understanding of the goal, it is not just not effective, but further work with the base will be a real torment, an impenetrable forest for users. Here are some database design tips to help you create a useful and easy-to-use product.



1. Determine what the table is for and what its structure is.







Today, development methods such as Scrum or RAD (rapid application development) help IT teams quickly develop databases. However, in the pursuit of time, the temptation is very great to plunge right into the construction of the base, vaguely imagining what the goal is, what the final results should be.



As if the team is focused on efficient, high-speed work, but this is a mirage. The further and faster you go deeper into the project, the more time will be required to identify and change errors in the project base.



Therefore, the first thing to decide is to determine the purpose for your database. For what type of application is the database developed? Will the user only work with records and need to pay attention to transactions or is he more interested in data analytics? Where should the base be deployed? Will she track customer behavior or just manage the relationship between them?



The sooner the design team answers these questions, the softer, smoother the database design process will go.



2. What data to choose for storage?







Plan ahead. Thoughts about what the site or system will do in the future for which the database is being designed. It is important to go beyond the simple requirements of the technical specifications. Just please, do not start thinking immediately about all the possible types of data that the user will ever store. Better think about whether users can post, upload documents or photos, or exchange messages. If so, then in the database you need to allocate space for them.



Work with a team, department or organization for which the project base will be supported in the future. Connect with people at all levels, from customer service specialists to department heads. So with the help of feedback you get a clear idea of ​​the requirements of the company.



Inevitably, the needs of users within even one department will conflict. If you encounter this, do not be afraid to rely on your own experience and find a compromise that will suit all parties and will satisfy the ultimate goal of the database. Be sure: in the future, +100500 in karma and a mountain of cookies will fly to you.



3. Model data with care







There are several key points that you should pay attention to when modeling data. As we said earlier, it depends on the purpose of the database which methods to use in the simulation. If we design a database for online processing of records (OLTP), in other words, to create, edit and delete them, then we use transaction modeling. If the database should be relational, then multidimensional modeling is best.



During the simulation, conceptual (CDM), physical (PDM), and logical (LDM) data models are built.



Conceptual models describe the entities and data types that they include, as well as the relationships between them. Divide your data into logical chunks - it’s much easier to live.

The main thing is a measure, do not overdo it.



If an entity is very difficult to classify in one word or phrase, then it's time to use subtypes (child entities).



If an entity leads its own life, has attributes that describe its behavior and its appearance, as well as relations with other objects, then it is safe to use not only a subtype, but also a supertype (parent entity).



If you neglect this rule, other developers will get confused in your model and will not fully understand the data and rules, how to collect them.



Conceptual models are implemented using logical ones. These models are like a roadmap for designing a physical database. In the logical model, the entities of business data are distinguished, data types, the status of the rule key that govern the relationship between data are determined.



Then the Logical data model is compared with the previously selected platform of the DBMS (database management system) and the Physical model is obtained. It describes a way to physically store data.



4. Use suitable data types







Using the wrong type of data can lead to less accurate data, difficulties in joining tables, synchronizing attributes, and bloating file sizes.

To guarantee the integrity of information, an attribute should contain only data types acceptable to it. If age is entered in the database, then make sure that integers from a maximum of 3 digits are stored in the column.



Create a minimum of null empty columns. If you create all columns as NULL, this is a blunder. If you need an empty column to perform a specific business function, when the data is unknown or does not yet make sense, then feel free to create. After all, we cannot fill in the columns “Date of death” or “Date of dismissal” in advance, we are not predictors to point a finger at the sky :-).



Most modeling software (ER / Studio, MySQL Workbench, SQL DBM, gliffy.com) data allows you to create prototypes of data areas. This guarantees not only the correct data type, application logic and good performance, but also the required value.



5. Prefer natural







When you decide which column in the table to choose as a key, always pay attention to which fields the user can edit. Never choose them as a key - a bad idea. Anything can happen, and you must guarantee uniqueness.



It is best to use a natural, or business, natural key. It makes sense, so you avoid duplication in the database.



If only the business key is not unique (name, surname, position) and is repeated in different rows of the table or it must be changed, then the generated key should be the generated artificial, surrogate key (artificial key).



6. Normalize in moderation







To effectively organize data in the database, you must follow a set of recommendations and normalize the database. There are five normal forms to follow.

Using normalization, you avoid redundancy and ensure the integrity of the data used in the application or on the site.



As always, everything should be in moderation, even normalization. If the database has too many tables with the same unique keys, then you get carried away and overly normalize the database. Excessive normalization negatively affects database performance.



7. Test early, test more often







A test plan and proper testing should be part of database design.



It is best to test the database through Continuous Integration. Model the scenario “One day in the life of the database” and check whether all boundary cases are handled and what user interactions are likely. The sooner you find bugs, the more you save both time and money.



These are just seven tips to help you design an excellent database of performance and efficiency. If you follow them, you will avoid most headaches in the future. These tips are just the tip of the iceberg in database modeling. There are a huge number of life hacks. Which ones do you use?



All Articles