There are sources that recommend database normalization as a best practice for relational databases. Is it always a best practice to normalize databases?
There is no clear answer as the sources vary.
A heavily voted-up answer on StackOverflow says that denormalization for OLAP performance is something to be avoided. We find recognition of this "best practice" on a venerated engineering blog:
Keeping data normalized is considered a best practice in MySQL. As we mentioned before, however, when using a NoSQL database like Cassandra, denormalizing data often improves query performance.https://engineeringblog.yelp.com/2016/08/how-we-scaled-our-ad-analytics-with-cassandra.html
The above sources have legitimacy, but authoritative sources dispute that normalization is a best practice. No one should dispute that there are proponents and opponents of normalization of SQL/relational databases as a best practice.
The opponents of normalization as a best practice say that you need to have denormalized data for OLAP databases (for reporting to reduce the expensive operation of a JOIN in SQL, and to avoid aggregate queries which can be time-intensive) and for datawarehouses. Datawarehouses are usually not updated that frequently; it is common for batch jobs to happen monthly with datawarehouses. The work to normalize such a database may be economically prohibitive, and the benefits may be negligible or have a negative value.
Kimball cubes are widely used in the I.T. industry; they have changed the database and BI verticals. The man behind the namesake of Kimball cubes says this on his company's website:
In general, dimensional designers must resist the normalization urges caused by years of operational database designs and instead denormalize the many-to-one ﬁxed depth hierarchies into separate attributes on a ﬂattened dimension row.https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/denormalized-flattened-dimension/
In a dataware house the best practice is to denormalize the data to increase the performanze [sic] of ad-hoc reporting; with the prize of a higher data volume.https://social.technet.microsoft.com/Forums/en-US/8c54d1d2-5fbd-4b62-a07f-16c34a863668/is-denormalization-best-practices?forum=transactsql
Other examples suggesting denormalization is beneficial (and indicated a "best practice") are the following:
If you want to read about the controversy of "best practices" in general, we recommend this posting.