When transforming a logical database design into a physical database schema the first rule is, do not focus on physical storage challenges and the limitations of any Database Management System (DBMS) known. The thinking should be conceptual not physical, always ensure that the integrity of the data model is intact, by honoring business rules. The second rule is to ensure the integrity of the process, by thinking about the structure and not the process to make sure that all the required stages and feedback loops are taken into account. The third rule is to ensure that the whole process is practical, by thinking about how things relate in the data model and that the data model address those relationships as well issues such as reporting, querying, load performance.
Foreign keys form the foundation of referential integrity, a primary key in one table, is referenced by a primary key in a secondary table. Entities (rows) in a table are uniquely identified using the primary key; entity integrity is used to maintain the primary keys while referential integrity is used to maintain the foreign keys. Entity Integrity is the method that the system uses to maintain primary keys and ensure that, the primary key is not null and it is unique. Referential integrity is the method that the system uses to maintain foreign keys, a foreign key must specify the table from which the primary key is being used, it ensures that the primary referenced exists and it does not contain a null. Domain integrity requires that in a relational database all the columns must be stated within a defined domain. Integrities are important ensuring that the database is safe and that errors cannot be introduced in the database, either accidentally or intentionally.
There are three types of data files, Primary data files, secondary data files and log files. A database starts with a primary data file that links to other files in the database, and there is only a single primary data file in each database, it should have a file extension of .mdf Secondary data files, are all other data files excluding the primary data film, a database can have more than one secondary data file. Log files record all the log information, which is critical in database recovery, and all databases have one.
Several major considerations govern the development of a data warehouse, data requirements must be define and how the data will be used by the users should be understood, by defining hierarchies, dimensions and aggregations. Automation and Execution of ETL jobs and other batch oriented types must be determines this includes scheduling of cron process on the operating system servers. How third party data is to be used is also to be considers, with the amount of data required estimated. The strength and weakness of the ETL tools is a critical factor that must be considered as well data dictionary and meta data. Others include OLAT tools, error checking and logging, process flow and tolerance, application server and data physical design and performance.