Somewhat surprisingly Microsoft Visio is a true Rapid Application Development tool as envisioned by James Martin. While Visio can perform many Rapid Application Development tasks (including Code Generation from UML diagrams), this article provides an in-depth view of how to use Microsoft Visio for data modeling.
Microsoft Visio, Enterprise Architect edition, is a powerful software development tool for numerous applications. In addition to a basic line-to-shape drawing tool, Visio can model UML, generate and reverse engineer interfaces in code (no implementation is generated, unlike Automated Architecture's Blue Ink tool), and as is the focus of this article: provide a fairly strong tool for data modeling. Visio is not particularly intuitive or well documented, and is occasionally buggy, but once you get used to it; it is very useful (and relatively inexpensive). In terms of data modeling Visio can:
I’ve used it extensively with SQL Server 7, SQL Server 2000, SQL Server 2005, and Oracle 8i, but it should work with any ODBC accessible database.
This document will go over how to use Visio to perform data modeling tasks and how to maintain a database with Visio, while providing tips from someone who has multiple years of experience with the tool. This document assumes a basic understanding of data modeling design concepts. For some background on how to read the crow's feet notation that Visio uses see my Blog post An Entity Relationship Diagram Example
The data modeling section will provide an overview of why to use a data modeling tool, then will delve into the details of using Visio, including how to create a new model, how to modify tables and relationships, and how to comment a data model.
I know numerous developers that build databases by hand, but of all my software development tools I believe a data modeler is by far the most important. A data model is like the foundation to a house; the better the data model the less custom development work in business rules and exceptions. The use of any Rapid Application Development tool or Code Generation tool requires a solid foundation, and a data modeling tool helps provide that by eliminating errors that are easy to make without a visual representation of your foundation. Visual modeling tools provide many secondary benefits beyond decreasing data model defects including the ability to keep multiple data models in sync, make on-the-fly data model changes (without loosing data), and validate models (enforce primary key requirements, warn against duplicate indexes, etc).
Create a new model by simply selecting File -> New -> Database -> Database Model Diagram. It is important when creating a diagram to set the Database Management System from the Drivers tab in Database -> Options -> Drivers.
Reverse engineering an existing database involves creating a new model from File -> New -> Database -> Database Model Diagram. Next select the type of Database Management System (DBMS) you want to connect to from the Drivers tab in Database -> Options -> Drivers. Next select Database -> Reverse Engineer and follow the wizard by selecting an ODBC data source or creating a new one. Clicking next allows you to select which elements of the database to retrieve, and the following screen allows you to select which tables to retrieve (you probably want all of them). The last screen allows you to automatically add the tables to the page or not. If you don’t automatically add them you can manually add them from the Tables and Views pane (that can be displayed via Database -> View -> Tables and Views).
There are a few options I recommend setting for a new diagram. I prefer to view crow’s feed to indicate one-to-many relationships instead of the default arrow. This can be set by selecting the "Show Crows Feet" checkbox on the Relationship tab from Database -> Options -> Document.
Setting the page area for the purposes of printing is another useful but non-intuitive process. Start by going to File -> Page Setup. Select the Page Size tab. Here you can select Portrait or Landscape at the bottom. If you want multiple pages the best way I’ve found to do it is by selecting Custom size and entering multiples of 8 inches and 10.5 inches (giving a quarter inch border around all pages). For instance three pages width-wise by two pages height-wise done landscape would be 24" X 21".
I also like to have the font smaller for tables and columns, but unfortunately I haven’t figured out how to set the default, so when adding a new table I select it and select 8 point from the toolbar.
Finally (for what it’s worth) I personally don’t like the default Window Pane setup. I like to have the Database Properties pane at the bottom of the screen with the Entity Relationship screen small and to the left of it. You may find the following setup more productive than the default:
Creating a table simply involves dragging it from the Entity Relationship pane to the design surface. You can modify a table’s name by selecting the table and editing the name from the Definition category in the Database Properties pane. Just modify the Physical Name, I’ve never found a need for the Conceptual name or Name space.
To modify columns just select the table on the design surface then the Columns category in the Database Properties pane. Adding and deleting columns is pretty straight forward. If you’re using SQL Server you can type "int identity" into the Data Type field to get an auto incrementing field. If you click edit you can change the default value and edit constraints.
Creating a new index is yet another not particularly intuitive process in Visio. Click indexes in database properties for a table and click new. When the prompt comes up for a name don’t enter anything and hit enter. Visio automatically names indexes for you and it I’ve seen it get confused if you enter something custom, so just leave it blank. Generally select "Unique Index Only", but there are probably reasons for entering other types of indexes. Selecting the columns is pretty straightforward.
Keep in mind that the design surface is not indicative of what is actually in the model. For instance if you click a table or relationship and hit the delete key it should ask you "Remove selected item from underlying model?" (if it doesn’t prompt you for this make sure to select "Ask user what to do" in the Logical Diagram tab in Database -> Options -> Modeling). If you say yes then it will truly delete the object, if you say no it will only make the object disappear from the design surface, but if you regenerate the database the item will still be generated.
You can view the real objects in the model by going to Database -> View -> Tables and Views (and you can drag these onto the design surface). You can also right click on a table and select "Show Related Tables" to view related non-visible tables and relationships. Hiding tables can be extremely useful for tables that are related to nearly every table in the model.
Creating a one-to-many relationship is easy, just drag a relationship object from the "Entity Relationship" pane to the design surface and connect the ends to tables. Visio will automatically add foreign keys to the table with the crow's feet on it provided the parent table has a primary key. Visio automatically names the relationship for you, but it isn’t smart enough to rename the relationship if you rename tables, so sometimes you’ll run into errors with duplicate relationship names, these you’ll have to fix by manually renaming the relationships.
Set up a one-to-one relationship just as you would a one-to-many relationship keeping in mind which table will be the parent and which will be the child. Next click on the relationship and in the Database Properties pane select the Miscellaneous category and set the cardinality to "Zero or One."
Visio does a great job of allowing you to comment a model and then run reports that describe the model. You can comment a table from the notes category in the Database Properties pane, and you can comment a column from the notes section in the Columns category. Oddly Visio puts default comments in for columns, but the default comments don’t show up in the reports.
Running reports is so customizable it is a little overwhelming. Start the process with Database -> Report. I find the most useful report to be the one that describes the tables and columns with their comments. Do this by selecting the "Table Report". If you run this report it will display a list of all tables and then follow it up with a couple pages per table describing the table and it’s columns, etc. You can select the "Default To" button to select preset options for different subsets of data. In particular the "Database Report" just shows tables and their comments, and the "Column Summary" report just shows tables and the comments for their columns.
Note: It ought to be easier than it is to export table and column names into a program like Microsoft Excel. If the topic interests you see my Blog post Export Visio Database Table Names to Excel which contains a step-by-step process with screenshots.
The Maintaining a Database with Visio section will detail data model validation, how to generate a new database using Visio's "Generate" command, and how to maintain an existing database including using Visio's "Update" command.
Before generating you need to validate the model. Visio will find all sorts of frustrating errors, some of them legit, some of them not. Perform the error check from Database -> Model -> Error Check. Errors will be displayed in the Output pane which will become visible after the check. You can double click on the error and sometimes Visio will take you to the problem.
The most common error is duplicate relationship names. In this case Visio will say something like "Foreign key relationship has the same parent and child tables as ... but does not have forward and inverse verb phrases." This occurs when there are two relationships between two tables that both have the same "Verb phrase" and "Inverse phrase" properties. To fix the problem select one of the relationships, select "Database Properties" select the "Name" category and change the values in the appropriate fields in the right-hand column (I typically just add a character to each as I have no real use for verb phrases).
The most annoying problem is identical column names (in different tables) that have different data types. The error will be "Column has same conceptual name as [Table].[Column], but has a different data type". To fix this select one of the columns, click the Edit button to bring up a column properties dialog box, uncheck "Sync names when typing," and change the "Conceptual name" field.
Generate a database from the model by selecting Database -> Generate. This will automatically run a model validation and then it will display a screen with some options. Generate a text file of the DDL script is a great idea, and you might as well always check this. Generate a new database is a bit of a misnomer, the checkbox is basically asking if Visio should automatically run the SQL commands (DDL script) against the database. It’s a nice convenience and I always check this, but more importantly it allows you to select the next checkbox "Store current database image in model." Visio keeps track of all changes since you updated the "image" of the database and when you regenerate it will only make the new changes, so clicking this checkbox is important as it stores an "image" of the database.
The next screen allows you to create a new database, which I would advise against, or selecting an existing database. I would suggest always creating the database by hand through SQL Server or Oracle and then selecting "database already exists." When you take this approach Visio will ask you for an ODBC data source. Select an existing one or more likely create a new system data source, select your DBMS and enter the parameters to connect to the database (don’t forget to set the default database to your newly created database). Finally Visio will summarize the changes it’ll make and then when you hit finish it will generate the tables, columns, indexes, relationships, constraints, etc into your database. The results of the generated code will be displayed in the Output Window. Review the output carefully; although Visio is pretty good, it sometimes has errors when generating and it will not tell you about the errors other than through reviewing the Output Window. You may need to manually fix errors it encounters by reviewing the DDL script that it ran and re-running sections.
Once you have generated a database from a model it is possible that manual changes to the database schema may get your model and the database out of sync. To relieve this problem Visio can review a database and identify elements that have changed and optionally update the model with the changes. Perform this operation by going to Database -> Model -> Refresh and selecting your database from the ODBC Data Sources. Visio will compare the image in the model of what the database should look like and the actual database and list all discrepancies. You can select discrepancies and choose "No Change" or "Refresh Model." Fixing the discrepancies should update the model, but I have occasionally run into problems where I have needed to just note the discrepancies and manually make the changes then re-run the refresh operation.
There are two ways to update a data model and sometimes one is more appropriate than the other. Visio is pretty crummy when it comes to making modifications to SQL Server tables that have fields of type "text," but otherwise it does an excellent job of modifying a database schema while maintaining the data in the database. In I go with using Visio’s update operation first and if there are errors then I’ll make the changes manually.
Making changes using this approach involves making the changes in the model (adding rows, changing relationships between tables, etc), and then selecting Database -> Update. Always select the first option "Generate a text file with the DDL script" as you can refer to this if there are any errors. "Update the Database" is the option I always go with for sheer laziness as it runs the script against the database for you and displays the output on screen. "Detect Changes in the Database" is an option that you can check if you are worried that the database might not be in sync with the data model. Finally, I haven’t used the "Log Conflicts to file" option, but it sounds useful and I should probably start.
If you select "Update the Database" the next screen has you select the ODBC Data Source to get to the database. If you selected "Detect Changes in the Database" the next screen will allow you to resolve conflicts. It gives you options to update the model from the database, update the database from the model, or ignore the conflict. I haven’t had problems with these options, and have found the detect changes useful on a number of occasions. Finally you will be presented with a list of changes that Visio will make to the database and to the model. Click Finish with the options I usually pick and the DDL Script will be displayed and run against the database and results will be displayed in the Output Window. Review the Output Window carefully, errors are more frequent with the Update operation than with the Generation operation. When errors occur (this happens occasionally) you will need to review the DDL script and re-run sections to get everything to work. I will briefly describe Visio’s process in the DDL script to handle updates:
For any table that has changed Visio will remove all foreign key referential constraints ("alter table drop constraint"). Visio will then create a duplicate of the table for backup purposes and name it "[TableName]_IM0." Visio will then copy all data into the backup table, delete the existing table, recreate the table with the new changes, and move the data from the backup table to the newly created table. When data is moved from the backup table to the new table Visio will often set a default value of 1 for new required fields. Note that this will fail for required foreign keys that do not have a parent record number "1" to point to. Finally Visio deletes the backup table and then recreates all referential constraints (alter table add constraint).
When errors occur they will sometimes cascade to all operations on all tables and sometimes just remain local to the table that is being changed. When errors do occur they often leave _IM0 tables sitting around that can usually be deleted. Once the operation is complete and any errors resolved I recommend refreshing the model to make sure everything worked and the model and database are equivalent.
Making changes manually is the best rout to use when any table that is involved in a change is in SQL Server and has a field of type text, but it is occasionally useful when errors occur during the Visio’s update operation. I will avoid making manual changes when relationships change or when there are more than a couple changes to make. Incidentally I find making schema changes to be much easier in SQL Server and I try to avoid making manual changes in Oracle without a tool such as TOAD.
The manual process is pretty straight forward: make the changes in the model, make the changes in the database, and run a refresh to make sure they are equivalent.
In summary I would never do database design work without a data modeling Rapid Application Development tool, and Visio is one that is quite powerful.
Copyright © 2019 Automated Architecture, Inc. All rights reserved.