Thursday, March 11, 2010
Banner

Announcements
Enterprise Data World - Karen and Rob are speaking

We Are Speaking at Enterprise Data World 2010

 
 

Discussion Group Login Minimize
Print  


Users Online Minimize
Membership Membership:
Latest New User Latest: cumbers5306
New Today New Today: 0
New Yesterday New Yesterday: 0
User Count Overall: 2353

People Online People Online:
Visitors Visitors: 1293
Members Members: 0
Total Total: 1293

Online Now Online Now:
Print  

Archive Minimize
Partners Minimize

InfoAdvisors partners with

 
embt.png
 
 
Microsoft
Sybase
Telelogic
 
We can help you evaluate and successfully implement our partners' products
 


Welcome... Minimize

Welcome to InfoAdvisors' website dedicated to information technology processes.  You'll find subscriber-written articles on UML, data management, data modeling, process modeling, ITIL, information governance, as well as materials to help you improve your information management resources.



Karen Lopez: Musings on Data, Process, and Architecture Minimize

Sara Yeager sent me a question about how to generate an XSD from her data model for developers on her team:

I used the ARTS model as a template for the definition of a Retail Transaction and Retail Transaction Line Item.  I know you've worked with ARTS and are familiar with this super/sub type structure.

In working with our solutions architects, they asked if I could produce XSD for them from the data model from which they could create the Retail Transaction business object.  So, what I did was exported the model using MIMB into XSD format and provided it to them.  What they said was that the XML created wasn't what they had expected.  It evidently wasn't formatted in the manner they expected, with relationships and "choices" not included.

I am absolutely not an XML/XSD expert, but what it appears to me is that the XSD created from ERwin defines the data model structure and what they are looking for is more functional XML to use in the creation of their classes etc.  So, perhaps there's an additional conversion or something which must be done in order for the XML to be usable, according to them.

What I’ve learned about these sort of generation requests is to enter into a more detailed dialog with the developers to find out what they really want.  XSD is just the name of a very vague file format.  Calling it a format is even a stretch.  It’s sort of like someone asking you to provide the data model “in a spreadsheet”.  You can generate a million spreadsheets from a  data model without providing any value, depending on what the consumer is really looking for.  In addition, the tips below also apply to requests for “some DDL” from the data models, too.

I have used the following questions to help derive what they need:

  • What, exactly, will you be using the XSD for?  This is not a query asking the developer to justify their request, but to be much more specific in what type of XSD they need.  It may also lead to a response where the best type of file to provide them is not an XSD.
  • Are you using this as a one-time method to get the data model information into a tool?  Which tool? What formats will it import?  Does it have any special requirements for the format or standards for the file?
  • Are you expecting this provision of the data model to be repeated over the course of the project?  When something changes in the model? On a fixed schedule?  Only when the model is officially released?
  • How will you incorporate changes to the model into your tool?  If you are annotating or modifying the model objects, how will you reconcile those changes with changes in the data model?
  • Do you need the entire data model or just parts of it?
  • Do you want the physical (database) names, the logical, or both?
  • What properties (datatypes, domains, definitions, notes, UDPs/Attachments) do you want with the file?

There is often a lot of resistance to answer the questions because many times they just want to be able to browse the model in a tool that they are comfortable working with, which is fine if they understand that data models include all kinds of information that their tool may not support.  Their tools may also require modifications to the model such as truncation of names or definitions.  I can guarantee that every time to move data from one system to another, it gets changed.

Other times, they want to do things that are less benign, such as modify the model to generate development databases.  Depending on your project methods, that may or may not be a good thing.

The reason you want to know what tool is that the Meta Integration Bridge (MIMB) may offer a more targeted, direct format than just an XSD.  For instance, I get requests to generate XSDs for common development tools, but I can produce a much more meaningful and valuable file by generating a file that all the objects in the right places for the exact version of the tool they are using.  For instance, the XMI possibilities often lead to better integration with other tools.

Also, if they are reporting something is “missing”, then it means that either you didn’t choose to include it in the Destination tab of the MIMB or that the external format (or MIMB) does not support it.  For instance, some tools don’t include relationships in the data model when choosing XSD as the target format.

The best way to figure out what the requester is looking for is to ask for a good example of what they expected the file to look like.  You could ask them to prepare an XSD that has 3-4 entities and a handful of attributes.  In reviewing this, you can best determine which export format to use.

If there are still issues with the format or content of what you are providing, you may have to negotiate with them to provide enough content that they can develop what they need from what you deliver.

You can provide them a file that has all the specifications about a data element so that they can derive a process or object-focused structure for their needs, but they will need to design a new file by bringing together all these data elements.  You can work collaboratively with them on this, but that the model you are providing won’t instantly do design for them.

You must be able to review their designs for compliance with the underlying business rules inherent in the data model. For instance, it is typical for a message or object to leave out certain elements that are more appropriate for persistence and historical/traceability needs.  So they can leave this stuff out and still be compliant.  But they can’t re-design business rules by ignoring the fact, for instance, that ITEMS often have multiple SKUs or GTINs. Or that RETAIL TRANSACTIONs sometimes result in no tendering being made (exchanges, credits, etc.).

The key, as in all deliverables of the data model, is to not be tempted to just produce a file based on tool defaults, throw it over the wall and tell others to deal with it.  I see these request for derivations of the data models be great indicators of the value of your hard work.  There are those who would not come to you to request these files, but just try to key it in by reading a print out.  That’s labor intensive and mistake prone.  So treat all these requests for better exports from the model as compliments.  They are.

Technorati Tags: ,,,,,,

 

image We'll be starting a quick book study / discussion here at InfoAdvisors ITBoards of the book Semantic Web For Dummies by Jeff Pollock.  The study will have a special mailing list / conference on this board, so that we can have our discussion without drowning out any other conversations. I'll be setting up that conference shortly.

Why a For Dummies book?

  • It's summer.  I will enjoy starting my reading about semantic technologies with lighter fare.
  • My goal is to establish a literacy level of semantic web and modeling terms, notations, and methods.
  • This will be the first of more books about semantics, ontologies, and semantic modeling.
  • I'm hoping it will go faster than a more in-depth work.

Why Semantic Web?

  • Much of the work involved in the Semantic Web and related technologies depends on understanding the meaning of concepts and building models about those concepts, then building systems that use those concepts.  Sound familiar?
  • Much of the work in establishing standards for the Semantic Web and related technologies could really benefit from our experience in defining concepts (not just database object), rationalizing terms, negotiating disputes, documenting meanings, and general modeling foundations.  But we can be part of that if we don't know what Semantic Web is.
  • Ontologies play a big part in the initial analysis of semantic solutions.  I believe that we in the data management community are in a great position to lend our experiences here.
  • Understanding new technologies, notations, tools, and approaches is good for career management.  Meaning your career.

The Book is described here (affiliate link) http://www.infoadvisors.com/Bookstore/SemanticTechnologiesModelsandOntologies/tabid/443/Default.aspx (Amazon US)  If you are ordering this book, you might also want to pick up our next book study book,  Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL http://astore.amazon.com/infoadvsiorsirmb-20/detail/0123735564 (affiliate link) at the same time because we will be starting that discussion right after the study of the Semantic Web for Dummies is complete.

Note that the book is available for the Kindle.  I don't see it available on Safari Online.  There is a sample chapter, Chapter 3: The Data Web at Work for Business, available at http://www.semanticuniverse.com/semanticwebfordummies .

Let's plan on starting the book discussions of content around 10 August 2009.   The book has 5 parts, so we'll tackle them by parts instead of just chapters.

Once you acquired your copy, go ahead and start reading. I think it will be a quick read, since most of these For Dummies books are fairly high level.   Please do reply to this post if you are planning on joining this discussion.

We have set up a conference/mailing list/newsgroup/web board for our book study on our Data Modeling board.  Free registration is required to access the Groups.  To create an account or to use your existing discussion groups account, visit the Data Modeling Board. If you'd like to subscribe to the mailing list for this book study, login to the Data Modeling Board, then go to OPTIONS / MY MAILING LIST and check the box for SemanticWebForDummiesBookStudy.

Looking forward to discussing this important data-related topic with all of you.

 

Ever have one of those long, back and forth email exchanges where when you’re done you feel as if you’ve written a few chapters of an upcoming book?  I’m going to be taking a few of these from my archive to share with others who may be getting the same types of questions from non-modelers.

Today, we will be discussing the question of why does resolving a many-to-many relationship require a new relationshipWhy do we have to create a new entity?


This is a fairly classical data modeling concept.  It might be referred to as an intersectional entity, a resolution entity, an associative entity, or in short form, a M:N entity.  Silverston calls this concept Intersection or Association entity.  Simsion and Witt call these Intersection entities, Associative entities, Resolution entities, or Relationship entities.  Riordan calls these Junction tables, even in the data model.  I’d say that the most common terms are Intersection or Associative entity, but I think it depends on what tool one uses and what types of data modeling books one reads.

A typical example might start out with a many-to-many relationship, say between cars and people:

PERSON >-|---owns----o-< CAR

This is a many-to-many relationship:  A person may own cars and a car may be owned by more than one person.  In my made up example, my business rule is that a car has to be owned by at least one person (which is sort of bending the real world rules, but bear with me).  We’ll also ignore the fact that there are other relationships between cars and people.  We’ll focus just on ownership.

We can’t leave many-to-many relationships that way in a relational database, so we need to resolve them.  There are also normally real business reasons why they need to be resolved, but I’ll leave that for another discussion, too.

To resolve a many-to-many, you create a new entity, in my example, OWNERSHIP:

PERSON - ||---registers----0-< OWNERSHIP >-|-----is registered on-----||-CAR

OWNERSHIP keeps track of the relationship between a specific person and a car.  It becomes the list of just two things: a person and a car.

Karen owns car 1234

Kirstin owns car 2345

Rob owns car 1234

Rob owns car 3456

In this list, notice that Karen and Rob jointly own car 1234.  Car 2345 is owned only by Kirstin and car 3456 is owned only by Rob.  Karen owns only one car, Rob owns two, and Kirstin owns one.  We can also assume that Richard owns no cars (according to the data) because he has no entry in OWNERSHIP. 

In the attributed model, the entity OWNERSHIP would look like this:

OWNERSHIP

=======================

Person.PersonID (fk)

Car.CarID (fk)

…in a very simple world where we don’t worry about time.   The real world reason why we need these associative entities is because they almost always involve an aspect of time and other attributes, but we’ll ignore that for now.

Each foreign key in this associative entity came from the relationship from CAR and PERSON to OWNERSHIP.  That’s why we need two relationships.  We could not drop one of them, because each plays the part of associating the two concepts to each other, one pair at a time. 

The PARTY AFFILIATION entity is a special case of the associative entity above because it started out (at least conceptually), as a recursive relationships (a relationship from an entity to itself).  These are more difficult to draw in ASCII data modeling, so I’ll just duplicate the entity:

PARTY >-o-----is affiliated with-----o-< PARTY

So just imagine the relationship being “dog eared” back to the same entity.

We created PARTYAFFILIATION  to do the same job as OWNERSHIP:

PARTY -||-------is affiliated via-----o-< PARTY AFFILIATION >-0----------is affiliated via-----||- PARTY

It would result in an associative entity that looked like this:

PARTYAFFILIATION

=====================

PARTY.PARTYID (fk)

PARTY.PARTYID (fk)

…which we can’t have, since the foreign key would have migrated twice with the same name.  So we rolenamed one of the relationships for it to be:

PARTYAFFILIATION

=====================

PARTY.PARTYID (fk)

PARTY.SUBPARTYID (fk)

Personally, I prefer to rolename both in these cases so it is very clear which role the foreign key is taking, but it works either way.

So if you check out your books on data modeling for the terms I mentioned in the beginning, you might come up with some more examples of why many-to-many relationships resolve to two relationships with an associative entity in the middle.

 

InfoAdvisors Calendar List Minimize

Month viewMonth view  Week viewWeek view  List viewList view  Print  

  Minimize

Copyright 2006-8 InfoAdvisors, Inc.