Sunday, March 11, 2012

DB Redesign - use junction/xref table?

I'm currently redesigning our db. Mind you I was NOT the designer of the
current db. I have the following tables:
CREATE TABLE [CUSTOMERS] (
[CID] [int] IDENTITY (1, 1) NOT NULL ,
[customerName] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[customerID] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[address] [varchar] (200) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[city] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[state] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[zip] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[phone] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[email] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[fax] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[businessName] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[companyID] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
) ON [PRIMARY]
GO
CREATE TABLE [TRANSACTIONS] (
[Transaction_ID] [int] NOT NULL ,
[CID] [int] NULL ,
[Customer_ID] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Customer_name] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Total_Collection_Amount] [money] NULL ,
[Account_Type] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[user_name] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
) ON [PRIMARY]
GO
CREATE TABLE [ADMIN] (
[user_name] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[company_ID] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
) ON [PRIMARY]
GO
Now, there are multiple orphaned records in TRANSACTIONS (I will be using
TRANSACTIONS.CID = CUSTOMERS.CID, which is a new column I added to the TRANS
table). I can't rely on using the CustomerID field to join them as identica
l
Customers.CustomerID exist (even when grouping by Customers.CompanyID).
The problem I'm facing is that I have multiple records in the TRANS table
for the same Customer, but have different Company_IDs. These are the orphane
d
records I speak of that do not have a corresponding record in Customers. Th
e
Company_ID is derived by joining Transactions.User_Name = Admin.User_Name.
An example of a record in my Transactions table:
Tran_ID Customer_Name Admin.CompanyID
123 Home Repair R9
124 Home Repair R11
The company I work for operates under different dba (doing business as)
names. That's how this has happened (hence a customer_name having multiple
Company_IDs). What I plan to do is insert Distinct
Transactions.Customer_Name into Customers, retrieve the CID then populate
this value into theTransactions.CID field. Then I can simply remove the
Transactions.Customer_Name field. Now without adding two records to my
Customers table, how can I overcome this design flaw?As frightened as I am by the concept of having a many to many relationship
between transactions and customers, if that is what you need, then you will
need to have a xref table with the customerId and transactionId. It seems
to me that what needs to be done is have a company table, then a customer or
doingBusinessAs table that relates to the transaction. More thought needs
to go into your design.
Keep in mind the key part of database design, every table should represent
one thing. This is the basis of normalization, and it seems to me that your
customers table, and even your transactions table might be representing > 1
thing at a time, which generally will cause you problems like this as too
many things relate to too many things.
ou probably ought to standardize your names customerName customer_name, only
one naming style (clearly you probably only want to see that attribute once
in the db anyhow.)
The same concern is with CID and Transaction_ID or how about TransactionId.
I like TransactionId, but the key is to not make your users guess how
something will be named.
I know this is kind of a lot to swallow at once, but think about this
statement:

> I'm currently redesigning our db. Mind you I was NOT the designer of the
> current db.
The goal will be to not have the next person say the same about you :)
----
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP
"Arguments are to be avoided: they are always vulgar and often convincing."
(Oscar Wilde)
"Eric" <Eric@.discussions.microsoft.com> wrote in message
news:4E996864-39D1-4866-B2C4-71AA3991C005@.microsoft.com...
> I'm currently redesigning our db. Mind you I was NOT the designer of the
> current db. I have the following tables:
> CREATE TABLE [CUSTOMERS] (
> [CID] [int] IDENTITY (1, 1) NOT NULL ,
> [customerName] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [customerID] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [address] [varchar] (200) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [city] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [state] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [zip] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [phone] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [email] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [fax] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [businessName] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [companyID] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> ) ON [PRIMARY]
> GO
>
> CREATE TABLE [TRANSACTIONS] (
> [Transaction_ID] [int] NOT NULL ,
> [CID] [int] NULL ,
> [Customer_ID] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [Customer_name] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [Total_Collection_Amount] [money] NULL ,
> [Account_Type] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> [user_name] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
> ) ON [PRIMARY]
> GO
> CREATE TABLE [ADMIN] (
> [user_name] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
> [company_ID] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
> ,
> ) ON [PRIMARY]
> GO
> Now, there are multiple orphaned records in TRANSACTIONS (I will be using
> TRANSACTIONS.CID = CUSTOMERS.CID, which is a new column I added to the
> TRANS
> table). I can't rely on using the CustomerID field to join them as
> identical
> Customers.CustomerID exist (even when grouping by Customers.CompanyID).
> The problem I'm facing is that I have multiple records in the TRANS table
> for the same Customer, but have different Company_IDs. These are the
> orphaned
> records I speak of that do not have a corresponding record in Customers.
> The
> Company_ID is derived by joining Transactions.User_Name = Admin.User_Name.
> An example of a record in my Transactions table:
> Tran_ID Customer_Name Admin.CompanyID
> 123 Home Repair R9
> 124 Home Repair R11
> The company I work for operates under different dba (doing business as)
> names. That's how this has happened (hence a customer_name having
> multiple
> Company_IDs). What I plan to do is insert Distinct
> Transactions.Customer_Name into Customers, retrieve the CID then populate
> this value into theTransactions.CID field. Then I can simply remove the
> Transactions.Customer_Name field. Now without adding two records to my
> Customers table, how can I overcome this design flaw?
>|||You seem to have a composite candidate key ( customerID, companyID ) which
uniquely identify a customer in a transaction, right? If they are
duplicated then you should start over.
Generally, it is impossible to give you an accurate solution unless your
business model is familiar to others in this forum. However based on your
narrative one could reasonably conclude that you have an under-normalized
schema. In other words, you have various instances where multiple entity
types are bundled up into single table, for instance your transaction table
seems to have information about both transactions as well as customers, and
perhaps about accounts as well.
Unless, your business model and rules are thoroughly analyzed, it is hard to
provide any substantial advice. In the meantime, consider learning the data
design fundamentals and apply them to the business model in hand. If this is
time critical, considering a professional hire might be worth it - that last
statement in Louis' post has the gist.
Anith|||Louis - thanks for your input. Just some follow up:
"It seems to me that what needs to be done is have a company table..."
There actually already is one. However, the original db designer (who I
might add is no longer w/the company), decided to join Transactions.User_Nam
e
= Admin.User_Name, where the Admin table also contains the user's CompanyID.
So each user in Admin belongs to a Company_ID (so then Admin.Company_ID =
Company.Company_ID)
"...the key part of database design, every table should represent
one thing." I understand this, hence it's why I'm now redesigning it.
"The goal will be to not have the next person say the same about you :)" I
couldn't agree w/you more.
As for a proposed xref table, are you suggesting something like this:
Xref Table Columns:
XID
CID
CompanyID
So now, my Customers table will no longer have a CompanyID. Instead the
relationship will be Customers.CID = XREF.CID Next, I will have a new colum
n
in my TRANSACTIONS table, so that XREF.XID = TRANSACTIONS.XID. Some sample
data:
Customers Table:
CID CustomerName
2 A1 Home
XREF Table
XID CID CompanyID
33 2 R9
34 2 R11
Transactions Table
TranID XID Amount
1 33 $1.00
2 34 $1.25
Thanks for your help
"Louis Davidson" wrote:

> As frightened as I am by the concept of having a many to many relationship
> between transactions and customers, if that is what you need, then you wil
l
> need to have a xref table with the customerId and transactionId. It seems
> to me that what needs to be done is have a company table, then a customer
or
> doingBusinessAs table that relates to the transaction. More thought need
s
> to go into your design.
> Keep in mind the key part of database design, every table should represent
> one thing. This is the basis of normalization, and it seems to me that yo
ur
> customers table, and even your transactions table might be representing >
1
> thing at a time, which generally will cause you problems like this as too
> many things relate to too many things.
> ou probably ought to standardize your names customerName customer_name, on
ly
> one naming style (clearly you probably only want to see that attribute onc
e
> in the db anyhow.)
> The same concern is with CID and Transaction_ID or how about TransactionId
.
> I like TransactionId, but the key is to not make your users guess how
> something will be named.
> I know this is kind of a lot to swallow at once, but think about this
> statement:
>
> The goal will be to not have the next person say the same about you :)
> --
> ----
--
> Louis Davidson - http://spaces.msn.com/members/drsql/
> SQL Server MVP
> "Arguments are to be avoided: they are always vulgar and often convincing.
"
> (Oscar Wilde)
> "Eric" <Eric@.discussions.microsoft.com> wrote in message
> news:4E996864-39D1-4866-B2C4-71AA3991C005@.microsoft.com...
>
>|||Actually, from your data here:

> Customers Table:
> CID CustomerName
> 2 A1 Home
> XREF Table
> XID CID CompanyID
> 33 2 R9
> 34 2 R11
> Transactions Table
> TranID XID Amount
> 1 33 $1.00
> 2 34 $1.25
The xref table is not really a simple many to many table. It is more of a
company allocation. It works, I think, since now both transactions are
allocated to customer 2, but tran1 is for their company r9, and tran2 is for
company r11.
--
----
Louis Davidson - http://spaces.msn.com/members/drsql/
SQL Server MVP
"Arguments are to be avoided: they are always vulgar and often convincing."
(Oscar Wilde)
"Eric" <Eric@.discussions.microsoft.com> wrote in message
news:204589FB-CC82-40D6-A4F8-ED4EA500E7A0@.microsoft.com...
> Louis - thanks for your input. Just some follow up:
> "It seems to me that what needs to be done is have a company table..."
> There actually already is one. However, the original db designer (who I
> might add is no longer w/the company), decided to join
> Transactions.User_Name
> = Admin.User_Name, where the Admin table also contains the user's
> CompanyID.
> So each user in Admin belongs to a Company_ID (so then Admin.Company_ID =
> Company.Company_ID)
> "...the key part of database design, every table should represent
> one thing." I understand this, hence it's why I'm now redesigning it.
> "The goal will be to not have the next person say the same about you :)"
> I
> couldn't agree w/you more.
> As for a proposed xref table, are you suggesting something like this:
> Xref Table Columns:
> XID
> CID
> CompanyID
> So now, my Customers table will no longer have a CompanyID. Instead the
> relationship will be Customers.CID = XREF.CID Next, I will have a new
> column
> in my TRANSACTIONS table, so that XREF.XID = TRANSACTIONS.XID. Some
> sample
> data:
> Customers Table:
> CID CustomerName
> 2 A1 Home
> XREF Table
> XID CID CompanyID
> 33 2 R9
> 34 2 R11
> Transactions Table
> TranID XID Amount
> 1 33 $1.00
> 2 34 $1.25
> Thanks for your help
> "Louis Davidson" wrote:
>|||Find the guy that did this and kill him.
Almost every VARCHAR(n) is totally wrong or absurd. There are not
keys. All columns can be NULL, so you cannot ever have keys.
CHAR(20) as a ZIP code' Everything is a VARCHAR(<< magic number >> )
in this world. Give me an example of that stuff. The rest of the
stinking crap uses "magic numbers: like VARCHAR(50) for anything.
Codes without validation, etc.
Columns are not fields!! This is FOUNDATIONS of RDBMS!! And the
definition of an identifier is that it is unique to each entity. This
is a disaster without any hope of data integrity.
You need to throw the whole damn thing and start over. Other people
will tell you the same thing in a nicer way (i.e. "As frightened as I
am by the concept of having .."), but I tend to be blunt.|||--CELKO-- wrote:

>Other people
>will tell you the same thing in a nicer way (i.e. "As frightened as I
>am by the concept of having .."), but I tend to be blunt.
>
LOL!
Understatement of the century.
*mike hodgson*
http://sqlnerd.blogspot.com|||Actually, it was a woman.
"--CELKO--" wrote:

> Find the guy that did this and kill him.
> Almost every VARCHAR(n) is totally wrong or absurd. There are not
> keys. All columns can be NULL, so you cannot ever have keys.
> CHAR(20) as a ZIP code' Everything is a VARCHAR(<< magic number >> )
> in this world. Give me an example of that stuff. The rest of the
> stinking crap uses "magic numbers: like VARCHAR(50) for anything.
> Codes without validation, etc.
>
> Columns are not fields!! This is FOUNDATIONS of RDBMS!! And the
> definition of an identifier is that it is unique to each entity. This
> is a disaster without any hope of data integrity.
> You need to throw the whole damn thing and start over. Other people
> will tell you the same thing in a nicer way (i.e. "As frightened as I
> am by the concept of having .."), but I tend to be blunt.
>|||It's hard to believe that people are still modeling basic customer tables
and relationships from scratch. It's like re-developing a bubble sort
algorithm or re-inventing the wheel.
"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1137741113.315703.103120@.g44g2000cwa.googlegroups.com...
> Find the guy that did this and kill him.
> Almost every VARCHAR(n) is totally wrong or absurd. There are not
> keys. All columns can be NULL, so you cannot ever have keys.
> CHAR(20) as a ZIP code' Everything is a VARCHAR(<< magic number >> )
> in this world. Give me an example of that stuff. The rest of the
> stinking crap uses "magic numbers: like VARCHAR(50) for anything.
> Codes without validation, etc.
>
> Columns are not fields!! This is FOUNDATIONS of RDBMS!! And the
> definition of an identifier is that it is unique to each entity. This
> is a disaster without any hope of data integrity.
> You need to throw the whole damn thing and start over. Other people
> will tell you the same thing in a nicer way (i.e. "As frightened as I
> am by the concept of having .."), but I tend to be blunt.
>|||Could you provide a link to a standard design?
"JT" <someone@.microsoft.com> wrote in message
news:eo%23aP6dHGHA.3448@.TK2MSFTNGP10.phx.gbl...
> It's hard to believe that people are still modeling basic customer tables
> and relationships from scratch. It's like re-developing a bubble sort
> algorithm or re-inventing the wheel.
> "--CELKO--" <jcelko212@.earthlink.net> wrote in message
> news:1137741113.315703.103120@.g44g2000cwa.googlegroups.com...
>

No comments:

Post a Comment