Closed Thread Icon

Topic awaiting preservation: Database design question(s) (Page 1 of 1) Pages that link to <a href="https://ozoneasylum.com/backlink?for=30110" title="Pages that link to Topic awaiting preservation: Database design question(s) (Page 1 of 1)" rel="nofollow" >Topic awaiting preservation: Database design question(s) <span class="small">(Page 1 of 1)</span>\

 
DL-44
Lunatic (VI) Inmate

From: under the bed
Insane since: Feb 2000

posted posted 03-14-2008 15:38

So I am curious about the possible pros and cons of different ways of setting up a database for a project I have going.
The main purpose of the databse is to track shipments. A lot of them.
The main table will potentially have a *lot* of fields. These fields can be broken up into several top level categories, related to the type of information being held.
These fields will all be a one to one relationship with the primary key.

Is there any benefit in breaking up these fields (let's say potentially 40-50 fields or so...) into separate tables, based on the category they relate to, or would it be best to leave them all in one large table?

All of the data for each of these fields will be relatively small - several date fields, several numeric (currency) fields, and several varchar fields ranging in size from 5 characters to maybe 60 characters (no blobs or text fields).

There will, of course, be other relational tables in addition to this main table regardless of which way the main table is set up.

Any thoughts on this would be appreciated.

WebShaman
Lunatic (VI) Mad Scientist

From: Happy Hunting Grounds...
Insane since: Mar 2001

posted posted 03-14-2008 16:04

This is from someone that I know that has looked at your question, DL. They have more knowledge on these things than I do.

quote:
well, it looks like each field will be small, so memory and read/write probably won't be limiting the speed. If most db queries are going to essentially reassemble the split table it probably will be worse to split.

It's probably more a security/use decision



I hope that helps, for what it is worth.

WebShaman | The keenest sorrow (and greatest truth) is to recognize ourselves as the sole cause of all our adversities.
- Sophocles

DL-44
Lunatic (VI) Inmate

From: under the bed
Insane since: Feb 2000

posted posted 03-14-2008 16:47

I had been thinking along those lines.

Most of the queries will *not* be reading all fields, and will settle mostly, though certainly not exclusively, along the 'category' lines. Or at least "core part" plus "this category" and then elsewhere "core part" plus "this other category".

That's really the main reason for my dilemma - I know it will work either way, but not sure what would be best, or if it will matter anyway.

I guess at this point it's more of a semantics or best-practice issue than a technical one...

Tyberius Prime
Maniac (V) Mad Scientist with Finglongers

From: Germany
Insane since: Sep 2001

posted posted 03-14-2008 17:27

Keep data normalized. Things that are one-to-one belong into the same table, and the engines are very good at optimizing, especially if you have fixed size fields and indices at the right columns.

Arthurio
Paranoid (IV) Inmate

From: cell 3736
Insane since: Jul 2003

posted posted 03-14-2008 18:28

Just make sure you don't query more fields than you need if you use one table. And as TP suggested in the case of one-to-one relations it's ok to hold the data in one table. This might be obvious to you but avoid things like comma separated lists in a string field etc ... if something requires extra processing every time it's queried it's probably done incorrectly. Somehow to alot of people this isn't obvious.

DL-44
Lunatic (VI) Inmate

From: under the bed
Insane since: Feb 2000

posted posted 03-15-2008 17:29

Thanks.
This re-affirms my thinking. I had been reading quite a few tips while pondering this advising people to de-normalize...was starting to doubt my instincts.

I think this will work out well. Main table is 40+ fields, all fixed length varchar/date/numeric fields. Anything requiring text or blob inputs are in separate tables and are 1:n or n:n relationships.

WarMage
Maniac (V) Mad Scientist

From: Rochester, New York, USA
Insane since: May 2000

posted posted 03-16-2008 10:48

Having lots of columns is not a bad thing.

The one thing I like to think about during the design phase is possible changes to the system that might require a data migration. I really dislike having to write code to move the data around.

If you find that you have to write a really ugly SQL query, or do some weird front end processing to handle massage your data into a usable format on the front end you might want to think about doing things differently.

Also joins can be your friend. I find that many people do not use joins to their advantage, they treat every table as a completely separate entity.

code:
select o.* from orders o join addresses a on o.address_id = a.id



The above code is not something you should run away from. It can be very useful, and can reduce the amount of data duplication. Thinking about cases where o.billing_address_id = o.shipping_address_id you would not need to store two addresses.

Again, just think about possible future use cases, and avoiding the need to do a data migration and you should be set, and if it works, don't fix it.

Good luck.

Dan
Code Town | Zombie Head | How Much TP? | Feed The Blob

« BackwardsOnwards »

Show Forum Drop Down Menu