Any DBA's in the house?

Mike McGrath mmcgrath at redhat.com
Tue Jun 16 20:32:28 UTC 2009


On Tue, 16 Jun 2009, Draciron Smith wrote:

> MySQL doesn't have a problem with that many fields in a table, however
> it's usually a bad idea performance wise.
>
> What you want to do is look at the data and see how it's typically
> being used. You may want to normalize it down a little bit, especially
> if some of the fields are extremely large.
>
> Remember over normalizing is worse than under normalizing. I once saw
> a supposedly proffesional DBA normalize Sex. He used a 6 char ID field
> no less for a 1 char max value ROFL.
>
> The way it works is every time you do a join, you make the DB create
> temp tables. So when you over normalize especially on large record
> sets you are gutting your performance unless you are reducing
> redundancy. On the flip side pulling up data you are not using is a
> performance hit. Redundancy will eat up amazing amounts of disk space
> with large record sets as well.
>
> For example if your normalizing projects. You might expect to have
> fields like Proj name, Current Revision, Proj lead, people working on
> the project, Liscence type, etc.
>
> Proj Name obvously you normally won't normalize it. Revision though if
> you have project turnover and want to maintain a list of who was
> working on specific revisions you'll need to normalize that field.
> Proj lead if you can have multiple proj leads you an either have
> multiple proj lead fields or normalize that. People working on the
> project should of course be normalized. Liscence type it depends. If
> you are using abreviations better to just have the redundant data
> since your only going to have a max of 5 chars. If you are exeeding 10
> chars you normally want to normalize it.  With large recordsets on a
> hard hit server you may want to denormalize something like that to
> improve performance. If your cramped for space and performance is
> great you may want to normalize with even as few as 5 or 6 chars.
>
> As a rule you want to normalize all blob type fields which large
> amounts of text. Those are typically drill down items anyway, pulling
> up a million such fields will choke even the fastest servers.
> Performance wise and practicality reasons it's usually better to just
> use a Char feild than text or blob fields. You create a separate table
> with the forign key(s) of the tables you want to use this with. Then a
> sequence value so you an assemble the parts back together. It's a
> little more work on the front end submitting the data as you have to
> chop it up into 254 char blocks but it gives you almost infinite space
> and none of the wierd bugs and limitations that come with using Blob
> and text fields. The data is stored more efficently and it's far
> easier to search char fields than text/blob fields.  A table of char
> fields an be accessed by just about anything and every database driver
> will not only handle them but also handle them in the same way. Blob
> and Text fields are unsupported in many database drivers and are
> handled differently by different drivers.
>
> In my experience 15 to 30 fields are usually the largest tables I've
> worked with. I've seen larger, seen 100 field tables composed almost
> entirely of  forign keys. One relationship diagram we used an entire
> wall to fit it on.Covered the entire thing top to bottom. You almost
> had to use a magnifying glass to read the field names.  That's very
> unusual however. Most of the time screen real estate is such that your
> not displaying more than 10-15 values at a time anyway. Anything
> that's a drill down item should be normalized unless it's a main field
> for another view.
>
> I know a kinda long winded way of saying that depends on x :)
> Would be happy to look at it for you. Volenteered to do some DBA work
> in the past but never got sent the schema, given access to the DB or
> got a follow up so I assume somebody else wound up volenteering and
> got picked instead.
>

So that's a no....?

only kidding :).  Looks like I'll need to think on it some on how this
data will be used a bit more.

	-Mike

> On Mon, Jun 15, 2009 at 2:56 PM, Mike McGrath<mmcgrath at redhat.com> wrote:
> >
> > Is it better to have:
> >
> > id | smallString1 | smallString2 | smallString3 | largerString1
> >
> > or put the largerString1 in it's own table?  largerString1 I'm
> > specifically thinking about is for http://bugzilla.redhat.com/503550
> >
> > Also, does mysql have a problem with having a table with many columns in
> > it?  Where many== 20, 30, 40?
> >
> > I'm specifically talking about a table with around 1.2 million rows in it
> > which is growing at a rate of several hundred thousand rows every 6
> > months.
> >
> >        -Mike
> >
> > _______________________________________________
> > Fedora-infrastructure-list mailing list
> > Fedora-infrastructure-list at redhat.com
> > https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list
> >
>
> _______________________________________________
> Fedora-infrastructure-list mailing list
> Fedora-infrastructure-list at redhat.com
> https://www.redhat.com/mailman/listinfo/fedora-infrastructure-list
>


More information about the Fedora-infrastructure-list mailing list