Optimizing your database with Paul Tuckfield

Paul Tuckfield spoke at the mySQL conference in April about optimizing Youtube.

Toward the end of the talk, Paul shifts to a mainly system-oriented focus on optimization and presents a few tips:

Advertisements

2 Responses to Optimizing your database with Paul Tuckfield

  1. David says:

    I’d like to see you take this a bit farther and talk about the different types of RAID and which ones work best – 10, 50, 2, 3 and which ones work best for which database. Also, is there still a focus on installing the data on a DB native filesystem (installing on raw) or does that not come into play much anymore.

  2. sergew says:

    I’ve never used RAID2, or 3. The most popular RAID levels remain 1, 5
    and 10.

    RAID50 seems too odd a hybrid and I haven’t seen it be useful in
    applications I’ve worked on.

    RAID50 would be using disk stripes of, say 3 disks, and then striping
    them across a RAID0 configuration. That means your minimal RAID50
    would be 6 disks, and you have two disks used for parity.

    RAID5 is quite slow because all disks become enabled when a write
    occurs. This becomes a real problem when your array exceeds 4 disks
    because you have five spindles running for each write operation. With
    RAID50, you can stripe the data and reduce writes. If each disks is
    1tb, your final storage would be 4tb, with no hot spares.

    You can contrast that with, say a RAID10, which is very fast and
    optimizes protection. Using the same 6 disks described above, you’d
    end up with only 3tb, but you’d have increased performance and
    protection in case of disk problems.

    This is important because it’s common to find that when a RAID fails,
    another disk in the enclosure will fail soon after.

    Ultimately, in large environment, you won’t find much use of RAID50
    because the industry has moved away from this model of
    storage. Organizations like Google use individual hosts as storage
    nodes and replicate data across nodes. In these cases you’re limited
    to the storage in your computer enclosure and you’re inclined to
    either treat the entire host as disposable (ie no protection) or be
    focused on reliability (RAID1/RAID10).

    On the other end of the spectrum are SAN solutions which use more
    complex RAID solutions which provide greater protection and scability.

    And lastly, you’re seeing and will continue to see a move away from
    block level RAID to hybrid LVM/filesystem solutions such as found in
    Sun’s ZFS and Hammer filesystem (which is still in development). In
    these schemes, the filesystem itself provides the RAID abstraction-
    this means the RAID can be more flexible (able to accomodate different
    sized disks, resize dynamically, etc.) and provide faster data access.

    The Sun Thumper is a designed around ZFS and has been reported to
    provide good speed at low cost.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: