silverwhisper's tags:
OK, i've blown most of yesterday & today massaging data in this frigging database. now that i'm looking at this legacy stuff, it's a damned nightmare. whoever designed the original app didn't know much about database architecture and was apparently only passingly acquainted with the concept of a relational database.

you might well ask, "gee ed, what (the hell) is 'massaging data'?"

yesterday, TS asked about a few things about which i was ranting and although i did (finally) provide an answer to his questions, i'm gonna enlarge upon that a bit now--mainly b/c it's still pissing me the hell off. :D

but to answer the question: in brief, it's manipulating the data so that all the fields are structured properly. it's the manipulation of data you have to do to old data before you let someone go and, well...manipulate it. you coax, you sort, you adjust, you fiddle...you massage.



[deep breath]



people who work with databases detest using fields in which the data doesn't fit into some kind of format if at all possible. "free form text" is a set of three four-letter words, as a very wise DBA once observed to me.

the legacy app is designed to hold 2 phone numbers and doesn't provide much space for users to add comments or notes. i'm seeing records that have phone numbers stuffed into address fields, the country field, the ZIP code--and of course, none of it in any standardized way, forcing me to search high & low in every field. and i've just had to add a new table to account for all of this data--fortunately, not a big deal logistically, but philosophically not at all desirable.

do you know how many ways it's possible to indicate that a phone# is temporary? i've counted at least 2 dozen.

and all of this--all of it!--could have been avoided had the guy who designed this app knew thing one about restricting user input.

that ignorant fuck!

ed

del.icio.us Digg reddit StumbleUpon

Comments

  • fearing said on Jan 11, 2008....
    Not even going to tease you about "massaging data" - you sound a little edgy. 
    I'm sorry this has become such a pain for you. 
  • TinSoldier said on Jan 11, 2008....
    So, two phone numbers isn't enough? That seems to be the major problem.

    Sounds like fun.

    I was thinking about something you said earlier, where you don't/won't have any control over the people who do the data entry. But the thought occurred to me that when you finish redesigning this database properly, the data entry portion will probably make a lot more sense and be less prone to the kinds of errors and workarounds that you seem to be struggling with.

    These are kind of fun to read, and I'm sure that I should learn how to create databases eventually.
  • gingersoul said on Jan 11, 2008....

    Ed.......I don't envy you ...no even a little bit....sorry this stuff is making you crazy....

    A glass of wine might help?...:-)

  • silverwhisper said on Jan 12, 2008....
    fearing: yeah, i was pretty irritated about it last night, i'll confess. but i'm doing better now. :>

    TS: well, if i do my job properly as a db designer, whoever comes after me will definitely have an easier job of it, no question! hell, the data's gonna be stored in access--there isn't even a real database back-end, but b/c access supports ODBC, moving the data into SQL or whatever real database they might use next is relatively easy, really. i'm glad you're finding them fun to read. you'd probably do well first to find a good book on relational databases before moving to an app-specific text, though.

    GS: thank you. :> as for wine, well, that was the night before. :D hm...db design while buzzed...hey, it might help!

    ed
  • kelly said on Jan 12, 2008....
    Tin, if you're interested in learning more you might try "Database Design for Mere Mortals," by Hernandez.  It's been a while since I looked at it but it seemed like a nice introduction yet beefy enough to allow you to do real work.
  • gingersoul said on Jan 12, 2008....

    Hold on, Ed......you can make it and looking back you will even think..

    "Tsk tsk, that was actually a piece of cake and that wine was so good"....:-)

  • TinSoldier said on Jan 12, 2008....
    Wait -- Access isn't a real database like MySQL or PostgreSQL or Oracle or such things?
  • crybabylu said on Jan 13, 2008....
    This has been quite an experience for you.
  • travelr712 said on Jan 13, 2008....
    wow! no wonder you're so pissed off! sounds like the guy who designed the database didn't understand the first thing about data input! i mean, you have to at least separate each catagory of data into its own set of fields, if not into a complete table! how can you structure a search for people who live in a particular zip code if the whole address is in the phone number field? that's just crazy! and yes, i've run into a couple examples like this, with over 10,000 data points. it's truly an exercise in patience to go back and restructure the data set to be proper. cudos to you, ed my man!
  • silverwhisper said on Jan 14, 2008....
    kelly: perhaps i should pick up a copy myself. :>

    GS: :>

    TS: no. access is used for small apps where you aren't routinely exceeding one million records/table or so. most businesses wouldn't build an enterprise app on it mainly b/c it simply isn't sufficiently robust, nor does it really give a DBA many tools, whereas of course SQL or oracle certainly do.

    dee: that's certainly one way of putting it... :>

    trav: it so very, very truly is exactly that...i've been splitting their 4 basic tables into the dozen or so that i'll be using. populating the HOC tables...that's gonna be interesting. heh...i'm refraining from putting the field data in proper case and reining in my instinct to design the phone fields to display int'l phone#s correctly... :> there's only 2 records so i figure it'll be OK. :>

    ed
  • humanbeing said on Feb 03, 2008....
    In defense of Microsoft Access, I once worked on a marketing system for a large telecommunications company.  There were over 5 millions records in a customer table in Microsoft Access v3.0, together with numerous other tables containing hundreds of thousands of records.  We optimized the queries and we're getting results as fast as any Oracle database could have done it on a similar platform.
     
    The database was used by 20 concurrent users via VB apps over a network.
    It needs tweaking to get that sort of performance, but it's not the 'toy' database that people think it is. 

Comment on "rant: free form text is 3 four-letter words"


(Separate tags using commas, for example: New York, dating, vegetarian)