Saturday, December 20, 2008

np-complete

i wrote an entry last week discussing perfection. nobody’s perfect (np) was not meant to take the track it did. the thoughts that were lingering were over taken by the events of the day, so the writing followed the terrain rather than the map. the thing is, the map is still out there needing to be read. "no longer young rob", this ones for you.

two of my best friends have told me in under a month that they are getting married. being invited to weddings of the people you most want to see happy in the world is a good thing… right? it’s a time for you to realize that there is love in the world, and that people can find happiness. having people you love find love themselves should be the opposite of a death in the family, it should be the goodness that proves the world is a happy place.

this should be even more true with the two people are those that you wondered if they would ever tie the knot. but somehow the impending happiness has not given me the warm and fuzzy feelings that i expected them to. rather their announcements have reminded me of complex algorithms with large data sets.

computational complexity theory is not an issue that most liberal arts majors turned software consultants trying to make a buck are concerned with. it takes exposure to someone interested in considering and measuring these things to even know what algorithm analysis is. my practical approach had always been, see a new solution and then test the algorithm with varying inputs that modeled the actual use. this does work well when the inputs and results are well understood, but begins to break down as datasets grow.

years ago i was standing over an engineer, watching him do algorithm analysis on a sorting problem. i asked what the issue was, there was someone else waiting for a code solution and scribbling on paper was not getting that person his fix. the engineer looked up at me and explained that the current solution would not scale to big datasets. the practical side of my personality, and the bullheaded side of my nature took over. i asked how many elements were in the current set… the answer was 6. i asked how many we could expect worst case, with a shy look the engineer told me 100. even this was a stretch, but i loved this guy and i smiled. “do you think we can just code this up and not worry about it?”

that’s what we did, and the waiting person got his fix, things moved forward. the thing is i have come back to this moment in my career over and over. i didn’t listen all the way through. i simply pushed for a solution, which at the time was the right thing, but i lost the opportunity to grasp the implications of driving large datasets though an inherently inefficient algorithm. the larger lose is not having learned the way to prove the efficiency of the algorithm in the first place.

being able to accept that i am going to production with a less than perfect algorithm is part of my personality and my career. just in time development, dealing with the problem at hand and working in a highly triage based method is part of who i am. return on investment and risk verse reward are central themes. i will not spend time tuning to save cpu if i am not short on cpu, hardware is cheap and scaling with hardware is easier than spending the time to analyze and find perfection.

the truth is, there is almost never a perfect solution. solutions need to be created based on the situation, on the conditions and on the problems you are trying to solve. i now realize the sets of data have grown. i cannot rely on the fact that there will be a very limited set of data that is easy to look at and analyze. the data analysis grows with the set of data, and the need for efficiency now outweighs the need for perfection.

the need for perfection is most likely not a requirement in the first place. jpeg as my preferred graphic file format and is a lossy algorithm. it is the standard that most people use for all their graphics needs, and its popularity is partially due to its balance of quality over expense. the less than perfect results are acceptable to lower the resource consumption. in many cases this is more than acceptable, it’s the only way to solve the problem.

i have best friends who have selected the most important algorithm of their lives, they have either limited the set of data, or found an efficient way to get to the quality that they need. the world is a large and complex place. in two weeks it is very possible i will have all three of us in one place, i am going to force a discussion of algorithm analysis to take place.

nobody’s perfect, but deciding on what is good enough is what we will help each other with.


it still didn't go the way i expected... i love writing.

No comments:

Post a Comment