Friday, April 5, 2013

Binary search trees: structure for data retrieval





                                If we recall from earlier in the semester, we discussed a method for discovering the similarity of two different documents through a process that went through shingling, minhashing, and then local sensitive hashing.  Briefly we discussed hash tables and how they are used in mapping certain keys to values into a “signature matrix”. In homework 3 question two, we were required to perform a minhash on a data table given a certain order of rows. Essentially, it was a method for data retrieval. I would like to introduce a new type of data structure called a binary search tree which has certain characteristics that might be more useful for data retrieval than a typical hash table when searching under certain conditions.





I have drawn most of my understand on binary search trees from this video lecture from UC Berkeley's computer science department. Essentially, a binary search tree is a type of data structure which can be described as having 4 specific characteristics. Directly from Wikipedia, those characteristics are the following:

  1. The left subtree of a node contains only nodes with keys less than the node's key.
  2. The right subtree of a node contains only nodes with keys greater than the node's key.
  3. Both the left and right subtrees must also be binary search trees.
  4. There must be no duplicate nodes.
       To clear up some of the jargon, essential a key is some predefined value for which the search is based on. In the video this value is numeric. A node is a position along the tree which has a key value assigned to it.
     Say a search is running on some binary search tree, it will begin with the top , or root node, will check the key value of that root node, and decide if the key value being searched for is greater than or equal to the key of that node. If it is less, the search will go to the bottom left sub-tree  and if the key is greater, it will go to the bottom right sub-tree. After the search is sitting on the new node, it will repeat the previous process until it finds the key. If it doesn't find the key, it will return a null, and most algorithms can be written so that if a null is returned, the key that was being searched on can be indexed into where the null was previously located. 
      The interesting thing about binary trees vs hash tables is that a binary tree is better suited to find inexact matches as where hash tables can more efficiently withdraw exact matches to some particular key. In the video, the lecturer mentions two keys that will be found if they exist in the tree given that someone is searching for some arbitrary key with value k which does not  exist in the tree. These two nodes are as follows.
  1. Node containing the smallest key value greater than k.
  2. Node containing the largest key value less than k.

     If you pay attention to the lecture at time 20:11 in the video, the lecturer gives a more visual explanation of what I just described. Imagine you had a binary tree full of information and were searching for a key value you knew did not exist, a standard find algorithm run across this type of data structure will find, if they exist, the two nodes containing key values closest to the one which was being searched for. It is possible that meaningful information might exist in those two nodes. Hence, from my take on the advantage of binary search trees, the characteristics as a type of data structure good for finding inexact but potentially very closely related information is demonstrated. Please watch the video for a more in depth understanding.

Sources:




18 comments:

  1. Hello! This is my first visit to your blog! We are a team of volunteers and starting a new initiative in a community in the same niche. Your blog provided us useful information to work on. You have done an outstanding job.

    AWS Online Training | Online AWS Certification Course - Gangboard
    AWS Training in Chennai | AWS Training Institute in Chennai Velachery, Tambaram, OMR

    ReplyDelete
  2. Superb. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. I hope it will help a lot for all. Thank you so much for this amazing posts and please keep update like this excellent article. thank you for sharing such a great blog with us.

    angularjs Training in chennai
    angularjs Training in chennai

    angularjs-Training in tambaram

    angularjs-Training in sholinganallur

    angularjs-Training in velachery

    ReplyDelete
  3. The site was so nice, I found out about a lot of great things. I like the way you make your blog posts. Keep up the good work and may you gain success in the long run.
    python training in tambaram | python training in annanagar | python training in jayanagar

    ReplyDelete
  4. Nice post. By reading your blog, i get inspired and this provides some useful information. Thank you for posting this exclusive post for our vision. 
    advanced excel training in bangalore

    ReplyDelete
  5. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
    Java training in Marathahalli | Java training in Btm layout

    Java training in Jaya nagar | Java training in Electronic city

    ReplyDelete
  6. Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
    Data Science training in Chennai | Data science training in bangalore

    Data science training in pune | Data science online training

    Data Science Interview questions and answers

    ReplyDelete
  7. Do you have a spam issue on this website; I also am a blogger, and I wanted to know your situation; many of us have developed some nice methods, and we are looking to trade methods with others
    nebosh course in chennai

    ReplyDelete
  8. Hello, I read your blog occasionally, and I own a similar one, and I was just wondering if you get a lot of spam remarks? If so how do you stop it, any plugin or anything you can advise? I get so much lately it’s driving me insane, so any assistance is very much appreciated.
    Android Course Training in Chennai | Best Android Training in Chennai
    Selenium Course Training in Chennai | Best Selenium Training in chennai
    Devops Course Training in Chennai | Best Devops Training in Chennai

    ReplyDelete
  9. Wow, what an awesome spot to spend hours and hours! It's beautiful and I'm also surprised that you had it all to yourselves! Kindly visit us @ Best HIV Treatment in India | Top HIV Hospital in India | HIV AIDS Treatment in Mumbai
    HIV Specialist in Bangalore | HIV Positive Treatment in India | Medicine for AIDS in India

    ReplyDelete