Analytics and Visualization of Big Data: Binary search trees: structure for data retrieval

Friday, April 5, 2013

Binary search trees: structure for data retrieval

If we recall from earlier in the semester, we discussed a method for discovering the similarity of two different documents through a process that went through shingling, minhashing, and then local sensitive hashing. Briefly we discussed hash tables and how they are used in mapping certain keys to values into a “signature matrix”. In homework 3 question two, we were required to perform a minhash on a data table given a certain order of rows. Essentially, it was a method for data retrieval. I would like to introduce a new type of data structure called a binary search tree which has certain characteristics that might be more useful for data retrieval than a typical hash table when searching under certain conditions.

I have drawn most of my understand on binary search trees from this video lecture from UC Berkeley's computer science department. Essentially, a binary search tree is a type of data structure which can be described as having 4 specific characteristics. Directly from Wikipedia, those characteristics are the following:

The left subtree of a node contains only nodes with keys less than the node's key.
The right subtree of a node contains only nodes with keys greater than the node's key.
Both the left and right subtrees must also be binary search trees.
There must be no duplicate nodes.

To clear up some of the jargon, essential a key is some predefined value for which the search is based on. In the video this value is numeric. A node is a position along the tree which has a key value assigned to it.

Say a search is running on some binary search tree, it will begin with the top , or root node, will check the key value of that root node, and decide if the key value being searched for is greater than or equal to the key of that node. If it is less, the search will go to the bottom left sub-tree and if the key is greater, it will go to the bottom right sub-tree. After the search is sitting on the new node, it will repeat the previous process until it finds the key. If it doesn't find the key, it will return a null, and most algorithms can be written so that if a null is returned, the key that was being searched on can be indexed into where the null was previously located.

The interesting thing about binary trees vs hash tables is that a binary tree is better suited to find inexact matches as where hash tables can more efficiently withdraw exact matches to some particular key. In the video, the lecturer mentions two keys that will be found if they exist in the tree given that someone is searching for some arbitrary key with value k which does not exist in the tree. These two nodes are as follows.

Node containing the smallest key value greater than k.
Node containing the largest key value less than k.

If you pay attention to the lecture at time 20:11 in the video, the lecturer gives a more visual explanation of what I just described. Imagine you had a binary tree full of information and were searching for a key value you knew did not exist, a standard find algorithm run across this type of data structure will find, if they exist, the two nodes containing key values closest to the one which was being searched for. It is possible that meaningful information might exist in those two nodes. Hence, from my take on the advantage of binary search trees, the characteristics as a type of data structure good for finding inexact but potentially very closely related information is demonstrated. Please watch the video for a more in depth understanding.

Sources:

http://en.wikipedia.org/wiki/Binary_search_tree

17 comments:

LENINSeptember 4, 2018 at 8:32 AM
Hello! This is my first visit to your blog! We are a team of volunteers and starting a new initiative in a community in the same niche. Your blog provided us useful information to work on. You have done an outstanding job.

AWS Online Training | Online AWS Certification Course - Gangboard
AWS Training in Chennai | AWS Training Institute in Chennai Velachery, Tambaram, OMR
ReplyDelete
Replies
pragyachitraSeptember 27, 2018 at 4:54 AM
Superb. I really enjoyed very much with this article here. Really it is an amazing article I had ever read. I hope it will help a lot for all. Thank you so much for this amazing posts and please keep update like this excellent article. thank you for sharing such a great blog with us.

angularjs Training in chennai
angularjs Training in chennai

angularjs-Training in tambaram

angularjs-Training in sholinganallur

angularjs-Training in velachery
ReplyDelete
Replies
saiOctober 13, 2018 at 8:54 AM
The site was so nice, I found out about a lot of great things. I like the way you make your blog posts. Keep up the good work and may you gain success in the long run.
python training in tambaram | python training in annanagar | python training in jayanagar
ReplyDelete
Replies
AnonymousOctober 15, 2018 at 1:04 AM
Thanks for the information shared with us. aws online training
ReplyDelete
Replies
AnonymousOctober 16, 2018 at 3:19 AM
Nice post. By reading your blog, i get inspired and this provides some useful information. Thank you for posting this exclusive post for our vision.
advanced excel training in bangalore
ReplyDelete
Replies
UnknownOctober 18, 2018 at 2:19 AM
Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
Java training in Marathahalli | Java training in Btm layout

Java training in Jaya nagar | Java training in Electronic city
ReplyDelete
Replies
shalinipriyaOctober 19, 2018 at 12:25 AM
Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.
Data Science training in Chennai | Data science training in bangalore

Data science training in pune | Data science online training

Data Science Interview questions and answers
ReplyDelete
Replies
gowthunanNovember 17, 2018 at 12:24 AM
Do you have a spam issue on this website; I also am a blogger, and I wanted to know your situation; many of us have developed some nice methods, and we are looking to trade methods with others
nebosh course in chennai
ReplyDelete
Replies
Malaivel Siddha HospitalJune 5, 2019 at 4:25 AM
Wow, what an awesome spot to spend hours and hours! It's beautiful and I'm also surprised that you had it all to yourselves! Kindly visit us @ Best HIV Treatment in India | Top HIV Hospital in India | HIV AIDS Treatment in Mumbai
HIV Specialist in Bangalore | HIV Positive Treatment in India | Medicine for AIDS in India
ReplyDelete
Replies
KITJune 6, 2019 at 6:22 AM
Nice blog, it's so knowledgeable, informative, and good looking site. I appreciate your hard work. Good job. Thank you for this wonderful sharing with us. Keep Sharing.
Kindly visit us @ 100% Job Placement | Best Colleges for Computer Engineering
Biomedical Engineering Colleges in Coimbatore | Best Biotechnology Colleges in Tamilnadu
Biotechnology Colleges in Coimbatore | Biotechnology Courses in Coimbatore
Best MCA Colleges in Tamilnadu | Best MBA Colleges in Coimbatore
Engineering Courses in Tamilnadu | Engg Colleges in Coimbatore
ReplyDelete
Replies
ammuDecember 14, 2019 at 10:17 PM
very nice...
inplant training in chennai
inplant training in chennai
inplant traing in chennai
brunei darussalam web hosting
costa rica web hosting
costa rica web hosting
hong kong web hosting
jordan web hosting
turkey web hosting
gibraltar web hosting
ReplyDelete
Replies
SS Vermicompost IndustryFebruary 25, 2020 at 1:19 AM
Thanks for posting such a great blog
Vermicompost Manufacturers | Vermicompost in chennai
ReplyDelete
Replies
Shanthi CabsMarch 3, 2020 at 11:45 PM
Nice blog...
Best Travels in Madurai | Tours and Travels in Madurai | Best tour operators in Madurai
ReplyDelete
Replies
tjproductstechJune 15, 2020 at 3:14 AM

Hi, Very nice article. I hope you will publish again such type of post. Thank you!
Corporate gifts ideas | Corporate gifts
Corporate gifts singapore | Corporate gifts in singapore
Promotional gifts singapore | Corporate gifts wholesale Singapore
leather corporate gifts singapore | t shirts supplier singapore
thumb drive supplier singapore | business card holder singapore
corporate gifts supplier | customized corporate gifts singapore
corporate gifts supplier singapore
ReplyDelete
Replies
JayalakshmiAugust 17, 2020 at 11:47 AM
It’s great to come across a blog every once in a while that isn’t the same out of date rehashed material. Fantastic read.
hadoop training in chennai

hadoop training in tambaram

salesforce training in chennai

salesforce training in tambaram

c and c plus plus course in chennai

c and c plus plus course in tambaram

machine learning training in chennai

machine learning training in tambaram
ReplyDelete
Replies
praveenAugust 19, 2020 at 3:56 AM
Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging
web designing training in chennai

web designing training in porur

digital marketing training in chennai

digital marketing training in porur

rpa training in chennai

rpa training in porur

tally training in chennai

tally training in porur

ReplyDelete
Replies
deivaAugust 20, 2020 at 12:40 PM
Nice post. By reading your blog, i get inspired and this provides some useful information. Thank you for posting this exclusive post for our vision.
angular js training in chennai

angular js training in omr

full stack training in chennai

full stack training in omr

php training in chennai

php training in omr

photoshop training in chennai

photoshop training in omr
ReplyDelete
Replies

Add comment