Abstract- The recent success of supervised and unsupervised common item finding pushes us to focus our paper on entirelyunsupervised common object discovery. Co-localization of objects is a generic term that refers to the process of locatingthings of the same type within a set of photographs taken concurrently. To identify the presence/absence of objects in animage using traditional object detection/localization techniques, object instance bounding box annotations or, at the veryleast, picture-level labels are frequently required. Our entirely unsupervised technique searches through a set of photos withno annotations for images that contain comparable objects and then locates those same objects in the matching images. Noprior knowledge of all common things is required to perform this unsupervised object discovery task, which is characterisedas a sub-graph mining task from a weighted network of object proposals. Together with positive images and commonobjects, sub-graphs with densely connected nodes each recording a single item pattern are discovered. The only two modesof communication available to humans are speech and text. Blind individuals will be able to collect information using theirvoices. Visually handicapped individuals can easily read the captured image's text with the assistance of this endeavour. Wetake photographs with a camera in this project and then scan them using Image Magick software to further process them.Text is converted to speech using the TTS (Text to Speech) engine. According to scientific research, blind people will benefitfrom analysing a variety of photos.