Hi i am just wondering how I would go about using objective c to analyze a photo by displaying text that is summarizing/describing key points, main idea ect, of that photo. Its an algorithm simliar to how googles search by image works but without the search, and more of the analyzing by shapes,color etc.? Would i need a data base of words too since the code will need something to display the summary?
Writing a program to analyze a photograph and to extract information in order to accurately describe what’s in the photograph is not a trivial task.
If you really want to do this, first you need to learn serious digital signal processing, especially image processing and pattern recognition, because a picture is nothing but a 2D array of samples of a scene at a particular point in time.
This is a challenging problem and there is no deterministic algorithm to solve it. One can only hope to design or find an algorithm that does the job only reasonably well.
To get an appreciation of the problem, just take your camera and go near your utility meters, water or electricity, snap a photo of the counter on one of them, and upload the picture onto your computer.
You can look at the picture and read the numbers in the counter effortlessly; this is a remarkable feat. You can do this because you are endowed with the most amazing pattern-recognition processor, thanks to your eyes and brain. But what sort of algorithm do you need to design to achieve this remarkable feat?
This is the sort of challenge you should tackle only if you are not afraid of mathematics because digital signal processing relies on mathematics.
[Become a competent programmer faster than you can imagine: pretty-function.org]
You know your post is exactly what I needed cause I have been to forum after forum and people have been telling me to give this idea up because it cant be done or it would just take too long especailly for one person to code. I thank you for the info and motivation needed but besides signal processing can you walk me through a rundown of what I would need to code/do for this app to summarize a photo?
Actually, what exactly are you trying to do? Is your intention to really analyse any picture and describe what’s in it or to do something maybe much simpler?
Assuming that you really want to analyse a picture in order to describe what’s in it, here is what you need to do.
The hard part
- Find or design your pattern recognition algorithm, which can recognise patterns in a picture.
For example, a pattern can be as simple as a rectangular shape or as complex as someone picking their nose. The important thing to recognise is that there is no generic pattern recognition algorithm. You can only hope to design or find one that is good at recognising only certain classes of patterns.
The easy part
- Write the code for a pattern-recognition function that implements your algorithm.
- Construct a two dimensional array from a photo, this will just be an array of colors and brightness values for each pixel.
- Transform this array into the form that your a pattern-recognition function expects as input.
- Call your pattern-recognition function with this input; at the end of its computation, this function will give you a list of patterns.
- Go through the list of patterns and decide what to do with them.
This will take you quite a long time if you attempt to do it all on your own. First you need to train yourself in digital signal processing (image processing + pattern recognition) assuming that you are already good at mathematics required for this (Linear Algebra, Filtering, Fourier series, Fourier Transforms, etc.). Then you need to write the code, etc.
This is not impossible to do but requires time, energy, financial resources, and determination.
when you say -Construct a two dimensional array from a photo, this will just be an array of colors and brightness values for each pixel.- does that mean for each individual pixel in the image i have to make an array for, cause it seems, in one picture there are thousands of pixels especially for a hd image on an iphone? or would i use some kind of loop in the code to compensate for each individual pixel? Or did u simply mean one array to govern all the pixels color? sorry i just need to know the specifics so i know exactly what im dealing with, plus for the digital signal processing it helps since i major in electrical an computer engineering.
And since you said there is no generic algorithm to encompass all possible image structures, what if I put multiple algorithms together for different patterns and have the software choose which algorithm to use based on the image, therefore increasing the chances of providing a more accurate summary using the correct algorithm than just using a generic one for all combination of images? The only bad thing I see is how many algorithms would I need in that case, and in terms of speed how efficient will it be, at going through those algorithms to choose which is the correct one? Most likely it will have to test each algorithm and somehow pick which one is the best? WOW this is sound really complicated, but fun.
[quote]when you say -Construct a two dimensional array from a photo, this will just be an array of colors and brightness values for each pixel.- does that mean for each individual pixel in the image i have to make an array for, cause it seems, in one picture there are thousands of pixels especially for a hd image on an iphone? or would i use some kind of loop in the code to compensate for each individual pixel? Or did u simply mean one array to govern all the pixels color? sorry i just need to know the specifics so i know exactly what im dealing with, plus for the digital signal processing it helps since i major in electrical an computer engineering.
Conceptually, a picture is just a two dimensional array of pixel values:
[code] struct Color
typedef double value_type;
value_type red, green, blue, alpha;
Color color ;
// The picture
Pixel picture [WIDTH][HEIGHT];
However, to get a good grasp on this, read the spec for the raw picture-data your camera produces.
Also for this project, better to use C or C++, not Objective-C.
Yes, multiple algorithms is the way to go.
But, before tackling this project head-on, make sure that you understand the full magnitude of the problem by solving a few simple problems first.
For example, given a 2D array of numbers, in the range [0, 1], determine if it contains:
- a triangular boundary;
- a rectangular boundary; or
- a circular boundary.
A boundary is a closed sequence of row-column-index pairs denoting the points where the array has the same value (zero, for example.)
Bro I just wanna thank you for all the advice. Undertaking a project of this magnitude while going to school will be difficult, time wise, but I will try to keep you posted, I just hope you dont leave these forums overs the next 2 years(lots of research to be done, coding wise since the math stuff I got a handle on).
btw I dont want this thread to be locked since I am 100% sure I am gonna try this even if I have a-lot of failures throughout the process, which is why I need it open to post stuff.