Quantcast

Jump to content


Photo

[Tutorial] Python's Guide To Image Matching (and Mystery Pic)

python tutorial mystery pic

  • This topic is locked This topic is locked
2 replies to this topic

#1 juvian

juvian
  • 123 posts


Users Awards

Posted 11 July 2015 - 05:56 PM

Ever wondered how some people know the mystery pic answer? Well, some are born with supernatural powers that allows them to realize which image is it just by the colors in it, but unfortunately I was not born with them so as soon as they released mystery pic again after months of absence, I started to wonder if it was possible to make a program that solves it. Of course a program can solve it, but wanted to have a solution that took less than a minute to solve it, so that's the hard part ;).

Even if it sounds hard, DON'T PANIC. If you read enough and try hard, all is possible. So, after some weeks of programming and testing a looot of possible solutions, I'm finally happy with the results and wanted to share them to the community.


I did this with python, with the help of several libraries:

  • PIL (pillow) -> this handles image reading convertion to rgb
  • urllib -> to download images
  • ThreadPool -> to make some things faster with pools
  • numpy -> this does a lot of array handling in c++ to make it faster


1) Downloading an item image from the internet:

def doWork(img): # download image and save on images folder with image name
	urllib.urlretrieve("http://images.neopets.com/items/"+img, "../images/"+img)


def downloadImages(arr): # pass array of item image names to download

	print "downloding " + str(len(arr)) + " images"

	p = ThreadPool(processes=50) # makes it faster

	p.map_async(doWork, arr).get(999999999)


2) Reading an image and converting to rgb:

def ImagetoRgb(name): # pass name of the image inside our images folder, open and return rgb conversion

    return Image.open("../images/"+name).convert('RGB')

3) Figuring out mystery pic "real" dimensions: (even if its a 150x150 px image, we actually only care about 5x5). The idea is that we check every row and every column, to check in how many pixels it changes color. When it changes, we save this count if it is lower than our max. So even if the image has repetead pixels, as we only pick the lowest distance from one color to another, we will actually pick the right count: 150/5 = 30


 

def getImagetoSolve(link):
	im = Image.open(link).convert('RGB') #open our image
	imx = im.load() // load pixel data

	minCount = 9999 # we start with a big number of our lowest pixel distance

	for x in range(im.size[0]): # img.size[0] is the width
		difColors = [] # colors we have checked on this row
		difColors2 = [] # colors we have checked on this column
		r1=1 # our current distance in row is 1
		r2=1 # our current distance in column is 1
		for y in range(im.size[1]): # img.size[1] is the height
			col = imx[x,y] # this gives the pixel in the x,y coordinates
			col2 = imx[y,x] # this gives the pixel in the y,x coordinates

			if col not in difColors: # if its a pixel we haven't seen
				difColors.append(col) # we add it as seen
				if r1 > 1: # if its not the case when we actually start, this means we reached a new color 
					minCount = min(minCount, r1) # we update our ditance
			else:
				r1+=1 # if not, we have traveled 1 pixel more without a new color
			
                        # the same but with column
			if col2 not in difColors2:
				difColors2.append(col2)
				if r2 > 1:
					minCount = min(minCount, r2)
			else:
				r2+=1


        # now we know the real distance between pixels, that is to say, every x pixels there is a new pixel color

	tileSize = im.size[0] / minCount # tileSize represents each pixel size in the 150x150 image (30 pixels if its 5x5)

	minCount = im.size[0] * 1.0 / tileSize # 150x150 in a 9x9 image is not a round number, so we fix the minCount to represent exact value

	pixelData = [] # real data we care about (5x5 or 9x9)

	print link, tileSize, minCount

	for y in range(tileSize): # here we add each pixel we care about to our grid
		pixelData.append([])
		for x in range(tileSize): # we pick the pixel at coordinates x*mincount,y*mincount but that might give a value with decimals so we do a little math fix
			pixelData[y].append(imx[math.floor(x*minCount)+1,math.floor(y*minCount)+1])


	arr1=numpy.array(pixelData) # we pass it to numpy array which is much faster
        
        #return useful stuff
	return { 
		"img": im,
		"imx" : imx,
		"tileSize": tileSize,
		"pixelData": arr1
	} 


4) Comparing the new 5x5, 9x9 or 10x10 pixel image we now have with an image we want to check:

This can be done with a simple pixel x pixel match, but this would take hours to solve if you want to compare against 100k images and check for rotations as well. So after a bit of searching, there is an awesome concept called integral image, which is really nicely explained on wikipedia, but the idea is that we want to match a 5x5 pixel image window over a 80x80 (as example). The idea of integral image is that instead of having a grid with all the pixels, we pass that to a grid of sum of pixels, in a way that we can easily know the sum of pixels on a 5x5 window of the 80x80 image.

So instead of checking against every pixel, we actually only check if the sum of pixels of  the 5x5 image is the same as the window we are looking on the 80x80 one. If it doesn't match, there is no need to check if all pixels match. If it matches, its possible that it is a perfect match, so we add that region as possible regions to analize later pixel by pixel (note that except extreme cases, if it gives a posible region it most likely will be the match we want)

And the awesome part is that the integral image of the 5x5 one is the same as its possible rotations, as we just sum up all the pixels values
 

def find_image(im, tpl): #check if matches, if it does it returns the location of the match. If not, throws an error . im is our 80x80 image, tpl is our 5x5 template

	im = numpy.atleast_3d(im) # we make sure it has at least a depth of 3 (red,green,blue)
	tpl = numpy.atleast_3d(tpl) # we make sure it has at least depth of 3 (red,green,blue)

	H, W, D = im.shape[:3] # we get the heigh, width and depth
	h, w, d = tpl.shape[:3]

	# Integral image and template sum per channel
	sat = im.cumsum(1).cumsum(0)
	

	# Calculate lookup table for all the possible windows
	iA, iB, iC, iD = sat[:-h, :-w], sat[:-h, w:], sat[h:, :-w], sat[h:, w:] 
	lookup = iD - iB - iC + iA


	# Possible matches
	tplsum = numpy.array([tpl[:, :, i].sum() for i in range(D)])

	possible_match = numpy.where(numpy.logical_and(*[lookup[..., i] == tplsum[i] for i in range(D)]))

	hastoCheck = False

	for possibility in possible_match: # if there isn't a possible match, we don't have to check anything else
		if len(possibility) > 0:
			hastoCheck = True

	if hastoCheck: # if there is a possible match, we have to check pixel by pixel against the 5x5 one, and its possible rotations 
		zipped = zip(*possible_match)
		tp2 = numpy.rot90(tpl)
		tp3 = numpy.rot90(tp2)
		tp4 = numpy.rot90(tp3)
		for tp in [tpl,tp2,tp3,tp4]:

			# Find exact match
			for y, x in zipped:
				if numpy.all(im[y+1:y+h+1, x+1:x+w+1] == tp):

					return (y+1, x+1)



	raise Exception("Image not found")

Between this and a bit of threading, I was able to match a 5x5 image against 45723 item images I obtained in just 12 seconds using 100% cpu. Took more to read the images and start the threads than the actual matching which takes less than 0.02 seconds each. Using 50% cpu it took 20 seconds

So this awesome method of image matching is great for our purpose and quite useful for a lot of things in general when we want to match something exactly in an efficient way

However, all this image matching is not actually really necessary to match in less than a minute. Another choice is using a database and storing the images in a way we can filter the ones that do not even have the pixels that the 5x5 image has. Why check an image with no blue pixel if the 5x5 one has a blue pixel? That's the general idea, and I tried over 10 methods of doing probability filters and things like that but they all took a lot (30-60 seconds, which is more than manually matching) and besides were a pain to set up the data on the db.

Finally, in my last try I decided to go with a pretty straightforward and simple structure, just make a row on a images table with image name and all its unique pixels data. Then, I check the unique pixels of the 5x5 image and query the images that have all those pixels on it. With this, the program took 10 seconds and the actual query on mysql took 3-4 seconds.

The best part of this approach is that it will work better when you need to match against more than 40k images, and surprisingly enough, with the last 3 mystery pic competitions when I made the query on the db, only the picture that was the actual solution wasn't filtered out! This says a lot about the uniqueness of pixels, even if the 5x5 image had only 6 different pixels, from the 47k images I had only 1 had exatly the same pixels in them

All in all, my program is 400 lines, with both the db and integral image approach.

Feel free to ask any questions or give an idea for an ever better solving! I'm pretty sure this would solve any mystery pic in under a minute with all neo pictures though :)

 



#2 Neoquest

Neoquest
  • 1760 posts


Users Awards

Posted 12 July 2015 - 01:28 PM

The database approach sounds pretty neato. I would like to do something like that for funsies.


How did you get URLs for every image on neopets though?



#3 juvian

juvian
  • 123 posts


Users Awards

Posted 12 July 2015 - 02:18 PM

For items I had to crawl jn images, and the rest it was quite too hard for me until last week when drsloth images from jn upated their way of searching, so you can just search newest images and get all links with pagination. Did that yesterday, got 127k images in 1:30 hours, 2gb of images :p. Haven't put the info of those in db yet though ;). Have fun!





Also tagged with one or more of these keywords: python, tutorial, mystery pic

0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users