Denoising Dirty Documents – Part 10

01 Sunday Nov 2015

Posted by Colin Priest in Convolutional Neural Networks, Deep Learning, Image Processing, Kaggle, Machine Learning, Python

≈ 27 Comments

Tags

Convolutional Neural Networks, Deep Learning, Image Processing, Kaggle, Machine Learning, Python

In my last blog, I explained how to take advantage of an information leakage regarding the repeated backgrounds in Kaggle’s Denoising Dirty Documents competition. The result of that process was that we had done a fairly good job of removing the background. But the score from doing this was not good enough to get a good placing. We need to do some processing on the image to improve the score.

Today we will use an approach that does not require me to do any feature engineering – convolutional neural networks, which are neural networks where the first few layers repeatedly apply the same weights across overlapping regions of the input data. One intuitive way of thinking about this is that it is like applying an edge detection filter (much like I described here) where the algorithm finds the appropriate weights for several different edge filters.

I’m told that convolutional neural networks are inspired by how vision works in the natural world. So if I test whether convolutional neural networks work well, am I giving them a robot eye test?

While I am comfortable coding in R, there is little support for convolutional neural networks in R, and I had to code this in Python, using the Theano library. The reason that I chose Theano is because neural network model fitting can be quite time consuming, and Theano supports GPU based processing, which can be orders of magnitude faster than CPU based calculations. To simply the code, I am using Daniel Nouri’s nolearn library, which sits over the lasagne library, which sits over the Theano library. This is the first time I have coded in Python and the first time I have used convolutional neural networks, so it was a good learning experience.

Since my PCs don’t have top of the line graphics cards with GPU processing support, I decided to run my code in a cloud on a virtual machine with GPU support. And since I didn’t want go through the effort of setting up a Linux machine and installing all of the libraries and compilers, I used Domino Data Labs to host my analysis. You can find my project here.

Before I progress to coding the model, I have to set up the environment.

import os
import shutil
 
def setup_theano():
	destfile = "/home/ubuntu/.theanorc"
	open(destfile, 'a').close()
	shutil.copyfile("/mnt/.theanorc", destfile)
 
	print "Finished setting up Theano"

The Python script shown above creates a function that copies the Theano settings file into the appropriate folder in the virtual machine, so that Theano knows to use GPU processing rather than CPU processing. This function gets called from my main script.

In order to reduce the number of calculations, I used a network architecture that inputs an image and outputs an image. The suggestion for this architecture comes from ironbar, a great guy who placed third in the competition. This is unlike all of the examples I found online, which have just one or two outputs, usually because the online example is a classification problem identifying objects appearing within the image. But there are two issues with this architecture:

it doesn’t allow for fully connected layers before the output, and
the target images are different sizes.

I chose to ignore the first problem, although if I had time I would have tried out a more traditional architecture that included fully connected layers but which only models one target pixel at a time.

def image_matrix(img):
 """
 The output value has shape (<number of pixels>, <number of rows>, <number of columns>)
 """
 # 420 x 540 or 258 x 540?
 if img.shape[0] == 258:
 return (img[0:258, 0:540] / 255.0).astype('float32').reshape((1, 1, 258, 540))
 if img.shape[0] == 420:
 result = []
 result.append((img[0:258, 0:540] / 255.0).astype('float32').reshape((1, 1, 258, 540)))
 result.append((img[162:420, 0:540] / 255.0).astype('float32').reshape((1, 1, 258, 540)))
 result = np.vstack(result).astype('float32').reshape((2, 1, 258, 540))
 return result

For the second problem, I used the script shown above to split the larger images into two smaller images that were the same size as the other small images in the data, thereby standardising the output dimensions.

def load_train_set(file_list):
 xs = []
 ys = []
 for fname in file_list:
 x = image_matrix(load_image(os.path.join('./train_foreground/', fname)))
 y = image_matrix(load_image(os.path.join('./train_cleaned/', fname)))
 for i in range(0, x.shape[0]):
 xs.append(x[i, :, :, :].reshape((1, 1, 258, 540)))
 ys.append(y[i, :, :, :].reshape((1, 1, 258, 540)))
 return np.vstack(xs), np.vstack(ys)

Theano uses tensors (multi-dimensional matrices) to store the training data and outputs. The first dimension is the index of the training data item. The second dimension is the colourspace information e.g. RGB would be 3 dimensions. Since our images are greyscale, this dimension has a size of only 1. The remaining dimensions are the dimensions of the input data / output data. The script shown above reshapes the data to meet this criteria.
The nolearn library simplifes the process of defining the architecture of a neural network. I used 3 hidden convolutional layers, each with 25 image filters. The script below shows how this was achieved.

net2 = NeuralNet(
layers = [
('input', layers.InputLayer),
('conv1', layers.Conv2DLayer),
('conv2', layers.Conv2DLayer),
('conv3', layers.Conv2DLayer),
('output', layers.FeaturePoolLayer),
],
#layer parameters:
input_shape = (None, 1, 258, 540),
conv1_num_filters = 25, conv1_filter_size = (7, 7), conv1_pad = 'same',
conv2_num_filters = 25, conv2_filter_size = (7, 7), conv2_pad = 'same',
conv3_num_filters = 25, conv3_filter_size = (7, 7), conv3_pad = 'same',
output_pool_size = 25,
output_pool_function = T.sum,
y_tensor_type=T.tensor4,

#optimization parameters:
update = nesterov_momentum,
update_learning_rate = 0.005,
update_momentum = 0.9,
regression = True,
max_epochs = 200,
verbose = 1,
batch_iterator_train=BatchIterator(batch_size=25),
on_epoch_finished=[EarlyStopping(patience=20),],
train_split=TrainSplit(eval_size=0.25)
)

Due to the unique nature of the problem versus the online examples, my first attempt at this script was not successful. Here are the key changes that I needed to make:

set y_tensor_type=T.tensor4 because the target is 2 dimensional
your graphics card almost certainly doesn’t have enough RAM to process all of the images at once, so you need to use a batch iterator and experiment to find a suitable batch size e.g. batch_iterator_train=BatchIterator(batch_size=25)


plot_loss(net2)
plt.savefig("./results/plotloss.png")

I also wanted to plot the loss across the iterations. So I added the two lines above, giving me the graph below.

During the first several iterations, the neural network is balancing out the weights so that the pixels are the correct magnitude, and after that the serious work of image processing begins.


plot_conv_weights(net2.layers_[1], figsize=(4, 4))
plt.savefig("./results/convweights.png")

I wanted to see what some of the convolutional filters looked like. So I added the two lines shown above, giving me the set of images below.

These filters look like small parts of images of letters, which makes some sense because we are trying to identify whether a pixel sits on the stroke of a letter.

The output looks reasonable, although not as good as what I achieve using a combination of image processing techniques and deep learning.

In the output image above you can see some imperfections.

I think that if I had changed the network architecture to include fully connected layers then I would have achieved a better result. Maybe one day when I have enough time I will experiment with that architecture.

The full Python script is shown below:


import random
import numpy as np
import cv2
import os
import itertools
import math
import matplotlib.pyplot as plt

from setup_GPU import setup_theano
setup_theano()

from lasagne import layers
from lasagne.updates import nesterov_momentum
from lasagne.nonlinearities import softmax
from lasagne.nonlinearities import sigmoid
from nolearn.lasagne import BatchIterator
from nolearn.lasagne import NeuralNet
from nolearn.lasagne import TrainSplit
from nolearn.lasagne import PrintLayerInfo
from nolearn.lasagne.visualize import plot_loss
from nolearn.lasagne.visualize import plot_conv_weights
from nolearn.lasagne.visualize import plot_conv_activity
from nolearn.lasagne.visualize import plot_occlusion

import theano.tensor as T

def load_image(path):
return cv2.imread(path, cv2.IMREAD_GRAYSCALE)

def write_image(img, path):
return cv2.imwrite(path, img)

def image_matrix(img):
"""
The output value has shape (<number of pixels>, <number of rows>, <number of columns>)
"""
# 420 x 540 or 258 x 540?
if img.shape[0] == 258:
return (img[0:258, 0:540] / 255.0).astype('float32').reshape((1, 1, 258, 540))
if img.shape[0] == 420:
result = []
result.append((img[0:258, 0:540] / 255.0).astype('float32').reshape((1, 1, 258, 540)))
result.append((img[162:420, 0:540] / 255.0).astype('float32').reshape((1, 1, 258, 540)))
result = np.vstack(result).astype('float32').reshape((2, 1, 258, 540))
return result

def load_train_set(file_list):
xs = []
ys = []
for fname in file_list:
x = image_matrix(load_image(os.path.join('./train_foreground/', fname)))
y = image_matrix(load_image(os.path.join('./train_cleaned/', fname)))
for i in range(0, x.shape[0]):
xs.append(x[i, :, :, :].reshape((1, 1, 258, 540)))
ys.append(y[i, :, :, :].reshape((1, 1, 258, 540)))
return np.vstack(xs), np.vstack(ys)

def load_test_file(fname, folder):
xs = []
x = image_matrix(load_image(os.path.join(folder, fname)))
for i in range(0, x.shape[0]):
xs.append(x[i, :, :, :].reshape((1, 1, 258, 540)))
return np.vstack(xs)

def list_images(folder):
included_extentions = ['jpg','bmp','png','gif' ]
results = [fn for fn in os.listdir(folder) if any([fn.endswith(ext) for ext in included_extentions])]
return results

def do_test(inFolder, outFolder, nn):
test_images = list_images(inFolder)
nTest = len(test_images)
for x in range(0, nTest):
fname = test_images[x]
x1 = load_test_file(fname, inFolder)
x1 = x1 - 0.5
pred_y = nn.predict(x1)
tempImg = []
if pred_y.shape[0] == 1:
tempImg = pred_y[0, 0, :, :].reshape(258, 540)
if pred_y.shape[0] == 2:
tempImg1 = pred_y[0, 0, :, :].reshape(258, 540)
tempImg2 = pred_y[1, 0, :, :].reshape(258, 540)
tempImg = np.empty((420, 540))
tempImg[0:258, 0:540] = tempImg1
tempImg[162:420, 0:540] = tempImg2
tempImg[tempImg < 0] = 0
tempImg[tempImg > 1] = 1
tempImg = np.asarray(tempImg*255.0, dtype=np.uint8)
write_image(tempImg, (os.path.join(outFolder, fname)))

class EarlyStopping(object):
def __init__(self, patience=100):
self.patience = patience
self.best_valid = np.inf
self.best_valid_epoch = 0
self.best_weights = None

def __call__(self, nn, train_history):
current_valid = train_history[-1]['valid_loss']
current_epoch = train_history[-1]['epoch']
if current_valid < self.best_valid:
self.best_valid = current_valid
self.best_valid_epoch = current_epoch
self.best_weights = nn.get_all_params_values()
elif self.best_valid_epoch + self.patience < current_epoch:
print("Early stopping.")
print("Best valid loss was {:.6f} at epoch {}.".format(
self.best_valid, self.best_valid_epoch))
nn.load_params_from(self.best_weights)
raise StopIteration()

def main():
random.seed(1234)

training_images = list_images("./train_foreground/")
random.shuffle(training_images)
nTraining = len(training_images)
TRAIN_IMAGES = training_images

train_x, train_y = load_train_set(TRAIN_IMAGES)
test_x = train_x
test_y = train_y

# centre on zero - has already been divided by 255
train_x = train_x - 0.5

net2 = NeuralNet(
layers = [
('input', layers.InputLayer),
('conv1', layers.Conv2DLayer),
('conv2', layers.Conv2DLayer),
('conv3', layers.Conv2DLayer),
('output', layers.FeaturePoolLayer),
],
#layer parameters:
input_shape = (None, 1, 258, 540),
conv1_num_filters = 25, conv1_filter_size = (7, 7), conv1_pad = 'same',
conv2_num_filters = 25, conv2_filter_size = (7, 7), conv2_pad = 'same',
conv3_num_filters = 25, conv3_filter_size = (7, 7), conv3_pad = 'same',
output_pool_size = 25,
output_pool_function = T.sum,
y_tensor_type=T.tensor4,

#optimization parameters:
update = nesterov_momentum,
update_learning_rate = 0.005,
update_momentum = 0.9,
regression = True,
max_epochs = 200,
verbose = 1,
batch_iterator_train=BatchIterator(batch_size=25),
on_epoch_finished=[EarlyStopping(patience=20),],
train_split=TrainSplit(eval_size=0.25)
)

net2.fit(train_x, train_y)

plot_loss(net2)
plt.savefig("./results/plotloss.png")
plot_conv_weights(net2.layers_[1], figsize=(4, 4))
plt.savefig("./results/convweights.png")

#layer_info = PrintLayerInfo()
#layer_info(net2)

import cPickle as pickle
with open('results/net2.pickle', 'wb') as f:
pickle.dump(net2, f, -1)

y_pred2 = net2.predict(test_x)
print "The accuracy of this network is: %0.2f" % (abs(y_pred2 - test_y)).mean()

do_test("./train_foreground/", './train_predicted/', net2)
do_test("./test_foreground/", './test_predicted/', net2)

if __name__ == '__main__':
main()

27 thoughts on “Denoising Dirty Documents – Part 10”

Pingback: Denoising Dirty Documents: Part 11 | Keeping Up With The Latest Techniques
- Colin Priest said:
  
  December 1, 2016 at 7:18 am
  
  I think that means it can’t read your images. Check your file path.
  
  Colin
  
  LikeLike
  
  Reply
  - Christ said:
    
    December 6, 2016 at 3:54 pm
    
    Hi, I’ve fixed the path error, but then I got another one
    
    ## Layer information
    
    # name size
    — —— ———-
    0 input 1x258x540
    1 conv1 25x258x540
    2 conv2 25x258x540
    3 conv3 25x258x540
    4 output 1x258x540
    
    Traceback (most recent call last):
    File “convnet.py”, line 235, in
    main()
    File “convnet.py”, line 214, in main
    net2.fit(train_x, train_y)
    File “/usr/local/lib/python2.7/dist-packages/nolearn/lasagne/base.py”, line 544, in fit
    self.train_loop(X, y, epochs=epochs)
    File “/usr/local/lib/python2.7/dist-packages/nolearn/lasagne/base.py”, line 602, in train_loop
    self.apply_batch_func(self.train_iter_, Xb, yb))
    File “/usr/local/lib/python2.7/dist-packages/nolearn/lasagne/base.py”, line 692, in apply_batch_func
    return func(Xb) if yb is None else func(Xb, yb)
    File “/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py”, line 786, in __call__
    allow_downcast=s.allow_downcast)
    File “/usr/local/lib/python2.7/dist-packages/theano/tensor/type.py”, line 177, in filter
    data.shape))
    TypeError: (‘Bad input argument to theano function with name “/usr/local/lib/python2.7/dist-packages/nolearn/lasagne/base.py:518” at index 1(0-based)’, ‘Wrong number of dimensions: expected 4, got 2 with shape (25, 1).’)
    
    Apperently there is a shape mismatch. I would greatly appreciate if you can help to fix that.
    
    Thanks
    
    LikeLike
  - Colin Priest said:
    
    December 7, 2016 at 9:00 am
    
    Christ,
    Have you altered any of my script other than the folder paths? What are the dimensions of train_x and train_y? After the line:
    
    train_x, train_y = load_train_set(TRAIN_IMAGES)
    
    add the following 3 lines to see what is happening:
    
    print(TRAIN_IMAGES)
    print(train_x.shape())
    print(train_y.shape())
    
    Colin
    
    LikeLike
  - Christ said:
    
    December 7, 2016 at 10:47 am
    
    Hi,
    I did three “alterations”:
    
    a)- def image_matrix(img):
    return (img[0:258, 0:540] / 255.0).astype(‘float32’).reshape((1, 1, 258, 540))
    
    b)- def load_train_set(file_list_fg , file_list_cl):
    xs = []
    ys = []
    for fname in file_list_fg:
    print os.path.join(‘/home/user’, ‘train_foreground’, fname)
    temp = load_image(os.path.join(‘/home/user’, ‘train_foreground’, fname))
    x = image_matrix(temp)
    print temp.shape , ‘—–>’ , x.shape # to check
    for i in range(0, x.shape[0]):
    xs.append(x[i, :, :, :].reshape((1, 1, 258, 540)))
    ys.append([0])
    for fname in file_list_cl:
    print os.path.join(‘/home/user’, ‘train_cleaned’, fname)
    temp = load_image(os.path.join(‘/home/user’, ‘train_cleaned’, fname))
    x = image_matrix(temp)
    print temp.shape , ‘—–>’ , x.shape
    for i in range(0, x.shape[0]):
    xs.append(x[i, :, :, :].reshape((1, 1, 258, 540)))
    ys.append([1])
    return np.vstack(xs), np.vstack(ys)
    
    sample of views:
    
    /home/user/train_foreground/zer.png
    (3264, 2788) —–> (1, 1, 258, 540)
    /home/user/train_foreground/vbn.png
    (1920, 2560) —–> (1, 1, 258, 540)
    /home/user/train_foreground/der.png
    (2560, 1920) —–> (1, 1, 258, 540)
    /home/user/train_cleaned/138.png
    (420, 540) —–> (1, 1, 258, 540)
    /home/user/train_cleaned/14.png
    (258, 540) —–> (1, 1, 258, 540)
    /home/user/train_cleaned/162.png
    (420, 540) —–> (1, 1, 258, 540)
    
    c)- def main():
    random.seed(1234)
    
    training_images_fg = list_images(“./train_foreground/”)
    training_images_cl = list_images(“./train_cleaned/”)
    # random.shuffle(training_images)
    # nTraining = len(training_images)
    random.shuffle(training_images_fg)
    random.shuffle(training_images_cl)
    nTraining = len(training_images_fg) + len(training_images_cl)
    #splitAt = int(math.ceil(nTraining * 3 / 4))
    #TRAIN_IMAGES = training_images[:splitAt]
    #TEST_IMAGES = training_images[splitAt:]
    TRAIN_IMAGES = training_images_fg , training_images_cl
    #train_x, train_y = load_train_set(TRAIN_IMAGES)
    train_x, train_y = load_train_set(training_images_fg , training_images_cl)
    
    print [x.shape for x in train_x]
    #test_x, test_y = load_train_set(TEST_IMAGES)
    test_x = train_x
    test_y = train_y
    
    the later “print [x.shape for x in train_x]” gives:
    
    [(1, 258, 540), (1, 258, 540), (1, 258, 540), (1, 258, 540), (1, 258, 540)]
    
    Results of three changes are:
    
    # Neural Network with 62550 learnable parameters
    
    ## Layer information
    
    # name size
    — —— ———-
    0 input 1x258x540
    1 conv1 25x258x540
    2 conv2 25x258x540
    3 conv3 25x258x540
    4 output 1x258x540
    
    Traceback (most recent call last):
    File “convnet.py”, line 235, in
    main()
    File “convnet.py”, line 214, in main
    net2.fit(train_x, train_y)
    File “/usr/local/lib/python2.7/dist-packages/nolearn/lasagne/base.py”, line 544, in fit
    self.train_loop(X, y, epochs=epochs)
    File “/usr/local/lib/python2.7/dist-packages/nolearn/lasagne/base.py”, line 602, in train_loop
    self.apply_batch_func(self.train_iter_, Xb, yb))
    File “/usr/local/lib/python2.7/dist-packages/nolearn/lasagne/base.py”, line 692, in apply_batch_func
    return func(Xb) if yb is None else func(Xb, yb)
    File “/usr/local/lib/python2.7/dist-packages/theano/compile/function_module.py”, line 786, in __call__
    allow_downcast=s.allow_downcast)
    File “/usr/local/lib/python2.7/dist-packages/theano/tensor/type.py”, line 177, in filter
    data.shape))
    TypeError: (‘Bad input argument to theano function with name “/usr/local/lib/python2.7/dist-packages/nolearn/lasagne/base.py:518” at index 1(0-based)’, ‘Wrong number of dimensions: expected 4, got 2 with shape (25, 1).’)
    
    When I add the 3 lines you suggested
    
    print(TRAIN_IMAGES)
    print(train_x.shape())
    print(train_y.shape())
    
    I got this error:
    
    ([‘zer.png’, ‘vfd.png’, ‘xcv.png’, ‘ghy.png’, ‘vbn.png’, ‘jkl.png’, ‘mlp.png’, ‘uio.png’, ‘plo.png’, ‘toi.png’, ‘jhk.png’, ‘pyu.png’, ‘der.png’, ‘xwc.png’, ‘qer.png’, ‘xwc.png’, ‘ter.png’, ‘uty.png’, ‘210.png’, ‘114.png’, ‘126.png’, ‘128.png’, ‘207.png’, ‘122.png’, ‘119.png’, ‘164.png’, ‘134.png’, ‘168.png’, ‘105.png’, ‘155.png’, ‘137.png’])
    Traceback (most recent call last):
    File “convnet.py”, line 238, in
    main()
    File “convnet.py”, line 177, in main
    print(train_x.shape())
    TypeError: ‘tuple’ object is not callable
    
    LikeLike
  - Colin Priest said:
    
    December 7, 2016 at 11:49 am
    
    Sorry I did a typo. Those new lines should be:
    
    print(train_x.shape)
    print(train_y.shape)
    
    LikeLike
  - Christ said:
    
    December 7, 2016 at 12:34 pm
    
    Hi,
    
    Thank you for your time.
    Here are the results:
    
    train_x.shape
    (96, 1, 258, 540)
    train_y.shape
    (96, 1)
    
    Best
    
    LikeLike
  - Colin Priest said:
    
    December 7, 2016 at 1:38 pm
    
    You’ve altered my script so that it no longer loads an image for the target values. That’s why the script is broken. The neural network architecture is set up for an image input and an image output, but you have an image input and a scalar output.
    
    Colin
    
    LikeLike
  - Christ said:
    
    December 20, 2016 at 2:41 pm
    
    Hi,
    I (finally) ran the cnn. It took two days (on my PC) to go to the end.
    The accuracy of the network is 19%, how can I improve?
    Thanks
    
    LikeLike
  - Colin Priest said:
    
    December 20, 2016 at 3:56 pm
    
    Hi Christ.
    You have changed the data and the targets from the model I presented in this blog. If you use the same code and the same data as me, you will get the same accuracy as me.
    
    LikeLike
  - Christ said:
    
    December 20, 2016 at 4:14 pm
    
    Ok, I understand.
    But actually, I am using the same code. I went back to your original code. And as far as I understand it, the cnn only works if there are a dirty image (foreground) and the corresponding clean image in cleaned_train. What to do if I ONLY have the dirty images?
    To run the cnn, I put my noisy files in test_foreground, is that correct?
    I would like to clean my own images? Image size (3264, 2788) and (2560, 1920).
    My problem is that I would like to clean an old manuscript (and especially extract the background text) .
    
    Thanks
    
    LikeLike
  - Colin Priest said:
    
    December 21, 2016 at 1:33 am
    
    Christ,
    
    There are two steps:
    1) train a model using dirty images as the input and clean images as the output
    2) run predictions using the trained model against your dirty images
    3) if your new dirty images are a different size to the training images, then trying resizing the images, or split the images into smaller sub-images, each of which is the same size as the input requirements for the neural network. This is a trick that I used in the Kaggle competition.
    
    Note that if your images have different types of stains and wrinkles to what you trained your model on, or if the writing is a different font, then it is unlikely that this will work well. A model needs to be trained on image data that is representative of the images you will be using later.
    
    Colin
    
    LikeLike
  - Christ said:
    
    December 21, 2016 at 9:22 am
    
    Hi Colin,
    
    Thank you so much for your answer.
    I’ll follow these steps and let you know about the outcome.
    
    Best,
    Christ
    
    LikeLike
Christ said:

November 30, 2016 at 9:57 am

Hi Colin, I tried to run the CNN for denoising, and I got this error:

Traceback (most recent call last):
File “CNN.py”, line 176, in
main()
File “CNN.py”, line 119, in main
train_x, train_y = load_train_set(TRAIN_IMAGES)
File “CNN.py”, line 50, in load_train_set
y = image_matrix(load_image(os.path.join(‘./train_cleaned/’, fname)))
File “CNN.py”, line 36, in image_matrix
if img.shape[0] == 258:
AttributeError: ‘NoneType’ object has no attribute ‘shape’

I know what the error means, but I don’t understand why it’s appearing. When I make a print(img) to check I have a list like this

[[ 0 7 9 …, 96 95 99]
[ 0 0 9 …, 99 103 101]
[ 0 10 0 …, 101 102 102]
…,
[ 56 61 66 …, 114 107 103]
[ 58 64 70 …, 109 105 103]
[ 61 66 72 …, 109 109 102]]

so the img are there.

My first idea was to write

if img is not None:
if img.shape[0] == 258:
….

But then I got

Traceback (most recent call last):
File “CNN.py”, line 177, in
main()
File “CNN.py”, line 120, in main
train_x, train_y = load_train_set(TRAIN_IMAGES)
File “CNN.py”, line 52, in load_train_set
for i in range(0, x.shape[0]):
AttributeError: ‘NoneType’ object has no attribute ‘shape’

And if I change this as above I got the following error

Traceback (most recent call last):
File “CNN.py”, line 178, in
main()
File “CNN.py”, line 121, in main
train_x, train_y = load_train_set(TRAIN_IMAGES)
TypeError: ‘NoneType’ object is not iterable

What am I missing?
Thanks

LikeLike

Reply
Ashok Lathwal said:

April 19, 2017 at 12:34 am

Hi. How much time it will take to run on laptop (i3 processor, 4GB RAM) ??

LikeLike

Reply
- Colin Priest said:
  
  April 19, 2017 at 12:44 am
  
  I had to use GPUs to get any decent speed, and even then it took me several hours to run. So I think that you’d need a week to run on your hardware.
  
  Colin
  
  LikeLike
  
  Reply
  - Christ said:
    
    May 13, 2017 at 1:56 pm
    
    Hi Colin,
    Concerning the running process (on my laptop), first it took days to run the cnn. And now I have an error “process stopped” (my translation of French message).
    Is there any HPC environment you’ll suggest to use? I was considering the possibility of using AWS, but for the moment I don’t know how AWS works or how much it costs.
    
    LikeLike
Colin Priest said:

May 14, 2017 at 9:28 am

Hi Christ,

When I first wrote this, I didn’t have a GPU system of my own to build these models. So I used Domino Data Labs https://www.dominodatalab.com/ They were quite helpful when I couldn’t figure out how to get Theano running on GPU.

Colin

LikeLike

Reply
- Christ said:
  
  May 14, 2017 at 12:11 pm
  
  Hi Colin,
  
  Thank you for your reply.
  
  Christ
  
  LikeLike
  
  Reply
  - tibbers93 said:
    
    August 21, 2017 at 8:29 pm
    
    Hi Christ, can you share what changes have you made to run cnn only on dirty images?
    
    LikeLike
  - Christ said:
    
    August 29, 2017 at 11:04 am
    
    Hi tibbers93,
    
    Sorry for the late reply! Summer holidays!!
    I will report here the “alterations” I mentionned to Colin on https://colinpriest.com/2015/11/01/denoising-dirty-documents-part-10/#comment-576
    
    Now I don’t know if you are using your own data.
    I used Colin’s data + my data + data I got from other people.
    That’s why the accuracy of my cnn was 19%.
    
    change 1:
    
    def image_matrix(img):
    return (img[0:258, 0:540] / 255.0).astype(‘float32’).reshape((1, 1, 258, 540))
    
    change 2:
    
    def load_train_set(file_list):
    xs = []
    ys = []
    for fname in file_list:
    ##print os.path.join(‘/home/user’, ‘train_foregroundv2’, fname)
    x = image_matrix(load_image(os.path.join(‘/home/user’, ‘train_foregroundv2’, fname)))
    ##print os.path.join(‘/home/user’, ‘train_cleanedv2’, fname)
    y = image_matrix(load_image(os.path.join(‘/home/user’, ‘train_cleanedv2’, fname)))
    for i in range(0, x.shape[0]):
    xs.append(x[i, :, :, :].reshape((1, 1, 258, 540)))
    ys.append(y[i, :, :, :].reshape((1, 1, 258, 540)))
    return np.vstack(xs), np.vstack(ys)
    
    change 3 in def main():
    
    training_images_fg = list_images(“./train_foregroundv2/”)
    training_images_cl = list_images(“./train_cleanedv2/”)
    random.shuffle(training_images_fg)
    random.shuffle(training_images_cl)
    nTraining = len(training_images_fg) + len(training_images_cl)
    TRAIN_IMAGES = training_images_fg , training_images_cl
    train_x, train_y = load_train_set(training_images_fg)
    
    test_x = train_x
    test_y = train_y
    
    train_x = train_x – 0.5
    
    do_test(“./train_foregroundv2/”, ‘./train_predicted/’, net2)
    do_test(“./test_foreground/”, ‘./test_predicted/’, net2)
    
    I still need a HPC system to improve the cnn, will do that later
    Hope that helps.
    
    Christ
    
    LikeLike
Ewa Dąbrowska said:

August 20, 2017 at 3:51 pm

Hi Colin, I’m trying to run the code but after running net2.fit(train_x, train_y) I’m getting the below error:

File “C:/Users/Diana/PycharmProjects/untitled2/sukces2.py”, line 159, in
net2.fit(train_x, train_y)
File “C:\Users\Diana\Anaconda2\lib\site-packages\nolearn\lasagne\base.py”, line 544, in fit
self.train_loop(X, y, epochs=epochs)
File “C:\Users\Diana\Anaconda2\lib\site-packages\nolearn\lasagne\base.py”, line 554, in train_loop
X_train, X_valid, y_train, y_valid = self.train_split(X, y, self)
File “C:\Users\Diana\Anaconda2\lib\site-packages\nolearn\lasagne\base.py”, line 138, in __call__
kf = KFold(y.shape[0], round(1. / self.eval_size))
File “C:\Users\Diana\Anaconda2\lib\site-packages\sklearn\cross_validation.py”, line 337, in __init__
super(KFold, self).__init__(n, n_folds, shuffle, random_state)
File “C:\Users\Diana\Anaconda2\lib\site-packages\sklearn\cross_validation.py”, line 262, in __init__
” than the number of samples: {1}.”).format(n_folds, n))
ValueError: Cannot have number of folds n_folds=4 greater than the number of samples: 2.

Do you know what may cause the issue? Is it sth connected with packages?
I would appreciate your help!

LikeLike

Reply
- Colin Priest said:
  
  August 20, 2017 at 9:36 pm
  
  Hi Ewa,
  
  It only found 1 sample image on your computer.
  
  Colin
  
  LikeLike
  
  Reply
  - Ewa Dabrowska said:
    
    August 21, 2017 at 8:49 am
    
    Hi Colin – Thank you for your reply!
    Do you know what may cause that only 1/2 image samples are being imported? I’ve only altered folder paths for train_foreground and train_cleaned images.
    
    Thanks,
    Ewa
    
    LikeLike
  - Colin Priest said:
    
    August 21, 2017 at 8:55 am
    
    Hi Ewa,
    
    Sorry I don’t know what you’ve broken when you made the changes. You will need to debug your code to see where it is going wrong.
    
    Colin
    
    LikeLike
  - Ewa Dąbrowska said:
    
    August 21, 2017 at 3:52 pm
    
    Hi Colin – i’ve checked with print(training_images) and I got a list of all 144 images. What do you mean that only one sample image has been found?
    
    Ewa
    
    LikeLike
Colin Priest said:

August 21, 2017 at 10:32 pm

Hi Ewa,

The error message said that it found only 1 image. Maybe your test images? Sorry I can’t provide more guidance because I can’t see the details of what your computer is doing.

Colin

LikeLike

Reply