Person Re-Identification And Semantic Person Search In Python Machine Learning

Introduction

In this task we have completed some machine learning topics like pre processing for images, neural networks, clustering, data description details and optimized error matrix using sry optimizer.

In this project we have covered Problems Problems like; Person Re-Identification and Semantic Person Search.

Data Pre processing

Data pre-processing is a data mining technique which is used to transform the raw data in a useful and efficient format.

It is the combination of data cleaning, data transformation and data reduction.

Data cleaning is used to remove some missing value or unused data.

Data transformation is include: Normalization, Attribute Selection, Discretization and Concept Hierarchy Generation

Neural networks

Neural Networks are a class of models within the general machine learning literature. Neural networks are a specific set of algorithms that have revolutionized machine learning.

It used to:

Understand how the brain actually works

Understand a style of parallel computation inspired by neurons

Solve practical problems by using novel learning algorithms inspired by the brain

Data description detailed

It include how much data used and know about data columns and indexes to understand data properly before perform operationon it like deviding data into target and featured variable to predict the result.

Optimizer and error metric

To optimizing the machine learning result below some methods are used:

Recall

True positive value of matrix is called the recall, it go-to-performance measure in in binary/non-binary classification problems

Accuracy

This is the main part of predicting result, data scientist always try to find the good accuracy.

Error(R2)

A higher R2 gives a better model, however if too high at close to 99% can sometimes cause the risk of overfitting. R2 can be misleading due to the correlation vs causation debate that can give an illogically high R2

etc.

Person re-identification

Person re-identification is the task of matching a detected person to a gallery of previously seen people, and determining their identity.

It use the keras/ tensorflow to complete the model.

Data Pre-processing

Below the some steps which used to pre-processing the data: train_coods.head(), train_coods.info()

train_coods.head(): show top 5 rows of dataset

train_coods.info(): It show all the data columns including the datatypes.

Dataset and complete solution download from here

train_coods=pd.read_csv('VIPeR/tags5.0.csv') test_coods = pd.read_csv('VIPeR/tags5.0_plauy.csv')

First read the train and test datasets:

Train Data

img_size = 150
training_data = []
testing_data = []
datadir = 'VIPeR/Train/'
def train_data():
   path = datadir + 'cam_a'
   for img in os.listdir(path):
      img_arr = cv2.imread(os.path.join(path,img))
      new_arr = cv2.resize(img_arr,(img_size,img_size))
      training_data.append([new_arr])
train_data()

Here we see the shape of the dataset is:

np.shape(training_data)

(482, 1, 150, 150, 3)

Test Data

Now read the test data from dataset:

def test_data():
 	path = datadir + 'cam_b'
 	for img in os.listdir(path):
 		img_arr = cv2.imread(os.path.join(path,img))
 		new_arr = cv2.resize(img_arr,(img_size,img_size))
 		testing_data.append([new_arr])

test_data()

Here we see the shape of the dataset is:

np.shape(testing_data[0])

(1, 150, 150, 3)

Resizing Images

train = np.array(training_data).reshape(-1,img_size,img_size,3)
test = np.array(testing_data).reshape(-1,img_size,img_size,3)

print('length of training data is : ',len(train))
print('length of test data is : ',len(test))

test_y = np.array(test_coods.iloc[:482,:])
train_y = np.array(train_coods.iloc[:482,:])

print('shape of train_y', train_y.shape)
print('shape of test_y' , test_y.shape)
print(train_y[:10])

Fit into model

After this use the two models to predict the result:

#Import module
from sklearn.ensemble import RandomForestClassifier
forest = RandomForestClassifier(random_state=1)
from sklearn.multioutput import MultiOutputClassifier

batch_size = 200
input_shape = train.shape[1:]

model = Sequential()
model.add(Conv2D(32, kernel_size = (3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(96, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
)
model.add(Flatten())
model.add(Dense(128, activation='relu'))

model.add(Dense(22, activation = 'relu'))

model.compile(loss='categorical_crossentropy', 
 optimizer='rmsprop', 
 metrics=['accuracy']) 
model.fit(train, train_y, batch_size=32, nb_epoch=5,validation_data=(test,test_y))

Output with Accuracy:

Train on 482 samples, validate on 482 samples

Epoch 1/5

482/482 [==============================] - 25s 52ms/step - loss: -144.6076 - accuracy: 0.0954 - val_loss: -151.9930 - val_accuracy: 0.0207

Epoch 2/5

482/482 [==============================] - 24s 50ms/step - loss: -178.9679 - accuracy: 0.1432 - val_loss: -175.8850 - val_accuracy: 0.1286

Epoch 3/5

482/482 [==============================] - 24s 50ms/step - loss: -190.1594 - accuracy: 0.1826 - val_loss: -186.6882 - val_accuracy: 0.1266

Epoch 4/5

482/482 [==============================] - 24s 50ms/step - loss: -196.8711 - accuracy: 0.1763 - val_loss: -201.6089 - val_accuracy: 0.1307

Epoch 5/5

482/482 [==============================] - 24s 49ms/step - loss: -205.7317 - accuracy: 0.1701 - val_loss: -204.0899 - val_accuracy: 0.1432

<keras.callbacks.callbacks.History at 0x7fde5cdcb7f0>

Semantic Person Search

This problem based on Semantic Person Search

Find the complete solution at here

First import all the training and testing data

# import data
train_ = pd.read_csv('Semantic_Person_Search/Train_Data/Train.csv')
test_ = pd.read_csv('Semantic_Person_Search/Test_Data/Test.csv')

#pre-process data
train_.drop(['filename'],axis=1,inplace=True)
test_.drop(['filename'],axis=1,inplace=True)

# pre-processing train images
img_size = 150
training_data = []
testing_data = []

datadir = 'Semantic_Person_Search/Train_Data/'
def train_data():
    path = datadir + 'Originals'
    for img in os.listdir(path):
         img_arr = cv2.imread(os.path.join(path,img))
         new_arr = cv2.resize(img_arr,(img_size,img_size))
         training_data.append([new_arr])

train_data()

len(training_data)

520

It tran the 520 images

# pre-processing test images 
di = '/content/drive/My Drive/triple-proj/Data/Q3/Q3/Test_Data/'
def test_data():
    path = di + 'Originals'
    for img in os.listdir(path):
         img_arr = cv2.imread(os.path.join(path,img))
         new_arr = cv2.resize(img_arr,(img_size,img_size))
         testing_data.append([new_arr])
test_data()

len(testing_data)

196

Test data test the 196 images

Fit into Model

# deep learning approach

batch_size = 200
input_shape = train.shape[1:]

model = Sequential()
model.add(Conv2D(32, kernel_size = (3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(64, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(96, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
#model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
#model.add(Dropout(0.3))
model.add(Dense(13, activation = 'softmax'))
model.compile(loss='categorical_crossentropy', 
 optimizer='rmsprop', 
 metrics=['accuracy'])

Here 5 appches is done to find the result:

# fitting and evaluation 
model.fit(train, train_y, batch_size=32, nb_epoch=5,validation_data=(test,test_y))

Output:

Train on 520 samples, validate on 196 samples

Epoch 1/5

520/520 [==============================] - 24s 46ms/step - loss: 25.1075 - accuracy: 0.2346 - val_loss: 169.9337 - val_accuracy: 0.1531

Epoch 2/5

520/520 [==============================] - 23s 45ms/step - loss: 10.8575 - accuracy: 0.4077 - val_loss: 50.2284 - val_accuracy: 0.2296

Epoch 3/5

520/520 [==============================] - 23s 44ms/step - loss: -27.9195 - accuracy: 0.4423 - val_loss: 57.6140 - val_accuracy: 0.3724

Epoch 4/5

520/520 [==============================] - 23s 44ms/step - loss: -94.1515 - accuracy: 0.4519 - val_loss: 81.4588 - val_accuracy: 0.3520

Epoch 5/5

520/520 [==============================] - 23s 44ms/step - loss: -209.7171 - accuracy: 0.4615 - val_loss: 66.5317 - val_accuracy: 0.3980

<keras.callbacks.callbacks.History at 0x7f3ad8151710>

Conclusion

It covers all two approaches to find the result of given task with good accuracy result. In first task we will covers the Person Re-Identification with the which done the accracy of the some appoches.

And Second problem we will find the Semantic Person Search with the accuracy of the some appoches and recommendation of some IDs.

Send your request at realcode4you@gmail.com and get instant help with an affordable price.

We are always focus to delivered unique or without plagiarism code which is written by our highly educated professional which provide well structured code within your given time frame.

If you are looking other programming language help like C, C++, Java, Python, PHP, Asp.Net, NodeJs, ReactJs, etc. with the different types of databases like MySQL, MongoDB, SQL Server, Oracle, etc. then also contact us.

RealCode4You