HW5 -- Artificial Artificial Intelligence¶

import pandas as pd
import numpy as np

neg = pd.read_csv('AMT_neg.csv')
pos = pd.read_csv('AMT_pos.csv')

Initial EDA¶

neg[:3]

pos[:3]

neg.columns.tolist()

['HITId',
 'HITTypeId',
 'Title',
 'Description',
 'Keywords',
 'Reward',
 'CreationTime',
 'MaxAssignments',
 'RequesterAnnotation',
 'AssignmentDurationInSeconds',
 'AutoApprovalDelayInSeconds',
 'Expiration',
 'NumberOfSimilarHITs',
 'LifetimeInSeconds',
 'AssignmentId',
 'WorkerId',
 'AssignmentStatus',
 'AcceptTime',
 'SubmitTime',
 'AutoApprovalTime',
 'ApprovalTime',
 'RejectionTime',
 'RequesterFeedback',
 'WorkTimeInSeconds',
 'LifetimeApprovalRate',
 'Last30DaysApprovalRate',
 'Last7DaysApprovalRate',
 'Input.text',
 'Answer.sentiment.label',
 'Approve',
 'Reject']

How many unique turkers worked on each dataframe?¶

def get_unique(df, column):
    unique = np.unique(df[column], return_counts=True)
    df = pd.DataFrame(zip(unique[0], unique[1]))
    return len(unique[0]), unique, df

num_neg, unique_neg, u_neg_df = get_unique(neg, 'WorkerId')    
num_pos, unique_pos, u_pos_df = get_unique(pos, 'WorkerId')

print(num_neg, 'Turkers worked on NEG batch')
print(num_pos, 'Turkers worked on POS batch')

53 Turkers worked on NEG batch
38 Turkers worked on POS batch

How many HITS did each unique turker do?¶

u_neg_df.plot(kind='bar',x=0,y=1)

<matplotlib.axes._subplots.AxesSubplot at 0x11cdcb978>

u_pos_df.plot(kind='bar',x=0,y=1)

<matplotlib.axes._subplots.AxesSubplot at 0x11c5d1748>

What's the `max` and `min` HIT for unique turkers¶

print('For {}, the min was: {} and the max was: {}'.format('neg', unique_neg[1].min(), unique_neg[1].max())) 
print('For {}, the min was: {} and the max was: {}'.format('pos', unique_pos[1].min(), unique_pos[1].max()))

For neg, the min was: 1 and the max was: 37
For pos, the min was: 1 and the max was: 40

Did a specitic Sentiment take longer for turkers to assess?¶

import seaborn as sns
import matplotlib.pyplot as plt
sns.catplot(x="Answer.sentiment.label", 
            y="WorkTimeInSeconds", 
            kind="bar", 
            order=['Negative', 'Neutral', 'Positive'], 
            data=neg);
plt.title('Negative')

Text(0.5, 1, 'Negative')

sns.catplot(x="Answer.sentiment.label", 
            y="WorkTimeInSeconds", 
            kind="bar", 
            order=['Negative', 'Neutral', 'Positive'], 
            data=pos)
plt.title('Positive')

Text(0.5, 1, 'Positive')

How many turkers had less than 10 second response time?¶

response_time = neg[neg['WorkTimeInSeconds'] < 10]
response_time_check = neg[neg['WorkTimeInSeconds'] > 10]

len(response_time)

48

len(response_time_check)

312

Checking for potential bots¶

Did anyone have a consistent average low response time?¶

count = pos.groupby(['WorkerId'])['HITId'].count()
work_time = pos.groupby(['WorkerId'])['WorkTimeInSeconds'].mean()
new_df = pd.DataFrame([work_time, count]).T
new_df[:5]

Did anyone have a consistent average high response time?¶

new_df['WorkTimeInMin'] = new_df['WorkTimeInSeconds']/60
new_df[:5]

count = pos.groupby(['WorkerId', 'Answer.sentiment.label'])['Answer.sentiment.label'].count()
# count = pos.groupby(['WorkerId'])['Answer.sentiment.label'].count()
count

WorkerId        Answer.sentiment.label
A13CLN8L5HFT46  Neutral                    2
                Positive                  11
A18WFPSLFV4FKY  Positive                   2
A1IQV3QUWRA8G1  Positive                   1
A1N1ULK71RHVMM  Negative                   1
                                          ..
AMC42JMQA8A5U   Positive                   1
AO2WNSGOXAX52   Neutral                    3
                Positive                   1
AOMFEAWQHU3D8   Neutral                    1
                Positive                   6
Name: Answer.sentiment.label, Length: 74, dtype: int64

Did anyone answer ONLY pos/neg/neutral?¶

pnn = pd.DataFrame()
pnn['Neutral'] = pos.groupby('WorkerId')['Answer.sentiment.label'].apply(lambda x: (x=='Neutral').sum())
pnn['Positive'] = pos.groupby('WorkerId')['Answer.sentiment.label'].apply(lambda x: (x=='Positive').sum())
pnn['Negative'] = pos.groupby('WorkerId')['Answer.sentiment.label'].apply(lambda x: (x=='Negative').sum())
pnn['Total'] = pos.groupby('WorkerId')['Answer.sentiment.label'].apply(lambda x: x.count())
pnn[:5]

This is getting a little confusing, let's just look at our top performers¶

top = pnn.sort_values(by=['Total'], ascending=False)

top[:10]

Interesting!! Looking from here, we have three workers who ONLY chose positive.

Let's look at their response time to see if we can determine if they are a bot!!

top['Avg_WorkTimeInSeconds'] = pos.groupby('WorkerId')['WorkTimeInSeconds'].apply(lambda x: x.mean())
top['Avg_WorkTimeInMin'] = pos.groupby('WorkerId')['WorkTimeInSeconds'].apply(lambda x: x.mean()/60)
top['Min_WorkTimeInMin'] = pos.groupby('WorkerId')['WorkTimeInSeconds'].apply(lambda x: x.min()/60)
top['Max_WorkTimeInMin'] = pos.groupby('WorkerId')['WorkTimeInSeconds'].apply(lambda x: x.max()/60)

top[:10]

Even more interesting! These two don't appear to be bots, based on our current metric which is time variability.

HOWEVER, worker A681XM15AN28F appears to only work for an average of 13 seconds per review which doesn't seem like enough time to read and judge a review...

PART 2: Second submission to AMT¶

TOO MANY REVIEWERS!

Here is when we realized that doing a kappa score with over 30 individual reviewers would be tricky, so we rusubmitted to AMT and required the turkers to be 'Master' in the hopes that this additional barrier-to-entry would help reduce the amount of turkers working on the project

	HITId	HITTypeId	Title	Description	Keywords	Reward	CreationTime	MaxAssignments	RequesterAnnotation	AssignmentDurationInSeconds	...	RejectionTime	RequesterFeedback	WorkTimeInSeconds	LifetimeApprovalRate	Last30DaysApprovalRate	Last7DaysApprovalRate	Input.text	Answer.sentiment.label	Approve	Reject
0	3IQ9O0AYW6ZI3GD740H32KGG2SWITJ	3N0K7CX2I27L2NR2L8D93MF8LIRA5J	Sentiment analysis	Sentiment analysis	sentiment, text	$0.02	Fri Nov 01 12:08:17 PDT 2019	3	BatchId:3821423;OriginalHitTemplateId:928390909;	10800	...	NaN	NaN	44	0% (0/0)	0% (0/0)	0% (0/0)	Missed Opportunity\nI had been very excited to...	Neutral	NaN	NaN
1	3IQ9O0AYW6ZI3GD740H32KGG2SWITJ	3N0K7CX2I27L2NR2L8D93MF8LIRA5J	Sentiment analysis	Sentiment analysis	sentiment, text	$0.02	Fri Nov 01 12:08:17 PDT 2019	3	BatchId:3821423;OriginalHitTemplateId:928390909;	10800	...	NaN	NaN	7	0% (0/0)	0% (0/0)	0% (0/0)	Missed Opportunity\nI had been very excited to...	Negative	NaN	NaN
2	3IQ9O0AYW6ZI3GD740H32KGG2SWITJ	3N0K7CX2I27L2NR2L8D93MF8LIRA5J	Sentiment analysis	Sentiment analysis	sentiment, text	$0.02	Fri Nov 01 12:08:17 PDT 2019	3	BatchId:3821423;OriginalHitTemplateId:928390909;	10800	...	NaN	NaN	449	0% (0/0)	0% (0/0)	0% (0/0)	Missed Opportunity\nI had been very excited to...	Positive	NaN	NaN

	HITId	HITTypeId	Title	Description	Keywords	Reward	CreationTime	MaxAssignments	RequesterAnnotation	AssignmentDurationInSeconds	...	RejectionTime	RequesterFeedback	WorkTimeInSeconds	LifetimeApprovalRate	Last30DaysApprovalRate	Last7DaysApprovalRate	Input.text	Answer.sentiment.label	Approve	Reject
0	3VMV5CHJZ8F47P7CECH0H830NF4GTP	3N0K7CX2I27L2NR2L8D93MF8LIRA5J	Sentiment analysis	Sentiment analysis	sentiment, text	$0.02	Fri Nov 01 12:11:19 PDT 2019	3	BatchId:3821427;OriginalHitTemplateId:928390909;	10800	...	NaN	NaN	355	0% (0/0)	0% (0/0)	0% (0/0)	funny like a clown\nGreetings again from the d...	Positive	NaN	NaN
1	3VMV5CHJZ8F47P7CECH0H830NF4GTP	3N0K7CX2I27L2NR2L8D93MF8LIRA5J	Sentiment analysis	Sentiment analysis	sentiment, text	$0.02	Fri Nov 01 12:11:19 PDT 2019	3	BatchId:3821427;OriginalHitTemplateId:928390909;	10800	...	NaN	NaN	487	0% (0/0)	0% (0/0)	0% (0/0)	funny like a clown\nGreetings again from the d...	Neutral	NaN	NaN
2	3VMV5CHJZ8F47P7CECH0H830NF4GTP	3N0K7CX2I27L2NR2L8D93MF8LIRA5J	Sentiment analysis	Sentiment analysis	sentiment, text	$0.02	Fri Nov 01 12:11:19 PDT 2019	3	BatchId:3821427;OriginalHitTemplateId:928390909;	10800	...	NaN	NaN	1052	0% (0/0)	0% (0/0)	0% (0/0)	funny like a clown\nGreetings again from the d...	Positive	NaN	NaN

	WorkTimeInSeconds	HITId
WorkerId
A13CLN8L5HFT46	7.230769	13.0
A18WFPSLFV4FKY	47.000000	2.0
A1IQV3QUWRA8G1	22.000000	1.0
A1N1ULK71RHVMM	10.000000	3.0
A1S2MN0E9BHPVA	173.444444	27.0

	WorkTimeInSeconds	HITId	WorkTimeInMin
WorkerId
A13CLN8L5HFT46	7.230769	13.0	0.120513
A18WFPSLFV4FKY	47.000000	2.0	0.783333
A1IQV3QUWRA8G1	22.000000	1.0	0.366667
A1N1ULK71RHVMM	10.000000	3.0	0.166667
A1S2MN0E9BHPVA	173.444444	27.0	2.890741

	Neutral	Positive	Negative	Total
WorkerId
A681XM15AN28F	13	20	7	40
A1Y66T7FKJ8PJA	5	23	7	35
A33ENZVC1XB4BA	0	34	0	34
A1S2MN0E9BHPVA	2	21	4	27
A37L5E8MHHQGZM	6	13	3	22
AE03LUY7RH400	4	10	7	21
A2G44A4ZPWRPXU	4	12	2	18
A1YK1IKACUJMV4	0	15	0	15
A3AW887GI0NLKF	3	10	2	15
A3HAEQW13YPT6A	0	14	0	14

	Neutral	Positive	Negative	Total	Avg_WorkTimeInSeconds	Avg_WorkTimeInMin	Min_WorkTimeInMin	Max_WorkTimeInMin
WorkerId
A681XM15AN28F	13	20	7	40	13.575000	0.226250	0.100000	0.833333
A1Y66T7FKJ8PJA	5	23	7	35	695.857143	11.597619	0.216667	22.000000
A33ENZVC1XB4BA	0	34	0	34	366.647059	6.110784	0.616667	9.916667
A1S2MN0E9BHPVA	2	21	4	27	173.444444	2.890741	0.400000	4.983333
A37L5E8MHHQGZM	6	13	3	22	346.272727	5.771212	2.150000	8.283333
AE03LUY7RH400	4	10	7	21	102.238095	1.703968	0.100000	3.433333
A2G44A4ZPWRPXU	4	12	2	18	221.277778	3.687963	0.383333	7.383333
A1YK1IKACUJMV4	0	15	0	15	593.600000	9.893333	1.716667	11.000000
A3AW887GI0NLKF	3	10	2	15	269.400000	4.490000	1.616667	7.216667
A3HAEQW13YPT6A	0	14	0	14	442.928571	7.382143	0.866667	11.100000