How to use Data Science Superpowers for Useless Things: Getting a Job at Amazon, Take 2
IMPETUS: I’d like a job as a data scientist at Amazon.
PROBLEM: There are 3062 jobs listing “data scientist” somewhere in the title or description (at least this is the number of postings that comes back when I search “data science”)
SOLUTION: Open every single one and compare the requirements and job description to my resume and desires
PROBLEM: Assuming this takes me one minute per posting (which is generous – during trials it took me two minutes on average), this would take me 3062 minutes, or 51 hours or just over two days of uninterrupted job-post-reading.
[Ain’t nobody got time for that gif]
SOLUTION: Use our friend Beautiful Soup to scrape the webpage!
PROBLEM: They are using React and Beautiful Soup is trying to scrape before the whole document has been rendered! Que lastima!
SOLUTION: Do a complicated set of awaits. JUST KIDDING. Go straight to the source and find the XHR request that is generating all this data in the first place.
[SHOW CODE HERE]
PROBLEM: We now have a spreadsheet of 3062 job postings. How can we start to narrow this down?!
SOLUTION: Remove jobs outside of the US.
PROBLEM: We still have 2543 jobs to “look at.”
SOLUTION: Let’s help ourselves out by narrowing the search down by searching through ONLY the jobs that have the words “data scientist” in the title.
–HOORAY!! This brought us down to 556!! HOWEVER…
PROBLEM: A lot of positions are “senior”
SOLUTION: Remove any positions that include “senior” or “sr.” !
–HOORAY!! This brought us down from 642 to 350!!
PROBLEM: A lot of these positions have the word “manager” or “principal”
SOLUTION: Remove any positions that include “principal” or “manager”
–HOORAY!! This brought us down from 350 to 226!!
Now, lets see how many of those 226 are unique!!
ONLY 107!!